最近在做结构化数据迁移,从MySQL迁移到MongoDB,一些小的细节和知识点还是挺多的
MySQL里的多表连接查询,先确定一个主表,然后通过JOIN将其它表有选择性地和主表连接起来
说实话用起来很蛋疼
比如先来个班级表
mysql root@localhost:test> select * from class; +----+--------+---------+ | id | number | user_id | +----+--------+---------+ | 1 | 33 | 3 | | 2 | 11 | 1 | | 3 | 22 | 2 | | 5 | 55 | 5 | +----+--------+---------+
再来个学生表,主键存在了班级表的user_id字段,作为外键
mysql root@localhost:test> select * from user; +----+--------+-----+-----+ | id | name | job | age | +----+--------+-----+-----+ | 1 | lihui | IT | 11 | | 2 | lilei | IT | 11 | | 3 | liwei | IT | 11 | | 4 | MeiMei | HR | 14 | +----+--------+-----+-----+
假如此时需要得到class表里现有班级里已录入信息的学生的详细信息
mysql root@localhost:test> select c.id, c.number, c.user_id, u.name, u.job, u.age from class c INNER JOIN user u ON c.user_id=u.id +----+--------+---------+-------+-----+-----+ | id | number | user_id | name | job | age | +----+--------+---------+-------+-----+-----+ | 1 | 33 | 3 | liwei | IT | 11 | | 2 | 11 | 1 | lihui | IT | 11 | | 3 | 22 | 2 | lilei | IT | 11 | +----+--------+---------+-------+-----+-----+
外键user_id作为关联字段得到了所有人的信息
但是有一点注意,内连接INNER JOIN只返回同时存在于两张表的数据,所以class表和user表的最后一行都没返回
如果你还有其他需求,在连接过程中,想保证左表或者右表必须存在,那么可以用外连接OUTER JOIN
比如LEFT OUTER JOIN
mysql root@localhost:test> select c.id, c.number, c.user_id, u.name, u.job, u.age from class c LEFT OUTER JOIN user u ON c.user_id=u.id +----+--------+---------+--------+--------+--------+ | id | number | user_id | name | job | age | +----+--------+---------+--------+--------+--------+ | 1 | 33 | 3 | liwei | IT | 11 | | 2 | 11 | 1 | lihui | IT | 11 | | 3 | 22 | 2 | lilei | IT | 11 | | 5 | 55 | 5 | <null> | <null> | <null> | +----+--------+---------+--------+--------+--------+
和INNER JOIN唯一不同的就是最后一行,主表依旧在,但user表由于没有这个user_id信息,null填充
同样的RIGHT OUTER JOIN
mysql root@localhost:test> select c.id, c.number, c.user_id, u.name, u.job, u.age from class c RIGHT OUTER JOIN user u ON c.user_id=u.id +--------+--------+---------+--------+-----+-----+ | id | number | user_id | name | job | age | +--------+--------+---------+--------+-----+-----+ | 1 | 33 | 3 | liwei | IT | 11 | | 2 | 11 | 1 | lihui | IT | 11 | | 3 | 22 | 2 | lilei | IT | 11 | | <null> | <null> | <null> | MeiMei | HR | 14 | +--------+--------+---------+--------+-----+-----+
保证user表存在,class表全部null填充
上面都是SQL的用法,下面是MongoDB里如何实现类似的用法
首先是collection class,student_id作为外键
// 1 { "_id": ObjectId("5f13a185167a9003cf76206b"), "class_num": 1, "student_id": 1 } // 2 { "_id": ObjectId("5f13a18b167a9003cf76206c"), "class_num": 2, "student_id": 2 } // 3 { "_id": ObjectId("5f13a18f167a9003cf76206d"), "class_num": 3, "student_id": 3 }
然后是collection student
// 1 { "_id": 1, "name": "lihui", "job": "IT", "age": 17 } // 2 { "_id": 2, "name": "lilei", "job": "TEA", "age": 18 } // 3 { "_id": 3, "name": "MEIMEI", "job": "HR", "age": 16 }
下面这个链接,是MongoDB官方文档$lookup的用法,完成类似JOIN的功能
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
具体的做法如下:
db.class.aggregate([ { $lookup: { from: "student", localField: "student_id", foreignField: "_id", as: "class_student_infos" } } ])
这里几个字段的含义:
from:class作为主collection,from字段表示需要链接的collection,不明白为啥命名个from
localField:主collection里的外键字段
foreignField:连接collection里对应外键的字段
as:最终连接的部分输出的字段名称
返回的结果
// 1 { "_id": ObjectId("5f12bef9167a9003cf762068"), "class_id": 3, "student_id": 3, "class_student_infos": [ { "_id": 3, "name": "MEIMEI", "job": "HR", "age": 16 } ] } // 2 { "_id": ObjectId("5f12bf1d167a9003cf762069"), "class_id": 2, "student_id": 2, "class_student_infos": [ { "_id": 2, "name": "lilei", "job": "TEA", "age": 18 } ] } // 3 { "_id": ObjectId("5f12bf27167a9003cf76206a"), "class_id": 1, "student_id": 1, "class_student_infos": [ { "_id": 1, "name": "lihui", "job": "IT", "age": 17 } ] }
此时,如果和上面SQL的例子一样,加两条两个表独有的记录
比如class collection
// 1 { "_id": ObjectId("5f13a185167a9003cf76206b"), "class_num": 1, "student_id": 1 } // 2 { "_id": ObjectId("5f13a18b167a9003cf76206c"), "class_num": 2, "student_id": 2 } // 3 { "_id": ObjectId("5f13a18f167a9003cf76206d"), "class_num": 3, "student_id": 3 } // 4 { "_id": ObjectId("5f13aff0167a9003cf76206e"), "class_num": 4, "student_id": 4 }
student collection
// 1 { "_id": 1, "name": "lihui", "job": "IT", "age": 17 } // 2 { "_id": 2, "name": "lilei", "job": "TEA", "age": 18 } // 3 { "_id": 3, "name": "MEIMEI", "job": "HR", "age": 16 } // 4 { "_id": 5, "name": "LIWEI", "job": "TEA", "age": 15 }
再次执行上面的$lookup命令
// 1 { "_id": ObjectId("5f13a185167a9003cf76206b"), "class_num": 1, "student_id": 1, "class_student_infos": [ { "_id": 1, "name": "lihui", "job": "IT", "age": 17 } ] } // 2 { "_id": ObjectId("5f13a18b167a9003cf76206c"), "class_num": 2, "student_id": 2, "class_student_infos": [ { "_id": 2, "name": "lilei", "job": "TEA", "age": 18 } ] } // 3 { "_id": ObjectId("5f13a18f167a9003cf76206d"), "class_num": 3, "student_id": 3, "class_student_infos": [ { "_id": 3, "name": "MEIMEI", "job": "HR", "age": 16 } ] } // 4 { "_id": ObjectId("5f13aff0167a9003cf76206e"), "class_num": 4, "student_id": 4, "class_student_infos": [ ] }
可以看到多了第四条document,主collection字段都在,但是连接的student信息为空,原因是外键4在collection student里找不到,因此为空,相当于SQL里的LEFT OUTER JOIN功能