How to perform a JOIN in MongoDB

For a full overview of MongoDB and all my posts on it, check out my overview.

If you have relational data in your MongoDB instance, you can perform an operation similar to a JOIN typically done in SQL queries.

Consider the following data set:

db.podcasts.insertMany([
	{id: 1, name: "Off The Clock", category: "Technology", rss: "https://anchor.fm/s/76aafa5c/podcast/rss"},
	{id: 2, name: "Tech Over Tea", category: "Technology", rss: "https://anchor.fm/s/149fd51c/podcast/rss"}
]);

db.episodes.insertMany([
	{podcast_id: 1, title: "Resume Tips", published_on: "2022-01-11"},
	{podcast_id: 2, title: "#75 Welcome Our Hacker Neko Waifu | Cyan Nyan", published_on: "2021-08-04"},
	{podcast_id: 2, title: "Square Enix Refuses To Take My Money | Solo", published_on: "2022-01-26"},
	{podcast_id: 1, title: "Find the Right Job", published_on: "2022-01-25"}
]);

If you want to get every podcast with all of its associated episodes, you can accomplish this using the $lookup aggregation stage.

The aggregation would look something like this:

db.podcasts.aggregate([
	{ $lookup: {
		from: "episodes", // Name of the other collection to "join" from
		localField: "id", // Name of the field your current documents contain to compare with
		foreignField: "podcast_id", // Name of field to compare to in the "from" collection's documents
		as: "episodes" // What to call the field that contains the array of sub documents that matched
	}}
]);

Any documents in the episodes collection that had a podcast_id that matched one of the podcasts documents would have been added to an array of sub-documents called episodes.

The result would look like this:

[
	{
		id: 1,
	 	name: "Off The Clock",
		category: "Technology",
		rss: "https://anchor.fm/s/76aafa5c/podcast/rss",
		episodes: [
			{podcast_id: 1, title: "Resume Tips", published_on: "2022-01-11"},
			{podcast_id: 1, title: "Find the Right Job", published_on: "2022-01-25"}
		]
	},
	{
		id: 2,
		name: "Tech Over Tea",
		category: "Technology",
		rss: "https://anchor.fm/s/149fd51c/podcast/rss",
		episodes: [
			{podcast_id: 2, title: "#75 Welcome Our Hacker Neko Waifu | Cyan Nyan", published_on: "2021-08-04"},
			{podcast_id: 2, title: "Square Enix Refuses To Take My Money | Solo", published_on: "2022-01-26"},
		]
	}
]

This would be the equivalent of the SQL query:

SELECT
	*
FROM
	podcasts
LEFT JOIN episodes ON episodes.podcast_id = podcasts.id;

I specified LEFT JOIN in the SQL example because if no documents match the $lookup aggregation stage, the podcasts would still be returned with the episodes field being an empty array.


Did you find this information useful? If so, consider heading over to my donation page and drop me some support.

Want to ask a question or just chat? Contact me here