MongoDB Aggregation Pipelines: A Hands-on Tutorial
Analyzing data and extracting metrics, requires several tasks such as filtering, sorting, grouping, etc. In MongoDB, we can perform these operations using aggregation pipelines.
Before we dive into the details, to understand different stages in MongoDB aggregation pipelines and to prepare the dataset, read this article on stages in MongoDB Aggregation pipelines .
Let’s discuss how to break complex analytical problems into smaller, manageable parts and how to utilize the different stages in MongoDB aggregation pipelines to gain insights from data. We’ll explore three analytical questions to illustrate how to work with MongoDB aggregation pipelines effectively.
What Are Aggregation Pipelines in MongoDB?
MongoDB aggregation pipelines consist of one or more stages that perform specific operations on documents. A stage in the MongoDB pipeline can filter data, group documents, sort them, or select specific fields from the documents.
- We use the
$match
stage to filter the documents. - We use the
$group
stage to group the data by values in specific fields. - We use the
$sort
stage to sort the documents in the data by a particular field. - We use the
$project
stage to select specific fields from the documents.
We use the aggregate()
function to combine all the stages together and execute the MongoDB aggregation pipeline.
Using Different Stages to Create Aggregation Pipelines in MongoDB
To solve complex problems, we create MongoDB aggregation pipelines with different stages like $match
, $group
, $sort
, and others. We also use operators, such as $sum
, $min
, $max
, and $avg
.
To efficiently use aggregation pipelines to solve a problem, we follow this three-step process:
- Break the problem into small sub-problems like filtering, sorting, and selecting.
- Identify which MongoDB aggregation pipeline stage we can use to solve each sub-problem.
- Write the code to solve each sub-problem and combine them to build the pipeline to solve the original problem.
To understand the above steps, let’s consider the following examples:
- Find the average salary of employees in the department ID 100 for the given employee data.
- Find the average salary of the employees in each department of an organization and show the values in descending order of the average salary.
- Show the name and salary of all the employees in department 100. Order the results by employee age in ascending order.
Using the $match
, $group
, and $project
Stages Together
We will use the following steps to create a MongoDB aggregation pipeline for finding the average salary of employees in a department with id
100:
- First, we will use the
$match
stage to filter documents for departmentid
100. - Next, we will use the
$group
stage to group all the documents with departmentid
100. For this, we will passnull
to the_id
key in the$group
stage. Subsequently, we will use the$avg
operator to calculate the average salary for the given department. - Finally, we will use the
$project
stage to show the average salary value.
An example of this is as follows:
codecademy> db.employees.aggregate([{$match:{"dept_id":100}},{$group:{_id:null, average_salary:{$avg:"$salary"}}},{$project:{_id:0,average_salary:1}}])
Output:
[ { average_salary: 11750 } ]
Using the $group
, $sort
, and $project
Stages Together
To calculate the average salary in each department of the organization and show the values in descending order of the average salary, we will build a MongoDB aggregation pipeline using the $group
, $sort
, and $project
stages. For this, we use the following steps:
- First, we will use the
$group
stage and the$avg
operator to find the average salary in each department. - Next, we will use the
$sort
stage to sort the data by the average salary field. - Finally, we use the
$project
stage to show the output.
You can observe this in the following example:
codecademy> db.employees.aggregate([{$group: {_id: {dept_id:"$dept_id"}, average_salary:{$avg:"$salary"}}},{$sort:{average_salary:-1}},{$project:{_id:1, average_salary:1}}])
Output:
[{ _id: { dept_id: 300 }, average_salary: 22500 },{ _id: { dept_id: 200 }, average_salary: 15000 },{ _id: { dept_id: 400 }, average_salary: 15000 },{ _id: { dept_id: 100 }, average_salary: 11750 }]
Using the $match
, $sort
, and $project
Stages Together
We will use the $match
, $sort
, and $project
stages to build a MongoDB aggregation pipeline showing the name
and salary
of all the employees in department 100, ordered by employee age. We do this like so::
- We use the
$match
stage to filter documents for the department ID 100. - We use the
$sort
stage to sort the documents using theemp_age
field. - We will use the
$project
stage to show only theemp_name
andsalary
fields.
You can see this in the following example:
codecademy> db.employees.aggregate([{$match:{"dept_id":100}},{$sort:{emp_age:1}},{$project:{_id:0,emp_name:1,salary:1}}])
Output:
[{ emp_name: 'Katy', salary: 12000 },{ emp_name: 'Aditya', salary: 12000 },{ emp_name: 'Adam', salary: 13000 },{ emp_name: 'Ankit', salary: 10000 }]
Conclusion
Mastering MongoDB aggregation pipelines can help you solve analytical questions easily. Using the examples, we discussed how to break complex analytical problems into sub-problems and solve them using different stages in MongoDB aggregation pipelines.
To better understand the concepts, we suggest you frame some questions to calculate different metrics on any given dataset and create aggregation pipelines for it. This will help you understand how to break problems down, solve them using different stages, and create a complete aggregation pipeline in MongoDB.
To read more tutorials on topics like data science, cloud computing, artificial intelligence, etc., visit the Codecademy article hub.
Author
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
Learn more on Codecademy
- Skill path
Code Foundations
Start your programming journey with an introduction to the world of code and basic concepts.Includes 5 CoursesWith CertificateBeginner Friendly4 hours - Career path
Full-Stack Engineer
A full-stack engineer can get a project done from start to finish, back-end to front-end.Includes 51 CoursesWith Professional CertificationBeginner Friendly150 hours