NoSQL

Cassandra

Apache Spark

collections

documents

Mongo Aggregates

Starting out a Mongo Aggregate

db . collection . aggregate ([ aggregate pipeline commands ], options )

db

Match in Mongo Aggregate

{ $match: { "roll_number" : 901 }}

{ $match: { "class" : 5 }}

$

Use match as early as possible

$match

indexes

$match operations use suitable indexes to scan only the matching documents in a collection. When possible, place $match operators at the beginning of the pipeline.

$match

Project in Mongo Aggregate

{ $project: { student_name: 1, student_age: '$age' } }

1

SELECT

SQL

_id

_id: 0

Grouping in Mongo Aggregate

db.students.aggregate([ { $group: { _id: "$class", total: { $sum: "$age" } } } ])

Lookup other collections in Mongo Aggregate

$lookup

{ $lookup: { from: <collection to join>, localField: <field from the input documents>, foreignField: <field from the documents of the "from" collection>, as: <output array field> } }

{ $lookup: { from: "students", let: { studentId: "$_id", age: "$age" }, pipeline: [ { $match: { $expr: { $eq: [ "$student_id", "$$studentId" ] } } }, { $project: { student_id: 1, age: 1 } } ], as: "data" } }

$match

Unwind in Mongo Aggregate

each

[ { "collection" : "collection" , "count" : 10 , "content" : [ { "k" : { "type" : "int" , "minInt" : 0 , "maxInt" : 10 }, "a" : { "type" : "int" , "minInt" : 0 , "maxInt" : 10 }, "b" : { "type" : "int" , "minInt" : 0 , "maxInt" : 10 }, }, {} ] } ]

$unwind

db . collection . aggregate ([ { " $unwind " : { " path " : " $content " , " preserveNullAndEmptyArrays " : true } }, { " $project " : { " _id " : 0 , " content " : 1 } } ])

[ { "content" : { "a" : { "maxInt" : 10 , "minInt" : 0 , "type" : "int" }, "b" : { "maxInt" : 10 , "minInt" : 0 , "type" : "int" }, "k" : { "maxInt" : 10 , "minInt" : 0 , "type" : "int" } } }, { "content" : {} } ]

How to Explain mongo aggregate queries

db . getCollection ( " author " ). explain (). aggregate ([ { $match : { " email " : " [email protected] " } } ])

winningplan

queryPlanner

Array

explain

executionStats

docs Returned

docs Examined

index

collection

db . getCollection ( " author " ). explain ( " executionStats " ). aggregate ([ { $match : { " email " : " [email protected] " } } ])

Conclusion

Today we are going to talk about Mongo Aggregates. A framework that helps us to run complex queries in Mongo DB. One of the best things that happened to Mongo. Let’s start by pulling out a few differences between the normal and Mongo database.Mongo belongs to one of those NoSQL databases which disrupted the internet a few years ago. Everyone in the industry was talking about them. Everyone wanted to move their stack to these flexible databases. Everyone was talking about how the data needs to move to that direction and so on. As the hype began to settle, people started realizing the movement of the stack will help only if they implement it correctly and for most of them shift wasn’t even necessary.is a wide term and consists of a variety of database models. Mongo is also one of them, others beingand many more. MongoDB is the document-based, distributed database. In production, people tend to run it with 3 replicas. 1 is the master and the other two being the slaves. This provides redundancy and high data availability. You can configure these to follow any guidelines but by default reads and writes are handled by the primary replica and the new data is moved on to the replica sets on each writes. As the mongo doesn’t have a well-defined schema, it’s pretty hard to make queries from the data. Mongoose is a library that can help you with this. It provides a lot of benefits, like creating hooks and indexes easily on the collections. BTW, tables equivalent in Mongo are known asand rows equivalents are known asMongo DB aggregates make it easier to query data from any collection. It involves things like matching, getting data from other collections, selecting fields and many more. Let’s discuss a few of these aggregate queries.You can start the aggregate using the following code.where the collection is the name of the collection on which aggregate is applied andis the instance of the connected DB object. Following are the commands that you can use in the aggregate pipeline.The match query can be called as the where query in SQL terms. You tell the aggregate to get the data that follows the given condition. It is recommended to keep the match query as soon as possible in the pipeline. This will reduce the number of documents being returned from the query. It is also a good option to index the fields on which you run the match query to return the results faster. For example: In a student database you can make this query as followsorThis query will return all the documents which satisfy the given query. For all further parts of the pipeline, you can keep adding it to the main array of the pipeline.You can make use of the fields by appending ato the name of the fields, whenever you want to use them.You should useas early as possible to make use ofAlways try to explain thepart of your queries and prefer to create compound indexes according to your queries.Project is the part where you tell the query which keys to pick from the given document.This will pick up only the fields with valuein the document. It is important to reduce of data being transferred to next part of the pipeline. It can also be used to change the name of the field. This is theequivalent ofcommands.field is added by default in the result. You can use, if you don’t want to include that field.Group is used to group different things together, for example, if you want to calculate the sum of the age of students in different standards, the query will look something like this. Here is the link to the mongo query. You can use all different types of aggregate queries like average, min and max.Lookup is one of the most important aggregate queries which you can use. This allows us to join different collections together and get data from the other collections as well. The simplest implementation ofis as follows. Here is the query of the lookup. The lookup can be used with the pipeline as well. With this type of lookup, you can apply a check and tell the pipeline, when the given lookup will run.This is the implementation that you want to use, when you want to runon the data being picked from the other collection.Unwinding is a type of operation in which deconstructs an array field from the input documents to output a document forelement. For example, consider this document:On applyingWe will get the following result,Check this on Mongo playground To explain the queries, you will have to use the options in the aggregate to find the way in which queries are run.This will generate a simple output like this.contains an object which tells us more about the winning plan which was used to run the query andcontains anof plans which were tried. Mongo chooses the best plans and uses it for running the queries. If you use thewithit will give you things likeandwhich can be helpful in finding the best suitablefor yourThere are a few more options to choose from. Aggregate is one of the most important parts of the Mongo Database. If you are dealing with this Database daily, then it would be useful to know a little about it as well. Thanks for stopping by, Let me know if you want to know about something else as well. I love to write about tech topics. Also, let me know if I have made any mistake in this post.The internal performance optimizer of the mongo aggregate optimizes the queries accordingly to make them as fast as possible.