MongoDB uses examples to learn aggregation operations 07/19 Update SLTechnology News&Howtos

MongoDB uses examples to learn aggregation operations

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The official website of Mongodb provides a census of the United States, which can be downloaded at

Http://media.mongodb.org/zips.json

Data examples:

[root@localhost cluster] # head zips.json {"_ id": "01001", "city": "AGAWAM", "loc": [- 72.622739, 42.070206], "pop": 15338, "state": "MA"} {"_ id": "01002", "city": "CUSHMAN", "loc": [- 72.51564999999999, 42.377017], "pop": 36963 "state": "MA"} {"_ id": "01005", "city": "BARRE", "loc": [- 72.10835400000001, 42.409698], "pop": 4546, "state": "MA"} {"_ id": "01007", "city": "BELCHERTOWN", "loc": [- 72.41095300000001, 42.275103], "pop": 10579 "state": "MA"} {"_ id": "01008", "city": "BLANDFORD", "loc": [- 72.936114, 42.182949], "pop": 1240, "state": "MA"} {"_ id": "01010", "city": "BRIMFIELD", "loc": [- 72.188455, 42.116543], "pop": 3706 "state": "MA"} {"_ id": "01011", "city": "CHESTER", "loc": [- 72.988761, 42.279421], "pop": 1688, "state": "MA"} {"_ id": "01012", "city": "CHESTERFIELD", "loc": [- 72.833309, 42.38167], "pop": "state": "MA"} {"_ id": "01013", "city": "CHICOPEE", "loc": [- 72.607962, 42.162046], "pop": 23396, "state": "MA"} {"_ id": "01020", "city": "CHICOPEE", "loc": [- 72.576142, 42.176443], "pop": 31495 "state": "MA"}

Import data into a mongodb database using mongoimport

[root@localhost cluster] # mongoimport-d test-c "zipcodes"-- file zips.json-h 192.168.199.219 test 270202016-01-16T18:31:29.424+0800 connected to: 192.168.199.219 test 270202016-01-16T18:31:32.420+0800 [#.] Test.zipcodes 2.1 MB/3.0 MB (68.5%) 2016-01-16T18:31:34.471+0800 [#] test.zipcodes 3.0 MB/3.0 MB (100.0%) 2016-01-16T18:31:34.471+0800 imported 29353 documents

I. aggregation operation for a single purpose

Ask for simple operations such as count,distinct

Example 1.1: find the number of documents in the zipcodes collection

Db.zipcodes.count ()

Example 1.2 find the total number of documents in the MA state

Db.zipcodes.count ({state: "MA"})

Example 1.3 find out which states are in zipcodes

Db.zipcodes.distinct ("state")

Second, use the aggregate aggregation framework for more complex aggregation operations

Example 2.1: count the total population of each state

Db.zipcodes.aggregate ([{$group: {_ id: "$state", total: {$sum: "$pop"}])

Aggregate queries are made using the aggregate method of the collection.

The $group keyword is followed by a grouped field (be sure to use the $prefix when referencing the field), as well as the aggregate function.

_ id: is the keyword that represents the primary key that returns the result set.

The equivalent SQL of this query is

Select state as _ id,sum (pop) as total from zipcodes group by state

Example 2.2: statistics of the total population of each state and each city

Db.zipcodes.aggregate ([{$group: {_ id: {state: "$state", city: "$city"}, pop: {$sum: "$pop"},])

If there is more than one field grouped, each field should be given an alias, such as state: "$state"

Example 2.3: count the total population of cities with a population of more than 10000 in each state

Db.zipcodes.aggregate ([{$match: {"pop": {$gt: 10000}}}, {$group: {_ id: {state: "$state"}, pop: {$sum: "$pop"},])

The $match keyword is followed by the filter criteria for the collection. This statement is equivalent to the following SQL

Select state,sum (pop) as pop from zipcodes where pop > 10000 group by state

Example 2.4: query states with a total population of more than 10 million

Db.zipcodes.aggregate ([{$group: {_ id: {state: "$state"}, pop: {$sum: "$pop"}, {$match: {"pop": {$gt: 1000 million}}}])

Putting $match after $group is equivalent to performing a group operation before filtering the result set. The equivalent sql is as follows

Select state,sum (pop) as pop from zipcodes group by state having sum (pop) > 100000000

Example 5: find the average population of each state city

Db.zipcodes.aggregate ([{$group: {_ id: {state: "$state", city: "$city"}, pop: {$sum: "$pop"}, {$group: {_ id: "$_ id.state", avgPop: {$avg: "$pop"}}])

Our aggregate function supports multiple iterations, and the equivalent sql of this statement is

Select state,avg (pop) as avgPop from (select state,city,sum (pop) pop from zipcodes group by state,city) group by state

Example 2.5: find the name of the city with the largest and least population in each state and the corresponding population

The first $group calculates the number of people grouped by state,city.

The $sort operation is sorted by population

The second $group is grouped by state, and cityPop sorting is installed for the data of each state packet. The first row of data for each group ($first) is the least populated city, and the last row ($last) is the most populous city.

Example 2.6 uses $project to reformat the result

Db.zipcodes.aggregate ([{$group: {_ id: {state: "$state", city: "$city"}, cityPop: {$sum: "$pop"}, {$sort: {cityPop: 1}}, {$group: {_ id: "$_ id.state", biggestCity: {$last: "$_ id.city"}, biggestPop: {$last: "$cityPop"} SmallestCity: {$first: "$_ id.city"}, smallestPop: {$first: "$cityPop"}}, {$project: {_ id:0, state: "$_ id", biggestCity: {name: "$biggestCity", pop: "$biggestPop"}, smallestCity: {name: "$smallestCity" Pop: "$smallestPop"}])

Example 2.7 do aggregate statistics on the contents of the array

Let's assume that there is a collection of students taking courses. Examples of data are as follows.

Db.course.insert ({name: "Zhang San", age:10,grade: "fourth grade", course: ["math", "English", "politics"]}) db.course.insert ({name: "Li Si", age:9,grade: "third grade", course: ["mathematics", "Chinese", "nature"]}) db.course.insert ({name: "Wang Wu", age:11,grade: "fourth grade", course: ["mathematics", "English") Db.course.insert ({name: "Zhao Liu", age:9,grade: "Grade 4", course: ["Mathematics", "History", "Politics"]})

Ask how many people take each course.

Db.course.aggregate ([{$unwind: "$course"}, {$group: {_ id: "$course", sum: {$sum: 1}, {$sort: {sum:-1}])

$unwind, which is used to unpack the contents of the array, and then group them according to the unpacked data. In addition, there is no $count keyword in aggregate, so use $sum:1 to calculate count.

Example 2.8 asks what city each state has.

Db.zipcodes.aggregate ([{$group: {_ id: "$state", cities: {$addToSet: "$city"}},])

AddToSet writes the city contents of each packet into an array.

Suppose we have the following data structure

Db.book.insert ({_ id: 1, title: "MongoDB Documentation", tags: ["Mongodb", "NoSQL"], year: 2014, subsections: [{subtitle: "Section 1: Install MongoDB", tags: ["NoSQL", "Document"], content: "Section 1: This is the content of section 1."}, {subtitle: "Section 2: MongoDB CRUD Operations" Tags: ["Insert", "Mongodb"], content: "Section 2: This is the content of section 2."}, {subtitle: "Section 3: Aggregation", tags: ["Aggregate"], content: {text: "Section 3: This is the content of section3.", tags: ["MapReduce", "Aggregate"]}}]})

This document describes the chapters of the book, each chapter has a tags field, and the book itself has a tags field.

If the customer needs it, check for books with the label Mongodb and chapters that show only the label Mongodb. It is impossible for us to use the find () method.

Db.book.find ({$or: [{tags: {$in: ['Mongodb']}}, {"subsections.tags": {$in: [' Mongodb']}}]})

A similar query above shows that all parts of the document are hit, and sections that do not contain Mongodb tags are also displayed.

Aggregate provides a $redact expression that can be tailored to the result.

Db.book.aggregate ([{$redact: {$cond: {if: {$gt: [{$size: {$setIntersection: ["$tags", ["Mongodb"]]}}, 0]}, then: "$$DESCEND", else: "$PRUNE"}}])

$DESCEND returns the conditional tags field if the condition is met, or the parent field for embedded documents. All judgment conditions are applied to the embedded document.

$PRUNE does not display this field if the condition is not met.

The query results are as follows

{"_ id": 1, "title": "MongoDB Documentation", "tags": ["Mongodb", "NoSQL"], "year": 2014, "subsections": [{"subtitle": "Section 2: MongoDB CRUD Operations" "tags": ["Insert", "Mongodb"], "content": "Section 2: This is the content of section 2."}]}

Third, use mapReduce

Example 3.1: count the total population of each state

Db.zipcodes.mapReduce (function () {emit (this.state, this.pop)}, / / mapFunction (key, values) = > {return Array.sum (values)}, / / reduceFunction {out: "zipcodes_groupby_state"})

With mapReduce, there are at least three parameters, the map function, the reduce function, and the out output parameter.

In the map function, this represents the current document being processed. The emit function, which passes the passed key-value pair to the reduce function.

Reduce accepts the output of the map function as input. The values in reduce is a list. For the above example, state is the key, and the pop corresponding to each record of the same state forms a list as a value. The form is as follows

State = "CA" values= [51841, 40629...]

The key of the reduce function must be returned by default, and the return value of return adds up the values in values. As a value.

Out: the collection saved by the output result

Example 3.2 counts the population of each city and the number of documents in each city.

Db.zipcodes.mapReduce (function () {var key = {state:this.state,city:this.city} emit (key, {count:1,pop:this.pop})}, / / mapFunction (key, values) = > {var retval = {count:0,pop:0} for (var I = 0 values I < values.length) Return retval +) {retval.count + = values [I] .count retval.pop + = values [I] .pop} return retval}, / / reduceFunction {out: "zipcodes_groupby_state_city"})

We pass {state,city} as an object as a value to the key of the map function. Pass the {count:1,pop:this.pop} object to the value of map.

The value of count,pop is calculated again in the reduce function. Return.

The equivalent sql is as follows

Select state,city,count (*) as count,sum (pop) as pop from zipcodes group by state,city

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.