In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
In statistical application projects, we often encounter such requirements: to sort a large number of objects, and then only need to take the top N names as the ranking data, this is the TopN algorithm. As a typical representative of nosql database, mongodb can store large amounts of data, and it often meets the needs of TopN in the process of use, such as getting the latest data of required fields from mongodb. The following is to use the aggregator SPL language operation, through a case study of how to achieve the above functions.
Collection last3 has two fields: variable and timestamp, which are first grouped by variable, then select the latest three timestamp in each set of documents, and finally find the earliest timestamp from these documents.
Some of the data of last3 are as follows:
{"_ id": ObjectId ("54f69645e4b077ed8d997857"), "variable": "A", "timestamp": ISODate ("1995-01-01T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997856"), "variable": "A", "timestamp": ISODate ("1995-01-02T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997855"), "variable": "A", "timestamp": ISODate ("1995-01-03T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997854"), "variable": "B", "timestamp": ISODate ("1995-01-02T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997853"), "variable": "B", "timestamp": ISODate ("1995-01-01T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997852"), "variable": "B", "timestamp": ISODate ("1994-01-03T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997851"), "variable": "C", "timestamp": ISODate ("1994-01-03T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997850"), "variable": "C", "timestamp": ISODate ("1994-01-02T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997858"), "variable": "C", "timestamp": ISODate ("1994-01-01T00:00:00Z")}
{"_ id": ObjectId ("54f69645e4b077ed8d997859"), "variable": "C", "timestamp": ISODate ("1993-01-01T00:00:00Z")}
Aggregator code:
AB1=mongo_open ("mongodb://localhost:27017/local?user=test&password=test") 2=mongo_shell (A1, "last3.find (, {_ id:0}; {variable:1})") 3for A2bot variabletimestamp) 4
= @ | B35=B4.minp (~ .timestamp)
6=mongo_close (A1)
A1: connect MongoDB, the connection word format is mongo://ip:port/db?arg=value& ….
A2: use the find function to take numbers from MongoDB and sort them to form cursors. Collectoin is last3, the filter condition is empty, take out all the fields except _ id, and sort by variable.
A3: loop the reading from the cursor, fetching a set of documents with the same variable field each time. The scope of A3 loop is indented B3 to B4, in which A3 can be used to refer to loop variables, where A3 is memory data. You can view the result of a fetch in debug mode as follows:
VariabletimestampC1994-01-03 08:00:00C1994-01-02 08:00:00C1994-01-01 08:00:00C1993-01-01 08:00:00
B3: select the 3 latest (largest) timestamp in this group of documents.
B4: constantly append B3 to B4. B4 is as follows:
VariabletimestampA1995-01-03 08:00:00A1995-01-02 08:00:00A1995-01-01 08:00:00B1995-01-02 08:00:00B1995-01-01 08:00:00B1994-01-03 08:00:00C1994-01-03 08:00:00C1994-01-02 08:00:00C1994-01-01 08:00:00
A5: select the document with the earliest (smallest) timstamp in B4, that is:
VariabletimestampC1994-01-01 08:00:00
A6: close the mongodb connection.
To achieve the requirements of topN similar to Mongodb, using the SPL language can simplify the implementation of mongodb shell, which is much easier than mongodb scripts.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.