In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "how distributed crawlers deal with data in Redis". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's ideas to study and learn "how distributed crawlers deal with data in Redis".
Save to MongoDB
1. Start the MongoDB database: sudo mongod
two。 Execute the following procedure: py2 process_youyuan_mongodb.py
# process_youyuan_mongodb.py#-*-coding: utf-8-*-import jsonimport redisimport pymongodef main (): # specify Redis database information rediscli = redis.StrictRedis (host='192.168.199.108', port=6379, db=0) # specify MongoDB database information mongocli = pymongo.MongoClient (host='localhost', port=27017) # create database name db= mongocli ['youyuan'] # create table name sheet = db [' beijing_18_25'] while True: # FIFO mode is blpop LIFO mode is brpop, get key value source, data = rediscli.blpop (["youyuan:items"]) item = json.loads (data) sheet.insert (item) try: print u "Processing:% (name) s"% item except KeyError: print u "Error procesing:% r"% itemif _ _ name__ = ='_ main__': main ()
Save to MySQL
1. Start mysql:mysql.server start (more platform is different)
two。 Log in to root user: mysql-uroot-p
3. Create a database youyuan:create database youyuan
4. Switch to the specified database: use youyuan
5. Create the table beijing_18_25 and the column names and data types of all fields.
6. Execute the following procedure: py2 process_youyuan_mysql.py
# process_youyuan_mysql.py#-*-coding: utf-8-*-import jsonimport redisimport MySQLdbdef main (): # specify redis database information rediscli = redis.StrictRedis (host='192.168.199.108', port= 6379, db = 0) # specify mysql database mysqlcli = MySQLdb.connect (host='127.0.0.1', user='power', passwd='xxxxxxx', db = 'youyuan', port=3306, use_unicode=True) while True: # FIFO mode is blpop,LIFO mode is brpop Get the key value source, data = rediscli.blpop (["youyuan:items"]) item = json.loads (data) try: # use the cursor () method to get the operation cursor cur = mysqlcli.cursor () # execute the SQL INSERT statement cur.execute ("INSERT INTO beijing_18_25 (username, crawled, age, spider, header_url, source, pic_urls, monologue, source_url) VALUES (% s,% s % s,% s,% s) ", [item ['username'], item [' crawled'], item ['age'], item [' spider'], item ['header_url'], item [' source'], item ['pic_urls'], item [' monologue'] Item ['source_url']]) # commit sql transaction mysqlcli.commit () # close this operation cur.close () print "inserted% s"% item [' source_url'] except MySQLdb.Error,e: print "Mysql Error% d:% s"% (e.args [0], e.args [1]) if _ name__ = ='_ main__': main ()
Thank you for reading, the above is the content of "how distributed crawlers deal with data in Redis". After the study of this article, I believe you have a deeper understanding of how distributed crawlers deal with data in Redis, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.