In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about how Python achieves bulk collection of commodity data. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
This purpose
Python collects data of a commodity in batches
Knowledge point
Requests sends request
Re parses web page data
Json type data extraction
Csv table data saving
Development environment
Python 3.8
Pycharm
Requests
Code
Import module
Import jsonimport randomimport timeimport csvimport requestsimport reimport pymysql
Core code
# Connect database def save_sql (title, pic_url, detail_url, view_price, item_loc, view_sales, nick): count = pymysql.connect (host='xxx.xxx.xxx.xxx', # database address port=3306, # database port user='xxxx', # database account password='xxxx' # Database password db='xxxx' # Database Table name) # create database object db= count.cursor () # write sql sql = f "insert into goods (title, pic_url, detail_url, view_price, item_loc, view_sales, nick) values ('{title}','{pic_url}','{detail_url}', {view_price},'{item_loc}') '{view_sales}','{nick}') "# execute sql db.execute (sql) # Save changes count.commit () db.close () headers = {'cookie':' miid=4137864361077413341 Tracknick=%5Cu5218%5Cu6587%5Cu9F9978083283; thw=cn; hng=CN%7Czh-CN%7CCNY%7C156; cna=MNI4GicXYTQCAa8APqlAWWiS; enc=%2FWC5TlhZCGfEq7Zm4Y7wyNToESfZVxhucOmHkanuKyUkH1YNHBFXacrDRNdCFeeY9y5ztSufV535NI0AkjeX4g%3D%3D; tweead15767ffa6febb4d2a8709edebf622d3; lgc=%5Cu5218%5Cu6587%5Cu9F9978083283; sgcookie=E100EcWpAN49d4Uc3MkldEc205AxRTa81RfV4IC8X8yOM08mjVtdhtulkYwYybKSRnCaLHGsk1mJ6lMa1TO3vTFmr7MTW3mHm92jAsN%2BOA528auARfjf2rnOV%2Bx25dm%2BYC6l; uc3=nk2=ogczBg70hCZ6AbZiWjM%3D&vt3=F8dCvCogB1%2F5Sh2kqHY%3D&lg2=Vq8l%2BKCLz3%2F65A%3D%3D&id2=UNGWOjVj4Vjzwg%3D%3D; uc4=nk4=0%40oAWoex2a2MA2%2F2I%2FjFnivZpTtTp%2F2YKSTg%3D%3D&id4=0%40UgbuMZOge7ar3lxd0xayM%2BsqyxOW; _ cc_=W5iHLLyFfA%3D%3D; _ massih6fugtkmilk ac589fc01c86be5353b640607e791528mm 1647451667088; _ migmatictkenvelopen7d452e4e140345814d5748c3e31fc355a75section7b227561726668170703b3b32223a223264393343163343636353530386635333636363636363564334c6158364545455061633f2f2b2b4b4b4b6686686454d372f2f772f2702f772f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372f372 Tfstk=cKKGBRTY1F71aDbHPcs6LYjFVa0dZV2F6iSeY3hEAYkCuZxFizaUz1sbK1hS_r1..; lqo44n5U62lRBzU9BeYBqo44n5U62Rom la1Hmn; isg=BDw8SnVxcvXZcEU4ugf-vTadDdruNeBfG0WXdBa9WicK4dxrPkd97hHTxQmZqRi3', 'referer':' https://s.taobao.com/search?q=%E4%B8%9D%E8%A2%9C&imgfile=&js=1&stats_click=search_radio_all%3A1&initiative_id=staobaoz_20220323&ie=utf8&bcoffset=1&ntoffset=1&p4ppushleft=2%2C48&s=', 'sec-ch-ua':' "Not AtBrand"; v = "99", "Chromium"; v = "99", "Chromium"; v = "99", "Google Chrome" V = "99", 'sec-ch-ua-mobile':'? 050, 'sec-ch-ua-platform':', 'Windows', 'sec-fetch-dest':' document', 'sec-fetch-mode':' navigate', 'sec-fetch-site':' same-origin', 'sec-fetch-user':'? 1century, 'upgrade-insecure-requests':' 1' 'user-agent': 'Mozilla/5.0 (Windows NT 10.0 Win64 X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36',} with open ('Taobao .csv', mode='a', encoding='utf-8', newline='') as f: csv_writer = csv.writer (f) csv_writer.writerow (['title',' pic_url', 'detail_url',' view_price', 'item_loc',' view_sales', 'nick']) for page in range (1 Url= f 'https://s.taobao.com/search?q=%E4%B8%9D%E8%A2%9C&imgfile=&js=1&stats_click=search_radio_all%3A1&initiative_id=staobaoz_20220323&ie=utf8&bcoffset=1&ntoffset=1&p4ppushleft=2%2C48&s={44*page}' response = requests.get (url=url, headers=headers) json_str = re.findall (' g_page_config = (. *) 'To Response.text) [0] json_data = json.loads (json_str) auctions = json_data ['mods'] [' itemlist'] ['data'] [' auctions'] for auction in auctions: try: title = auction ['raw_title'] pic_url = auction [' pic_url'] detail_url = auction ['detail_url'] view _ price = auction ['view_price'] item_loc = auction [' item_loc'] view_sales = auction ['view_sales'] nick = auction [' nick'] print (title Pic_url, detail_url, view_price, item_loc, view_sales, nick) save_sql (title, pic_url, detail_url, view_price, item_loc, view_sales, nick) with open ('Taobao .csv', mode='a', encoding='utf-8', newline='') as f: csv_writer = csv.writer (f) csv_writer.writerow ([title, pic_url) Detail_url, view_price, item_loc, view_sales, nick]) except: pass time.sleep (random.randint (3,5)) Thank you for reading! This is the end of the article on "how to collect commodity data in bulk by Python". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.