In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
The content of this article mainly explains "how to climb the Wechat official account article, title, article address", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "Python how to climb Wechat official account article, title, article address" bar!
Preface
The text and picture filter network in this article can be learned, communicated and used, and does not have any commercial use. if you have any questions, please contact us in time for handling.
Free online viewing of Python crawler, data analysis, website development and other case tutorials
Https://space.bilibili.com/523606542 basic development environment
Python 3.6
Picham.
Crawl articles from two official accounts:
1. Climb all the articles on the official account of Green Light programming.
2. Climb all the official account articles about python
Crawl all the articles on the green light programming official account.
1. Click on the picture and text after logging in to the official account.
3, click the hyperlink
When the relevant data is loaded, there will be about the data package, including the article title, link, summary, release time, etc., you can also choose other official accounts that can also be crawled, but this requires you to have a Wechat official account.
To add cookie
Import pprintimport timeimport requestsimport csvf = open ('Tsing Deng official account article .csv', mode='a', encoding='utf-8', newline='') csv_writer = csv.DictWriter (f, fieldnames= ['title', 'article release time', 'article address']) csv_writer.writeheader () for page in range (0,40) 5): url = f 'https://mp.weixin.qq.com/cgi-bin/appmsg?action=list_ex&begin={page}&count=5&fakeid=&type=9&query=&token=1252678642&lang=zh_CN&f=json&ajax=1' headers = {' cookie': 'plus cookie',' referer': 'https://mp.weixin.qq.com/cgi-bin/appmsg?t=media/appmsg_edit_v2&action=edit&isNew=1&type=10&createType=0&token=1252678642&lang=zh_CN', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',} response = requests.get (url=url) Headers=headers) html_data = response.json () pprint.pprint (response.json ()) lis = html_data ['app_msg_list'] for li in lis: title = li [' title'] link_url = li ['link'] update_time = li [' update_time'] timeArray = time.localtime (int (update_time)) otherStyleTime = time.strftime ("% Y" -% mmurf% d% H:%M:%S " TimeArray) dit = {'title': title, 'article release time': otherStyleTime, 'article address': link_url,} csv_writer.writerow (dit) print (dit) crawl all official account articles about python
1. Sogou searches python and selects Wechat.
Note: if you do not log in, you can only crawl the first ten pages of data, and you can crawl more than 2W articles after logging in.
2. Crawl the title, official account, article address and release time of the static web page directly.
Import timeimport requestsimport parselimport csvf = open ('official account article .csv', mode='a', encoding='utf-8', newline='') csv_writer = csv.DictWriter (f, fieldnames= ['title', 'official account', 'article release time', 'article address']) csv_writer.writeheader () for page in range (1 2447): url = f 'https://weixin.sogou.com/weixin?query=python&_sug_type_=&s_from=input&_sug_=n&type=2&page={page}&ie=utf8' headers = {' Cookie': 'own cookie',' Host': 'weixin.sogou.com' 'Referer': 'https://www.sogou.com/web?query=python&_asf=www.sogou.com&_ast=&w=01019900&p=40040100&ie=utf8&from=index-nologin&s_from=index&sut=1396&sst0=1610779538290&lkt=0%2C0%2C0&sugsuv=1590216228113568&sugtime=1610779538290',' User-Agent': 'Mozilla/5.0 (Windows NT 10.0 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',} response = requests.get (url=url) Headers=headers) selector = parsel.Selector (response.text) lis = selector.css ('.news-list li') for li in lis: title_list = li.css (' .txt-box h4 afrog text'). Getall () num = len (title_list) if num = = 1: title_str = 'python' + title_list [0] else: title_ Str = 'python'.join (title_list) href = li.css (' .txt-box h4 a::attr (href)'). Get () article_url = 'https://weixin.sogou.com' + href name = li.css (' .smurf) date = li.css ('.s-p::attr (t)'). Get () timeArray = time .localtime (int (date)) otherStyleTime = time.strftime ("% Y-%m-%d% H:%M:%S" TimeArray) dit = {'title': title_str, 'official account': name, 'article release time': otherStyleTime, 'article address': article_url,} csv_writer.writerow (dit) print (title_str, name, otherStyleTime, article_url) so far I believe that "Python how to climb Wechat official account article, title, article address" have a deeper understanding, you might as well to actually operate it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.