Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Python crawl the article, title and address of the official account of Wechat?

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

The content of this article mainly explains "how to climb the Wechat official account article, title, article address", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "Python how to climb Wechat official account article, title, article address" bar!

Preface

The text and picture filter network in this article can be learned, communicated and used, and does not have any commercial use. if you have any questions, please contact us in time for handling.

Free online viewing of Python crawler, data analysis, website development and other case tutorials

Https://space.bilibili.com/523606542 basic development environment

Python 3.6

Picham.

Crawl articles from two official accounts:

1. Climb all the articles on the official account of Green Light programming.

2. Climb all the official account articles about python

Crawl all the articles on the green light programming official account.

1. Click on the picture and text after logging in to the official account.

3, click the hyperlink

When the relevant data is loaded, there will be about the data package, including the article title, link, summary, release time, etc., you can also choose other official accounts that can also be crawled, but this requires you to have a Wechat official account.

To add cookie

Import pprintimport timeimport requestsimport csvf = open ('Tsing Deng official account article .csv', mode='a', encoding='utf-8', newline='') csv_writer = csv.DictWriter (f, fieldnames= ['title', 'article release time', 'article address']) csv_writer.writeheader () for page in range (0,40) 5): url = f 'https://mp.weixin.qq.com/cgi-bin/appmsg?action=list_ex&begin={page}&count=5&fakeid=&type=9&query=&token=1252678642&lang=zh_CN&f=json&ajax=1' headers = {' cookie': 'plus cookie',' referer': 'https://mp.weixin.qq.com/cgi-bin/appmsg?t=media/appmsg_edit_v2&action=edit&isNew=1&type=10&createType=0&token=1252678642&lang=zh_CN', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',} response = requests.get (url=url) Headers=headers) html_data = response.json () pprint.pprint (response.json ()) lis = html_data ['app_msg_list'] for li in lis: title = li [' title'] link_url = li ['link'] update_time = li [' update_time'] timeArray = time.localtime (int (update_time)) otherStyleTime = time.strftime ("% Y" -% mmurf% d% H:%M:%S " TimeArray) dit = {'title': title, 'article release time': otherStyleTime, 'article address': link_url,} csv_writer.writerow (dit) print (dit) crawl all official account articles about python

1. Sogou searches python and selects Wechat.

Note: if you do not log in, you can only crawl the first ten pages of data, and you can crawl more than 2W articles after logging in.

2. Crawl the title, official account, article address and release time of the static web page directly.

Import timeimport requestsimport parselimport csvf = open ('official account article .csv', mode='a', encoding='utf-8', newline='') csv_writer = csv.DictWriter (f, fieldnames= ['title', 'official account', 'article release time', 'article address']) csv_writer.writeheader () for page in range (1 2447): url = f 'https://weixin.sogou.com/weixin?query=python&_sug_type_=&s_from=input&_sug_=n&type=2&page={page}&ie=utf8' headers = {' Cookie': 'own cookie',' Host': 'weixin.sogou.com' 'Referer': 'https://www.sogou.com/web?query=python&_asf=www.sogou.com&_ast=&w=01019900&p=40040100&ie=utf8&from=index-nologin&s_from=index&sut=1396&sst0=1610779538290&lkt=0%2C0%2C0&sugsuv=1590216228113568&sugtime=1610779538290',' User-Agent': 'Mozilla/5.0 (Windows NT 10.0 WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',} response = requests.get (url=url) Headers=headers) selector = parsel.Selector (response.text) lis = selector.css ('.news-list li') for li in lis: title_list = li.css (' .txt-box h4 afrog text'). Getall () num = len (title_list) if num = = 1: title_str = 'python' + title_list [0] else: title_ Str = 'python'.join (title_list) href = li.css (' .txt-box h4 a::attr (href)'). Get () article_url = 'https://weixin.sogou.com' + href name = li.css (' .smurf) date = li.css ('.s-p::attr (t)'). Get () timeArray = time .localtime (int (date)) otherStyleTime = time.strftime ("% Y-%m-%d% H:%M:%S" TimeArray) dit = {'title': title_str, 'official account': name, 'article release time': otherStyleTime, 'article address': article_url,} csv_writer.writerow (dit) print (title_str, name, otherStyleTime, article_url) so far I believe that "Python how to climb Wechat official account article, title, article address" have a deeper understanding, you might as well to actually operate it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report