Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use PHP code to collect articles on Wechat official account

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "how to use PHP code to collect the official account of Wechat". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

There are several problems in collecting official account historical information through Sogou search:

1. There is a verification code

2. The list of historical messages has only the last 10 mass messages.

3. The address of the article is valid.

4. It is said that ip should be changed for batch collection.

There are no these problems through my previous article, although the construction of the acquisition system is not as simple as the traditional collector to write a rule to climb. However, the efficiency of batch collection is OK after being built at one time. And the address of the collected article is permanent, and all the historical messages of an official account can be collected.

Let's start with the link address of an official account article:

1. The link address copied from the menu in the upper right corner of Wechat:

Http://mp.weixin.qq.com/s/fF34bERZ0je_8RWEJjoZ5A

2. The address obtained in the list of historical messages:

Http://mp.weixin.qq.com/s?__biz=MjM5NDAwMTA2MA==&mid=2695729619&idx=1&sn=8be0b6bd0210cee0d492ebdf20f7371f&chksm=83d74818b4a0c10ef286b33bb7deb73226125f866ddb5b2781166066a69afef3705eabdb3b85&scene=4#wechat_redirect

3. Complete real address:

Https://mp.weixin.qq.com/s?__biz=MjM5NDAwMTA2MA==&mid=2695729619&idx=1&sn=8be0b6bd0210cee0d492ebdf20f7371f&chksm=83d74818b4a0c10ef286b33bb7deb73226125f866ddb5b2781166066a69afef3705eabdb3b85&scene=37&key=c81d77271180a0e6ce32be2d9dcaa2a7436aeba2c1d47a20d02194d1c944a8286a8eded93495eeadd05da412bbfaa638a379750aeaa4cf5c00e4d7851c5710d9b9736b80e3c72770a57a515c23ff2400&ascene=3&uin=MzUyOTIyNQ%3D%3D&devicetype=iOS10.1.1&version=16050120&nettype=WIFI&fontScale=100&pass_ticket=FGRyGfXLPEa4AeOsIZu7KFJo6CiXOZex83Y5YBRglW4%3D&wx_header=1

The above three addresses are the addresses of the same article, and three completely different results are obtained when they are obtained in different locations.

Like the historical message page, Wechat has a mechanism for automatically adding parameters. The first address is obtained from the copy link and appears to be a camouflaged code. It's no use. We don't think about it. The second address is the link address obtained from the json article list of historical messages through the method described in the previous article, which we can save to the database. The content of the article can then be obtained from the server through this address. After the third link adds the parameters, the purpose is to enable the js of the article page to get the json results of the likes of the readings. In the method of our previous article, because the article page is opened and displayed by the client, because of these parameters, the js in the article page automatically gets the reading volume, so we can get the reading volume of this article through the proxy service.

The content of this article is to study in detail how to obtain the content of the article and some other useful information on the basis of obtaining a large number of Wechat articles through the methods introduced in the previous articles of this column.

(list of articles saved in my database, some fields)

1. Get the source code of the article:

The article source code can be read into variables through php's function file_get_content (). Wechat article source code because it can be opened from the browser, so I will not paste here, so as not to waste page space.

2. Useful information in the source code:

1) content of the original text:

The original text is contained in a tag and is obtained through php code:

Then there is the video. The display of the video is abnormal. After long-term testing, it is found that it can be solved by replacing a page address. Instead of talking about the process, just say the result:

After these two replacements, the pictures and videos in the html of the article content are normal.

3) official account related information:

Through the previous article of this column, we introduced the use of Wechat client, after arbitrarily opening a historical message page of an official account. The system identifies the value of biz from the database, finds that there is no record in the database, and inserts a new record. After that, the collection queue will regularly obtain the historical message list of the official account according to the biz.

But we only got the biz of the official account, the name of the official account, and the profile portrait, which are two important information we still don't get. The main reason is that these two messages are not available on the historical messages page. But we can get it from the article page.

At the bottom of the Wechat article page html, there is some js variable assignment code. After regular matching, we can get the information of the two official accounts:

Through these two regular matches, we can get the profile picture and nickname of the official account, and then according to the biz in the address of the article, we can save it to the corresponding WeChat data table.

3. Preservation and treatment of articles

The previous code has fetched the content of the article into a variable. How to preserve it, in fact, everyone may have their own ideas. Let me tell you a little bit about my method of saving content:

Save the html of the article content into a html file with the database id as the file name and take the biz field as the directory.

The above code is a standard php to set up a folder to save files, you can arrange the saving method according to your actual situation.

After that, we can get a html file on our server, which is the content of the official account. We can open it from the browser and have a look. At this time, you may find the picture hotlink protection! Unable to display properly! Including the article cover image saved in the database, the profile picture of the official account is hotlink protection.

Don't worry, this problem is easy to solve, just save the pictures to your own server, but it will take up your own server space and bandwidth in the future.

The principle of image hotlink protection is that when a picture is displayed on a web page, the image server will detect the server domain name that references the image, and when it is found that the server domain name does not contain http://qq.com or http://qpic.cn, it will be replaced with a hotlink protection image.

However, if the domain name of the referenced page is not detected, it will be displayed normally, so we can get the binary code of the image through the function file_get_content () of php, and then save the file name on our server according to our own idea. Here is another way to save images. I currently use Tencent Cloud's "Vientiane YouTu" to save images to cloud space through the api provided by them. This advantage is that when reading pictures, add the desired image size parameter to the link address of the image, and you can directly get a thumbnail image. It's much more convenient than having your own server. Aliyun should have the same product, which seems to be called object storage.

In addition, the purpose of collecting official account content is to make a news app. After the html code is displayed in app, because app also does not have a domain name, the hotlink protection server will not think that the picture has been hacked. In this way, the picture can be displayed directly.

This is the end of the content of "how to use PHP code to collect Wechat official account articles". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report