In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces how to use the PHP spider crawler framework to crawl data related knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that after reading this article on how to use PHP spider crawler framework to crawl data will have a harvest, let's take a look at it.
My environment is pagoda lnmp,php is version 5.4, do not use this version, lack of various extension libraries
Error 1: no extension, no need to configure in php.ini
Error 2: this extension library is missing and does not need to be configured in php.ini
Running reported this error: PHP Fatal error: Call to undefined function phpspider\ core\ mb_detect_encoding () in / on line 474
Solution: execute yum install php-mbstring-y
1. Run demo on linux.
Condition: there should be php environment on linux, upload the code, and execute php-f demo.php
Want to quit this page and execute quit or ctrl + c
You may wonder, this is going to run, where is the crawled data?
2. You need to add these two configurations to $configs (refer to the members explained in the document configs):
/ / location where the log is stored
'log_file' = >''
'export' = > array (
'type' = >' csv'
'file' = >', / / the crawled data is placed in the data directory, and the directories and files should be created in advance.
) the csv saved here needs to be downloaded to the local computer to see, because it is an excel or it is easy to download.
Of course, you can also save it in the database.
3. The following is a complete example:
3.1, ideas: specific or to look at the code, ideas are just easy to understand and remember
/ / this page is the home page of the website.
/ / this page is a list page
/ / this page is the page number below the list page.
This time we make it clear that what we are going to crawl is the data from the list page:
3.1.1. Next, set crawl rules.
3.1.2. Instantiate and pass the configuration to the constructor of this class
3.1.3. Add a new url to the band crawl alignment
3.1.4, filter the crawled data, such as title, content, and assemble the data.
3.1.5. Carry out storage operation
3.2, Code
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.