In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Overview
Usually, there are not many cases of capturing cascade data, but if it is really needed, it does cause some trouble, such as grabbing commodity classification level information and so on. This content will describe how to collect unlimited cascade data and take GoldData to capture the latest provincial, municipal and county levels in 2019 as an example.
Create a dataset
In dataset management, add the dataset area. As shown in the following figure:
A dataset is equivalent to a table in a database, except that the fields are flexible and can be added and changed as needed.
Create a rule
In the rule management, add the rule arearule and fill in the address http://xzqh.mca.gov.cn/map as the crawl entry address.
Through analysis, we know that we can get provincial data from http://xzqh.mca.gov.cn/map, then request http://xzqh.mca.gov.cn/selectJson with provincial name through level data to obtain city data, and finally request http://xzqh.mca.gov.cn/selectJson with city data to obtain county data through city data.
And it is found that the request http://xzqh.mca.gov.cn/selectJson needs to send a POST request, so we need to prefix url with fake:, and then use the rule to request URL with JavaScript.
The area dataset we wrote here has the following fields:
Name description sn takes the area code as the record unique field name name code takes the region code abbr provincial name abbreviated parent_code takes the parent area code
Therefore, the writing rules are as follows:
(note: for specific rules, please see the end of the article)
Then, when the writing is complete, we can start the crawler for fetching.
View data
Open data Management and select the area dataset to see the following figure:
Export data
Go back to data management, select conditions, select the fields you want to export, and export. For example, there is a lot of data here, and GoldData will be downloaded as a packaged excel file and compressed as a zip file. Extract it locally, and then open excel to see the captured data, as shown in the following figure:
End
Through this section, you can learn how to grab cascading data through GoldData. But the next question is how to import the data into the associated list, and the next article will talk about how to integrate cascading data into the database autoassociated table.
Recordings:
(grab rules)
[{_ _ sample: http://xzqh.mca.gov.cn/map match0: http\:\ /\ / xzqh\ .mca\ .gov\ .cn\ / map fields0: {_ _ model: true _ _ node: js _ _ js:''var exp11=/json\ s =\ s (. +)\ s +\ $\ (doc/ var ret=exp11.exec (html)) Var ss=eval (ret [1]) for (var iTunes 0 I
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 280
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.