In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
R language how to achieve Zhihu live secondary page acquisition, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
I have written a short article about crawling the information of Zhihu live course, and the part of the course shown on Zhihu live's home page, which is directly traversed, is only a small part.
Today's article will be the upgraded version of this small project, directly traversing the secondary pages of the courses on the live home page according to the module, so that you can grab more rich course information, and the total number of courses this time is nearly 800 +.
For the details of the package analysis of the course page, I will not repeat it here. If you want to know more, you can take a look at this old article. This article only collates the traversal ideas of the secondary pages.
Zhihu live course data crawling actual combat
Because of the relatively large number of courses, you need to get the cookie value by logging in directly using cookie.
Library ("httr") library ("jsonlite")
Library ("httr")
Library ("magrittr")
Library ("plyr")
Library ("rlist")
The first-level page traverses to get the course topic information of each module and the course id value in it.
According to the previous package grabbing process, the crawling function of the first-level course module is as follows:
Mylive% fromJSON (flatten = TRUE)% >% `[[` (2)% >% `[` (1) = = TRUE) break Sys.sleep (runif) I + 1} cat ("all page is OKcargo!", sep = "\ n")
Return (myresult)} execution code system.time (myresult% `[` (2)% >% `[` (1) = = TRUE) break Sys.sleep (runif (1) 0.5) I = I + 1}
Return (myresult)} uses a loop to execute the subpage traversal function above. Fulloutdata
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.