In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Today, the editor will show you how to climb a web page with C#+Selenium+ChromeDriver. The knowledge points in the article are introduced in great detail. Friends who feel helpful can browse the content of the article with the editor, hoping to help more friends who want to solve this problem to find the answer to the problem. Follow the editor to learn more about "how to climb a web page with C#+Selenium+ChromeDriver".
1. Background
Selenium is a tool for testing Web applications. The Selenium test runs directly in the browser, just like a real user is working on it. For crawlers, using Selenium to manipulate browsers to crawl data on the Internet must be a killer weapon among crawlers. Here, I will introduce the general use of selenium + Google browser.
two。 Demand
In ordinary crawler development, sometimes the web page is a pile of js code, involving a lot of asynchronous computing, if it is an ordinary http console request, then the source file is a pile of js, you need to assemble the data yourself, which is very laborious; but using Selenium+ChromeDriver can achieve the perfect effect of WYSIWYG.
3. Mode of realization
Project structure: winform program for ease of use, with nuget package attached
The following is the code for form1.cs, so I'll just put the key method code here. Need to install the latest chrome browser + the chromedriver used in the code is v2.9.248315
# region abnormal exit chromedriver [DllImport ("user32.dll", EntryPoint = "FindWindow")] private extern static IntPtr FindWindow (string lpClassName, string lpWindowName); [DllImport ("user32.dll", EntryPoint = "SendMessage")] public static extern int SendMessage (IntPtr hWnd, int Msg, int wParam, int lParam); public const int SW_HIDE = 0; public const int SW_SHOW = 5 [DllImport ("user32.dll", EntryPoint = "ShowWindow")] public static extern int ShowWindow (IntPtr hwnd, int nCmdShow); / get window handle / public IntPtr GetWindowHandle () {string name = (Environment.CurrentDirectory + "\\ chromedriver.exe"); IntPtr hwd = FindWindow (null, name) Return hwd;} / close the chromedriver window / public void CloseWindow () {try {IntPtr hwd = GetWindowHandle (); SendMessage (hwd, 0x10, 0,0) } catch {}} / exit chromedriver / public void CloseChromeDriver (IWebDriver driver) {try {driver.Quit (); driver.Dispose () } catch {} CloseWindow ();} # endregion exited chromedriver abnormally
Effect:
Let's talk about the train of thought:
1. Jump to the specified web page driver.Navigate (). GoToUrl
two。 Determine the data source and read the data from driver.PageSource
3. Parsing html data
Thank you for your reading, the above is the whole content of "how to climb a web page with C#+Selenium+ChromeDriver", learn friends to hurry up to operate it. I believe that the editor will certainly bring you better quality articles. Thank you for your support to the website!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.