Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to skip webdriver detection and simulate login to Taobao by selenium

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "how to skip webdriver detection and simulate login to Taobao by selenium". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Brief introduction

Simulated login to Taobao is not a new thing. In the past, I used to use get/post for crawlers and joined the IP proxy pool to skip the test, but with the upgrading of large websites, it is more difficult to implement this strategy. Because you use get/post to crawl data, you will be prompted to log in, and login is a big problem, you need to swipe CAPTCHA to verify. When you want to use the IP proxy pool for skip verification, you find that you need SMS CAPTCHA verification when logging in, so you can know that the old fully automatic crawling data is more difficult for large websites.

Selenium is an excellent WEB automated testing tool, so now selenium is used for semi-automatic crawling data, supporting simulated login to Taobao and automatic processing of sliding CAPTCHA.

Compiling ideas

As large websites now detect selenium tools, if selenium is detected, it will be judged to be a robot, and access is denied. So the first step is to prevent being detected as a robot, how to prevent it from being detected? When using selenium for automation, typing windows.navigator.webdriver in the consloe in the chrome browser will find that the result is Ture, while the value is False when using the browser normally. So we block the windows.navigator.webdriver.

Add to the code:

Options= webdriver.ChromeOptions () # this step is important to set it to developer mode to prevent it from being recognized by major websites using Selenium options.add_experimental_option ('excludeSwitches', [' enable-automation']) self.browser = webdriver.Chrome (executable_path=chromedriver_path, options=options)

At the same time, in order to speed up the crawling speed, we set the browser mode to not load pictures and add:

Options = webdriver.ChromeOptions () # No images are loaded, speed up access to options.add_experimental_option ("prefs", {"profile.managed_default_content_settings.images": 2})

At this point, we have understood the key steps, and all that is left is to write the code. In the given example, you are required to have some knowledge of html and css.

For example, the following code exists:

Self.browser.find_element_by_xpath ('/ / * [@ class= "btn_tip"] / a taobao_name span') .click () taobao_name = self.wait.until (EC.presence_of_element_located ((By.CSS_SELECTOR, '.site-nav-bd > ul.site-nav-bd-l > li#J_SiteNavLogin > div.site-nav-menu-hd > div.site-nav-user > a.site-nav-login-info-nick')) print (taobao_name.text)

The first line of code refers to finding any (*) class element named btn_tip from the root directory (/ /) and finding the child element span in the a tag of the child element of btn_tip

The second line of code means waiting for some CSS element to appear, otherwise the code stays here and detects all the time. With. The first one represents the class name (class) in CSS, and the one that begins with # represents the ID name (id) in CSS. A > B refers to the child element B of A. So this line of code can be understood as looking for the child element E of the child element C of the child element B of A, or the child element E of the child element D of C, otherwise it will be detected here all the time.

Line 3 refers to printing the text content of an element

Using tutorials

Click here to download the chrome browser

Check the version number of the chrome browser and click here to download the chromedriver driver with the corresponding version number.

Pip installs the following packages

[x] pip install selenium

Click here to log in to Weibo and bind the password to Taobao account through Weibo

Fill in the absolute path of chromedriver in main

Fill in the Weibo account password in main

# change to the full path address of your chromedriver chromedriver_path = "/ Users/bird/Desktop/chromedriver.exe" # change to your Weibo account weibo_username = "change to your Weibo account" # change to your Weibo password weibo_password = "change your Weibo password"

Demo picture

This is the end of the content of "how to skip webdriver detection and simulate login to Taobao by selenium". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report