In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to use python library selenium to collect Douyin data". In daily operation, I believe many people have doubts about how to use python library selenium to collect Douyin data. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the questions of "how to use python library selenium to collect Douyin data". Next, please follow the editor to study!
First, install seleniumpip install Selenium second, initialize the browser
Chrome initializes Google browser
Firefox initializes the Firefox browser
Edge initializes the IE browser
PhantomJS is an interface-less browser.
From selenium import webdriver driver = webdriver.Chrome () 3. Set browser size
Maximize_window maximize window
Set_window_size Custom window size
From selenium import webdriver driver = webdriver.Chrome () driver.maximize_window () IV, visit the page from selenium import webdriver driver = webdriver.Chrome () driver.get ('https://www.baidu.com') V), locate the element
The basic methods for locating elements are as follows
Positioning one element locating multiple elements interpretation find_element_by_idfind_elements_by_id locating find_element_by_namefind_elements_by_name by element id locating find_element_by_xpathfind_elements_by_xpath by element name locating find_element_by_link_textfind_elements_by_link_tex by xpath expression locating find_element_by_partial_link_textfind_elements_by_partial_link by full hyperlink _ text locates find_element_by_tag_namefind_elements_by_tag_name by partial link, locates find_element_by_class_namefind_elements_by_class_name by tag, locates by class name, find_elements_by_css_selectorfind_elements_by_css_selector locates by css selector
* * example demonstration: * * find the input box on the home page of Baidu
From selenium import webdriver driver = webdriver.Chrome () driver.get ('https://www.baidu.com')driver.find_element_by_id('kw') VI. Another way to write positioning elements
Need to introduce By module
From selenium import webdriverfrom selenium.webdriver.common.by import By driver = webdriver.Chrome () driver.get ('https://www.baidu.com')driver.find_element(By.ID, 7. Element interaction method interpretation click () Click an element send_keys (entered value) Analog input clear () clear operation submit () submit form get_attribute (name) get the attribute value of the element location get the location of the element text get the text value of the element size get the size of the element id get the id value of the element tag _ name get the tag name of the element
* * example demonstration: * * enter me as autofelix in the Baidu input box, and click the search button.
From selenium import webdriver driver = webdriver.Chrome () driver.get ('https://www.baidu.com')driver.find_element_by_id('kw').send_keys(' I am autofelix') driver.find_element_by_id (' su'). Click () 8. Execute jsfrom selenium import webdriver driver = webdriver.Chrome () driver.maximize_window () driver.get ('https://www.baidu.com') js_sql =' document.getElementById ('kw'). Value =' I am autofelix''''driver.execute_script (js_sql) 9, Frame operation
In the case of frame in the web page, you need to perform cut-in and cut-out operations.
Switch_to.from (the id name of the child iframe) cuts in
Switch_to.parent_frame (the id name of the parent iframe) cut out
From selenium import webdriver driver = webdriver.Chrome () driver.maximize_window () driver.get ('https://www.baidu.com') / / this URL does not have iframe, I guess there is Just take a look at driver.switch_to.frame ('my guessed iframe') 10. The operation method of cookie explains that delete_all_cookies () deletes all cookiesget_cookie (name) of the current page to get the specified cookie value get _ cookies () gets all cookies value of the current page add _ cookie () sets the cookie value from selenium import webdriver driver = webdriver.Chrome () driver.maximize_window () driver.get ('https://www.baidu.com') driver.delete_all_cookies () driver.add_cookie ({' name': 'name') 'domain': '.baidu.com' 'value':' autofelix'}) 11. Tab management method explains that window_handles saves all tabs' tuple switch_to.window () toggle tab from selenium import webdriver driver = webdriver.Chrome () driver.maximize_window () driver.get ('https://www.baidu.com')driver.get('https://www.taobao.com') driver.switch_to.window (driver.window_handles [0]) driver.switch_to.window (driver.window_handles [1]) XII. Mouse event
ActionChains module needs to be introduced into the mouse event | method | description | |:-- |::-- | | move_to_element (above) | right-click of the mouse | | double_click () | double-click of the mouse | | drag_and_drop () | drag and hold the left button | | perform () | Action storage | |
* * example demonstration: * * Slider verification code
From selenium import webdriverfrom selenium.webdriver import ActionChainsfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.wait import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as EC # initialize Google browser driver = webdriver.Chrome () # maximize window driver.maximize_window () # Open the headline login URL driver.get ('https://sso.toutiao.com') # wait for an element to see if WebDriverWait appears (self.driver) 10) .until (EC.text_to_be_present_in_element ((By.XPATH,'/ / * [@ id= "mobile-code-get"] / span') U' send') # instantiate mouse operation action = ActionChains (self.driver) # hold down slider action.click_and_hold (self.driver.find_element_by_xpath ('/ * [@ id= "captcha_container"]')) .perform () # move the slider x distance action.move_by_offset (xoffset=x, yoffset=0). Perform () # release slider action.release () .perform () XIII, wait
Yin's waiting
If the specified element does not appear at a certain time, the process will not block, but if it is not found at the specified time, an exception will be thrown
From selenium import webdriver driver = webdriver.Chrome () driver.implicitly_wait (10) driver.get ('https://www.baidu.com')
Show wait
If the specified element does not appear within a certain period of time, the process will block here, and if it is not found by the specified time, an exception will be thrown
From selenium import webdriverfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as EC driver = webdriver.Chrome () driver.implicitly_wait (10) driver.get ('https://www.baidu.com')WebDriverWait(driver, 10). Until (EC.presence_of_element_located ((By.ID,' kw')) XIV. Forward, backward and refresh
Back back up
Forward forward
Refresh refresh browser
From selenium import webdriver driver = webdriver.Chrome () driver.get ('https://www.baidu.com')driver.get('https://www.taobao.com')driver.get('https://www.jd.com') driver.back () driver.forward () driver.refresh () 15. Close the browser
Close closes the current tab
Quit closes the entire browser
From selenium import webdriver driver = webdriver.Chrome () driver.get ('https://www.baidu.com')// opens the Baidu page and closes the entire browser driver.quit (). This is the end of the study on "how to use the python library selenium to collect Douyin data". I hope to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.