In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "Python+Selenium+phantomjs how to achieve web page simulation login and screenshot". In daily operation, I believe many people have doubts about how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs". Next, please follow the editor to study!
All the operations of this paper are under the environment of windows.
Install Python
Python is a cross-platform computer programming language, which can run on Windows, Mac and various Linux/Unix systems. Is an object-oriented dynamically typed language originally designed for automated scripting (shell). With the continuous update of versions and the addition of new features of the language, it is increasingly used for independent, large-scale project development.
Go to Python's website www.python.org to download and install
Check pip (python package management tool) and install pip during installation
After python is installed, open the command line tool cmd, type "python-V", and hit enter. If the python version number appears, the installation is successful.
Install selenium
Selenium is a tool for testing Web applications. The selenium test runs directly in the browser, just like a real user is working on it. Supported browsers include IE (7, 8, 9, 10, 11), Mozilla Firefox,Safari,Google Chrome,Opera, etc. Selenium is a complete web application testing system, including test recording (selenium IDE), writing and running (Selenium Remote Control) and test parallel processing (Selenium Grid).
Install through the python package management tool pip
Pip install selenium
Install phantomjs
PhantomJS is a webkit-based javaScript API. It uses QtWebKit as its core browser function and uses webkit to compile, interpret and execute javaScript code. Anything you can do based on a webkit browser, it can do it. It is not only a hidden browser, such as css selector, support for wen standards, DOM operations, json, HTML5, etc., but also provides the operation to deal with files, so that you can read and write files to the operating system, and so on. PhantomJS can be used for a wide range of purposes, such as network monitoring, web screenshots, wen testing without browsers, page access automation, etc.
Phantomjs installation link www.phantomjs.org
Create a demo folder on the desktop, create a demo.py file as our script file, and create an img folder to hold the captured images
Demo.py:
# coding=utf-8
# Import web page driver software
From selenium import webdriver
# Import WebDriverWait waiting module
From selenium.webdriver.support.wait import WebDriverWait
Import time
# call the PhantomJS browser specified by the environment variable to create browser objects
# phantomjs installation location in parentheses
Driver = webdriver.PhantomJS (executable_path= "D:\\ Python27\\ Scripts\ phantomjs-2.1.1-windows\\ bin\\ phantomjs.exe")
# website visited (take CCTV network as an example)
Driver.get ("http://www.cctv.com/")"
# maximize browser
Driver.maximize_window ()
# simulate clicking the login button to log in to the pop-up login box (the method of positioning elements is described later)
Driver.find_elements_by_xpath ('/ / span [@ class= "btn_icon"]') [1] .click ()
# wait for the login page to load, WebDriverWait (the waiting method is described later)
WebDriverWait (driver, 10,0.5) .clients (lambda diver:driver.find_element_by_xpath ('/ a [@ class= "dl"]'), message= "")
Time.sleep (2)
# capture the page of the login box and save it to the appropriate location
Driver.save_screenshot ('demo\\ img\\ login1.png')
# locate the username and password elements of the login page and simulate filling in the username and password
Driver.find_element_by_name ("username") .send_keys ('xxxxxxxxxxx')
Driver.find_element_by_name ("passwd_view") .send_keys ('xxxxxxxxxxx')
# simulate and click the login button to log in
Driver.find_element_by_link_text ('login'). Click ()
WebDriverWait (driver, 10,0.5) .clients (lambda diver:driver.find_elements_by_xpath ('/ / span [@ class= "btn_icon"]'), message= "")
Time.sleep (2)
# intercept the login page and save it to the appropriate location
Driver.save_screenshot ('demo\\ img\\ login2.png')
# simulate clicking the button to jump to the sports page
Driver.find_element_by_link_text ('sports'). Click ()
WebDriverWait (driver, 10,0.5) .lambda diver:driver.find_element_by_link_text ('CBA'), message= "")
Time.sleep (2)
# capture the sports page and save it to the appropriate location
Driver.save_screenshot ('demo\\ img\\ sport.png')
# exit the driver and close all windows
Driver.quit ()
Run the python script
Open the command line window cmd, change to the path of the demo.py file, and type
Python demo.py
After the script runs, it will automatically fill in the user name and password we set and log in, intercept the set page and save it to the img folder.
Screenshot of the page of the login box:
Screenshot of the page after login:
Screenshot of sports page:
Some of the methods are introduced:
Block location screenshot (secondary screenshot) method:
PIL (Python Image Library) is the third-party image processing library of python. The function of PIL is very powerful, and API is very easy to use. It is already the de facto image processing standard library of Python platform. PIL only supports the version of python2.x. The version of python3.x needs to install pillow. Pillow is a PIL-friendly branch, but supports the version of python3.x.
Install PIL for secondary screenshot under python2.x version
Pip install PIL
Install pillow for block positioning and take screenshots in python3.x version
Pip install pillow
Demo.py:
# Import Image class
From PIL import Image
# locate the elements of the block requiring secondary screenshots
Img = driver.find_element_by_xpath ('/ / * [@ class= "weui-img"]')
# the x coordinate of the upper left corner of the block element in the web page
Left = img.location ['x']
# the y coordinate of the upper left corner of the block element in the web page
Top = img.location ['y']
# the x coordinate of the lower right corner of the block element in the web page
Right = img.location ['x'] + img.size ['width']
# y coordinates of the lower right corner of the block element in the web page
Bottom = img.location ['y'] + img.size ['height']
# Open a screenshot of the page
Photo = Image.open ('demo\\ img\\ img_page.png')
# realize the second screenshot according to the coordinates of block elements
Photo = photo.crop ((left, top, right, bottom))
# Save the second screenshot which is a good http://m.zzzy120.com/ for Zhengzhou abortion Hospital
Photo.save ('demo\\ img\\ img.png')
WebDriver8 basic element location methods:
1. Find_element_by_id () locates according to the id attribute
For example: find_element_by_id ("one") locates elements whose id is one
2. Find_element_by_name () locates according to the name attribute
For example: find_element_by_name ("one") locates elements whose name attribute is one
3. Find_element_by_class_name () locates according to the name of class
For example: find_element_by_class_name ("one") locates elements whose class is one
4. Find_element_by_xpath () xpath is a XML path language, which locates elements by determining the location of elements in xml documents.
For example: find_element_by_xpath ("/ / div [@ id='one']") locates the div element whose id is one
Find_element_by_xpath ("/ / * [@ class='two']") locates elements whose class is two
5. Find_element_by_css_selector () locates according to the css attribute
For example: find_element_by_css_selector ("# one") locates the div element whose id is one
Find_element_by_css_selector (".two") locates elements whose class is two
6. Find_element_by_tag_name () locates according to the label signature
For example: find_element_by_tag_name ("input") locates the input element
7. Find_element_by_link_text () locates based on the complete a link text
Find_element_by_partial_link_text () locates based on part of a link text
For example: find_element_by_link_text ("news") locates the an element whose text is "news"
Find_element_by_partial_link_text ("smell") location text includes the an element of "smell"
8. By positioning
(need to import By class: from selenium.webdriver.common.by import By)
For example: find_element (By.ID, "one") locates elements whose id is one
Find_element (By.NAME, "one") locates elements whose name attribute is one
Find_element (By.CLASS_NAME, "one") locates elements whose class is one
Find_element (By.TAG_NAME, "div") locates div elements
When there are multiple positioning elements, use elements plural positioning, that is, replace the element in the positioning method with elements, then get a set of elements with the same attributes and return a list queue, and then you can locate a single element.
For example: find_elements_by_class_name ("one") [1] locates the second element of all elements whose class is one
Three waiting methods for selenium:
When doing automated testing, sometimes the next operation will depend on the result or content of the previous step, and the next operation can not be carried out until the previous operation is completed successfully. At this time, we need to use waiting to determine whether the previous operation is completed, and then perform the following operations. For example, when you log in to the login page, you need to wait for the login page to load successfully. To locate the element corresponding to the user name and password, and then fill in the user name and password for login operation.
1. Forced waiting
Time.sleep (s) forces you to wait s seconds before doing the following
Disadvantages: it is not easy to control the time, and the waiting time is fixed. If the setting time is not up, the following operations can already be carried out, and extra waiting is required. If the setting time is reached and the previous operation has not been completed, the following operation cannot be carried out normally, and an error will be reported directly.
two。 Implicit waiting
Implicitly_wait (s) within s seconds, the previous operation is completed, proceed to the next step, otherwise wait s seconds, and then proceed to the next step
Disadvantages: if the setting time is reached and the previous operation has not been completed, and the following operation cannot be carried out normally, an error will be reported directly.
3. Explicit wait (recommended)
WebDriverWait (driver,timeout,poll_frequency=0.5,ignored_exceptions=None)
Wait for the page to load, find a condition, and then continue to execute the subsequent code. If the setting time is exceeded, an exception is thrown.
Driver: browser driver
Timeout: maximum timeout. Default is in seconds.
Poll_frequency: interval step for detection. Default is 0.5s.
Ignored_exceptions: exception information thrown after timeout. NoSuchElementExeception exception is thrown by default.
Used in conjunction with until ():
WebDriverWait (driver, s) .clients (method,message= "")
It is detected every 0.5 seconds within s seconds. If the passed method returns true, proceed to the next step. If the arrival time is not detected and the following operations cannot work properly, an error will be reported directly.
At this point, the study on "how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.