How to realize simulated login and screenshot of web page by Python+Selenium+phantomjs 07/03 Update SLTechnology News&Howtos

How to realize simulated login and screenshot of web page by Python+Selenium+phantomjs

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "Python+Selenium+phantomjs how to achieve web page simulation login and screenshot". In daily operation, I believe many people have doubts about how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs". Next, please follow the editor to study!

All the operations of this paper are under the environment of windows.

Install Python

Python is a cross-platform computer programming language, which can run on Windows, Mac and various Linux/Unix systems. Is an object-oriented dynamically typed language originally designed for automated scripting (shell). With the continuous update of versions and the addition of new features of the language, it is increasingly used for independent, large-scale project development.

Go to Python's website www.python.org to download and install

Check pip (python package management tool) and install pip during installation

After python is installed, open the command line tool cmd, type "python-V", and hit enter. If the python version number appears, the installation is successful.

Install selenium

Selenium is a tool for testing Web applications. The selenium test runs directly in the browser, just like a real user is working on it. Supported browsers include IE (7, 8, 9, 10, 11), Mozilla Firefox,Safari,Google Chrome,Opera, etc. Selenium is a complete web application testing system, including test recording (selenium IDE), writing and running (Selenium Remote Control) and test parallel processing (Selenium Grid).

Install through the python package management tool pip

Pip install selenium

Install phantomjs

PhantomJS is a webkit-based javaScript API. It uses QtWebKit as its core browser function and uses webkit to compile, interpret and execute javaScript code. Anything you can do based on a webkit browser, it can do it. It is not only a hidden browser, such as css selector, support for wen standards, DOM operations, json, HTML5, etc., but also provides the operation to deal with files, so that you can read and write files to the operating system, and so on. PhantomJS can be used for a wide range of purposes, such as network monitoring, web screenshots, wen testing without browsers, page access automation, etc.

Phantomjs installation link www.phantomjs.org

Create a demo folder on the desktop, create a demo.py file as our script file, and create an img folder to hold the captured images

Demo.py:

# coding=utf-8

# Import web page driver software

From selenium import webdriver

# Import WebDriverWait waiting module

From selenium.webdriver.support.wait import WebDriverWait

Import time

# call the PhantomJS browser specified by the environment variable to create browser objects

# phantomjs installation location in parentheses

Driver = webdriver.PhantomJS (executable_path= "D:\\ Python27\\ Scripts\ phantomjs-2.1.1-windows\\ bin\\ phantomjs.exe")

# website visited (take CCTV network as an example)

Driver.get ("http://www.cctv.com/")"

# maximize browser

Driver.maximize_window ()

# simulate clicking the login button to log in to the pop-up login box (the method of positioning elements is described later)

Driver.find_elements_by_xpath ('/ / span [@ class= "btn_icon"]') [1] .click ()

# wait for the login page to load, WebDriverWait (the waiting method is described later)

WebDriverWait (driver, 10,0.5) .clients (lambda diver:driver.find_element_by_xpath ('/ a [@ class= "dl"]'), message= "")

Time.sleep (2)

# capture the page of the login box and save it to the appropriate location

Driver.save_screenshot ('demo\\ img\\ login1.png')

# locate the username and password elements of the login page and simulate filling in the username and password

Driver.find_element_by_name ("username") .send_keys ('xxxxxxxxxxx')

Driver.find_element_by_name ("passwd_view") .send_keys ('xxxxxxxxxxx')

# simulate and click the login button to log in

Driver.find_element_by_link_text ('login'). Click ()

WebDriverWait (driver, 10,0.5) .clients (lambda diver:driver.find_elements_by_xpath ('/ / span [@ class= "btn_icon"]'), message= "")

Time.sleep (2)

# intercept the login page and save it to the appropriate location

Driver.save_screenshot ('demo\\ img\\ login2.png')

# simulate clicking the button to jump to the sports page

Driver.find_element_by_link_text ('sports'). Click ()

WebDriverWait (driver, 10,0.5) .lambda diver:driver.find_element_by_link_text ('CBA'), message= "")

Time.sleep (2)

# capture the sports page and save it to the appropriate location

Driver.save_screenshot ('demo\\ img\\ sport.png')

# exit the driver and close all windows

Driver.quit ()

Run the python script

Open the command line window cmd, change to the path of the demo.py file, and type

Python demo.py

After the script runs, it will automatically fill in the user name and password we set and log in, intercept the set page and save it to the img folder.

Screenshot of the page of the login box:

Screenshot of the page after login:

Screenshot of sports page:

Some of the methods are introduced:

Block location screenshot (secondary screenshot) method:

PIL (Python Image Library) is the third-party image processing library of python. The function of PIL is very powerful, and API is very easy to use. It is already the de facto image processing standard library of Python platform. PIL only supports the version of python2.x. The version of python3.x needs to install pillow. Pillow is a PIL-friendly branch, but supports the version of python3.x.

Install PIL for secondary screenshot under python2.x version

Pip install PIL

Install pillow for block positioning and take screenshots in python3.x version

Pip install pillow

Demo.py:

# Import Image class

From PIL import Image

# locate the elements of the block requiring secondary screenshots

Img = driver.find_element_by_xpath ('/ / * [@ class= "weui-img"]')

# the x coordinate of the upper left corner of the block element in the web page

Left = img.location ['x']

# the y coordinate of the upper left corner of the block element in the web page

Top = img.location ['y']

# the x coordinate of the lower right corner of the block element in the web page

Right = img.location ['x'] + img.size ['width']

# y coordinates of the lower right corner of the block element in the web page

Bottom = img.location ['y'] + img.size ['height']

# Open a screenshot of the page

Photo = Image.open ('demo\\ img\\ img_page.png')

# realize the second screenshot according to the coordinates of block elements

Photo = photo.crop ((left, top, right, bottom))

# Save the second screenshot which is a good http://m.zzzy120.com/ for Zhengzhou abortion Hospital

Photo.save ('demo\\ img\\ img.png')

WebDriver8 basic element location methods:

1. Find_element_by_id () locates according to the id attribute

For example: find_element_by_id ("one") locates elements whose id is one

2. Find_element_by_name () locates according to the name attribute

For example: find_element_by_name ("one") locates elements whose name attribute is one

3. Find_element_by_class_name () locates according to the name of class

For example: find_element_by_class_name ("one") locates elements whose class is one

4. Find_element_by_xpath () xpath is a XML path language, which locates elements by determining the location of elements in xml documents.

For example: find_element_by_xpath ("/ / div [@ id='one']") locates the div element whose id is one

Find_element_by_xpath ("/ / * [@ class='two']") locates elements whose class is two

5. Find_element_by_css_selector () locates according to the css attribute

For example: find_element_by_css_selector ("# one") locates the div element whose id is one

Find_element_by_css_selector (".two") locates elements whose class is two

6. Find_element_by_tag_name () locates according to the label signature

For example: find_element_by_tag_name ("input") locates the input element

7. Find_element_by_link_text () locates based on the complete a link text

Find_element_by_partial_link_text () locates based on part of a link text

For example: find_element_by_link_text ("news") locates the an element whose text is "news"

Find_element_by_partial_link_text ("smell") location text includes the an element of "smell"

8. By positioning

(need to import By class: from selenium.webdriver.common.by import By)

For example: find_element (By.ID, "one") locates elements whose id is one

Find_element (By.NAME, "one") locates elements whose name attribute is one

Find_element (By.CLASS_NAME, "one") locates elements whose class is one

Find_element (By.TAG_NAME, "div") locates div elements

When there are multiple positioning elements, use elements plural positioning, that is, replace the element in the positioning method with elements, then get a set of elements with the same attributes and return a list queue, and then you can locate a single element.

For example: find_elements_by_class_name ("one") [1] locates the second element of all elements whose class is one

Three waiting methods for selenium:

When doing automated testing, sometimes the next operation will depend on the result or content of the previous step, and the next operation can not be carried out until the previous operation is completed successfully. At this time, we need to use waiting to determine whether the previous operation is completed, and then perform the following operations. For example, when you log in to the login page, you need to wait for the login page to load successfully. To locate the element corresponding to the user name and password, and then fill in the user name and password for login operation.

1. Forced waiting

Time.sleep (s) forces you to wait s seconds before doing the following

Disadvantages: it is not easy to control the time, and the waiting time is fixed. If the setting time is not up, the following operations can already be carried out, and extra waiting is required. If the setting time is reached and the previous operation has not been completed, the following operation cannot be carried out normally, and an error will be reported directly.

two。 Implicit waiting

Implicitly_wait (s) within s seconds, the previous operation is completed, proceed to the next step, otherwise wait s seconds, and then proceed to the next step

Disadvantages: if the setting time is reached and the previous operation has not been completed, and the following operation cannot be carried out normally, an error will be reported directly.

3. Explicit wait (recommended)

WebDriverWait (driver,timeout,poll_frequency=0.5,ignored_exceptions=None)

Wait for the page to load, find a condition, and then continue to execute the subsequent code. If the setting time is exceeded, an exception is thrown.

Driver: browser driver

Timeout: maximum timeout. Default is in seconds.

Poll_frequency: interval step for detection. Default is 0.5s.

Ignored_exceptions: exception information thrown after timeout. NoSuchElementExeception exception is thrown by default.

Used in conjunction with until ():

WebDriverWait (driver, s) .clients (method,message= "")

It is detected every 0.5 seconds within s seconds. If the passed method returns true, proceed to the next step. If the arrival time is not detected and the following operations cannot work properly, an error will be reported directly.

At this point, the study on "how to achieve web page simulation login and screenshot by Python+Selenium+phantomjs" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.