In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article will explain in detail how to use Playwright in Python. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.
Playwright is a new generation of automated testing tool opened by Microsoft at the beginning of 2020. Its function is similar to Selenium, Pyppeteer and so on. It can drive browsers to do various automatic operations. Its function is also very powerful, supporting the mainstream browsers on the market, and API is simple and powerful. Although it was born relatively late, it has developed very hot.
In an age when Pyppeteer is no longer maintained, having Playwright as an open source tool is a great choice, well-documented and powerful.
Installation method conda config-- add channels conda-forgeconda config-- add channels microsoftconda install playwrightplaywright install
The above command runs by downloading Playwright and packaging it as Chromium, Firefox, and Webkit to install browser binaries.
Characteristics
Playwright supports all current mainstream browsers, including Chrome and Edge (based on Chromium), Firefox, Safari (based on Webkit), and provides complete automatic control of API.
Playwright supports the testing of mobile pages, and device simulation technology can be used to test responsive web applications in mobile web browsers.
Playwright supports testing in both Headless mode and non-Headless mode for all browsers.
Playwright installation and configuration is very simple, the installation process will automatically install the corresponding browser driver, no additional configuration WebDriver and so on.
Playwright provides a large number of automation-related API, when the page is loaded, it will automatically wait for the corresponding node to load, which greatly simplifies the difficulty of writing API.
Mode of use
Import Playwright into the python script and launch one of the three browsers (Chromium, Friefox, and webkit). Playwright supports two writing modes, one is asynchronous mode like Pyppeter, and the other is synchronous mode like Selenium. We can choose different modes according to our actual needs.
Let's first look at an example of a basic synchronization mode:
From playwright.sync_api import sync_playwrightwith sync_playwright () as p: for browser_type in [p.chromium, p.firefox, p.webkit]: browser = browser_type.launch (headless=False) page = browser.new_page () page.goto ("https://www.baidu.com") page.screenshot (path=f" screenshot- {browser_type.name} .png ") print (page.title ()) browser.close ()
First we import the sync_playwright method here, and then call this method, which returns a PlaywrightContextManager object, which can be understood as the browser's context manager, which is assigned to the variable p. Then call the chromium, firefox, and webkit browser instances of the PlaywrightContextManager object, then use the for loop to execute their launch methods in turn, and set headless to False.
Here's a note: if launch is not set to Flase, the default is headless mode to launch the browser, we can't see any windows.
The launch method returns a browser (Browser) object, which we copy as a browser variable, and then call the new_page method, which is equivalent to creating a new graphics card, returning the page object and assigning the variable page, and then calling a series of automated API of the page object for operation. When the page is loaded, a screenshot is generated and the console output result is exited. The above code calls two methods of the page object:
1. Screenshot: the parameter passes the name of a file, so that the screenshot will be automatically saved as the name of the file.
2. Title: returns the title of the page.
At this point, three screenshots will be generated in the current directory, all of which are the home page of Baidu, with the name of the browser in the file name, as shown in the figure:
Results of console operation:
Baidu, you will know.
Baidu, you will know.
Baidu, you will know.
In addition to the synchronous mode described above, Playwright also supports asynchronous mode. If you use asyncio in your project, you should consider using asynchronous mode and asynchronous API, as follows:
Import asynciofrom playwright.async_api import async_playwrightasync def main (): async with async_playwright () as p: for browser_type in [p.chromium, p.firefox P.webkit]: browser = await browser_type.launch () page = await browser.new_page () await page.goto ("https://www.baidu.com") await page.screenshot (path=f" screenshot- {browser_type.name} .png ") print (await page.title ()) await browser.close () asyncio.run (main ())
As you can see from the above code, the whole writing method is very similar to the synchronization mode.
Note:
1. The async_playwright method is imported
2. Add the async/await keyword to the writing.
Code generation
Playwright also has a powerful feature, that is, it can record our actions in the browser and automatically generate the code during the operation. This can be done by calling codegen with the Playwright command. Let's take a look at the parameters of the codegen command.
Playwright codegen-help
The results are similar to the following:
Usage: npx playwright codegen [options] [url]
Open page and generate code for user actions
Options:
-o,-- output saves the generated script to a file
-target language to generate, one of javascript, test, python, python-async, csharp (default:
"python")
-b,-- browser browser to use, one of cr, chromium, ff, firefox, wk, webkit (default: "chromium")
-channel Chromium distribution channel, "chrome", "chrome-beta", "msedge-dev", etc
Color-scheme emulate preferred color scheme, "light" or "dark"
Device emulate device, for example "iPhone 11"
-- geolocation specify geolocation coordinates, for example "37.819722 Murray 122.478611"
-- ignore-https-errors ignore https errors
-load-storage load context storage state from the file, previously saved with-- save-storage
-lang specify language / locale, for example "en-GB"
-- proxy-server specify proxy server, for example "http://myproxy:3128" or" socks5://myproxy:8080
-save-storage save context storage state at the end, for later use with-- load-storage
-- save-trace record a trace for the session and save it to a file
Timezone timezone to emulate, for example "Europe/Rome"
-- timeout timeout for Playwright actions in milliseconds (default: "10000")
-- user-agent specify user agent string
-- viewport-size specify browser viewport size in pixels, for example "1280"
-h,-- help display help for command
Examples:
$codegen
$codegen-target=python
$codegen-b webkit https://example.com
You can see several options above, such as
-o represents the name of the output code file
-target indicates the language used. The default is python, that is, the operation code of synchronous mode will be generated. If python-async is passed in, the operation code of asynchronous mode will be generated.
-b indicates the type of browser used. The default is Chrome browser.
-device can simulate the use of mobile browser
-lang means to set browser language
-timeout can set the page load timeout.
After understanding these uses, let's try to launch the Chrome browser and output the result to test3.py with the following command:
Playwright codegen-o test3.py-- target python-async
You can see that the browser also highlights the node you are working on, along with the node name.
The code changes in real time during the operation. After you finish the operation, you can close the browser, and Playwright will generate a test3.py file with the following contents:
Import asynciofrom playwright.async_api import Playwright Async_playwrightasync def run (playwright: Playwright)-> None: browser = await playwright.chromium.launch (headless=False) context = await browser.new_context () # Open new page page = await context.new_page () # Go to https://www.baidu.com/ await page.goto ("https://www.baidu.com/") # Click input [name=" wd "] await page.click (" input [name=\ "wd\"] ") # Click input [name=" wd "] await page.click (" input [name=\ "wd\"] ") # Fill input [name=" wd "] await page.fill (" input [name=\ "wd\"] " "how to be a rich woman on the list") # Click text= Baidu # async with page.expect_navigation (url= "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=%E5%A6%82%E4%BD%95%E6%A6%9C%E4%B8%8A%E5%AF%8C%E5%A9%86&fenlei=256&rsv_pq=ca59e3ec000cf6aa&rsv_t=5f82kcndi6iqNSwqOVo5sd%2BHSoqhzQHKLGVs1HFegxx02UtWAA5gHQbWBfw&rqlang=cn&rsv_enter=0&rsv_dl=tb&") Rsv_sug3=24&rsv_sug1=14&rsv_sug7=100&rsv_btype=i&prefixsug=%25E5%25A6%2582%25E4%25BD%2595%25E6%25A6%259C%25E4%25B8%258A%25E5%25AF%258C%25E5%25A9%2586&rsp=4&inputT=8686&rsv_sug4=68370&rsv_jmp=fail "): async with page.expect_navigation (): await page.click (" text= Baidu ") # assert page.url =" https://www.baidu.com/s?ie=utf- " 8% A6% 9C% E4% B8% 8A% E5% AF% 8C% E5% A96cf6aaaaaaaaaaaaaaaec000cf6aaaaaaaaaec000cf6aaaaaec000cf6aaaaec000cf6aaaaaec000cf6aaaaaaaaec000cf6aaaaaaaaaaaaaaaec000cf6aaaaaaaaaaaaaaec000cf6aaec000cf6aaaec000cf6aaaaaec000cf6aaaaaaaaaaec000cf6aaaaaaec000cf6aaaaec000cf6aaaaaaaaaaaaaaaaaaec000cf6f6aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaec000cff6aaaaaaaaaaaaaaaaaaaaaaaaaaaaec000cf6fff6aaaaaaaaaaaaaaaaaaaaec000cff6aaaaaaaaaaaaaaaaaaaaaaaaaaaec000cf6ff6aaaaaaaaaaaaaaaaaaaaaaec000cf6ff6aaaaaaaaaaaaaaaaaaa 258C%25E5%25A9%2586&rsp=4&inputT=8686&rsv_sug4=68370 "# Close page await page.close () #-await context.close () await browser.close () async def main ()-> None: async with async_playwright () as playwright: await run (playwright) asyncio.run (main ())
You can see that the code here is basically similar to the code we wrote before, and it can also be run, and after running it, you can see that it repeats what we just did.
In addition, the new_page here is not called through browser, but through the context variable, and context is called through the Browser object. The context variable here is equivalent to a BrowserContext object, which is an independent context similar to the stealth mode, and its running resources are isolated to ensure that they do not interfere with each other.
Selector
The documentation of Playwright is very rich, you can refer to https://playwright.dev/python/docs/selectors directly.
Event monitoring
The page object provides an on method that can be used to listen for events in the page, such as close, console, load, request, response, and so on.
For example, we can listen to the response event, and the response event can be triggered when each network request is responded, and we can set the corresponding callback method to get all the information of the corresponding Response.
From playwright.sync_api import sync_playwrightdef on_response (response): print (f'Statue {response.status}: {response.url}') with sync_playwright () as p: browser = p.chromium.launch (headless=False) page = browser.new_page () page.on ('response', on_response) page.goto (' https://www.kenshujun.cn/') page.wait_for_load_state ('networkidle') browser.close ()
After you create the page object, you start listening for response events, set the callback method to the on_response,on_response object to accept a parameter, and then output the status code and connection.
You can see that the output here is the same as the content loaded by the browser Network panel.
This is the end of the article on "how to use Playwright in Python". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.