In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "how to create a fake desktop running simulation browser head mode in a Linux server". The content of the article is simple and clear, and it is easy to learn and understand. please follow the editor's train of thought to study and learn "how to create a fake desktop running simulation browser head mode in a Linux server".
Students who often use Selenium or Puppeteer know that the Chrome browsers they launch are divided into headless mode and headless mode. When operating on your own computer, if it is in header mode, a Chrome browser window will pop up, and you can see that the browser is operating automatically. Headless mode, on the other hand, does not pop up any windows, only processes.
Don't die. Selenium and Puppeteer dozens of features that can be detected by a website in this article, we introduce a website that detects the characteristics of a simulated browser. Through him, we can find that without making any settings, the browser launched by Selenium or Puppeteer has dozens of features that can be identified as crawlers by the target site. Moreover, the headless mode has much more features than the headless mode.
In other words, even if you don't use any technology to hide features, you'll be much safer just using headers. If the site is not very strict anti-crawler, in many cases, using headless mode is easier to find, but using headless mode is more difficult to find.
The following figure shows the header mode, which does not use any hidden feature technology to visit the detection website:
The following figure shows headless mode, which does not use any hidden feature technology to visit the detection website:
Thousands of miles of rivers and mountains are red.
So, in general, you should use headers more often.
The problem is that when we run the crawler using Selenium or Puppeteer on the Linux server, we will find that the header mode always reports an error. This is because the header mode requires the system to provide graphical interface support in order to draw browser windows, but the Linux server generally does not have a graphical interface, so the header mode is bound to fail.
In this case, in order to be able to use the header mode of the analog browser, we need to create a fake graphical interface to deceive the browser so that its header mode can be used properly.
To do this, we can use something called Xvfb. The introduction of this thing on Wikipedia [1] is as follows:
Xvfb or X virtual framebuffer is a display server implementing the X11 display server protocol. In contrast to other display servers, Xvfb performs all graphical operations in virtual memory without showing any screen output.
Xvfb implements the X11 display service protocol on a machine without an image device. It implements all kinds of interfaces that other graphical interfaces have, but there is no real graphical interface. So when a program invokes graphical interface-related operations in Xvfb, they all run in virtual memory, but you can't see anything.
With Xvfb, we can trick Selenium or Puppeteer into thinking that it is running on a system with a graphical interface, so that we can use header mode properly.
To install Xvfb is very simple, in Ubuntu, you only need to execute the following two lines of commands:
Sudo apt-get update sudo apt-get install xvfb
Now, let's write a very simple piece of code for Selenium to operate on Chrome:
Import time from selenium.webdriver import Chrome driver = Chrome ('. / chromedriver') driver.get ('https://bot.sannysoft.com/') time.sleep (5) driver.save_screenshot (' screenshot.png') driver.close () print ('run complete')
If you run it directly on the server, the effect is as follows:
Because there is no graphical interface, the program must report an error.
Now, we just need to add xvfb-run to the command that runs this code, and see how it works:
The code ran successfully and no error was reported. Now we pull down the generated screenshot.png file from the server, and when we open it, we can see the following:
As you can see, although the window is relatively small, it is indeed the detection result under the header mode. Of course, we can also resize the window and add the parameter: xvfb-run python3 test.py-s-screen 0 1920x1080x16 can pretend to run the program on a monitor with a resolution of 1920x1280. Then modify the Selenium code to set the size of the browser window:
The running effect is shown in the following figure:
This article demonstrates the use of Python to operate Selenium, you can also try using Puppeteer, just change the startup command to xvfb-run node index.js.
Thank you for reading, the above is "how to create a fake desktop to run simulation browser head mode in the Linux server". After the study of this article, I believe you have a deeper understanding of how to create a fake desktop running simulation browser head mode in the Linux server, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.