Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of selenium+chromedriver running on Server

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the example analysis of selenium+chromedriver running on the server. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

1. Preface

I want to use selenium to grab data from a website, but sometimes I make mistakes when using phantomjs. Chrome now has a no-interface mode, so you can do without phantomjs in the future.

However, there are some errors when the server installs chrome. Here is a summary of the whole installation process.

Install chrome on 2.ubuntu

# Install Google Chrome# https://askubuntu.com/questions/79280/how-to-install-chrome-browser-properly-via-command-linesudo apt-get install libxss1 libappindicator1 libindicator7wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.debsudo dpkg-I google-chrome*.deb # Might show "errors", fixed by next linesudo apt-get install-f

By this time, it should be installed, and run the test with the following life line:

Google-chrome-headless-remote-debugging-port=9222 https://chromium.org-disable-gpu

Here, headless mode is used for remote debugging, and there is no gpu on ubuntu, so-disable-gpu to avoid error.

You can then open another ssh to connect to the server and use the command line to access the server's local port 9222:

Curl http://localhost:9222

If installed, you will see debugging information. But I will report a mistake here, and here is the wrong solution.

1) possible error resolution

Running the above command may report an error that chrome cannot be run under root. Use the following side to set chrome at this time.

1. Find the google-chrome file

My location is at / opt/google/chrome/

two。 Open the google-chrome file with vi

Vi / opt/google/chrome/google-chrome

Find in the file

Exec-a "$0"$HERE/chrome"$@"

3. Just add-user-data-dir-no-sandbox at the end, and the whole shell command is

Exec-a "$0"$HERE/chrome"$@"-user-data-dir-no-sandbox

4. Open google-chrome again and you can access it normally!

3. Install the chrome driver chromedriver

Download chromedriver

Chromedriver provides api for operating chrome, which is the bridge for selenium to control chrome.

Chromedriver had better install the latest version, remember that I did not install the latest version at the beginning, will report an error. There is no problem with the latest version of chromedriver. The latest version can be found at the address below.

Https://sites.google.com/a/chromium.org/chromedriver/downloads

When I wrote this article, the latest edition was 2.37.

Wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip

Unzip chromedriver_linux64.zip

Here the server-side interface-less version of chrome is installed.

4. How to use chrome without interface

From selenium import webdriverchrome_options = webdriver.ChromeOptions () chrome_options.add_argument ('--headless') chrome_options.add_argument ('--disable-gpu') chrome_options.add_argument ("user-agent='Mozilla/5.0 (X11) Linux x86x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36' ") wd = webdriver.Chrome (chrome_options=chrome_options,executable_path='/home/chrome/chromedriver') wd.get (" https://www.163.com")content = wd.page_source.encode ('utf-8') print contentwd.quit ())

Here is the third setting parameter in chrome_options, which can be used to prevent the site from detecting that you are using unbounded mode for anti-crawling.

The other two settings below, when not set, will open chrome with an interface on the desktop version of linux or on mac. When debugging, you can comment out the following two lines and use the interface version of chrome to debug the program.

Chrome_options.add_argument ('--headless') chrome_options.add_argument ('--disable-gpu')'s article on "sample analysis of selenium+chromedriver running on server" ends here. I hope the above content can be helpful to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report