In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)05/31 Report--
Most people do not understand the knowledge of this article "python how to achieve batch download of wallpaper", so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "python how to achieve batch download of wallpaper" article.
Initialize the project
The project uses virtualenv to create a virtual environment to avoid contaminating the entire situation. Download directly using pip3:
Pip3installvirtualenv then creates a new wallpaper-downloader directory where appropriate, and uses virtualenv to create a virtual environment called venv:
Virtualenvvenv . To activate the next step of venv/bin/, create a dependent directory:
Finally, yun downloads and installs the dependency:
Pip 3 install-requirements. Txt analysis of crawler work steps
For simplicity, let's go directly to the wallpaper list page classified as "Aviation"
As you can see, there are 10 wallpapers available for download on this page. But because the thumbnails are shown here, the definition of wallpaper is far from enough, so you need to go to the wallpaper details page to find the download link for HD. Click on the first wallpaper and you can see a new page:
Because my machine has a Retina screen, I intend to download the largest one directly to ensure high definition (the volume shown in the red circle).
After understanding the specific steps, you can find the corresponding dom node through the developer's tool and extract the corresponding url. This process is no longer carried out, and readers can try it on their own. Next, enter the coding section.
Visit the page
Create a new download.py file and introduce two libraries:
From bs4 import beautification group
Import request next, write a special function that accesses url and returns the page html:
Defvisit_page (url):
Title = {
User agent': 'Mozilla/5.0 (Macintosh;intelmacosx 10 _ 13 _ 1) apple WebKit/537.36 (KHTML,like gecko) Chrome/63. 0. 3239. 108 safari/537.36'
}
R=requests.get (url,headers=headers)
R.encoding='utf-8'
Soup= Beautification Group (r.textjue 'lxml')
To prevent returnsoup from being hit by the site's anti-crawling mechanism, we need to disguise the crawler as a normal browser by adding UA to the header, then specify the utf-8 encoding, and finally return html in string format.
Extract Link
After obtaining the page html, you need to extract the url corresponding to the wallpaper list of the page:
Defget_paper_link (page 1)
Links=page. Select (# contentidivullidiva)
Collect= []
Forlinkinlinks:
Collect.append (link.get ('href'))
The function returncollect extracts the url of all the wallpaper details in the list page.
Download wallpaper
With the address of the detailed page, we can go in and choose the right size. After analyzing the dom structure of the page, we can see that each size corresponds to a link:
So the first step is to extract the links corresponding to these sizes:
Wall
Paper_source=visit_page (link) wallpaper_size_links=wallpaper_source.select ('# wallpaper-resolutions > a') size_list= [] forlinkinwallpaper_size_links: href=link.get ('href') size_list.append ({' size':eval (link.get_text (). Replace)), 'name':href.replace (' / download/',''), 'url':href})
Size_list is a collection of these links. In order to facilitate the selection of the highest definition (largest) wallpaper, I used the eval method in size to calculate the 5120x3200 directly as the value of size.
After getting all the collections, you can use the max () method to select the highest definition item:
Biggest_one=max (size_list,key=lambdaitem:item ['size'])
The url in this biggest_one is the download link of the corresponding size. Next, you only need to download the linked resources through the requests library:
Result=requests.get (PAGE_DOMAIN+biggest_one ['url']) ifresult.status_code==200:open (' wallpapers/'+biggest_one ['name'],' wb') .write (result.content)
Note that first you need to create a wallpapers directory under the root directory, otherwise the runtime will report an error.
To tidy up, the complete download_wallpaper function looks like this:
Defdownload_wallpaper (link): wallpaper_source=visit_page (PAGE_DOMAIN+link) wallpaper_size_links=wallpaper_source.select ('# wallpaper-resolutions > a') size_list= [] forlinkinwallpaper_size_links:href=link.get ('href') size_list.append ({' size':eval (link.get_text (). Replace)), 'name':href.replace (' / download/',''), 'url':href}) biggest_one=max (size_list Key=lambdaitem:item ['size']) print (' Downloadingthe'+str (index+1) +'/'+ str (total) + 'wallpaper:'+biggest_one [' name']) result=requests.get (PAGE_DOMAIN+biggest_one ['url']) ifresult.status_code==200:open (' wallpapers/'+biggest_one ['name'],' wb') .write (result.content)
Batch operation
The above steps can only download the first wallpaper of the first wallpaper list page. If we want to download all the wallpapers for multiple list pages, we need to call these methods in a loop. First, let's define a few constants:
Importsysiflen (sys.argv)! = 4:print ('3argumentswererequiredbutonlyfind'+str (len (sys.argv)-1) +'!') exit () category=sys.argv [1] try:page_start= [int (sys.argv [2])] page_end=int (sys.argv [3]) except:print ('Thesecondand thirdargumentsmustbeannumberbutnotasting') exit ()
Here, by obtaining the command line parameters, three constants category, page_start and page_end are specified, corresponding to the wallpaper classification, the start page number and the end page number, respectively.
For convenience, define two more url-related constants:
PAGE_DOMAIN=' http://wallpaperswide.com'PAGE_URL='http://wallpaperswide.com/'+category+'-desktop-wallpapers/page/'
Then you can do batch operations happily, before we define a start () startup function:
Defstart (): ifpage_start [0]
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.