Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to download Onmyoji website wallpapers in bulk by Python

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of "Python how to download Yin-Yang division website wallpaper in batches". Xiaobian shows you the operation process through actual cases. The operation method is simple, fast and practical. I hope this article "Python how to download Yin-Yang division website wallpaper in batches" can help you solve the problem.

Code copy can be used directly, remember pip install download requests and bs4

Final version #Introduction of system class libraries for opening and closing files import sys#Use of document parsing class libraries from bs4 import BeautifulSoup#Use of network request class libraries import requests#Picture saving directory path ='D:/Onmyoji ' #Onmyoji wallpaper website html_doc = "https://yys.163.com/media/picture.html" #Requests_html_doc = requests.get(html_doc).text#Regular matching all href addresses regex = re.compile ('. *? href="(.*?) 2732x2048.jpg" rel="external nofollow" ')urls = regex.findall(requests_html_doc)# set Set to prevent duplicate downloaded images result = set()for i in urls: result.add(i)#counter for picture name num = 0 #file path, operation mode, encoding # r''#open file entry picture f = open(r'result.txt','w ', encoding=' utf-8') for a in urls: try: image_data = requests.get(a).content image_name ='{}.jpg'.format(num) #Name each image save_path = path + '/' + image_name #Save the image at with open(save_path, 'wb') as f: f.write(image_data) print(image_name, '======================= ') num = num+1 #Next picture name number plus one except: pass#close file entry f.close()print("scan results have been written to result.txt file") process

Reference code

Starting from 0, I have no clue, and I don't have a high grasp of python. Then I start by borrowing other people's code. The first code to borrow is as follows.

#import sys#use document parsing class library from bs4 import BeautifulSoup#use network request class library import urllib.requestpath ='D:/Onmyoji 'html_doc = "https://yys.163.com/media/picture.html"#Get Request req = urllib.request.Request(html_doc)#Open page webpage = urllib.request.urlopen(req)#Read page content html = webpage.read ()#Parse into document object soup = BeautifulSoup(html, 'html.parser') #Document object #Illegal URL 1invalidLink 1 ='#'#Illegal URL 2 invalidLink 2 ='_javascript:void(0)'# set collection prevents duplicate connections between downloaded images result = set()#counter for image naming num = 0#Find all a tags in the document for k in soup.find_all ('a'): # print(k) #Find href tags link = k.get('href') #Filter not found if(link is not None): #Filter illegal links if link == invalidLink1: pass elif link == invalidLink2: pass elif link.find("_javascript:") != -1: pass else: result.add(link)for a in result: #file path, operation mode, encoding # r'' f = open(r'result.txt', 'w', encoding='utf-8') # image_data = urllib.request.get(url=a).content image_data = requests.get(url=a).content image_name ='{}.jpg'.format(num) #Name each image save_path = path + '/' + image_name #Save the image at with open(save_path, 'wb') as f: f.write(image_data) print(image_name, '======================= ') num = num+1 #Next picture name number plus one f.close()print("scan results are written to result.txt file") Think urllib.request and requests

The borrowed code uses urllib.request to request, and some code examples that I just learned to see often use urllib.request to initiate requests. Later, I saw that some code uses requests. For me personally, subjectively feeling requests is more convenient, writing a few lines of code less, so I look up to understand the difference between the two.

BeautifulSoup

I came into contact with BeautifulSoup, and I saw praise for BeautifulSoup in the comments of some articles. I went into the document to check the usage, which changed my impression that it was difficult to write python and obtain some feature element nodes in the document.

Beautiful Soup 4 Documentation

optimization processing

The reason for adding regular matching is that there is an empty string in the image link obtained at the beginning, and the whole program hangs directly when downloading the image, and the invalidLink1 and invalidLink2 in this reference code look really uncomfortable. So add regular from the source to ensure the validity of the link, and in the execution of the download code, add try, except to ensure that the program error will not hang up.

About "Python how to download Yin Yang division website wallpaper in batches" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the industry information channel. Xiaobian will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report