In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
Today, I will talk to you about how to achieve simple web page pictures on Python. Many people may not know much about it. In order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
Python implements a complete code example of simple web page image capture
Category column: python
Article tag: programming language python java big data
Copyright notice: this article is the original article of the blogger, in accordance with the CC 4.0BY-SA copyright Agreement. Please attach the original source link and this statement to reprint it.
Link to this article: https://blog.csdn.net/chengxun03/article/details/106321564
Put it away
@ this article comes from the official account: csdn2299. If you like, you can follow the official account programmer institution.
This article mainly introduces Python to achieve a simple web page picture capture complete code example, has a certain reference value, friends in need can refer to.
The steps to capture network pictures using python are:
1. Get the source code of the web page according to the given URL
2. Use regular expressions to filter out the image addresses in the source code.
3. Download network pictures according to the filtered image address.
The following is a relatively simple implementation of grabbing a picture of a Baidu Tieba web page:
#-*-coding: utf-8-*-
# feimengjuan
Import re
Import urllib
Import urllib2
# capture web page images
# get the details of the web page according to the given URL, and the resulting html is the source code of the web page
Def getHtml (url):
Page = urllib.urlopen (url)
Html = page.read ()
Return html
Def getImg (html):
# use regular expressions to filter out the image addresses in the source code
Reg = rascsrc = "(. +?\ .jpg)" pic_ext'
Imgre = re.compile (reg)
Imglist = imgre.findall (html) # means to filter out the addresses of all images in the entire web page and put them in imglist
X = 0
For imgurl in imglist:
Urllib.urlretrieve (imgurl,'%s.jpg'% x) # Open the URL of the picture saved in imglist, download the picture and save it locally
X = x + 1
Html = getHtml ("http://tieba.baidu.com/p/2460150866")# gets the details of the web page, and the resulting html is the source code of the web page."
GetImg (html) # analyze and download saved pictures from the source code of the web page
Further tidying up the code, creating a "pictures" folder locally to save the pictures
#-*-coding: utf-8-*-
# feimengjuan
Import re
Import urllib
Import urllib2
Import os
# capture web page images
# get the details of the web page according to the given URL, and the resulting html is the source code of the web page
Def getHtml (url):
Page = urllib.urlopen (url)
Html = page.read ()
Return html
# create a folder to save pictures
Def mkdir (path):
Path = path.strip ()
# determine whether the path exists
# True exists
# there is no Flase
IsExists = os.path.exists (path)
If not isExists:
Print u 'created a new folder named' path,u''
# create directory operation function
Os.makedirs (path)
Return True
Else:
# do not create a directory if it exists, and prompt that the directory already exists
Print u 'folder named', path,u' has been created successfully'
Return False
# enter a file name and save multiple pictures
Def saveImages (imglist,name):
Number = 1
For imageURL in imglist:
SplitPath = imageURL.split ('.')
FTail = splitPath.pop ()
If len (fTail) > 3:
FTail = 'jpg'
FileName = name + "/" + str (number) + "." + fTail
# for each image address, save it
Try:
U = urllib2.urlopen (imageURL)
Data = u.read ()
F = open (fileName,'wb+')
F.write (data)
The picture being saved by print u'is', fileName
F.close ()
Except urllib2.URLError as e:
Print (e.reason)
Number + = 1
# get the address of all the pictures in the web page
Def getAllImg (html):
# use regular expressions to filter out the image addresses in the source code
Reg = rascsrc = "(. +?\ .jpg)" pic_ext'
Imgre = re.compile (reg)
Imglist = imgre.findall (html) # means to filter out the addresses of all images in the entire web page and put them in imglist
Return imglist
# create a local save folder and download and save pictures
If _ _ name__ = ='_ _ main__':
Html = getHtml ("http://tieba.baidu.com/p/2460150866")# gets the details of the web page, and the resulting html is the source code of the web page."
Path = u 'picture'
Mkdir (path) # create a local folder
Imglist = getAllImg (html) # get the address list of the picture
SaveImages (imglist,path) # Save Picture
As a result, dozens of pictures are saved under the Pictures folder, such as screenshots:
After reading the above, do you have any further understanding of how Python implements simple web images? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.