How to use python to crawl Baidu picture website and download pictures in batch 07/01 Update SLTechnology News&Howtos

How to use python to crawl Baidu picture website and download pictures in batch

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use python to climb Baidu picture website and download pictures in batches. The introduction in this article is very detailed and has certain reference value. Interested friends must finish reading it!

No bb show your codeimport os

Import requests

Import re

KeyWord = "Yang surpasses" # sets the theme of crawling pictures

Number = 10 # number of crawled pictures

If not os.path.exists (keyWord):

Os.makedirs (keyWord)

Url = r 'http://image.baidu.com/search/flip?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq'\

Renewal 1497491098685, ringing pvents, showing tabbals, fbads, widthpieces, heightweights, faceplates, istypedia, ietrees, utfmur8, ctd'\.

Renewal 1497491098685% 5E001519X735wordwords'+ keyWord

Get = requests.get (url)

Pciture_url = re.findall (r'objURL ":" (. *?) ",', get.text)

A = 1

For i in pciture_url:

P_type = i.split ('.') [- 1]

A + = 1

Try:

Picture = requests.get (I, timeout=10)

Name = "% s/%s_%d.%s"% (keyWord, keyWord, a, p_type)

With open (name, 'wb') as f:

F.write (picture.content)

Print ('d picture is downloading'a)

Except:

Print ('d picture download failed! Skipped.'% a)

If a > = number:

Break

The code logic is mainly to climb the Baidu picture website page source code, and then extract the address link of each picture from the source code, using a loop to save each picture.

You can copy this code directly into the Python editor, just modify the keyWord variable and num variable, set it to the theme and number of images you want to crawl, and you can run and download it. The result of the crawl is shown below:

Crawl advanced version of the code

The above code can only crawl one page, because it only extracts the picture links in one URL. If you want to climb a large number of pictures, you need to extract the link to the next page of the picture website. The core code is as follows. If you need a full version of the code, you can reply [line 01] in the background to get all the codes.

Def get_url_one_page (url):

Html = requests.get (url)

Html.encoding = 'utf-8'

Html = html.text

Url_pic_this_page = re.findall (r'"objURL": "(. *?)",', html)

Url_next_page_prefix = re.findall (r 'next page, html)

If len (url_next_page_prefix)! = 0:

Url_next_page = 'http://image.baidu.com' + url_next_page_prefix [0]

Else:

Print ("Last page reached!")

Url_next_page = None

Return url_pic_this_page, url_next_page

Crawlers can be used without code.

Before the crawler, many readers said that they had no contact with Python or crawler, and they wanted to implement the function but could not understand the code. So brother Xing has converted this Python code into an exe that can be used directly, as shown in the following figure:

Enter the theme of the picture you want to crawl in the crawl keyword, then enter the number of crawled pictures, and finally select the path where the image is saved and click to start crawling. You just need to wait (if the network speed is fast enough. Generally download a picture per second) the final download result is as follows

The above is "how to use python to crawl Baidu picture website and download pictures in batch" all the content of this article, thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.