In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Editor to share with you how to use Python crawler to crawl website music, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's learn about it!
Concrete realization
1. Introduce a third-party library that sends network requests
Import requests # third-party libraries that send network requests
Installation method
Pip install requests
two。 Introduce the data parsing third-party library
From lxml import etree # third-party library for data parsing
Installation method
Pip install lxml
3. The list url of some Yiyun music website is' https://music.163.com/#/discover/toplist?id=3778678'.
Url = 'https://music.163.com/#/discover/toplist?id=3778678'
4. Send a request to get page data
Response = requests.get (urlurl=url) # request page data
5. Parsing data
Html=etree.HTML (response.text) # parsing page data
6. Get all song tag collections (a tags)
Id_list = html.xpath ('/ / a [contains (@ href, "song?")]') # all songs id collection
7. Download songs
Base_url = 'http://music.163.com/song/media/outer/url?id=' # download music URL prefix # download music url = URL prefix + music id for data in id_list: href = data.xpath ('. / @ href') [0] music_id = href.split ('=') [1] # Music id music_url = base_url + music_id # download Music url music_name = data.xpath ('. / text ()') [0] # download music name music = requests.get (url = music_url) # Save the downloaded music as a file with open ('. / music/%s.mp3'% music_name 'wb') as file: file.write (music.content) print (' download succeeded'% music_name)
The pit encountered
I learned the above method from a video, which was released half a year ago. It may have worked well at that time, but today I suddenly made a mistake when I was downloading music files in this way.
First of all, the editor failed to find music_name and music_id. If I take a closer look, the id in the acquired id_list collection (that is, the tag collection) is not id at all, but code. It is estimated that the music website has also done the corresponding anti-scraping mechanism here.
Secondly, I found a piece of music on the website and got its id and assigned id to music_id. As a result, when I used the outer link to download music, it showed that the network was crowded, and it was estimated that the URL for downloading music was not working.
Base_url = 'http://music.163.com/song/media/outer/url?id=' music_id =' 1804320463.mp3' music_url = base_url + music_id music = requests.get (url=music_url) print (music.text)
{"msg": "the network is too crowded, please try again later!" , "code":-460, "message": "the network is too crowded, please try again later!" }
Finally, I print out the music_url, click on it, and I can listen to music and download it. I don't know why.
Base_url = 'http://music.163.com/song/media/outer/url?id=' music_id =' 1804320463.mp3' music_url = base_url + music_id # music = requests.get (url=music_url) print (music_url) these are all the contents of this article entitled "how to use Python crawler to crawl website music". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.