In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you "how to use python crawler to call Baidu translation", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn how to use python crawler to call Baidu translation "this article.
1. Analysis of Baidu translated web pages
First of all, let's open Baidu translation:
Then press F12, open debugging, and then click network
After our analysis, we can find that the real post submission page translated by Baidu is Request URL: https://fanyi.baidu.com/sug and we can find that there is a key value pair kw:day in form data.
After preliminary analysis, we should have a general idea. Through this URL, we post will submit some data to him, and then he will return a value to us (in fact, later we know that this data is returned to us through json format)
two。 Start writing code.
1. First of all, we all have to import the library we need and define our URL and the words to be translated (here we are user input)
From urllib import request, parse
Import json
Baseurl = "https://fanyi.baidu.com/sug"
Word = input ("Please enter the word you want to enter:")
two。 Because we know from the above analysis that the value we pass to it (that is, the word we want to translate) is passed in the form of key-value pairs, we can use the dictionary format in python to define it.
# We need to transmit past data
Datas = {
'kw': word
}
3. Then we will encode the datas through parse, because the dictionary type at this time is a string type, and what we send is a bytes type. If we don't encode it, there will be an error later!
# Encoding the data
Data = parse.urlencode (datas). Encode ()
4. Secondly, we need to write the headers to visit Baidu translation website, which can be accessed by a simulated browser. Of course, for this kind of access, we only need to write the length of our transmission, and there is no need to write other parameters.
# write http header, at least need Content-Length
Headers = {
# here is the encoded length
'Content-Length': len (data)
}
5. After we have written the data (words) to be transmitted and the headers that visited the site, the most important step is to transfer these things to the Baidu translation website.
# transfer data
Req = request.Request (url=baseurl, data=data, headers=headers)
Res = request.urlopen (req)
We first use the Request object in request to pass the url URL, data data, and headers header file to the req object. Then write the object req to request's urlopen.
6. At this point, we have finished the post data part, and res is the data object returned to us. We read the returned data object through the read method, then encode it with the decode method (at this point, it becomes a data in json format), and finally we parse it in json format.
Json_data = res.read ()
Json_data = json_data.decode ()
Json_data = json.loads (json_data)
Let's print the json_data.
7. The last step is to extract what our users want to see. We analyze that the value of data in this json is a list object, so after we extract the value of data, we can process the data like list!
Data_list = json_data.get ('data')
For item in data_list:
Print (item ['k'],'- -', item ['v'])
The final result:
Complete code
''
Use crawler to call Baidu translator-power:IT Resource Jun
''
From urllib import request, parse
Import json
If _ _ name__ = ='_ _ main__':
Baseurl = "https://fanyi.baidu.com/sug"
Word = input ("Please enter the word you want to enter:")
# We need to transmit past data
Datas = {
'kw': word
}
# Encoding the data
Data = parse.urlencode (datas). Encode ()
# write http header, at least need Content-Length
Headers = {
# here is the encoded length
'Content-Length': len (data)
}
# transfer data
Req = request.Request (url=baseurl, data=data, headers=headers)
Res = request.urlopen (req)
Json_data = res.read ()
Json_data = json_data.decode ()
Json_data = json.loads (json_data)
# data contains a list
Data_list = json_data.get ('data')
For item in data_list:
Print (item ['K'],'-', item ['v']) above are all the contents of this article entitled "how to use python Crawler to invoke Baidu Translation". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.