In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the relevant knowledge of "how to handle the exception module in the python crawler urllib". The editor shows you the operation process through the actual case, the operation method is simple, fast and practical. I hope this article "how to handle the exception module in the python crawler urllib" can help you solve the problem.
Exception handling in urllib
When we write a crawler, if there is an error in url, then we can't crawl what we want. For this, we introduce exception handling in urllib.
Components of url
URL consists of six parts: eg:
Https://www.baidu.com/s?wd= Yi Yi Qianxi
Protocol (http/https)
Host (www.baidu.com)
Port number (80Universe 443)
Path (s)
Parameters (wd= Yi Qianxi)
Anchor point
Common port numbers:
Http (80) https mysql (3306) oracle (1521) redis (6379) mongodb (27017)
URLError
Generally speaking, the URLError error is usually the error in the host part of the url address:
Example:
Url = 'https://www.baidu.com1/'
Running result:
Urllib.error.URLError:
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.