Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the method of formatting web page text by python?

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "what is the method of formatting web page text by python". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the method of python formatting web page text".

1. A web page usually contains text information. For different text types, we can choose the appropriate HTML semantic elements for tagging.

2. The em element is used to mark and emphasize parts of the content, and the small element is used for comments and signed text.

The theory of examples comes first in pragmatics.

Author: Confucius 1 (September 28, 551-April 11, 479)

This quotation

"Learning and" is the title of the first article in the Analects of Confucius. The first two or three words of the first chapter are generally used as the title of each article in the Analects of Confucius. "Xueer" consists of 16 chapters, covering many aspects. The key points are "three self-examination of my day", "economical use and love, so that the people can keep up with the time", "the use of propriety, harmony is precious" and moral categories such as benevolence, filial piety and faith.

Original text

The Master said, "Don't you also say that you learn and practice it from time to time? Isn't it a pleasure to have friends coming from afar? Isn't it a gentleman if you don't know it, but don't you think it's a gentleman? "

Knowledge point expansion:

Transformation between Python int and string

String- > int

1, decimal string converted to int

Int ('12')

2. Convert hexadecimal string to int

Int ('12, 16)

Int- > string

1. Convert int to decimal string

Str (18)

2. Convert int to hexadecimal string

Hex (18)

2. Since when Lianjia selected the second page online, he only added a "D2" at the back of the page, such as http://sh.lianjia.com/ershoufang/pudong/d2, so if you want to crawl more pages, you only need to update the requests page URL in a loop.

3. After adding a loop, you can print all the crawl results

From lxml import etreeimport requestsimport stringurl = 'http://sh.lianjia.com/ershoufang/'region =' pudong'price = 'p23'finalURL = url+region+pricedef spider_room (finallyURL): r = requests.get (finallyURL) html = requests.get (finalURL) .content.decode (' utf-8') dom_tree = etree.HTML (html) # all the messages all_message = dom_tree.xpath ("/ / ul [@ class='js_fang_list'] / li") for index in Range (len (all_message)): print (all_ message [index] .XPath ('string (.)'). Strip () returnfor i in range (20): finallyURL = finalURL +'/ d'+str (I) spider_room (finallyURL)

4. 20 pages of content have been crawled, but the form of the output of the content has not changed.

Thank you for your reading, the above is the content of "what is the method of formatting web page text by python". After the study of this article, I believe you have a deeper understanding of what the method of formatting web page text by python is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report