Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use pyquery in python

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

Editor to share with you how to use pyquery in python, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

I. introduction of pyquery

Using pyquery requires the use of this CSS selector on the basis of Web and knowledge of jQuery.

Second, the use of pyquery 1. Initialization work

There are many ways to initialize using pyquery. The parameters passed in can be strings, URL and file names. The initialization methods are described below.

String html =''test02.html' 'from pyquery import PyQuery as pqdoc = pq (html) print (doc (' title'))

[running result]

Test02.html

URL

URL takes the home address of CSDN as an example:

From pyquery import PyQuery as pqdoc = pq (url = 'https://www.csdn.net/')print(doc('title'))

[running result]

CSDN-Professional developer Community

File initialization

We save the following string as a HTML file and initialize it as a file.

[test02.html]

Harry Potter 29.99 Learning XML 39.95 from pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') print (doc (' title'))

[running result]

Harry Potter

Learning XML

2. Find nodes (1) find child nodes

The find () method is needed to find the child nodes, and the parameter passed in is the CSS selector.

From pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') item = doc (' book') print (item) lis1 = item.find ('title') lis2 = item.find (' price') print (lis1) print (lis2)

[running result]

Harry Potter

29.99

Learning XML

39.95

Harry Potter

Learning XML

29.99

39.95

As you can see, we first match the book node, and then match the child nodes title and price under the book node.

In fact, the find method is used to match all the descendant nodes, and if you simply match the child nodes, you can use the children method.

(2) match the parent node

Using the parent () method, you need to use the parents () method if you want to match the ancestor node.

(3) matching sibling nodes

You can use the siblings () method.

3. Traversing

If the content obtained is a single node, it can be directly converted to a string type, while for multiple nodes, because the type is PyQuery, the obtained data needs to be traversed, which requires calling the items () method.

From pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') items = doc (' title'). Items () print (items) print (type (items)) for i in items: print (type (I) print (I))

[running result]

Harry Potter

Learning XML

4. Get information (1) get attributes

Use the attr () method

From pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') items = doc (' title') for i in items.items (): print (i.attr ('lang'))

[running result]

Eng

Eng

By traversing the obtained data, you can get the land attribute values of all title nodes.

(2) get the text

Use the text () method

From pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') items = doc (' title') for i in items.items (): print (i.text ())

[running result]

Harry Potter

Learning XML

It is also traversed to get the text value of each title node.

5. Node operation (1) add or delete a class for a node

The calling methods are addClass and removeClass

From pyquery import PyQuery as pqdoc = pq (filename = 'test02.html') items = doc (' title') for i in items.items (): print (I) i.addClass ('book01') print (I) i.removeClass (' book01') print (I)

[running result]

Harry Potter

Harry Potter

Harry Potter

Learning XML

Learning XML

Learning XML

As you can see, the first step is to print the original title node, add the class attribute and print again, and remove the class attribute and print again.

(2) attr, text, html

Attr: used to change attribute values

Text: used to change the text value

Html: used to change node values

(3) remove

Remove unwanted node values and remove the entire node.

6. Pseudo-class selector

Supports a variety of pseudo-class selectors, such as selecting the first node, the last node, the odd node, the even node, and the node containing the specified text.

The above is all the contents of the article "how to use pyquery in python". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report