In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how to use XPath for you. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.
The use of 1 ~ (th) XPath
You need to install the lxml library before using it.
Installation code: pip3 install lxml
Common rules for 1.1XPath:
/ means to select a direct child node
/ / indicates that all descendant nodes are selected
. Select the current node
.. Select the parent node of the current node
@ Select attributes
After reading this? Do you still look confused? Let's put it into practice.
1.2 instance reference
As shown in the figure:
Import etree Modul
Etree.HTML () is to construct a XPath object
Etree.tostring () corrects the code. If there is a missing part, it will be repaired automatically.
The method is relatively simple, so the effect image is not intercepted.
What if we parse relative to the local file? We can write like this.
The first parameter of etree.parse () is the path of html, and the second (etree.HTMLParser ()) is the same as the above etree.HTML (). For convenience, I will parse the local file next.
The html text is as follows:
1.3 get all nodes
Results:
Start with / / to select all the matching nodes, and * to get all the nodes
Don't the above two sentences mean the same thing? I don't understand!
We can understand it in two steps:
The first step is to select all the nodes that meet the requirements and do not specify what the requirements are! Oh, I don't know what you want to get.
The second step * represents all nodes, so all nodes are acquired. It should be easy to understand this way.
Note: a list is returned
1.4 get the specified node
Or the html text above, what if we want to get the li node?
You just need to change result_text=html.xpath ('/ *') to result_text=html.xpath ('/ / li')
If you want to get node a, change it to / / a, or write it as / / li//a, or / / ul//a get / / li/a
Can be obtained, but if / / ul/an is not available because / represents a direct child node.
Note: all nodes are returned, not text information.
That is:
This form.
1.4 attribute matching
If we want the href attribute of the a tag, we can change it to / / a/@href
Return the result:
It also returns a list.
If we want to match the li whose class is li_1, we can modify it to / / li [@ class= "li_1"].
1.5 parent node matching
Let's get the class attribute of the parent node of node an of link2.html. We need to modify it to / / a [@ href= "link2.html"] /.. / @ class here. Indicates that the parent node is found, and a list is still returned.
1.6 get text
Let's get the text of a under li whose class is li_3, which can be written as / / li [@ class= "li_3"] / a/text ().
1.7contains () function
For example, one of the li is:
At this point: li has two class names. If we write / / li [@ class= "li"] like this, we will not get the node.
Then we can write to get the node / / li [contains (@ class, "li")].
1.8 Multi-attribute acquisition
For the same li, we need to get the li named li and id as caidan. You can write / / li [contains (@ class, "li") and @ id= "caidan"]
To get a li whose class name is li or id is caidan, use or.
1.9 last (), position () function
The html above has a lot of li, and if I just want to get the first one, I can do this:
/ / li [1], similarly, change the second one to 2. If you want to get the last one: / / li [last ()]
If you want to get the first two: / / li [position ()
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.