Path 句法(URL):
| Syntax | Meaning |
|---|---|
tag |
Selects all child elements with the given tag. For example, spam selects all child elements named spam, and spam/egg selects all grandchildren named egg in all children named spam. |
* |
Selects all child elements. For example, */egg selects all grandchildren named egg. |
. |
Selects the current node. This is mostly useful at the beginning of the path, to indicate that it’s a relative path. |
// |
Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree. |
.. |
Selects the parent element. |
[@attrib] |
Selects all elements that have the given attribute. |
[@attrib='value'] |
Selects all elements for which the given attribute has the given value. The value cannot contain quotes. |
[tag] |
Selects all elements that have a child named tag. Only immediate children are supported. |
[tag='text'] |
Selects all elements that have a child named tag whose complete text content, including descendants, equals the given text. |
[position] |
Selects all elements that are located at the given position. The position can be either an integer (1 is the first position), the expression last() (for the last position), or a position relative to the last position (e.g. last()-1). |
Predicates (expressions within square brackets) must be preceded by a tag name, an asterisk, or another predicate. position predicates must be preceded by a tag name.
Xpath Python使用示例:
Here’s an example that demonstrates some of the XPath capabilities of the module. We’ll be using the countrydata XML document from the Parsing XML section:
|
|
以上示例用于XML文本,如果要作用于HTML文本,需要将 import xml.etree.ElementTree as ET 改为 import lxml.html ,以及将 root = ET.fromstring(countrydata) 改为 root = lxml.html.fromstring(HTMLData)
XPath Axes
An axis defines a node-set relative to the current node.
| AxisName | Result |
|---|---|
| ancestor | Selects all ancestors (parent, grandparent, etc.) of the current node |
| ancestor-or-self | Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself |
| attribute | Selects all attributes of the current node |
| child | Selects all children of the current node |
| descendant | Selects all descendants (children, grandchildren, etc.) of the current node |
| descendant-or-self | Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself |
| following | Selects everything in the document after the closing tag of the current node |
| following-sibling | Selects all siblings after the current node |
| namespace | Selects all namespace nodes of the current node |
| parent | Selects the parent of the current node |
| preceding | Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes |
| preceding-sibling | Selects all siblings before the current node |
| self | Selects the current node |
Axe Examples
| Example | Result |
|---|---|
| child::book | Selects all book nodes that are children of the current node |
| attribute::lang | Selects the lang attribute of the current node |
| child::* | Selects all element children of the current node |
| attribute::* | Selects all attributes of the current node |
| child::text() | Selects all text node children of the current node |
| child::node() | Selects all children of the current node |
| descendant::book | Selects all book descendants of the current node |
| ancestor::book | Selects all book ancestors of the current node |
| ancestor-or-self::book | Selects all book ancestors of the current node - and the current as well if it is a book node |
| child::*/child::price | Selects all price grandchildren of the current node |