Path 句法(URL):
Syntax | Meaning |
---|---|
tag |
Selects all child elements with the given tag. For example, spam selects all child elements named spam , and spam/egg selects all grandchildren named egg in all children named spam . |
* |
Selects all child elements. For example, */egg selects all grandchildren named egg . |
. |
Selects the current node. This is mostly useful at the beginning of the path, to indicate that it’s a relative path. |
// |
Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree. |
.. |
Selects the parent element. |
[@attrib] |
Selects all elements that have the given attribute. |
[@attrib='value'] |
Selects all elements for which the given attribute has the given value. The value cannot contain quotes. |
[tag] |
Selects all elements that have a child named tag . Only immediate children are supported. |
[tag='text'] |
Selects all elements that have a child named tag whose complete text content, including descendants, equals the given text . |
[position] |
Selects all elements that are located at the given position. The position can be either an integer (1 is the first position), the expression last() (for the last position), or a position relative to the last position (e.g. last()-1 ). |
Predicates (expressions within square brackets) must be preceded by a tag name, an asterisk, or another predicate. position
predicates must be preceded by a tag name.
Xpath Python使用示例:
Here’s an example that demonstrates some of the XPath capabilities of the module. We’ll be using the countrydata
XML document from the Parsing XML section:
|
|
以上示例用于XML文本,如果要作用于HTML文本,需要将 import xml.etree.ElementTree as ET
改为 import lxml.html
,以及将 root = ET.fromstring(countrydata)
改为 root = lxml.html.fromstring(HTMLData)
XPath Axes
An axis defines a node-set relative to the current node.
AxisName | Result |
---|---|
ancestor | Selects all ancestors (parent, grandparent, etc.) of the current node |
ancestor-or-self | Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself |
attribute | Selects all attributes of the current node |
child | Selects all children of the current node |
descendant | Selects all descendants (children, grandchildren, etc.) of the current node |
descendant-or-self | Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself |
following | Selects everything in the document after the closing tag of the current node |
following-sibling | Selects all siblings after the current node |
namespace | Selects all namespace nodes of the current node |
parent | Selects the parent of the current node |
preceding | Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes |
preceding-sibling | Selects all siblings before the current node |
self | Selects the current node |
Axe Examples
Example | Result |
---|---|
child::book | Selects all book nodes that are children of the current node |
attribute::lang | Selects the lang attribute of the current node |
child::* | Selects all element children of the current node |
attribute::* | Selects all attributes of the current node |
child::text() | Selects all text node children of the current node |
child::node() | Selects all children of the current node |
descendant::book | Selects all book descendants of the current node |
ancestor::book | Selects all book ancestors of the current node |
ancestor-or-self::book | Selects all book ancestors of the current node - and the current as well if it is a book node |
child::*/child::price | Selects all price grandchildren of the current node |