xpath学习

学习HtmlUnit的时候，看到了Xpath，主要是用Xpath选择hml或者xml中的元素，

先给一段代码：

WebClient client = new WebClient(BrowserVersion.INTERNET_EXPLORER_8);
HtmlPage page = client
.getPage("http://218.75.208.250:8089/opac/jdjsjg.jsp");

这是获取到了HtmlPage。

List<DomeNode> nodeList = page.getByXPath("/table[@class='.xxtable']");

这里可以选择用Jsoup，即Document d = Jsoup.parse(p.asXml());
接下来用d.select.... 去获取相关的元素。

今天我主要说Xpath，在W3cschool中学习的，

nodename 根据name查找所有的节点
/         表示从根目录下搜索
//         在当前的目录下搜素，不管在什么位置
.         获取当前的节点
..         获取父节点
@         根据属性去获取节点

下面给几个事例：

/bookstore/book[1]
/bookstore/book[last()]
/bookstore/book[last()-1]
/bookstore/book[position()<3]
//title[@lang]
//title[@lang='eng']
/bookstore/book[price>35.00]
/bookstore/book[price>35.00]/title

* Matches any element node
@* Matches any attribute node
node() Matches any node of any kind

/bookstore/* Selects all the child nodes of the bookstore element
//* Selects all elements in the document
//title[@*] Selects all title elements which have any attribute

//book/title | //book/price Selects all the title AND price elements of all book elements
//title | //price Selects all the title AND price elements in the document
/bookstore/book/title | //price

猜你喜欢