Java爬虫——常用的maven依赖

java实现爬虫常用的第三方包:

  • httpclient,for http
  • jsoup,for dom
  • rhino,for js
  • jackson,for json

pom.xml摘录

<dependencies>

    <!-- simulate client action -->
    <dependency>
        <groupId>net.sourceforge.htmlunit</groupId>
        <artifactId>htmlunit</artifactId>
        <version>2.33</version>
    </dependency>

    <!-- simulate web browser -->
    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpclient</artifactId>
        <version>4.5.7</version>
    </dependency>

    <!-- parse DOM -->
    <dependency>
        <groupId>org.jsoup</groupId>
        <artifactId>jsoup</artifactId>
        <version>1.11.3</version>
    </dependency>

    <!-- jackson -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.9.8</version>
    </dependency>

    <!-- upgrade junit to junit4 -->
    <dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.12<!-- default is v3.8.1 --></version>
        <scope>test</scope>
    </dependency>

</dependencies>

猜你喜欢

转载自www.cnblogs.com/godwithus/p/10564175.html