17track包裹单个物流轨迹抓取(一)

近期正在学习python ,结合自己过往的工作,所以闲来无事,试下爬17track 的轨迹。

爬取途径是:利用静态页面爬取,需要了解前端网页知识。

三方包:pyquery

话不多说,看代码吧:

#!/usr/bin/env python3
#coding=utf-8


from pyquery import PyQuery as pq
import pymysql

def get_time(d1):
    l=[]
    for data in d1('time'):
        msg=d1(data).text()
        #print(msg[0:11],len(msg))
        l.append(msg[0:10])

    return l


def get_message(d1):
    s=[]
    for data in d1('p'):
        msg1=d1(data).text()
        s.append(msg1)

    return s


def main():
    d = pq(filename="18.html")
    d1 = d(".ori-block")#查找类是ori-block的html模块
    d2 = d('.text-uppercase').text()获取类是text-uppercase的文本内容
    print (type(d2))#测试返回的数据类型,为str
    i=0
    while i < len(get_time(d1)):
        print(d2+"/"+get_time(d1)[i]+"/"+get_message(d1)[i])
        i += 1

main()

抓取结果如下:

扫描二维码关注公众号,回复: 2479867 查看本文章

1Z3Y18900337899118/2018-07-05/LAS VEGAS, NV, US, DELIVERED
1Z3Y18900337899118/2018-07-05/Las Vegas, NV, United States, Destination Scan
1Z3Y18900337899118/2018-07-04/Las Vegas, NV, United States, Arrival Scan
1Z3Y18900337899118/2018-07-04/Departure Scan
1Z3Y18900337899118/2018-07-04/Arrival Scan
1Z3Y18900337899118/2018-07-04/Ontario, CA, United States, Departure Scan
1Z3Y18900337899118/2018-07-04/Origin Scan

1Z3Y18900337899118/2018-06-30/United States, Order Processed: Ready for UPS

后续会更新 :

url动态抓取

40个包裹抓取

超过40个抓取

python API抓取等。。。



猜你喜欢

转载自blog.csdn.net/weixin_42532882/article/details/80961757
今日推荐