Python3爬虫:利用Fidder抓取手机APP的数据

  1、什么是Fiddler?

    Fiddler是一个http协议调试代理工具,它能够记录并检查所有你的电脑和互联网之间的http通讯,设置断点,查看所有的“进出”Fiddler的数据(指cookie,html,js,css等文件)。

   Fiddler 要比其他的网络调试器要更加简单,因为它不仅仅暴露http通讯还提供了一个用户友好的格式。同类的工具有: httpwatch, firebug, wireshark。

    Fiddler使用,参考:https://www.cnblogs.com/miantest/p/7289694.html

    Fiddler下载:https://www.telerik.com/fiddler

    傻瓜式安装,一键到底。

  2、手机APP抓包设置

   2.1 、Fiddler设置

    打开Fiddler软件,打开工具的设置。(Fiddler软件菜单栏:Tools->Options)

    

    在HTTPS中设置如下:

    

    在Connections中设置如下,这里使用默认8888端口,当然也可以自己更改,但是注意不要与已经使用的端口冲突:

    

  2.2 、安全证书下载

    在手机浏览器中输入地址:http://localhost:8888/,点击FiddlerRoot certificate,下载安全证书:

    

  2.3、 安全证书安装

    以华为手机为例:

    在手机设置--->高级设置-->安全---->显示受信任的CA证书--->用户

    

    

    

    安装成功后,显示如下:

    

  

  2.4、局域网设置

    使用Fiddler进行手机抓包,首先要确保手机和电脑的网络在一个内网中,可以使用让电脑和手机都连接同一个路由器。当然,也可以让电脑开放WIFI热点,手机连入。

  这里,我使用的方法是,让手机和电脑同时连入一个路由器中。最后,让手机使用电脑的代理IP进行上网。

      首先,查看电脑的IP地址,在cmd中使用命令ipconfig查看电脑IP地址。找到无线局域网WLAN的IPv4地址,记下此地址。

     

    在手机上,点击连接的WIFI进行网络修改,添加代理。进行手动设置,主机名即为上图中找到的IP地址,端口号即为Fiddler设置中的端口号8888:

    

  

  3、Fiddler手机抓包测试

      上述步骤都设置完成之后,用手机打开今日头条app,截图如下:

   

     

   我们再来看fidder抓取的数据情况:

  

  

  4、python代码测试

     有了上面这些信息就可以写代码了

# -*- coding: UTF-8 -*-
import requests

class app_data:
    def __init__(self):
        self.headers = {'Accept-Charset': 'UTF-8',
                   'X-Requested-With': 'XMLHttpRequest',
                   'Host': 'lf-hl.snssdk.com',
                   'Connection': 'Keep-Alive',
                   'Accept-Encoding': 'gzip',
                   'X-SS-REQ-TICKET': '1544235590880',
                   'sdk-version': '1',
                   'User-Agent': 'Dalvik/2.1.0 (Linux; U; Android 7.0; HUAWEI CAZ-AL10 Build/HUAWEICAZ-AL10) NewsArticle/7.0.1 cronet/TTNetVersion:pre_blink_merge-277498-gd2bb364e 2018-08-24',
                   'X-SS-TC': '0'
                   }
        self.heros_url1 = "https://lf-hl.snssdk.com/user/profile/homepage/v7/?refer=" \
                    "&user_id=65860205302" \
                    "&iid=53115531269" \
                    "&device_id=52727404130" \
                    "&ac=wifi" \
                    "&channel=huawei" \
                    "&aid=13" \
                    "&app_name=news_article" \
                    "&version_code=701" \
                    "&version_name=7.0.1" \
                    "&device_platform=android" \
                    "&ab_version=624153%2C593993%2C623879%2C617391%2C635142%2C628023%2C625730%2C631388%2C622716%2C618681%2C622136%2C622991%2C623895%2C631604%2C631594%2C636162%2C554836%2C549647%2C630836%2C621573%2C572465%2C608437%2C615291%2C606547%2C442255%2C633552%2C630218%2C546700%2C280447%2C628958%2C281295%2C633175%2C632887%2C622043%2C325617%2C578588%2C634871%2C625065%2C612913%2C616220%2C616208%2C636135%2C498375%2C613888%2C554330%2C467513%2C631638%2C630331%2C623322%2C595556%2C630551%2C611285%2C622103%2C621398%2C486953%2C604157%2C292722%2C630693%2C596282%2C608565%2C571131%2C239098%2C612193%2C636117%2C170988%2C493250%2C617806%2C609105%2C374118%2C588069%2C633376%2C631359%2C633720%2C627387%2C550042%2C435214%2C635022%2C603542%2C586994%2C609625%2C631781%2C627130%2C614229%2C614099%2C620527%2C522766%2C617328%2C416055%2C621360%2C636125%2C392460%2C636212%2C630238%2C558140%2C617836%2C555254%2C378450%2C635503%2C471406%2C603443%2C596391%2C550817%2C598626%2C631351%2C634911%2C631592%2C603385%2C603397%2C603403%2C603405%2C629151%2C607361%2C618798%2C609338%2C326532%2C636089%2C586291%2C609314%2C562442%2C627740%2C589102%2C553951%2C618166%2C457480%2C618233" \
                    "&ab_client=a1%2Cc4%2Ce1%2Cf1%2Cg2%2Cf7" \
                    "&ab_group=94567%2C102749%2C181430" \
                    "&ab_feature=94567%2C102749" \
                    "&abflag=3" \
                    "&ssmix=a" \
                    "&device_type=HUAWEI+CAZ-AL10" \
                    "&device_brand=HUAWEI&language=zh" \
                    "&os_api=24" \
                    "&os_version=7.0" \
                    "&uuid=864590038380239" \
                    "&openudid=47628a3804ad50be" \
                    "&manifest_version_code=701" \
                    "&resolution=1080*1788" \
                    "&dpi=480" \
                    "&update_version_code=70108" \
                    "&_rticket=1544235590871" \
                    "&fp=DrT_L2w1cST5FlT_F2U1FYK7FrxO" \
                    "&tma_jssdk_version=1.5.3.9" \
                    "&rom_version=emotionui_5.0.4_caz-al10c00b386" \
                    "&plugin=26958" \
                    "&ts=1544235590" \
                    "&as=a24552e0d684ec3a1b4355" \
                    "&mas=0085f891c37633622601d48d5117a8a8b542ec864006686e80 HTTP/1.1"
  

 def catch_app_data1(self):

        req = requests.get(url=self.heros_url3, headers=self.headers).json()

        data = req.get("data")
        name = data.get("name")
        print('账号:', name)

        verified_content = data.get("verified_content")
        print('认证:', verified_content)

        area = data.get("area")
        print('位置:', area)

        description = data.get("description")
        print('简介:', description)

        user_id = data.get("user_id")
        print('user_id:', user_id)


if __name__ == '__main__':
    obj = app_data()
    obj.catch_app_data1()
View Code

  输出结果如下:

  

  5、总结

    通过测试发现在需要获取信息的url里面关键的不同就是user_id的值,不同的账号对应的user_id的值不同,其它基本相同,像headers值是一样。因此关键就是需要获取user_id的值。

  

  参考:https://blog.csdn.net/memoryofyck/article/details/80955615

猜你喜欢

转载自www.cnblogs.com/shaosks/p/10087252.html