python爬虫之POST登录方式之Cookie、Session

import re

import requests

登录方式一: 携带登录后的Cookie

下面有一个需要主要的点: ‘Cookie’的值,在登录界面点击右键 检查元素(或者检查),然后选择网络(或者Network)。然后你正常登录,下面的网络中会出现很多请求,找POST请求中的login,复制set-Cookie里面的内容就ok了。最后就是把生成的内容存到一个html里面,可以在网页里面打开看一下是否是你登录后的内容,如果是,说明你成功了
在这里插入图片描述

r_sess = requests.session()
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Cookie':'session_data_places="085473362e3d14780a01ad87b0fb929e:2AL8y-7KDRGhS69z8rBt-gjYKaBF7yQqxRBF7XJ77ef4DvrNzVnx9muBKrlJddtaBBbtviK10MuY67gurSJO9geKYkBSpQIEUr7AhXs4EEyfiQn5j5q1wethXn0TsiKiuVYOSn5YOuDWDht3aYFrErkUN9OQT0dJXZ3pu2AC6vMv94icWWFcobv8e0FVBq_CD_9Z4RBaN8giEFsKqXZq-nQypMlBpiQAZJJd6zKz06W9zU5VynAHWkV1GJDDnCOWie0AHIfnfpQoub5pGgeHlNaf5Ia8eLq9ZV7hVeLpseL7sgvRbo6gBbpFXLuOfSggPGMSrGUXQNZ-ANBwpMcBqlnzkdngfWYRJSEbFkdtOm335sAmh0xjGegO-xH0ud78cnsIopPpXd11q1mOyuIjIBQsz7s-rDILkxJgsl7uxZklx5qXQg0tuDfrh-yeb-AP41XBnyeQ0uAP-t6K21nxAfWUL9ppeUTRkbCUh9S0d-1wJyLeaguGhumK4tINFdXiSlVxHedICFMlPFMFbubnzYShS-tdOJZfQzRMyT3EmMiVdC0M5Pmnx96unKutNt1TX5FQN02JBQ8KO_2cd15ru4l5HcoZ9XYgTLMeS3r0uRp2GusSOUY2Su3kNWOdfzgDcl6eoCTXU2zTSRkQI2PAo-qcGiGZRZiUGOxcCwf2or2QHwfSXT8SxXimRUYen0vspuCjKkpyVj0P_nwnoqpctEJ5aX6X8EV5yU0IN9_IjBMpAfS1GT5VmUN9tvSosEdgbedQLieZ_AjW3cDk-d1PfbW2q_bh70EbUTU5v2gVK0E5CQAw9ObgR_n5afB_AZTixJnKDP_tJJ-aXfoOcg-lY89RC9jJ15To6fqGJgiJDdqnsvNyC9XgSYO26GaQLvvD-mv5M34iqNiR6fkS-YvdGOrs3hqpE3LqfhlEIWWL4oQOKNQuBV4AF2vhjzIGJat3yrRrSq95Er_mv5J3OaNKLzlHs1QijQf4ieGRVpWl1tsqOfLCLiqkzUNfSvTXGa3JsridTJ4uDqE151yhtxpLzlux7BepTBU98TGr2jHjAYT45iPKpZivbhRdikFsiw5T-MKPiImWFDalB8kHS0AEV6qv1wMDEKDUffhu1PiNF8zYdFnHTQDIYlu2VIQ6QJfPtqR11I5mpE5BLVqeFbeuMg=="; Path=/'

}
url = 'http://example.webscraping.com/places/default/user/login?_next=/places/default/index'
response = r_sess.get(url,headers=headers)
print(response.cookies.get_dict())
html = response.text
with open('login.html','w+',encoding='utf-8') as fb:
    fb.write(html)

登录方式二:利用session()

第二种这个方法会有点麻烦,但是理解起来好一点。
这个首找到参数中的表单数据,这个就是你登录提交的数据,
这个最重要的一点就是需要获得 _formkey的值,因为他是一个每次登陆会变得,所以我使用正则获取 _formkey 的值。后面就迎刃而解了。

在这里插入图片描述

r_sess = requests.session()
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0',
    'Cookie':'session_data_places="21992393c08629ead6fa70d0b6078b6e:tFkhX_r032wONHyGzsFxenHy6WMdCKEm85a2bPegmOhIjKavWSAZe7gdDrPdbF9EiyCEcRN38SO6S7qJoS7h5M4ifDTWTQXtqJNwYBCQ6ezLEX9Sa6l6Pdzoa58NuZrocig4hSFrs0kX_Le10XawB3saudb4NhkbtFllvDZAY6_RWKyEims_XIt5kBBIKBVvntzCC_dyjg0P7oh2IlN1ltF_matkt7lDqbfoUWjbZ3ov4HaNnSVkyKIZ2KjMxUkPGqc5gFkwBTQjOOX260SiGBQK38zBZ4vl_QMEXc5qdfa47k3jYOMCg07eH0h_taBudKGOdqpBXzPnDA9BoonobgEQl6oAjQg3WuT-oJ95A5dJLZ6ovFF_Dd3QPd1fH4m2YkS7HqcvqMv-DhdNH30UvD8YuZmw3uOLsbWyXFynZrOdDvOsq5sOA5Eh_66wiL972TyDW2FLBNAn_hA1CWmbEqS_yjMIL7JcnIQ0L83-UV4_Frb2ZrhB-w9dryHDoxVdZpXwfJZTNOZmlRDeA11dbA=="; Path=/'
}
url = 'http://example.webscraping.com/places/default/user/login?_next=/places/default/index'
response = r_sess.get(url,headers=headers)
html = response.text

##  通过正则,匹配  _formkey  的值
rule = re.compile(r'<input name="_formkey" type="hidden" value="(.*?)" />')
result1 = rule.findall(html)[0]
print(result1)

#  这个就是参数中的表单值
data = {}
data['email']='自己设置的邮箱'
data['password']='自己的密码'
data['_next']='/places/default/index'
data['_formkey']=result1
data['_formname']='login'
# 使用 post 请求
res = r_sess.post(url,data=data)
with open('web.html','w+',encoding='utf-8') as fb:
    fb.write(res.text)

补充:

其他网页的登录可能 set-cookie 使用的时候不能用,解决方法:
把set-cookie里面的内容换成,下面请求头中Cookie的值:
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/Mr_791063894/article/details/85199755