selenium 带验证码的自动登录

说明:本页并不是爬取数据 只是用selenium 进行网站自动登录(有验证码) 并获取一个网页所有的信息

from selenium import webdriver
import requests
from lxml import etree
import base64
url=’https://accounts.douban.com/login?alias=&redir=https%3A%2F%2Fwww.douban.com%2F&source=index_nav&error=1001
driver=webdriver.Chrome()

访问网址

driver.get(url)

possword 对应的id 后面为input 中输入的内容

driver.find_element_by_id(‘password’).send_keys(‘jintao1121328421’)
driver.find_element_by_id(‘email’).send_keys(‘15836076372’)

#在pycharm 后端手动输入验证码的值

res=input(‘请输入验证码:’)

响应的内容

html_img_str=driver.page_source
res_str=etree.HTML(html_img_str)

获取到验证码的图片url

code_img_url=res_str.xpath(‘//img[@id=”captcha_image”]/@src’)[0]
res_img=requests.get(code_img_url)
b64_str=base64.b64encode(res_img.content)
form={
‘v_pic’:b64_str,
‘v_type’:’cn’
}
headers={
‘Authorization’:’APPCODE 1141ff308f034d9d80770c812d4bb929’
}
yun_url=’http://yzmplus.market.alicloudapi.com/fzyzm
yunshichang_response=requests.get(yun_url,data=form,headers=headers)
return_value=yunshichang_response.json()[“v_code”]

后端自动登录的步骤《1 》是获取验证码图片 图片由url来获取 所以关键是找到url

<2>导入lxml 运用xpath 来查找

把后端输入的值传到里面进行输入

driver.find_element_by_id(“captcha_field”).send_keys(return_value)

给登录绑定一个单击事件

driver.find_element_by_class_name(“btn-submit”).click()

登录进去之后有些网页是需要cookie 才能登陆 获取到信息 否则什么也获取不到

cookie_selenium=driver.get_cookies()
cookies=[]
for i in cookie_selenium:
cookie=i[‘name’] +’=’+i[‘value’]
# print(cookie)
cookies.append(cookie)
cookie=’; ‘.join(cookies)
headers={
‘Cookie’:cookie,
“User-Agent”:’Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36’
}
url=’https://www.douban.com/accounts/
response1=requests.get(url,headers=headers)
with open(‘personsl.html’,’wb’) as ff:
ff.write(response1.content)

猜你喜欢

转载自blog.csdn.net/chengjintao1121/article/details/82055298