《精通python网络爬虫》学习笔记三 - 代码天地

《精通python网络爬虫》学习笔记三

其他 2018-08-18 10:14:04 阅读次数: 0

Cookie
- Cookiejar

Cookiejar

先通过F12，点击登录按钮找到post方法对应的request url。然后在网页源码中找到表单的name。
先进行无Cookie的登录

url = "....."  #登录网址
postdata = urllib.parse.urlencode({
"username":"....",
"password":"...."
}).encode('utf-8')
req = urllib.request.Request(url, postdata)
req.add_header('User-Agent', '.....')
data = urllib.request.urlopen(req).read()

这里会爬到登录之后跳转的页面

接下来进行有Cookie的登录并爬取登录后的首页

import http.cookiejar
...#同上
cjar = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cjar)) #使用HTTPCookieProcessor创建cookie处理器
urllib.request.install_opener(opener) #将opener安装为全局
file = opener.open(req)

猜你喜欢

转载自blog.csdn.net/sinat_25721683/article/details/81116496

《精通python网络爬虫》学习笔记三

《精通python网络爬虫》学习笔记二

《精通python网络爬虫》学习笔记一

《精通python网络爬虫》学习笔记四——多线程爬虫

Python网络爬虫学习笔记（三）

精通python网络爬虫笔记一

python网络爬虫学习笔记之三 Selenium入门

python网络爬虫学习笔记（三）：urllib库的使用

python网络爬虫学习笔记

Python（学习笔记—网络爬虫）

《精通Python网络爬虫》读书笔记—— Urllib库(1)

《精通Python网络爬虫》读书笔记—— Urllib库(2)

Python爬虫学习笔记（三）

Python 3网络爬虫开发实战+精通Python爬虫框架Scrapy学习资料

Python实现网络爬虫基础学习（三）

Python网络爬虫学习笔记（五）

Python网络爬虫学习笔记（四）

Python网络爬虫学习笔记（二）

Python网络爬虫学习笔记（一）

python网络爬虫学习笔记2

python 学习笔记----网络爬虫(详细)

python网络爬虫学习笔记（1）

python网络爬虫学习笔记（2）

Python学习（笔记3-网络爬虫）

Python学习笔记-网络爬虫基础

python学习笔记4---（python网络爬虫-爬虫前奏）

Python网络爬虫快速入门到精通

python爬虫--------scrapy学习笔记（三）

无敌python爬虫教程学习笔记（三）

Python爬虫学习笔记（三）————urllib

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

让自己的头脑极度开放

CentOS 6.5(x64) 和Redhat6.5操作系误删libc

高可用注册中心

【日记】12.28/【题解】AtCoder AGC041

XML（5）_XML 约束_DTD

Java集合Map（四）

树梅派安装桌面环境教程

pipenv 的使用和安装

小程序白屏问题和内存研究

C语言简单选择排序

每日归档

更多

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)