1. Http常用请求类型
OPTIONS:
返回服务器针对特定资源所支持的
http
请求方法。
HEAD:
向服务器索要与
get
请求相一致的响应,只不过响应体将不会被返回。
GET:
向特定资源发出请求
PUT:
向指定资源位置上传其最新内容
POST:
向指定资源提交数据进行处理请求
DELETE:
请求服务器删除指定
URI
所标识的资源
PATCH:
用来将局部修改应用于某一资源
2. HTTP常见状态码
200/OK
: 请求成功
201/Created:
请求已被实现,且一个新资源已根据请求被建立,
URI
跟随
Location
头信息返回。
202/Accepted:
服务器已接受请求,但尚未处理。
400/Bad Request:
请求无法被服务器理解
401/Unauthorized:
当前请求需要用户验证
403/Forbidden:
服务器已理解请求,但拒绝执行。
404/Not Found
3. requests.get(url) 获取url的response对象
用于动态响应客户端请示,控制发送给用户的信息,并将动态生成响应。
import requests
from io import BytesIO
from PIL import Image
import json
url = 'http://www.baidu.com'
r = requests.get(url) # 获取response对象
print(r)
print(r.status_code) # 获取状态码,200表示成功
print(r.encoding) # 获取编码方式
<Response [200]>
200
ISO-8859-1
4. 传递参数:比如:http://aaa.com?pageId=1&type=content
params = {'k1':'v1','k2':'v2'}
r = requests.get('http://httpbin.org/get', params)
print(r.url)
http://httpbin.org/get?k1=v1&k2=v2
5. 二进制数据: 获取并保存图片为例,注意:
- response.text返回的是Unicode型的数据。
- response.content返回的是bytes型也就是二进制的数据。
- response.iter_content(n=1024) 按块返回二进制数据(此处为一次获取1024个二进制数据),n默认为1
r = requests.get('https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec\
=1530370070095&di=42cc5a0b3201fe4e89f05ca243a9502b&imgtype=0&src=http\
%3A%2F%2Fww2.sinaimg.cn%2Fbmiddle%2F0067Ewosgw1f5xwpkn00nj30ij0rsth2.jpg')
image = Image.open(BytesIO(r.content)) # content里是二进制数据,text中是文本数据
image.save('meinv.png')
with open('meinv2.png', 'wb+') as fw:
for chunk in r.iter_content(1024): # 一次写入1024个数据
fw.write(chunk)
6. json处理
r = requests.get('https://github.com/timeline.json')
print(type(r.json))
print(r.json)
<class 'method'>
<bound method Response.json of <Response [410]>>
7. 提交表单:常用于模拟登录
response.post(url, data=form, , headers = headers) 用于在访问url时提取表单和headers等信息。
form = {'username':'user','password':'pass'} # 提交字典格式的数据就被认为是一个表单
r = requests.post('http://httpbin.org/post', data = form)
print(r.text)
{"args":{},"data":"","files":{},"form":{"password":"pass","username":"user"},\
"headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close",\
"Content-Length":"27","Content-Type":"application/x-www-form-urlencoded",\
"Host":"httpbin.org","User-Agent":"python-requests/2.18.4"},"json":null,\
"origin":"114.213.252.227","url":"http://httpbin.org/post"}
其中json.dumps(form) 将dict类型的数据转换为str.
r = requests.post('http://httpbin.org/post', data = json.dumps(form))
print(r.text)
{"args":{},"data":"{\"username\": \"user\", \"password\": \"pass\"}","files":{},"form":{},\
"headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close",\
"Content-Length":"40","Host":"httpbin.org","User-Agent":"python-requests/2.18.4"},\
"json":{"password":"pass","username":"user"},"origin":"114.213.252.227",\
"url":"http://httpbin.org/post"}
8. cookie: 指某些网站为了辨别用户身份、进行 session 跟踪而储存在用户本地终端上的数据(通常经过加密)
url = 'http://www.baidu.com'
r = requests.get(url)
cookies = r.cookies # 获取cookie
for k, v in cookies.get_dict().items():
print(k, v)
BDORZ 27315
cookies = {'c1':'v1', 'c2':'v2'}
r = requests.get('http://httpbin.org/cookies', cookies = cookies)
print(r.text)
{"cookies":{"c1":"v1","c2":"v2"}}
9. 重定向和重定向历史
简单来说,HTTPS协议是由SSL+HTTP协议构建的可进行加密传输、身份认证的网络协议,要比http协议安全
重定向(Redirect)就是通过各种方法将各种网络请求重新定个方向转到其它位置(如:网页重定向)。
r = requests.head('http://github.com', allow_redirects = True)
print(r.url)
print(r.status_code)
print(r.history)
https://github.com/ # github被重定向到https网页
200
[<Response [301]>]