response = urllib.request.urlopen(req)
urlopen缺点:不能添加IP池(不能使用代理IP);不能使用cookie
所以要重写urlopen,自定义处理器HTTPSHandler:
import urllib.request # 处理器(处理https,也能处理http协议) handler = urllib.request.HTTPSHandler() # 打开器 opener = urllib.request.build_opener(handler) headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"} url = 'http://www.baidu.com' # 添加请求 req = urllib.request.Request(url, headers=headers) """ 通过打开器打开网页方式 open的参数:fullurl, data=None, timeout """ # response = opener.open(req) # print(response.read().decode('utf-8')) """ 以下是安装全局打开器来打开网页方式 return opener.open(url, data, timeout): 用response = urllib.request.urlopen(req)打开网页,本质还是response = opener.open(req) """ # 安装全局打开器 urllib.request.install_opener(opener) # 打开网页 response = urllib.request.urlopen(req) print(response.read().decode('utf-8'))