5.Python使用最新爬虫工具requests-html - 代码天地

5.Python使用最新爬虫工具requests-html

其他 2018-08-20 11:21:03 阅读次数: 0

1.安装，在命令行输入：pip install requests-html，安装成功后，在Pycharm引入即可。

2.代码如下所示：

from requests_html import HTMLSession
import requests

session = HTMLSession()

r = session.get('http://www.win4000.com/wallpaper_2358_0_10_1.html')

images = r.html.find('ul.clearfix > li > a')        #获取到网页上所有a标签url

def save_Image(url,title):                          #定义一个函数，用于保存图片到指定目录下（E盘下需手动新建bg文件夹）
    html_response = requests.get(url)
    with open('E:/bg/'+title+'.jpg','wb') as file:
        file.write(html_response.content)


#查找页面中背景图，找到链接，访问查看大图，并获取大图地址
for image in  images:
    image_url = image.attrs['href']  #获取到每张图片属性值为href的url
    if '/wallpaper_detail' in image_url:
        r = session.get(image_url)
        item_url = r.html.find('img.pic-large',first=True)    #获取到href下的src的url
        url = item_url.attrs['src']
        title = item_url.attrs['title']
        print(url+title)
        save_Image(url,title)

3.在指定目录即可查看到爬下来的图片

　　

猜你喜欢

转载自www.cnblogs.com/android-it/p/9504388.html

5.Python使用最新爬虫工具requests-html

最新的爬虫工具requests-html

Spider 好用模块记录最新的爬虫工具requests-html

Python 爬虫实战（二）：使用 requests-html

爬虫-requests-html

爬虫最新的库requests-html库总结

Python学习之旅 -11-爬虫利器Requests-HTML使用方法

使用Python的Requests-HTML库进行网页解析

requests-html 爬虫新库

Requests-HTML爬虫简单了解

Requests-html解析库的使用

requests-html库render方法的使用

5.Python使用模块

Python爬虫入门教程三：requests-html处理动态网页

requests-html

requests-html库

Python删除dom节点的5种方式：BeautifulSoup、lxml、PyQuery、Scrapy、requests-html

Python更新DOM的5种方式：BeautifulSoup、lxml、Scrapy、pyquery、requests-html

requests-html库初识 + 无资料解BUG之 I/O error : encoder error，Python爬虫第30例

Python requests-html扔多处理错误

【Python技能树共建】requests-html库初识

requests-html快速入门

Requests-html 设置 headers

requests-html模块(下)

牛逼的requests-html

requests-html添加header

5.Python语句

5.python函数

requests-html 文档坑较多

用requests-html爬取PDF

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

让自己的头脑极度开放

CentOS 6.5(x64) 和Redhat6.5操作系误删libc

高可用注册中心

【日记】12.28/【题解】AtCoder AGC041

XML（5）_XML 约束_DTD

Java集合Map（四）

树梅派安装桌面环境教程

pipenv 的使用和安装

小程序白屏问题和内存研究

C语言简单选择排序

每日归档

更多

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)