python网络爬虫抓取图片 - 代码天地

python网络爬虫抓取图片

其他 2018-11-20 10:22:17 阅读次数: 0

1.最近学习Python（Python3），试着网上爬下照片

2.参考了网上的代码，发现网页编码格式（utf-8，gbk等等）不同会导致抓取不到，所以利用chardet模块判断编码，小改了一下，效果还不错：

import urllib.request
import re
import chardet
def getHtml(url):
    page =urllib.request.urlopen(url)
    html = page.read()
    chr=chardet.detect(html)['encoding']
    if chr=='utf-8':
        html = html.decode()
    else :
        html =html.decode('GBK')
    return html
def getImg(html):
    reg = r'src="(.+?\.jpg)"' 
    imgre = re.compile(reg)    
    imglist = re.findall(imgre,html)
    x=0
    for imgurl in imglist:
        urllib.request.urlretrieve(imgurl,'D:\pic\%s.jpg' % x)
        x+=1


url="https://pixabay.com/"
html = getHtml(url)
print (getImg(html))

参考的博客：http://blog.csdn.net/longshengguoji/article/details/9946675

猜你喜欢

转载自blog.csdn.net/boyheroes/article/details/71246083

python网络爬虫抓取图片

python网络爬虫抓取网站图片

利用Python网络爬虫抓取豆瓣首页图片代码分享

java爬虫抓取网络上的图片

Python3爬虫图片抓取

Python爬虫 —— 抓取美女图片

Python爬虫之网页图片抓取

Python爬虫之gif图片抓取

python爬虫-- 抓取网页、图片、文章

python 爬虫抓取网站img图片

Python爬虫教程：用Python网络爬虫抓取百度贴吧评论区图片和视频

python网络爬虫及数据抓取（一）

python网络爬虫（1）静态网页抓取

使用Python网络爬虫抓取CodeForces题目

python网络数据抓取二（bing图片抓取）

python网络数据抓取三（斗图网图片抓取）

爬虫抓取图片

爬虫抓取网页图片

Python网络图片爬虫

【Python】网络爬虫获取图片

网络爬虫的抓取策略

[Python][爬虫03]requests+BeautifulSoup实例:抓取图片并保存

[Python练手爬虫]煎蛋网抓取图片

python爬虫之抓取网页中的图片到本地

Python爬虫 —— 抓取美女图片（Scrapy篇）

Python之多线程爬虫抓取网页图片

python 爬虫, 抓取百度美女吧图片

python入门之爬虫------抓取王者荣耀英雄皮肤图片

（廿五）Python爬虫：抓取今日头条图片

python爬虫实战---今日头条的图片抓取

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)