用Python批量爬取2种格式的图片并下载

欢迎加入QQ学习交流群,与我们一起学习,一起进步吧!
群号:225361733
可以QQ扫一扫加入群聊哦!

在这里插入图片描述

本文使用编程猫官网进行教学,敬请谅解,不过本文作者通过自己钻研在第3次更新中推出了爬2种格式的代码,本文还给出了一些实例:如爬取汇图网,CSDN等!

PS : 爬2种格式的代码见文章末

前言

作为一个爬虫小白,我一直在学编程猫,最近编程猫从视频处理方面转战爬虫,我也沾了光……
在这里插入图片描述
今天就分享一下批量爬取图片的方法
PS:本文后还附赠了爬汇图网图片的方法

找资源部分

进入编程猫图鉴网找到聚集地

我们输入网址https://shequ.codemao.cn/wiki/book,进入编程猫官方社区的图鉴页面,随后按F12进入审查元素,依次点击network和XHR
在这里插入图片描述
按F5刷新,然后一个一个单击并点response排查,最后找到文件:all
在这里插入图片描述

获取聚集地网址

这行代码里有许多图片网址,我们确定就是他啦!接下来我们要确定代码网址:点击headers,发现一个url链接地址:https://api.codemao.cn/api/sprite/list/all
在这里插入图片描述
进入此网站,发现代码!
在这里插入图片描述

{"code":200,"msg":"成功","description":"Http request finish without mistake","data":{"sprite_list":[{"id":32,"name":"编程猫","faction_id":2,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/001%E7%BC%96%E7%A8%8B%E7%8C%AB.png","NO":1,"faction_name":"普通"},{"id":29,"name":"猫老祖","faction_id":12,"star":6,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/002%E7%8C%AB%E8%80%81%E7%A5%96.png","NO":2,"faction_name":"神圣"},{"id":57,"name":"黑色编程猫","faction_id":2,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook/%E9%BB%91%E8%89%B2%E7%BC%96%E7%A8%8B%E7%8C%AB%E5%9B%BE%E9%89%B4%E5%89%AF%E6%9C%AC.png","NO":6,"faction_name":"普通"},{"id":26,"name":"木叶龙","faction_id":3,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/003%E6%9C%A8%E5%8F%B6%E9%BE%99.png","NO":7,"faction_name":"草"},{"id":40,"name":"雷电猴","faction_id":5,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/006%E9%9B%B7%E7%94%B5%E7%8C%B4.png","NO":10,"faction_name":"电"},{"id":24,"name":"星能猫","faction_id":11,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/009%E6%98%9F%E8%83%BD%E7%8C%AB2.png","NO":13,"faction_name":"超能"},{"id":30,"name":"疾风雀","faction_id":10,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/012%E7%96%BE%E9%A3%8E%E9%9B%80.png","NO":16,"faction_name":"飞行"},{"id":22,"name":"导弹鲨","faction_id":7,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/015%E6%8D%A3%E8%9B%8B%E9%B2%A8.png","NO":18,"faction_name":"水"},{"id":71,"name":"花粉虫","faction_id":6,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/021-花粉虫-V2.png","NO":21,"faction_name":"虫"},{"id":33,"name":"花粉蝶","faction_id":6,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E8%8A%B1%E7%B2%89%E8%9D%B62.png","NO":23,"faction_name":"虫"},{"id":20,"name":"呆鲤鱼","faction_id":7,"star":1,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/024%E5%91%86%E9%B2%A4%E9%B1%BC.png","NO":24,"faction_name":"水"},{"id":21,"name":"大黄鸡","faction_id":9,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/027%E5%A4%A7%E9%BB%84%E9%B8%A1.png","NO":27,"faction_name":"机械"},{"id":28,"name":"熔岩龙","faction_id":8,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E7%86%94%E5%B2%A9%E9%BE%99.png","NO":32,"faction_name":"火"},{"id":31,"name":"笨笨鸭","faction_id":7,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/032%E4%B8%91%E5%B0%8F%E9%B8%AD.png","NO":33,"faction_name":"水"},{"id":34,"name":"草灵灵","faction_id":3,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/034%E8%8D%89%E7%81%B5%E7%81%B5.png","NO":35,"faction_name":"草"},{"id":50,"name":"象牙螺","faction_id":7,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/036%E8%B1%A1%E7%89%99%E8%9E%BA.png","NO":37,"faction_name":"水"},{"id":36,"name":"蓝雀","faction_id":10,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/038%E8%93%9D%E9%9B%80.png","NO":39,"faction_name":"飞行"},{"id":37,"name":"达达蟹","faction_id":7,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/058%E8%BE%BE%E8%BE%BE%E8%9F%B92.png","NO":41,"faction_name":"水"},{"id":42,"name":"飞电鼠","faction_id":5,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/043%E9%A3%9E%E7%94%B5%E9%BC%A02.png","NO":43,"faction_name":"电"},{"id":58,"name":"独角蛛","faction_id":6,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/068%E7%8B%AC%E8%A7%92%E8%9B%9B.png","NO":45,"faction_name":"虫"},{"id":59,"name":"妙音龙","faction_id":11,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/069%E5%A6%99%E9%9F%B3%E9%BE%99.png","NO":47,"faction_name":"超能"},{"id":61,"name":"地龙","faction_id":4,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/070%E5%9C%B0%E9%BE%99.png","NO":49,"faction_name":"地"},{"id":62,"name":"绅士猫","faction_id":2,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/071%E7%BB%85%E5%A3%AB%E7%8C%AB3.png","NO":52,"faction_name":"普通"},{"id":65,"name":"拳击袋鼠","faction_id":4,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/074%E6%8B%B3%E5%87%BB%E8%A2%8B%E9%BC%A0.png","NO":54,"faction_name":"地"},{"id":64,"name":"冰牙犬","faction_id":7,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/073%E5%86%B0%E7%89%99%E7%8A%AC.png","NO":55,"faction_name":"水"},{"id":70,"name":"小火熊","faction_id":8,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E5%B0%8F%E7%81%AB%E7%86%8A.png","NO":57,"faction_name":"火"},{"id":63,"name":"晴天娃娃","faction_id":10,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/072%E6%99%B4%E5%A4%A9%E5%A8%83%E5%A8%83.png","NO":62,"faction_name":"飞行"},{"id":67,"name":"涂鸦狐","faction_id":2,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E6%B6%82%E9%B8%A6%E7%8B%B8.png","NO":64,"faction_name":"普通"},{"id":72,"name":"炸弹齿轮","faction_id":9,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E7%82%B8%E5%BC%B9%E9%BD%BF%E8%BD%AE.png","NO":66,"faction_name":"机械"},{"id":39,"name":"阿尔法","faction_id":11,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/051-阿尔法-V2.png","NO":78,"faction_name":"超能"},{"id":68,"name":"画笔海龟","faction_id":7,"star":3,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E7%94%BB%E7%AC%94%E6%B5%B7%E9%BE%9F.png","NO":79,"faction_name":"水"},{"id":69,"name":"网络爬虫","faction_id":6,"star":4,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/%E7%BD%91%E7%BB%9C%E7%88%AC%E8%99%AB.png","NO":80,"faction_name":"虫"},{"id":44,"name":"贝塔1","faction_id":11,"star":5,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/115%E8%B4%9D%E5%A1%941.png","NO":81,"faction_name":"超能"},{"id":45,"name":"贝塔2","faction_id":11,"star":5,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/116%E8%B4%9D%E5%A1%942.png","NO":82,"faction_name":"超能"},{"id":56,"name":"玛洛斯","faction_id":12,"star":6,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/064%E7%8E%9B%E6%B4%9B%E6%96%AF.png","NO":84,"faction_name":"神圣"},{"id":66,"name":"玛洛斯-贝塔","faction_id":11,"star":6,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/075%E6%9A%97%E9%BB%91%E7%8E%9B%E6%B4%9B%E6%96%AF.png","NO":85,"faction_name":"超能"},{"id":73,"name":"黄金呆鲤鱼","faction_id":7,"star":2,"handbook_image":"https://static.codemao.cn/sprite/handbook-v2/huangjingdaliyu.png","NO":86,"faction_name":"水"}]}}

代码中是有规律的:
在这里插入图片描述
我们只要构建一个正则表达式就OK啦!!!
来了解下通配符:
在这里插入图片描述

代码部分

导入相应的库

re库介绍

我们本次要用到我们常用的requests库和字符库—re库,我们来介绍一下re库:Re库是Python的标准库,主要用于字符串匹配。
调用方式:

import re

正则表达式的表示类型:
raw string类型(原生字符串类型):
re库采用raw string类型表示正则表达式,表示为:r'text'
例如:r'[1-9]\d{5}'
raw string是指不包含转义符的字符串
string类型,更繁琐。
例如:[1-9]\\d{5}\\d{3}-\\d{8}|\\d{4}-\\d{7}当正则表达式包含转义符时,建议使用raw string类型来表示正则表达式。
re库的主要功能函数:
在这里插入图片描述

代码

咳咳!说多了,导入库的代码:

import requests
import re

获取整个网站的内容

webPage=requests.get("https://api.codemao.cn/api/sprite/list/all")
webPage=webPage.text

有人要问了,为啥要加个webPage=webPage.text呢?
如果不加(代码如下)

import requests
import re
webPage=requests.get("https://api.codemao.cn/api/sprite/list/all")
print(webPage)

输出的是网页的状态码:
在这里插入图片描述

扩展:状态码的意思

在这里插入图片描述

其他的代码……

这些代码比较多,我直接上吧!解析写里面啦

image_re=re.compile(r'https.*?png')#构建正则表达式
sprite_image=image_re.findall(webPage)#过滤并传入数据
a=range(len(sprite_image))#保存数列
for b in a:
    sprite_image_1=requests.get(sprite_image[b])#对图片发送请求
    #存储信息
    spritePage=open("图鉴%s.png"%b,"wb")#新建并打开或打开文件
    '''前一个参数为打开文件,另一个是打开模式
       运行时,变量b会替换到%s的地方'''
    spritePage.write(sprite_image_1.content)#写入信息
    spritePage.close()#关闭并保存
    print("成功保存%s个图片\n"%b)#保存提示

总体代码

import requests
import re
#获取内容
webPage=requests.get("https://api.codemao.cn/api/sprite/list/all")
webPage=webPage.text
image_re=re.compile(r'https.*?png')#构建正则表达式
sprite_image=image_re.findall(webPage)#过滤并传入数据
a=range(len(sprite_image))#保存数列
for b in a:
    sprite_image_1=requests.get(sprite_image[b])#对图片发送请求
    #存储信息
    spritePage=open("图鉴%s.png"%b,"wb")#新建并打开或打开文件
    '''前一个参数为打开文件,另一个是打开模式
       运行时,变量b会替换到%s的地方'''
    spritePage.write(sprite_image_1.content)#写入信息
    spritePage.close()#关闭并保存
    print("成功保存%s个图片\n"%b)#保存提示

关于运行

运行前提

我们把它保存到一个文件夹里
在这里插入图片描述
运行一下吧!
在这里插入图片描述
打开文件夹看看!
在这里插入图片描述
获取成功!

解决个问题

由于列表属性,有个图编号是0,要是我们加上b=b+1……
在这里插入图片描述
我们只得手动改一下了
在这里插入图片描述

实例——爬汇图网图片

分析格式

打开http://www.huitu.com/
这网站也做的太缜密了,我找了半天,找到了1个!
在这里插入图片描述
由此可得图片格式为http://show.huitu.com/pic/…………jpg

修改代码

将代码修改为

import requests
import re
#获取内容
webPage=requests.get("http://www.huitu.com/")
webPage=webPage.text
image_re=re.compile(r'http://show.huitu.com/pic/.*?jpg')#构建正则表达式
sprite_image=image_re.findall(webPage)#过滤并传入数据
a=range(len(sprite_image))#保存数列
for b in a:
    sprite_image_1=requests.get(sprite_image[b])#对图片发送请求
    #存储信息
    spritePage=open("图%s.png"%b,"wb")#新建并打开或打开文件
    '''前一个参数为打开文件,另一个是打开模式
       运行时,变量b会替换到%s的地方'''
    spritePage.write(sprite_image_1.content)#写入信息
    spritePage.close()#关闭并保存
    print("成功保存%s个图片\n"%b)#保存提示

我们变动了第4,6行代码的内容,并修改了第12行的保存名

运行效果

看下效果:
在这里插入图片描述
在这里插入图片描述
؏؏☝ᖗ乛◡乛ᖘ☝؏؏ 完美!!

进阶代码

代码部分

进阶代码可以一次获取两个格式
代码:

import requests
import re
import os
webPage=requests.get("网址")
webPage=webPage.text
image_re_jpg=re.compile(r'https.*?jpg')
sprite_image_jpg=image_re_jpg.findall(webPage)
a=range(len(sprite_image_jpg))
for b in a:
    sprite_image_1_jpg=requests.get(sprite_image_jpg[b])
    spritePage_jpg=open("jpg图%s.jpg"%b,"wb")
    spritePage_jpg.write(sprite_image_1_jpg.content)
    spritePage_jpg.close()
    print("成功保存%s个jpg图片\n"%b)
image_re_png=re.compile(r'https.*?png')
sprite_image_png=image_re_png.findall(webPage)
c=range(len(sprite_image_png))
for d in c:
    sprite_image_1_png=requests.get(sprite_image_png[d])
    spritePage_png=open("png图%s.png"%d,"wb")
    spritePage_png.write(sprite_image_1_png.content)
    spritePage_png.close()
    print("成功保存%s个png图片\n"%d)

实例

我们来爬一下我的主页(我就不给链接了,自己点一下我的图标)
在这里插入图片描述

代码

import requests
import re
import os
webPage=requests.get("https://blog.csdn.net/weixin_43233491")
webPage=webPage.text
image_re_jpg=re.compile(r'https.*?jpg')
sprite_image_jpg=image_re_jpg.findall(webPage)
a=range(len(sprite_image_jpg))
for b in a:
    sprite_image_1_jpg=requests.get(sprite_image_jpg[b])
    spritePage_jpg=open("jpg图%s.jpg"%b,"wb")
    spritePage_jpg.write(sprite_image_1_jpg.content)
    spritePage_jpg.close()
    print("成功保存%s个jpg图片\n"%b)
image_re_png=re.compile(r'https.*?png')
sprite_image_png=image_re_png.findall(webPage)
c=range(len(sprite_image_png))
for d in c:
    sprite_image_1_png=requests.get(sprite_image_png[d])
    spritePage_png=open("png图%s.png"%d,"wb")
    spritePage_png.write(sprite_image_1_png.content)
    spritePage_png.close()
    print("成功保存%s个png图片\n"%d)

试试吧!!!
在这里插入图片描述

效果

在这里插入图片描述
哇!太棒了!!!效果很赞!
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_43233491/article/details/104711697