爬虫学习笔记之requests库 - 代码天地

爬虫学习笔记之requests库

其他 2020-02-17 12:10:31 阅读次数: 0

mooc 课堂五个爬取的代码
1.爬取京东

import requests
url="https://item.jd.com/100007926792.html"
try:
    r=requests.get(url)
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text[:1000])
except:
    print("error")

2.爬取亚马逊

import requests
url="https://www.amazon.cn/dp/B00D20QFXQ?ref_=Oct_DLandingS_D_4a27ed07_61&smid=A3TEGLC21NOO5Y"
try:
    kv={'user-agent':'Mozilla/5.0'}
    r=requests.get(url,headers=kv)
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text[0:1000])
except:
    print('error')

3.百度360 搜索关键词提交
百度的关键词接口：http://www.baidu.com/s?wd=keyword
360的关键词接口：http://www.so.com/s?q=keyword

替换keyword既可以进行不同的提交即wd 换成q

import requests
keyword="python"
try:
    kv={'wd':keyword}
    r=requests.get("http://www.baidu.com/s",params=kv)
    print(r.request.url)
    r.raise_for_status()
    print(len(r.text))
except:
    print("error")

4.网络图片的连接格式
http://www.example.com/picture.jpg

import requests
import os
url = "图片链接“

root="保存路径（D：//）
path=root+url.splot('/')[-1]
try:
    if not os.path.exists(root):
        os.mkdir(root)
    if not os.path.exists(path):
        r=requests.get(url)
        with open(path,'wb') as f:
            f.write(r.content)
            f.close()
            print("文件保存成功")
    else:
        print("文件已存在")
except:
    print("error")

5.IP地址查询

import requests
url = "http://m.ip138.com/ip.asp?ip="
try:
    r=requests.get(url+'202.24.80.112')
                  
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text[-500:])
except:
    print("error")

卡小葵

发布了9 篇原创文章 · 获赞 0 · 访问量 1337

私信关注

猜你喜欢

转载自blog.csdn.net/kaxiaokui/article/details/104315267

爬虫学习笔记之requests库

Python爬虫学习笔记(requests库)

Python爬虫学习笔记(Requests库补充)

爬虫之Requests库

爬虫笔记：Requests库详解

python爬虫之requests库

Python爬虫之-Requests库

python爬虫学习笔记1：requests库及robots协议

【Python爬虫学习笔记3】requests库的基本使用

python爬虫学习笔记二：Requests库详解及HTTP协议

网络爬虫学习笔记（一）——Requests库入门

python网络爬虫学习笔记（六）：Requests库的使用

Py爬虫学习_requests库

【爬虫】Requests 库的入门学习

爬虫学习—requests库使用

[ Python ] 爬虫类库学习之 requests

python 爬虫之requests笔记

爬虫学习之-requests乱码

python爬虫学习笔记一：爬虫学习概览与Requests库的安装与使用

Python 网络爬虫笔记1 -- Requests库

Python网络爬虫之网络爬虫的“盗亦有道”和Requests库网络爬取实战学习笔记手札及代码实战

Python爬虫之requests库入门

Python爬虫之requests库介绍(一)

爬虫之 Requests库的基本使用

Python爬虫之Requests库的基本使用

网络爬虫必备知识之requests库

Python爬虫之Requests库的使用

python爬虫常用库之requests详解

python爬虫之requests库（五）

Python网络爬虫基本之 requests库

今日推荐

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

Java自定义时间格式

同步整形电路

在开发中最最最常用的字符串的属性大集合

Linux 查看端口占用并杀掉

Java基础四：ArrayList

多线程之死锁就是这么简单

mysql 基础命令集

awk 命令详解

Centos6.3编译安装nginx+php步骤

OCR （Optical Character Recognition，光学字符识别）

每日归档

更多

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)