网络爬虫学习笔记

1、基本

import requests
from bs4 import BeautifulSoup

res = requests.get('http://www.baidu.com','html.parser')
soup = BeautifulSoup(res.text)
# class
soup.select('.time-source')[0].text
# id
soup.select('#artibodytitle')[0].text

#时间
from datetime import datetime
dt = datetime.strptime(timesource,'%Y %m %d %H:%M')
dt.strftime('%Y-%m-%d')

soup.select(‘#id’)[:-1]

#字符串以空格合并
article=[]
' '.join(article)

猜你喜欢

转载自blog.csdn.net/myxuan475/article/details/80710346