Python3网络爬虫——环境配置

主要涉及到:

  Python, MongoDB, Redis, MySQL以及python爬虫常用库的安装;可视化图形界面包括:Robo 3T,Redis, Navicat for MySQL

python:

  我电脑有python3.5 和3.7两个版本。环境是配置的sublime_text3

  python需要pip的几种库:

pip3 install lxml

from bs4 import BeautifulSoup
soup = BeautifulSoup("<html></html>","lxml")

   具体功能参见:https://www.cnblogs.com/zhangxinqi/p/9210211.html#_label6

pip3 install pyquery

from pyquery import PyQuery as pq
doc = pq("<html>Hello</html>")
result = doc("html").text()
result
>>>'Hello'

   具体功能参见:https://www.cnblogs.com/zhaof/p/6935473.html

pip3 install pymysql

import pymysql
conn = pymysql.connect(host = 'localhost', user = 'root', password = '123456', port = 3306, db = 'cookbook') 
cursor = conn.cursor()
cursor.execute('select * from color')
cursor.fetchall()

   这个功能就是操作mysql的数据库而存在的第三方库

pip3 install pymongo

import pymongo
client = pymongo.MongoClient('localhost')
db = client['newtestdb']
db['table'].insert({'name':'Bob'})
db['table'].find_one({'name':'Bob'})

pip3 install redis

import redis
r = redis.Redis('localhost', 6379)
r.set('name', 'Bob')
r.get('name')

MongoDB、Redis、MySQL的区别:https://www.cnblogs.com/noah0532/p/10943120.html

 

pip3 install beautifulsoup4

  具体功能参见:https://www.cnblogs.com/hanmk/p/8724162.html

pip3 install flask

pip3 install django

  flask与django功能对比链接:https://www.cnblogs.com/crss/p/8532950.html

pip3 install jupyter

jupyter一个好用的python编辑器

如何打开.ipynb文件?

  cmd 命令转到.ipynb文件夹下——>输入jupyter notebook 即可跳出网页界面,进行查看。

如何使用Selenium + Chorme?

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("http://www.badiu.com")
driver.page_source

猜你喜欢

转载自www.cnblogs.com/HannahGreen/p/11929209.html