win7 安装python3后,设置环境变量。
1.请先安装 visualcppbuildtools_full.exe (我的华为云中有),否则某些开发工具没有,将报错。
2.pip install scrapy
cmd
d:
scrapy startproject mySpider
D:>tree /f mySpider
Folder PATH listing for volume New Volume
Volume serial number is 4647-06DA
D:\MYSPIDER
│ scrapy.cfg
│
└─mySpider
│ items.py
│ middlewares.py
│ pipelines.py
│ settings.py
│ init.py
│
├─spiders
│ │ init.py
│ │
│ └─__pycache__
└─__pycache__
cd mySpider
scrapy genspider stockInfo “itcast.cn”
stockInfo 是爬虫的名字
itcast.cn 是 allowed_domains
D:\mySpider>tree /f
Folder PATH listing for volume New Volume
Volume serial number is 4647-06DA
D:.
│ scrapy.cfg
│
└─mySpider
│ items.py
│ middlewares.py
│ pipelines.py
│ settings.py
│ init.py
│
├─spiders
│ │ stockInfo.py
│ │ init.py
│ │
│ └─__pycache__
│ init.cpython-37.pyc
│
└─__pycache__
settings.cpython-37.pyc
init.cpython-37.pyc
注意:自动产生了一个文件 stockInfo.py
import scrapy
class StockinfoSpider(scrapy.Spider):
name = ‘stockInfo’
allowed_domains = [‘itcast.cn’]
start_urls = [‘http://itcast.cn/’]
def parse(self, response):
pass