installation
pip install scrapy
The establishment of a reptile project
scrapy startproject project name
scrapy startproject itcast
Generate a reptile
scrapy genspider reptile name "reptile range"
scrapy genspider itcast "itcast.cn"
Reptile generation position
Write itcast.py
# - * - Coding: UTF-. 8 - * - Import Scrapy class ItcastSpider (scrapy.Spider): name = " itcast " allowed_domains = [ " itcast.cn " ] start_urls = ( ' http://www.itcast.cn/ Channel / teacher.shtml ' , ) DEF the parse (Self, Response): # Print (Response) DATA_LIST = response.xpath ( " // div [@ class =' tea_con '] // H3 / text () " ) .extract () # extract () returns a string containing the list of data is useless if the method returns a list comprising selector print(data_list) # garbled u \ u5218 .... setting.py added FEED_EXPORT_ENCODING = 'utf-8' or not does not know the reason? ? ? for i in DATA_LIST: Print (i) # print here is Chinese
Since distortion is not Chinese terminal installation package ubuntu
Chinese installation package
apt-get install language-pack-zh
Modify / tec / environment
sudo gedit /etc/environment
Add the following two lines
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" LANG="zh_CN.UTF-8" LANGUAGE="zh_CN:zh:en_US:en"
The second line that is the default Chinese character encoding. Note: You can modify the default encoding of Chinese characters through here, such as modifications to: zh_CN.GBK
Modify /var/lib/locales/supported.d/local file
sudo gedit /var/lib/locales/supported.d/local
Add to
zh_CN.UTF-8 UTF-8
en_US.UTF-8 UTF-8
Once saved, execute the command
sudo locale-gen
Restart
sudo reboot
Solve the garbage is gone, you can display Chinese
After other printed data terminal
setting.py configure the log level
LOG_LEVEL = "WARNING"