爬虫.pyspider环境搭建

1、SH中搭建的

 1.1、貌似 可以和JDK的切换一样的方式,通过 环境变量PATH的设置来 决定使用 哪个版本的Python

2、流水账:

 2.1、开始时 有 Python37x64的环境变量是这样的:

C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\libnvvp;C:\Program Files\Python37\Scripts\;C:\Program Files\Python37\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\dotnet\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\;C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2019.3.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;G:\NVidia\cuda_win7\bin;D:\Program Files (x86)\MATLAB\R2015b\runtime\win32;D:\Program Files (x86)\MATLAB\R2015b\bin;D:\Program Files (x86)\MATLAB\R2015b\polyspace\bin;C:\Program Files (x86)\MATLAB\MATLAB Runtime\v90\runtime\win32;D:\OpenCV_something\opencv-3.4.6-vc14_vc15\build\x86_zz\release\vc14\bin;D:\Program Files\nodejs\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\

  排一排后是这样:

C:\Program Files (x86)\Common Files\Oracle\Java\javapath
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\libnvvp
C:\Program Files\Python37\Scripts\
C:\Program Files\Python37\
C:\Windows\system32
C:\Windows
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0\
C:\Program Files\dotnet\
C:\Program Files\Microsoft SQL Server\130\Tools\Binn\
C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\DTS\Binn\
C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\
C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\
C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\
C:\Program Files\TortoiseSVN\bin
C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\
C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\
C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\
C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\
C:\Program Files\Git\cmd
C:\Program Files\NVIDIA Corporation\Nsight Compute 2019.3.0\
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR
G:\NVidia\cuda_win7\bin
D:\Program Files (x86)\MATLAB\R2015b\runtime\win32
D:\Program Files (x86)\MATLAB\R2015b\bin
D:\Program Files (x86)\MATLAB\R2015b\polyspace\bin
C:\Program Files (x86)\MATLAB\MATLAB Runtime\v90\runtime\win32
D:\OpenCV_something\opencv-3.4.6-vc14_vc15\build\x86_zz\release\vc14\bin
D:\Program Files\nodejs\
C:\Program Files\Microsoft SQL Server\120\Tools\Binn\

  ZC:发现,与 Python相关的 就两个:“C:\Program Files\Python37\Scripts\”、“C:\Program Files\Python37\”

  去掉这2个之后的环境变量为:

C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\libnvvp;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\dotnet\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\;C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2019.3.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;G:\NVidia\cuda_win7\bin;D:\Program Files (x86)\MATLAB\R2015b\runtime\win32;D:\Program Files (x86)\MATLAB\R2015b\bin;D:\Program Files (x86)\MATLAB\R2015b\polyspace\bin;C:\Program Files (x86)\MATLAB\MATLAB Runtime\v90\runtime\win32;D:\OpenCV_something\opencv-3.4.6-vc14_vc15\build\x86_zz\release\vc14\bin;D:\Program Files\nodejs\;C:\Program Files\Microsoft SQL Server\120\Tools\Binn\;

 然后安装 python-3.5.4.exe(这个是 32位的安装包)
 安装 python-3.5.4.exe 完成后,貌似 没有自动添加 Path,且 没有自动包含 pycurl...

 手动在CMD中添加:

扫描二维码关注公众号,回复: 8459998 查看本文章
set path=%path%;"D:\Python\Python35-32\Scripts\";"D:\Python\Python35-32\"

 PS:下面的 whl,都是在 https://www.lfd.uci.edu/~gohlke/pythonlibs/#pycurl 里面下载的

 手动安装 pycurl:
  pip install pycurl
  报错:Command "python setup.py egg_info" failed with error code 10 in C:\Users\ADMINI~1\AppData\Local\Temp\pip-build-mpbd0qzy\pycurl\
  查了一下 说要装另外2个东西,但是 也会装不成功,然后下载 pycurl-7.43.0.3-cp27-cp27m-win32.whl(下载地址是https://www.lfd.uci.edu/~gohlke/pythonlibs/#pycurl),然后
  参考这个文章(pip安装pycurl报错: Complete output from command python setup.py egg_info_ Please specify --curl-dir=_path_to_built_libcurl - 血染&征袍 - 博客园.html [“https://www.cnblogs.com/xueranzp/p/5010656.html”])
  pip install wheel
  pip install D:\IDE\Python\pycurl-7.43.0.3-cp27-cp27m-win32.whl
  报错:“pycurl-7.43.0.3-cp27-cp27m-win32.whl is not a supported wheel on this platform.” 查了一下,说 Python版本要对应,我这里是 Python3.5.4,∴对应是 pycurl-7.43.0.3-cp35-cp35m-win32.whl
  pip install D:\IDE\Python\pycurl-7.43.0.3-cp35-cp35m-win32.whl
  pip install pycurl
  ZC:pycurl 装好

 PS:下面的几个 whl的安装 只执行了 "pip install xxxxx.whl",并没有执行"pip install xxxxx"(不像pycurl的最后还要"pip install pycurl")

 pip install pyspider  参考:windows 下安装pyspider - 幽篁晓筑 - 博客园.html(https://www.cnblogs.com/woods1815/p/9637856.html
  装不上,下载不了一些依赖项,还是到 https://www.lfd.uci.edu/~gohlke/pythonlibs/#pycurl 去下载

Collecting pyspider
Using cached https://files.pythonhosted.org/packages/d0/97/d6062c928f53d899ff2a8538fed11d4d425ba3d
27c96248a2c601c1c9fef/pyspider-0.3.10.tar.gz
Collecting Flask>=0.10 (from pyspider)
Downloading https://files.pythonhosted.org/packages/9b/93/628509b8d5dc749656a9641f4caf13540e2cdec8
5276964ff8f43bbb1d3b/Flask-1.1.1-py2.py3-none-any.whl (94kB)
86% |███████████████████████████▊ | 81kB 5.4kB/s eta 0:00:03Excep
C:\Users\Administrator>pip install D:\IDE\Python\Flask-1.1.1-py2.py3-none-any.whl
Processing d:\ide\python\flask-1.1.1-py2.py3-none-any.whl
Collecting Werkzeug>=0.15 (from Flask==1.1.1)
Cache entry deserialization failed, entry ignored
Downloading https://files.pythonhosted.org/packages/ce/42/3aeda98f96e85fd26180534d36570e4d18108d62
ae36f87694b476b83d6f/Werkzeug-0.16.0-py2.py3-none-any.whl (327kB)
37% |████████████ | 122kB 3.7kB/s eta 0:00:56Exception:

  pip install D:\IDE\Python\Werkzeug-0.16.0-py2.py3-none-any.whl
  然后 反过来,直到“pip install pyspider”

C:\Users\Administrator>pip install pyspider
Collecting pyspider
Using cached https://files.pythonhosted.org/packages/d0/97/d6062c928f53d899ff2a8538fed11d4d425ba3d
27c96248a2c601c1c9fef/pyspider-0.3.10.tar.gz
Requirement already satisfied: Flask>=0.10 in d:\python\python35-32\lib\site-packages (from pyspider
)
Requirement already satisfied: Jinja2>=2.7 in d:\python\python35-32\lib\site-packages (from pyspider
)
Collecting chardet>=2.2 (from pyspider)
Cache entry deserialization failed, entry ignored
Cache entry deserialization failed, entry ignored
Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec751
0b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB)
7% |██▌ | 10kB 2.0kB/s eta 0:01:03Exception:

  pip install D:\IDE\Python\chardet-3.0.4-py2.py3-none-any.whl
  pip install D:\IDE\Python\cssselect-1.1.0-py2.py3-none-any.whl

 pyspider all
  得到 警告:[W 200108 13:46:23 run:413] phantomjs not found, continue running without it.
  得到 报错:

ValueError: Invalid configuration:
- Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead.

  解压 phantomjs-2.1.1-windows.zip 后,得到phantomjs.exe,将 phantomjs.exe 复制到 Python根目录(我这里是 路径"D:\Python\Python35-32\"中)

  phantomjs.exe放好之后 还是有报错:

Deprecated option 'domaincontroller': use 'http_authenticator.domain_controller' instead

  解决:(ValueError_ Invalid configuration_ - Deprecated option domaincontroller_ use http_authenticator_qq_37253540的博客-CSDN博客.html [https://blog.csdn.net/qq_37253540/article/details/88196994])

  安装完爬虫框架pyspider之后,使用pyspider all 命令,输入http://localhost:5000运行就出现上述错误
  原因是因为WsgiDAV发布了版本 pre-release 3.x。
  解决方法如下:
   在安装包中找到pyspider的资源包,然后找到webui文件里面的webdav.py文件打开,修改第209行即可。
   把
   'domaincontroller': NeedAuthController(app),
   修改为:
   'http_authenticator':{
    'HTTPAuthenticator':NeedAuthController(app),
   },
  然后再执行pyspider all就能够通过http://localhost:5000打开页面了。

 ZC: 貌似 "pyspider all"能跑起来 全看运气?前几次没跑起来 强制结束了 进程python.exe&pyspider.exe&phantomjs.exe(不知道还有没有别的进程需要手动干掉...),然后重来 貌似也不行 但是没报错,然后 又杀进程重来了几次这种操作 就又OK了...(pyspider的"Dashboard"界面也出来了)

3、

4、

5、

猜你喜欢

转载自www.cnblogs.com/pythonzc/p/12166546.html