暑假放假第二天,打开了爬虫之路,后来发现要配的环境好多,一开始比较顺利,可是今天竟然花了四个多小时配tesserocr库,其中的坎坷、与艰辛真的是难以启齿,废话不多说了,直接上硬货。
当然啦安装tesserocr库之前,必须要安装tesseracr啦,这里就很简单啦不在仔细讲啦。
在最初的安装时出现了以下的错误:
Exception:
Traceback (most recent call last):
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\compat\__init__.py", line 73, in console_to_str
return s.decode(sys.__stdout__.encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 70: invalid continuation byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\commands\install.py", line 335, in run
wb.build(autobuilding=True)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\req\req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\req\req_set.py", line 634, in _prepare_file
abstract_dist.prep_for_dist()
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\req\req_set.py", line 129, in prep_for_dist
self.req_to_install.run_egg_info()
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\req\req_install.py", line 439, in run_egg_info
command_desc='python setup.py egg_info')
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\utils\__init__.py", line 676, in call_subprocess
line = console_to_str(proc.stdout.readline())
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\compat\__init__.py", line 75, in console_to_str
return s.decode('utf_8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 70: invalid continuation byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\basecommand.py", line 215, in main
status = self.run(options, args)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\commands\install.py", line 385, in run
requirement_set.cleanup_files()
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\utils\build.py", line 38, in __exit__
self.cleanup()
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\utils\build.py", line 42, in cleanup
rmtree(self.name)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\_vendor\retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\_vendor\retrying.py", line 212, in call
raise attempt.get()
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\_vendor\retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\_vendor\six.py", line 686, in reraise
raise value
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\_vendor\retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\utils\__init__.py", line 102, in rmtree
onerror=rmtree_errorhandler)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\shutil.py", line 494, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\shutil.py", line 384, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\shutil.py", line 393, in _rmtree_unsafe
onerror(os.rmdir, path, sys.exc_info())
File "C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\utils\__init__.py", line 114, in rmtree_errorhandler
func(path)
PermissionError: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'C:\\Users\\upup\\AppData\\Local\\Temp\\pip-build-fixlbed7\\tesserocr'
解决方法为:修改pip源代码
在"C:\Users\upup\AppData\Local\conda\conda\envs\http\lib\site-packages\pip\compat_init_.py"
的第75行,把return s.decode(‘utf_8’)改为return s.decode(‘cp936’) 即可。 在这里不要用记事本修改可以用editplus。处理完毕以上问题之后,又出现了下面的问题(真是让人很生气嗷)
我是通过以下命令安装:
pip3 install tesserocr pillow
但是一直卡着,如下:
就是卡在这个地方,一开始以为是网速的问题,后来又在网上找了好多教程,无济于事。而且报错末尾还出现了:
百度之后,我就安装一个rpy2-2.9.5-cp36-cp36m-win_amd64.whl,文件安装很成功一个命令就解决了。但是呢还是不好使。然后我就继续找教程。发现可以使用以下命令:
pip install tesseract-ocr
很可惜啊,又出错了:
出现这种情况,先不要着急去安装Microdoft Visual C++ 14.0,我们换一种安装方式首先下载对应的.whl文件,一定要与tesseract版本配对:
运行命令:pip3 install tesserocr-2.2.2-cp36-cp36m-win_amd64.whl(这个文件我也是花钱买的。。。。)。
运行这个命令之后,忽然就成功了hhhhhh。如图:
咱也不知道为啥,咱也不敢问。