识别验证码库tesserocr的安装问题解决

     tesserocr是Python的一个OCR识别库.其是对tesseract的一个 python API封装.

<1> 在安装tesserocr之前要安装tesseract.

Windows下载地址是

https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.01.exe

可以选择其他版本,我目前选择的是这个,带dev的为开发版本,不带dev的为稳定版本. 安装的时候要勾选

additional language data. 安装的时候耐心等待即可.

安装完成后,在控制面板的系统中增加环境变量. 将tesseract的安装路径增加到PATH

环境变量中,另外再增加一个新的环境变量TESSDATA_PREFIX, 也指向tesseract的安装

目录.我的系统中结果如图

检验是否安装成功,在命令行中输入tessact -v,显示如下结果就说明安装成功了.

<2> 安装tesserocr

安装这个比较坑,按照pip3 install tesserocr pillow会出现如下错误

c:\Python37\Scripts>pip install tesserocr pillow
Collecting tesserocr
  Downloading https://files.pythonhosted.org/packages/92/2d/05a7f8387e93c192919b508e4f4936f232bd3d2ca388b9130ae538a9f9ad/tesserocr-2.4.0.tar.gz (56kB)
    100% |████████████████████████████████| 61kB 208kB/s
Collecting pillow
  Downloading https://files.pythonhosted.org/packages/55/ea/305f61258278790706e69f01c53e107b0830ea5a4a69aa1f2c11fe605ed3/Pillow-5.3.0-cp37-cp37m-win_amd64.whl (1.6MB)
    100% |████████████████████████████████| 1.6MB 1.9MB/s
Building wheels for collected packages: tesserocr
  Running setup.py bdist_wheel for tesserocr ... error
  Complete output from command c:\python37\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\ZHANGM~1\\AppData\\Local\\Temp\\pip-install-aixq4ve2\\tesserocr\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d C:\Users\ZHANGM~1\AppData\Local\Temp\pip-wheel-7w7aj1_3 --python-tag cp37:
  C:\Users\ZHANGM~1\AppData\Local\Temp\pip-install-aixq4ve2\tesserocr\setup.py:134: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
    _LOGGER.warn('Failed to extract tesseract version from executable: {}'.format(e))
  Failed to extract tesseract version from executable: [WinError 2] 系统找不到指定的文件。
  Supporting tesseract v3.04.00
  Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}}
  c:\python37\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'long_description_content_type'
    warnings.warn(msg)
  running bdist_wheel
  running build
  running build_ext
  building 'tesserocr' extension
  error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

  ----------------------------------------
  Failed building wheel for tesserocr
  Running setup.py clean for tesserocr
Failed to build tesserocr
Installing collected packages: tesserocr, pillow
  Running setup.py install for tesserocr ... error
    Complete output from command c:\python37\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\ZHANGM~1\\AppData\\Local\\Temp\\pip-install-aixq4ve2\\tesserocr\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\ZHANGM~1\AppData\Local\Temp\pip-record-tobwu8mn\install-record.txt --single-version-externally-managed --compile:
    C:\Users\ZHANGM~1\AppData\Local\Temp\pip-install-aixq4ve2\tesserocr\setup.py:134: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
      _LOGGER.warn('Failed to extract tesseract version from executable: {}'.format(e))
    Failed to extract tesseract version from executable: [WinError 2] 系统找不到指定的文件。
    Supporting tesseract v3.04.00
    Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}}
    c:\python37\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'long_description_content_type'
      warnings.warn(msg)
    running install
    running build
    running build_ext
    building 'tesserocr' extension
    error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

    ----------------------------------------

开始想着去按照提示的链接下载visual c++ 14.0, 但是很不巧,碰到的是404错误. 然后发现可以通过下载

whl格式的安装包解决这个问题,下载地址为

https://github.com/simonflueckiger/tesserocr-windows_build/releases

安装结果如图

这个说下碰到的一个问题,最开始下载的是tesserocr-2.2.2-cp36-cp36m-win_amd64.whl这个文件.因为我是python小白,还没反应出来

cp36代表什么东西.所以一直出现如下这个错误

c:\Python37\Scripts>pip3 install tesserocr-2.2.2-cp36-cp36m-win_amd64.whl
tesserocr-2.2.2-cp36-cp36m-win_amd64.whl is not a supported wheel on this platform.
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

可以看到我用的是python 3.7, 所以一直提示这个错误. 后来反应过来了,应该是我下的的版本不是针对3.7的,所以

又找到那个链接提供的tesserocr-2.3.1-cp37-cp37m-win_amd64.whl,就安装成功了.

猜你喜欢

转载自www.cnblogs.com/zmiao/p/10182040.html