VPS(ubuntu server 10.04)上安装tesseract-ocr

1.添加源:

参考http://conanca.iteye.com/blog/1044256

2.安装编译器,编译工具,图片格式支持和leptonica

sudo apt-get install upgrade
sudo apt-get install update

sudo apt-get install build-essential gcc make
sudo apt-get install libc6 libc6-dev

sudo apt-get install autoconf automake libtool

sudo apt-get install libpng12-dev
sudo apt-get install libjpeg62-dev
sudo apt-get install libtiff4-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libxpm-dev
sudo apt-get install libgif-dev

sudo apt-get install libleptonica-dev

3.安装tesseract及其语言组件

wget http://tesseract-ocr.googlecode.com/files/tesseract-3.00.tar.gz
gzip -d tesseract-3.00.tar.gz
cd tesseract
./runautoconf
./configure
make
make install

ldconfig

wget http://tesseract-ocr.googlecode.com/files/eng.traineddata.gz
gzip -d eng.traineddata.gz
sudo cp -r eng.traineddata /usr/local/share/tessdata/

wget http://tesseract-ocr.googlecode.com/files/chi_sim.traineddata.gz
gzip -d chi_sim.traineddata.gz
sudo cp -r chi_sim.traineddata /usr/local/share/tessdata/

4.设置ubuntu,使其支持中文

参考http://hi.baidu.com/cobala/blog/item/993ef23bd677fcf814cecb1f.html

猜你喜欢

转载自conanca.iteye.com/blog/1155118
今日推荐