1.添加源:
参考http://conanca.iteye.com/blog/1044256
2.安装编译器,编译工具,图片格式支持和leptonica
sudo apt-get install upgrade sudo apt-get install update sudo apt-get install build-essential gcc make sudo apt-get install libc6 libc6-dev sudo apt-get install autoconf automake libtool sudo apt-get install libpng12-dev sudo apt-get install libjpeg62-dev sudo apt-get install libtiff4-dev sudo apt-get install zlib1g-dev sudo apt-get install libxpm-dev sudo apt-get install libgif-dev sudo apt-get install libleptonica-dev
3.安装tesseract及其语言组件
wget http://tesseract-ocr.googlecode.com/files/tesseract-3.00.tar.gz gzip -d tesseract-3.00.tar.gz cd tesseract ./runautoconf ./configure make make install ldconfig wget http://tesseract-ocr.googlecode.com/files/eng.traineddata.gz gzip -d eng.traineddata.gz sudo cp -r eng.traineddata /usr/local/share/tessdata/ wget http://tesseract-ocr.googlecode.com/files/chi_sim.traineddata.gz gzip -d chi_sim.traineddata.gz sudo cp -r chi_sim.traineddata /usr/local/share/tessdata/
4.设置ubuntu,使其支持中文
参考http://hi.baidu.com/cobala/blog/item/993ef23bd677fcf814cecb1f.html