Image recognition using Tess4J

 

1. Download

1. Go to the official website download page

https://sourceforge.net/projects/tess4j/

 

2. Click download

 

3. Unzip after downloading, the directory is as follows, the three circled folders are needed

 

2. Use Tess4J

1. Import the packages under dist and lib into the java project

 

2. Copy the tessdata folder into the root directory of the project

 

3. The demo code is as follows

public class OCRDemo {

    public static void main(String[] args) {
        try {
            double start=System.currentTimeMillis();
            File imageFile = new File("C:\\Users\\dan\\Desktop\\12345.png"); // Image location 
            ITesseract instance = new Tesseract();
             // instance.setDatapath(""); // Set tessdata location 
            instance.setLanguage("chi_sim"); // Select font file 
            String result = instance.doOCR(imageFile); // Start to recognize 
            double end= System.currentTimeMillis();
            System.out.println(result); // Print the picture content 
            System.out.println("time-consuming"+(end-start)/1000+"s" );
        } catch (TesseractException e) {
            e.printStackTrace ();
        }
    }

}

Precautions:

①If tessdata is not placed in the root directory, be sure to set the location of teedata

instance.setDatapath(""); // Set tessdata location

② There is no need to write a suffix to select the font file. The Chinese package chi_sim may not be included in the default tessdata package, and you need to download it yourself

https://github.com/tesseract-ocr/tessdata

 

3. Operation results

 

The recognition rate of the official font library is still low. If you have high precision requirements, you need to train the font library yourself.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325692626&siteId=291194637