Baidu brain access table character recognition technology, electronic information quickly reduce input costs

Use the table character recognition technology, the paper registration form personal information, merchandise, publicity and content identification, electronic table of contents quickly realize, structured for registration information collation and statistics, a significant reduction in the work of human information electronically input costs, improve information management convenience

One. Access platform

This step is relatively simple, not much elaboration. Before a document can refer to:

https://ai.baidu.com/forum/topic/show/943162

II. Analysis of interface documentation

1. Open API documentation page, interface requirements analysis

https://ai.baidu.com/docs#/OCR-API/87932804

   (1) Interface description

Text in the table image is extracted and recognized, the structure of the output header, footer, and text for each cell. Containing conventional form capable of recognizing and merged table cell, and can choose to return JSON or Excel format.

(2) Description Request

Information need to use are:

请求URL:https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/request

Header格式:Content-Type:application/x-www-form-urlencoded

Body placed request parameters, the parameters as follows:

This interface asynchronous interface, divided into two API: Submit request interface, the interface obtaining results. There is a key parameter: is_sync, the value is "false", the need to obtain recognition results by obtaining the results of the interface; value is "true", synchronous return recognition results, without having to call to get the results interface. Of course, one can never get used twice, just set the parameter to "true" can be.

(3) Return Parameter                               

Returning to the example

{"result":

{"result_data":"http://bj.bcebos.com/v1/ai-edgecloud/4F00EC7AED4E4827BD517CB105E56DEB?authorization=bce-auth-v1%2Ff86a2044998643b5abc89b59158bad6d%2F2019-08-10T07%3A28%3A13Z%2F172800%2F%2F374c64232876bcbe78a54105e438a97376f530788e5386e04f67d0cba4935f3d",

"ret_msg":"\xe5\xb7\xb2\xe5\xae\x8c\xe6\x88\x90",

"percent":100,

"ret_code":3},

"log_id":1565422091617865}

2.获取access_token

# encoding:utf-8

import base64

import urllib

import urllib2



request_url = " https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/request "

# 二进制方式打开视频文件

f = open('[本地文件]', 'rb')

img = base64.b64encode(f.read())

params = {"data": data }

params = urllib.urlencode(params)

access_token = '[调用鉴权接口获取的token]'

request_url = request_url + "?access_token=" + access_token

request = urllib2.Request(url=request_url, data=params)

request.add_header('Content-Type', 'application/x-www-form-urlencoded')

response = urllib2.urlopen(request)

content = response.read()

if content:

print content

三.识别结果

1.

识别结果:

2.

识别结果:

 3.

识别结果:

4.

识别结果:

结论:

识别结果方面:采用不同形式的复杂表格进行测试,识别结果比较准确,能够大大减少信息录入工作。

处理速度方面:每张图片处理时间在3-5s,可以接受。

四.源码共享

# -*- coding: utf-8 -*-

#!/usr/bin/env python

import urllib

import urllib.parse

import urllib.request

import base64

import json

import time

#client_id 为官网获取的AK, client_secret 为官网获取的SK

client_id = '*******************'

client_secret = '*********************'



#获取token

def get_token():

    host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=' + client_id + '&client_secret=' + client_secret

    request = urllib.request.Request(host)

    request.add_header('Content-Type', 'application/json; charset=UTF-8')

    response = urllib.request.urlopen(request)

    token_content = response.read()

    if token_content:

        token_info = json.loads(token_content.decode("utf-8"))

        token_key = token_info['access_token']

    return token_key



     # 读取图片

def get_file_content(filePath):

    with open(filePath, 'rb') as fp:

        return fp.read()





#获取表格信息

def get_license_plate(path):



    request_url = "https://aip.baidubce.com/rest/2.0/solution/v1/form_ocr/request"

   

    f = get_file_content(path)

    access_token=get_token()

    print (access_token)

    img = base64.b64encode(f)

#    params = {"image": img,"is_sync": 'true',"request_type": 'json'}

    params = {"image": img,"is_sync": 'true',"request_type": 'excel'}

    params = urllib.parse.urlencode(params).encode('utf-8')

    request_url = request_url + "?access_token=" + access_token

    tic = time.clock()

    request = urllib.request.Request(url=request_url, data=params)

    request.add_header('Content-Type', 'application/x-www-form-urlencoded')

    response = urllib.request.urlopen(request)

    content = response.read()

    toc = time.clock()

    print('处理时长: '+'%.2f'  %(toc - tic) +' s')

    if content:

        print (content)

        license_plates = json.loads(content.decode("utf-8"))

        excel_url = license_plates['result']['result_data']

        excel = urllib.request.urlopen(excel_url)

        with open("sbg.xls", "wb") as code:

            code.write(excel.read())

        return content

    else:

        return ''



image_path='F:\paddle\sbg\s6.jpg'

get_license_plate(image_path)

五.意见建议

1.整体识别效果还是不错的,识别结果的精确度还有待提高,细节处理还可以更完善。比如复杂表格识别文字串行,个别文字丢失或错误等。

2.对表格中有手写体文字的识别效果不好,建议增加对手写输入的识别。

作者:wangwei8638 

Guess you like

Origin www.cnblogs.com/AIBOOM/p/11725472.html