Movatterモバイル変換


[0]ホーム

URL:


CN115376149A - A method for identifying reimbursement invoices - Google Patents

A method for identifying reimbursement invoices
Download PDF

Info

Publication number
CN115376149A
CN115376149ACN202211056210.3ACN202211056210ACN115376149ACN 115376149 ACN115376149 ACN 115376149ACN 202211056210 ACN202211056210 ACN 202211056210ACN 115376149 ACN115376149 ACN 115376149A
Authority
CN
China
Prior art keywords
invoice
picture
text
network
target frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211056210.3A
Other languages
Chinese (zh)
Inventor
励建科
胡艳
陈再蝶
朱晓秋
章星星
樊伟东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Kangxu Technology Co ltd
Original Assignee
Zhejiang Kangxu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Kangxu Technology Co ltdfiledCriticalZhejiang Kangxu Technology Co ltd
Priority to CN202211056210.3ApriorityCriticalpatent/CN115376149A/en
Publication of CN115376149ApublicationCriticalpatent/CN115376149A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The invention discloses a reimbursement invoice identification method, which comprises the following steps: s1, carrying out invoice instance detection and segmentation on a picture to be detected; s2, obtaining an invoice target frame and an invoice mask in the target frame according to the segmentation result, and cutting a single invoice picture according to the coordinates of the invoice target frame; s3, performing OCR recognition on the single invoice picture to obtain a recognition result, and judging the invoice type according to the recognition result; and S4, obtaining the invoice angle correction picture of 0 degree or 180 degrees through an angle correction algorithm according to the single invoice picture and the corresponding mask and invoice type. In the invention, a plurality of multi-type processing scenes appear in the reimbursement invoice picture; angle correction is carried out on the reimbursed invoice, the invoice identification accuracy is improved, and the false identification rate and the missing identification rate are reduced; the method is suitable for various invoice identifications including electronic invoices or shot pictures and the like, and is also suitable for invoices with or without tables.

Description

Translated fromChinese
一种报销发票识别方法A method for identifying reimbursement invoices

技术领域technical field

本发明涉及发票识别技术领域,尤其涉及一种报销发票识别方法。The invention relates to the technical field of invoice identification, in particular to a method for identifying reimbursement invoices.

背景技术Background technique

财务管理面临的难题是大量的数据采集和信息处理工作,不管是个人还是企业的经济活动,开具发票意识、凭票才能进行报销的认识越发普及,在报销的流程中,由于消费类别的不同,通常会在一张报销单上粘贴多张不同类型的发票,由于发票的形状不同,为了节省纸张的幅面空间,有的发票会竖着粘贴或倒着粘贴,结果是发票多样且位置不固定,现有的报销发票识别常常出现漏识别或者识别失败的情况,从而降低公司对财务发票的精细化管理水平。The problem facing financial management is a large amount of data collection and information processing. Whether it is the economic activities of individuals or enterprises, the awareness of issuing invoices and the recognition that reimbursement can only be made with the invoice are becoming more and more popular. In the process of reimbursement, due to different consumption categories, Usually, multiple invoices of different types are pasted on one reimbursement form. Due to the different shapes of the invoices, in order to save paper space, some invoices are pasted vertically or upside down. As a result, the invoices are diverse and their positions are not fixed. The existing reimbursement invoice identification often misses or fails to identify, thereby reducing the company's refined management level of financial invoices.

为此,推出了发票自动识别解决方案,通过利用发票识别技术,批量采集增值税发票等票据上的信息,并输出结构化的数据,与传统的人工录入数据相比,大大的减少了财务人员的工作量,提升了其工作效率。To this end, the invoice automatic identification solution was launched. By using the invoice identification technology, the information on the value-added tax invoices and other bills is collected in batches, and the structured data is output. Compared with the traditional manual data entry, the number of financial personnel is greatly reduced. workload, improving its work efficiency.

然而,现有报销发票识别技术中存在以下几个弊端:However, there are several drawbacks in the existing reimbursement invoice recognition technology:

(1)现有的报销发票识别匹配默认输入图片质量较高,且图片中默认一张发票,且图片中的文字均在发票内,当发票背景较为复杂上难以处理或者容易匹配错误;(1) The existing reimbursement invoice recognition and matching default input picture quality is high, and the picture defaults to an invoice, and the text in the picture is all in the invoice. When the background of the invoice is more complicated, it is difficult to handle or it is easy to match errors;

(2)现有部分报销发票识别匹配并不对发票角度进行矫正,这种情况导致文本检测识别准确率比较低,常常出现漏识别或者识别失败的问题,然而,部分对发票进行角度矫正,也是基于发票表格的角点,难以矫正没有表格的发票角度;(2) The recognition and matching of some reimbursement invoices does not correct the angle of the invoice. This situation leads to a relatively low accuracy rate of text detection and recognition, and the problem of missing recognition or recognition failure often occurs. However, part of the angle correction of the invoice is also based on the The corner of the invoice form, it is difficult to correct the angle of the invoice without the form;

(3)现有的报销发票识别匹配通常针对某一类报销发票构建模板,再进行检测识别,且由于图片质量要求高,难以适应多种报销发票的识别。(3) Existing reimbursement invoice recognition and matching methods usually construct a template for a certain type of reimbursement invoice, and then perform detection and recognition. Due to the high quality requirements of the image, it is difficult to adapt to the recognition of various reimbursement invoices.

发明内容Contents of the invention

为了解决上述背景技术中所提到的技术问题,而提出的一种报销发票识别方法。In order to solve the technical problems mentioned in the background art above, a reimbursement invoice identification method is proposed.

为了实现上述目的,本发明采用了如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

一种报销发票识别方法,包括以下步骤:A reimbursement invoice identification method, comprising the following steps:

S1、待测图片进行发票实例检测和分割;S1. Carry out invoice instance detection and segmentation on the image to be tested;

S2、根据分割结果,获得发票目标框和目标框内发票mask,并根据发票目标框坐标,切割单发票图片;S2. Obtain the invoice target frame and the invoice mask in the target frame according to the segmentation result, and cut the single invoice image according to the coordinates of the invoice target frame;

S3、对单发票图片进行OCR识别,获得识别结果,并根据识别结果判断发票类型;S3. Carry out OCR recognition on the single invoice picture, obtain the recognition result, and judge the invoice type according to the recognition result;

S4、根据单发票图片以及其对应的mask和发票类型,通过角度矫正算法,获得0度或者180度的发票角度矫正图片;S4. According to the single invoice picture and its corresponding mask and invoice type, through the angle correction algorithm, obtain the invoice angle correction picture of 0 degrees or 180 degrees;

S5、将发票角度矫正图片输入到已训练的方向分类器,输出0度发票图片;S5. Input the invoice angle correction picture to the trained direction classifier, and output the 0-degree invoice picture;

S6、将0度发票图片输入到已训练的文本检测模型,获得文本框位置;S6. Input the 0-degree invoice picture into the trained text detection model to obtain the position of the text box;

S7、对文本框位置进行方向分类,转正文本区域;S7, classify the direction of the text box position, and turn the text area into a normal position;

S8、将转正后的文本区域输入到已训练的文本识别模型,获得识别结果;S8. Input the converted text region into the trained text recognition model to obtain the recognition result;

S9、根据文本框位置和识别结果,以及发票类型,通过文本匹配规则,完成文本匹配;S9. According to the position of the text box, the recognition result, and the type of the invoice, complete the text matching through the text matching rule;

S10、根据发票文本匹配结果,通过发票验真接口实现发票验真。S10. According to the invoice text matching result, implement the invoice authenticity verification through the invoice authenticity verification interface.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S4中,角度验证算法步骤包括:In the step S4, the angle verification algorithm step includes:

S41、获得单发票图片以及其对应的mask二值图;S41. Obtain a single invoice image and its corresponding mask binary image;

S42、对mask二值图通过opencv中findContours获得发票目标边缘点;S42. Obtain the edge point of the invoice target through findContours in opencv for the mask binary image;

S43、用minAreaRect方法获得发票边缘的最小外接矩形rect;S43. Use the minAreaRect method to obtain the minimum circumscribed rectangle rect of the edge of the invoice;

S44、根据发票类型,分别计算竖票和横票与水平方向所夹角度;S44. According to the type of invoice, respectively calculate the angles between the vertical invoice and the horizontal invoice and the horizontal direction;

S45、使用getRotationMatrix2D()和warpAffine将发票转正至与水平方向夹角为0度。S45. Use getRotationMatrix2D() and warpAffine to turn the invoice to the angle of 0 degrees with the horizontal direction.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S9中,所述文本匹配规则包括:In the step S9, the text matching rules include:

S91、针对不同发票类型的发票预设待识别关键字;S91, preset keywords to be identified for invoices of different invoice types;

S92、根据文本检测结果,若关键字和值出现在同一文本框,则分离关键字和数值,完成文本匹配;S92. According to the text detection result, if the keyword and the value appear in the same text box, then separate the keyword and the value to complete the text matching;

S93、若关键字和值未出现在同一文本框,则通过设置参数框延长方向和延长比例来查找是否存在与该关键字文本框相交的其他文本框,若存在相交框,则合并两个文本框内容,分离关键字和值完成文本匹配,若不存在相交框,则文本匹配失败。S93, if the keyword and the value do not appear in the same text box, then by setting the extension direction and extension ratio of the parameter box to find whether there are other text boxes intersecting with the keyword text box, if there is an intersecting box, then merge the two texts Box content, separate keywords and values to complete text matching, if there is no intersecting box, text matching fails.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S93中,关键字文本框表示为[center_x,center_y,w,h],其中center_x,center_y为文本框中心坐标,w,h为文本框宽度和高度;In the step S93, the keyword text box is expressed as [center_x, center_y, w, h], wherein center_x, center_y are the center coordinates of the text box, and w, h are the width and height of the text box;

设延长比例为ratio,当延长方向为水平向右,则延长后的文本框为[center_x+(ratio-1)*w/2,center_y,w*ratio,h];Let the extension ratio be ratio, when the extension direction is horizontal to the right, the extended text box is [center_x+(ratio-1)*w/2, center_y,w*ratio,h];

当延长方向为垂直向下,则延长后的文本框为[center_x,center_y+(ratio-1)*h/2,w,h*ratio];When the extension direction is vertically downward, the extended text box is [center_x, center_y+(ratio-1)*h/2, w, h*ratio];

当延长方向为右下方向,则延长后的文本框为[center_x+(ratio-1)*w/2,center_y+(ratio-1)*h/2,w*ratio,h*ratio]When the extension direction is the lower right direction, the extended text box is [center_x+(ratio-1)*w/2, center_y+(ratio-1)*h/2, w*ratio, h*ratio]

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S1中,通过已训练的MaskRCNN网络进行待检测图片的发票实例检测和分割,通过训练数据集对MaskRCNN网络的进行训练,训练步骤如下:In the step S1, the invoice instance detection and segmentation of the picture to be detected are carried out through the trained MaskRCNN network, and the MaskRCNN network is trained through the training data set. The training steps are as follows:

S11、对MaskRCNN网络权重初始化,使用在Imagenet数据集预训练好的网络参数对backbone网络进行参数初始化;S11. Initialize the weights of the MaskRCNN network, and use the network parameters pre-trained in the Imagenet dataset to initialize the parameters of the backbone network;

S12、将训练数据集图片样本按照短边随机缩放,并利用ResNet50+FPN网络来提取训练样本图像的整体特征图;S12. Randomly scale the image sample of the training data set according to the short side, and use the ResNet50+FPN network to extract the overall feature map of the training sample image;

S13、将整体特征图输入到RPN网络中,预测出ROI候选区域,根据候选区域目标框和标注目标框的重叠比挑选正负样本;S13. Input the overall feature map into the RPN network, predict the ROI candidate area, and select positive and negative samples according to the overlapping ratio of the target frame of the candidate area and the marked target frame;

S14、对正负样本对应特征图上的ROI候选区域进行ROIAlign池化计算,获得固定尺寸的候选特征图;S14. Perform ROIAlign pooling calculation on the ROI candidate region on the feature map corresponding to the positive and negative samples to obtain a fixed-size candidate feature map;

S15、对候选特征图进行分类和目标框的回归计算;S15, performing classification on the candidate feature map and regression calculation of the target frame;

S16、计算MaskRCNN网络的损失函数,通过随机梯度算法对损失函数进行梯度计算,并更新MaskRCNN网络的权重;S16. Calculate the loss function of the MaskRCNN network, perform gradient calculation on the loss function through a stochastic gradient algorithm, and update the weight of the MaskRCNN network;

S17、重复步骤S12-S16,训练至少20轮次迭代后停止训练,保存MaskRCNN网络,即获得已训练的MaskRCNN网络。S17. Repeat steps S12-S16, stop training after at least 20 iterations of training, save the MaskRCNN network, and obtain the trained MaskRCNN network.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S14中,在进行ROIAlign池化计算时,首先,将ROI区域目标框映射到特征图上,再根据最小外接矩形算法,得到特征图像ROI区域,将ROI区域划分成m×m个网格,每个网格选取特征图上4个点进行双线性差值,最后得到尺寸为m×m的特征图。In the step S14, when performing ROIAlign pooling calculation, firstly, the target frame of the ROI area is mapped to the feature map, and then according to the minimum circumscribed rectangle algorithm, the ROI area of the feature image is obtained, and the ROI area is divided into m×m grids Each grid selects 4 points on the feature map for bilinear difference, and finally obtains a feature map with a size of m×m.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述步骤S2中,发票目标框和目标框内发票mask的获取步骤如下:In the step S2, the acquisition steps of the invoice target frame and the invoice mask in the target frame are as follows:

S21、对原图进行大小调整,获得待测图片;S21. Adjust the size of the original image to obtain the image to be tested;

S22、待测图片输入到通过MaskRCNN网络,预测结果插值处理成原图尺寸,获得发票目标框和目标框内发票mask。S22. The image to be tested is input to the MaskRCNN network, and the prediction result is interpolated into the size of the original image to obtain the invoice target frame and the invoice mask in the target frame.

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述方向分类器采用卷积神经网络模型,模型包括CNN主干网络和流模块,输入数据图像大小resize到224x224,然后输入到CNN主干网络进行卷积提取特征生成特征图(7x7),在流模块中使用步长大于1的深度卷积(DWConv)层进行下采样,然后将特征图(7x7)输出为1维特征向量(1x1024)。The direction classifier adopts a convolutional neural network model, the model includes a CNN backbone network and a flow module, the input data image size is resized to 224x224, and then input to the CNN backbone network for convolution extraction feature generation feature map (7x7), in the flow module Downsampling is performed using a depthwise convolution (DWConv) layer with a stride greater than 1, and then the feature map (7x7) is output as a 1D feature vector (1x1024).

作为上述技术方案的进一步描述:As a further description of the above technical solution:

所述CNN主干网络包括BlockA和BlockB,BlockA是MobilenetV2中提出的逆残差块,BlockB作为模型的下采样模块,BlockB左侧辅助分支使用AVG Pool,在CNN主干网络中。The CNN backbone network includes BlockA and BlockB, BlockA is the inverse residual block proposed in MobilenetV2, BlockB is used as the downsampling module of the model, and the auxiliary branch on the left side of BlockB uses AVG Pool in the CNN backbone network.

综上所述,由于采用了上述技术方案,本发明的有益效果是:(1)本方法支持报销发票图片中出现多张多类型的处理场景;(2)对报销发票进行角度矫正,提高发票识别准确率,降低误识别与漏识别率;(3)对不少于12种发票进行测试,适用于多类型包括电子发票或者拍摄图片等各种发票识别,也适用于有表格无表格的发票;(4)对报销发票进行接口验真,识别虚假发票。In summary, due to the adoption of the above technical solution, the beneficial effects of the present invention are: (1) the method supports the occurrence of multiple and multi-type processing scenes in the picture of the reimbursement invoice; (2) correcting the angle of the reimbursement invoice, improving the Recognition accuracy rate, reduce misidentification and missed recognition rate; (3) Test no less than 12 kinds of invoices, applicable to various types of invoice identification including electronic invoices or photographed pictures, and also applicable to invoices with forms but without forms ; (4) Carry out interface verification on reimbursement invoices to identify false invoices.

附图说明Description of drawings

图1示出了根据本发明实施例提供的一种报销发票识别方法的工作流程示意图;Fig. 1 shows a schematic workflow diagram of a reimbursement invoice identification method provided according to an embodiment of the present invention;

图2示出了根据本发明实施例提供的一种报销发票识别方法的MaskRCNN网络和方向分类器训练流程示意图;Fig. 2 shows a schematic diagram of the MaskRCNN network and direction classifier training process of a reimbursement invoice recognition method provided according to an embodiment of the present invention;

图3示出了根据本发明实施例提供的一种报销发票识别方法的MaskRCNN网络结构示意图;Fig. 3 shows a schematic diagram of the MaskRCNN network structure of a reimbursement invoice recognition method provided according to an embodiment of the present invention;

图4示出了根据本发明实施例提供的一种报销发票识别方法的方向分类器结构示意图;Fig. 4 shows a schematic structural diagram of a direction classifier of a reimbursement invoice identification method provided according to an embodiment of the present invention;

图5示出了根据本发明实施例提供的一种报销发票识别方法的CNN主干网络结构示意图;FIG. 5 shows a schematic diagram of a CNN backbone network structure of a reimbursement invoice identification method provided according to an embodiment of the present invention;

图6示出了根据本发明实施例提供的一种报销发票识别方法的CNN主干网络结构模块示意图。Fig. 6 shows a schematic diagram of a CNN backbone network structure module of a reimbursement invoice recognition method provided according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

实施例一Embodiment one

请参阅图1-6,本发明提供一种技术方案:一种报销发票识别方法,包括以下步骤:Please refer to Figures 1-6, the present invention provides a technical solution: a reimbursement invoice identification method, comprising the following steps:

S1、通过已训练的MaskRCNN网络进行待检测图片的发票实例检测和分割;S1. Use the trained MaskRCNN network to detect and segment the invoice instance of the image to be detected;

具体的,通过训练数据集对MaskRCNN网络的进行训练,MaskRCNN网络对FasterRCNN进行扩展,添加一个分支对预测的目标检测框进一步预测mask掩膜,训练步骤如下:Specifically, the MaskRCNN network is trained through the training data set, the MaskRCNN network expands the FasterRCNN, and adds a branch to further predict the mask mask for the predicted target detection frame. The training steps are as follows:

S11、对MaskRCNN网络权重初始化,使用在Imagenet数据集预训练好的网络参数对backbone网络进行参数初始化;S11. Initialize the weights of the MaskRCNN network, and use the network parameters pre-trained in the Imagenet dataset to initialize the parameters of the backbone network;

S12、将训练数据集图片样本按照短边[640,672,704,736,768,800]随机缩放,并利用ResNet50+FPN网络来提取训练样本图像的整体特征图;S12. Randomly zoom the image sample of the training data set according to the short side [640, 672, 704, 736, 768, 800], and use the ResNet50+FPN network to extract the overall feature map of the training sample image;

S13、将整体特征图输入到RPN网络中,预测出ROI候选区域,根据候选区域目标框和标注目标框的重叠比挑选正负样本;S13. Input the overall feature map into the RPN network, predict the ROI candidate area, and select positive and negative samples according to the overlapping ratio of the target frame of the candidate area and the marked target frame;

S14、对正负样本对应特征图上的ROI候选区域进行ROIAlign池化计算,获得固定尺寸的候选特征图,具体的,ROIAlign池化计算时,首先将ROI区域目标框映射到特征图上,再根据最小外接矩形算法,得到特征图像ROI区域,将ROI区域划分成m×m个网格,每个网格选取特征图上4个点进行双线性差值,最后得到尺寸为m×m的特征图;S14. Perform ROIAlign pooling calculation on the ROI candidate area on the feature map corresponding to the positive and negative samples to obtain a fixed-size candidate feature map. Specifically, during ROIAlign pooling calculation, first map the ROI area target frame to the feature map, and then According to the minimum circumscribed rectangle algorithm, the ROI area of the feature image is obtained, and the ROI area is divided into m×m grids. For each grid, 4 points on the feature map are selected for bilinear difference, and finally the size of m×m is obtained. feature map;

S15、对候选特征图进行分类和目标框的回归计算;S15, performing classification on the candidate feature map and regression calculation of the target frame;

S16、计算MaskRCNN网络的损失函数,通过随机梯度算法对损失函数进行梯度计算,并更新MaskRCNN网络的权重;S16. Calculate the loss function of the MaskRCNN network, perform gradient calculation on the loss function through a stochastic gradient algorithm, and update the weight of the MaskRCNN network;

S17、重复步骤S12-S16,训练至少20轮次迭代后停止训练,保存MaskRCNN网络,即获得已训练的MaskRCNN网络;S17, repeat steps S12-S16, stop training after training for at least 20 rounds of iterations, save the MaskRCNN network, that is, obtain the trained MaskRCNN network;

进一步地,训练数据集包括训练集和验证集:挑选12类常用的报销发票共2132张图片作为发票实例分割训练与验证数据集,进行样本标注,标注出图像中不同发票多边形区域、类别,数据集按照8:2分割成训练集和验证集,处理为COCO数据格式;Furthermore, the training data set includes a training set and a verification set: select 12 commonly used reimbursement invoices, a total of 2132 images, as invoice instance segmentation training and verification data sets, and perform sample labeling, marking different invoice polygon areas, categories, and data in the image. The set is divided into training set and verification set according to 8:2, and processed into COCO data format;

S2、根据分割结果,获得发票目标框和目标框内发票mask,并根据发票目标框坐标,切割单发票图片;S2. Obtain the invoice target frame and the invoice mask in the target frame according to the segmentation result, and cut the single invoice image according to the coordinates of the invoice target frame;

具体的,发票目标框和目标框内发票mask的获取步骤如下:Specifically, the steps to obtain the invoice target box and the invoice mask in the target box are as follows:

S21、对原图进行大小调整,获得待测图片;S21. Adjust the size of the original image to obtain the image to be tested;

S22、待测图片输入到通过MaskRCNN网络,预测结果插值处理成原图尺寸,获得发票目标框和目标框内发票mask;S22. The image to be tested is input to the MaskRCNN network, and the prediction result is interpolated into the size of the original image to obtain the invoice target frame and the invoice mask in the target frame;

S3、对单发票图片进行OCR识别,获得识别结果,并根据识别结果判断发票类型;S3. Carry out OCR recognition on the single invoice picture, obtain the recognition result, and judge the invoice type according to the recognition result;

S4、根据单发票图片以及其对应的mask和发票类型,通过角度矫正算法,获得0度或者180度的发票角度矫正图片;S4. According to the single invoice picture and its corresponding mask and invoice type, through the angle correction algorithm, obtain the invoice angle correction picture of 0 degrees or 180 degrees;

具体的,角度验证算法步骤包括:Specifically, the steps of the angle verification algorithm include:

S41、获得单发票图片以及其对应的mask二值图;S41. Obtain a single invoice image and its corresponding mask binary image;

S42、对mask二值图通过opencv中findContours获得发票目标边缘点;S42. Obtain the edge point of the invoice target through findContours in opencv for the mask binary image;

S43、用minAreaRect方法获得发票边缘的最小外接矩形rect,包括矩形中心rect[0],矩形长和宽rect[1],矩形的角度rect[3];S43. Use the minAreaRect method to obtain the minimum circumscribed rectangle rect of the edge of the invoice, including the center of the rectangle rect[0], the length and width of the rectangle rect[1], and the angle rect[3] of the rectangle;

S44、根据发票类型,分别计算竖票和横票与水平方向所夹角度;S44. According to the type of invoice, respectively calculate the angles between the vertical invoice and the horizontal invoice and the horizontal direction;

S45、使用getRotationMatrix2D()和warpAffine将发票转正至与水平方向夹角为0度;S45. Use getRotationMatrix2D() and warpAffine to forward the invoice to an angle of 0 degrees with the horizontal direction;

S5、将发票角度矫正图片输入到已训练的方向分类器,输出0度发票图片;S5. Input the invoice angle correction picture to the trained direction classifier, and output the 0-degree invoice picture;

具体的,方向分类器采用卷积神经网络模型做图像分类,其结构图如图4所示;Specifically, the direction classifier uses a convolutional neural network model for image classification, and its structure diagram is shown in Figure 4;

首先,输入数据图像大小resize到224x224,然后输入到主干网络进行卷积提取特征等操作后生成特征图(7x7),其中,CNN主干网络主要由BlockA和BlockB构建,BlockA是MobilenetV2中提出的逆残差块,BlockB作为模型的下采样模块,BlockB左侧辅助分支使用AVG Pool,是因为它能够在不同的感受野中嵌入多尺度信息和聚合特征,带来性能的提高;First, the size of the input data image is resized to 224x224, and then input to the backbone network for convolution to extract features and other operations to generate a feature map (7x7). Among them, the CNN backbone network is mainly constructed by BlockA and BlockB, and BlockA is the inverse residual proposed in MobilenetV2 Bad block, BlockB is used as the downsampling module of the model, and the auxiliary branch on the left side of BlockB uses AVG Pool, because it can embed multi-scale information and aggregate features in different receptive fields, resulting in improved performance;

其次,经过CNN主干网络后,为了更好的提取特征图信息,使用了流模块,如图4所示,在流模块中使用步长大于1的深度卷积(DWConv)层进行下采样,然后将其输出为1维特征向量(1x1024),这样可减少全连接层引起的过拟合风险,后用该特征向量计算损失进行预测;Secondly, after the CNN backbone network, in order to better extract the feature map information, the flow module is used, as shown in Figure 4, in the flow module, the depth convolution (DWConv) layer with a step size greater than 1 is used for downsampling, and then Output it as a 1-dimensional feature vector (1x1024), which can reduce the risk of overfitting caused by the fully connected layer, and then use the feature vector to calculate the loss for prediction;

最后,主干网络初始阶段采用快速下采样的策略,这能使特征图的尺寸迅速减小,且花费较少参数,可以避免算力有限的慢速下采样过程导致的特征嵌入能力弱和处理时间长的问题;Finally, the initial stage of the backbone network adopts a fast downsampling strategy, which can quickly reduce the size of the feature map and consume less parameters, which can avoid the weak feature embedding ability and processing time caused by the slow downsampling process with limited computing power. long question;

由于单张图片可能包含多张发票,因此要先将原图处理为单张发票来训练深度学习方向分类器,如图1所示,具体的,方向分类器的训练步骤如下:Since a single picture may contain multiple invoices, the original picture must be processed into a single invoice to train the deep learning direction classifier, as shown in Figure 1. Specifically, the training steps of the direction classifier are as follows:

S51、将原发票图片通过已训练完成MaskRCNN,获得单发票目标框与其mask;S51. Pass the original invoice image through the trained MaskRCNN to obtain the single invoice target frame and its mask;

S52、判定单发票类型;S52. Determining the type of invoice;

S53、矫正单发票角度,获得0或180度的发票图片;S53. Correct the angle of the single invoice to obtain an invoice picture of 0 or 180 degrees;

S54、数据标注,生成0度图片和180度图片各1831张;S54, data labeling, generating 1831 0-degree pictures and 180-degree pictures respectively;

S55、以8:2的比例作为方向分类器的训练与验证数据集;S55. Using a ratio of 8:2 as a training and verification data set for the direction classifier;

S6、将0度发票图片输入到已训练的文本检测模型,获得文本框位置;S6. Input the 0-degree invoice picture into the trained text detection model to obtain the position of the text box;

S7、对文本框位置进行方向分类,转正文本区域;S7, classify the direction of the text box position, and turn the text area into a normal position;

S8、将转正后的文本区域输入已训练的文本识别模型,获得识别结果;S8. Input the converted text region into the trained text recognition model to obtain the recognition result;

S9、根据文本框位置和识别结果,以及发票类型,通过文本匹配规则,完成文本匹配;S9. According to the position of the text box, the recognition result, and the type of the invoice, complete the text matching through the text matching rule;

具体的,文本匹配规则包括:Specifically, the text matching rules include:

S91、针对不同发票类型的发票预设待识别关键字,比如增值税发票的关键字为['发票代码','发票号码','开票日期','校验码'......];S91. Preset keywords to be identified for invoices of different invoice types, for example, keywords for value-added tax invoices are ['invoice code', 'invoice number', 'bill date', 'check code'... ];

S92、根据文本检测结果,若关键字和值出现在同一文本框,则分离关键字和数值,完成文本匹配;S92. According to the text detection result, if the keyword and the value appear in the same text box, then separate the keyword and the value to complete the text matching;

S93、若关键字和值未出现在同一文本框,则通过设置参数框延长方向和延长比例来查找是否存在与该关键字文本框相交的其他文本框,若存在相交框,则合并两个文本框内容,分离关键字和值完成文本匹配,若不存在相交框,则文本匹配失败,其中,为解决文本关键字与值匹配的问题,通过设置不同参数来自定义文本框延长比例,更加有效的实现文本关键信息匹配;S93, if the keyword and the value do not appear in the same text box, then by setting the extension direction and extension ratio of the parameter box to find whether there are other text boxes intersecting with the keyword text box, if there is an intersecting box, then merge the two texts Box content, separate keywords and values to complete text matching, if there is no intersecting box, the text matching will fail, among them, in order to solve the problem of matching text keywords and values, set different parameters to customize the extension ratio of the text box, which is more effective Realize text key information matching;

步骤S93是用于解决文本关键字及其值的匹配问题,当关键字与值的间隔较大,文本识别模型将从这两个文本对象检测出两个框.为了让这两个文本框匹配起来,我们预先知道,关键字文本框和值文本框是在水平方向上,还是垂直方向上,可以通过寻找与关键字文本框对应方向上的所有文本框位置信息,获得对应的值,因此,本发明提出一种更优秀的方法来找关键字的值,即通过文本框是否相交来匹配关键字和值,设置延长方向为水平向右或者垂直向下或者右下方向,设置延长比例为1.5~5来扩大关键字的文本框,通常这样设置以后会得到一个与关键字文本框相交的框,其文本识别结果为关键字的值;Step S93 is used to solve the matching problem of text keywords and their values. When the distance between keywords and values is relatively large, the text recognition model will detect two boxes from these two text objects. In order to make these two text boxes match Up, we know in advance whether the keyword text box and the value text box are in the horizontal direction or vertical direction, and the corresponding value can be obtained by finding all the text box position information in the direction corresponding to the keyword text box. Therefore, The present invention proposes a more excellent method to find the value of the keyword, that is, to match the keyword and the value by whether the text boxes intersect, set the extension direction to the horizontal right or vertical downward or the lower right direction, and set the extension ratio to 1.5 ~5 to expand the text box of the keyword. Usually, after setting this way, a box intersecting the text box of the keyword will be obtained, and the text recognition result is the value of the keyword;

关键字文本框表示为[center_x,center_y,w,h],其中center_x,center_y为文本框中心坐标,w,h为文本框宽度和高度;The keyword text box is expressed as [center_x, center_y, w, h], where center_x, center_y are the center coordinates of the text box, w, h are the width and height of the text box;

设延长比例为ratio,当延长方向为水平向右,则延长后的文本框为[center_x+(ratio-1)*w/2,center_y,w*ratio,h];当延长方向为垂直向下,则延长后的文本框为[center_x,center_y+(ratio-1)*h/2,w,h*ratio];当延长方向为右下方向,则延长后的文本框为[center_x+(ratio-1)*w/2,center_y+(ratio-1)*h/2,w*ratio,h*ratio];遍历所有文本框,找到与扩大后的文本框相交的文本框即为其值框,通过调整参数自定义文本框延长比例,参数通常设置为单个,若存在多个,默认取第一个;Set the extension ratio as ratio, when the extension direction is horizontal to the right, the extended text box is [center_x+(ratio-1)*w/2,center_y,w*ratio,h]; when the extension direction is vertical downward, Then the extended text box is [center_x, center_y+(ratio-1)*h/2, w, h*ratio]; when the extension direction is the lower right direction, the extended text box is [center_x+(ratio-1) *w/2,center_y+(ratio-1)*h/2,w*ratio,h*ratio]; Traverse all text boxes, find the text box that intersects with the expanded text box as its value box, adjust the parameters Customize the extension ratio of the text box, the parameter is usually set to single, if there are more than one, the first one will be selected by default;

S10、根据发票文本匹配结果,通过发票验真接口实现发票验真。S10. According to the invoice text matching result, implement the invoice authenticity verification through the invoice authenticity verification interface.

具体的,与现有公开号为:CN111062262A,名称为“发票识别方法以及发票识别装置”的专利相比较,该专利需要首先识别发票表格的角点,但是部分报销发票不包含表格,因此角点检测不好,其后续文本检测等操作都会失败,本发明则支持包含和不包含表格的报销发票,例如出租车发票,过路费发票,动车票等;Specifically, compared with the existing patent with the publication number: CN111062262A, titled "Invoice Recognition Method and Invoice Recognition Device", this patent needs to first identify the corners of the invoice forms, but some reimbursement invoices do not contain forms, so the corners If the detection is not good, the subsequent text detection and other operations will fail. The present invention supports reimbursement invoices with or without forms, such as taxi invoices, toll invoices, train tickets, etc.;

与现有公开号为:CN111768565B,名称为“一种增值税发票中发票代码识别后处理方法”的专利相比较,该专利只是对增值税发票代码进行后处理操作,具有一定的局限性;Compared with the existing patent with the publication number: CN111768565B and titled "A post-processing method for identifying invoice codes in value-added tax invoices", this patent only performs post-processing operations on the value-added tax invoice codes, which has certain limitations;

与现有公开号为:CN114004962A,名称为“一种电力营业厅发票OCR识别方法”的专利相比较,该专利运用投影法确定文本角度,不依赖于数据量的多少,缺点是难以识别带噪声的发票,如发票上存在笔迹或者拍摄存在阴影,都会对发票识别结果产生影响,本发明运用深度学习构建发票实例分割模型,更有效地确定发票方向;Compared with the existing patent with the publication number: CN114004962A and the name "A Method for OCR Recognition of Invoices in Electric Power Business Halls", this patent uses the projection method to determine the angle of the text, which does not depend on the amount of data. The disadvantage is that it is difficult to identify the text with noise. Invoices, such as handwriting on the invoice or shadows in the photograph, will have an impact on the invoice recognition results. The present invention uses deep learning to build an invoice instance segmentation model to more effectively determine the direction of the invoice;

与现有公开号为:CN109977957A,名称为“一种基于深度学习的发票识别方法及系统”的专利相比较,该专利使用Faster-RCNN模型训练发票检测模型,后对发票目标进行OCR识别,但是并未涉及发票OCR识别的精度提升,本发明对发票进行mask实例分割,在检测发票目标的同时,获得发票mask,用于对发票角度矫正,再对矫正后发票进行方向分类和OCR识别,目的是提高发票OCR的识别精度;Compared with the existing patent with the publication number: CN109977957A, titled "A Method and System for Invoice Recognition Based on Deep Learning", this patent uses the Faster-RCNN model to train the invoice detection model, and then performs OCR recognition on the invoice target, but It does not involve the improvement of the accuracy of invoice OCR recognition. The present invention divides the mask instance of the invoice, and obtains the invoice mask while detecting the invoice target, which is used to correct the angle of the invoice, and then performs direction classification and OCR recognition on the corrected invoice. It is to improve the recognition accuracy of invoice OCR;

与现有公开号为:CN111062262A,名称为“发票识别方法以及发票识别装置”的专利相比较,该专利针对不同发票建立角点图模型模板,然后对单元格进行OCR检测识别,是针对表格类发票的一种识别方法,但是,本发明支持多种类型发票识别,包括表格类和不含表格的发票识别;Compared with the existing patent with the publication number: CN111062262A and the name "Invoice Recognition Method and Invoice Recognition Device", this patent establishes a corner diagram model template for different invoices, and then performs OCR detection and recognition on the cells. An identification method for invoices, however, the present invention supports identification of multiple types of invoices, including form-based and form-free invoice identification;

综上,本发明在报销发票识别上具有以下多个优势:(1)本方法支持报销发票图片中出现多张多类型的处理场景;(2)对报销发票进行角度矫正,提高发票识别准确率,降低误识别与漏识别率;(3)对不少于12种发票进行测试,适用于多类型包括电子发票或者拍摄图片等各种发票识别,也适用于有表格无表格的发票;(4)对报销发票进行接口验真,识别虚假发票。To sum up, the present invention has the following advantages in the recognition of reimbursement invoices: (1) This method supports the occurrence of multiple and multi-type processing scenarios in the reimbursement invoice picture; (2) corrects the angle of the reimbursement invoice to improve the accuracy of invoice recognition , to reduce misidentification and missed identification rates; (3) Test no less than 12 kinds of invoices, which are suitable for various types of invoice identification including electronic invoices or photographed pictures, and are also applicable to invoices with forms but without forms; (4) ) to verify the authenticity of the reimbursement invoice and identify false invoices.

以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto, any person familiar with the technical field within the technical scope disclosed in the present invention, according to the technical solution of the present invention Any equivalent replacement or change of the inventive concepts thereof shall fall within the protection scope of the present invention.

Claims (9)

CN202211056210.3A2022-08-312022-08-31 A method for identifying reimbursement invoicesPendingCN115376149A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211056210.3ACN115376149A (en)2022-08-312022-08-31 A method for identifying reimbursement invoices

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211056210.3ACN115376149A (en)2022-08-312022-08-31 A method for identifying reimbursement invoices

Publications (1)

Publication NumberPublication Date
CN115376149Atrue CN115376149A (en)2022-11-22

Family

ID=84070151

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211056210.3APendingCN115376149A (en)2022-08-312022-08-31 A method for identifying reimbursement invoices

Country Status (1)

CountryLink
CN (1)CN115376149A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN116434245A (en)*2023-04-072023-07-14中国平安财产保险股份有限公司Multi-mode document classification method, system, computer equipment and storage medium
CN116612479A (en)*2023-05-042023-08-18西安电子科技大学 A lightweight bill OCR recognition method and system
CN116721425A (en)*2023-05-292023-09-08北京啄木鸟云健康科技有限公司Certificate information input method and device, electronic equipment and storage medium
CN117576717A (en)*2023-11-152024-02-20希维科技(广州)有限公司 Recognition methods, equipment and storage media for engineering drawings
CN117727059A (en)*2024-02-182024-03-19蓝色火焰科技成都有限公司Method and device for checking automobile financial invoice information, electronic equipment and storage medium
CN119130390A (en)*2024-11-132024-12-13中博信息技术研究院有限公司 A financial reimbursement process optimization method based on intelligent reimbursement cloud platform

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106934632A (en)*2015-12-302017-07-07远光软件股份有限公司Invoice verification method and invoice true check system
CN111612003A (en)*2019-02-222020-09-01北京京东尚科信息技术有限公司Method and device for extracting text in picture
CN111931664A (en)*2020-08-122020-11-13腾讯科技(深圳)有限公司Mixed note image processing method and device, computer equipment and storage medium
CN113158895A (en)*2021-04-202021-07-23北京中科江南信息技术股份有限公司Bill identification method and device, electronic equipment and storage medium
CN113449623A (en)*2021-06-212021-09-28浙江康旭科技有限公司Light living body detection method based on deep learning
WO2022057471A1 (en)*2020-09-172022-03-24深圳壹账通智能科技有限公司Bill identification method, system, computer device, and computer-readable storage medium
CN114550189A (en)*2021-12-232022-05-27上海浦东发展银行股份有限公司Bill recognition method, device, equipment, computer storage medium and program product

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106934632A (en)*2015-12-302017-07-07远光软件股份有限公司Invoice verification method and invoice true check system
CN111612003A (en)*2019-02-222020-09-01北京京东尚科信息技术有限公司Method and device for extracting text in picture
CN111931664A (en)*2020-08-122020-11-13腾讯科技(深圳)有限公司Mixed note image processing method and device, computer equipment and storage medium
WO2022057471A1 (en)*2020-09-172022-03-24深圳壹账通智能科技有限公司Bill identification method, system, computer device, and computer-readable storage medium
CN113158895A (en)*2021-04-202021-07-23北京中科江南信息技术股份有限公司Bill identification method and device, electronic equipment and storage medium
CN113449623A (en)*2021-06-212021-09-28浙江康旭科技有限公司Light living body detection method based on deep learning
CN114550189A (en)*2021-12-232022-05-27上海浦东发展银行股份有限公司Bill recognition method, device, equipment, computer storage medium and program product

Cited By (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN116434245A (en)*2023-04-072023-07-14中国平安财产保险股份有限公司Multi-mode document classification method, system, computer equipment and storage medium
CN116612479A (en)*2023-05-042023-08-18西安电子科技大学 A lightweight bill OCR recognition method and system
CN116721425A (en)*2023-05-292023-09-08北京啄木鸟云健康科技有限公司Certificate information input method and device, electronic equipment and storage medium
CN117576717A (en)*2023-11-152024-02-20希维科技(广州)有限公司 Recognition methods, equipment and storage media for engineering drawings
CN117727059A (en)*2024-02-182024-03-19蓝色火焰科技成都有限公司Method and device for checking automobile financial invoice information, electronic equipment and storage medium
CN117727059B (en)*2024-02-182024-05-03蓝色火焰科技成都有限公司Method and device for checking automobile financial invoice information, electronic equipment and storage medium
CN119130390A (en)*2024-11-132024-12-13中博信息技术研究院有限公司 A financial reimbursement process optimization method based on intelligent reimbursement cloud platform

Similar Documents

PublicationPublication DateTitle
CN112686812B (en) Bank card tilt correction detection method, device, readable storage medium and terminal
CN115376149A (en) A method for identifying reimbursement invoices
CN111325203B (en) An American license plate recognition method and system based on image correction
CN111626146B (en)Merging cell table segmentation recognition method based on template matching
CN109658584B (en)Bill information identification method and device
CN105046252B (en)A kind of RMB prefix code recognition methods
CN101561866B (en)Character recognition method based on SIFT feature and gray scale difference value histogram feature
CN109086714A (en)Table recognition method, identifying system and computer installation
CN110796186A (en)Dry and wet garbage identification and classification method based on improved YOLOv3 network
CN113780087B (en) A postal package text detection method and device based on deep learning
CN111738055B (en) Multi-category text detection system and bill form detection method based on the system
CN115331245A (en) A table structure recognition method based on image instance segmentation
CN112949455B (en)Value-added tax invoice recognition system and method
CN105184225B (en)A kind of multinational banknote image recognition methods and device
CN113688821A (en)OCR character recognition method based on deep learning
CN112307919A (en) A method for identifying digital information regions in document images based on improved YOLOv3
CN118968537B (en) Bill scene recognition method, device, equipment and storage medium
CN113792780B (en) Container number recognition method based on deep learning and image post-processing
CN114927236A (en)Detection method and system for multiple target images
CN116030266A (en) Pavement crack detection and classification method in natural scenes based on improved YOLOv3
CN116152824A (en)Invoice information extraction method and system
CN115100714A (en) Method, device and server for living body detection based on face image
CN106709474A (en)Handwritten telephone number identification, verification and information sending system
CN111652117B (en) A method and medium for image segmentation of multiple documents
CN113191195A (en)Face detection method and system based on deep learning

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp