CN111583502B

Movatterモバイル変換

Info

Publication number: CN111583502B
Application number: CN202010381442.0A
Authority: CN
Inventors: 田莹; 王澧冰; 董惠文; 汪洋; 崔龙磊; 苗丰泽
Original assignee: University of Science and Technology Liaoning USTL
Current assignee: University of Science and Technology Liaoning USTL
Priority date: 2020-05-08
Filing date: 2020-05-08
Publication date: 2022-06-03
Anticipated expiration: 2040-05-08
Also published as: CN111583502A

Abstract

The invention relates to the technical field of banknote identification, in particular to a method for identifying RMB serial numbers with multiple labels based on a deep convolutional neural network. Preprocessing the banknote image; firstly, roughly positioning by using priori knowledge, and then accurately positioning the serial number to obtain a RMB serial number image; zooming all RMB crown word number images into a preset same size; extracting image features by using a deep convolutional neural network, training a model to obtain a prediction vector, and storing the model when the model reaches a certain accuracy rate; in the prediction stage, the image is transmitted into a deep convolution neural network to extract image characteristics; stretching and inputting the feature map into a full-connection layer to obtain a prediction vector; performing Sigmoid operation on the prediction vector; and dividing the prediction vector subjected to Sigmoid operation into ten pieces, finding the maximum value from each piece, and mapping the maximum value to the corresponding label vector to obtain a final classification result. Compared with the traditional identification method, the method is rapid, stable and high in accuracy.

Description

Translated fromChinese

基于深度卷积神经网络的人民币冠字号多标签识别方法A multi-label recognition method of RMB prefix number based on deep convolutional neural network

技术领域technical field

本发明涉及钞票识别技术领域，尤其涉及一种基于深度卷积神经网络的人民币冠字号多标签识别方法。The present invention relates to the technical field of banknote identification, in particular to a multi-label identification method of RMB serial number based on a deep convolutional neural network.

背景技术Background technique

纸币的冠字号用于记录纸币发行序列，具有控制纸币发行数量以及纸币防伪的作用。冠字号可以理解为每一张纸币的身份证，银行或自助金融设备可对流入或流出的纸币进行冠字号记录，以便于管理取证以及跟踪纸币的流向。自动取款机或存取一体机等自助金融设备还可以根据纸币的冠字号对纸币的真伪进行识别。由此可见，准确的识别出纸币冠字号十分重要。The serial number of the banknote is used to record the issuance sequence of the banknote, which has the function of controlling the issuance quantity of the banknote and preventing counterfeiting of the banknote. The serial number can be understood as the ID card of each banknote, and the bank or self-service financial equipment can record the serial number of the incoming or outgoing banknotes, so as to facilitate the management of evidence collection and the tracking of the flow of the banknotes. Self-service financial equipment such as automatic teller machines or all-in-one deposit and withdrawal machines can also identify the authenticity of banknotes based on the serial number of the banknotes. It can be seen that it is very important to accurately identify the serial number of banknotes.

目前国内外纸币冠字号识别的方法有：通过USB将纸币图像传至上位机进行处理，由于受限于USB传输速度，实时性效果差；通过DSP平台进行纸币冠字号识别，但由于纸币图像的寻边、面向朝向的识别、冠字号区域的定位分割以及冠字号识别采用了低效率的方法，导致识别效果及软件的健壮性较差。例如在纸币图像的寻边，没有进行异常点去除，导致寻找的纸币边缘不准确，影响纸币冠字号定位和识别。又如纸币的面向朝向识别，采用了粗网格特征，严重影响了程序的效率。At present, the methods for identifying the serial number of banknotes at home and abroad include: transferring the image of the banknote to the host computer for processing through USB. Due to the limitation of the USB transmission speed, the real-time effect is poor; Edge finding, orientation-oriented recognition, location and segmentation of the serial number area, and serial number recognition use inefficient methods, resulting in poor recognition effect and software robustness. For example, in the edge finding of the banknote image, no abnormal point removal is carried out, resulting in inaccurate edges of the banknote searched, which affects the positioning and identification of the banknote serial number. Another example is the face-to-face recognition of banknotes, which uses coarse grid features, which seriously affects the efficiency of the program.

这些方法最主要的缺点就是效率低下，识别效果差，冠字号识别率不高。The main disadvantages of these methods are low efficiency, poor recognition effect, and low recognition rate of serial number.

发明内容SUMMARY OF THE INVENTION

为了克服现有技术的不足，本发明提供了一种基于深度卷积神经网络的人民币冠字号多标签识别方法，相较于传统识别方法，快速，稳定，准确率高。In order to overcome the deficiencies of the prior art, the present invention provides a multi-label identification method for the RMB serial number based on a deep convolutional neural network, which is fast, stable and has a high accuracy rate compared to the traditional identification method.

基于深度卷积神经网络的人民币冠字号多标签识别方法，具体包括如下步骤：The multi-label identification method of the RMB title number based on the deep convolutional neural network specifically includes the following steps:

1)对钞票图像进行预处理，包括改善光亮强度、提取冠字号图像以及配准冠字号图像；1) Preprocessing the banknote image, including improving the light intensity, extracting the serial number image and registering the serial number image;

2)首先使用先验知识大致定位，然后对冠字号进行准确定位，得到人民币冠字号图像；2) First use prior knowledge to roughly locate, and then accurately locate the serial number to obtain an image of the RMB serial number;

3)将所有人民币冠字号图像缩放成预设的同一尺寸；3) Scale all the images of the RMB serial number to the same preset size;

4)利用深度卷积神经网络对图像特征进行提取，训练模型后得到预测向量，当模型达到一定准确率时保存模型；4) Extract the image features by using a deep convolutional neural network, obtain a prediction vector after training the model, and save the model when the model reaches a certain accuracy;

5)预测阶段将图像传入到深度卷积神经网络中提取图像特征；5) In the prediction stage, the image is passed into the deep convolutional neural network to extract image features;

6)将特征图拉伸输入到全连接层中得出预测向量；6) The feature map is stretched and input into the fully connected layer to obtain the prediction vector;

7)将预测向量进行Sigmoid操作；7) Perform Sigmoid operation on the prediction vector;

8)将作Sigmoid操作后的预测向量切分成十条，从每一条中找到最大值，并映射到对应的标签向量上，得出最终分类结果。8) Divide the predicted vector after the Sigmoid operation into ten pieces, find the maximum value from each piece, and map it to the corresponding label vector to obtain the final classification result.

所述步骤1具体包括：灰度化的基础上结合顶帽变换以改善纸币图像二值化效果；提取纸币图像所在的矩阵区域以去除无关的背景信息；利用单应矩阵对图像进行配准以校正倾斜和消除透视效应；Thestep 1 specifically includes: combining top hat transformation on the basis of grayscale to improve the binarization effect of the banknote image; extracting the matrix area where the banknote image is located to remove irrelevant background information; Correct tilt and eliminate perspective effects;

第一个图像预处理具体包括：The first image preprocessing specifically includes:

1)建立配准前纸币图像的四个顶角坐标与配准图像的对应关系；1) Establish the correspondence between the four vertex coordinates of the banknote image before registration and the registration image;

2)由坐标对应关系求出单应矩阵；2) Obtain the homography matrix from the coordinate correspondence;

3)利用单应矩阵求出配准后的纸币图像中在配准前的纸币图像的对应点；3) using the homography matrix to find the corresponding point of the banknote image before registration in the registered banknote image;

4)采用双线性插值法对匹配后的纸币图像赋值。4) Using bilinear interpolation method to assign value to the matched banknote image.

所述步骤2具体包括：利用先验知识大致定位在配准图像左方1/4和下方1/3的矩形区域；采用基于分块二值化的精准定位，即将大致定位图分成左右两块并分别使用全局阈值进行二值化，再拼合起来进行扫描定位。Thestep 2 specifically includes: using prior knowledge to roughly locate the rectangular area on the left 1/4 and the lower 1/3 of the registration image; adopting accurate positioning based on block binarization, that is, dividing the approximate positioning map into two left and right blocks. And use the global threshold for binarization respectively, and then combine them for scanning and positioning.

所述步骤4具体包括：训练阶段将归一化的二值冠字号图像输入深度卷积神经网络中通过模型的自训练得到特征向量；将特征图拉伸输入到全连接层中得出预测向量；通过Sigmoid交叉熵函数对预测向量和标签向量进行训练，得出最终模型。Thestep 4 specifically includes: in the training stage, the normalized binary numbered image is input into the deep convolutional neural network to obtain the feature vector through the self-training of the model; the feature map is stretched and input into the fully connected layer to obtain the prediction vector. ; Train the prediction vector and the label vector through the Sigmoid cross-entropy function to obtain the final model.

所述步骤5具体包括：预测阶段将图像输入到保存的深度卷积神经网络模型中提取特征。The step 5 specifically includes: in the prediction stage, the image is input into the saved deep convolutional neural network model to extract features.

所述深度卷积神经网络模型结构如下：The structure of the deep convolutional neural network model is as follows:

首先共有4层，分别为卷积层，批正则化层，激活层，以及最大值池化层；First, there are 4 layers, namely convolution layer, batch regularization layer, activation layer, and maximum pooling layer;

输入图像尺寸为(128,64,3)，其中卷积层的卷积核大小为7x7，卷积核的深度为64，卷积步长为2；The input image size is (128, 64, 3), the size of the convolution kernel of the convolution layer is 7x7, the depth of the convolution kernel is 64, and the convolution stride is 2;

批正则化层对输入进行归一化，不改变输入的尺寸；The batch regularization layer normalizes the input without changing the size of the input;

激活层增加了神经网络的非线性，不改变输入的尺寸；The activation layer increases the nonlinearity of the neural network without changing the size of the input;

最大池化层中采样层为3x3，步长为2，最大池化层用来缩减模型的大小，提高计算速度，同时提高所提取特征的鲁棒性；The sampling layer in the maximum pooling layer is 3x3, and the step size is 2. The maximum pooling layer is used to reduce the size of the model, improve the calculation speed, and improve the robustness of the extracted features;

然后是瓶颈模块，瓶颈模块中包含了九层，第一层为卷积层，第二层为批正则化层，第三层为激活层，第四层为补边层，第五层为卷积层，第六层为批正则化层，第七层为激活层，第八层为卷积层，第九层为批正则化层；Then there is the bottleneck module, which contains nine layers, the first layer is the convolution layer, the second layer is the batch regularization layer, the third layer is the activation layer, the fourth layer is the edge supplementation layer, and the fifth layer is the volume The sixth layer is the batch regularization layer, the seventh layer is the activation layer, the eighth layer is the convolution layer, and the ninth layer is the batch regularization layer;

共堆叠了16个瓶颈模块；A total of 16 bottleneck modules are stacked;

接下来时是捷径残差快，在瓶颈模块中，跨越三层将第一层加权连接到第三层，有效的解决了深度网络中的梯度发散问题。其中捷径通道中的权重设为1；The next step is the shortcut residual fast. In the bottleneck module, the first layer is weighted and connected to the third layer across three layers, which effectively solves the problem of gradient divergence in deep networks. The weight in the shortcut channel is set to 1;

然后是全局均值池化层，其中全局均值池化层用来减少参数数量，即减轻模型过拟合的发生；Then there is the global mean pooling layer, in which the global mean pooling layer is used to reduce the number of parameters, that is, to reduce the occurrence of model overfitting;

最后是全连接层，全连接层输出高度提纯的特征，用来交给最后的分类器做分类。The last is the fully connected layer, which outputs highly purified features, which are used to hand over to the final classifier for classification.

所述步骤6具体包括：将特征图拉伸输入到全连接层中得出预测向量。The step 6 specifically includes: stretching the feature map and inputting it into the fully connected layer to obtain a prediction vector.

所述步骤7具体包括：将预测向量作Sigmoid交叉熵函数得到取值范围为0-1的预测向量。Thestep 7 specifically includes: using the prediction vector as a Sigmoid cross entropy function to obtain a prediction vector with a value range of 0-1.

所述步骤8具体包括：由于冠字号图像固定是十位，所以将向量切分成10条，从每条中找到最大值，并映射到对应的标签向量上，然后输出预测结果。Thestep 8 specifically includes: since the serial number image is fixed in ten digits, the vector is divided into 10 pieces, the maximum value is found from each piece, and mapped to the corresponding label vector, and then the prediction result is output.

与现有的技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

1、本发明采用了基于单应矩阵的纸币图像配准方法使不同角度、背景、光照强度以及分辨率的输入纸币图像均能输出为统一的纸币俯视图。1. The present invention adopts the banknote image registration method based on the homography matrix, so that the input banknote images of different angles, backgrounds, light intensity and resolution can be output as a unified banknote top view.

2、本发明采用了基于人民币纸币的图像纹理特征和预处理后的配准纸币图像快速判断纸币为正面还是为反面的问题，迅速地定位到人民币冠字号。2. The present invention adopts the problem of quickly judging whether the banknote is positive or negative based on the image texture features of the RMB banknote and the preprocessed registered banknote image, and quickly locates the RMB serial number.

3、本发明不需要再进行繁琐的人民币冠字号图像分割操作，大大提升了识别效率，使字符识别准确率达到99.84％。3. The present invention does not need to perform the tedious image segmentation operation of the renminbi serial number, greatly improves the recognition efficiency, and makes the character recognition accuracy rate reach 99.84%.

附图说明Description of drawings

图1为本发明人民币冠字号识别系统框图；Fig. 1 is the block diagram of the identification system of RMB serial number of the present invention;

图2为本发明基于单应矩阵的纸币图像配准过程图；Fig. 2 is the banknote image registration process diagram based on the homography matrix of the present invention;

图3为本发明深度卷积神经网络结构图；Fig. 3 is the deep convolutional neural network structure diagram of the present invention;

图4为本发明提取人民币冠字号图像示例图；Fig. 4 is an example diagram of the present invention extracting the image of the renminbi serial number;

图5为本发明瓶颈模块结构图；Fig. 5 is the bottleneck module structure diagram of the present invention;

图6为本发明卷积计算过程图；Fig. 6 is the convolution calculation process diagram of the present invention;

图7为本发明池化计算过程图；Fig. 7 is the pooling calculation process diagram of the present invention;

图8为本发明捷径残差块结构图。FIG. 8 is a structural diagram of the short-cut residual block of the present invention.

具体实施方式Detailed ways

本发明公开了一种基于深度卷积神经网络的人民币冠字号多标签识别方法。本领域技术人员可以借鉴本文内容，适当改进参数实现。特别需要指出的是，所有类似的替换和改动对本领域技术人员来说是显而易见的，它们都被视为包括在本发明。本发明的方法及应用已经通过较佳实施例进行了描述，相关人员明显能在不脱离本发明内容、精神和范围内对本文所述的方法和应用进行改动或适当变更与组合，来实现和应用本发明技术。The present invention discloses a multi-label identification method of the RMB serial number based on the deep convolutional neural network. Those skilled in the art can refer to the content of this document to appropriately improve the parameter realization. It should be particularly pointed out that all similar substitutions and modifications are obvious to those skilled in the art, and they are deemed to be included in the present invention. The method and application of the present invention have been described through preferred embodiments, and it is obvious that relevant persons can make changes or appropriate changes and combinations of the methods and applications described herein without departing from the content, spirit and scope of the present invention to achieve and Apply the technology of the present invention.

实施例：Example:

如图1-8所示，一种基于深度卷积神经网络的人民币冠字号多标签识别方法，包括下述步骤，首先，对纸币图像进行预处理，包括改善严重曝光、提取纸币图像和配准纸币图像等处理。提取纸币图像所在的矩阵区域以去除无关的背景信息；利用单应矩阵对图像进行配准以校正倾斜和消除透视效应；先根据纸币二值图像像素点的左右分布来判断倒转情况，再根据纸币左下方区域的颜色色调来判断正反面情况。该预处理算法能够很好地适配后续冠字号定位和识别，而且对于输入纸币图像的约束要求低，在任意角度、光照强度、分辨率下，只要直观可清晰辨认，则可输出正立纸币图像。As shown in Figure 1-8, a multi-label identification method of RMB serial number based on deep convolutional neural network includes the following steps. First, preprocessing the banknote image, including improving severe exposure, extracting the banknote image and registration Processing of banknote images, etc. Extract the matrix area where the banknote image is located to remove irrelevant background information; use the homography matrix to register the image to correct the inclination and eliminate the perspective effect; first judge the inversion situation according to the left and right distribution of the pixel points of the banknote binary image, and then according to the banknote The color tone of the lower left area is used to judge the positive and negative situation. The preprocessing algorithm can well adapt to the positioning and recognition of the subsequent serial number, and has low constraints on the input banknote image. Under any angle, light intensity and resolution, as long as it is intuitive and clearly identifiable, the upright banknote can be output. image.

用两步法来对冠字号进行定位，即第一步使用先验知识大致定位，第二步对冠字号进行准确定位。A two-step method is used to locate the title number, that is, the first step uses prior knowledge to roughly locate, and the second step is to accurately locate the title number.

最后利用深度卷积神经网络对图像进行训练以及预测，获得了较高的识别率。Finally, a deep convolutional neural network is used to train and predict the image, and a high recognition rate is obtained.

为了能够输出统一的纸币俯视图，我们采用了基于单应矩阵的纸币图像配准方法，对纸币图像进行校正倾斜、消除透视效应。设配准前的纸币图像为A，配准完成后得到配准后的图像B，其中B＝HA，H为单应矩阵。In order to output a unified top view of banknotes, we adopted a homography matrix-based banknote image registration method to correct the inclination of banknote images and eliminate perspective effects. The banknote image before registration is set as A, and the registered image B is obtained after the registration is completed, where B=HA, and H is a homography matrix.

纸币图像配准的处理步骤如下：The processing steps of banknote image registration are as follows:

建立原图的四个顶角坐标与配准图的对应关系；Establish the correspondence between the four vertex coordinates of the original image and the registration map;

由坐标对应关系求出单应矩阵H；The homography matrix H is obtained from the coordinate correspondence;

利用H求出B中在A的对应点；Use H to find the corresponding point in A in B;

采用双线性插值法对B赋值。Use bilinear interpolation to assign values to B.

另外，根据人民币纸币图像纹理，通过其二值化图像的左右两块像素值分布情况来判断纸币是否倒转，再通过左下方区域的色调来判断正反面情况。In addition, according to the image texture of RMB banknotes, it can be judged whether the banknotes are reversed by the distribution of the left and right pixel values of the binarized image, and then the front and back sides of the banknotes can be judged by the hue of the lower left area.

本深度卷积神经网络模型结构如下：The structure of this deep convolutional neural network model is as follows:

首先共有4层，分别为卷积层，批正则化层，激活层，以及最大值池化层。First, there are 4 layers, namely convolution layer, batch regularization layer, activation layer, and max pooling layer.

输入图像尺寸为(128,64,3)，通过卷积运算提取输入图片的特征得到一个卷积层，其中卷积核大小为7x7，卷积核的深度为64，卷积步长为2，卷积层由32个特征映射图组成，共(7x7+1)x32＝1600个参数。The input image size is (128, 64, 3), and the features of the input image are extracted by convolution operation to obtain a convolution layer, where the size of the convolution kernel is 7x7, the depth of the convolution kernel is 64, and the convolution step size is 2. The convolutional layer consists of 32 feature maps with a total of (7x7+1)x32=1600 parameters.

批正则化层对输入进行归一化，不改变输入的尺寸。The batch regularization layer normalizes the input without changing the size of the input.

激活层增加模型的非线性，同样不改变输入的尺寸。The activation layer increases the nonlinearity of the model without changing the size of the input.

最大池化层中采样层为3x3，步长为2，最大池化层用来缩减模型的大小，提高计算速度，同时提高所提取特征的鲁棒性，通过最大池化层池化操作后，得到一个64x32x32的特征映射图。The sampling layer in the maximum pooling layer is 3x3, and the step size is 2. The maximum pooling layer is used to reduce the size of the model, improve the calculation speed, and improve the robustness of the extracted features. After the pooling operation of the maximum pooling layer, Get a 64x32x32 feature map.

接下来是瓶颈模块，瓶颈模块中包含了九层，第一层为卷积层，第二层为批正则化层，第三层为激活层，第四层为补边层，第五层为卷积层，第六层为批正则化层，第七层为激活层，第八层为卷积层，第九层为批正则化层。Next is the bottleneck module, which consists of nine layers, the first layer is the convolution layer, the second layer is the batch regularization layer, the third layer is the activation layer, the fourth layer is the edge-filling layer, and the fifth layer is the The sixth layer is the batch regularization layer, the seventh layer is the activation layer, the eighth layer is the convolution layer, and the ninth layer is the batch regularization layer.

总共堆叠了16个瓶颈模块。A total of 16 bottleneck modules are stacked.

然后是捷径残差块，在瓶颈模块中，跨越三层将第一层加权连接到第三层，有效的解决了深度网络中的梯度发散问题。其中捷径通道中的权重设为1。Then there is the shortcut residual block. In the bottleneck module, the first layer is weighted and connected to the third layer across three layers, which effectively solves the problem of gradient divergence in deep networks. The weight in the shortcut channel is set to 1.

然后是全局均值池化层，其中全局均值池化层用来减少参数数量，即减轻模型过拟合的发生。Then there is the global mean pooling layer, where the global mean pooling layer is used to reduce the number of parameters, that is, to reduce the occurrence of model overfitting.

最后是全连接层，全连接层输出高度提纯的特征，用来给分类器做分类。Finally, the fully connected layer, which outputs highly purified features, is used to classify the classifier.

卷积操作：在卷积神经网络中，最常见的计算方法有两种，一种是卷积操作，另外一种则是池化操作。卷积实际上是一种积分运算，它用来描述线性时不变系统的输入和输出的关系：即输出可以通过输入和一个表征系统特性的函数进行卷积运算得到。当F(n)为有限长度N，S(n)为有限长度M的信号，计算卷积F(n)*S(n)主要的方法有直接计算法，它的做法时利用卷积的定义

若F(n)和S(n)都是实数信号，则需要MN个乘法。因此计算卷积的复杂度为O(m*n)。Convolution operation: In the convolutional neural network, there are two most common calculation methods, one is the convolution operation, and the other is the pooling operation. Convolution is actually an integral operation, which is used to describe the relationship between the input and output of a linear time-invariant system: that is, the output can be obtained by performing a convolution operation on the input and a function that characterizes the system. When F(n) is a signal of finite length N and S(n) is a signal of finite length M, the main method for calculating convolution F(n)*S(n) is the direct calculation method, which uses the definition of convolution.

If both F(n) and S(n) are real signals, MN multiplications are required. So the complexity of computing convolution is O(m*n).

图像的卷积过程：用一个可训练的卷积核x去和另外一个输入的图像做卷积运算，然后添加一个偏置b，得到卷积特征层，即f(xW_ij+b)，f是relu函数。Image convolution process: use a trainable convolution kernel x to perform a convolution operation with another input image, and then add a bias b to get the convolution feature layer, that is, f(xW_ij +b), f is the relu function.

池化操作：经过卷积操作后，提取了图像最基本的特征，理论上讲，可以利用这些特征进行分类，但这些特征不一定能够代表抽象概念的特征，且数据量大，容易出现过拟合，所以需要对其进行更高层次的抽象，也就是进行池化操作，即二次特征提取，这么做的好处可以检测更多的特征信息，并且能够减少计算复杂度。Pooling operation: After the convolution operation, the most basic features of the image are extracted. In theory, these features can be used for classification, but these features may not represent the features of abstract concepts, and the amount of data is large, which is prone to overfitting. Therefore, a higher level of abstraction is required, that is, a pooling operation, that is, secondary feature extraction, which has the advantage of detecting more feature information and reducing computational complexity.

池化操作时利用一个矩阵窗口在张量上进行扫描，将每个矩阵中的通过取最大值或者平均值等来减少元素的个数。选择图像中的连续像素点作为池化区域的话，并且只是池化相同的隐藏单元产生的特征，那么，这些池化单元就具有平移不变性，平移不变性对于识别十分重要。During the pooling operation, a matrix window is used to scan the tensor, and the number of elements in each matrix is reduced by taking the maximum value or average value, etc. If the continuous pixels in the image are selected as the pooling area, and only the features generated by the same hidden units are pooled, then these pooling units have translation invariance, which is very important for recognition.

瓶颈模块：瓶颈模块共包含了三个卷积层，其中第一个是1x1卷积，能够对通道数进行降维的作用，从而令第二个3x3的卷积层在计算的时候能够以相对较低维度的输入进行卷积计算，提高计算效率。而第三个1x1卷积层则是起到了升维的作用。其他层则主要是对模型的优化。Bottleneck module: The bottleneck module contains a total of three convolutional layers, the first of which is a 1x1 convolution, which can reduce the dimension of the number of channels, so that the second 3x3 convolutional layer can be calculated with relative Convolution calculations are performed on inputs of lower dimensions to improve computational efficiency. The third 1x1 convolutional layer plays the role of dimension enhancement. The other layers are mainly to optimize the model.

捷径残差块：随着网络深度的增加，卷积神经网络的效果越来越好，但训练变的更加困难。这主要是因为在基于随机梯度下降的网络训练过程中，误差信号的多层反向传播非常容易引发梯度弥散的现象。一些特殊的权重初始化策略和批正则化等等方法能够改善此类问题，但是当模型收敛时，随着网络深度愈发的增加，训练误差没有降低反而升高。捷径残差块则很好的解决了网络深度带来的训练困难的问题。Shortcut Residual Block: Convolutional Neural Networks get better as the depth of the network increases, but training becomes more difficult. This is mainly because in the process of network training based on stochastic gradient descent, the multi-layer back-propagation of the error signal is very easy to cause the phenomenon of gradient dispersion. Some special weight initialization strategies and batch regularization methods can improve such problems, but when the model converges, as the network depth increases, the training error does not decrease but increases. The shortcut residual block is a good solution to the training difficulty caused by the depth of the network.

残差计算公式如下所示：y＝F(x,w_f)*T(x,w_t)+x*C(x,w_c)传统的卷积神经网络则没有T，C两项。T(x,wt)是一种非线性变换，称作变换门，负责控制变换的强度。C(x,wc)也是一种非线性变化，称作携带门，负责控制原输入信号的保留强度，换句话说就是y是F和x的加权组合，T和C则分别控制着两项对应的权重，其中T+C＝1。残差块使得训练模型的过程比训练原始函数更容易。The residual calculation formula is as follows: y=F(x,w_f )*T(x,w_t )+x*C(x,w_c ) The traditional convolutional neural network does not have T and C items. T(x,wt) is a nonlinear transformation, called a transformation gate, that controls the strength of the transformation. C(x,wc) is also a nonlinear change, called the carry gate, which is responsible for controlling the retention strength of the original input signal. In other words, y is the weighted combination of F and x, and T and C control the corresponding two items respectively. , where T+C=1. Residual blocks make the process of training the model easier than training the original function.

本发明采用了基于单应矩阵的纸币图像配准方法使不同角度、背景、光照强度以及分辨率的输入纸币图像均能输出为统一的纸币俯视图。本发明采用了基于人民币纸币的图像纹理特征和预处理后的配准纸币图像快速判断纸币为正面还是为反面的问题，迅速地定位到人民币冠字号。本发明不需要再进行繁琐的人民币冠字号图像分割操作，大大提升了识别效率，使字符识别准确率达到99.84％。The invention adopts the banknote image registration method based on the homography matrix, so that the input banknote images of different angles, backgrounds, light intensity and resolution can be output as a unified banknote top view. The invention adopts the problem of quickly judging whether the banknote is positive or negative based on the image texture feature of the RMB banknote and the preprocessed registered banknote image, and quickly locates the RMB serial number. The invention does not need to carry out the complicated operation of image segmentation of the RMB serial number, greatly improves the recognition efficiency, and makes the character recognition accuracy rate reach 99.84%.

以上所述，仅为本发明较佳的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，根据本发明的技术方案及其发明构思加以等同替换或改变，都应涵盖在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited to this. The equivalent replacement or change of the inventive concept thereof shall be included within the protection scope of the present invention.

Claims

1. The RMB crown word number multi-label identification method based on the deep convolutional neural network is characterized by comprising the following steps:

1) preprocessing the banknote image, including improving brightness intensity, extracting a crown word number image and registering the crown word number image;

2) firstly, roughly positioning by using priori knowledge, and then accurately positioning the serial number to obtain a RMB serial number image;

3) zooming all RMB crown word number images into a preset same size;

4) extracting image features by using a deep convolutional neural network, training a model to obtain a prediction vector, and storing the model when the model reaches a certain accuracy rate;

in the training stage, inputting the normalized binary crown word number image into a deep convolution neural network to obtain a characteristic vector through self-training of a model; stretching and inputting the feature map into a full-connection layer to obtain a prediction vector; training the prediction vector and the label vector through a Sigmoid cross entropy function to obtain a final model;

5) in the prediction stage, the image is input into a stored deep convolution neural network model to extract image characteristics;

the deep convolutional neural network model structure is as follows:

firstly, 4 layers are provided, namely a convolution layer, a batch regularization layer, an activation layer and a maximum pooling layer;

the input image size is (128,64,3), where the convolution kernel size of the convolution layer is 7x7, the depth of the convolution kernel is 64, and the convolution step size is 2;

the batch regularization layer normalizes the input without changing the size of the input;

the activation layer increases the nonlinearity of the neural network without changing the size of the input;

the sampling layer in the maximum pooling layer is 3x3, the step length is 2, the maximum pooling layer is used for reducing the size of the model, the calculation speed is increased, and meanwhile the robustness of the extracted features is improved;

then, a bottleneck module is provided, the bottleneck module comprises nine layers, the first layer is a coiling layer, the second layer is a batch regularization layer, the third layer is an activation layer, the fourth layer is a trimming layer, the fifth layer is a coiling layer, the sixth layer is a batch regularization layer, the seventh layer is an activation layer, the eighth layer is a coiling layer, and the ninth layer is a batch regularization layer;

a total of 16 bottleneck modules are stacked;

the following is a shortcut residual block, and the calculation formula is as follows: y ═ F (x, w)_f)*T(x，w_t)+x*C(x，w_c)，

T(x，w_t) Is a non-linear transformation, called a transformation gate, responsible for controlling the intensity of the transformation, C (x, w)_c) The variable is also a nonlinear variable, called a carry gate, and is responsible for controlling the retained strength of the original input signal, y is a weighted combination of F and x, and T and C respectively control two corresponding weights, wherein T + C is 1; the problem of difficult training caused by network depth is well solved by the shortcut residual block;

in a bottleneck module, the first layer is connected to the third layer in a weighting mode in a spanning mode, the problem of gradient divergence in a deep network is effectively solved, and the weight in a shortcut channel is set to be 1;

then a global mean pooling layer is carried out, wherein the global mean pooling layer is used for reducing the number of parameters, namely reducing the occurrence of model overfitting;

finally, the full connection layer outputs highly purified characteristics for being sent to a final classifier for classification;

6) performing Sigmoid operation on the prediction vector, and taking the prediction vector as a Sigmoid cross entropy function to obtain the prediction vector with the value range of 0-1;

7) and dividing the prediction vector after Sigmoid operation into ten items, finding the maximum value from each item, and mapping the maximum value to the corresponding label vector to obtain the final classification result.

2. The method for identifying the Renminbi (RMB) crown word number multiple tags based on the deep convolutional neural network as claimed in claim 1, wherein the step 1 specifically comprises: the binarization effect of the paper money image is improved by combining top cap transformation on the basis of graying; extracting a matrix area where the paper money image is located to remove irrelevant background information; registering the images with a homography matrix to correct for tilt and eliminate perspective effects;

the first image preprocessing specifically includes:

1) establishing a corresponding relation between four vertex angle coordinates of the banknote image before registration and the registered image;

2) solving a homography matrix according to the coordinate correspondence;

3) solving the corresponding points of the banknote image before registration in the registered banknote image by using a homography matrix;

4) and assigning the matched paper money image by a bilinear interpolation method.

3. The method for identifying the Renminbi number with multiple tags based on the deep convolutional neural network as claimed in claim 1, wherein the step 2 specifically comprises: rectangular regions located approximately to the left 1/4 and below 1/3 of the registered image with a priori knowledge; and (3) adopting accurate positioning based on block binarization, namely dividing the approximate positioning map into a left block and a right block, respectively using a global threshold value to carry out binarization, and then splicing the two blocks to carry out scanning positioning.