Real-time license plate recognition method based on deep learning in complex sceneTechnical Field
The invention belongs to the field of image processing technology and text recognition, and relates to a license plate automatic recognition method under a complex scene by utilizing a deep learning technology.
Background
The quantity of motor vehicles in China is increasing year by year, higher requirements are put forward for traffic vehicle management work, and intelligent traffic and monitoring systems are developed in the background. An Automatic Vehicle license plate Recognition technology (ALPR for short) plays an important role in an intelligent traffic and monitoring system, and the technology has wide application scenes, such as parking lot access control, road traffic monitoring, future unmanned traffic and the like. The ALPR is mainly used for positioning a license plate region from a scene, extracting license plate character information and finally identifying license plate characters, and the classical ALPR system flow comprises the following steps: the method comprises the steps of license plate region detection, license plate positioning, character segmentation and character recognition.
The main purpose of license plate region detection is to determine a region containing a license plate in the whole image, and in most cases, due to the problem of shooting angle, the license plate in a scene has an angle deviation, so that the purpose of license plate positioning is to further accurately position and correct the problems of license plate deviation and the like in the region containing the license plate, and further obtain a standard license plate image. The traditional method for license plate region detection and accurate license plate positioning mainly extracts the information characteristics of license plates, such as the characteristics of outlines, colors, edge densities, shape textures and the like, so that the extraction of the license plate regions is realized, but the information representation capability of the manually extracted characteristics is limited, so that for the license plates in different environments and scenes, the surrounding complex environment can cause great interference on the detection of the license plate regions, thereby causing positioning errors, and on the other hand, because the types of the Chinese license plates are various, the manually extracted shallow characteristics cannot meet the challenges of various license plates.
The main purpose of character segmentation is to cut continuous text into individual characters, which is convenient for subsequent recognition, and the traditional character segmentation technology mainly uses methods of searching connected domains, horizontal and vertical projections, characteristic projections and the like, but due to the structure of Chinese characters and the diversity of strokes, the methods of connected domains, projections and the like are easy to generate segmentation errors, and improper character segmentation can cause license plate character information loss, thereby seriously affecting the recognition work of subsequent license plates.
The character recognition is used for recognizing license plate characters, and commonly used methods include a template matching method, a hidden Markov model, a neural network and the like. The difficulty of character recognition mainly lies in the recognition of Chinese characters, and because Chinese characters have many strokes and are thin, the traditional recognition method is difficult to capture character detail information, and character misjudgment is caused. In recent years, due to the great advantages of a Convolutional Neural Network (CNN) in image feature representation, the existing license plate recognition system mostly adopts the CNN to recognize each character in sequence, so that a good recognition effect is obtained, but as the license plate is of a fixed length, each license plate needs to be recognized seven times according to the character sequence, so that great Network redundancy is caused, and the time consumption of character recognition is increased.
In summary, each step in an ALPR system ultimately has a decisive impact on the recognition result. The existing ALPR method has the defects that the license plate area detection and positioning are not enough for the anti-interference capability in a complex scene, the continuous loss of character information is caused by a multi-section pipeline type processing mode in each step, and the recognition speed cannot meet the real-time requirement.
Disclosure of Invention
The invention provides a novel automatic license plate recognition method based on deep learning, which aims to solve the problems that an existing ALPR system is insufficient in recognition capability and limited in license plate recognition type in a complex scene, and information loss and low recognition speed are caused by a multi-section processing mode in each step.
The technical scheme of the invention is as follows: the ALPR system of the present invention is mainly composed of three parts: the method comprises the steps of end-to-end license plate region detection and classification based on a deep learning object detection method, accurate license plate positioning and correction, and end-to-end character recognition based on CNN. The method comprises the following steps of firstly, carrying out an improved deep learning object classification detection algorithm: a Single-platform Multi-scale object region Detector (SSD for short) is applied to license plate region detection, and the type of a license plate is given while the license plate region is determined to be included in a complex background image; then accurately positioning and correcting the license plate based on multilevel threshold binarization operation, finding out rectangular outlines meeting the length-width ratio and the prior size requirement of characters for each level of binarized license plate area images, further determining four boundary point sets of the license plate through the rectangular outlines, determining license plate boundary lines through a line fitting method, wherein intersection points of the boundary lines are four corner points of the license plate, and finishing license plate correction through perspective transformation; the method removes the step of character segmentation in the traditional system, finally directly inputs the complete license plate after accurate positioning and correction into the CNN with seven outputs, and simultaneously realizes the recognition of seven characters of the license plate end to end.
The method comprises the following steps:
the first step is as follows: and (3) network setting of license plate region detection classification:
taking SSD as the basis of the license plate region detection and license plate classification algorithm, the feature extraction network in the SSD is replaced by: lightweight neural networks (MobileNet), forming a much lighter weight SSD network: and the SSD-MobileNet changes the aspect ratio setting of the license plate candidate frame into {2,3,4}, so that the license plate candidate frame is more fit with the shape of the license plate, and resource consumption caused by the calculation of the candidate frame with the unreasonable aspect ratio is avoided. The network parameter setting process of the SSD-MobileNet comprises the following steps: firstly, the network is pre-trained in a positioning classification way on a big database so as to determine proper initial parameters of the network; and then performing transfer learning, performing transfer training on the scene license plate database, finely adjusting network parameters to enable the network parameters to be optimal on a target data set, and completing parameter determination of the SSD-MobileNet network. The trained SSD-MobileNet is an end-to-end license plate detection classifier, and the scene license plate image is input into a network, so that the coordinates of a rectangular frame of a license plate area and the category of a corresponding license plate can be obtained.
The second step is that: accurate positioning and correction of license plates:
normalizing the obtained license plate region image to the same size, graying or inverse graying according to the license plate type to ensure that the gray value of the character is greater than the background gray value (namely the color of the character is lighter than the background color), and then performing multi-level threshold binarization transformation on the grayed region image, wherein the basic threshold is self-adaptively determined based on different neighborhood mean values in the image, so that the binarization threshold at each pixel position in a picture is not determinedThe method is fixed and invariable, and is determined by the distribution of surrounding neighborhood pixels, the character details are ensured, the final threshold value is obtained by subtracting a constant C from the basic threshold value, and the requirement of multistage threshold values can be realized by changing the value of the constant C. With mean (R)i) Represents a region RiMean of inner pixels, region RiTo the final threshold thRiNamely:
for each level of binarization region image, finding connected region rectangular frames meeting character prior conditions (length-width ratio and area size), adding upper left corners of the rectangular frames into an upper corner set, adding lower right corners of the rectangular frames into a lower corner set, adding upper left and lower left corners of the rectangular frames in a left region of the image into a left corner set, adding upper right and lower right corners of the rectangular frames in a right region of the image into a right corner set, obtaining four boundary lines of the license plate by line fitting by using corresponding point sets, wherein intersection points of the boundary lines are the license plate, and finally correcting the license plate deflection by perspective transformation operation.
The third step: license plate end-to-end character recognition network:
the license plate character recognition is realized by adopting CNN, because the content and the relative position relation of the whole image are known in the process of scanning the whole license plate image by the CNN, the invention adopts three groups of convolution modules and a CNN framework of seven parallel full-connection layers to realize end-to-end character recognition, and each group of convolution modules comprises two convolution layers and a maximum value down-sampling layer. The network trains on a pure license plate data set, the network loss function adopts class cross entropy, and when the loss function converges, the training of network parameters is completed.
The invention has the beneficial effects that:
(1) the license plate region detection part adopts an object detection algorithm SSD based on deep learning, has strong robustness on license plate positioning in a complex environment, and realizes more accurate positioning of a small object license plate by setting a license plate region candidate frame in a multilevel characteristic diagram by utilizing the SSD; and the feature extraction network in the SSD is replaced by the lightweight MobileNet, so that the network parameter and detection time overhead are reduced. A network end-to-end mode is adopted, and the classification of the license plates is completed while the license plate area is detected.
(2) The license plate accurate positioning and correcting part is based on a multi-level threshold value binarization method, license plate angular points are accurately positioned, and finally perspective transformation operation is used, so that the license plate accurate positioning and correcting are realized in one step, the sectional complex operation of separately processing license plate extraction and license plate angle and deflection problems in the traditional method is avoided, the license plate processing flow is simplified, and the information loss is reduced.
(3) Compared with the traditional method, the method eliminates the step of character segmentation, and adopts end-to-end CNN to recognize all characters in parallel in the character recognition part. The method reduces the character information loss caused by the intermediate processing flow of the license plate, reduces the algorithm complexity, and obviously improves the speed and the accuracy of character recognition.
Drawings
FIG. 1 is an overall flow diagram of a license plate recognition system of the present invention;
FIG. 2 is a schematic diagram of a license plate detection and classification network SSD-MobileNet structure according to the present invention;
FIG. 3 is a flowchart of an algorithm for accurate license plate location and correction according to the present invention;
fig. 4 is a schematic diagram of a character recognition network structure according to the present invention.
Detailed Description
FIG. 1 is an overall flow chart of the license plate recognition system of the present invention. The license plate identification process comprises three steps:
firstly, setting a license plate region detection classification network:
a deep learning object detection algorithm combined with SSD-MobileNet is used as the license plate region detection and classification method, and license plate region positioning and classification are completed in an end-to-end network at the same time.
FIG. 2 is a schematic diagram of a license plate detection and classification network SSD-MobileNet.
The method for detecting and classifying five types of Chinese license plates (classes is 5) comprises the following steps: blue cards, yellow cards, white cards, black cards and new energy license plates. Firstly, an image is sent to a feature extraction network, a feature extraction network part takes MobileNet from a first layer of full convolutional layer (Conv1) to a 7 th depth separable convolutional layer (Conv _ dw7), the last two layers of depth separable convolutional layers of the MobileNet are replaced by full convolutional layers (Conv7-Conv8), then 4 full convolutional layers (Conv9-Conv12) are added for multi-scale detection, and finally, redundant blocks are output after non-maximum suppression. In order to realize multiscale license plate inspection, a default candidate box with different length-width ratios is proposed in each cell of 6 feature maps with different sizes in 5 full convolution layers (Conv7-Conv12) and a depth convolution layer (Conv _ dw6) of a 5 th depth separable convolution layer in MobileNet, according to the prior knowledge of the license plate (the standard aspect ratio of the Chinese license plate is approximately equal to 3.14, the length is 440cm, and the width is 140cm), the aspect ratio of the candidate box is set as {2,3 and 4}, the three aspect ratios are set to prevent the uncertainty of the aspect ratio caused by the distortion of the license plate due to the problem of the shooting angle, for each candidate frame in each feature map unit, classification and position regression of the candidate frame are completed through a small convolution kernel of 3 multiplied by 3 (the candidate frame is regressed through calculating an offset value relative to the left side of the upper left corner point of a true value frame and the length and width), and the position and the classification of the license plate position frame can be obtained through final output of the network.
The network parameter setting process of the SSD-MobileNet is as follows: hyper-parameter setting of MobileNet: the Width Multiplier (Width Multiplier) is 1, and the Resolution Multiplier (Resolution Multiplier) is 224. Firstly, MobileNet performs classification pre-training on a large data set ImageNet (the data set comprises more than 120 ten thousand images and contains 1000 types), and determines initial parameters of the network. Then adding the last 6 full convolution layers (weight parameters W to (0,0.03) Gaussian distribution, initializing network bias to be 0), carrying out migration training on the scene license plate data set, wherein the total target loss function of the network is the weighted sum of positioning loss and classification loss, and carrying out iterative training until the network loss function is converged to finish network training.
Secondly, accurately positioning and correcting the license plate:
and further extracting the license plate from the positioned area, and correcting the problem of license plate deflection caused by the shooting angle to obtain a standard license plate image. Firstly, normalizing a license plate region image: length 136 and height 72, and performing graying (the characters are white license plate types) or inverse graying (the characters are black license plate types) pretreatment on the license plate region image according to the obtained license plate type information to ensure that the gray value of the characters is greater than the background gray value (namely character gray 4 and background gray). And for the gray level image of the license plate area, performing binarization operation by adopting a multi-level threshold value, and converting the gray level image into a binary image, wherein the binarization of the 20-level threshold value is completed by adopting a local self-adaptive binarization method and changing the value of a constant C, wherein C is set to 20 numerical values which are uniformly distributed in [ -50, 0).
FIG. 3 is a flowchart of an algorithm for accurate license plate location and correction.
(1) For the binary image obtained at each level, searching all connected domain outlines, then determining circumscribed rectangles corresponding to the outlines, and screening the circumscribed rectangles meeting character priori knowledge according to the character priori knowledge (100< the area of the circumscribed rectangles >1200, the height-width ratio >0.7 or the height-width ratio >3, considering the situation that the character is '1' and the Chinese character is discontinuous);
(2) respectively adding the coordinates of the upper left corner point and the coordinates of the lower right corner point of the circumscribed rectangle meeting the requirements into an upper boundary point set and a lower boundary point set, on the other hand, if the rectangular frame is in the area of 20% of the left side of the image, adding the coordinates of the upper left corner point and the lower left corner point into the left boundary point set, and if the rectangular frame is in the area of 20% of the right side of the image, adding the upper right corner point and the lower right corner point into the right boundary point set;
(3) after the images subjected to threshold binarization are operated as above, point sets of an upper boundary, a lower boundary, a left boundary and a right boundary can be obtained, line fitting is carried out on the upper boundary point set and the lower boundary point set by adopting a straight line fitting method of a Huber Loss function (Huber Loss) with strong robustness on outliers, line fitting is carried out on the left boundary point set and the right boundary point set by adopting a least square method, four boundary lines of the license plate are obtained, and the intersection points of the boundary lines are calculated to be the corner positions of the license plate;
(4) for the four corner points of the license plate and the expected boundary corner point set of the standard license plate: and a right lower corner point [136,36], a left lower corner point [0,36], a right upper corner point [136,0] and a left upper corner point [0,0] can obtain the standard license plate in one step through perspective change.
Thirdly, end-to-end character recognition:
and recognizing all characters by adopting a multi-output convolutional neural network structure.
Fig. 4 is a schematic diagram of a structure of a character recognition network.
The network realizes end-to-end character recognition by three groups of convolution modules and a CNN architecture of seven parallel full-connection layers, each group of convolution modules comprises two convolution layers and a down-sampling layer, the convolution layers of the three modules all adopt convolution kernels of 3 multiplied by 3, and the number of feature maps of the three modules is respectively as follows: 32. 64 and 128, all the down-sampling layers adopt a maximum value down-sampling method, the size of a sampling area is 2 multiplied by 2, and finally seven fully connected layers output confidence results of seven characters in parallel. The net final penalty function is the sum of the classification penalties for seven characters. And training the character recognition network on the pure license plate data set until the loss function is converged to complete network training.
The results of the license plate detection classification and fine positioning calibration method designed by the invention are obtained based on 600 license plate images in complex scenes, and the experimental environment is as follows: the CPU is an Intel core i7 quad-core processor, the main frequency is 2.6Ghz, the display card is NVIDIAGeForce GTX965M, and the display memory is 4G. The results of comparison with license plate location algorithms using other characteristics are shown in table 1. The accuracy is the percentage of the number of the effective positioning images in the whole data set (the effective positioning is defined as that the positioning license plate is in a standard license plate form, has no obvious deflection and character missing conditions, and is classified correctly). As can be seen from the table, even if the complexity of the license plate of China is greater than that of license plates in other countries, the method still has competitiveness, and the positioning speed is greatly increased on the basis of ensuring the positioning accuracy. (processing speed unit: milliseconds (ms), number of frames processed Per Second (FPS)).
We also compared the license plate recognition system of the present invention with the existing license plate recognition system, and the results are shown in table 2. On one hand, the accuracy of the design is competitive compared with other methods, and on the other hand, the license plate recognition speed is obviously improved due to the end-to-end license plate detection and recognition method, the effect of real-time detection and recognition can be achieved, and the advantages are obvious.
TABLE 1 comparison table of license plate positioning method performance based on different characteristics
TABLE 2 comparison table for different license plate recognition systems