Disclosure of Invention
The invention aims to provide an end-to-end frame number identification system based on deep learning to solve the technical problem.
In order to achieve the purpose, the invention adopts the following technical scheme:
the end-to-end frame number identification system based on deep learning is provided and used for automatically identifying the frame number, and comprises the following components:
the image input module is used for inputting an image containing the whole frame number character string;
the image characteristic extraction module is connected with the image input module and used for extracting the image characteristics corresponding to the image to obtain a characteristic diagram corresponding to the image and converting the characteristic diagram into a corresponding characteristic vector;
and the character recognition module is connected with the image feature extraction module and used for carrying out corresponding character type recognition on each frame number character in the frame number character string in the image according to the feature vector and based on a preset character recognition model, and finally recognizing to obtain a character recognition result of the frame number character string.
As a preferable scheme of the present invention, the end-to-end vehicle frame number recognition system performs convolution recognition on the image through a convolution neural network to obtain the feature map corresponding to the image.
As a preferable aspect of the present invention, the end-to-end frame number recognition system converts the feature map corresponding to the image into the feature vector through the convolutional neural network.
As a preferable scheme of the present invention, the character recognition module includes a plurality of character recognition units, each of the character recognition units is respectively used for recognizing a character type corresponding to the frame number character at one of the designated character positions in the frame number character string,
each character recognition unit specifically comprises:
the character feature positioning subunit is used for positioning the component features corresponding to the frame number characters on the appointed character positions in the feature vectors and obtaining a positioning result;
the prediction vector generation subunit is connected with the appointed character bit character feature positioning subunit and used for converting the feature vector into a corresponding prediction vector according to the positioning result;
the prediction probability calculating subunit is connected with the prediction vector generating subunit and is used for calculating component values corresponding to all the components in the prediction vector based on the character recognition model;
and the character type identification subunit is connected with the prediction probability calculation subunit and is used for identifying the character type corresponding to the component corresponding to the maximum component value in the prediction vector based on the character identification model, taking the identified character type as the character type corresponding to the frame number character on the specified character position, and outputting the character type identification result of the frame number character on the specified character position.
As a preferable aspect of the present invention, the number of the character recognition units is 17, and each of the character recognition units is respectively configured to recognize the character type corresponding to the frame number character at one of the designated character positions in the frame number character string.
As a preferred scheme of the present invention, the end-to-end vehicle frame number recognition system further includes a character recognition model training module, connected to the character recognition module, for training and forming the character recognition model according to the character recognition result.
The invention also provides an end-to-end frame number identification method based on deep learning, which is realized by applying the end-to-end frame number identification system and comprises the following steps:
step S1, inputting an image containing a whole frame number character string by the end-to-end frame number identification system;
step S2, the end-to-end frame number recognition system extracts the image characteristics corresponding to the image and obtains a characteristic diagram corresponding to the image;
step S3, the end-to-end frame number recognition system converts the characteristic diagram into a corresponding characteristic vector;
and step S4, the end-to-end frame number recognition system simultaneously performs corresponding character type recognition on each character in the frame number character string in the image according to the feature vector and based on a preset character recognition model, and finally obtains a character recognition result of the frame number character string through recognition.
As a preferable scheme of the present invention, in step S4, the process of identifying the character type corresponding to each character in the frame number character string by the end-to-end frame number identification system specifically includes the following steps:
step S41, the end-to-end frame number recognition system locates the components corresponding to the frame number characters on each designated character position in the frame number character string in the feature vector based on the preset character recognition model, and obtains a plurality of locating results of the frame number characters related to each designated character position;
step S42, the end-to-end frame number identification system converts the same feature vector into a plurality of corresponding prediction vectors according to each positioning result;
step S43, the end-to-end frame number identification system calculates component values corresponding to the components in the prediction vectors based on the preset character identification model;
step S44, the end-to-end frame number recognition system recognizes, based on the preset character recognition model, a character type corresponding to the component corresponding to the maximum component value in each of the prediction vectors, uses the recognized character type as a character type corresponding to the frame number character on the corresponding designated character position, and finally obtains a character recognition result for the frame number character string by recognition.
The end-to-end frame number recognition system provided by the invention can automatically recognize the frame number characters of the input image containing the frame number character string, has quick and efficient recognition process and high recognition accuracy, and solves the technical problems of low recognition efficiency, easy omission and wrong detection of the traditional manual recognition mode.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.
In the description of the present invention, unless otherwise explicitly specified or limited, the term "connected" or the like, if appearing to indicate a connection relationship between the components, is to be understood broadly, for example, as being fixed or detachable or integral; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or may be connected through one or more other components or may be in an interactive relationship with one another. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Referring to fig. 1, an end-to-end frame number recognition system based on deep learning according to an embodiment of the present invention is used for automatically recognizing a frame number, and the frame number recognition system includes:
theimage input module 1 is used for inputting an image containing a whole vehicle frame number character string;
the imagecharacteristic extraction module 2 is connected with theimage input module 1 and is used for extracting image characteristics corresponding to the image to obtain a characteristic diagram corresponding to the image and converting the characteristic diagram into a corresponding characteristic vector;
and thecharacter recognition module 3 is connected with the imagefeature extraction module 2 and is used for carrying out corresponding character type recognition on each frame number character in the frame number character string in the image according to the feature vector and based on a preset character recognition model, and finally recognizing to obtain a character recognition result of the frame number character string.
In the technical scheme, the end-to-end frame number recognition system performs convolution recognition on the image through the convolution neural network to obtain the characteristic diagram corresponding to the input image. Referring specifically to fig. 6, the convolutional neural network preferably uses the VGGNet or ResNet network architecture existing in the prior art to extract the image features. The network architecture includes a convolutional layer, a ReLu layer, and a batch normalization layer. The input to the convolutional neural network is an image containing the entire string of vehicle frame numbers, the image size being 3 x 448. The input image is first calculated by a convolution layer with convolution kernel size of 3 × 3, feature maps of 64 × 448 are output, and then feature maps with sizes of 64 × 224, 128 × 112, 256 × 64, and 512 × 32 are output sequentially by further four stages of image convolution feature extraction. And finally, compressing the feature map with the size of 512 × 32 into a 1024-dimensional feature vector, wherein the feature vector encodes the position and shape features of the frame number character string in the input image, and then inputting the feature vector into a subsequent character recognition network to perform a character recognition process of the frame number character string.
It should be noted that the convolutional neural network image feature extraction method adopted by the end-to-end frame number recognition system is an image feature extraction method existing in the prior art, and the image feature extraction method is not within the scope of the claimed invention, so the specific process of extracting the feature map of the input image by the end-to-end frame number recognition system is not described herein.
In the technical scheme, the end-to-end vehicle frame number identification system also converts the characteristic diagram corresponding to the image into the characteristic vector through the convolutional neural network. The method of converting the feature map into the feature vector using the convolutional neural network is a method existing in the prior art, and the method is not within the scope of the claimed invention, so the detailed conversion process thereof is not described herein.
Referring to fig. 2, thecharacter recognition module 3 includes a plurality ofcharacter recognition units 31, eachcharacter recognition unit 31 is respectively used for recognizing a character type corresponding to a frame number character at a designated character position in the frame number character string,
referring to fig. 3, eachcharacter recognition unit 31 specifically includes:
the characterfeature positioning subunit 311 is configured to position, in the feature vector, a component feature corresponding to a frame number character associated with the designated character position, and obtain a positioning result;
a predictionvector generating subunit 312, connected to the characterfeature positioning subunit 311, for converting the feature vector into a corresponding prediction vector according to the positioning result;
a predictionprobability calculation unit 313 connected to the predictionvector generation subunit 312, configured to calculate, based on the character recognition model, prediction probabilities corresponding to the components in the prediction vector;
and the charactertype identification subunit 314 and the predictionprobability calculation subunit 313 are configured to identify, based on the character identification model, a character type corresponding to the component corresponding to the largest component value in the prediction vector, use the identified character type as the character type corresponding to the car frame number character on the specified character position, and output a character type identification result for the car frame number character on the specified character position.
It is emphasized here that one character recognition unit recognizes the frame number character only for recognizing one of the designated character positions in the frame number character string. For example, the first character recognition unit recognizes a frame number character on the first character position in the frame number character string, and the second character recognition unit recognizes a frame number character … … on the second character position in the frame number character string.
Since the frame number is generally composed of 17-digit letters or numbers or a combination of letters and numbers, it is preferable that the number of thecharacter recognition units 31 is 17, and eachcharacter recognition unit 31 is used for recognizing the character type corresponding to the frame number character at a designated character position in the frame number character string.
The characters include 36 character types, which are 26 english alphabets and 10 natural numbers from 1 to 10, respectively.
In the above technical solution, the process of recognizing the character type of the frame number character by the end-to-end frame number recognition system is detailed as follows:
fig. 7 shows a network architecture diagram of a convolutional neural network adopted by the end-to-end frame number recognition system provided by the present invention to recognize the character type corresponding to the frame number character on the designated character position in the frame number character string, please refer to fig. 7, the network architecture is composed of a first full connection layer, a second full connection layer and a ReLu layer,
the 1024-dimensional feature vector output by the system is subjected to feature extraction of the first fully connectedlayer 100, the second fully connectedlayer 200 and theReLu layer 300, and then a 36-dimensional prediction vector is output. The 36 components in the 36-dimensional feature vector are used to represent 26 english letters and ten natural numbers from 1 to 10, respectively. The component values corresponding to the 36 components are used for representing the prediction probability that the component is the corresponding English letter or natural number.
Specifically, the 1024-dimensional feature vector output by the system is simultaneously used as the input of 17 character recognition units, and then each character recognition unit respectively extracts the component features of the frame number characters on the appointed character position on the concerned frame number character string so as to extract and output the prediction vector corresponding to the frame number character on the appointed character position. The prediction vector is the 36-dimensional feature vector, then component values corresponding to components in the 36-dimensional feature vector are calculated according to a preset character recognition model (that is, the component is calculated to be the prediction probability of corresponding english letters or numbers), then the maximum component value is taken as a prediction result to output the character type corresponding to the component, for example, the character type is a character "a", and finally, the frame number character on the designated character position on the frame number character string concerned by the character recognition unit is output to be the character "a".
It should be noted that each character recognition unit may locate, according to a preset character recognition model, component features of a frame number character to be focused on and associated with a designated character position in a 1024-dimensional feature vector, and ignore other feature parts in the 1024-dimensional feature vector.
Preferably, the end-to-end frame number recognition system provided by the invention further comprises a character recognitionmodel training module 4 connected with thecharacter recognition module 3 and used for training and forming the character recognition model according to the character recognition result.
The loss function used by the training character recognition model is calculated by the following formula:
in the above formula, L is used to represent a loss function;
m is used to represent the character category to which each frame number character may belong (i.e., 26 english letters and 10 natural numbers from 1 to 10);
ycfor representing an indicator variable representing a character class predicted by the systemWhether the character type is consistent with the real character type or not, if so, ycIs 1, otherwise ycIs 0;
pcused to represent the prediction probability of the training sample being the character class c.
The character recognition model is optimized by an Adam optimization method in the prior art.
The invention also provides an end-to-end frame number identification method based on deep learning, which is realized by applying the end-to-end frame number identification system, and please refer to fig. 4 and 8, and comprises the following steps:
step S1, inputting an image containing a whole frame number character string by the end-to-end frame number identification system;
step S2, the end-to-end frame number recognition system extracts the image characteristics corresponding to the image and obtains a characteristic diagram corresponding to the image;
step S3, the end-to-end frame number recognition system converts the characteristic diagram into a corresponding characteristic vector;
and step S4, the end-to-end frame number recognition system simultaneously carries out corresponding character type recognition on each character in the frame number character string in the image according to the characteristic vector and based on a preset character recognition model, and finally obtains a character recognition result of the frame number character string through recognition.
Referring to fig. 5, in step S4, the process of the end-to-end frame number recognition system recognizing the character type corresponding to each character in the frame number character string specifically includes the following steps:
step S41, the end-to-end frame number recognition system locates the components corresponding to the frame number characters on each appointed character position in the frame number character string in the feature vector based on the preset character recognition model, and obtains a plurality of locating results of the frame number characters related to each appointed character position;
step S42, the end-to-end frame number recognition system converts the same feature vector into a plurality of corresponding prediction vectors according to each positioning result;
step S43, the end-to-end frame number recognition system calculates component values corresponding to components in the prediction vectors based on a preset character recognition model;
and step S44, the end-to-end frame number recognition system recognizes the character type corresponding to the component corresponding to the maximum component value in each prediction vector based on a preset character recognition model, uses the recognized character type as the character type corresponding to the frame number character on the corresponding designated character position in the frame number character string, and finally recognizes to obtain the character recognition result of the frame number character string.
In the above technical solution, in step S41, the positioning method adopted by the system to position the component corresponding to the frame number character on the designated character position in the feature vector is obtained by recognition of a pre-trained character recognition model, the recognition positioning method is a positioning method existing in the prior art, and the positioning method is not within the scope of the present invention, so the specific method process of the system to position the component based on the convolutional neural network is not described here.
In step S42, the prediction vector is a 36-dimensional feature vector, 36 components in the 36-dimensional feature vector are used to indicate that the frame number character on the designated character position may be the corresponding 26 english alphabets or 10 natural numbers from 1 to 10 (i.e., the possible character categories of the frame number character), and the component value corresponding to each of the 36 components is the prediction probability that the component is the corresponding character category.
In step S42, the method for the system to locate the position of the character feature corresponding to the frame number character of the designated character position in the 1024-dimensional feature vector is the conventional locating method, and the locating method is not within the scope of the claimed invention, and therefore, will not be described herein.
In step S43, the method for calculating the component values corresponding to the components in the 36-dimensional feature vector by the system is also the method existing in the prior art, and the above convolutional neural network is preferably used for calculation, and the specific calculation process is not described herein.
It should be emphasized that each character recognition unit only recognizes the frame number character type of the designated character position in the frame number character string, for example, the first character recognition unit only recognizes the character type corresponding to the frame number character at the first designated character position in the frame number character string, and the second character recognition unit only recognizes the character type … … corresponding to the frame number character at the second designated character position in the frame number character string, so that the characters in the frame number are arranged in sequence in the recognition result of the system performing character recognition on the frame number character string, and the disorder situation does not occur.
In conclusion, the end-to-end frame number recognition system provided by the invention can automatically recognize the frame number characters of the input image containing the frame number character string, has a quick and efficient recognition process and high recognition accuracy, and solves the technical problems of low recognition efficiency, easy omission and wrong detection of the traditional manual recognition mode.
It should be understood that the above-described embodiments are merely preferred embodiments of the invention and the technical principles applied thereto. It will be understood by those skilled in the art that various modifications, equivalents, changes, and the like can be made to the present invention. However, such variations are within the scope of the invention as long as they do not depart from the spirit of the invention. In addition, certain terms used in the specification and claims of the present application are not limiting, but are used merely for convenience of description.