Disclosure of Invention
The invention aims to provide a picture correcting method, a picture correcting device, picture correcting equipment and a readable storage medium, which can solve the technical problem that the efficiency of manually correcting the direction of a certificate picture in the prior art is low.
One aspect of the present invention provides a picture correction method, including: acquiring a picture to be corrected; dividing the picture to be corrected to generate a plurality of subunit pictures; inputting a plurality of subunit pictures into a picture direction recognition model which is trained in advance, obtaining a direction label of each subunit picture output by the picture direction recognition model, counting the category to which the direction label of each output subunit picture belongs, and taking the direction label with the largest category number as a target direction label of the picture to be corrected; and carrying out correction operation on the picture to be corrected according to the target direction label to generate a standard forward picture.
Optionally, before the step of inputting a plurality of subunit pictures into a picture direction recognition model trained in advance and obtaining a direction label of each subunit picture output by the picture direction recognition model, the method further includes: acquiring a first training set, wherein the first training set is composed of standard forward pictures; determining the size of the standard forward picture, judging whether the size of the standard forward picture meets a preset standard size, and if not, adjusting the standard forward picture to the preset standard size; performing equal-proportion segmentation on the adjusted standard forward picture to generate a plurality of sub-unit pictures with the same size, adding a 0-degree direction label to each sub-unit picture, and taking the sub-unit picture with the 0-degree direction label as a second training set; respectively rotating all the subunit pictures in the second training set by 90 degrees, 180 degrees and 270 degrees in the same rotation direction, taking the rotation angle as a direction label of each subunit picture, and taking the subunit pictures with the direction labels after rotation as a third training set; and inputting the sub-unit pictures in the second training set and the third training set into the initial learning model for training to obtain a picture direction recognition model.
Optionally, the initial learning model obtains the image direction recognition model through a training mode including: respectively judging the weight of the text information in the sub-unit pictures in the input second training set and the input third training set through a first convolution neural network of the initial learning model, and if the weight of the text information in the sub-unit pictures is larger than a first preset threshold value, taking the sub-unit pictures as the output of the first convolution neural network; inputting the sub-unit pictures output by the first convolutional neural network of the initial learning model into a second convolutional neural network of the initial learning model, identifying direction labels carried by the input sub-unit pictures through the second convolutional neural network of the initial learning model, and outputting the direction labels of all the sub-unit pictures; comparing the direction labels of all the subunit pictures output by the second convolutional neural network of the initial learning model with the direction labels of the subunit pictures in the second training set and the third training set, and judging whether the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold value or not; and when the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold value, determining that the training of the initial learning model is finished, and taking the trained initial learning model as a picture direction recognition model.
Optionally, the step of inputting a plurality of subunit pictures into a pre-trained picture direction recognition model, obtaining a direction label of each subunit picture output by the picture direction recognition model, counting categories to which the direction labels of each output subunit picture belong, and using the direction label with the largest number of categories as a target direction label of a picture to be corrected includes: receiving each subunit picture through a convolutional layer and a pooling layer of a first convolutional neural network of the picture direction identification model, and outputting a first characteristic matrix and a second characteristic matrix corresponding to each subunit picture; receiving a first characteristic matrix and a second characteristic matrix through a first full-connection layer of a first convolution neural network of the picture direction recognition model, calculating products of a first class vector in the first full-connection layer and the first characteristic matrix and the second characteristic matrix respectively, and outputting probability scores of character information weights of all subunits of pictures; inputting the subunit pictures with the probability scores of the text information weight larger than a preset threshold value into a convolution layer and a pooling layer of a second convolution neural network of the picture direction identification model, and outputting a third feature matrix and a fourth feature matrix corresponding to each subunit picture; receiving a third characteristic matrix and a fourth characteristic matrix through a second full-connection layer of a second convolutional neural network of the picture direction identification model, calculating a product of a second category vector in the second full-connection layer and the third characteristic matrix and the fourth characteristic matrix, determining probability scores of character information direction categories of all the subunit pictures according to a calculation result, taking the direction category with the highest probability score as a direction label of the subunit pictures, and outputting the direction label of the subunit pictures; and counting the categories to which the direction labels of each output subunit picture belong, and taking the direction label with the largest category number as a target direction label of the picture to be corrected.
Optionally, the step of performing a correction operation on the picture to be corrected according to the target direction tag, and generating a standard forward picture includes: determining the association angle between the geometric center of the picture to be corrected and the target direction label; and rotating the picture to be corrected by the same angle in the reverse direction of the associated angle of the target direction label according to the associated angle of the target direction label and the geometric center of the picture to be corrected to generate a standard forward picture.
Optionally, after the step of acquiring the first training set, the method further comprises: enhancing the image attributes of the standard direction pictures in the first training set according to a preset first image enhancement algorithm, wherein the image attributes at least comprise one of the following: image brightness, image chroma, image contrast, image sharpness, and image resolution; and enhancing the image quality of the standard direction pictures in the first training set according to a preset second image enhancement algorithm, wherein the second image enhancement algorithm at least comprises one of the following steps: gaussian blur enhancement algorithm, motion blur enhancement algorithm, gaussian noise enhancement algorithm.
Another aspect of the present invention provides a picture correcting apparatus, including: the acquisition module is used for acquiring a picture to be corrected; the segmentation module is used for segmenting the picture to be corrected to generate a plurality of subunit pictures; the recognition module is used for inputting a plurality of subunit pictures into a pre-trained picture direction recognition model, obtaining a direction label of each subunit picture output by the picture direction recognition model, counting the category to which the direction label of each output subunit picture belongs, and taking the direction label with the largest category number as a target direction label of the picture to be corrected; and the generating module is used for carrying out correction operation on the picture to be corrected according to the target direction label to generate a standard forward picture.
Optionally, the apparatus further comprises a training module, the training module comprising: the acquisition submodule is used for acquiring a first training set, wherein the first training set consists of standard forward pictures; the adjusting submodule is used for determining the size of the standard forward picture, judging whether the size of the standard forward picture meets the preset standard size or not, and if not, adjusting the standard forward picture to the preset standard size; the segmentation submodule is used for carrying out equal proportion segmentation on the adjusted standard forward picture to generate a plurality of subunit pictures with the same size, adding a 0-degree direction label to each subunit picture, and taking the subunit picture with the 0-degree direction label as a second training set; the rotation sub-module is used for respectively rotating all the sub-unit pictures in the second training set by 90 degrees, 180 degrees and 270 degrees in the same rotation direction, taking the rotation angle as a direction label of each sub-unit picture, and taking the sub-unit picture with the direction label after rotation as a third training set; and the training submodule is used for inputting the subunit pictures in the second training set and the third training set into the initial learning model for training to obtain a picture direction recognition model.
Yet another aspect of the present invention provides a computer apparatus, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the picture correction method of any of the above embodiments when executing the computer program.
Yet another aspect of the present invention provides a computer-readable storage medium on which a computer program is stored, the computer program, when executed by a processor, implementing the picture correction method of any of the above embodiments. Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
According to the picture correcting method, the picture correcting device, the picture correcting equipment and the readable storage medium, after the picture to be corrected is obtained, the picture to be corrected is divided to generate the plurality of subunit pictures, so that the local area of the picture to be corrected can be identified in a targeted manner, and the reading efficiency and the identification accuracy of the picture are improved; inputting all the sub-unit pictures into a pre-trained picture direction identification model, outputting direction labels of all the sub-unit pictures, counting categories to which the direction labels of all the output sub-unit pictures belong, and taking the direction label with the largest category number as a target direction label of a picture to be corrected; and carrying out correction operation on the picture to be corrected according to the target direction label to generate a standard forward picture. Carry out the automatic identification of picture direction based on the picture direction recognition model trained in advance in this embodiment, can discern picture in batches simultaneously and correct for correct efficiency and rate of accuracy improve greatly, it corrects the mode that efficiency must be higher than the manual work only correct a picture once, has saved the cost of manual work correction picture simultaneously, based on this application, has solved the technical problem that manual work corrects picture direction efficiency and rate of accuracy are low among the prior art.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Example one
Fig. 1 shows a flowchart of a picture correcting method according to an embodiment of the present invention, and as shown in fig. 1, the picture correcting method may include steps S1 to S4, where:
and step S1, acquiring the picture to be corrected.
The picture type mentioned in this embodiment may be any picture form containing text information, and an additional document is added as enhancement data to improve the accuracy of subsequently identifying the picture to be corrected based on the text information, wherein the additional document may be a language text document the same as the text information contained in the picture; and simultaneously acquiring some pictures without characters or containing a small amount of character information as negative samples for comparison, and detecting whether the pictures to be corrected meet the requirements. The method for acquiring the picture to be corrected is not limited at all, and the method for acquiring the picture to be corrected can be that a digital camera directly imports (including using a film to shoot a picture for reproduction), a shooting website search, a slide, a picture material, an optical disk or a network disk, an Email forum, a blog acquisition, a screenshot is input by using a browser or dog searching pinyin, a picture is carried with a screenshot function, and a drawing software Photoshop is used for drawing the picture by self-drawing. And after the corresponding picture to be corrected is obtained, storing the picture to be corrected into a database.
Step S2, the picture to be corrected is divided to generate a plurality of sub-unit pictures.
The complete picture to be corrected comprises a plurality of local features, the reading process is time-consuming, each local feature can not be identified by accurate scanning, the identification accuracy rate of the picture to be corrected is reduced, the picture to be corrected is segmented after the picture to be corrected is obtained, a plurality of subunit pictures are obtained, then each subunit picture is identified in a targeted mode, and the reading efficiency and the identification accuracy rate of the picture are improved. As a preferred mode, the size information of the picture to be corrected is determined, whether the size information meets the preset standard size is judged, if not, the picture to be corrected is adjusted to the preset standard size, the adjusted picture to be corrected is divided in equal proportion, a plurality of sub-unit pictures with the same size are generated, and the reading efficiency is effectively improved.
Step S3, inputting a plurality of subunit pictures into a picture direction recognition model which is trained in advance, obtaining a direction label of each subunit picture output by the picture direction recognition model, counting the category to which the direction label of each output subunit picture belongs, and taking the direction label with the largest category number as a target direction label of the picture to be corrected;
inputting the divided sub-unit pictures into a pre-trained picture direction recognition model for scanning operation, wherein the output result is a direction label of each sub-unit picture, the direction labels represent the placement angles of the sub-unit certificate pictures, the number of categories to which the direction labels of each sub-unit picture belong is counted, the direction label with the largest category is taken as the direction label of the picture to be corrected, and the description shows that most areas in the picture to be corrected are placed in one direction, so that the direction label with the largest category can represent the direction of the picture to be corrected.
The automatic recognition of the directions of all local areas of the pictures to be corrected is carried out through the picture direction recognition model, the pictures to be corrected can be simultaneously in batches, the recognition speed is high, the recognition accuracy is high, and the technical problem that the manual experience recognition efficiency is low is solved.
In an optional embodiment of the present invention, there is further provided a preferable scheme that in the step S3, the table structure corresponding to the table picture to be recognized is generated according to the word segments and the association relationship between the word segments, specifically, the step S3 may include the following steps S31 to S35:
step S31, receiving each subunit picture through the convolution layer and the pooling layer of the first convolution neural network of the picture direction identification model, and outputting a first characteristic matrix and a second characteristic matrix corresponding to each subunit picture;
step S32, receiving a first characteristic matrix and a second characteristic matrix through a first full-connection layer of a first convolution neural network of the picture direction recognition model, calculating products of a first class vector in the first full-connection layer and the first characteristic matrix and the second characteristic matrix respectively, and outputting probability scores of character information weights of all sub-unit pictures;
step S33, inputting the subunit pictures with the probability scores of the text information weight larger than the preset threshold value into the convolution layer and the pooling layer of the second convolution neural network of the picture direction identification model, and outputting a third feature matrix and a fourth feature matrix corresponding to each subunit picture;
step S34, receiving a third characteristic matrix and a fourth characteristic matrix through a second full-connection layer of a second convolutional neural network of the picture direction identification model, calculating the product of a second category vector in the second full-connection layer and the third characteristic matrix and the fourth characteristic matrix, determining the probability score of the character information direction category of each subunit picture according to the calculation result, taking the direction category with the highest probability score as the direction label of the subunit picture, and outputting the direction label of the subunit picture;
step S35, counting the categories to which the direction labels of each sub-unit picture output belong, and taking the direction label with the largest number of categories as the target direction label of the picture to be corrected.
The trained image direction recognition model comprises two parts, namely first convolutional neural network recognition and second convolutional neural network recognition.
Typically, convolutional neural networks include convolutional layers, pooling layers, and fully-connected layers. In the convolutional layer, the dot product between the region of the input subunit picture and the weight matrix of the filter (convolutional kernel) is calculated, the first feature matrix is output, the filter will slide through the whole subunit picture, and the same dot product operation is repeated until each subunit picture has completed feature extraction. The pooling layer is also called a down-sampling layer, which is used to reduce the feature space dimension on the convolutional neural network, but does not reduce the depth, and the specific operation is basically the same as that of the convolutional layer. The full link layer is used to flatten the feature matrix obtained from the convolutional layer and the pooling layer.
When the first convolution neural network acquires a subunit picture of a picture, the convolution layer and the pooling layer extract features of each subunit picture to acquire a first feature matrix and a second feature matrix, the first full-connection layer comprises two category vectors, namely text information and no text information, the feature matrices acquired by the convolution layer and the pooling layer are subjected to dot multiplication operation with each category vector in the first full-connection layer respectively to acquire a probability vector of text information weight in each subunit picture, the probability vectors are normalized by a softmax function to acquire a probability value, the category with the maximum probability value is the text information proportion corresponding to the subunit picture, the subunit picture corresponding to the text information is screened, and the subunit picture with the probability value larger than a preset threshold value is determined from the screened subunit pictures with the text information.
After the second convolutional neural network receives the subunit pictures with text information, the convolutional layer and the pooling layer extract the features of each subunit picture to obtain a third feature matrix and a fourth feature matrix, the second full connection layer comprises four direction category vectors, 0 degree, 90 degrees, 180 degrees and 270 degrees, the feature matrices obtained by the convolutional layer and the pooling layer in the second full connection layer are respectively subjected to dot product operation with each category vector to obtain a probability vector of each subunit picture, the probability vectors are normalized by utilizing a softmax function to obtain a probability value, the direction category with the maximum probability value is a direction label corresponding to each subunit picture, and the direction label of each subunit certificate picture is output.
Counting the category to which the direction label of each output subunit picture belongs, and taking the direction label with the largest category number as a target direction label of the picture to be corrected, wherein the target direction label represents the actual placement direction of the picture to be recognized.
The picture direction recognition model recognizes the picture to be corrected twice through the convolutional neural network, so that the recognition accuracy of the picture to be corrected is improved, meanwhile, the sub-unit pictures of the picture to be corrected are screened through first recognition, the sub-unit pictures with the text information weight meeting the requirement are obtained, on the basis, the direction recognition of the second convolutional neural network is carried out, and the recognition efficiency is improved due to the fact that the number of the sub-unit pictures is reduced.
In addition, in another optional embodiment of the present invention, the method is further optimized to obtain a picture direction identification model. Specifically, in step S3, inputting a plurality of sub-unit pictures into a pre-trained picture direction recognition model, obtaining a direction label of each sub-unit picture output by the picture direction recognition model, counting categories to which the direction labels of each output sub-unit picture belong, and before taking the direction label with the largest number of categories as a target direction label of a picture to be corrected, the method may further include steps a1 to a 4:
step A1, obtaining a first training set, wherein the first training set is composed of standard forward pictures;
step A2, determining the size of the standard forward picture, judging whether the size of the standard forward picture meets the preset standard size, if not, adjusting the standard forward picture to the preset standard size;
step A3, performing equal proportion segmentation on the adjusted standard forward picture to generate a plurality of subunit pictures with the same size, adding a 0-degree direction label to each subunit picture, and taking the subunit picture with the 0-degree direction label as a second training set;
step A4, respectively rotating all the subunit pictures in the second training set by 90 degrees, 180 degrees and 270 degrees in the same rotation direction, taking the rotation angle as the direction label of each subunit picture, and taking the subunit pictures with the direction labels after rotation as a third training set;
and step A5, inputting the subunit pictures in the second training set and the third training set into the initial learning model for training to obtain a picture direction recognition model.
Specifically, the training process for constructing the image direction recognition model comprises the following steps: acquiring a first training set formed by a plurality of standard forward (0-degree placement) pictures, wherein the standard forward pictures contain text information, determining the actual size of the standard forward pictures, judging whether the size meets the preset standard size, if not, adjusting the standard forward pictures to the preset standard size, respectively carrying out equal-proportion segmentation on the adjusted standard forward pictures to generate a plurality of same subunit pictures, and adding a direction label to each subunit picture to obtain a second training set; respectively rotating the subunit pictures of the second training set by 90 degrees, 180 degrees and 270 degrees to obtain subunit pictures of directions, and taking the rotation angles as direction labels of each subunit picture to obtain a third training set; and inputting the sub-unit pictures in the second training set and the third training set into an initial learning model for training to generate a picture direction recognition model.
In an optional embodiment of the present invention, there is further provided a preferable scheme that the step a5 inputs the sub-unit pictures in the second training set and the third training set into the initial learning model for training to obtain the picture direction recognition model, and when the initial learning model is implemented specifically, the initial learning model performs the following steps a51 to a54 to obtain the picture direction recognition model:
step A51, respectively judging the weight of the text information in the sub-unit pictures in the second training set and the third training set which are input through the first convolution neural network of the initial learning model, and if the weight of the text information in the sub-unit pictures is larger than a first preset threshold value, taking the sub-unit pictures as the output of the first convolution neural network;
step A52, inputting the subunit pictures output by the first convolutional neural network of the initial learning model into the second convolutional neural network of the initial learning model, identifying the direction labels carried by the input subunit pictures through the second convolutional neural network of the initial learning model, and outputting the direction labels of the subunit pictures;
step A53, comparing the direction labels of all the subunit pictures output by the second convolutional neural network of the initial learning model with the direction labels of the subunit pictures in the second training set and the third training set, and judging whether the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold value;
and step A54, when the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold, determining that the training of the initial learning model is finished, and taking the trained initial learning model as a picture direction recognition model.
The initial learning model comprises a first convolutional neural network and a second convolutional neural network, after the subunit certificate pictures of a second training set and a third training set are input into the first convolutional neural network, the first convolutional neural network carries out text information weight recognition on the input subunit certificate pictures, if the text information weight is greater than a first preset threshold value, the corresponding subunit certificate pictures are taken as the output of the first convolutional neural network, the second convolutional neural network receives the output of the first convolutional neural network as the input of the first convolutional neural network, the direction labels of the input subunit pictures are recognized, the recognized direction labels are compared with the direction labels of the subunit pictures in the second training set and the third training set, whether the recognition accuracy of the second convolutional neural network is greater than a second preset threshold value or not is judged, and if the accuracy is greater than the second preset threshold value, and (5) indicating that the initial learning model is successfully trained, and taking the trained initial learning model as a picture direction identification model.
In another alternative embodiment of the present invention, the method described above is further optimized for image enhancement of standard forward pictures. Specifically, after the first training set is obtained in step a1, the method may further include steps B1 to B2:
step B1, the image attributes of the standard direction pictures in the first training set are enhanced according to a preset first image enhancement algorithm, and the image attributes at least include one of the following: image brightness, image chroma, image contrast, image sharpness, and image resolution;
step B2, the image quality of the standard direction pictures in the first training set is enhanced according to a preset second image enhancement algorithm, and the second image enhancement algorithm at least comprises one of the following steps: gaussian blur enhancement algorithm, motion blur enhancement algorithm, gaussian noise enhancement algorithm.
Because the state values contained in the standard direction pictures are all under ideal conditions, the obtained pictures to be corrected may have a fuzzy phenomenon, if the standard direction pictures are directly trained, the subsequent recognition effect will be limited, only the pictures with the standard state values can be recognized, the standard forward pictures in the first training set are subjected to image enhancement from two dimensional information of image attributes and image quality by adopting a preset image enhancement algorithm, and then the training effect of the picture direction recognition model is improved.
The image attributes of the standard direction pictures in the first training set are enhanced according to a preset first image enhancement algorithm, wherein the image attributes at least comprise one of the following: image brightness, image chroma, image contrast, image sharpness, and image resolution; and enhancing the image quality of the standard direction pictures in the first training set according to a preset second image enhancement algorithm, wherein the second image enhancement algorithm at least comprises one of the following steps: gaussian blur enhancement algorithm, motion blur enhancement algorithm, gaussian noise enhancement algorithm.
It should be noted that, the steps B1 and B2 are provided for the purpose of enhancing the image of the standard forward picture based on a preset image enhancement algorithm after the standard forward picture is acquired, so as to improve the training effect of the picture direction recognition model, and there is no special requirement on the processing sequence, that is, in specific implementation, the step B1 may be executed first, then the step B2 may be executed, or the step B2 may be executed first, then the step B1 may be executed, or the steps B1 and B2 may be executed at the same time.
And step S4, performing correction operation on the picture to be corrected according to the target direction label to generate a standard forward picture.
And after the target direction label of the picture to be corrected is determined, automatically correcting the picture to be corrected according to the target direction label to obtain a standard forward picture. The manual correction cost is saved, and the correction efficiency is improved.
In still another alternative embodiment of the present invention, there is further provided a preferable scheme that the step S4 performs a correction operation on the picture to be corrected according to the target direction tag to generate a standard forward picture, and specifically, the step S4 may include the following steps S41 to S42:
step S41, determining the association angle between the geometric center of the picture to be corrected and the target direction label;
and step S42, rotating the picture to be corrected by the same angle in the reverse direction of the association angle of the target direction label according to the association angle of the target direction label and the geometric center of the picture to be corrected, and generating a standard forward picture.
And after a target direction label of the picture to be corrected is obtained, determining the association angle of the target direction label, simultaneously determining the geometric center of the picture to be corrected, and obtaining a standard forward picture by reversely rotating the same angle according to the association angle of the direction label so as to finish the correction process. For example, when the associated angle of the direction tag of the picture to be corrected is 90 degrees clockwise, the geometric center of the picture to be corrected is taken as the rotation center, and the standard forward picture can be obtained 90 degrees counterclockwise.
According to the picture correcting method provided by the invention, after the picture to be corrected is obtained, the picture to be corrected is divided to generate a plurality of subunit pictures, so that the local area of the picture to be corrected can be subjected to targeted identification, and the reading efficiency and the identification accuracy of the picture are improved; inputting all the sub-unit pictures into a pre-trained picture direction identification model, outputting direction labels of all the sub-unit pictures, counting categories to which the direction labels of all the output sub-unit pictures belong, and taking the direction label with the largest category number as a target direction label of a picture to be corrected; and carrying out correction operation on the picture to be corrected according to the target direction label to generate a standard forward picture. Carry out the automatic identification of picture direction based on the picture direction recognition model trained in advance in this embodiment, can discern picture in batches simultaneously and correct for correct efficiency and rate of accuracy improve greatly, it corrects the mode that efficiency must be higher than the manual work only correct a picture once, has saved the cost of manual work correction picture simultaneously, based on this application, has solved the technical problem that manual work corrects picture direction efficiency and rate of accuracy are low among the prior art.
Example two
The second embodiment of the present invention further provides a picture correcting device, which corresponds to the picture correcting method provided by the first embodiment, and corresponding technical features and technical effects are not described in detail in this embodiment, and reference may be made to the first embodiment for relevant points. Specifically, fig. 2 shows a block diagram of a picture correction apparatus. As shown in fig. 2, the picture correcting apparatus 200 includes an obtaining module 201, a dividing module 202, an identifying module 203, and a generating module 204, wherein:
an obtaining module 201, configured to obtain a picture to be corrected;
the segmentation module 202 is connected to the acquisition module 201, and is configured to segment a picture to be corrected to generate a plurality of sub-unit pictures;
the recognition module 203 is connected with the segmentation module 202, and is configured to input a plurality of subunit pictures into a pre-trained picture direction recognition model, obtain a direction label of each subunit picture output by the picture direction recognition model, count categories to which the direction labels of each output subunit picture belong, and take the direction label with the largest number of categories as a target direction label of a picture to be corrected;
the generating module 204 is connected to the identifying module 203, and is configured to perform a correction operation on the picture to be corrected according to the target direction tag, and generate a standard forward picture.
Optionally, the apparatus further comprises a training module, the training module comprising: the acquisition submodule is used for acquiring a first training set, wherein the first training set consists of standard forward pictures; the adjusting submodule is used for determining the size of the standard forward picture, judging whether the size of the standard forward picture meets the preset standard size or not, and if not, adjusting the standard forward picture to the preset standard size; the segmentation submodule is used for carrying out equal proportion segmentation on the adjusted standard forward picture to generate a plurality of subunit pictures with the same size, adding a 0-degree direction label to each subunit picture, and taking the subunit picture with the 0-degree direction label as a second training set; the rotation sub-module is used for respectively rotating all the sub-unit pictures in the second training set by 90 degrees, 180 degrees and 270 degrees in the same rotation direction, taking the rotation angle as a direction label of each sub-unit picture, and taking the sub-unit picture with the direction label after rotation as a third training set; and the training submodule is used for inputting the subunit pictures in the second training set and the third training set into the initial learning model for training to obtain a picture direction recognition model.
Optionally, the initial learning model obtains the picture direction recognition model through the following training mode: respectively judging the weight of the text information in the sub-unit pictures in the input second training set and the input third training set through a first convolution neural network of the initial learning model, and if the weight of the text information in the sub-unit pictures is larger than a first preset threshold value, taking the sub-unit pictures as the output of the first convolution neural network; inputting the sub-unit pictures output by the first convolutional neural network of the initial learning model into a second convolutional neural network of the initial learning model, identifying direction labels carried by the input sub-unit pictures through the second convolutional neural network of the initial learning model, and outputting the direction labels of all the sub-unit pictures; comparing the direction labels of all the subunit pictures output by the second convolutional neural network of the initial learning model with the direction labels of the subunit pictures in the second training set and the third training set, and judging whether the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold value or not; and when the accuracy of the second convolutional neural network of the initial learning model is greater than a second preset threshold value, determining that the training of the initial learning model is finished, and taking the trained initial learning model as a picture direction recognition model.
Optionally, the recognition module is specifically configured to, when executing the step of inputting a plurality of sub-unit pictures into a pre-trained picture direction recognition model, obtaining a direction label of each sub-unit picture output by the picture direction recognition model, counting categories to which the direction labels of each output sub-unit picture belong, and using the direction label with the largest number of categories as a target direction label of a picture to be corrected: receiving each subunit picture through a convolutional layer and a pooling layer of a first convolutional neural network of the picture direction identification model, and outputting a first characteristic matrix and a second characteristic matrix corresponding to each subunit picture; receiving a first characteristic matrix and a second characteristic matrix through a first full-connection layer of a first convolution neural network of the picture direction recognition model, calculating products of a first class vector in the first full-connection layer and the first characteristic matrix and the second characteristic matrix respectively, and outputting probability scores of character information weights of all subunits of pictures; inputting the subunit pictures with the probability scores of the text information weight larger than a preset threshold value into a convolution layer and a pooling layer of a second convolution neural network of the picture direction identification model, and outputting a third feature matrix and a fourth feature matrix corresponding to each subunit picture; receiving a third characteristic matrix and a fourth characteristic matrix through a second full-connection layer of a second convolutional neural network of the picture direction identification model, calculating a product of a second category vector in the second full-connection layer and the third characteristic matrix and the fourth characteristic matrix, determining probability scores of character information direction categories of all the subunit pictures according to a calculation result, taking the direction category with the highest probability score as a direction label of the subunit pictures, and outputting the direction label of the subunit pictures; and counting the categories to which the direction labels of each output subunit picture belong, and taking the direction label with the largest category number as a target direction label of the picture to be corrected.
Optionally, the generating module is specifically configured to, when executing the step of performing a correction operation on the picture to be corrected according to the target direction tag to generate a standard forward picture: determining the association angle between the geometric center of the picture to be corrected and the target direction label; and rotating the picture to be corrected by the same angle in the reverse direction of the associated angle of the target direction label according to the associated angle of the target direction label and the geometric center of the picture to be corrected to generate a standard forward picture.
Optionally, the apparatus further comprises an image enhancement module, the image enhancement module comprising: the first image enhancement module is used for enhancing the image attributes of the standard direction pictures in the first training set according to a preset first image enhancement algorithm, wherein the image attributes at least comprise one of the following properties: image brightness, image chroma, image contrast, image sharpness, and image resolution; the second image enhancement submodule is used for enhancing the image quality of the standard direction pictures in the first training set according to a preset second image enhancement algorithm, and the second image enhancement algorithm at least comprises one of the following steps: gaussian blur enhancement algorithm, motion blur enhancement algorithm, gaussian noise enhancement algorithm.
EXAMPLE III
Fig. 3 shows a block diagram of a computer device suitable for implementing a picture correction method according to a third embodiment of the present invention. In this embodiment, the computer device 300 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like that execute programs. As shown in fig. 3, the computer device 300 of the present embodiment includes at least but is not limited to: a memory 301, a processor 302, a network interface 303, which may be communicatively coupled to each other via a system bus. It is noted that FIG. 3 only shows computer device 300 having components 301 and 303, but it is understood that not all of the shown components are required and that more or fewer components may be implemented instead.
In this embodiment, the memory 303 includes at least one type of computer-readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 301 may be an internal storage unit of the computer device 300, such as a hard disk or a memory of the computer device 300. In other embodiments, the memory 301 may also be an external storage device of the computer device 300, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 300. Of course, the memory 301 may also include both internal and external storage devices for the computer device 300. In the present embodiment, the memory 301 is generally used for storing an operating system installed in the computer device 300 and various types of application software, such as program codes of a picture correction method.
Processor 302 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 302 generally serves to control the overall operation of the computer device 300. Such as performing control and processing related to data interaction or communication with computer device 300. In this embodiment, the processor 302 is configured to execute the program code of the steps of the picture correction method stored in the memory 301.
In this embodiment, the picture correcting method stored in the memory 301 can be further divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 302) to complete the present invention.
The network interface 303 may comprise a wireless network interface or a wired network interface, and the network interface 303 is typically used to establish communication links between the computer device 300 and other computer devices. For example, the network interface 303 is used to connect the computer device 300 to an external terminal via a network, establish a data transmission channel and a communication link between the computer device 300 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), or Wi-Fi.
Example four
The present embodiment also provides a computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements the steps of the picture correcting method.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
It should be noted that the numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.