Specific embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from start to finishSame or similar label represents same or similar element or the element with same or like function.Below with reference to attachedIt is exemplary to scheme the embodiment of description, it is intended to for explaining the present invention, and be not considered as limiting the invention.
Below with reference to the accompanying drawings the image processing method and device of the embodiment of the present invention described.
Fig. 1 is the flow chart of image processing method according to an embodiment of the invention.
As shown in figure 1, the image processing method includes:
S101, obtains the picture of input.
Specifically, obtaining needs to carry out the picture of OCR identifications.For example, user has found certain page in Library Reading booksContent oneself prefer, taken the page with mobile phone, and want that the word obtained in picture by OCR is gone forward side by side edlin.CauseAnd, this photo that user takes can be as the picture of input.
S102, extracts the feature of picture.
Specifically, after input picture is obtained, the feature of picture is extracted.Wherein, the feature of picture includes morphological featureAnd textural characteristics.
Wherein, morphological feature includes the concavo-convex ratio of axle ratio, face tax, the concavo-convex ratio of girth, spherical property, eccentricity, picture in length and breadthOne or more in the anglec of rotation.Textural characteristics include that gradient advantage, intensity profile, gradient are distributed, gray scale is average, gradient is flat, one or more in gray scale mean square deviation, gradient mean square deviation.
S103, feature is sequentially input into multiple forecast models, and is judged whether to use and prediction according to forecast modelThe corresponding pretreatment mode of model.
Wherein, forecast model is used to judge whether using the pretreatment mode corresponding with forecast model.Wherein, pre-processMode may include correction for direction, keystone, except Fuzzy Processing, removal white noise, sharpenings, adjust contrast and shade and brightnessTreatment etc..
It is to be appreciated that in an embodiment of the present invention, forecast model is corresponding with pretreatment mode.That is,Each pretreatment mode has a corresponding forecast model.For example, correction for direction pretreatment mode have one it is correspondingCorrection for direction forecast model.Wherein, correction for direction forecast model is used to judge whether to pre-process picture using correction for direction, withDirection to picture is corrected.
Specifically, the feature of extraction is sequentially input into multiple forecast models, and is judged whether to adopt according to forecast modelWith the pretreatment mode corresponding with forecast model.
For example, the feature of the picture of extraction can be sequentially input to correction for direction forecast model, keystone and is predictedModel, except Fuzzy Processing forecast model, removal white noise forecast model, sharpen forecast model, adjustment contrast forecast model andShade and brightness processed forecast model.Judge whether to use picture by these forecast models, it is corresponding with forecast modelCorrection for direction pretreatment, keystone pretreatment, except Fuzzy Processing, removal white noise pretreatment, Edge contrast, adjust contrastPretreatment and shade and brightness processed.That is, judging optimal pretreatment combination size mode by forecast model.
It should be noted that being sequentially input for picture feature to the order of forecast model, can carry out according to actual needsSet, be not limited thereto.
Additionally, before the feature of picture is input into forecast model, can be trained to forecast model.Forecast modelTraining process, as shown in Fig. 2 training step is as follows:
S201, obtains picture sample.
Specifically, picture sample that is substantial amounts of, can be used for OCR identifications is obtained.
S202, takes picture sample corresponding pretreatment mode, to obtain the pre-processed results of picture sample.
After obtaining picture sample, corresponding pretreatment mode is taken picture sample, to obtain the pretreatment of picture sampleAs a result.Wherein, pretreatment mode may include but be not limited to, correction for direction, keystone, except Fuzzy Processing, removal white noise, sharpChange, adjust contrast and shade and brightness processed etc..
S203, carries out OCR analyses to picture sample and pretreated picture sample respectively, to obtain the first result andTwo results.
After the pre-processed results for obtaining picture sample, respectively to picture sample and corresponding pretreated picture sampleOCR analyses are carried out, to obtain the first result and the second result.Wherein, the first result is to carry out OCR analyses to picture sample to obtainResult, the second result be pretreated picture sample is carried out OCR analysis obtain result.For example, in training direction schoolDuring positive forecast model, the first result is that the result that OCR analyses are obtained directly is carried out to picture sample, and the second result is to process sideTo pretreated picture is corrected, the result that OCR analyses are obtained is carried out.
S204, compares the first result and the second result, judges whether to be labeled picture sample.
Specifically, the first result and the second result are compared.When the first result is better than the second result, mark picture sample is notUsing the pretreatment mode.When the second result is better than the first result, mark picture sample uses the pretreatment mode.NamelySay, when not using the OCR analysis results of the pretreatment mode better than the OCR analysis results for using the processing mode, mark figurePiece sample does not use the pretreatment mode;It is better than not using the pretreatment mode when using the OCR analysis results of the processing modeOCR analysis results when, mark picture sample use the pretreatment mode.
S205, the picture sample after mark is input into forecast model, is trained with to forecast model.
Picture sample after mark is input into in the initial predicted model corresponding with pretreatment mode, using machineThe mode of study, logic-based is returned or random forest is trained to initial predicted model.
Further, in order to improve the accuracy rate of forecast model judged result, the forecast model for training can be carried outChecking optimization.Specifically, the picture sample of acquisition can be divided into two parts, such as in 80% and 20% ratio by picture sampleIt is divided into two parts, 80% picture sample is used to train forecast model, and 20% picture sample is used to verify what optimization was trainedForecast model.So as to by checking and optimal prediction model, improve the accuracy rate of forecast model judged result.
It should be noted that the picture sample for training forecast model, and for verifying the picture of optimal prediction modelThe allocation proportion of sample, can be set according to actual needs, be not limited thereto.
S104, if using the pretreatment mode corresponding with forecast model, after being pre-processed to picture, to figurePiece carries out OCR identifications.
Specifically, the feature of picture is sequentially input to multiple forecast models, if it is determined that using with forecast model phaseCorresponding pretreatment mode, then carry out corresponding pretreatment to picture.And then, OCR identifications are carried out to pretreated picture.
As an example, the feature of certain pictures for extracting is sequentially input to correction for direction forecast model, trapezoidal schoolPositive forecast model, except Fuzzy Processing forecast model, removal white noise forecast model, sharpen forecast model, adjustment contrast predictionIn seven forecast models such as model and shade and brightness processed forecast model.Assuming that seven judged results of forecast model are successivelyBe, to picture using correction for direction treatment, do not use keystone process, using except Fuzzy Processing, using removal white noise atManage, Edge contrast used, the treatment of adjustment contrast is not used and uses shade and brightness processed.That is, can be to the figurePiece is using correction for direction treatment, except Fuzzy Processing, the treatment of removal white noise, four kinds of pretreatment modes of shade and brightness processed.CauseThis, according to judged result to the picture travel direction correction process, except Fuzzy Processing, the treatment of removal white noise, shade and brightnessFour kinds of pretreatment modes for the treatment of.After being pre-processed to picture, OCR identifications are carried out to picture.
If it should be noted that forecast model judge can be to picture using various pretreatment modes, to pictureWhen carrying out the pretreatment of various ways, the order of pretreatment can be configured according to actual needs, be not limited thereto.
Therefore, after multiple forecast models select optimal pretreatment combination size mode, according to the pretreatment mode selectedPicture is pre-processed.So as to improve the accuracy rate of OCR recognition results.
In sum, the image processing method of the embodiment of the present invention, multiple predictions are sequentially input by by the feature of pictureIn model, whether interpretation uses the pretreatment mode corresponding with forecast model, if using corresponding with forecast model pre-Processing mode, after being pre-processed to picture, OCR identifications is carried out to picture.The method passes through multiple forecast models, chooses mostThe pretreatment mode of good combination is pre-processed to picture, so as to improve the accuracy rate of OCR recognition results.
The picture processing device that the embodiment of the present invention is proposed is described in detail with reference to Fig. 3.Fig. 3 is according to this hairThe structural representation of the picture processing device of bright one embodiment.
As shown in figure 3, the picture processing device may include:Acquisition module 310, extraction module 320, judge module 330, placeReason module 340.
Acquisition module 310, the picture for obtaining input.
Specifically, acquisition module 310 obtains and needs to carry out the picture of OCR identifications.For example, user is in Library Reading booksWhen, it is found that certain page of content oneself is preferred, the page is taken with mobile phone, and want by the word in OCR acquisition pictures simultaneouslyEnter edlin.Thus, this photo that user takes can be as the picture of input.
Extraction module 320, the feature for extracting picture.
Specifically, after acquisition module 310 obtains input picture, extraction module 320 extracts the feature of picture.Wherein, schemeThe feature of piece includes morphological feature and textural characteristics.
Wherein, morphological feature includes the concavo-convex ratio of axle ratio, face tax, the concavo-convex ratio of girth, spherical property, eccentricity, picture in length and breadthOne or more in the anglec of rotation.Textural characteristics include that gradient advantage, intensity profile, gradient are distributed, gray scale is average, gradient is flat, one or more in gray scale mean square deviation, gradient mean square deviation.
Judge module 330, for feature to be sequentially input into multiple forecast models, and judges whether according to forecast modelUsing the pretreatment mode corresponding with forecast model.
Wherein, forecast model is used to judge whether using the pretreatment mode corresponding with forecast model.Wherein, pre-processMode may include correction for direction, keystone, except Fuzzy Processing, removal white noise, sharpenings, adjust contrast and shade and brightnessTreatment etc..
It is to be appreciated that in an embodiment of the present invention, forecast model is corresponding with pretreatment mode.That is,Each pretreatment mode has a corresponding forecast model.For example, correction for direction pretreatment mode have one it is correspondingCorrection for direction forecast model.Wherein, correction for direction forecast model is used to judge whether to pre-process picture using correction for direction, withDirection to picture is corrected.
Specifically, judge module 330 is used to sequentially input the feature of extraction into multiple forecast models, and according to predictionModel judges whether using the pretreatment mode corresponding with forecast model.
For example, the feature of the picture of extraction can be sequentially input to correction for direction forecast model, keystone and is predictedModel, except Fuzzy Processing forecast model, removal white noise forecast model, sharpen forecast model, adjustment contrast forecast model andShade and brightness processed forecast model.Judge module 330 is used to judge whether to picture use by these forecast models, and pre-Survey model it is corresponding correction for direction pretreatment, keystone pretreatment, except Fuzzy Processing, removal white noise pretreatment, sharpenTreatment, the pretreatment of adjustment contrast and shade and brightness processed.That is, judging optimal pretreatment by forecast modelCombination.
It should be noted that being sequentially input for picture feature to the order of forecast model, can carry out according to actual needsSet, be not limited thereto.
Additionally, as shown in figure 4, before the feature of picture is input into forecast model, can be by training module 350 pairsForecast model is trained.Wherein, training module 350 includes:Acquiring unit 351, pretreatment unit 352, analytic unit 353,Mark unit 354, training unit 355.
Acquiring unit 351, for obtaining picture sample.
Pretreatment unit 352, for taking picture sample corresponding pretreatment mode, to obtain the pre- place of picture sampleReason result.
Analytic unit 353, for carrying out OCR analyses to picture sample and pre-processed results respectively, to obtain the first resultWith the second result.
Mark unit 354, pretreatment side is not used for when the first result is better than the second result, then marking picture sampleFormula, and when the second result is better than the first result, then marks picture sample and use pretreatment mode;
Training unit 355, for the picture sample after mark to be input into forecast model, instructs with to forecast modelPractice.
The training process of forecast model is as shown in Fig. 2 step is as follows:
S201, obtains picture sample.
Specifically, acquiring unit 251 obtains picture sample that is substantial amounts of, can be used for OCR identifications.
S202, takes picture sample corresponding pretreatment mode, to obtain the pre-processed results of picture sample.
After obtaining picture sample, pretreatment unit 352 takes picture sample corresponding pretreatment mode, to obtain pictureThe pre-processed results of sample.Wherein, pretreatment mode may include but be not limited to, correction for direction, keystone, except Fuzzy Processing,Removal white noise, sharpening, adjustment contrast and shade and brightness processed etc..
S203, carries out OCR analyses to picture sample and pretreated picture sample respectively, to obtain the first result andTwo results.
After the pre-processed results for obtaining picture sample, analytic unit 353 is respectively to picture sample and pretreated figurePiece sample carries out OCR analyses, to obtain the first result and the second result.Wherein, the first result is that OCR points is carried out to picture sampleThe result for obtaining is analysed, the second result is that the result that OCR analyses are obtained is carried out to pretreated picture sample.For example, in trainingDuring correction for direction forecast model, the first result is that the result that OCR analyses are obtained directly is carried out to picture sample, and the second result is rightBy the pretreated picture of correction for direction, the result that OCR analyses are obtained is carried out.
S204, compares the first result and the second result, judges whether to be labeled picture sample.
Specifically, the first result and the second result are compared.When the first result is better than the second result, mark unit 354 is markedNote picture sample does not use the pretreatment mode.When the second result is better than the first result, the mark mark picture sample of unit 354Using the pretreatment mode.That is, the OCR analysis results of the pretreatment mode ought not be used better than using the processing modeOCR analysis results when, mark unit 354 mark picture sample do not use the pretreatment mode;When using the processing modeWhen OCR analysis results are better than not using the OCR analysis results of the pretreatment mode, the mark mark picture sample of unit 354 is usedThe pretreatment mode.
S205, the picture sample after mark is input into forecast model, is trained with to forecast model.
By the picture sample after mark be input into, in the initial predicted model corresponding with pretreatment mode, training unit355 can be by the way of machine learning, and logic-based is returned or random forest is trained to initial predicted model.
Further, in order to improve the accuracy rate of forecast model judged result, the forecast model for training can be carried outChecking optimization.Specifically, the picture sample of acquisition can be divided into two parts, such as in 80% and 20% ratio by picture sampleIt is divided into two parts, 80% picture sample is used to train forecast model, and 20% picture sample is used to verify what optimization was trainedForecast model.So as to by checking and optimal prediction model, improve the accuracy rate of forecast model judged result.
It should be noted that the picture sample for training forecast model, and for verifying the picture of optimal prediction modelThe allocation proportion of sample, can be set according to actual needs, be not limited thereto.
Processing module 340, if for using the pretreatment mode corresponding with forecast model, picture is carried out it is pre-After treatment, OCR identifications are carried out to picture.
Specifically, the feature of picture is sequentially input to multiple forecast models, processing module 340 is used for if it is determined that adoptingWith the pretreatment mode corresponding with forecast model, then corresponding pretreatment is carried out to picture.And then, to pretreated figurePiece carries out OCR identifications.
As an example, the feature of certain pictures for extracting is sequentially input to correction for direction forecast model, trapezoidal schoolPositive forecast model, except Fuzzy Processing forecast model, removal white noise forecast model, sharpen forecast model, adjustment contrast predictionIn seven forecast models such as model and shade and brightness processed forecast model.Assuming that seven judged results of forecast model are successivelyBe, to picture using correction for direction treatment, do not use keystone process, using except Fuzzy Processing, using removal white noise atManage, Edge contrast used, the treatment of adjustment contrast is not used and uses shade and brightness processed.That is, processing module340 can to the picture using correction for direction treatment, except Fuzzy Processing, the treatment of removal white noise, four kinds of shade and brightness processed it is pre-Processing mode.Therefore, processing module 340 according to judged result to the picture travel direction correction process, except Fuzzy Processing, removalWhite noise treatment, four kinds of pretreatment modes of shade and brightness processed.Processing module 340 after being pre-processed to picture, to figurePiece carries out OCR identifications.
If it should be noted that forecast model judge can be to picture using various pretreatment modes, to pictureWhen carrying out the pretreatment of various ways, the order of pretreatment can be configured according to actual needs, be not limited thereto.
Therefore, after multiple forecast models select optimal pretreatment combination size mode, according to the pretreatment mode selectedPicture is pre-processed.So as to improve the accuracy rate of OCR recognition results.
In sum, the picture processing device of the embodiment of the present invention, multiple predictions are sequentially input by by the feature of pictureIn model, whether interpretation uses the pretreatment mode corresponding with forecast model, if using corresponding with forecast model pre-Processing mode, after being pre-processed to picture, OCR identifications is carried out to picture.The device passes through multiple forecast models, from multipleIn pretreatment mode, the pretreatment mode for choosing best of breed is pre-processed to picture, so as to improve OCR recognition resultsAccuracy rate.
Additionally, term " first ", " second " are only used for describing purpose, and it is not intended that indicating or implying relative importanceOr the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can express orImplicitly include at least one this feature.
In the description of this specification, reference term:" one embodiment ", " some embodiments ", " example ", " specifically showThe description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are describedPoint is contained at least one embodiment of the invention or example.In this manual, to the schematic representation of above-mentioned term notIdentical embodiment or example must be directed to.And, the specific features of description, structure, material or feature can be with officeCombined in an appropriate manner in one or more embodiments or example.Additionally, in the case of not conflicting, the skill of this areaArt personnel can be tied the feature of the different embodiments or example described in this specification and different embodiments or exampleClose and combine.
Although embodiments of the invention have been shown and described above, it is to be understood that above-described embodiment is exampleProperty, it is impossible to limitation of the present invention is interpreted as, one of ordinary skill in the art within the scope of the invention can be to above-mentionedEmbodiment is changed, changes, replacing and modification.