Movatterモバイル変換


[0]ホーム

URL:


CN110245545A - A kind of character recognition method and device - Google Patents

A kind of character recognition method and device
Download PDF

Info

Publication number
CN110245545A
CN110245545ACN201811126275.4ACN201811126275ACN110245545ACN 110245545 ACN110245545 ACN 110245545ACN 201811126275 ACN201811126275 ACN 201811126275ACN 110245545 ACN110245545 ACN 110245545A
Authority
CN
China
Prior art keywords
suggestion box
candidate
text
model
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811126275.4A
Other languages
Chinese (zh)
Inventor
任宇鹏
卢维
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co LtdfiledCriticalZhejiang Dahua Technology Co Ltd
Priority to CN201811126275.4ApriorityCriticalpatent/CN110245545A/en
Publication of CN110245545ApublicationCriticalpatent/CN110245545A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of character recognition method and devices, and the recognition result accuracy for solving the problems, such as text in image is not high.This method comprises: the image comprising text to be identified is input in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network that training in advance is completed, the first score value that the content for including in the location information and each Suggestion box for each Suggestion box for including in described image is text is obtained;Screen the candidate Suggestion box that score value is greater than default scoring threshold value;According to the position of each candidate Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box;Each target Suggestion box is input in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network that training in advance is completed, identifies the text for including in each target Suggestion box.

Description

A kind of character recognition method and device
Technical field
The present invention relates to deep learning and technical field of character recognition more particularly to a kind of character recognition methods and device.
Background technique
With the fast development of image capture device, more and more image informations need the mankind to be managed it.AndThe automatic management that image information is realized using Internet technology is current best means.
In identification image before text, need to position the text in image first.The text in image is fixed at presentPosition method is broadly divided into following two categories: the first is based on Faster RCNN (Faster Region ConvolutionalNeural Networks), YOLO (You Only Look Once) or SSD (Single Shot MultiBox Detector)The position frame homing method of network, such method can directly export line of text scoring and positioning frame body;Second is based on completeThe dividing method of convolutional neural networks (Fully Convolutional Networks, FCN), such method pass through prediction pixelThe text classification of grade is as a result, and carry out certain post-processing generation boundary rectangle frame to result.Real-time and precision are all higherFaster RCNN method suggests network (Region Proposal Networks, RPN) method after convolution using regionDifferent text filed candidate frames is generated on characteristic pattern, and classification and position frame time are carried out to candidate region by neural networkReturn.But acutely due to the variation of line of text length, conventional candidate frame scheme is difficult to realize the accurate positionin to the type objects, togetherWhen, due to the limitation of computing cost and the requirement of real-time, cannot meet simply by candidate frame size and shape is increasedRequired precision needs to improve existing RPN scheme.
It is that number connection inscription product science and technology in Chengdu is limited with the closest existing implementation of the present invention in terms of pictograph identificationA kind of " complex script recognition methods based on deep learning " patent formula of company's application.The program is using single convolution mindIdentify that single character does not account for the context and semantic information that text sequence is included through network, recognition result accuracy is notIt is high.
Summary of the invention
The purpose of the embodiment of the present invention is that a kind of character recognition method and device are provided, for solving the knowledge of text in imageThe not high problem of other result precision.
The embodiment of the invention provides a kind of character recognition methods, comprising:
Image comprising text to be identified is input to and trains the neural comprising convolutional neural networks and circulation of completion in advanceIn first model of network, include in the location information for each Suggestion box for including in acquisition described image and each Suggestion boxContent is the first score value of text, wherein first model obtains the characteristic pattern of described image, based on the characteristic pattern intoThe operation of row sliding window, determines each window feature, each according to preset width and Height Prediction in each window featurePosition Suggestion box;Using the corresponding window feature sequence of every row of the characteristic pattern as the input of Recognition with Recurrent Neural Network, it is based on instituteState Recognition with Recurrent Neural Network obtain described image in include each Suggestion box location information and each Suggestion box in include it is interiorHold the first score value for text;
Screen the candidate Suggestion box that the first score value is greater than default scoring threshold value;
According to the position of each candidate Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box;
Each target Suggestion box is input to the convolutional neural networks comprising Recognition with Recurrent Neural Network that training is completed in advanceIn second model, the text for including in each target Suggestion box is identified.
Further, it is described by the image comprising text to be identified be input to that training in advance completes comprising convolutional Neural netBefore in first model of network and Recognition with Recurrent Neural Network, the method also includes:
Described image is handled using threshold segmentation method and connected domain analysis method;
And image carries out text orientation correction to treated.
Further, the position according to each candidate Suggestion box merges to obtain target and build to candidate Suggestion boxDiscussing frame includes:
For the first candidate Suggestion box in each candidate Suggestion box, recognize whether and the horizontal seat of the first candidate Suggestion boxThe distance between mark is less than preset first threshold, and the degree of overlapping of vertical direction is greater than preset second threshold, and shape is similarDegree is greater than the second candidate Suggestion box of preset third threshold value, if it does, by the described first candidate Suggestion box and described secondCandidate Suggestion box is incorporated as the first candidate Suggestion box;If it does not, using the first candidate Suggestion box as target Suggestion box.
Further, it is determined that the degree of overlapping of the vertical direction includes:
According to the of the first height of the described first candidate Suggestion box and the first vertical coordinate and the second candidate Suggestion boxTwo height and the second vertical coordinate, using following formula: overlap=| yA2-yD1|/min(h1,h2), determine the Vertical SquareTo degree of overlapping, wherein yA2Represent the second vertical coordinate of the described second candidate Suggestion box, yD1First candidate is represented to buildDiscuss the first vertical coordinate of frame, h1And h2The first height and second candidate for respectively representing the described first candidate Suggestion box are builtDiscuss the second height of frame.
Further, it is determined that the shape similarity includes:
According to the second height of the first height of the described first candidate Suggestion box and the second candidate Suggestion box, using following public affairsFormula: similarity=min (h1,h2)/max(h1,h2), determine the shape similarity, wherein h1And h2Respectively represent instituteState the height of the first candidate Suggestion box and the second candidate Suggestion box.
Further, the process for training first model in advance includes:
Sample image is obtained, wherein being labelled with the location information of each Suggestion box in the sample image and each position is builtThe content that view frame includes is the second score value of text;
Each sample image is input in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network, according to everyThe output of a first model is trained first model.
Further, the process for training second model in advance includes:
Obtain each line of text marked in sample image;
Each sample image comprising corresponding line of text is input to comprising convolutional neural networks and Recognition with Recurrent Neural NetworkIn second model, according to the output of each second model, second model is trained.
The embodiment of the invention provides a kind of character recognition device, which includes:
Obtain module, for by include text to be identified image be input to that training in advance completes comprising convolutional Neural netIn first model of network and Recognition with Recurrent Neural Network, obtains the location information for each Suggestion box for including in described image and each buildThe content for including in view frame is the first score value of text, wherein first model obtains the characteristic pattern of described image, is based onThe characteristic pattern carries out sliding window operation, determines each window feature, in each window feature according to preset width andThe each position Suggestion box of Height Prediction;Using the corresponding window feature sequence of every row of the characteristic pattern as Recognition with Recurrent Neural NetworkThe input of model obtains the location information for each Suggestion box for including in described image based on the Recognition with Recurrent Neural Network submodelAnd the content in each Suggestion box including is the first score value of text;
Screening module, the candidate Suggestion box for being greater than default scoring threshold value for screening the first score value;
Merging module is merged to obtain target and be built for the position according to each candidate Suggestion box to candidate Suggestion boxDiscuss frame;
Identification module, for by each target Suggestion box be input to that training in advance completes comprising convolutional neural networks and followingIn second model of ring neural network, the text for including in each target Suggestion box is identified.
Further, described device further include:
Correction module, for being handled using threshold segmentation method and connected domain analysis method described image;And it is rightTreated, and image carries out text orientation correction.
Further, the merging module is specifically used for for the first candidate Suggestion box, identification in each candidate Suggestion boxIt is less than preset first threshold, the degree of overlapping of vertical direction with the presence or absence of with the distance between the first candidate Suggestion box abscissaGreater than preset second threshold, and shape similarity is greater than the second candidate Suggestion box of preset third threshold value, if it does, willDescribed first candidate Suggestion box and the second candidate Suggestion box are incorporated as the first candidate Suggestion box;If it does not, shouldFirst candidate Suggestion box is as target Suggestion box.
Further, the merging module, specifically for the first height and first according to the described first candidate Suggestion boxThe second height and the second vertical coordinate of vertical coordinate and the second candidate Suggestion box, using following formula: overlap=| yA2-yD1|/min(h1,h2), determine the degree of overlapping of the vertical direction, wherein yA2Represent the described second candidate Suggestion box second is hung downStraight coordinate, yD1Represent the first vertical coordinate of the described first candidate Suggestion box, h1And h2Respectively represent the described first candidate suggestionThe height of frame and the second candidate Suggestion box.
Further, the merging module, specifically for the first height and second according to the described first candidate Suggestion boxSecond height of candidate Suggestion box, using following formula: similarity=min (h1,h2)/max(h1,h2), determine the shapeShape similarity, wherein h1And h2Respectively represent the first height and the second candidate Suggestion box of the described first candidate Suggestion boxSecond height.
Further, described device further include:
First training module, for obtaining sample image, wherein being labelled with the position of each Suggestion box in the sample imageThe content that confidence breath and each position Suggestion box include is the second score value of text;Each sample image is input to comprising volumeIn first model of product neural network and Recognition with Recurrent Neural Network, according to the output of each first model, to first model intoRow training.
Further, described device further include:
Second training module, for obtaining each line of text marked in sample image;The every of corresponding line of text will be includedA sample image is input in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network, according to each second modelOutput, is trained second model.
The embodiment of the present invention provides a kind of character recognition method and device, this method are defeated by the image comprising text to be identifiedEnter in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network completed to preparatory training, includes in acquisition imageThe content for including in the location information of each Suggestion box and each Suggestion box is the first score value of text, wherein the first modelObtain image characteristic pattern, based on characteristic pattern carry out sliding window operation, determine each window feature, in each window feature according toPreset width and each position Suggestion box of Height Prediction;Using the corresponding window feature sequence of every row of characteristic pattern as circulation mindInput through network, location information and each Suggestion box based on each Suggestion box for including in Recognition with Recurrent Neural Network acquisition imageIn include content be text the first score value.Identify that the first score value is greater than the candidate Suggestion box of default scoring threshold value;RootAccording to the position of each candidate Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box;Each target Suggestion box is defeatedEnter in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network completed to preparatory training, identifies each target suggestionThe text for including in frame.
Due in embodiments of the present invention, by the image comprising text to be identified be input to that training in advance completes comprising volumeIn first model of product neural network and Recognition with Recurrent Neural Network, the location information of each Suggestion box for including in image and every is obtainedThe content for including in a Suggestion box is the first score value of text.First model can effectively obtain the upper and lower of text sequenceLiterary information simultaneously adds it in position fixing process, specifically, the score value that the white space Suggestion box between same row text is setIt can be promoted because of the sequence signature of front and back text, the line of text position frame obtained is finally made to be more in line with text sequencePosition feature, line of text positioning result are more accurate.Secondly, including by what each target Suggestion box was input to training completion in advanceIn second model of convolutional neural networks and Recognition with Recurrent Neural Network, the text for including in each target Suggestion box is identified.This secondModel is due to can be enhanced the extraction of word sequence contextual information comprising Recognition with Recurrent Neural Network, so that the prediction of text sequenceAs a result more accurate.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodimentAttached drawing is briefly introduced, it should be apparent that, drawings in the following description are only some embodiments of the invention, for thisFor the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings otherAttached drawing.
Fig. 1 is a kind of flow diagram for character recognition method that the embodiment of the present invention 1 provides;
Fig. 2 is the specific implementation procedure schematic diagram for the line of text positioning operation that the embodiment of the present invention 1 provides;
Fig. 3 is the effect diagram operated by Recognition with Recurrent Neural Network that the embodiment of the present invention 1 provides;
Fig. 4 is the specific implementation procedure schematic diagram for the line of text identification operation that the embodiment of the present invention 1 provides;
Fig. 5 is the Suggestion box desired position information schematic diagram that the embodiment of the present invention 3 provides;
Fig. 6 is that the entire flow diagram for the express delivery face list Text region that the embodiment of the present invention 7 provides is intended to;
Fig. 7 is a kind of character recognition device structural schematic diagram that the embodiment of the present invention 8 provides.
Specific embodiment
The present invention will be describe below in further detail with reference to the accompanying drawings, it is clear that described embodiment is only thisInvention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art existAll other embodiment obtained under the premise of creative work is not made, shall fall within the protection scope of the present invention.
Embodiment 1:
Fig. 1 is a kind of process schematic of character recognition method provided in an embodiment of the present invention, which includes following stepIt is rapid:
S101: by the image comprising text to be identified be input to that training in advance completes comprising convolutional neural networks and circulationIn first model of neural network, obtains and wrapped in the location information and each Suggestion box for each Suggestion box for including in described imageThe content contained is the first score value of text, wherein first model obtains the characteristic pattern of described image, is based on the featureFigure carries out sliding window operation, each window feature is determined, according to preset width and Height Prediction in each window featureEach position Suggestion box;Using the corresponding window feature sequence of every row of the characteristic pattern as the input of Recognition with Recurrent Neural Network, baseInclude in the location information and each Suggestion box that the Recognition with Recurrent Neural Network obtains each Suggestion box for including in described imageContent be text the first score value.
Since the text information in image is likely distributed in any position in image, and may there was only one in imageSubregion includes text to be identified.Therefore before being identified to the text in image, it is necessary first to the text in imageCurrent row location information carries out positioning operation and obtains the location information of line of text in the picture.According to the text line position after positioning operationConfidence breath carries out identification operation to text wherein included.
Wherein, two neural networks for including in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network can be withIt is: convolutional neural networks and Recognition with Recurrent Neural Network.Since the purpose of the operation of convolutional neural networks and Recognition with Recurrent Neural Network is allIn order to realize the positioning to line of text location information in image, therefore by above-mentioned two neural network collectively the first mouldType.The image of text to be identified is input in convolutional neural networks, the convolution sum pondization operation by several layers finally obtainsThe characteristic pattern of image.Sliding window convolution operation is carried out on this feature figure, obtains window feature;And when carrying out sliding window operation everyA sliding window center goes out each position Suggestion box according to the width and Height Prediction of setting.It is obtained above-mentioned according to sliding window convolution operationWindow feature be input in Recognition with Recurrent Neural Network, the coordinate information of each position Suggestion box of final output and the position are builtThe first score value containing text in frame is discussed, the score value is for judging whether the position Suggestion box is candidate Suggestion box.
When determining position Suggestion box, the embodiment of the present invention passes through the width and height at each sliding window center using settingPredict each position Suggestion box.Since the text height in the line of text in image is indefinite, built according to fixation in the prior artThe Suggestion box generation method of frame size and shape, the problem of will cause line of text position inaccurate are discussed, and the embodiment of the present invention mentionsThe method of the generation position Suggestion box of confession can solve above-mentioned problem.And by setting threshold decision position Suggestion box whetherFor candidate Suggestion box, redundant position Suggestion box is removed, it is possible to reduce open due to increasing the calculating of Suggestion box size and shape bringPin.Meanwhile line of text is positioned by introducing Recognition with Recurrent Neural Network model in first model, due to recycling nerve netNetwork model itself has the characteristic of memory, therefore the context of text sequence can be effectively obtained using the Recognition with Recurrent Neural NetworkInformation simultaneously adds it in position fixing process.In specific implementation it is possible that a kind of situation be, between same row textWhite space Suggestion box score value can because front and back text sequence signature and be promoted, finally make acquisition textRow Suggestion box is more in line with the position feature of text sequence, keeps positioning result more accurate.
For example, by taking the single image Text region of express delivery face as an example, the line of text positioning operation of express delivery face single image to be identifiedSpecific implementation procedure it is as shown in Figure 2, wherein Convx_x represents the convolution operation of disparate modules, the dotted line connection of convolution modulePart represents pondization operation.BLSTM (Bidirectional Long Short-term Memory) is two-way long short-term memoryNeural network, FC (Fully Connected) refer to full articulamentum, predict k location Suggestion box altogether in characteristic pattern conv5_3,After BLSTM and FC layers, exporting the content for including in the location information and each Suggestion box of each Suggestion box of prediction isThe score value of text.
Image to be identified is input to the convolutional neural networks based on VGGNet that training is completed in advance first, extracts figureAs feature, the network alternately convolution-pondization is operated, specifically, image pass through altogether 13 3 × 3 convolutional layer and 42 ×2 maximum pond layer, the final shape that obtains is W × H × C characteristic pattern conv5_3, and wherein W, H, C respectively represent characteristic patternWide, high and port number;
It is 1 that step-length is carried out on characteristic pattern conv5_3 obtained above, and the sliding window convolution that convolution kernel size is 3 × 3 is graspedMake, and predicts k position Suggestion box according to certain shapes and sizes at each sliding window center;
In specific implementation, k is set as 10;And certain shapes and sizes are particularly: fixed wide using small scaleDegree, the position Suggestion box set-up mode only changed on altitude range.Specifically, fixed width can be set to 16 pixels, it is highThe mode changed in degree range are as follows: height is reduced to 11 pictures from 283 pixels according to the method that reduction ratio is 0.7 respectivelyElement predicts 10 position Suggestion box according to above-mentioned method altogether.
Secondly, the window feature of features described above figure conv5_3 t 3 × 3 × C obtained by sliding window convolution operation is madeIt is characterized sequence inputting BLSTM neural network, is cyclically updated the internal state H of hidden layert, according to the following formula to internal stateIt is cyclically updated:
Wherein Xt∈R3×3×CIt is the characteristic sequence obtained in the every a line of characteristic pattern conv5_3 from t sliding window, W isThe width of characteristic pattern conv5_3, C are characterized the port number of figure conv5_3,For nonlinear function.Obtain effective contextInformation connects the first scoring that the content for including in the location information and each Suggestion box of the FC layers of each Suggestion box of output is textValue.
For example, Fig. 3 be specific implementation process in an example, the figure show the result is that by BLSTM neural networkCorresponding first score value of the Suggestion box and Suggestion box of prediction after operation.Wherein the box of the third line represents Suggestion box, theThe number of two rows represents the first score value of Suggestion box, and the number in the first row represents the location index value of corresponding Suggestion box,Middle index value is for traversing Suggestion box.
S102: the first score value of screening is greater than the candidate Suggestion box of default scoring threshold value.
In the above-mentioned Suggestion box determined, it is possible to there is the Suggestion box not comprising text information.Therefore it is directed to upper oneThe content for including in each Suggestion box that step obtains is the score value of text, eliminates redundancy suggestion by preset scoring threshold valueFrame obtains candidate Suggestion box.Specifically, the Suggestion box is considered as if the score value of the Suggestion box is greater than preset scoring threshold valueCandidate Suggestion box;On the contrary, the Suggestion box is considered as redundancy suggestion if the score value of the Suggestion box is not more than preset scoring threshold valueFrame, and remove the Suggestion box.
For example, in specific implementation, preset scoring threshold value can be set to 0.7, judge Suggestion box score value whetherGreater than 0.7, if so, the Suggestion box is candidate Suggestion box;If it is not, then the Suggestion box is considered as redundancy Suggestion box, and it is superfluous to eliminate thisRemaining Suggestion box.
S103: according to the position of each candidate Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box.
The corresponding text of a line every in image is positioned in order to realize, candidate Suggestion box obtained above need to be carried outMerging finds out target Suggestion box.Therefore according to the location information of the above-mentioned candidate Suggestion box found out, to candidate Suggestion box one by one intoRow, which merges, finds out target Suggestion box.
Wherein, the process merged for two candidate Suggestion box, in specific implementation, a kind of possible embodimentFor using the minimum circumscribed rectangle of this two candidate Suggestion box as the frame obtained after merging, i.e. target Suggestion box.
A kind of possible embodiment judges this two candidate Suggestion box in level for any two candidate's Suggestion boxWhether the distance in direction is less than the threshold value of setting, if it is, merging this two candidate Suggestion box.
S104: by each target Suggestion box be input in advance training complete comprising convolutional neural networks and circulation nerve netIn second model of network, the text for including in each target Suggestion box is identified.
Obtain line of text location information positioning result after, need to identify the text in line of text, identification it is accurateRate is very crucial for the automatic management for realizing image.Therefore the line of text in image is positioned by aforesaid operations,And after obtaining target Suggestion box, in order to identify the text in the target Suggestion box after positioning, by target Suggestion box obtained aboveIn the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network that input training in advance is completed, identify in target Suggestion boxText information.
Wherein, two neural networks for including in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network can be withIt is: convolutional neural networks and Recognition with Recurrent Neural Network.Since the purpose of the operation of convolutional neural networks and Recognition with Recurrent Neural Network is allIn order to realize the identification to text information in image, therefore by above-mentioned two neural network collectively the second model.
Using target Suggestion box obtained above as the input of convolutional neural networks, by several convolution sum maximum pondsOperation obtains image convolution feature, and using image convolution feature as the input of Recognition with Recurrent Neural Network, obtains the output of convolutional layerAnd it is calculated as the corresponding classification scoring of width dimension therewith.Using connectionism timing classification method by Recognition with Recurrent Neural NetworkOutput result be converted into sequence label, and probability is defined to sequence label according to the predicted value of every frame, uses negative pair of probabilityNumber likelihood, can be corresponding with sequence label with direct construction image as objective function training network, without marking single character.
For example, input picture is the target Suggestion box obtained after aforesaid operations, wait know by taking the single image of express delivery face as an exampleThe specific implementation procedure of the line of text identification operation of other express delivery face single image is as shown in Figure 4, wherein Convolution represents 3× 3 convolutional layer, Dense Blocks represent 1 × 1 and 3 × 3 combined convolutional layers, and Transition Layers represents 2 × 2Maximum pond layer, BGRU (Bidirectional Gated Recurrent Unit) be based on two-way GRU recycle nerve netNetwork model.
Detailed process are as follows: by the express delivery face single image after positioning be input in advance training complete based on DenseNet'sUltra-deep network structure extracts characteristics of image, image first pass around 3 × 3 convolutional layer, again successively alternately across several 1 × 1 and 3 ×3 combined convolutional layers and 1 × 1 convolutional layer and 2 × 2 maximum pond layer, network model depth reaches 120 layers.
Convolutional layer is obtained by being based on two-way GRU Recognition with Recurrent Neural Network layer using characteristics of image obtained above as inputIt exports and is calculated as the corresponding classification scoring of width dimension therewith;
Using connectionism timing classify (Connectionist Temporal Classification, CTC) method,Sequence label is converted by Recognition with Recurrent Neural Network layer output result.Probability is defined to sequence label according to the predicted value of every frame, is madeUse the negative log-likelihood of probability as objective function training network, can be corresponding with sequence label with direct construction image, it is not necessarily toMark single character.
In embodiments of the present invention, relative to the Transition Layers of traditional DenseNet using average pondMode, using maximum pond layer come the texture information of keeping characteristics figure in the embodiment of the present invention, and in most latter two maximum pondIn layer, step-length is used to operate in width dimension for 1 pondization, more characteristic informations for retaining width dimension, so that narrow character is examinedIt surveys more robust.GRU method is used in the embodiment of the present invention, is a kind of Recognition with Recurrent Neural Network more efficient than LSTM network,The extraction of word sequence contextual information can be enhanced, so that the prediction result of text sequence is more accurate.The embodiment of the present inventionThe CTC method of middle use is the common method for transformation of processing cycle neural network output result, can be converted output result toSequence label obtains last text results by the operations such as duplicate removal and rejecting space, and process object is entire sequence label,Rather than single character.
Embodiment 2:
It is on the basis of the above embodiments, in embodiments of the present invention, described to incite somebody to action in order to keep Text region accuracy higherImage comprising text to be identified is input to first comprising convolutional neural networks and Recognition with Recurrent Neural Network that training in advance is completedBefore in model, the method also includes:
Described image is handled using threshold segmentation method and connected domain analysis method;
And image carries out text orientation correction to treated.
The image comprising text to be identified is obtained by image capture devices such as cameras, since the text information in image canIt can be distributed any position in the picture, and may only some region include text to be identified in image.Therefore existBefore being identified to the text in image, first using threshold segmentation method and connected domain analysis method to image atReason removes redundant area, and retains the area image comprising text to be identified.Also, for the Text region knot for guaranteeing imageFruit is more accurate, carries out text orientation correction to the area image comprising text to be identified, is horizontally oriented line of text.ItsIn, process that sampling threshold dividing method and connected domain analysis method handle image and to comprising text to be identifiedThe process that area image carries out text orientation correction belongs to the prior art, no longer repeats herein the process.
Embodiment 3:
It is every in order to obtain due to only including the text information of sub-fraction in each candidate Suggestion box obtained aboveThe corresponding complete text row information of a line, on the basis of the various embodiments described above, in embodiments of the present invention, according to each timeThe position for selecting Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box include:
For the first candidate Suggestion box in each candidate Suggestion box, recognize whether and the horizontal seat of the first candidate Suggestion boxThe distance between mark is less than preset first threshold, and the degree of overlapping of vertical direction is greater than preset second threshold, and shape is similarDegree is greater than the second candidate Suggestion box of preset third threshold value, if it does, by the described first candidate Suggestion box and described secondCandidate Suggestion box is incorporated as the first candidate Suggestion box;If it does not, using the first candidate Suggestion box as target Suggestion box.
Specifically, the distance between abscissa is the minimum value and the of the abscissa of four angle points of the first candidate Suggestion boxThe four of the absolute value of difference between the minimum value of the abscissa of four angle points of two candidate Suggestion box or the first candidate Suggestion boxDifference between the maximum value of the abscissa of four angle points of the maximum value of the abscissa of a angle point and the second candidate Suggestion box it is exhaustedTo value.Wherein, the absolute value of difference is smaller, then this two candidate Suggestion box, are more possible to as a pair of of connected frame;Vertical directionDegree of overlapping is the lap determination in vertical direction with the second candidate Suggestion box according to the first candidate Suggestion box, weightFolded degree is bigger, then this two candidate Suggestion box, are more possible to as a pair of of connected frame;Shape similarity be the first candidate Suggestion box andThe similarity in terms of global shape of second candidate Suggestion box, shape similarity is bigger, then this two candidate Suggestion box, more haveIt may be a pair of of connected frame.
Specifically, determine the first candidate Suggestion box and the second candidate Suggestion box whether be a pair of of connected frame process: be directed toEach first candidate Suggestion box judges whether there is one second candidate Suggestion box, wherein second candidate frame and first timeThe distance d between Suggestion box in the horizontal direction is selected to be less than preset first threshold thresh1, the degree of overlapping of vertical directionOverlap is more than preset second threshold thresh2, and shape similarity similarity is greater than preset third threshold valuethresh3.If it exists, then it is assumed that the first candidate Suggestion box is a pair of of connected frame with the second candidate Suggestion box, by the distichThe minimum circumscribed rectangle of frame is tied as the first candidate Suggestion box, otherwise, using the first candidate Suggestion box as target Suggestion box.
As shown in figure 5, be wherein the candidate Suggestion box obtained in the embodiment of the present invention according to the above process in dotted line frame, it is emptyWire frame is target Suggestion box, and two frames below arrow are two candidate Suggestion box, the respectively first candidate Suggestion box and secondCandidate Suggestion box.Wherein A1, B1, C1, D1, A2, B2, C2, D2 respectively represent the first candidate Suggestion box and the second candidate Suggestion boxFour corner locations, h1And h2Respectively represent the first candidate Suggestion box the first height and the second candidate Suggestion box it is second highDegree.
When calculating the degree of overlapping of the first candidate Suggestion box and the second candidate Suggestion box in vertical direction, a kind of possible realityThe mode of applying is, according to the first candidate Suggestion box and the second candidate Suggestion box vertical direction coordinate length of overlapped part divided by h1And h2In maximum value, i.e., according to following formula: overlap=| yA2-yD1|/max(h1,h2), calculate the overlapping of vertical directionDegree.
The embodiment of another possibility is, according to the first candidate Suggestion box and the second candidate Suggestion box in vertical directionCoordinate lap divided by h1And h2Average value, i.e., according to following formula: overlap=| yA2-yD1|/mean(h1,h2),Calculate the degree of overlapping of vertical direction.
The third possible embodiment is contemplated that, according to the first candidate Suggestion box and the second candidate Suggestion box in Vertical SquareTo coordinate lap divided by h1And h2Union, i.e., according to following formula: overlap=| yA2-yD1|/union(h1,h2), calculate the degree of overlapping of vertical direction.
Embodiment 4:
In order to make to determine that the degree of overlapping of vertical direction is more acurrate, on the basis of the various embodiments described above, implement in the present inventionIn example, determine that the degree of overlapping of vertical direction includes:
According to the of the first height of the described first candidate Suggestion box and the first vertical coordinate and the second candidate Suggestion boxTwo height and the second vertical coordinate, using following formula: overlap=| yA2-yD1|/min(h1,h2), determine the Vertical SquareTo degree of overlapping, wherein yA2Represent the second vertical coordinate of the described second candidate Suggestion box, yD1First candidate is represented to buildDiscuss the first vertical coordinate of frame, h1And h2Respectively represent the height of the described first candidate Suggestion box and the second candidate Suggestion box.
In order to accurately determine whether the first candidate Suggestion box and the second candidate Suggestion box are a pair of of connected frame, in the present inventionIn embodiment, after having determined all candidate Suggestion box according to the above process, for arbitrary two candidate Suggestion box, respectively shouldTwo candidate Suggestion box are set as the first candidate Suggestion box and the second candidate Suggestion box, identify the first of the first candidate Suggestion boxThe second height and the second vertical coordinate of height and the first vertical coordinate and the second candidate Suggestion box.Firstly, calculating firstThe absolute value of the difference of vertical coordinate and the second vertical coordinate;Secondly, calculating the smallest height in the first height and the second heightValue;Finally, the absolute value of calculating difference and the ratio of minimum height values, the ratio be the first candidate Suggestion box and this secondDegree of overlapping of the candidate Suggestion box in vertical direction.The value is bigger, then the first candidate Suggestion box and the second candidate Suggestion box are oneA possibility that connected frame, is bigger.
Specifically, according to the first height of the first candidate Suggestion box and the first vertical coordinate and the second candidate Suggestion boxSecond height and the second vertical coordinate, according to following formula: overlap=| yA2-yD1|/min(h1,h2), determine first timeSelect the degree of overlapping of Suggestion box and the second candidate Suggestion box.Wherein, yA2The second vertical coordinate of the described second candidate Suggestion box is represented,yD1Represent the first vertical coordinate of the described first candidate Suggestion box, h1And h2Respectively represent the described first candidate Suggestion box and instituteState the height of the second candidate Suggestion box.
Embodiment 5:
In order to make to determine that shape similarity is more acurrate, on the basis of the various embodiments described above, in embodiments of the present invention, reallyDetermining shape similarity includes:
According to the second height of the first height of the described first candidate Suggestion box and the second candidate Suggestion box, using following public affairsFormula: similarity=min (h1,h2)/max(h1,h2), determine the shape similarity, wherein h1And h2Respectively represent instituteState the height of the first candidate Suggestion box and the second candidate Suggestion box.
It whether is a pair of of connected frame in order to which the first candidate Suggestion box and the second candidate Suggestion box is determined more accurately, at thisIn inventive embodiments, after having determined all candidate Suggestion box according to the above process, for arbitrary two candidate Suggestion box, respectivelyThis two candidate Suggestion box are set as the first candidate Suggestion box and the second candidate Suggestion box, identify the first candidate Suggestion boxSecond height of the first height and the second candidate Suggestion box.Firstly, determining the minimum value in first height and the second height;Secondly, determining the maximum value in first height and the second height;Finally, determine the ratio of the minimum value and maximum value, it shouldRatio is the shape similarity of the first candidate Suggestion box and the second candidate Suggestion box.The value is bigger, then first candidateA possibility that Suggestion box and the second candidate Suggestion box are a pair of of connected frame is bigger.
Specifically, according to the second height of the first height of the first candidate Suggestion box and the second candidate Suggestion box, according toLower formula: similarity=min (h1,h2)/max(h1,h2), determine the first candidate Suggestion box and the second candidate Suggestion boxShape similarity.Wherein, h1And h2Respectively represent the height of the described first candidate Suggestion box and the second candidate Suggestion box.
Embodiment 6:
In order to be positioned to the image comprising text to be identified newly inputted, therefore also wrap before locating itPre-training process is included, on the basis of the various embodiments described above, in embodiments of the present invention, trains the mistake of first model in advanceJourney includes:
Sample image is obtained, wherein being labelled with the location information of each Suggestion box in the sample image and each position is builtThe content that view frame includes is the second score value of text;
Each sample image is input in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network, according to everyThe output of a first model is trained first model.
Since the purpose of first model is to input image to be identified to position the line of text in images to be recognizedIt is the location information of each position Suggestion box in the image and each position Suggestion box in order to obtain into first modelIn include content be text the second score value, which is to calculate whether the position Suggestion box is that candidate buildsDiscuss frame.Therefore before carrying out pre-training to the first model, it is necessary first to be labeled to image data, obtain sample image.Specifically, being labelled in the location information and each position Suggestion box of each position Suggestion box in each image and includingContent is the second score value of text.
In specific implementation, a certain number of batch sample images are inputted every time, using propagated forward, error calculation, backwardIt propagates and weight updates step and is updated to model parameter;It continually enters batch sample and repeats above step, constantly adjustment ginsengNumber, the error of corrective networks output and a reference value finally obtain the network parameter of optimization, the i.e. network model of training completion.
Particularly, before network model starts training, general training method is all by the way of random initializtionThe initial parameter value of model is set.But the mode of random initializtion model parameter can theoretically converge to it is optimal,But its disadvantage it is also obvious that model restrain needed for the training time it is longer, be easily trapped into local optimum, it is not easy to obtain high-precisionNetwork model.It therefore, in embodiments of the present invention, will trained model in the prior art using transfer learning methodParameter moves to the mode that original model parameter random initializtion is replaced in new model, and this method is accelerated and optimizes new modelLearning efficiency and convergence rate.Specifically, real as the present invention using the Text region model parameter of some general datas trainingThe initial parameter for applying the model of example is trained.
Further, using incremental learning training method.Since the sample of simulation mark and true labeled data quantity are poorDifferent great disparity.Therefore, in embodiments of the present invention, the sample of the simulation mark of ten million magnitude is trained first, and then incremental learning is trueReal labeled data.In the increased situation of authentic specimen dynamic, the repetitive learning of the sample to magnanimity simulation mark is avoided, togetherWhen take full advantage of history training result, constantly adjust and optimize final model, reduce model training for the time and depositingStore up the demand in space.
Embodiment 7:
It in order to be identified to the image after positioning, therefore further include pre-training process before being identified to it,On the basis of the various embodiments described above, in embodiments of the present invention, the process of second model is trained to include: in advance
Obtain each line of text marked in sample image;
Each sample image comprising corresponding line of text is input to comprising convolutional neural networks and Recognition with Recurrent Neural NetworkIn second model, according to the output of each second model, second model is trained.
Since the purpose of second model is to input image to be identified to identify the line of text in images to be recognizedIt is the line of text in order to obtain in the image into second model, determining can be obtained text by Chinese dictionary after this article current rowText information in current row.Therefore before carrying out pre-training to the second model, it is necessary first to be labeled, obtain to image dataTake sample image.Specifically, being labelled with each line of text in each image.Next training identical with the first model is usedMode is trained, final the second model for obtaining training and completing, the Text region for new input picture.
For example, by taking the single image Text region of express delivery face as an example, the entire flow of list Text region in express delivery face as shown in FIG. 6Figure.
First against the express delivery face single image of input, list region in face is intercepted using Threshold segmentation and connected domain analysis method,Preliminary text orientation correction is carried out to the face list region of interception, so that line of text is all in horizontal direction.
Face list area image after aforesaid operations is input to line of text locating module, specific operation process are as follows: willInput of the face list area image as convolutional neural networks, obtains characteristic pattern;Sliding window operation is carried out on this feature figure, eachSliding window center predicts k position Suggestion box according to certain shapes and sizes;Using characteristic pattern obtained above as circulation mindInput through network, obtaining the content for including in the location information and each position Suggestion box of position Suggestion box is commenting for textScore value;Candidate Suggestion box is obtained by given threshold for the score value of position Suggestion box, and according to above-mentioned candidate Suggestion boxMerging algorithm it is merged, obtain target Suggestion box, target Suggestion box is the text that the locating module finally obtainsRow positioning result.
The line of text positioning result obtained after aforesaid operations is input to line of text identification module, specific operation processAre as follows: using line of text positioning result as the input of convolutional neural networks, extract characteristic pattern;Using this feature figure as circulation nerve netThe input of network obtains convolutional layer and exports and be calculated as the corresponding classification scoring of width dimension therewith;It will be followed using CTC methodThe convolutional layer output result of ring neural network is converted into sequence label, is compared, is obtained most by sequence label and Chinese dictionaryText information afterwards.The text information of above-mentioned acquisition is sorted out respectively according to name, phone and address etc., structuring can be obtainedFast reading electronic surface list information.
Embodiment 8:
Fig. 7 is a kind of character recognition device provided in an embodiment of the present invention, which includes:
Obtain module 701, for by include text to be identified image be input in advance training complete comprising convolution mindIn the first model through network and Recognition with Recurrent Neural Network, the location information of each Suggestion box for including in described image and every is obtainedThe content for including in a Suggestion box is the first score value of text, wherein first model obtains the characteristic pattern of described image,Sliding window operation is carried out based on the characteristic pattern, each window feature is determined, according to preset width in each window featureDegree and each position Suggestion box of Height Prediction;Using the corresponding window feature sequence of every row of the characteristic pattern as circulation nerve netThe input of network obtains the location information for each Suggestion box for including in described image based on the Recognition with Recurrent Neural Network and each buildsThe content for including in view frame is the first score value of text;
Screening module 702, the candidate Suggestion box for being greater than default scoring threshold value for screening the first score value;
Merging module 703 merges to obtain target for the position according to each candidate Suggestion box to candidate Suggestion boxSuggestion box;
Identification module 704, for by each target Suggestion box be input to that training in advance completes comprising convolutional neural networksIn the second model of Recognition with Recurrent Neural Network, the text for including in each target Suggestion box is identified.
Described device further include: correction module 705, for using threshold segmentation method and connected domain analysis method to describedImage is handled;And image carries out text orientation correction to treated.
The merging module 703 is specifically used for identifying whether to deposit for the first candidate Suggestion box in each candidate Suggestion boxIt is less than preset first threshold at a distance between the first candidate Suggestion box abscissa, the degree of overlapping of vertical direction is greater than pre-If second threshold, and shape similarity is greater than the second candidate Suggestion box of preset third threshold value, if it does, by described theOne candidate Suggestion box and the second candidate Suggestion box are incorporated as the first candidate Suggestion box;If it does not, this first is waitedSelect Suggestion box as target Suggestion box.
The merging module 703, specifically for being sat according to the first height of the described first candidate Suggestion box and first are verticalIt is marked with and the second height and the second vertical coordinate of the second candidate Suggestion box, using following formula: overlap=| yA2-yD1|/min(h1,h2), determine the degree of overlapping of the vertical direction, wherein yA2Represent the second vertical seat of the described second candidate Suggestion boxMark, yD1Represent the first vertical coordinate of the described first candidate Suggestion box, h1And h2Respectively represent the described first candidate Suggestion boxSecond height of the first height and the second candidate Suggestion box.
The merging module 703, specifically for being built according to the first height and the second candidate of the described first candidate Suggestion boxThe second height for discussing frame, using following formula: similarity=min (h1,h2)/max(h1,h2), determine that the shape is similarDegree, wherein h1And h2Respectively represent the height of the described first candidate Suggestion box and the second candidate Suggestion box.
Described device further include:
First training module 706, for obtaining sample image, wherein being labelled with each Suggestion box in the sample imageThe content that location information and each position Suggestion box include is the second score value of text;Each sample image is input to and includesIn first model of convolutional neural networks and Recognition with Recurrent Neural Network, according to the output of each first model, to first modelIt is trained.
Described device further include:
Second training module 707, for obtaining each line of text marked in sample image;Corresponding line of text will be includedEach sample image is input in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network, according to each second modelOutput, second model is trained.
In conclusion the embodiment of the present invention provides a kind of character recognition method and device, comprising: will include text to be identifiedImage be input in the first model comprising convolutional neural networks and Recognition with Recurrent Neural Network that training in advance is completed, obtain imageIn include each Suggestion box location information and each Suggestion box in include content be text the first score value, whereinFirst model obtains the characteristic pattern of image, carries out sliding window operation based on characteristic pattern, determines each window feature, special in each windowAccording to preset width and each position Suggestion box of Height Prediction in sign;The corresponding window feature sequence of every row of characteristic pattern is madeFor the input of Recognition with Recurrent Neural Network, the location information of each Suggestion box for including in image and every is obtained based on Recognition with Recurrent Neural NetworkThe content for including in a Suggestion box is the first score value of text;Identify that the first score value is greater than the default candidate for scoring threshold value and buildsDiscuss frame;According to the position of each candidate Suggestion box, candidate Suggestion box is merged to obtain target Suggestion box;Each target is builtView frame is input in the second model comprising convolutional neural networks and Recognition with Recurrent Neural Network that training in advance is completed, and identifies each meshThe text for including in mark Suggestion box.
Due in embodiments of the present invention, by the image comprising text to be identified be input to that training in advance completes comprising volumeIn first model of product neural network and Recognition with Recurrent Neural Network, the location information of each Suggestion box for including in image and every is obtainedThe content for including in a Suggestion box is the first score value of text.First model can effectively obtain the upper and lower of text sequenceLiterary information simultaneously adds it in position fixing process, specifically, the score value that the white space Suggestion box between same row text is setIt can be promoted because of the sequence signature of front and back text, the line of text position frame obtained is finally made to be more in line with text sequencePosition feature, line of text positioning result are more accurate.Secondly, including by what each target Suggestion box was input to training completion in advanceIn second model of convolutional neural networks and Recognition with Recurrent Neural Network, the text for including in each target Suggestion box is identified.This secondModel is due to can be enhanced the extraction of word sequence contextual information comprising Recognition with Recurrent Neural Network, so that the prediction of text sequenceAs a result more accurate.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program productFigure and/or block diagram describe.It should be understood that can be realized by computer program instructions each in flowchart and/or the block diagramThe combination of process and/or box in process and/or box and flowchart and/or the block diagram.It can provide these computersProcessor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devicesTo generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices executeIn the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagramIt sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram orThe function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that countingSeries of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer orThe instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram oneThe step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basicProperty concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted asIt selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the artMind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologiesWithin, then the present invention is also intended to include these modifications and variations.

Claims (14)

By the image comprising text to be identified be input to that training in advance completes comprising convolutional neural networks and Recognition with Recurrent Neural NetworkThe first model in, obtain described image in include each Suggestion box location information and each Suggestion box in include contentFor the first score value of text, wherein first model obtains the characteristic pattern of described image, is slided based on the characteristic patternWindow operation, determines each window feature, according to preset width and each position of Height Prediction in each window featureSuggestion box;Using the corresponding window feature sequence of every row of the characteristic pattern as the input of Recognition with Recurrent Neural Network, followed based on describedRing neural network obtain described image in include each Suggestion box location information and each Suggestion box in include content beFirst score value of text;
Obtain module, for by include text to be identified image be input to that training in advance completes comprising convolutional neural networks andIn first model of Recognition with Recurrent Neural Network, the location information for each Suggestion box for including in acquisition described image and each Suggestion boxIn include content be text the first score value, wherein first model obtains the characteristic pattern of described image, based on describedCharacteristic pattern carries out sliding window operation, each window feature is determined, according to preset width and height in each window featurePredict each position Suggestion box;Using the corresponding window feature sequence of every row of the characteristic pattern as the defeated of Recognition with Recurrent Neural NetworkEnter, is obtained in location information and each Suggestion box for each Suggestion box for including in described image based on the Recognition with Recurrent Neural NetworkThe content for including is the first score value of text;
11. device as claimed in claim 10, which is characterized in that the merging module is specifically used for waiting according to described firstThe first height of Suggestion box and the second height and the second vertical coordinate of the first vertical coordinate and the second candidate Suggestion box are selected, is adoptedWith following formula: overlap=| yA2-yD1|/min(h1,h2), determine the degree of overlapping of the vertical direction, wherein yA2Represent instituteState the second vertical coordinate of the second candidate Suggestion box, yD1Represent the first vertical coordinate of the described first candidate Suggestion box, h1And h2Respectively represent the first height of the described first candidate Suggestion box and the second height of the second candidate Suggestion box.
CN201811126275.4A2018-09-262018-09-26A kind of character recognition method and devicePendingCN110245545A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811126275.4ACN110245545A (en)2018-09-262018-09-26A kind of character recognition method and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811126275.4ACN110245545A (en)2018-09-262018-09-26A kind of character recognition method and device

Publications (1)

Publication NumberPublication Date
CN110245545Atrue CN110245545A (en)2019-09-17

Family

ID=67882838

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811126275.4APendingCN110245545A (en)2018-09-262018-09-26A kind of character recognition method and device

Country Status (1)

CountryLink
CN (1)CN110245545A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110991435A (en)*2019-11-272020-04-10南京邮电大学 A method and device for locating key information of express waybill based on deep learning
CN111310762A (en)*2020-03-162020-06-19天津得迈科技有限公司Intelligent medical bill identification method based on Internet of things
CN111611985A (en)*2020-04-232020-09-01中南大学 An OCR Recognition Method Based on Model Fusion
CN111639566A (en)*2020-05-192020-09-08浙江大华技术股份有限公司Method and device for extracting form information
CN111666937A (en)*2020-04-172020-09-15广州多益网络股份有限公司Method and system for recognizing text in image
CN112016547A (en)*2020-08-202020-12-01上海天壤智能科技有限公司Image character recognition method, system and medium based on deep learning
CN112749695A (en)*2019-10-312021-05-04北京京东尚科信息技术有限公司Text recognition method and device
CN113139539A (en)*2021-03-162021-07-20中国科学院信息工程研究所Method and device for detecting characters of arbitrary-shaped scene with asymptotic regression boundary
CN113392844A (en)*2021-06-152021-09-14重庆邮电大学Deep learning-based method for identifying text information on medical film
CN113762237A (en)*2021-04-262021-12-07腾讯科技(深圳)有限公司Text image processing method, device and equipment and storage medium
CN113887375A (en)*2021-09-272022-01-04浙江大华技术股份有限公司Text recognition method, device, equipment and storage medium
CN114627456A (en)*2020-12-102022-06-14航天信息股份有限公司Bill text information detection method, device and system
CN114743200A (en)*2022-04-182022-07-12福建捷宇电脑科技有限公司 A recognition-based handwriting segmentation method for electronic signatures

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106570497A (en)*2016-10-082017-04-19中国科学院深圳先进技术研究院Text detection method and device for scene image
CN108073898A (en)*2017-12-082018-05-25腾讯科技(深圳)有限公司Number of people area recognizing method, device and equipment
CN108288078A (en)*2017-12-072018-07-17腾讯科技(深圳)有限公司Character identifying method, device and medium in a kind of image
CN108564084A (en)*2018-05-082018-09-21北京市商汤科技开发有限公司character detecting method, device, terminal and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106570497A (en)*2016-10-082017-04-19中国科学院深圳先进技术研究院Text detection method and device for scene image
CN108288078A (en)*2017-12-072018-07-17腾讯科技(深圳)有限公司Character identifying method, device and medium in a kind of image
CN108073898A (en)*2017-12-082018-05-25腾讯科技(深圳)有限公司Number of people area recognizing method, device and equipment
CN108564084A (en)*2018-05-082018-09-21北京市商汤科技开发有限公司character detecting method, device, terminal and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHI TIAN等: "Detecting Text in Natural Image with Connectionist Text Proposal Network", 《14TH EUROPEAN CONFERENCE ON COMPUTER VISION (ECCV)》*
蔡文哲 等: "基于双门限梯度模式的图像文字检测方法", 《计算机科学》*

Cited By (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112749695A (en)*2019-10-312021-05-04北京京东尚科信息技术有限公司Text recognition method and device
CN110991435A (en)*2019-11-272020-04-10南京邮电大学 A method and device for locating key information of express waybill based on deep learning
CN111310762A (en)*2020-03-162020-06-19天津得迈科技有限公司Intelligent medical bill identification method based on Internet of things
CN111666937A (en)*2020-04-172020-09-15广州多益网络股份有限公司Method and system for recognizing text in image
CN111611985A (en)*2020-04-232020-09-01中南大学 An OCR Recognition Method Based on Model Fusion
CN111639566A (en)*2020-05-192020-09-08浙江大华技术股份有限公司Method and device for extracting form information
CN111639566B (en)*2020-05-192024-08-09浙江大华技术股份有限公司Method and device for extracting form information
CN112016547A (en)*2020-08-202020-12-01上海天壤智能科技有限公司Image character recognition method, system and medium based on deep learning
CN114627456A (en)*2020-12-102022-06-14航天信息股份有限公司Bill text information detection method, device and system
CN113139539A (en)*2021-03-162021-07-20中国科学院信息工程研究所Method and device for detecting characters of arbitrary-shaped scene with asymptotic regression boundary
CN113762237A (en)*2021-04-262021-12-07腾讯科技(深圳)有限公司Text image processing method, device and equipment and storage medium
CN113762237B (en)*2021-04-262023-08-18腾讯科技(深圳)有限公司Text image processing method, device, equipment and storage medium
CN113392844A (en)*2021-06-152021-09-14重庆邮电大学Deep learning-based method for identifying text information on medical film
CN113887375A (en)*2021-09-272022-01-04浙江大华技术股份有限公司Text recognition method, device, equipment and storage medium
CN114743200A (en)*2022-04-182022-07-12福建捷宇电脑科技有限公司 A recognition-based handwriting segmentation method for electronic signatures

Similar Documents

PublicationPublication DateTitle
CN110245545A (en)A kind of character recognition method and device
Yuliang et al.Detecting curve text in the wild: New dataset and new solution
CN106683091B (en)A kind of target classification and attitude detecting method based on depth convolutional neural networks
CN111445488B (en) A Weakly Supervised Learning Approach to Automatically Identify and Segment Salt Bodies
CN108918536B (en) Tire mold surface character defect detection method, device, equipment and storage medium
CN110991435A (en) A method and device for locating key information of express waybill based on deep learning
CN109377445A (en) Model training method, method, apparatus and electronic system for replacing image background
CN108898047A (en)The pedestrian detection method and system of perception are blocked based on piecemeal
CN114549507B (en) Improved Scaled-YOLOv4 fabric defect detection method
CN110414344B (en) A video-based person classification method, intelligent terminal and storage medium
CN106548151B (en)Target analyte detection track identification method and system towards intelligent robot
CN107871124A (en) A remote sensing image target detection method based on deep neural network
CN113780087B (en) A postal package text detection method and device based on deep learning
CN104881673B (en)The method and system of pattern-recognition based on information integration
KR20220125719A (en) Method and equipment for training target detection model, method and equipment for detection of target object, electronic equipment, storage medium and computer program
CN110598698B (en)Natural scene text detection method and system based on adaptive regional suggestion network
CN113221956B (en)Target identification method and device based on improved multi-scale depth model
CN110796018A (en) A Hand Motion Recognition Method Based on Depth Image and Color Image
CN107609575A (en)Calligraphy evaluation method, calligraphy evaluating apparatus and electronic equipment
CN113673482B (en)Cell antinuclear antibody fluorescence recognition method and system based on dynamic label distribution
CN113420727B (en) Training method and device for table detection model, and table detection method and device
CN112734803A (en)Single target tracking method, device, equipment and storage medium based on character description
CN111696079A (en)Surface defect detection method based on multi-task learning
CN110163208A (en)A kind of scene character detecting method and system based on deep learning
CN110348423A (en)A kind of real-time face detection method based on deep learning

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20190917


[8]ページ先頭

©2009-2025 Movatter.jp