Movatterモバイル変換


[0]ホーム

URL:


CN108304814A - A kind of construction method and computing device of literal type detection model - Google Patents

A kind of construction method and computing device of literal type detection model
Download PDF

Info

Publication number
CN108304814A
CN108304814ACN201810128155.1ACN201810128155ACN108304814ACN 108304814 ACN108304814 ACN 108304814ACN 201810128155 ACN201810128155 ACN 201810128155ACN 108304814 ACN108304814 ACN 108304814A
Authority
CN
China
Prior art keywords
picture
region
character area
original image
literal type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810128155.1A
Other languages
Chinese (zh)
Other versions
CN108304814B (en
Inventor
徐行
刘辉
刘宁
张东祥
郭龙
陈李江
李启林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Avanti Technology Co ltd
Original Assignee
Hainan Cloud River Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Cloud River Technology Co LtdfiledCriticalHainan Cloud River Technology Co Ltd
Priority to CN201810128155.1ApriorityCriticalpatent/CN108304814B/en
Publication of CN108304814ApublicationCriticalpatent/CN108304814A/en
Application grantedgrantedCritical
Publication of CN108304814BpublicationCriticalpatent/CN108304814B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention discloses a kind of construction method of literal type detection model and literal type detection methods, and suitable for being executed in computing device, model building method includes:Acquisition training picture;Each trained picture is extended for a rectangular picture;Obtain the result after being labeled to the print hand writing region of each rectangular picture and handwritten text region;Convolutional neural networks are trained according to each trained picture and its annotation results, obtain literal type detection model.Detection method includes:Original image to be identified is obtained, is multiple subgraphs by the original image cutting;Print hand writing region and handwritten text region in each subgraph are detected using literal type detection model respectively, obtain the coordinate information and its literal type of each character area;The adjacent same type of character area cut for belonging to different subgraphs is merged, the print hand writing region in original image and handwritten text region are obtained.The invention also discloses corresponding computing devices.

Description

A kind of construction method and computing device of literal type detection model
Technical field
The present invention relates to image real time transfer field more particularly to a kind of construction method, the texts of literal type detection modelWord type detection method and computing device.
Background technology
With the development of computer and Internet technology, people more and more use automation equipment to try student examinationVolume is goed over examination papers.In in examination paper analysis, it is often necessary to identify that the word of each identification region is hand-written script or printed wordsBody.Current character recognition method is typically based on character color or the simple character features answered are identified.This method is to imageQuality requirement it is very high, if image have shade or occur it is hand-written immersion and it is fuzzy situations such as, it will cause accuracy of detection mistakesLow problem.Moreover, this method is typically only capable to be split detection based on horizontal line word, to rotation image can not be intoRow detection well.In addition, word itself has various features, it is based only upon color characteristic and the detection differentiation of handwriting is failedThe feature of handwriting is fully excavated, and then limits its detection result to a certain extent.
Accordingly, it is desirable to provide a kind of detection method of more effective handwritten text and print hand writing.
Invention content
In view of the above problems, the present invention proposes a kind of construction method of literal type detection model, literal type detectionMethod and computing device exist above to try hard to solve the problems, such as or at least solve.
According to an aspect of the present invention, a kind of construction method of literal type detection model is provided, suitable for being set in calculatingStandby middle execution, this method include:Acquisition training picture, wherein every trained picture includes print hand writing and handwritten textAt least one of;Each trained picture is extended for a rectangular picture according to the long width values of each trained picture;It obtains to each sideThe print hand writing region and handwritten text region of shape picture be labeled after result;And according to each trained picture and itsAnnotation results are trained convolutional neural networks, obtain literal type detection model.
Optionally, in the construction method of literal type detection model according to the present invention, convolutional neural networks include 6 layersConvolutional layer and 2 layers of full articulamentum.
Optionally, intermediate in convolutional neural networks in the construction method of literal type detection model according to the present inventionThe convolution kernel of convolutional layer includes 3*3 convolution kernels, 5*5 convolution kernels and 7*7 convolution kernels, and last output layer includes print hand writing areaDomain, handwritten text region and 3 kinds of background area classification.
Optionally, in the construction method of literal type detection model according to the present invention, the block letter of square shaped pictureThe operation that character area and handwritten text region are labeled includes:Determine each line of text in the rectangular picture and each textCharacter area in one's own profession;The character area type of each line of text is labeled line by line, character area type includes block letterCharacter area and handwritten text region;And by the coordinate information of each character area in each line of text and its affiliated wordClassification is preserved.
It optionally, will according to the long width values of picture in the construction method of literal type detection model according to the present inventionTraining picture the step of being extended for a rectangular picture includes:Higher value framework one during selection is long and wide is white background figurePicture, and the training picture is placed on to the center of white background picture.
According to a further aspect of the invention, a kind of literal type detection method is provided, suitable for being executed in computing device,Literal type detection model is stored in computing device, literal type detection model is suitable for examining using literal type as described aboveThe construction method structure of model is surveyed, literal type detection method includes:The original image of literal type to be identified is obtained, and shouldOriginal image cutting is multiple subgraphs, wherein each subgraph is not overlapped and connects;Using literal type detection model respectively to eachPrint hand writing region and handwritten text region in subgraph are detected, obtain wherein each character area coordinate information andLiteral type belonging to it;And different subgraphs will be belonged to respectively and the adjacent same type of character area cut closesAnd and using in all subgraphs print hand writing regional ensemble and handwritten text regional ensemble as the print in the original imageBrush body character area and handwritten text region
Optionally, in literal type detection method according to the present invention, different subgraphs will be belonged to respectively and adjacent cutSame type of character area the step of merging include:Print hand writing region in each subgraph and hand-written is obtained respectivelyFirst coordinate information of the body character area in corresponding subgraph, and first coordinate information is converted to based on original image theTwo coordinate informations;It is detected whether there are two according to the second coordinate information of each character area or multiple belongs to same type of wordRegion is adjacent to cut, if so, then merge these it is adjacent cut region, with obtain all print hand writing regions in original image andHandwritten text region.
Optionally, in literal type detection method according to the present invention, by the step that original image cutting is multiple subgraphsSuddenly include:The original image is extended for a rectangular picture according to the long width values of original image, and by the rectangular picture cuttingFor multiple subgraphs.
Optionally, in literal type detection method according to the present invention, the coordinate information of character area includes the wordThe top left corner apex coordinate and lower right corner apex coordinate in region.
Optionally, in literal type detection method according to the present invention, if the top left corner apex of original image is in its instituteRectangular picture in coordinate value be (x, y), the top left corner apex of some subgraph is in the rectangular picture in the rectangular pictureCoordinate value be (x1, y1), coordinate value of the top left corner apex of certain character area in the subgraph is (x in the subgraph2, y2), thenCoordinate value of the character area in the original image is (x1+x2- x, y1+y2-y)。
According to a further aspect of the invention, a kind of computing device is provided, including:At least one processor;Be stored withThe memory of program instruction, wherein the program instruction is configured as being suitable for being executed by least one processor, program instructionIt include the instruction of the construction method and/or literal type detection method for executing literal type detection model as described above.
According to a further aspect of the invention, a kind of readable storage medium storing program for executing for the instruction that has program stored therein is provided, when the programWhen instruction is read and is executed by computing device so that the computing device executes the structure of literal type detection model as described aboveMethod and/or literal type detection method.
According to the technique and scheme of the present invention, during model training, block letter and handwritten form are largely carried by acquiringThe textual image of word carries out it rectangular expansion processing, and to print hand writing region therein and handwritten text regionBe input in convolutional neural networks after manually marking and learnt, obtains literal type detection model.Rectangular expansion processingCan effectively reduce in following model training process makes model training effect become since tab area is too small, size is irregularDifference.Artificial mark can enable following model training identify the text of single row in this way according to the mark line by line of horizontal directionBlock domain, avoids the result roughness of model whole detection, and improves the fine granularity and precision of detection.
Can be multiple subgraphs by its actual size cutting for original image to be identified during model use, andThe print hand writing region in each subgraph and handwritten text region are detected respectively.Finally, by the print hand writing of each subgraphRegion and handwritten text region merge, you can obtain print hand writing region and the handwritten text area of the original imageDomain.Here, original image is cut into the detection that region detection model is more suitable for after subgraph, compared to directly in artwork it is enterprisingRow identification, can improve the fine granularity and precision of identification.And after being merged to all subgraph results, it more actual can obtainThe region fragment formed in factor graph detection is reduced, to obtain being more in line in artwork in block letter and handwritten text regionThe region of word distribution.
Description of the drawings
To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawingsFace, these aspects indicate the various modes that can put into practice principles disclosed herein, and all aspects and its equivalent aspectIt is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentionedAnd other purposes, feature and advantage will be apparent.Throughout the disclosure, identical reference numeral generally refers to identicalComponent or element.
Fig. 1 shows the structure diagram of computing device 100 according to an embodiment of the invention;
Fig. 2 shows the flow charts of the construction method 200 of literal type detection model according to an embodiment of the invention;
Fig. 3 shows the flow chart of literal type detection method 300 according to an embodiment of the invention;
Fig. 4 A and Fig. 4 B respectively illustrate the sample picture for meeting model training requirement;
Fig. 4 C and 4D respectively illustrate the sample picture for not meeting model training requirement;
Fig. 5 A and Fig. 5 B are respectively illustrated carries out the rectangular schematic diagram for expanding processing by picture;
Fig. 6 shows the signal according to an embodiment of the invention being labeled line by line to respectively asking the character area of one's own professionFigure;
Fig. 7 shows the structural schematic diagram of convolutional neural networks according to an embodiment of the invention;
Fig. 8 shows the signal according to an embodiment of the invention by the adaptive cutting of original image for multiple subgraphsFigure;And
Fig. 9 shows the schematic diagram of substrate coordinate system transformation according to an embodiment of the invention.
Specific implementation mode
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawingExemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth hereIt is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosureCompletely it is communicated to those skilled in the art.
Fig. 1 is the block diagram of Example Computing Device 100.In basic configuration 102, computing device 100, which typically comprises, isSystem memory 106 and one or more processor 104.Memory bus 108 can be used for storing in processor 104 and systemCommunication between device 106.
Depending on desired configuration, processor 104 can be any kind of processing, including but not limited to:Microprocessor(μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 104 may include such asThe cache of one or more rank of on-chip cache 110 and second level cache 112 etc, processor core114 and register 116.Exemplary processor core 114 may include arithmetic and logical unit (ALU), floating-point unit (FPU),Digital signal processing core (DSP core) or any combination of them.Exemplary Memory Controller 118 can be with processor104 are used together, or in some implementations, and Memory Controller 118 can be an interior section of processor 104.
Depending on desired configuration, system storage 106 can be any type of memory, including but not limited to:EasilyThe property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System storesDevice 106 may include operating system 120, one or more apply 122 and program data 124.In some embodiments,It may be arranged to be operated using program data 124 on an operating system using 122.Program data 124 includes instruction, in rootIn computing device 100 according to the present invention, program data 124 includes the construction method 200 for executing literal type detection modelAnd/or the instruction of literal type detection method 300.
Computing device 100 can also include contributing to from various interface equipments (for example, output equipment 142, Peripheral Interface144 and communication equipment 146) to basic configuration 102 via the communication of bus/interface controller 130 interface bus 140.ExampleOutput equipment 142 include graphics processing unit 148 and audio treatment unit 150.They can be configured as contribute to viaOne or more port A/V 152 is communicated with the various external equipments of such as display or loud speaker etc.Outside exampleIf interface 144 may include serial interface controller 154 and parallel interface controller 156, they, which can be configured as, contributes toVia one or more port I/O 158 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, touchInput equipment) or the external equipment of other peripheral hardwares (such as printer, scanner etc.) etc communicated.Exemplary communication is setStandby 146 may include network controller 160, can be arranged to convenient for via one or more communication port 164 and oneThe communication that other a or multiple computing devices 162 pass through network communication link.
Network communication link can be an example of communication media.Communication media can be usually presented as in such as carrier waveOr the computer-readable instruction in the modulated data signal of other transmission mechanisms etc, data structure, program module, and canTo include any information delivery media." modulated data signal " can such signal, one in its data set or moreIt is a or it change can the mode of coding information in the signal carry out.As unrestricted example, communication media can be withInclude the wire medium of such as cable network or private line network etc, and such as sound, radio frequency (RF), microwave, infrared(IR) the various wireless mediums or including other wireless mediums.Term computer-readable medium used herein may include depositingBoth storage media and communication media.
Computing device 100 can be implemented as server, such as file server, database server, application program serviceDevice and WEB server etc. can also be embodied as a part for portable (or mobile) electronic equipment of small size, these electronic equipmentsCan be such as cellular phone, personal digital assistant (PDA), personal media player device, wireless network browsing apparatus, individualHelmet, application specific equipment or may include any of the above function mixing apparatus.Computing device 100 can also be realIt includes desktop computer and the personal computer of notebook computer configuration to be now.In some embodiments, 100 quilt of computing deviceIt is configured to execute the construction method 200 and/or literal type detection method 300 of literal type detection model according to the present invention.
Fig. 2 shows the construction methods 200 of literal type detection model according to an embodiment of the invention, can countIt calculates and is executed in equipment, such as executed in computing device 100.As shown in Fig. 2, this method starts from step S220.
In step S220, training picture is acquired, wherein every trained picture includes print hand writing and handwritten textAt least one of.
For specific application scenarios, the word comprising block letter and/or handwritten form met under the scene can be collectedPicture, it shall be noted that the word line number in picture should not be excessive overstocked, in order to reduce the cost of labor of subsequent artefacts' mark.Figure4A and Fig. 4 B respectively illustrate the sample picture for meeting model training requirement, and text line number and spacing are more appropriate;Fig. 4 C and 4DRespectively illustrate the sample picture for not meeting model training requirement;Its text line number is excessively also overstocked.
Then, in step S240, each trained picture is extended for a square chart according to the long width values of each trained picturePiece.
The training picture usually acquired not necessarily meets the training requirement of subsequent detection model, it is therefore desirable to eachPicture carries out rectangular expansion processing, can reduce since tab area is too small in following model training process in this way, size is not advisedThen the problem of, makes model training effect be deteriorated.Rectangular expansion can (such as a length of w wide be according to the original size size of pictureH), the image that one background of value framework larger in w and h is white is chosen, and picture is placed on to the center of white image, in this wayOriginal image is just extended for the rectangular picture of a w*w or h*h.Fig. 5 A and Fig. 5 B respectively illustrate two kinds of rectangular processing and showExample, the picture width w in wherein Fig. 5 A are more than height h, therefore picture are extended for according to width value w rectangular;And in Fig. 5 BPicture width w is less than height h, therefore picture is extended for according to height value h rectangular.Certainly, if picture itself is exactly sideShape picture does not have to then carry out rectangular expansion again.
Then, in step S260, obtain to the print hand writing region of each rectangular picture and handwritten text region intoResult after rower note.
Wherein, the operation that the print hand writing region and handwritten text region of square shaped picture are labeled includes:ReallyEach line of text in the fixed rectangular picture and the character area in each line of text;Line by line to the character area type of each line of textIt is labeled, character area type includes print hand writing region and handwritten text region;By each word in each line of textThe coordinate information in region and its affiliated word classification are preserved.The coordinate information of character area generally includes character areaTop left corner apex coordinate and lower right corner apex coordinate, naturally it is also possible to other coordinate representation methods are chosen, as lower-left angular vertex is satMark and upper right corner apex coordinate or the long width values of top left corner apex coordinate and region, as long as a literal field can be represented accuratelyThe regional location in domain, this is not limited by the present invention.In addition, it will be appreciated that the identification of character area may be used existingSome arbitrary region recognition methods such as use OCR recognition methods, the invention is not limited in this regard.
Fig. 6 shows the signal according to an embodiment of the invention being labeled line by line to respectively asking the character area of one's own professionFigure, 4 line of text are print hand writing, and preceding 3 line of text are respectively there are one character area, and there are four texts in the 4th line of textBlock domain.This mask method line by line enables following model training to identify the character area of single row, and it is whole to avoid modelThe result that physical examination is surveyed is coarse, can improve the fine granularity and precision of detection.
Then, in step S280, convolutional neural networks is trained according to each trained picture and its annotation results, are obtainedTo literal type detection model.
The present invention carries out model training according to the picture set of the existing mark of certain scale, specifically, using rectangular placeThe markup information of picture set and every figure after reason is carried out using the detection model of improved fast area convolutional neural networksTraining.The detection model that training pattern is based on fast area convolutional neural networks (ZF networks) is transformed.For convolution godStructure through network and each layer content, those skilled in the art can sets itself as needed, the invention is not limited in this regard.
According to an embodiment of the present invention, which includes that 6 layers of convolutional layer and 2 layers of full articulamentum, Fig. 7 are shownThe structural schematic diagrams of the convolutional neural networks.In view of the dimension of picture of deep neural network input needs fixation (differentPicture will cut out specified size), w*w the or h*h original images of input are cut into system by the present invention by multiple dimensioned processingOne size is such as cut into 224*224 sizes, ensures that model can support multiple dimensioned image to input in this way.In addition, intermediate volumeLamination can increase the convolution kernel of sizes, appropriate after convolutional layer to adopt such as 3x3 convolution kernels, 5x5 convolution kernels and 7x7 convolution kernelsIt is set as 3 with the class number of parameter drop policy, last output layer, including three block letter, handwritten form and background classifications.ItsIn, the plain white background for referring at background, pixel value is RGB (255,255,255), not to original in neural computingPicture region generates interference or influences.Certainly, it about each layer structure in the convolutional neural networks, can also be arranged as required toFor other numerical value, the present invention is limited this.
As shown in fig. 7, the convolutional neural networks contain 12 layer network structures, wherein each layer of code name is InputLayer (input data layer), conv (convolutional layer), pool (pond layer), full articulamentum (fc), output layer (output).In Fig. 7Full articulamentum and pond layer be together, such as conv2+pool2, conv3+pool3, conv5+pool5, to have plenty of individuallyConvolutional layer there is no pond layer, such as conv1, conv4, conv6.It is, the complete structure of the convolutional neural networks is:InputThe+the second pond of the convolutional layer layer of layer → first convolutional layer → second → third convolutional layer+third pond layer → Volume Four lamination → theFull articulamentum → the output layer of full articulamentum → the second of convolutional layer → the first of five the+the five pond layers of convolutional layer → the 6th, each layerParameter is as shown in the table:
In addition, the mode that cross validation may be used in the training process carries out model selection:By entire picture set pointFor three training set, verification set and test set parts, it is trained on training set picture, is damaged according in iteration cycleThe reduction of function is lost to select the training pattern under the appropriate period to close the performance of test detection in verification collection, and is chosen at verificationThe training pattern to behave oneself best in set is as candidate optimum training model.
Fig. 3 shows literal type detection method 300 according to an embodiment of the invention, can be held in computing deviceRow, such as executes in computing device 100.Literal type detection model as described above, the word are stored in the computing deviceType detection model is suitable for building using literal type detection model method as described above.As shown in figure 3, this method starts from stepRapid S320.
In step S320, the original image of literal type to be identified is obtained, and is multiple subgraphs by the original image cutting,Wherein each subgraph is not overlapped and connects.
As it was noted above, block letter handwritten text detection method in the prior art is higher to image request, usually wantThe high-definition image that Seeking Truth scanner scanning obtains.And the present invention provides a kind of literal type detection models, can effectively reduceRequirement to image definition.Therefore, original image to be identified can obtain the character image of high definition by scanner, also may be usedTo take pictures by mobile phone or camera acquisition and obtain image.Moreover, picture obtains not stringent environmental requirement (such as illumination, angleDegree and paper texture etc.), normal photographing Plain paper under natural lighting is only needed, to effectively increase text imageThe universality of identification also reduces image recognition workload and cost.
The cutting of original image can take adaptive cutting method, i.e., according to the length of original image and roomy small to originalPicture carries out region division, and each region is not overlapped and connects, and each region is as a subgraph (such as the picture cutting institute in Fig. 8Show).Usually, can limit a subgraph size be no more than 480*320 sizes, such a 1920*1280 sizes it is originalPicture can be with cutting for 16-20 subgraph.It is cut into the detection that region detection model is more suitable for after subgraph, compared to directly existingIt is identified in artwork, the fine granularity and precision of identification can be improved.It further, can also be first according to the long width values of original imageThe original image is extended for a rectangular picture, then by the rectangular picture cutting is multiple subgraphs.The rectangular expansion side of its pictureMethod is referring to being described above, and which is not described herein again.
Then, in step S340, using literal type detection model respectively to the print hand writing area in each subgraphDomain and handwritten text region are detected, and obtain the coordinate information of wherein each character area and its affiliated literal type.Block letter and handwritten text region detection are exactly carried out one by one to each subgraph that step S320 cuttings obtain, obtain every heightThe coordinate information of multiple block letter and handwritten text region in figure, and the type of each detection zone (belong to block letterOr hand-written body region).Similarly, the coordinate information of character area includes top left corner apex coordinate and the bottom right of the character areaAngular vertex coordinate, but not limited to this, as long as the regional location of the character area can be indicated accurately.
Then, in step S360, different subgraphs will be belonged to respectively and the adjacent same type of character area cut intoRow merge, and using in all subgraphs print hand writing regional ensemble and handwritten text regional ensemble as in the original imagePrint hand writing region and handwritten text region.
To in all subgraphs printing body region and hand-written body region merge respectively, more actual can be printedBrush body and handwritten text region are reduced because of the region fragment formed in subgraph detection, to obtain being more in line in artworkThe region of word distribution.Include to the rule that subgraph merges:1) the regional ensemble for belonging to same type in different subgraphsTogether, the region as the corresponding types of original image;2) due to the detection in each subgraph (block letter is hand-writtenBody) area information be the first coordinate information based on subgraph, need first coordinate information being mapped to based on original imageSecond coordinate information (transformation for relating to substrate coordinate system);3) after being converted into the second coordinate information based on original image, inspectionThere are two surveys whether or multiple regions are adjacent cuts, and if there is overlapping, then merges these regions;4) it finally arranges and obtains original imageAll non-overlapping block letter and hand-written body region.
According to one embodiment of present invention, if seat in rectangular picture of the top left corner apex of original image where itScale value is (x, y), and coordinate value of the top left corner apex of some subgraph in the rectangular picture is (x in the rectangular picture1, y1), it shouldCoordinate value of the top left corner apex of certain character area in the subgraph is (x in subgraph2, y2), then the character area is in the original graphCoordinate value in piece is (x1+x2- x, y1+y2-y)。
Mainly how Fig. 9 shows substrate coordinate system transfer principle schematic diagram according to an embodiment of the invention,By detected in subgraph the coordinate of character area be converted into based on rectangular expansion after original w*w or h*h pictures in coordinate.Such asShown in Fig. 9, for by the rectangular picture for expanding (including white background), word picture region only accounts for the part in its center, shouldTop left corner apex (the i.e. left frame five-pointed star position) coordinate in region is (x, y).Since the present invention carries out block letter/handwritten formText detection is to carry out sub- Fig. 1-4 (by 4 pieces of the picture cutting of rectangular expansion in exemplary plot, it is of course possible to be cut into itThe subgraph of his number, such as 8 12 or 16 etc.), therefore the coordinate of block letter or handwritten form the style of writing word detected is alsoBased on subgraph, i.e. the first coordinate information.For example, in subgraph 2 rectangle frame handwriting region, top left corner apex coordinate is(x2, y2), this coordinate value is the vertex (i.e. upper side frame five-pointed star position in figure) relative to subgraph 2, and the target of the present inventionIt is by coordinate (x2, y2) be converted to coordinate value (x relative to the original image vertex (x, y) in rectangular picture2', y2'), i.e. phaseFor second coordinate information on original image vertex.By calculating it is found that x2'=x1+x2- x, y2'=y1+y2-y。
According to another embodiment of the invention, according to the second coordinate information relative to original image of each character areaAfterwards, you can detect whether there are two or multiple regions are adjacent cuts.Here, adjacent cut through refers to different subgraph edges and has printingBody or hand-written body region are adjacent, the case where being isolated by different subgraphs primarily directed to same character area.For thisThe word isolated needs to merge it to obtain complete a line word.It generally can be according to two character areasTop left corner apex coordinate and lower right corner apex coordinate value to determine whether adjacent cut, it is adjacent would generally be there are one abscissa value when cuttingOr ordinate value is identical.As the rectangle frame of subgraph 1 and subgraph 3 in Fig. 9 be it is adjacent cut, they are one in original imageWhole region, it is therefore desirable to be merged.
Specifically, the character area of the adjacent same type cut can be merged according to following method:It obtains respectively eachThe first coordinate information of print hand writing region and handwritten text region in subgraph in corresponding subgraph, and this first is satMark information is converted to the second coordinate information based on original image;It has been detected whether according to the second coordinate information of each character areaIt is two or more to belong to that same type of character area is adjacent to be cut, if so, then merge these it is adjacent cut region, it is original to obtainAll print hand writing regions in picture and handwritten text region.Here merging can refer to taking two or more wordsThe maximum union refion in region.
According to the technique and scheme of the present invention, after carrying out rectangular expansion processing to each picture, it is possible to reduce following modelSince tab area is too small in training process, the irregular problem of size makes model training effect be deteriorated.To training picture intoThe mark line by line of row horizontal direction so that following model training can identify the character area of single row, avoid model entiretyThe result of detection is coarse, can improve the fine granularity and precision of detection.For the image data collection feature in the present invention, network is changedAs a result, carrying out model training using based on improved fast area convolutional neural networks so that model performance higher.It is cut into sonIt is more suitable for the detection of region detection model after figure, compared to being directly identified in artwork, the fine granularity of identification can be improvedAnd precision.To block letter and handwritten text region more actual can be obtained after being merged in subgraph, reduce because of sonThe region fragment formed in figure detection, to obtain being more in line with the region that word is distributed in artwork.
B9, the method as described in any one of B6-B8, wherein the coordinate information of character area includes the character areaTop left corner apex coordinate and lower right corner apex coordinate.
B10, the method as described in B7, wherein if in rectangular picture of the top left corner apex of original image where itCoordinate value is (x, y), and coordinate value of the top left corner apex of some subgraph in the rectangular picture is (x in the rectangular picture1, y1),Coordinate value of the top left corner apex of certain character area in the subgraph is (x in the subgraph2, y2), then the character area is original at thisCoordinate value in picture is (x1+x2- x, y1+y2-y)。
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present inventionExample can be put into practice without these specific details.In some instances, well known method, knot is not been shown in detailStructure and technology, so as not to obscure the understanding of this description.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect,Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimesIn example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:It is i.e. required to protectShield the present invention claims the feature more features than being expressly recited in each claim.More precisely, as followingAs claims reflect, inventive aspect is all features less than single embodiment disclosed above.Therefore, it abides byThus the claims for following specific implementation mode are expressly incorporated in the specific implementation mode, wherein each claim itselfAs a separate embodiment of the present invention.
Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groupsPart can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the exampleIn different one or more equipment.Module in aforementioned exemplary can be combined into a module or be segmented into addition multipleSubmodule.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodimentChange and they are arranged in the one or more equipment different from the embodiment.It can be the module or list in embodimentMember or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement orSub-component.Other than such feature and/or at least some of process or unit exclude each other, it may be used anyCombination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appointWhere all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint powerProfit requires, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generationIt replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodimentsIn included certain features rather than other feature, but the combination of the feature of different embodiments means in of the inventionWithin the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointedOne of meaning mode can use in any combination.
Various technologies described herein are realized together in combination with hardware or software or combination thereof.To the present inventionMethod and apparatus or the process and apparatus of the present invention some aspects or part can take embedded tangible media, such as it is softThe form of program code (instructing) in disk, CD-ROM, hard disk drive or other arbitrary machine readable storage mediums,Wherein when program is loaded into the machine of such as computer etc, and is executed by the machine, the machine becomes to put into practice this hairBright equipment.
In the case where program code executes on programmable computers, computing device generally comprises processor, processorReadable storage medium (including volatile and non-volatile memory and or memory element), at least one input unit, and extremelyA few output device.Wherein, memory is configured for storage program code;Processor is configured for according to the memoryInstruction in the said program code of middle storage executes the construction method and/or word of the literal type detection model of the present inventionType detection method.
In addition, be described as herein can be by the processor of computer system or by executing for some in the embodimentThe combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or methodThe processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, device embodimentElement described in this is the example of following device:The device is used to implement performed by the element by the purpose in order to implement the inventionFunction.
As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc.Description plain objects are merely representative of the different instances for being related to similar object, and are not intended to imply that the object being described in this way mustMust have the time it is upper, spatially, in terms of sequence or given sequence in any other manner.
Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited fromIt is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted thatThe language that is used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limitDetermine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, to this skillMany modifications and changes will be apparent from for the those of ordinary skill in art field.For the scope of the present invention, to this hairBright done disclosure is illustrative and not restrictive, and it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims (10)

CN201810128155.1A2018-02-082018-02-08Method for constructing character type detection model and computing equipmentActiveCN108304814B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810128155.1ACN108304814B (en)2018-02-082018-02-08Method for constructing character type detection model and computing equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810128155.1ACN108304814B (en)2018-02-082018-02-08Method for constructing character type detection model and computing equipment

Publications (2)

Publication NumberPublication Date
CN108304814Atrue CN108304814A (en)2018-07-20
CN108304814B CN108304814B (en)2020-07-14

Family

ID=62864779

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810128155.1AActiveCN108304814B (en)2018-02-082018-02-08Method for constructing character type detection model and computing equipment

Country Status (1)

CountryLink
CN (1)CN108304814B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109263271A (en)*2018-08-152019-01-25同济大学A kind of printing equipment determination method based on big data
CN109685055A (en)*2018-12-262019-04-26北京金山数字娱乐科技有限公司Text filed detection method and device in a kind of image
CN109740473A (en)*2018-12-252019-05-10东莞市七宝树教育科技有限公司Picture content automatic marking method and system based on paper marking system
CN109766879A (en)*2019-01-112019-05-17北京字节跳动网络技术有限公司Generation, character detection method, device, equipment and the medium of character machining model
CN109919037A (en)*2019-02-012019-06-21汉王科技股份有限公司A kind of text positioning method and device, text recognition method and device
CN109977762A (en)*2019-02-012019-07-05汉王科技股份有限公司A kind of text positioning method and device, text recognition method and device
CN110059559A (en)*2019-03-152019-07-26深圳壹账通智能科技有限公司The processing method and its electronic equipment of OCR identification file
CN110321788A (en)*2019-05-172019-10-11平安科技(深圳)有限公司Training data processing method, device, equipment and computer readable storage medium
CN110490232A (en)*2019-07-182019-11-22北京捷通华声科技股份有限公司Method, apparatus, the equipment, medium of training literal line direction prediction model
CN111144191A (en)*2019-08-142020-05-12广东小天才科技有限公司Font identification method and device, electronic equipment and storage medium
CN111191668A (en)*2018-11-152020-05-22零氪科技(北京)有限公司Method for identifying disease content in medical record text
CN111275139A (en)*2020-01-212020-06-12杭州大拿科技股份有限公司Handwritten content removal method, handwritten content removal device, and storage medium
CN111582267A (en)*2020-04-082020-08-25北京皮尔布莱尼软件有限公司Text detection method, computing device and readable storage medium
CN111611421A (en)*2019-02-262020-09-01鸿富锦精密工业(武汉)有限公司 Image augmentation and annotation method, device and computer storage medium
CN111753830A (en)*2020-06-222020-10-09作业不凡(北京)教育科技有限公司 A job image correction method and computing device
CN113901952A (en)*2021-11-062022-01-07浙江星算科技有限公司Print form and handwritten form separated character recognition method based on deep learning
CN114120305A (en)*2021-11-262022-03-01北京百度网讯科技有限公司Training method of text classification model, and recognition method and device of text content

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050102135A1 (en)*2003-11-122005-05-12Silke GoronzyApparatus and method for automatic extraction of important events in audio signals
CN104966097A (en)*2015-06-122015-10-07成都数联铭品科技有限公司Complex character recognition method based on deep learning
CN105574513A (en)*2015-12-222016-05-11北京旷视科技有限公司Character detection method and device
CN105809164A (en)*2016-03-112016-07-27北京旷视科技有限公司Character identification method and device
CN105956626A (en)*2016-05-122016-09-21成都新舟锐视科技有限公司Deep learning based vehicle license plate position insensitive vehicle license plate recognition method
CN106874902A (en)*2017-01-192017-06-20博康智能信息技术有限公司北京海淀分公司A kind of license board information recognition methods and device
CN107346629A (en)*2017-08-222017-11-14贵州大学A kind of intelligent blind reading method and intelligent blind reader system
CN107403130A (en)*2017-04-192017-11-28北京粉笔未来科技有限公司A kind of character identifying method and character recognition device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050102135A1 (en)*2003-11-122005-05-12Silke GoronzyApparatus and method for automatic extraction of important events in audio signals
CN104966097A (en)*2015-06-122015-10-07成都数联铭品科技有限公司Complex character recognition method based on deep learning
CN105574513A (en)*2015-12-222016-05-11北京旷视科技有限公司Character detection method and device
CN105809164A (en)*2016-03-112016-07-27北京旷视科技有限公司Character identification method and device
CN105956626A (en)*2016-05-122016-09-21成都新舟锐视科技有限公司Deep learning based vehicle license plate position insensitive vehicle license plate recognition method
CN106874902A (en)*2017-01-192017-06-20博康智能信息技术有限公司北京海淀分公司A kind of license board information recognition methods and device
CN107403130A (en)*2017-04-192017-11-28北京粉笔未来科技有限公司A kind of character identifying method and character recognition device
CN107346629A (en)*2017-08-222017-11-14贵州大学A kind of intelligent blind reading method and intelligent blind reader system

Cited By (27)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109263271A (en)*2018-08-152019-01-25同济大学A kind of printing equipment determination method based on big data
CN109263271B (en)*2018-08-152020-06-12同济大学Printing equipment detection and analysis method based on big data
CN111191668A (en)*2018-11-152020-05-22零氪科技(北京)有限公司Method for identifying disease content in medical record text
CN111191668B (en)*2018-11-152023-04-28零氪科技(北京)有限公司Method for identifying disease content in medical record text
CN109740473A (en)*2018-12-252019-05-10东莞市七宝树教育科技有限公司Picture content automatic marking method and system based on paper marking system
CN109740473B (en)*2018-12-252020-10-16东莞市七宝树教育科技有限公司Picture content automatic marking method and system based on paper marking system
CN109685055A (en)*2018-12-262019-04-26北京金山数字娱乐科技有限公司Text filed detection method and device in a kind of image
CN109685055B (en)*2018-12-262021-11-12北京金山数字娱乐科技有限公司Method and device for detecting text area in image
CN109766879A (en)*2019-01-112019-05-17北京字节跳动网络技术有限公司Generation, character detection method, device, equipment and the medium of character machining model
CN109919037B (en)*2019-02-012021-09-07汉王科技股份有限公司Text positioning method and device and text recognition method and device
CN109977762A (en)*2019-02-012019-07-05汉王科技股份有限公司A kind of text positioning method and device, text recognition method and device
CN109919037A (en)*2019-02-012019-06-21汉王科技股份有限公司A kind of text positioning method and device, text recognition method and device
CN111611421A (en)*2019-02-262020-09-01鸿富锦精密工业(武汉)有限公司 Image augmentation and annotation method, device and computer storage medium
CN110059559A (en)*2019-03-152019-07-26深圳壹账通智能科技有限公司The processing method and its electronic equipment of OCR identification file
CN110321788A (en)*2019-05-172019-10-11平安科技(深圳)有限公司Training data processing method, device, equipment and computer readable storage medium
CN110321788B (en)*2019-05-172024-07-02平安科技(深圳)有限公司Training data processing method, device, equipment and computer readable storage medium
CN110490232A (en)*2019-07-182019-11-22北京捷通华声科技股份有限公司Method, apparatus, the equipment, medium of training literal line direction prediction model
CN110490232B (en)*2019-07-182021-08-13北京捷通华声科技股份有限公司Method, device, equipment and medium for training character row direction prediction model
CN111144191B (en)*2019-08-142024-03-22广东小天才科技有限公司Font identification method, font identification device, electronic equipment and storage medium
CN111144191A (en)*2019-08-142020-05-12广东小天才科技有限公司Font identification method and device, electronic equipment and storage medium
CN111275139A (en)*2020-01-212020-06-12杭州大拿科技股份有限公司Handwritten content removal method, handwritten content removal device, and storage medium
CN111275139B (en)*2020-01-212024-02-23杭州大拿科技股份有限公司Handwritten content removal method, handwritten content removal device, and storage medium
CN111582267A (en)*2020-04-082020-08-25北京皮尔布莱尼软件有限公司Text detection method, computing device and readable storage medium
CN111582267B (en)*2020-04-082023-06-02北京皮尔布莱尼软件有限公司Text detection method, computing device and readable storage medium
CN111753830A (en)*2020-06-222020-10-09作业不凡(北京)教育科技有限公司 A job image correction method and computing device
CN113901952A (en)*2021-11-062022-01-07浙江星算科技有限公司Print form and handwritten form separated character recognition method based on deep learning
CN114120305A (en)*2021-11-262022-03-01北京百度网讯科技有限公司Training method of text classification model, and recognition method and device of text content

Also Published As

Publication numberPublication date
CN108304814B (en)2020-07-14

Similar Documents

PublicationPublication DateTitle
CN108304814A (en)A kind of construction method and computing device of literal type detection model
CN110443250B (en)Method and device for identifying category of contract seal and computing equipment
CN106780512B (en)Method, application and computing device for segmenting image
CN109829453A (en)It is a kind of to block the recognition methods of text in card, device and calculate equipment
CN110674804A (en)Text image detection method and device, computer equipment and storage medium
BovikThe essential guide to image processing
CN108416345A (en)A kind of answering card area recognizing method and computing device
CN112949766A (en)Target area detection model training method, system, device and medium
CN107798321A (en)A kind of examination paper analysis method and computing device
US9519734B2 (en)Systems and methods for improved property inspection management
CN109978063B (en)Method for generating alignment model of target object
CN108898142B (en)Recognition method of handwritten formula and computing device
CN108762740B (en)Page data generation method and device and electronic equipment
US9099007B1 (en)Computerized processing of pictorial responses in evaluations
CN109684980A (en)Automatic marking method and device
CN110097059B (en) Document image binarization method, system and device based on generative adversarial network
CN110427946B (en)Document image binarization method and device and computing equipment
CN111626295B (en)Training method and device for license plate detection model
US8042039B2 (en)Populating a dynamic page template with digital content objects according to constraints specified in the dynamic page template
CN109117760A (en)Image processing method, device, electronic equipment and computer-readable medium
KR102239588B1 (en) Image processing method and apparatus
CN107977624A (en)A kind of semantic segmentation method, apparatus and system
CN106204424B (en)The de-watermarked method, apparatus of image and calculating equipment
US20200175727A1 (en)Color Handle Generation for Digital Image Color Gradients using Machine Learning
CN111768405B (en)Method, device, equipment and storage medium for processing marked image

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
CP03Change of name, title or address

Address after:571924 Hainan Ecological Software Park, Laocheng High tech Industrial Demonstration Zone, Haikou City, Hainan Province

Patentee after:Hainan Avanti Technology Co.,Ltd.

Address before:571924 Hainan old city high tech industrial demonstration area Hainan eco Software Park

Patentee before:HAINAN YUNJIANG TECHNOLOGY CO.,LTD.

CP03Change of name, title or address

[8]ページ先頭

©2009-2025 Movatter.jp