Fuzzy literal Enhancement Method and device based on deep neural networkTechnical field
The present invention relates to technical field of image processing, more particularly to a kind of fuzzy literal increasing based on deep neural networkStrong method and device.
Background technology
With the development and the progress of science and technology of society, requirement more and more higher of the people to image processing techniquess.Based on imageIn process, Text region is used as an important basic technology, with huge using value and wide application prospect, especiallyIt is the Text region of natural scene image.For example, Text region in image, OCR (Optical are carried out by OCR techniqueCharacter Recognition, optical character identification) refer to that electronic equipment (such as scanner or digital camera) is checked on paperThe word of printing, determines its shape by the pattern for detecting dark, bright, shape is translated into computer with character recognition method thenThe process of word.
Word is a kind of important information carrier, and according to incompletely statistics, the theme for still having 90% information resources at present isThere is provided by document information.With developing rapidly for scientific and technological information, these information automations are identified into that one becomes a kind ofTrend and focus.Word automatic identification rate in high-quality text image can reach more than 99%.
However, in the prior art, as the decline of picture quality, particularly image pixel be not high or image itself is unclearThe image blur phenomena brought by Chu, causes the discrimination of word also to decline therewith.
The content of the invention
The shortcoming of prior art in view of the above, it is an object of the invention to provide a kind of based on deep neural networkFuzzy literal Enhancement Method and device, during for solving image blurring in prior art, it is impossible to accurately identify word in imageProblem.
For achieving the above object and other related purposes, the present invention provides a kind of fuzzy literal based on deep neural networkEnhanced method, including:
Set up reference database;
Test image of the collection comprising word;
The test image is divided into into multiple test image blocks by image block division rule;
Index by target search of test image block each described in the reference database, filter out and the testThe most like multiple pre-set image blocks of image block;
According to fusion coefficients by multiple most like pre-set image block Weighted Fusions be restored image block, by described image pairAdjacent restored image block Weighted Fusion is answered to obtain restored image.
Another object of the present invention is to provide a kind of fuzzy literal based on deep neural network enhanced device, wrapInclude:
Reference database, for setting up reference database;
Acquisition module, for test image of the collection comprising word;
Processing module, for the test image is divided into multiple test image blocks by image block division rule;
Retrieval module, for indexing by target search of test image block each described in the reference database, sieveSelect the multiple pre-set image blocks most like with the test image block;
Fusion Module, for according to fusion coefficients by multiple most like pre-set image block Weighted Fusions be restored imageDescribed image correspondence adjacent restored image block Weighted Fusion is obtained restored image by block.
As described above, the fuzzy literal Enhancement Method based on deep neural network and device of the present invention, have with followingBeneficial effect:
By building reference database and identification model being trained in the data base, collection includes the test of word to the present inventionTest image is divided into multiple test image blocks by image, based on the deep neural network characteristic matching image in data baseThe most like pre-set image block of block, the multiple most like pre-set image blocks of Weighted Fusion obtain restored image block, will by picture positionAdjacent restored image block is recovered to picture rich in detail.To introduce deep neural network special when reference data being set up with image block retrievalLevy, improve the robustness of image block;The data base of training can also be passed through by comprising fuzzy word even at off-line stateImage restoration is easy to show or recognize word in image into clearly image, improves the resolution of word in image and clearDegree.
Description of the drawings
Fig. 1 is shown as the present invention and provides a kind of fuzzy literal Enhancement Method flow chart based on deep neural network;
Fig. 2 is shown as the present invention and provides a kind of the detailed of step S1 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Fig. 3 is shown as image segmentation during the present invention provides a kind of fuzzy literal Enhancement Method based on deep neural network and showsIt is intended to;
Fig. 4 is shown as the present invention and provides a kind of the detailed of step S4 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Fig. 5 is shown as the present invention and provides a kind of the detailed of step S5 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Fig. 6 is shown as Cellular structure during the present invention provides a kind of fuzzy literal Enhancement Method based on deep neural network and showsIt is intended to;
Fig. 7 is shown as the first enforcement stream that the present invention provides a kind of fuzzy literal Enhancement Method based on deep neural networkCheng Tu;
Fig. 8 is shown as the present invention and provides a kind of fuzzy literal intensifier structured flowchart based on deep neural network;
Fig. 9 is shown as the present invention and provides database structure in a kind of fuzzy literal intensifier based on deep neural networkBlock diagram;
Figure 10 is shown as retrieving module during the present invention provides a kind of fuzzy literal intensifier based on deep neural networkStructured flowchart;
Figure 11 is shown as the present invention and provides Fusion Module in a kind of fuzzy literal intensifier based on deep neural networkStructured flowchart.
Specific embodiment
Embodiments of the present invention are illustrated below by way of specific instantiation, those skilled in the art can be by this specificationDisclosed content understands other advantages and effect of the present invention easily.The present invention can also pass through concrete realities different in additionThe mode of applying is carried out or applies, the every details in this specification can also based on different viewpoints with application, without departing fromVarious modifications and changes are carried out under the spirit of the present invention.It should be noted that, in the case where not conflicting, following examples and enforcementFeature in example can be mutually combined.
It should be noted that the diagram provided in following examples only illustrates the basic structure of the present invention in a schematic wayThink, the component relevant with the present invention is only shown then in schema rather than according to component count during actual enforcement, shape and sizeDraw, which is actual when the implementing kenel of each component, quantity and ratio can be a kind of random change, and its assembly layout kenelIt is likely more complexity.
Embodiment 1
Fig. 1 is referred to, and a kind of flow chart of the fuzzy literal Enhancement Method based on deep neural network is provided for the present invention,Including:
Step S1, sets up reference database;
Specifically, the purpose of reference database is set up in order to build a priori storehouse for word deblurring, speciallyDoor is used to aid in fuzzy image enhancement.
Step S2, test image of the collection comprising word;
Specifically, collection is fuzzy literal image (picture) to be restored comprising character image, and the word includes wordRow, line of text, character etc..
The test image is divided into multiple test image blocks by image block division rule by step S3;
Specifically, the test image normalized is obtained into standardized format first, is carried out according still further to pixel spot sizeUniform piecemeal is processed, and is divided into multiple test image blocks, wherein, image block division rule be each described image block press word andPiecemeal position is identified.
Step S4, in the reference database with test image block each described as target search index, filter out withThe most like multiple pre-set image blocks of the test image block;
Specifically, to split the test image block of gained as target retrieval, enter according to above-mentioned target in reference databaseLine retrieval, estimates for search mark according to the distance between pre-set image block in the test image block and reference database of target retrievalStandard, the less representative of distance measure value are more similar.
Step S5, according to fusion coefficients by multiple most like pre-set image block Weighted Fusions be restored image block, by instituteState the adjacent restored image block Weighted Fusion of image correspondence and obtain restored image.
Specifically, most like multiple pre-set image blocks are weighted into fusion first, it is right according to institute in image to be restoredThe equal Weighted Fusion of image block answered obtains its correspondence restored image block;All restored image blocks are pressed into pixel Weighted Fusion one by oneObtain restored image.
In the present embodiment, introduce deep neural network feature when reference data being set up with image block retrieval, improveThe robustness of image block;Also will can be recovered to comprising fuzzy character image by the data base for training even at off-line stateClearly image, is easy to show or recognize word in image, improves the resolution and definition of word in image.
Embodiment 2
Fig. 2 is shown as the present invention and provides a kind of the detailed of step S1 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Step S101, gathers word clearly image, and wherein each word image comprising multiple multi-forms;
Specifically, clearly image is high-quality picture to the word of collection, and in order to consider the coverage rate of word withAnd picture number, the picture for being gathered should at least include Chinese characters in common use storehouse, secondary Chinese characters in common use storehouse and other common characters.SeparatelyOutward, it is considered to which it is different that same word may write (express) mode, so, each word image comprising multiple multi-forms,At least ensure the word integrity as far as possible in priori storehouse, for supporting feature space when follow-up blurred picture strengthens.
Step S102, by described image normalized and is divided into multiple pre-set image blocks, wherein, described in eachPre-set image block is identified with piecemeal position by word;
Specifically, by the word of collection, clearly image is normalized, normalization main to the size of image,Gray processing, contrast enhancing etc., convert images into corresponding sole criterion form;Secondly, normalized image is carried out pointBlock process, is divided into multiple pre-set image blocks, and the size (Block Size) of piecemeal process can be arranged on 10 to 40 picturesPlain left and right, the half being preferably fixed as image block size herein, i.e., between adjacent image block, registration is 50%, and imageDisplacement increment between block is wide, high 50% of the image block, as shown in Fig. 3 is aobvious, provides a kind of based on depth for the present inventionImage segmentation schematic diagram in the fuzzy literal Enhancement Method of neutral net;In figure, each pre-set image block size is 16*16pix, pre-set image block have coincidence between 8pix and adjacent another pre-set image block, according to word and piecemeal position in imagePut and be identified jointly, i.e., the pre-set image block that same word and same position are partitioned into just is identified as same label, sentencesIt is set to a class pre-set image block.
Step S103, adopts pre-set image described in convolutional neural networks Algorithm for Training based on softmaxwithloss structuresBlock obtains deep learning module;
Specifically, it is clear using word and identify identical image block, optimized based on class objectUsing convolutional neural networks Algorithm for Training, each pre-set image block obtains deep learning module to softmaxwithloss structures, itsIn, the softmaxwithloss structures are as follows:
In formula (1), vectors of the z for full articulamentum output in convolutional neural networks, z=(z1,z2,…zn);F (z) isThe output of softmax.
Step S104, is output as index with the deep learning module, sets up the identification model of reference database.
Specifically, for the deep learning module (deep neural network) for training, last full connection preferably by whichOutput of the layer as characteristics of image modeling, models for image block and indexes;Before restored image, need to each word instituteCorresponding image block carries out deep neural network feature extraction, carries out feature modeling index, generates the knowledge with regard to reference databaseOther model;When to needing the broad image for recovering to process, the feature of offline index is can be used directly, so as to convenient, fastThe maximum top n image block of correspondence similarity is found promptly.
In the present embodiment, by building reference database, the flow process of fuzzy literal image recovery is not only shortened, is improvedThe efficiency that fuzzy literal image increases;Meanwhile, the deep neural network of introducing can increase substantially the robust of image block searchProperty, improve the restorability of fuzzy literal image.
Embodiment 3
Fig. 4 is shown as the present invention and provides a kind of the detailed of step S4 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Step S401, extracts the deep neural network feature of each test image block;
Specifically, 10 most like with which pre-set image is retrieved in reference database for each test image blockBlock, in search, is measured with L1 distances using degree neural network characteristics.
Step S402, calculates the corresponding deep neural network feature of each test image block and reference data as followsThe distance between deep neural network feature of pre-set image block in storehouse is estimated;
D (p, q)=| | p-q | | (2)
In formula (2), distance measures of the d (p, q) for deep neural network feature between test image block and pre-set image block,P, q are respectively the CNN characteristic vectors of image block;
Formula (2) is specifically launched, equation below can be obtained:
D (p, q)=| p1-q1|+|p2-q2|+…|pn-qn|
Wherein, p, q are respectively the CNN characteristic vectors of image block;P=(p1,p2,…,pn) and q=(q1,q2,…,qn), oftenIndividual feature is n-dimensional vector.
Step S403, it is most like pre-set image block to filter out the minimum image block of multiple distance measure values.
Specifically, the 10 default figures minimum apart from measure value in reference database between target search image blockAs block is most like pre-set image block.
In the present embodiment, by using deep neural network feature, which has higher sign ability, in search high definitionImage block when have more robust, for the blurred picture under complicated true environment has good restorability, improve fuzzyImage restoration ability.
Embodiment 4
Fig. 5 is shown as the present invention and provides a kind of the detailed of step S5 in the fuzzy literal Enhancement Method based on deep neural networkThin flow chart;
Step S501, by reflecting between each position in image corresponding image block to be restored and most like pre-set image blockThe relation of penetrating is fixed as one to ten, and by ten most like pre-set image blocks, Weighted Fusion is calculated as follows, is restoredImage block;
In formula (3), f (x, y) be fusion after image block function, gkDuring (x, y) is the reference database that correspondence is searchedMost like pre-set image block, ω (xk) for fusion coefficients;
Specifically, the fusion coefficients can be expressed as following form:
Wherein, xiRepresent the characteristic parameter p of the fuzz testing image block and pre-set image block q of front ten for retrievingiFeatureFront ten similar pre-set image block is permeated restored image block by the inverse of parameter distance by image block function, therefore,All test image blocks corresponding in the image of parked can be merged in a manner described, be obtained the image weighting fusionRestored image block.
Each described restored image block is divided into four cell elements, as follows with cell element as substantially single by step S502Position individual element Weighted Fusion obtains restored image;
In formula (4), g (x, y) is the cell element after final fusion, fk(x, y) is four overlapping cell elements of correspondence, and (x, y) isThe station location marker of cell element pixel, ωk(x, y) is weight coefficient.
Specifically, when being divided into image block due to image, the overlapping ratio of setting is 50%, therefore, will be every in imageIndividual image block is divided into four cell elements in a manner described, as shown in fig. 6, providing a kind of based on deep neural network for the present inventionCellular structure schematic diagram in fuzzy literal Enhancement Method;In figure, the image block of 16*16pix is divided into the born of the same parents of four 8*8pixUnit, is weighted fusion according to formula (4), wherein, ωk(x, y) for the concrete calculation of weight coefficient is:
Wherein, | | Pk(x,y)||2For Euclidean distance formula, be on correspondence cell element point (x, y) in correspondence image blockThe distance of the heart (x', y'), ω 'k(x, y) is the value after weight coefficient normalization, adopts which finally to merge weight coefficient.
In the present embodiment, by being weighted fusion respectively to image block and cell element, and in first time Weighted FusionOn the basis of carry out second Weighted Fusion again, improve the definition of fuzzy literal, be easy to the identification of later stage character image.
Embodiment 5
Fig. 7 is shown as the first enforcement stream that the present invention provides a kind of fuzzy literal Enhancement Method based on deep neural networkCheng Tu, including:Word unclear " heresy " word image is restored, first, successively its normalized is processed with piecemeal,Obtain 16 test image blocks;Location difference according to image is identified to 16 test image blocks respectively, such as:FigurePicture block 1 is retrieved in reference database corresponding to image block 16, respectively the test image block with mark as target retrievalThis 10 most like pre-set image blocks are weighted fusion by 10 most like pre-set image blocks, respectively obtain correspondence surveyThe restored image block (image block of denoising) of examination image block (image block 1 is to image block 16);By " heresy " word image correspondence restored mapAs block obtains " heresy " word image denoising after for ultimate unit individual element Weighted Fusion by cell element, as shown in Figure 7, it will be apparent thatThe definition of fuzzy literal is increased, is easy to visual understanding;Meanwhile, for the image block comprising literal line, also can be according to weMethod carries out the recovery of fuzzy literal.
Embodiment
Fig. 8 is shown as the present invention and provides a kind of fuzzy literal intensifier structured flowchart based on deep neural network;BagInclude:
Reference database 1, for setting up reference database;
Specifically, the purpose of reference database is set up in order to build a priori storehouse for word deblurring, speciallyDoor is used to aid in fuzzy image enhancement.
Acquisition module 2, for test image of the collection comprising word;
Specifically, collection is fuzzy literal test image (picture) to be restored comprising character image, and the word is includedLiteral line, line of text, character etc..
Processing module 3, for the test image is divided into multiple test image blocks by image block division rule;
Specifically, the test image normalized is obtained into standardized format first, is carried out according still further to pixel spot sizeUniform piecemeal is processed, and is divided into multiple test image blocks, wherein, image block division rule be each described image block press word andPiecemeal position is identified.
Retrieval module 4, for indexing by target search of test image block each described in the reference database, sieveSelect the multiple pre-set image blocks most like with described image block;
Specifically, to split the test image block of gained as target retrieval, enter according to above-mentioned target in reference databaseLine retrieval, estimates for search mark according to the distance between pre-set image block in the test image block and reference database of target retrievalStandard, the less representative of distance measure value are more similar.
Fusion Module 5, for according to fusion coefficients by multiple most like pre-set image block Weighted Fusions be restored imageDescribed image correspondence adjacent restored image block Weighted Fusion is obtained restored image by block.
Specifically, most like multiple pre-set image blocks are weighted into fusion first, it is right according to institute in image to be restoredThe equal Weighted Fusion of image block answered obtains its correspondence restored image block;All restored image blocks are weighted one by one according to pixel and is meltedConjunction obtains restored image.
In the present embodiment, introduce deep neural network feature when reference data being set up with image block retrieval, improveThe robustness of image block;Also will can be recovered to comprising fuzzy character image by the data base for training even at off-line stateClearly image, is easy to show or recognize word in image, improves the resolution and definition of word in image.
Fig. 9 is shown as the present invention and provides database structure in a kind of fuzzy literal intensifier based on deep neural networkBlock diagram, including:
Collecting unit 11, for gathering word clearly image, and wherein each word includes multiple multi-formsImage;
Specifically, clearly image is high-quality picture to the word of collection, and in order to consider word coverage rate andPicture number, the picture for being gathered should at least include Chinese characters in common use storehouse, secondary Chinese characters in common use storehouse and other common characters, in addition,Consider that same word may write (express) mode difference, so, each word image comprising multiple multi-forms, at leastEnsure the word integrity as far as possible in priori storehouse, for supporting feature space when follow-up blurred picture strengthens.
Processing unit 12, by described image normalized and is divided into multiple pre-set image blocks, wherein, each instituteState pre-set image block to be identified with piecemeal position by word;
Specifically, by the word of collection, clearly image is normalized, normalization main to the size of image,Gray processing, contrast enhancing etc., convert images into corresponding sole criterion form;Secondly, normalized image is carried out pointBlock process, is divided into multiple pre-set image blocks, and the size (Block Size) of piecemeal process can be arranged on 10 to 40 picturesPlain left and right, the half being preferably fixed as image block size herein, i.e., between adjacent pre-set image block, registration is 50%, andDisplacement increment between pre-set image block is wide, high 50% of the image block, as shown in Fig. 3 is aobvious, provides one kind for the present inventionBased on image segmentation schematic diagram in the fuzzy literal Enhancement Method of deep neural network;In figure, each pre-set image block size is16*16pix, image block have coincidence between 8pix and adjacent another pre-set image block, according to word and piecemeal position in imagePut and be identified jointly, i.e., the image block that same word and same position are partitioned into just is identified as same label, is judged toOne class image block.
Training unit 13, for adopting pre- described in convolutional neural networks Algorithm for Training based on softmaxwithloss structuresIf image block obtains deep learning module;
Specifically, it is clear using word and identify identical image block, optimized based on class objectUsing convolutional neural networks Algorithm for Training, each pre-set image block obtains deep learning module to softmaxwithloss structures, itsIn, the softmaxwithloss structures are as follows:
In formula (1), vectors of the z for full articulamentum output in convolutional neural networks, z=(z1,z2,…zn);F (z) isThe output of softmax.
Model Identification unit 14, is output as index with the deep learning module, sets up the identification mould of reference databaseType.
Specifically, for the deep learning module (deep neural network) for training, last full connection preferably by whichOutput of the layer as characteristics of image modeling, models for image block and indexes;Before restored image, need to each word instituteCorresponding image block carries out deep neural network feature extraction, carries out feature modeling index, generates the knowledge with regard to reference databaseOther model;When to needing the broad image for recovering to process, the feature of offline index is can be used directly, so as to convenient, fastThe maximum top n image block of correspondence similarity is found promptly.
In the present embodiment, by building reference database, the flow process of fuzzy literal image recovery is not only shortened, is improvedThe efficiency that fuzzy literal image increases;Meanwhile, the deep neural network of introducing can increase substantially the robust of image block searchProperty, improve the restorability of fuzzy literal image.
Figure 10 is shown as retrieving module during the present invention provides a kind of fuzzy literal intensifier based on deep neural networkStructured flowchart, including:
Extraction unit 41, for extracting the deep neural network feature of each test image block;
Specifically, 10 most like with which pre-set image is retrieved in reference database for each test image blockBlock, in search, is measured with L1 distances using degree neural network characteristics.
Computing unit 42, for calculating the corresponding deep neural network feature of each test image block as follows with ginsengExamine the distance between deep neural network feature of pre-set image block in data base to estimate,
D (p, q)=| | p-q | | (2)
In formula (2), distance measures of the d (p, q) for deep neural network feature between test image block and pre-set image block,P, q are respectively the CNN characteristic vectors of image block;
Formula (2) is specifically launched, equation below can be obtained:
D (p, q)=| p1-q1|+|p2-q2|+…|pn-qn|
Wherein, p, q are respectively the CNN characteristic vectors of image block;P=(p1,p2,…,pn) and q=(q1,q2,…,qn), oftenIndividual feature is n-dimensional vector
Screening unit 43, is most like pre-set image block for filtering out the minimum image block of multiple distance measure values.
Specifically, the 10 default figures minimum apart from measure value in reference database between target search image blockAs block is most like pre-set image block.
In the present embodiment, by using deep neural network feature, which has higher sign ability, in search high definitionImage block when have more robust, for the blurred picture under complicated true environment has good restorability, improve fuzzyImage restoration ability.
Figure 11 is shown as the present invention and provides Fusion Module in a kind of fuzzy literal intensifier based on deep neural networkStructured flowchart, including:
First integrated unit 51, for by each position in image corresponding image block to be restored and most like pre-set imageMapping relations between block are fixed as one to ten, and by ten most like image blocks, Weighted Fusion is calculated as follows, is obtainedTo restored image block;
In formula (3), f (x, y) be fusion after image block function, gkDuring (x, y) is the reference database that correspondence is searchedMost like image block, ω (xk) for fusion coefficients;
Specifically, the fusion coefficients can be expressed as following form:
Wherein, xiRepresent the characteristic parameter p of the fuzz testing image block and pre-set image block q of front ten for retrievingiFeatureFront ten similar image block is permeated restored image block by the inverse of parameter distance by image block function, therefore, can be byIn the image of parked, corresponding all test image blocks are merged in a manner described, obtain the figure of the image weighting fusionAs block.
Second integrated unit 52, for each described restored image block is divided into four cell elements, as follows with born of the same parentsUnit obtains restored image for ultimate unit individual element Weighted Fusion;
In formula (4), g (x, y) is the cell element after final fusion, fk(x, y) is four overlapping cell elements of correspondence, and (x, y) isThe station location marker of cell element pixel, ωk(x, y) is weight coefficient.
Specifically, when being divided into image block due to image, the overlapping ratio of setting is 50%, therefore, will be every in imageIndividual restored image block is divided into four cell elements in the manner described above, as shown in fig. 7, providing a kind of based on depth nerve for the present inventionCellular structure schematic diagram in the fuzzy literal Enhancement Method of network;In figure, the image block of 16*16pix is divided into four 8*8pixCell element, be weighted fusion according to formula (4), wherein, ωk(x, y) for the concrete calculation of weight coefficient is:
Wherein, | | Pk(x,y)||2For Euclidean distance formula, be on correspondence cell element point (x, y) in correspondence image blockThe distance of the heart (x', y'), ω 'k(x, y) is the value after weight coefficient normalization, adopts which finally to merge weight coefficient.
In the present embodiment, by being weighted fusion respectively to image block and cell element, and in first time Weighted FusionOn the basis of carry out second Weighted Fusion again, improve the definition of fuzzy literal, be easy to the identification of later stage character image.
In sum, the present invention is by building reference database and in the data base, training identification model, collection to includeTest image is divided into multiple image blocks to be tested by the test image of word, special based on deep neural network in data baseLevy and match the most like pre-set image block of the image block, the multiple most like pre-set image blocks of Weighted Fusion obtain restored image block,Adjacent restored image block is recovered to into picture rich in detail by picture position.When reference data being set up with image block retrieval introduce depthNeural network characteristics, improve the robustness of image block;The data base that training can also be passed through even at off-line state will includeFuzzy character image is recovered to clearly image, is easy to show or recognize word in image, improves the knowledge of word in imageNot Du and definition.So, the present invention effectively overcomes various shortcoming of the prior art and has high industrial utilization.
The principle and its effect of above-described embodiment only illustrative present invention, it is of the invention not for limiting.It is any ripeThe personage for knowing this technology all can carry out modifications and changes to above-described embodiment under the spirit and the scope without prejudice to the present invention.CauseThis, those of ordinary skill in the art is complete with institute under technological thought without departing from disclosed spirit such asInto all equivalent modifications or change, should by the present invention claim be covered.