Intelligent medical bill identification method based on Internet of thingsTechnical Field
The invention relates to character recognition, in particular to an intelligent medical bill recognition method based on the Internet of things.
Background
With the continuous development and progress of the medical industry in China in recent years, the system and the system owned by the medical industry are relatively perfect. However, at present, the management of medical bills is messy, and because the bill format issued by each hospital is quinqueous, the circulation of the bills between the hospital and the reimbursement unit is performed in a manual input mode by the acceptance unit after the bills are submitted, so that the management is not beneficial to archiving and later retrieval, and the collection of the bill information is still performed in a traditional manual input mode.
The medical bill reimbursement system is characterized in that after information of a security participant is manually input, the security participant submits a bill, a front desk worker manually inputs invoice information item by item, a checker checks the input information and then submits audit of the audit, reimbursement comparison and reimbursement amount are given, the efficiency of the whole process worker is very low, and some input errors which are difficult to find often occur.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects in the prior art, the invention provides the intelligent medical bill identification method based on the Internet of things, which can effectively overcome the defect that the bill information on the medical bill cannot be accurately and efficiently identified in the prior art.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
an intelligent medical bill identification method based on the Internet of things comprises the following steps:
s1, extracting features by using a VGG16 model as a basic network, and obtaining a feature map through the first 5 convolutions in the VGG16 model;
s2, performing a sliding window on the feature map to obtain a feature vector;
s3, inputting the feature vector corresponding to the sliding window into an RNN circulation layer to obtain an output result of the circulation layer;
s4, inputting the output result of the circulation layer into the full-connection layer, and obtaining a text candidate region of dense prediction through classification or regression;
s5, filtering out the densely predicted text candidate area by using a standard Soft non-maximum suppression algorithm;
s6, merging the filtered text candidate regions into a text box by using a simple text line construction algorithm;
and S7, recognizing the characters in the combined text box through the CRNN model.
Preferably, the size of the feature map is W × H × C.
Preferably, the window size of the sliding window is 3 × 3, and the window of the sliding window obtains a feature vector with a length of 3 × 3 × C.
Preferably, the feature vector predicts offset distances between 10 target candidate regions, and the window center of the sliding window predicts 10 text candidate regions.
Preferably, the RNN cycle layer uses BLSTM, and the cycle layer outputs the result W × 256.
Preferably, the fully connected layer is 512-dimensional.
Preferably, the determination condition for merging the filtered text candidate regions into the text box includes:
the horizontal distance between the two text candidate regions is the smallest, the distance between the two text candidate regions is less than 50 pixels, and the vertical overlap of the two text candidate regions is greater than 0.7.
Preferably, the CRNN model consists of a CNN convolution layer, an RNN cycle layer, and a CTC transcription layer.
(III) advantageous effects
Compared with the prior art, the intelligent medical bill identification method based on the Internet of things integrates and combs the existing business processes, identifies the medical bill information scanned in batches by using the intelligent identification technology, can quickly identify texts with different formats, and completes heavy and repeated input work by a system to form editable structural information, so that an assistant worker can efficiently and accurately process the medical bill information of a participant, the waiting time of a reimburser is shortened, the related business processes are perfected, the intelligent processing of the medical bill is realized, an efficient and convenient medical bill reimbursement service environment is created, the service period is shortened, and the efficiency and the satisfaction degree of the medical bill reimbursement service are greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram of a dense predictive text candidate region obtained by classification or regression according to the present invention;
FIG. 2 is a schematic diagram of the invention after filtering FIG. 1 using a standard Soft non-maximum suppression algorithm;
FIG. 3 is a schematic diagram of combining filtered text candidate regions in FIG. 2 into a text box according to a simple text line construction algorithm;
FIG. 4 is a diagram illustrating the decision conditions for merging filtered text candidates into a text box according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An intelligent medical bill identification method based on the internet of things is shown in fig. 1 to 3, and comprises the following steps:
s1, extracting features by using a VGG16 model as a basic network, and obtaining a feature map through the first 5 convolutions in the VGG16 model;
s2, performing a sliding window on the feature map to obtain a feature vector;
s3, inputting the feature vector corresponding to the sliding window into an RNN circulation layer to obtain an output result of the circulation layer;
s4, inputting the output result of the circulation layer into the full-connection layer, and obtaining a text candidate region of dense prediction through classification or regression;
s5, filtering out the densely predicted text candidate area by using a standard Soft non-maximum suppression algorithm;
s6, merging the filtered text candidate regions into a text box by using a simple text line construction algorithm;
and S7, recognizing the characters in the combined text box through the CRNN model.
The size of the feature map is W × H × C.
The window size of the sliding window is 3 × 3, and the window of the sliding window obtains a feature vector with a length of 3 × 3 × C.
The feature vectors predict the offset distances between 10 target candidate regions, and the window center of the sliding window predicts 10 text candidate regions.
BLSTM is adopted by the RNN circulating layer, and the output result of the circulating layer is W multiplied by 256.
The fully connected layer is 512 dimensions.
The decision condition for merging the filtered text candidate regions into the text box includes:
the horizontal distance between the two text candidate regions is the smallest, the distance between the two text candidate regions is less than 50 pixels, and the vertical overlap of the two text candidate regions is greater than 0.7.
The CRNN model consists of a CNN convolution layer, an RNN circulation layer and a CTC transcription layer.
The steps S1-S6 are based on a CTPN network structure, the position of the text predicted by the CTPN network in the vertical direction is fixed at 16, the height of the text is changed from 11 pixels to 128 pixels, the total number of 10 target candidate areas is obtained, and the position in the horizontal direction is not predicted.
Every two adjacent text candidate regions form a text box, and different text boxes are combined until the text boxes cannot be combined any more. And determines two text candidate areas BiAnd BjCan form a textCondition of this frame is Bi—>BjWhile B isj—>BiThe symbol determination conditions are shown in fig. 4.
The CRNN model consists of a CNN convolution layer, an RNN circulation layer and a CTC transcription layer. It is characterized in that: end-to-end training, and performing combined training on the CNN convolution layer and the RNN circulation layer; input of arbitrary length (arbitrary image width, arbitrary word length); the training set does not need to be calibrated with characters; both dictionary and dictionary-less libraries (samples) may be used.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.