Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides an optical character recognition model training method. The execution subject of the optical character recognition model training method includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the optical character recognition model training method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, a flowchart of an optical character recognition model training method according to an embodiment of the present invention is shown, where in the embodiment of the present invention, the optical character recognition model training method includes:
in detail, the optical character recognition model training method comprises the following steps:
s1, acquiring an original picture set in actual production and an original data set corresponding to the original picture set, and storing the original picture set and the original data set into a preset message queue channel.
In the embodiment of the invention, the original picture set is unstructured data obtained from an actual production environment process by using a preset optical character recognition interface, the original data set is character information extracted from the original picture set by using the optical character recognition interface, and the character information is structured data.
The structured data refers to data which can be stored in a database and realized by two-dimensional logic expression; the unstructured data refers to data which cannot be realized by two-dimensional logic expression, such as text, picture, XML, HTML, audio, video and the like.
Specifically, in this embodiment, the identified structured data and unstructured data are sent to the message queue channel by using a preset optical character recognition interface, and then the data are processed according to the user requirements.
In detail, before the storing the original picture set and the original data set in a preset message queue channel, the method further includes:
establishing links between the original data set and the original picture set and the message middleware, and forming a message queue channel through the links;
and storing the original picture set and the original data set through the message queue channel.
In the embodiment of the invention, the message queue channel is a channel which is formed by linking the original data and the message middleware and can receive information, store information and send information.
Preferably, the link may be a TCP link.
Preferably, the message middleware may be kafka.
In another embodiment of the present invention, the identified structured data may be sent to the message queue channel by using a preset optical character recognition interface, and the unstructured data may be stored in the NAS disc, and then processed according to the user requirements.
S2, when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, performing error data screening on the original data set, determining that the screened error data form a negative sample data set, and determining that non-error data except the error data form a positive sample data set.
In the embodiment of the invention, the original picture set and the original data set can be asynchronously stored through the preset message queue channel, and when the original picture set and the original data set are transmitted to the message queue channel, the message queue channel can firstly not process the original picture set and the original data set and can determine when to process the original picture set and the original data set according to the requirement of a user; transmitting the original data set processed in the message queue channel to a preset search engine, and screening error data in the original data set corresponding to the original picture set in the message queue channel by using preset screening data of the search engine by acquiring use authorization of the original data set corresponding to the original picture set in actual production, wherein the error data is error character information in the original data set corresponding to the original picture set, and the error data is formed into a negative sample data set, and non-error data except the rest of the error data are formed into a positive sample data set.
Preferably, the preset search engine may be an elastic search.
In detail, the filtering the error data of the original data set, determining that the filtered error data forms a negative sample data set, and the non-error data except the error data forms a positive sample data set, including:
Acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
Comparing the sequence length with the sequence length index, composing the original data corresponding to the sequence length inconsistent with the length of the sequence length index into a negative sample data set, and composing the original data corresponding to the sequence length consistent with the length of the sequence length index into a positive sample data set.
In the embodiment of the invention, the sequence length can be the character length corresponding to the original data, and the sequence length index set by using the preset screening statement is an index set for the fixed length of the characters in the original data.
For example, if a certain original picture is a car license plate picture, the sequence length contained in the original data corresponding to the original picture is seven-bit number (i.e. the length of the characters contained in the original picture is seven-bit number), the screening statement sets the sequence length index to be length 7, and if the sequence length in the original data is identified to be 7, the original data is determined to be positive sample data; if the sequence length in the original data is not 7, the original data is determined to be negative sample data.
Further, after the determining that the filtered error data forms a negative sample data set and the non-error data other than the error data forms a positive sample data set, the method further includes:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and desensitizing the sensitive field by using a preset desensitizing function.
In the embodiment of the present invention, the fields related to personal privacy information such as personal name, identification card information and mobile phone number are all sensitive fields, and after the sensitive fields are identified, data replacement or mask shielding treatment can be performed on the sensitive fields to implement desensitization, for example, data replacement is performed on the middle four digits of the mobile phone number to obtain 13800001248, or middle four digits mask shielding is performed to obtain 138×1248.
For example, personal identification information, mobile phone numbers, bank card information, etc. collected by institutions and enterprises are subjected to desensitization processing.
In the embodiment of the invention, the privacy information in the positive sample data and the negative sample data can be shielded or hidden through the desensitization processing of the desensitization function on the sensitive field, so that the security of the privacy data of users in the real production environment is protected.
For example, because the embodiment of the invention uses the data in the actual production process to perform model training, confidential data of enterprises may be involved, and the data use permission can be preset, and the use restriction is performed in the data use permission, so that the developer cannot view or download the data.
In an embodiment of the invention, the preset data use authority can also ensure that a developer can only use data when performing model iteration and training, but cannot check or download the data, thereby avoiding the risk of data leakage.
S3, acquiring a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, wherein the error character labeling set is dynamically updated in real time.
In the embodiment of the invention, the positive sample data set and the negative sample data set can be transmitted to a preset labeling platform, the real character labeling of the positive sample data set is labeled by utilizing the preset labeling platform, the corresponding position information of the positive sample data set in the original picture (namely the position information of the real character in the original picture) is labeled, and the real character labeling of the positive sample data set and the position information of the real character in the original picture are combined into the real character labeling set; similarly, the character labeling of the negative sample data set is labeled by using a preset labeling platform, and the corresponding position information of the negative sample data set in the original picture (namely, the position information of the error character in the original picture) is labeled, and the character labeling of the negative sample data set and the position information of the character in the original picture are combined into the error character labeling set.
For example, the seven digits of license plate numbers XA and XXXXX can be marked by calling a marking platform interface to mark real characters of positive sample data XA and XXXXX and the number of digits of marked real characters respectively corresponding to the license plate numbers, and the real characters marked by the real characters of the positive sample data XA and XXXXX and the number of digits of marked real characters respectively corresponding to the license plate numbers are combined into a real character marking set; the license plate number of the negative sample data is XB.XXX, the character label of the negative sample data XB.XXX and the number of digits corresponding to the license plate number of the labeled character respectively can be labeled by calling the label platform interface, and the character label of the negative sample data XB.XXX and the number of digits corresponding to the license plate number of the labeled character respectively form an error character label set.
In the embodiment of the invention, after the index annotation platform discovers the error character annotation set, the negative sample data set corresponding to the error character annotation can be subjected to real-time data updating, and the negative sample data set is used as an updated training data set to be input into a subsequent model so as to improve the accuracy of subsequent model training.
S4, the positive sample data set, the negative sample data set and the original picture set are used as training data sets to be input into a preset optical character recognition model, and the optical character recognition model is used for recognizing a predicted character set of the training data set.
In the embodiment of the present invention, the preset optical character recognition model may be a deep learning model of a CRNN structure, where the CRNN structure includes: cnn+lstm+ctc, and the optical character recognition model includes: convolutional layer (CNN), cyclic Layer (LSTM), transcriptional layer (CTC), and loss function.
In detail, the identifying the predicted character of the training data set using the preset optical character recognition model includes:
extracting a characteristic sequence of the training data set by using a convolution layer in a preset optical character recognition model to obtain a character vector set;
Predicting a character tag set of the character vector set by using a loop layer in the optical character recognition model;
and integrating the character tag set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
In an embodiment of the present invention, the convolution layer includes a convolution sub-layer and a pooling layer, and feature extraction may be performed on the training data set by the convolution sub-layer to obtain a feature map; and extracting the feature sequence vectors in the feature map by using a pooling layer pair to obtain a character vector set.
In another embodiment of the present invention, the loop layer is mainly composed of variant LTSM of RNN, and since RNN has a problem of gradient disappearance, and cannot obtain more context information, LSTM replaces RNN, so that it can better extract context information, where the loop layer includes an input gate, a forget gate, and an output gate.
In detail, the predicting the character tag set of the character vector set using the loop layer in the optical character recognition model includes:
calculating a state value of the character vector set by using an input gate in the loop layer;
calculating an activation value of the character vector set by using a forgetting gate in the loop layer;
Calculating a state update value of the character vector set according to the state up-to-activation value;
And calculating a character tag set of the state update value by using an output gate in the circulating layer to obtain the character tag set of the character vector set.
In the embodiment of the invention, the input gate can control the number of the character vector sets entering and exiting and the number of the character vector sets passing through the gate; the forgetting gate is used for controlling the quantity of the character vector sets flowing to the current moment from the character vector set at the previous moment; the state updating value refers to the character vector set passing through the forgetting gate is used as the state updating value if the character vector set is not selected to be forgotten by the forgetting gate; the output gate may output a character tag set of the character vector set.
In the embodiment of the invention, the transcription layer mainly consists of CTC (Connectionist Temporal Classifification) and mainly aims to convert the predicted character tag set in the LSTM into the predicted character set with the tag.
Further, the integrating the character tag set by using the transcription layer in the optical character recognition model to obtain a predicted character set includes:
Acquiring all path probabilities of the character tag set by using the transcription layer, and searching the maximum path probability corresponding to each character tag from a plurality of path probabilities;
And merging each maximum path probability to obtain the predicted character of the character tag set.
In the embodiment of the invention, the predicted character can be obtained through the following formula:
In the embodiment of the invention, P (pi|x) is the path probability of all character labels, B (pi) is the path set of all character labels, pi is the maximum path probability corresponding to each character label, and y is the predicted character corresponding to the character label.
S5, obtaining loss values of the predicted character set, the real character labeling set and the error character labeling set through calculation, and if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model until the loss values meet the preset conditions, so that the trained optical character recognition model is obtained.
In an embodiment of the present invention, the first loss value and the second loss value of the predicted character set and the error character labeling set are fused by calculating the first loss value of the predicted character set and the real character labeling set, and calculating the second loss value of the predicted character set and the error character labeling set to obtain the loss value, if the loss value does not satisfy a preset condition, the parameters of the optical character recognition model are adjusted until the loss value satisfies the preset condition, and the trained optical character recognition model is obtained.
For example, the preset condition may be a preset threshold value of 0.1, and when the loss value is less than 0.1, the model parameters are adjusted until the loss value is greater than or equal to 0.1, so as to obtain the trained optical character recognition model.
In the embodiment of the invention, the original picture set in actual production and the original data set corresponding to the original picture set are firstly obtained, so that the data distribution difference between a development environment and a production environment can be avoided, and the accuracy of subsequent model training is improved; secondly, when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by utilizing the search engine, and carrying out error data identification on the original data set by utilizing the search engine, so that error data can be screened directly by the search engine, the screening of the error data can be carried out instead of manual work, the manpower and time are saved, the iteration period of the subsequent model training is shortened, and the training efficiency of the subsequent model is improved; furthermore, by marking the real characters corresponding to the positive sample data and marking the error characters corresponding to the negative sample data, the training of the subsequent model is facilitated; and finally, recognizing predicted characters of the training data by using the optical character recognition model, respectively calculating loss values of the predicted characters, the real characters and the error characters, and if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model, further improving the accuracy of the model until the loss values meet the preset conditions, so as to obtain the trained optical character recognition model. Therefore, the optical character recognition model training method provided by the embodiment of the invention can improve the efficiency and accuracy of optical character recognition model training.
FIG. 2 is a functional block diagram of the optical character recognition model training device of the present invention.
The optical character recognition model training apparatus 100 of the present invention may be mounted in an electronic device. Depending on the functions implemented, the optical character recognition model training apparatus may include a data set acquisition module 101, a data set screening module 102, a data set labeling module 103, a training data set recognition module 104, and a model training module 105, which may also be referred to herein as a unit, refers to a series of computer program segments capable of being executed by a processor of an electronic device and performing a fixed function, which are stored in a memory of the electronic device.
In the present embodiment, the functions concerning the respective modules/units are as follows:
The data set obtaining module 101 is configured to obtain an original picture set in actual production and an original data set corresponding to the original picture set, and store the original picture set and the original data set into a preset message queue channel.
In the embodiment of the invention, the original picture set is unstructured data obtained from an actual production environment process by using a preset optical character recognition interface, the original data set is character information extracted from the original picture set by using the optical character recognition interface, and the character information is structured data.
The structured data refers to data which can be stored in a database and realized by two-dimensional logic expression; the unstructured data refers to data which cannot be realized by two-dimensional logic expression, such as text, picture, XML, HTML, audio, video and the like.
Specifically, in this embodiment, the identified structured data and unstructured data are sent to the message queue channel by using a preset optical character recognition interface, and then the data are processed according to the user requirements.
The data set acquisition module may be configured to:
establishing links between the original data set and the original picture set and the message middleware, and forming a message queue channel through the links;
and storing the original picture set and the original data set through the message queue channel.
In the embodiment of the invention, the message queue channel is a channel which is formed by linking the original data and the message middleware and can receive information, store information and send information.
Preferably, the link may be a TCP link.
Preferably, the message middleware may be kafka.
In another embodiment of the present invention, the identified structured data may be sent to the message queue channel by using a preset optical character recognition interface, and the unstructured data may be stored in the NAS disc, and then processed according to the user requirements.
The data set screening module 102 is configured to, when a preset search engine is idle, obtain an original data set corresponding to the original picture set from the message queue channel by using the search engine, perform error data screening on the original data set, determine that the screened error data form a negative sample data set, and form non-error data other than the error data into a positive sample data set.
In the embodiment of the invention, the original picture set and the original data set can be asynchronously stored through the preset message queue channel, and when the original picture set and the original data set are transmitted to the message queue channel, the message queue channel can firstly not process the original picture set and the original data set and can determine when to process the original picture set and the original data set according to the requirement of a user; transmitting the original data set processed in the message queue channel to a preset search engine, and screening error data in the original data set corresponding to the original picture set in the message queue channel by using preset screening data of the search engine by acquiring use authorization of the original data set corresponding to the original picture set in actual production, wherein the error data is error character information in the original data set corresponding to the original picture set, and the error data is formed into a negative sample data set, and non-error data except the rest of the error data are formed into a positive sample data set.
Preferably, the preset search engine may be an elastic search.
In detail, the data set filtering module 102 performs error data filtering on the original data set by performing the following operations, determining that the filtered error data forms a negative sample data set, and that non-error data other than the error data forms a positive sample data set, including:
Acquiring the sequence length of the original data in the original data set by using a preset search engine, and setting a sequence length index by using a preset screening statement in the preset search engine;
Comparing the sequence length with the sequence length index, composing the original data corresponding to the sequence length inconsistent with the length of the sequence length index into a negative sample data set, and composing the original data corresponding to the sequence length consistent with the length of the sequence length index into a positive sample data set.
In the embodiment of the invention, the sequence length can be the character length corresponding to the original data, and the sequence length index set by using the preset screening statement is an index set for the fixed length of the characters in the original data.
For example, if a certain original picture is a car license plate picture, the sequence length contained in the original data corresponding to the original picture is seven-bit number (i.e. the length of the characters contained in the original picture is seven-bit number), the screening statement sets the sequence length index to be length 7, and if the sequence length in the original data is identified to be 7, the original data is determined to be positive sample data; if the sequence length in the original data is not 7, the original data is determined to be negative sample data.
The dataset screening module may be further operable to:
acquiring data fields of the positive sample data set and the negative sample data set, and identifying sensitive fields in the data fields;
and desensitizing the sensitive field by using a preset desensitizing function.
In the embodiment of the present invention, the fields related to personal privacy information such as personal name, identification card information and mobile phone number are all sensitive fields, and after the sensitive fields are identified, data replacement or mask shielding treatment can be performed on the sensitive fields to implement desensitization, for example, data replacement is performed on the middle four digits of the mobile phone number to obtain 13800001248, or middle four digits mask shielding is performed to obtain 138×1248.
For example, personal identification information, mobile phone numbers, bank card information, etc. collected by institutions and enterprises are subjected to desensitization processing.
In the embodiment of the invention, the privacy information in the positive sample data and the negative sample data can be shielded or hidden through the desensitization processing of the desensitization function on the sensitive field, so that the security of the privacy data of users in the real production environment is protected.
For example, because the embodiment of the invention uses the data in the actual production process to perform model training, confidential data of enterprises may be involved, and the data use permission can be preset, and the use restriction is performed in the data use permission, so that the developer cannot view or download the data.
In an embodiment of the invention, the preset data use authority can also ensure that a developer can only use data when performing model iteration and training, but cannot check or download the data, thereby avoiding the risk of data leakage.
The data set labeling module 103 is configured to obtain a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, where the error character labeling set is dynamically updated in real time.
In the embodiment of the invention, the positive sample data set and the negative sample data set can be transmitted to a preset labeling platform, the real character labeling of the positive sample data set is labeled by utilizing the preset labeling platform, the corresponding position information of the positive sample data set in the original picture (namely the position information of the real character in the original picture) is labeled, and the real character labeling of the positive sample data set and the position information of the real character in the original picture are combined into the real character labeling set; similarly, the character labeling of the negative sample data set is labeled by using a preset labeling platform, and the corresponding position information of the negative sample data set in the original picture (namely, the position information of the error character in the original picture) is labeled, and the character labeling of the negative sample data set and the position information of the character in the original picture are combined into the error character labeling set.
For example, the seven digits of license plate numbers XA and XXXXX can be marked by calling a marking platform interface to mark real characters of positive sample data XA and XXXXX and the number of digits of marked real characters respectively corresponding to the license plate numbers, and the real characters marked by the real characters of the positive sample data XA and XXXXX and the number of digits of marked real characters respectively corresponding to the license plate numbers are combined into a real character marking set; the license plate number of the negative sample data is XB.XXX, the character label of the negative sample data XB.XXX and the number of digits corresponding to the license plate number of the labeled character respectively can be labeled by calling the label platform interface, and the character label of the negative sample data XB.XXX and the number of digits corresponding to the license plate number of the labeled character respectively form an error character label set.
In the embodiment of the invention, after the index annotation platform discovers the error character annotation set, the negative sample data set corresponding to the error character annotation can be subjected to real-time data updating, and the negative sample data set is used as an updated training data set to be input into a subsequent model so as to improve the accuracy of subsequent model training.
The training data set recognition module 104 is configured to input the positive sample data set, the negative sample data set, and the original picture set as training data sets to a preset optical character recognition model, and recognize a predicted character set of the training data set using the optical character recognition model.
In the embodiment of the present invention, the preset optical character recognition model may be a deep learning model of a CRNN structure, where the CRNN structure includes: cnn+lstm+ctc, and the optical character recognition model includes: convolutional layer (CNN), cyclic Layer (LSTM), transcriptional layer (CTC), and loss function.
In detail, the training data set recognition module 104 recognizes predicted characters of the training data set using a preset optical character recognition model by performing operations including:
extracting a characteristic sequence of the training data set by using a convolution layer in a preset optical character recognition model to obtain a character vector set;
Predicting a character tag set of the character vector set by using a loop layer in the optical character recognition model;
and integrating the character tag set by utilizing a transcription layer in the optical character recognition model to obtain a predicted character set.
In an embodiment of the present invention, the convolution layer includes a convolution sub-layer and a pooling layer, and feature extraction may be performed on the training data set by the convolution sub-layer to obtain a feature map; and extracting the feature sequence vectors in the feature map by using a pooling layer pair to obtain a character vector set.
In another embodiment of the present invention, the loop layer is mainly composed of variant LTSM of RNN, and since RNN has a problem of gradient disappearance, and cannot obtain more context information, LSTM replaces RNN, so that it can better extract context information, where the loop layer includes an input gate, a forget gate, and an output gate.
In detail, the predicting the character tag set of the character vector set using the loop layer in the optical character recognition model includes:
calculating a state value of the character vector set by using an input gate in the loop layer;
calculating an activation value of the character vector set by using a forgetting gate in the loop layer;
Calculating a state update value of the character vector set according to the state up-to-activation value;
And calculating a character tag set of the state update value by using an output gate in the circulating layer to obtain the character tag set of the character vector set.
In the embodiment of the invention, the input gate can control the number of the character vector sets entering and exiting and the number of the character vector sets passing through the gate; the forgetting gate is used for controlling the quantity of the character vector sets flowing to the current moment from the character vector set at the previous moment; the state updating value refers to the character vector set passing through the forgetting gate is used as the state updating value if the character vector set is not selected to be forgotten by the forgetting gate; the output gate may output a character tag set of the character vector set.
In the embodiment of the invention, the transcription layer mainly consists of CTC (Connectionist Temporal Classifification) and mainly aims to convert the predicted character tag set in the LSTM into the predicted character set with the tag.
Further, the integrating the character tag set by using the transcription layer in the optical character recognition model to obtain a predicted character set includes:
Acquiring all path probabilities of the character tag set by using the transcription layer, and searching the maximum path probability corresponding to each character tag from a plurality of path probabilities;
And merging each maximum path probability to obtain the predicted character of the character tag set.
In the embodiment of the invention, the predicted character can be obtained through the following formula:
In the embodiment of the invention, P (pi|x) is the path probability of all character labels, B (pi) is the path set of all character labels, pi is the maximum path probability corresponding to each character label, and y is the predicted character corresponding to the character label.
The model training module 105 is configured to obtain, by calculating, a loss value of the predicted character set, the real character labeling set, and the error character labeling set, and if the loss value does not satisfy a preset condition, adjust parameters of the optical character recognition model until the loss value satisfies the preset condition, thereby obtaining a trained optical character recognition model.
In an embodiment of the present invention, the first loss value and the second loss value of the predicted character set and the error character labeling set are fused by calculating the first loss value of the predicted character set and the real character labeling set, and calculating the second loss value of the predicted character set and the error character labeling set to obtain the loss value, if the loss value does not satisfy a preset condition, the parameters of the optical character recognition model are adjusted until the loss value satisfies the preset condition, and the trained optical character recognition model is obtained.
For example, the preset condition may be a preset threshold value of 0.1, and when the loss value is less than 0.1, the model parameters are adjusted until the loss value is greater than or equal to 0.1, so as to obtain the trained optical character recognition model.
In the embodiment of the invention, the original picture set in actual production and the original data set corresponding to the original picture set are firstly obtained, so that the data distribution difference between a development environment and a production environment can be avoided, and the accuracy of subsequent model training is improved; secondly, when a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by utilizing the search engine, and carrying out error data identification on the original data set by utilizing the search engine, so that error data can be screened directly by the search engine, the screening of the error data can be carried out instead of manual work, the manpower and time are saved, the iteration period of the subsequent model training is shortened, and the training efficiency of the subsequent model is improved; furthermore, by marking the real characters corresponding to the positive sample data and marking the error characters corresponding to the negative sample data, the training of the subsequent model is facilitated; and finally, recognizing predicted characters of the training data by using the optical character recognition model, respectively calculating loss values of the predicted characters, the real characters and the error characters, and if the loss values do not meet preset conditions, adjusting parameters of the optical character recognition model, further improving the accuracy of the model until the loss values meet the preset conditions, so as to obtain the trained optical character recognition model. Therefore, the optical character recognition model training device provided by the embodiment of the invention can improve the efficiency and accuracy of optical character recognition model training.
Fig. 3 is a schematic structural diagram of an electronic device for implementing the training method of the optical character recognition model according to the present invention.
The electronic device may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program, such as an optical character recognition model training program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a local magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in an electronic device and various types of data, such as codes of an optical character recognition model training program, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, executes or executes programs or modules (e.g., an optical character recognition model training program, etc.) stored in the memory 11, and invokes data stored in the memory 11 to perform various functions of the electronic device and process data.
The communication bus 12 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power source (such as a battery) for supplying power to the respective components, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.
Optionally, the communication interface 13 may comprise a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The optical character recognition model training program stored by the memory 11 in the electronic device is a combination of a plurality of computer programs that, when run in the processor 10, implement:
Acquiring an original picture set in actual production and an original data set corresponding to the original picture set, and storing the original picture set and the original data set into a preset message queue channel;
When a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, performing error data screening on the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
Acquiring a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, wherein the error character labeling set is dynamically updated in real time;
Inputting the positive sample data set, the negative sample data set and the original picture set as training data sets to a preset optical character recognition model, and recognizing a predicted character set of the training data set by using the optical character recognition model;
And obtaining the loss values of the predicted character set, the real character labeling set and the error character labeling set through calculation, and if the loss values do not meet the preset conditions, adjusting the parameters of the optical character recognition model until the loss values meet the preset conditions, so as to obtain the trained optical character recognition model.
In particular, the specific implementation method of the processor 10 on the computer program may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
Further, the electronic device integrated modules/units may be stored in a computer readable medium if implemented in the form of software functional units and sold or used as stand alone products. The computer readable medium may be non-volatile or volatile. The computer readable medium may include: any entity or device capable of carrying the computer program code to be described, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, may implement:
Acquiring an original picture set in actual production and an original data set corresponding to the original picture set, and storing the original picture set and the original data set into a preset message queue channel;
When a preset search engine is idle, acquiring an original data set corresponding to the original picture set from the message queue channel by using the search engine, performing error data screening on the original data set, and determining that the screened error data form a negative sample data set and non-error data except the error data form a positive sample data set;
Acquiring a real character labeling set corresponding to the positive sample data set and an error character labeling set corresponding to the negative sample data set, wherein the error character labeling set is dynamically updated in real time;
Inputting the positive sample data set, the negative sample data set and the original picture set as training data sets to a preset optical character recognition model, and recognizing a predicted character set of the training data set by using the optical character recognition model;
And obtaining the loss values of the predicted character set, the real character labeling set and the error character labeling set through calculation, and if the loss values do not meet the preset conditions, adjusting the parameters of the optical character recognition model until the loss values meet the preset conditions, so as to obtain the trained optical character recognition model.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
In the several embodiments provided by the present invention, it should be understood that the disclosed media, devices, apparatuses, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.