Movatterモバイル変換


[0]ホーム

URL:


CN112507658B - A method, device and equipment for generating prediction models and normalizing detection data - Google Patents

A method, device and equipment for generating prediction models and normalizing detection data
Download PDF

Info

Publication number
CN112507658B
CN112507658BCN202011401732.3ACN202011401732ACN112507658BCN 112507658 BCN112507658 BCN 112507658BCN 202011401732 ACN202011401732 ACN 202011401732ACN 112507658 BCN112507658 BCN 112507658B
Authority
CN
China
Prior art keywords
feature
standard
data
nonstandard
detection data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011401732.3A
Other languages
Chinese (zh)
Other versions
CN112507658A (en
Inventor
艾迪歌
张陈
金圣海
李楠超
赵刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Group Guangzhou Co ltd
Neusoft Corp
Original Assignee
Neusoft Group Guangzhou Co ltd
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Group Guangzhou Co ltd, Neusoft CorpfiledCriticalNeusoft Group Guangzhou Co ltd
Priority to CN202011401732.3ApriorityCriticalpatent/CN112507658B/en
Publication of CN112507658ApublicationCriticalpatent/CN112507658A/en
Application grantedgrantedCritical
Publication of CN112507658BpublicationCriticalpatent/CN112507658B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请公开了一种预测模型生成方法、检测数据归一化的方法、装置及设备。通过获取包括至少一个字段的属性值的第一非标准检测数据,对各字段的属性值进行拼接,生成作为第一非标准特征数据的第一非标准特征句。获取包括至少一个字段的属性值的标准检测数据,对各字段的属性值进行拼接,生成作为标准特征数据的标准特征句。利用第一非标准特征数据、标准特征数据以及第一非标准特征数据与标准特征数据是否匹配的标签,训练生成预测模型。预测模型可以实现对输入的第二非标准特征数据以及标准特征数据,输出匹配结果。根据预测模型输出的匹配结果快速并且准确地确定第二非标准检测数据可以归一化的标准检测数据,提高确定检测数据归一化的效率。

The present application discloses a prediction model generation method, a method, an apparatus and a device for normalizing detection data. By obtaining first non-standard detection data including attribute values of at least one field, the attribute values of each field are spliced to generate a first non-standard feature sentence as the first non-standard feature data. Obtain standard detection data including attribute values of at least one field, splice the attribute values of each field, and generate a standard feature sentence as the standard feature data. The prediction model is trained and generated using the first non-standard feature data, the standard feature data and the labels of whether the first non-standard feature data matches the standard feature data. The prediction model can output a matching result for the input second non-standard feature data and the standard feature data. According to the matching result output by the prediction model, the standard detection data to which the second non-standard detection data can be normalized is quickly and accurately determined, thereby improving the efficiency of determining the normalization of the detection data.

Description

Method, device and equipment for generating prediction model and normalizing detection data
Technical Field
The application relates to the technical field of data processing, in particular to a prediction model generation method, a detection data normalization device and detection data normalization equipment.
Background
The detection data is data obtained after detecting the detection object by the detection device. The detection data may be used for data analysis. For example, in the medical field, medical detection data obtained by a detection device can be used for analysis of diseases, drugs, and the like.
The detection data obtained by different detection devices has a problem of non-uniformity, and normalization processing is required for the detection data before analysis is performed by using the detection data. At present, normalization processing of detection data is generally performed manually. However, the efficiency of manually normalizing the detection data is low, the amount of detection data that can be processed is small, and the normalized result may have an incorrect problem.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method for generating a prediction model, a method, a device and equipment for normalizing detection data, which can implement rapid normalization of nonstandard detection data to standard detection data with accuracy.
In order to solve the above problems, the technical solution provided by the embodiment of the present application is as follows:
A method of predictive model generation, the method comprising:
acquiring first non-standard detection data, wherein the first non-standard detection data comprises an attribute value of at least one field;
Splicing attribute values of all fields of the first nonstandard detection data to generate a first nonstandard feature sentence, and taking the first nonstandard feature sentence as first nonstandard feature data;
obtaining standard detection data, wherein the standard detection data comprises attribute values of at least one field;
splicing attribute values of all fields of the standard detection data to generate a standard feature sentence, and taking the standard feature sentence as standard feature data;
And training to generate a prediction model by using the first nonstandard feature data, the standard feature data and the label whether the first nonstandard feature data is matched with the standard feature data, wherein the prediction model is used for outputting a matching result whether the second nonstandard feature data is matched with the target standard feature data or not when the second nonstandard feature data and the standard feature data are input, so that whether the second nonstandard detection data corresponding to the second nonstandard feature data can be normalized to the standard detection data corresponding to the standard feature data or not is determined according to the matching result.
In one possible implementation manner, the splicing the attribute values of the fields of the first nonstandard detection data to generate a first nonstandard feature sentence includes:
forming a first attribute feature pair by a field name and a corresponding attribute value of each field in the first nonstandard detection data;
Splicing all first attribute feature pairs in the first nonstandard detection data to generate a first nonstandard feature sentence;
the splicing the attribute values of the fields of the standard detection data to generate a standard feature sentence comprises the following steps:
forming a second attribute characteristic pair by the field name of each field in the standard detection data and the corresponding attribute value;
and splicing all second attribute feature pairs in the standard detection data to generate a standard feature sentence.
In one possible implementation, the method further includes:
and training a first language model and a second language model by using the first non-standard feature sentence and the standard feature sentence.
In one possible implementation, before training to generate the prediction model using the first non-standard feature data, the standard feature data, and the tag whether the first non-standard feature data matches the standard feature data, the method further includes:
Inputting the first nonstandard feature sentence into the first language model to obtain a feature vector of the first nonstandard feature sentence, and taking the feature vector of the first nonstandard feature sentence as first nonstandard feature data;
And inputting the standard feature sentence into the second language model to obtain a feature vector of the standard feature sentence, and taking the feature vector of the standard feature sentence as standard feature data.
In one possible implementation, the first language model, the second language model, and the predictive model form a twinning network.
A method of detecting normalization of data, the method comprising:
acquiring second non-standard detection data, wherein the second non-standard detection data comprises an attribute value of at least one field;
Splicing attribute values of all fields of the second nonstandard detection data to generate a second nonstandard feature sentence, and taking the second nonstandard feature sentence as second nonstandard feature data;
Inputting the second nonstandard feature data and target standard feature data into a prediction model to obtain a matching result of whether the second nonstandard feature data and the target standard feature data are matched or not, wherein the target standard feature data are generated according to target standard feature sentences which are generated by splicing attribute values of fields of the target standard detection data, and the target standard detection data are any one of standard detection data;
And if the matching result is that the second nonstandard feature data is matched with the target standard feature data, normalizing the second nonstandard detection data corresponding to the second nonstandard feature data to the standard detection data corresponding to the target standard feature data.
In one possible implementation manner, when the target standard feature sentence is generated by splicing field names and attribute values of fields of the target standard detection data, the splicing the attribute values of fields of the second non-standard detection data to generate a second non-standard feature sentence includes:
Forming a third attribute characteristic pair by the field name of each field in the second nonstandard detection data and the corresponding attribute value;
and splicing all third attribute feature pairs in the second nonstandard detection data to generate a second nonstandard feature sentence.
In one possible implementation manner, when the target standard feature data is generated according to the feature vector of the target standard feature sentence, the method further includes:
inputting the second nonstandard feature sentence into a first language model, generating a nonstandard feature vector, and taking the nonstandard feature vector as second nonstandard feature data;
The feature vector of the target standard feature sentence is obtained by inputting the target standard feature sentence into a second language model;
The first language model and the second language model are generated according to the method of claim 3.
A predictive model generation apparatus, the apparatus comprising:
A first obtaining unit, configured to obtain first non-standard detection data, where the first non-standard detection data includes an attribute value of at least one field;
The first splicing unit is used for splicing the attribute values of the fields of the first nonstandard detection data to generate a first nonstandard feature sentence, and the first nonstandard feature sentence is used as first nonstandard feature data;
A second acquisition unit configured to acquire standard detection data including an attribute value of at least one field;
the second splicing unit is used for splicing the attribute values of the fields of the standard detection data to generate a standard feature sentence, and the standard feature sentence is used as standard feature data;
The first training unit is used for training and generating a prediction model by using the first nonstandard feature data, the standard feature data and the label whether the first nonstandard feature data is matched with the standard feature data, and the prediction model is used for outputting a matching result whether the second nonstandard feature data is matched with the target standard feature data or not when the second nonstandard feature data and the standard feature data are input, so that whether the second nonstandard detection data corresponding to the second nonstandard feature data can be normalized to the standard detection data corresponding to the standard feature data or not is determined according to the matching result.
An apparatus to detect normalization of data, the apparatus comprising:
A third acquisition unit configured to acquire second non-standard detection data, where the second non-standard detection data includes an attribute value of at least one field;
The third splicing unit is used for splicing the attribute values of the fields of the second nonstandard detection data to generate a second nonstandard feature sentence, and the second nonstandard feature sentence is used as second nonstandard feature data;
The system comprises a first input unit, a second input unit, a third input unit, a prediction model and a prediction model, wherein the first input unit is used for inputting the first nonstandard feature data and the target standard feature data to obtain a matching result of whether the first nonstandard feature data is matched with the target standard feature data or not;
And the normalization unit is used for normalizing the second nonstandard detection data corresponding to the second nonstandard feature data to the standard detection data corresponding to the target standard feature data if the matching result is that the second nonstandard feature data is matched with the target standard feature data.
A predictive model generating apparatus includes a memory, a processor, and a computer program stored on the memory and executable on the processor, which when executed, implements a predictive model generating method as described above.
An apparatus for normalizing test data comprises a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for normalizing test data as described above when executing the computer program.
A computer readable storage medium having instructions stored therein which, when executed on a terminal device, cause the terminal device to perform a predictive model generation method as described above, or a method of normalizing detected data as described above.
From this, the embodiment of the application has the following beneficial effects:
In the prediction model generation method provided by the embodiment of the application, the attribute values of the fields are spliced by acquiring the first nonstandard detection data comprising the attribute values of at least one field, so that a first nonstandard feature sentence serving as the first nonstandard feature data is generated. And acquiring standard detection data comprising the attribute values of at least one field, splicing the attribute values of the fields, and generating a standard feature sentence serving as standard feature data. And training and generating a prediction model by using the first non-standard characteristic data, the standard characteristic data and the label whether the first non-standard characteristic data is matched with the standard characteristic data. The prediction model can realize the output of a corresponding matching result for the input second non-standard characteristic data and standard characteristic data. And determining standard detection data which can be normalized by the second non-standard detection data according to the matching result output by the prediction model, and obtaining the standard detection data normalized by the second non-standard detection data faster through the prediction model. And the prediction model generated by using the first nonstandard feature data, the standard feature data and the corresponding label is accurate, so that normalized standard detection data corresponding to the second nonstandard detection data can be accurately determined according to the matching result.
Drawings
Fig. 1 is a schematic diagram of a frame of an exemplary application scenario provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for generating a prediction model according to an embodiment of the present application;
Fig. 3 is a schematic structural diagram of a twin network according to an embodiment of the present application;
FIG. 4 is a flowchart of a method for normalizing detection data according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a prediction model generating device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a device for normalizing detection data according to an embodiment of the present application.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of embodiments of the application will be rendered by reference to the appended drawings and appended drawings.
In order to facilitate understanding and explanation of the technical solutions provided by the embodiments of the present application, the following description will first explain the background art of the present application.
The inventor finds that the detection devices for generating the detection data are different after researching the traditional normalization process of the detection data, and the naming of each detection device for the detection data is different, so that the obtained detection data are not uniform. If the detection data is required to be used for data processing or analysis, the acquired detection data is required to be normalized manually, so that standard detection data is obtained. Because the normalization rule of the detection data is complex, the detection data may be incorrect in the manual processing process. Therefore, the efficiency of manually normalizing the detection data is low, and a large amount of detection data cannot be rapidly processed. For example, in practical applications, a large amount of non-standard medical test data cannot be quickly normalized to corresponding standard medical test data.
Based on the above, the embodiment of the application provides a prediction model generation method, which is used for obtaining first nonstandard detection data comprising attribute values of at least one field, splicing the attribute values of each field, and generating a first nonstandard feature sentence serving as the first nonstandard feature data. And acquiring standard detection data comprising the attribute values of at least one field, splicing the attribute values of the fields, and generating a standard feature sentence serving as standard feature data. And training and generating a prediction model by using the first non-standard characteristic data, the standard characteristic data and the label whether the first non-standard characteristic data is matched with the standard characteristic data. The prediction model can realize the output of a corresponding matching result for the input second non-standard characteristic data and standard characteristic data. And determining standard detection data which can be normalized by the second non-standard detection data according to the matching result output by the prediction model, and obtaining the standard detection data normalized by the second non-standard detection data faster through the prediction model. And the prediction model generated by using the first non-standard feature data, the standard feature data and the corresponding label is accurate, so that normalized standard detection data corresponding to the second non-standard detection data can be accurately determined according to the matching result.
In order to facilitate understanding of the prediction model generation method provided by the embodiment of the present application, the following description is made with reference to the scenario example shown in fig. 1. Referring to fig. 1, the diagram is a schematic frame diagram of an exemplary application scenario provided in an embodiment of the present application.
In practical application, first, second non-standard detection data 101 is acquired, a second non-standard feature sentence is generated by using the second non-standard detection data, and the second non-standard feature sentence is used as second non-standard feature data 102. And then the second non-standard characteristic data 102 and the target standard characteristic data 103 are input into the prediction model 104, so that a matching result of whether the second non-standard characteristic data 102 and the target standard characteristic data 103 output by the prediction model 104 are matched is obtained, and whether the second non-standard detection data 101 corresponding to the second non-standard characteristic data 102 can be normalized to the target standard detection data corresponding to the target standard characteristic data 103 can be determined according to the matching result.
Those skilled in the art will appreciate that the frame diagram shown in fig. 1 is but one example in which embodiments of the present application may be implemented. The scope of applicability of the embodiments of the application is not limited in any way by the framework.
In order to facilitate understanding of the present application, a method for generating a prediction model according to an embodiment of the present application is described below with reference to the accompanying drawings.
Referring to fig. 2, which is a flowchart of a prediction model generating method according to an embodiment of the present application, as shown in fig. 2, the method may include S201 to S205:
s201, acquiring first non-standard detection data, wherein the first non-standard detection data comprises attribute values of at least one field.
The first non-standard detection data is non-standard detection data for training a predictive model. The first non-standard detection data may be non-standard detection data with corresponding standard detection data. For example, first nonstandard medical detection data detected by a medical detection device in the medical field. The data may be generated based on standard detection data, for example, by replacing or deleting a part of the data in the standard detection data. For example, first non-standard medical test data obtained by modifying standard medical test data.
The first nonstandard detection data has at least one field and an attribute value corresponding to the field. Wherein the field may indicate a kind of data contained in the first non-standard detection data. The fields in the first non-standard test data may be selected from more commonly used and signed test data types. Taking medical detection data as an example, the fields can be detection identification, detection name, normal value range, unit, sampling type, detection method and the like. Wherein the detection mark refers to Chinese or English abbreviations of detection substances, such as ACTH (adreno-cortico-tropic-horone, corticotropin) or corticotropin. The detection name is the name of the detection item, and may be a specific component to be detected, such as urine microalbumin. The normal value range is a range of standard normal values, and may be represented by a numerical range, for example, <20, or may be represented by a letter, for example, positive. The type of sampling is the location or tissue that needs to be sampled, e.g., cerebrospinal fluid. The detection method may be a method used in detection, for example, gal-G2-CNP method.
The attribute value is information corresponding to the field, and the attribute value can be represented by a number, a text or a symbol. Taking the above six fields as an example, the corresponding attribute values can be AlbG, albumin, bromocresol green method, 35-53, g/l and drainage liquid in sequence.
It should be noted that, the attribute value of the field included in the first nonstandard detection data is at least one. In practical applications, the first non-standard test data at least needs to include a test name field. In one possible implementation manner, attribute values corresponding to a plurality of fields can be selected, and the accuracy of the prediction model is improved on the premise of ensuring the calculation speed of the prediction model.
S202, splicing attribute values of fields of the first nonstandard detection data to generate a first nonstandard feature sentence, and taking the first nonstandard feature sentence as the first nonstandard feature data.
It is understood that the fields in the non-standard detection data and the fields in the standard detection data are not in one-to-one correspondence in meaning or order and semanteme, and such a structure cannot directly determine the correspondence.
And therefore, the attribute values of the fields in the first nonstandard detection data are spliced to generate corresponding first nonstandard feature sentences. Taking the attribute value of the field as an example, the obtained first nonstandard feature sentence is AlbG. Albumin. Bromocresol green process. 35-53. g/l. Drainage liquid. Note that, in order to distinguish attribute values corresponding to respective fields, separators may be added between the respective attribute values. The separator may be punctuation or other commonly used separator symbols such as periods, commas, lines of separation, and the like.
And taking the obtained first nonstandard feature sentence as first nonstandard feature data.
In a possible implementation manner, a field name of a field may be further added to the first nonstandard feature sentence, and the embodiment of the present application further provides a possible splicing of attribute values of each field of the first nonstandard detection data, so as to generate a specific embodiment of the first nonstandard feature sentence, which is described below.
And S203, acquiring standard detection data, wherein the standard detection data comprises attribute values of at least one field.
Standard detection data is acquired, and can be obtained by labeling with labeling standards widely accepted in the field. For the medical field, standard test data may be standard medical test data, such as LOINC Codes in LOINC (Logical Observation IDENTIFIERS NAMES AND Codes, observer identifier logical naming and coding) systems.
The standard test data includes an attribute value for at least one field. The fields in the standard detection data may be selected from the labeling standard. In one possible implementation, for standard detection data of medicine, six fields may be selected from LOINC as fields included in standard detection data, specifically, component, property, TIME ASPECT, system, scale and method. The above six fields can determine a unique LOINC code, that is, standard detection data, and a prediction model is generated by training the LOINC code, so that the second non-standard detection data can be normalized to the unique corresponding standard detection data through the prediction model.
Wherein component means a component or an analyte to be detected, that is, a detection object to be detected. property indicates the kind of property of the object to be detected, and indicates the property of the object to be detected. TIME ASPECT indicates a temporal characteristic of the test, e.g., whether the test is disposable or valid for a period of time. system represents the sample to be tested. scale represents the scale or type of accuracy of the detection, e.g., qualitative or quantitative description. Method indicates the type of Method employed for the detection.
And S204, splicing attribute values of all fields of the standard detection data to generate a standard feature sentence, and taking the standard feature sentence as standard feature data.
And splicing attribute values of all fields of the standard detection data to generate a standard feature sentence. The generation manner of the standard feature sentence is similar to that of the first non-standard feature sentence, and will not be described herein. And taking the obtained standard feature sentence as standard feature data.
In a possible implementation manner, the field name of the field may be added to the standard feature sentence, and the embodiment of the present application further provides a possible splicing of attribute values of each field of the standard detection data, so as to generate a specific implementation manner of the standard feature sentence, which is described below.
S205, training and generating a prediction model by using the first nonstandard feature data, the standard feature data and the label whether the first nonstandard feature data is matched with the standard feature data, wherein the prediction model is used for outputting a matching result whether the second nonstandard feature data is matched with the target standard feature data or not when the second nonstandard feature data and the standard feature data are input, so that whether the second nonstandard detection data corresponding to the second nonstandard feature data can be normalized to the standard detection data corresponding to the standard feature data or not is determined according to the matching result.
The first non-standard detection data used to train the generation of the predictive model may have a matching relationship with the standard detection data. Correspondingly, the first non-standard characteristic data and the standard characteristic data can have a matching relationship. And determining whether the first nonstandard characteristic data and the standard characteristic data are matched labels by utilizing the matching relation between the first nonstandard characteristic data and the standard characteristic data. The labels that match or not may be specifically indicated by "0" and "1". The label corresponding to the matching of the first non-standard characteristic data and the standard characteristic data is '1', the first non-standard characteristic data is not matched with the standard characteristic data, and the label corresponding to the first non-standard characteristic data and the standard characteristic data is '0'.
And training and generating a prediction model by using the obtained first non-standard characteristic data, the label matched with the first non-standard characteristic data and the standard characteristic data. The prediction model can be BERT (Bidirectional Encoder Representations from Transformers) model or be composed of a double-layer fully connected network with Dropout.
The feature vector vs extracted from the first non-standard detection data and the feature vector vl extracted from the standard detection data may be combined by using a plurality of combination modes, for example, adding, subtracting, multiplying, splicing, and the like, to be fused into an intermediate vector vc. The loss function of the prediction model may be the cross entropy commonly used for classification models, yt is a 2-dimensional vector used to represent whether vs and vl match, e.g., if vs and vl match, then yt = [1,0], otherwise yt = [0,1]. The specific loss function of the prediction model is calculated as follows:
vc=Concatenate(vs+vl,vs-vl,vs×vl)
vd=Dropout(vc)
vh=tanh(Whvd+bd)
yp=softmax(Wovh+bh)
Loss=CrossEntropy(yp,yt)
where Wh and Wo are trainable parameters, adjustments can be made to achieve optimization of the loss function. The prediction model can correctly match the first non-standard detection data with the standard feature data through optimization of the loss function. And after the accuracy of the prediction model reaches the preset condition, training the prediction model is completed.
The generated prediction model can output a matching result of whether the corresponding second non-standard characteristic data is matched or not according to the input second non-standard characteristic data and standard data, and further can determine whether the second non-standard detection data corresponding to the second non-standard characteristic data can be normalized into standard detection data corresponding to the standard characteristic data by utilizing the matching result.
Based on the above-mentioned content of S201-S205, in the embodiment of the present application, by acquiring the first nonstandard detection data and the standard detection data, and respectively splicing attribute values of fields included in the first nonstandard detection data and the standard detection data, the first nonstandard detection data and the standard detection data can be converted into a form of a natural sentence, so as to realize determination of similarity degree between attribute values of non-corresponding fields in different detection data. Therefore, the generated prediction model can determine whether the input second non-standard characteristic data and the target standard characteristic data are matched, and the obtained matching result is more accurate. And further, normalized standard detection data corresponding to the second non-standard detection data can be rapidly and accurately determined according to the matching result.
In the process of splicing, the meaning represented by each field of the first non-standard detection data and the meaning represented by each field of the standard detection data may be different, and splicing only the attribute value of the first non-standard detection data and splicing the attribute value of the standard detection data may cause the meaning loss of the fields.
Based on the above problem, in one possible implementation manner, the embodiment of the present application provides a specific implementation manner of splicing attribute values of fields of first nonstandard detection data to generate a first nonstandard feature sentence, which can further improve accuracy of a generated prediction model, and specifically includes:
forming a first attribute feature pair by the field name of each field in the first nonstandard detection data and the corresponding attribute value;
And splicing all the first attribute feature pairs in the first nonstandard detection data to generate a first nonstandard feature sentence.
The method comprises the steps of firstly forming a first attribute feature pair by a field name of each field and an attribute value corresponding to the field in first nonstandard detection data, and then splicing the first attribute feature pair. Thus, the field name of the field can be used as the feature in the first non-standard feature sentence, and the information contained in the field name is ensured not to be lost.
To facilitate training of the prediction model, the number of embedded layers included in the prediction model may be reduced, and the field names may be formed into first attribute-feature pairs in the form of key-value pairs, for example, similar to JSON, with attribute values corresponding to the fields. In order to distinguish between the field name and the field attribute value, a spacer may be added between the field attribute value and the field name, and the embodiment of the present application does not limit the spacer between the field attribute value and the field name, and may be a common spacer symbol. For example, a first attribute feature pair in the format "field attribute value |field name. Taking the field name as "detection name" and the field attribute value as "urine microalbumin" as an example, the first attribute feature pair formed is "urine microalbumin|detection name".
Further, because the field names use Chinese vocabulary to generate Chinese word segmentation problem, more dimensions may be occupied. Preferably, when the field names are Chinese, the field names in the first attribute feature pair can be English vocabulary corresponding to Chinese vocabulary, so that feature extraction is carried out on the English vocabulary as a whole, and the dimension occupied by the field names is reduced. For example, the english character corresponding to the field NAME "detection NAME" is "item_name". The corresponding first attribute feature pair is "urine microalbumin |item_name".
And splicing the generated first attribute feature pairs to obtain corresponding first nonstandard feature sentences. The first attribute feature pairs may be partitioned by a separator symbol.
Correspondingly, splicing attribute values of all fields of the standard detection data to generate a standard feature sentence, wherein the method specifically comprises the following steps:
Forming a second attribute characteristic pair by the field name of each field in the standard detection data and the corresponding attribute value;
and splicing all second attribute feature pairs in the standard detection data to generate a standard feature sentence.
The method of composing the field name and the attribute value of each field in the standard detection data into the second attribute feature pair is similar to the method of composing the field name and the attribute value of each field in the first non-standard detection data into the first attribute feature pair, and the method of generating the standard feature sentence is similar to the method of generating the first non-standard feature sentence, which will not be described herein.
In the embodiment of the application, the first attribute characteristic pair and the second attribute characteristic pair are obtained by combining the field names and the field attribute values of the first nonstandard detection data and the standard detection data respectively, so that the information contained in the field names can be reserved on the basis of not changing the structure of the prediction model, and the precision of the generated prediction model is improved. And the matching result of the second nonstandard detection data obtained by using the prediction model and the target standard characteristic data is more accurate.
When training the prediction model by using the first non-standard feature sentence, the standard feature sentence and the corresponding label, the time for calculating the feature vectors of the first non-standard feature sentence and the standard feature sentence by the prediction model may be long, which may increase the time for outputting the result by the prediction model.
Based on the above problems, in a possible implementation manner, the embodiment of the application further provides a method for generating a prediction model. In addition to the above steps, the method further comprises:
training the first language model and the second language model by using the first nonstandard feature sentence and the standard feature sentence.
The first language model is a model for extracting feature vectors of the first non-standard feature sentence, and the second language model is a model for extracting feature vectors of the standard feature sentence. The feature vectors can be input into the prediction model by extracting the feature vectors by using the first language model and the second language model respectively, and the prediction model is only used for calculating the similarity degree of the feature vectors, so that the calculation speed of the prediction model is increased.
The first language model and the second language model may be trained by using a next sentence prediction (Next Sentence Prediction, NSP) method in BERT (Bidirectional Encoder Representations from Transformers) models.
It should be noted that the sequence of the first nonstandard feature sentence and the standard feature sentence may affect training the first language model and the second language model. Therefore, the first non-standard feature sentence and the standard feature sentence can be spliced to form one composition, and the standard feature sentence and the first non-standard feature sentence can be spliced to form another composition, so that the training of the first language model and the second language model can be performed together.
In addition, the first nonstandard feature sentence may be spliced only by the attribute value of each field of the first nonstandard detection data, or may be spliced by the first attribute feature pair obtained by combining the field name and the attribute value of each field of the first nonstandard detection data. The two first nonstandard feature sentences correspond to the two first language models respectively. Correspondingly, the standard feature sentence can be spliced by the attribute value of each field of the standard detection data, or can be spliced by the second attribute feature pair of the field name and attribute value combination of each field of the standard detection data, and the corresponding second language models are two.
Based on the above, the feature vectors of the first nonstandard feature sentence and the standard feature sentence can be extracted by using the first language model and the second language model by training and generating the first language model and the second language model by using the first nonstandard feature sentence and the standard feature sentence, respectively. The extraction of the feature vector and the matching of the feature vector are realized through different models, so that the calculation time of feature extraction can be reduced, and the calculation speed of a matching result of the second non-standard feature data and the standard feature data is improved.
Further, before training to generate the prediction model using the first non-standard feature data, the standard feature data, and the tag whether the first non-standard feature data matches the standard feature data, the method further includes:
inputting the first nonstandard feature sentence into a first language model to obtain a feature vector of the first nonstandard feature sentence, and taking the feature vector of the first nonstandard feature sentence as first nonstandard feature data;
And inputting the standard feature sentence into a second language model to obtain a feature vector of the standard feature sentence, and taking the feature vector of the standard feature sentence as standard feature data.
After training and generating the first language model and the second language model by using the first nonstandard feature sentence and the standard feature sentence, the first nonstandard feature sentence can be input into the first language model to obtain a feature vector corresponding to the first nonstandard feature sentence, and the feature vector of the obtained first nonstandard feature sentence is used as first nonstandard feature data. And inputting the standard feature sentence into the second language model to obtain an output feature vector corresponding to the standard feature sentence, and taking the feature vector of the standard feature sentence as standard feature data. The obtained first non-standard characteristic data and standard characteristic data are used for training a prediction model.
It can be understood that the first non-standard feature sentence has two splicing modes, and the corresponding first language model also has two modes. When extracting the feature vector of the first nonstandard feature sentence through the first language model, selecting a corresponding first language model according to the first nonstandard feature sentence splicing mode. Similarly, the standard feature sentence also has two splicing modes, and the corresponding second language model is required to be selected for extracting the feature vector of the standard feature sentence.
In the embodiment of the application, the feature vectors of the first nonstandard feature sentence and the standard feature sentence are extracted through the first language model and the second language model respectively. The extracted feature vector is used as data for training the prediction model, so that the extraction of the feature vector by the language model and the training of the prediction model by the feature vector can be realized, and the calculation speed of the prediction model can be improved. In addition, the extraction speed of the language model to the feature vector is faster than that of the prediction model to the feature vector, so that the normalization speed of the whole detection data can be improved.
In one possible implementation, the first language model, the second language model, and the predictive model form a twinning network.
Referring to fig. 3, the structure of a twin network according to an embodiment of the present application is shown. The first language model 301 is used for extracting feature vectors of the first non-standard feature sentence, and the second language model 302 is used for extracting feature vectors of the standard feature sentence. The feature vector of the first non-standard feature sentence and the feature vector of the standard feature sentence are input into the prediction model 303, and the prediction model 303 is used for judging the similarity degree of the feature vector of the first non-standard feature sentence and the feature vector of the standard feature sentence, and outputting a matching result of whether the corresponding feature vectors are matched according to the similarity degree.
Based on the above, in the embodiment of the present application, matching of the second nonstandard detection data and the standard detection data can be achieved through the twin network formed by the first language model, the second language model and the prediction model, and normalization of the detection data can be achieved rapidly and accurately.
Based on the prediction model generation method, the embodiment of the application also provides a method for normalizing the detection data. Referring to fig. 4, a flowchart of a method for normalizing detection data according to an embodiment of the present application may include S401 to S404:
S401, second non-standard detection data is acquired, wherein the second non-standard detection data comprises attribute values of at least one field.
The second non-standard test data is test data that is not in a standard test data format. The second non-standard test data may be test data obtained by actual testing. For example, in the medical field, the second non-standard medical detection data may be second non-standard medical detection data detected by an actual medical detection device.
The second non-standard test data includes an attribute value for at least one field. Wherein, the field may represent a detected item, and the attribute value of the field represents a detected value corresponding to the field. The fields of the second non-standard detection data can be detection identification, detection name, normal value range, unit, sampling type, detection method and the like.
And S402, splicing attribute values of all fields of the second nonstandard detection data to generate a second nonstandard feature sentence, and taking the second nonstandard feature sentence as the second nonstandard feature data.
And splicing attribute values of all fields of the second nonstandard detection data to obtain corresponding second nonstandard feature sentences. The splicing manner of the second nonstandard feature sentence is similar to that of the detection sentence to be trained, and is not described herein.
For example, table 1 provides second non-standard detection data for an embodiment of the present application:
detection IDDetecting namesDetection methodNormal range of valuesUnit (B)Sampling type
AlbGAlbuminBromocresol green (BCG) process35--53g/LDrainage liquid
TABLE 1
Taking the second nonstandard detection data in table 1 as an example, the corresponding second nonstandard feature sentence is AlbG. Albumin. Bromocresol green process. 35-53. g/l. Drainage liquid.
In one possible implementation, if the prediction model is generated by training according to a first nonstandard feature sentence obtained by splicing a first attribute feature pair, a standard feature sentence obtained by splicing a second attribute feature pair, and a corresponding tag, the second nonstandard feature sentence is also generated by a field name and an attribute value. The embodiment of the application provides a specific implementation manner for splicing attribute values of fields of second non-standard detection data to generate a second non-standard feature sentence, please refer to the following.
S403, inputting second non-standard feature data and target standard feature data into a prediction model to obtain a matching result of whether the second non-standard feature data and the target standard feature data are matched, generating target standard feature data according to target standard feature sentences, generating target standard feature sentences according to attribute values of fields of target standard detection data in a spliced mode, wherein the target standard detection data are any one of the standard detection data, and the prediction model is generated according to any one of the prediction models in the specific embodiment.
And selecting one item from the standard detection data as target standard detection data, splicing attribute values of all fields in the target standard feature data to generate a corresponding target standard feature sentence, and obtaining target standard feature data of the input prediction model from the target standard feature sentence.
In order to increase the speed of normalization of the detection data, the existing standard detection data may be processed in advance to obtain standard feature data corresponding to each standard detection data. And when the second nonstandard feature data is normalized, selecting target standard feature detection data from the standard feature detection data, and obtaining target standard feature data corresponding to the target standard feature detection data from the standard feature data obtained in advance, wherein the target standard feature data is used for normalizing the second nonstandard feature detection data.
The second non-standard feature data and the target standard feature data are input into a prediction model, and the prediction model can be generated according to the prediction model generation method. And obtaining a matching result of whether the second non-standard characteristic data output by the prediction model is matched with the target standard characteristic data.
S404, if the matching result is that the second nonstandard feature data is matched with the target standard feature data, normalizing the second nonstandard detection data corresponding to the second nonstandard feature data to the standard detection data corresponding to the target standard feature data.
And the prediction model determines a corresponding matching result according to the similarity degree between the second non-standard characteristic data and the target standard characteristic data. If the matching result is that the second nonstandard feature data is matched with the target standard feature data, the fact that the second nonstandard detection data corresponding to the second nonstandard feature data is matched with the standard detection data corresponding to the target standard feature data is indicated, and the second nonstandard detection data is normalized to the target standard detection data, so that normalization of the second nonstandard detection data is achieved.
Based on the above-mentioned content related to S401-S404, it is known that the corresponding matching result can be obtained by inputting the second non-standard feature data and the target standard feature data into the prediction model. According to the matching result, the normalization of the second nonstandard characteristic data can be realized, so that the normalization of the rapid and accurate detection data can be realized through a prediction model, and the efficiency of determining the standard detection data for normalizing the detection data is improved.
Further, when the target standard feature sentence is generated by splicing field names and attribute values of fields of the target standard detection data, the embodiment of the present application further provides a specific implementation manner of splicing attribute values of fields of the second non-standard detection data to generate the second non-standard feature sentence, which specifically includes:
Forming a third attribute feature pair by the field name of each field in the second nonstandard detection data and the corresponding attribute value;
And splicing all third attribute feature pairs in the second nonstandard detection data to generate a second nonstandard feature sentence.
If the target standard detection data are generated by splicing field names and attribute values of the fields, the corresponding second non-standard detection data are generated by splicing the field names and the attribute values, so that the similarity degree can be compared in the prediction model, and a more accurate matching result is obtained.
And firstly, forming a third attribute characteristic pair by field names and corresponding attribute values of all fields in the second nonstandard detection data. Taking the second nonstandard detection data in table 1 as an example, the third attribute feature pair obtained is "AlbG |detection flag", "albumin|detection name", "bromocresol green (BCG) method|detection method", "35-53|normal value range", "g/l|unit", "drainage|sampling type".
In one possible implementation, the field NAMEs may be corresponding english words, and the third attribute-feature pairs are "AlbG |term_id", "albumin|item_name", "bromocresol green (BCG) method|method_type", "35-53|scale", "g/l|unit_name", "drainage|sample_type".
And splicing the third attribute feature pairs to obtain a second nonstandard feature sentence. Taking the third attribute feature pair as an example, the obtained second nonstandard feature sentence is AlbG |term_id. Albumin i item_name. Bromocresol green (BCG) METHOD |method_type. 35- -53. Sub.SCALE. g/L|UNIT_NAME. Drainage |sample_TYP. The third attribute features are divided by separators, and specific separators in the embodiment of the application are not limited and may be common separators such as periods, commas, and the like.
In the embodiment of the application, the information with the field name in the second nonstandard feature sentence is obtained by splicing the third attribute feature pair. And inputting the second nonstandard characteristic data corresponding to the second nonstandard characteristic sentence and the target standard characteristic data into the prediction model, so that a more accurate matching result can be obtained, and a more accurate normalization result of the detection data can be obtained.
In one possible implementation, the target standard feature data is generated from feature vectors of the target standard feature sentences. Correspondingly, the detection data normalization method further comprises the following steps:
Inputting the second nonstandard feature sentence into the first language model, generating a nonstandard feature vector, and taking the nonstandard feature vector as second nonstandard feature data;
The feature vector of the target standard feature sentence is obtained by inputting the target standard feature sentence into a second language model;
the first language model and the second language model are generated according to the generation methods of the first language model and the second language model.
In order to improve the calculation speed of the prediction model, the target standard feature sentence can be input into the second language model to obtain a corresponding feature vector, and then target standard feature data is generated according to the feature vector of the target standard feature sentence.
In one possible implementation manner, in order to increase the matching speed of the second non-standard detection data, standard feature sentences may be input into the second language model, so as to obtain feature vectors corresponding to each standard feature sentence. And storing the feature vectors corresponding to the standard feature sentences into a feature vector library. After the second non-standard feature data is obtained, selecting a feature vector corresponding to the target standard feature sentence from the feature vector library, taking the feature vector as the corresponding target standard feature data, and inputting the feature vector into the prediction model to obtain a corresponding matching result.
Correspondingly, inputting the second nonstandard feature sentence into the first language model, generating a nonstandard feature vector corresponding to the second nonstandard feature sentence, and taking the nonstandard feature vector as second nonstandard feature data as data of the input prediction model.
It should be noted that there are two kinds of second language models corresponding to two kinds of target standard feature sentences, that is, a target standard feature sentence composed of attribute values of respective fields and a target standard feature sentence composed of field names and attribute values of respective fields. The corresponding second language model may be selected according to the kind of the specific target standard feature sentence.
Similarly, the second nonstandard feature sentence and the first language model also have two corresponding types, and the corresponding first language model can be selected according to the type of the second nonstandard feature sentence to process nonstandard feature vectors.
Taking the second nonstandard detection data in table 1 as an example, the second nonstandard feature sentence is AlbG |term_id. Albumin i item_name. Bromocresol green (BCG) METHOD |method_type. 35- -53. Sub.SCALE. g/L|UNIT_NAME. Drainage |sample_TYP. The non-standard feature vector as the second non-standard feature data is input into the first language model, and the obtained non-standard feature vector is-0.2,0.03, -0.5,0.06, and is 0.33. Standard detection data with the LOINC code number 77148 is selected from the feature vector library as target standard data. The corresponding feature vector is-0.17,0.02, -0.55,0.06, 0.32. And (3) inputting the second non-standard characteristic data and the target standard characteristic data into the prediction model to obtain a matching result of 1, and normalizing the second non-standard characteristic data to standard detection data with LOINC code number 77148-5.
Based on the foregoing, in the embodiment of the present application, when the target standard feature data is generated according to the feature vector, the corresponding non-standard feature vector corresponding to the second non-standard feature sentence may be generated by the first language model as the second non-standard feature data. Therefore, the speed of calculating the feature vector by the prediction model can be reduced, and the normalization of the second nonstandard detection data can be realized faster.
Based on the prediction model generating method provided by the method embodiment, the embodiment of the application also provides a prediction model generating device, and the prediction model generating device will be described with reference to the accompanying drawings.
Referring to fig. 5, the structure of a prediction model generating device according to an embodiment of the present application is shown. As shown in fig. 5, the implementation prediction model generation device includes:
A first obtaining unit 501, configured to obtain first non-standard detection data, where the first non-standard detection data includes an attribute value of at least one field;
The first splicing unit 502 is configured to splice attribute values of fields of the first nonstandard detection data, generate a first nonstandard feature sentence, and use the first nonstandard feature sentence as first nonstandard feature data;
a second obtaining unit 503, configured to obtain standard detection data, where the standard detection data includes an attribute value of at least one field;
a second splicing unit 504, configured to splice attribute values of fields of the standard detection data to generate a standard feature sentence, and use the standard feature sentence as standard feature data;
The first training unit 505 is configured to train to generate a prediction model by using the first nonstandard feature data, the standard feature data, and a tag that whether the first nonstandard feature data is matched with the standard feature data, where the prediction model is configured to output a matching result that whether the second nonstandard feature data is matched with the target standard feature data when the second nonstandard feature data and the standard feature data are input, so that whether the second nonstandard detection data corresponding to the second nonstandard feature data can be normalized to the standard detection data corresponding to the standard feature data is determined according to the matching result.
In a possible implementation manner, the first splicing unit 502 is specifically configured to combine a field name of each field in the first nonstandard detection data with a corresponding attribute value to form a first attribute feature pair;
the second stitching unit 504 is specifically configured to combine the field name of each field in the standard detection data with the corresponding attribute value to form a second attribute feature pair, and stitch each second attribute feature pair in the standard detection data to generate a standard feature sentence.
In one possible implementation, the apparatus further includes:
and the second training unit is used for training the first language model and the second language model by using the first non-standard feature sentence and the standard feature sentence.
In one possible implementation, the apparatus further includes:
the first input unit is used for inputting the first nonstandard feature sentence into the first language model to obtain a feature vector of the first nonstandard feature sentence, and the feature vector of the first nonstandard feature sentence is used as first nonstandard feature data;
And the second input unit is used for inputting the standard feature sentence into the second language model to obtain the feature vector of the standard feature sentence, and taking the feature vector of the standard feature sentence as standard feature data.
In one possible implementation, the first language model, the second language model, and the predictive model form a twinning network.
Based on the method for normalizing the detection data provided by the embodiment of the method, the embodiment of the application also provides a device for normalizing the detection data, and the device for normalizing the detection data is described below with reference to the accompanying drawings.
Referring to fig. 6, the structure of a device for normalizing detection data according to an embodiment of the present application is shown. As shown in fig. 6, the apparatus for implementing normalization of detection data includes:
a third obtaining unit 601, configured to obtain second non-standard detection data, where the second non-standard detection data includes an attribute value of at least one field;
A third stitching unit 602, configured to stitch attribute values of each field of the second non-standard detection data to generate a second non-standard feature sentence, and use the second non-standard feature sentence as second non-standard feature data;
The third input unit 603 is configured to input the second non-standard feature data and target standard feature data into a prediction model, to obtain a matching result of whether the second non-standard feature data and the target standard feature data are matched, wherein the target standard feature data are generated according to a target standard feature sentence, the target standard feature sentence is generated by splicing attribute values of fields of the target standard detection data, and the target standard detection data are any one of standard detection data;
And a normalizing unit 604, configured to normalize the second non-standard detection data corresponding to the second non-standard feature data to the standard detection data corresponding to the target standard feature data if the matching result is that the second non-standard feature data matches the target standard feature data.
In a possible implementation manner, when the target standard feature sentence is generated by splicing field names and attribute values of fields of the target standard detection data, the third splicing unit 602 is specifically configured to combine the field name of each field in the second non-standard detection data with the corresponding attribute value to form a third attribute feature pair;
and splicing all third attribute feature pairs in the second nonstandard detection data to generate a second nonstandard feature sentence.
In one possible implementation manner, when the target standard feature data is generated according to the feature vector of the target standard feature sentence, the apparatus further includes:
The fourth input unit is used for inputting the second nonstandard feature sentence into the first language model, generating a nonstandard feature vector, and taking the nonstandard feature vector as second nonstandard feature data;
The feature vector of the target standard feature sentence is obtained by inputting the target standard feature sentence into a second language model, and the first language model and the second language model are generated according to the prediction model generation method.
In addition, the embodiment of the application also provides a prediction model generating device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein when the processor executes the computer program, any implementation mode of the prediction model generating method according to the embodiment is realized. The embodiment of the application also provides equipment for normalizing detection data, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein when the processor executes the computer program, any implementation mode of the method for normalizing detection data according to the embodiment is realized.
In addition, the embodiment of the present application further provides a computer readable storage medium, where instructions are stored, where the instructions, when executed on a terminal device, cause the terminal device to perform any implementation of the method for generating a prediction model according to the embodiment described above, or perform any implementation of the method for normalizing detection data according to the embodiment described above.
And determining standard detection data which can be normalized by the second non-standard detection data according to the matching result output by the prediction model, and obtaining the standard detection data normalized by the second non-standard detection data faster through the prediction model. And the prediction model generated by using the first nonstandard feature data, the standard feature data and the corresponding label is accurate, so that normalized standard detection data corresponding to the second nonstandard detection data can be accurately determined according to the matching result.
It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system or device disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple, and the relevant points refer to the description of the method section.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" is used to describe an association relationship of an associated object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

CN202011401732.3A2020-12-042020-12-04 A method, device and equipment for generating prediction models and normalizing detection dataActiveCN112507658B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011401732.3ACN112507658B (en)2020-12-042020-12-04 A method, device and equipment for generating prediction models and normalizing detection data

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011401732.3ACN112507658B (en)2020-12-042020-12-04 A method, device and equipment for generating prediction models and normalizing detection data

Publications (2)

Publication NumberPublication Date
CN112507658A CN112507658A (en)2021-03-16
CN112507658Btrue CN112507658B (en)2025-01-14

Family

ID=74968267

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011401732.3AActiveCN112507658B (en)2020-12-042020-12-04 A method, device and equipment for generating prediction models and normalizing detection data

Country Status (1)

CountryLink
CN (1)CN112507658B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115407686A (en)*2022-08-042022-11-29华能秦煤瑞金发电有限责任公司Wisdom operating system of thermal power plant
CN115186650B (en)*2022-09-072022-12-09中国中金财富证券有限公司Data detection method and related device
US12360962B1 (en)2024-02-232025-07-15Crowdstrike, Inc.Semantic data determination using a large language model

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111737533A (en)*2020-06-192020-10-02东软集团股份有限公司Processing method and device for inspection items, storage medium and equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6754668B2 (en)*2000-10-242004-06-22Raytheon CompanyMultilingual system having dynamic language selection
CN110188350B (en)*2019-05-222021-06-01北京百度网讯科技有限公司Text consistency calculation method and device
CN110851713B (en)*2019-11-062023-05-30腾讯科技(北京)有限公司Information processing method, recommending method and related equipment
CN111259127B (en)*2020-01-152022-05-31浙江大学Long text answer selection method based on transfer learning sentence vector
CN111859960B (en)*2020-07-272023-08-01中国平安人寿保险股份有限公司Semantic matching method, device, computer equipment and medium based on knowledge distillation
CN112017744A (en)*2020-09-072020-12-01平安科技(深圳)有限公司Electronic case automatic generation method, device, equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111737533A (en)*2020-06-192020-10-02东软集团股份有限公司Processing method and device for inspection items, storage medium and equipment

Also Published As

Publication numberPublication date
CN112507658A (en)2021-03-16

Similar Documents

PublicationPublication DateTitle
CN112507658B (en) A method, device and equipment for generating prediction models and normalizing detection data
CN110335653B (en)Non-standard medical record analysis method based on openEHR medical record format
CN111292751B (en)Semantic analysis method and device, voice interaction method and device, and electronic equipment
CN113808758B (en)Method and device for normalizing check data, electronic equipment and storage medium
CN111368094A (en)Entity knowledge map establishing method, attribute information acquiring method, outpatient triage method and device
CN111291187B (en)Emotion analysis method and device, electronic equipment and storage medium
CN113688630A (en)Text content auditing method and device, computer equipment and storage medium
CN111611775B (en)Entity identification model generation method, entity identification device and equipment
CN111666766B (en)Data processing method, device and equipment
CN112069316B (en)Emotion recognition method and device
CN107301411B (en)Mathematical formula identification method and device
CN109920473B (en)General method for analyzing metabonomics marker weight
CN114398492B (en)Knowledge graph construction method, terminal and medium in digital field
CN110705262A (en)Improved intelligent error correction method applied to medical skill examination report
Pham et al.A hybrid approach to vietnamese word segmentation using part of speech tags
CN113778875B (en)System test defect classification method, device, equipment and storage medium
CN113221573B (en) A method, device, computing device and storage medium for entity classification
JP7040155B2 (en) Information processing equipment, information processing methods and programs
CN114154029A (en)Sample query method and server based on artificial intelligence and chromatographic analysis
CN117216132B (en)Mathematical test question similarity judging method, system and application
CN114880471A (en)Electronic medical record quality evaluation method and system based on text classification algorithm
CN110427330B (en)Code analysis method and related device
CN108304362B (en)Clause detection method and device
CN113806529A (en)Method and device for limiting word pair tagging and computer readable storage medium
CN116187299B (en)Scientific and technological project text data verification and evaluation method, system and medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp