CN111192689A

Movatterモバイル変換

Info

Publication number: CN111192689A
Application number: CN201811361095.4A
Authority: CN
Inventors: 罗立刚; 张旸; 陈超; 王飞; 周剑
Original assignee: Linkdoc Technology Beijing Co ltd
Current assignee: Linkdoc Technology Beijing Co ltd
Priority date: 2018-11-15
Filing date: 2018-11-15
Publication date: 2020-05-22
Anticipated expiration: 2038-11-15
Also published as: CN111192689B

Abstract

The invention provides a patient identification method based on medical data, which comprises the following steps: A. performing point location extraction in each medical record data, and judging whether the extracted point locations are the same; B. classifying the medical record data with the same point location; C. and D, classifying the medical record data classified in the step B into medical record data of the same patient. According to the technical scheme, whether the point locations are the same or not is directly judged, and medical records with the same point locations are initially classified into the same type, so that the same patient with different medical records is finally screened. Different from the prior art that probability error classification may exist in calculating the similarity between point locations, and the absolute point locations are the same for judgment, so that the accuracy of patient identification can be improved, and the defects in the prior art are eliminated to a certain extent.

Description

Patient identification method based on medical data

Technical Field

The invention relates to the technical field of medical big data, in particular to a patient identification method based on medical data.

Background

During medical data processing, a patient is the basis for medical analysis. But patient association between multiple pieces of patient data is inaccurate in most data processing scenarios.

As shown in fig. 6, the conventional general method is: and extracting the upper point positions of the same data source data from different medical record data to judge the similarity, judging that the same person is the same person when the similarity exceeds a threshold value, and judging that the person is two persons otherwise. The step of extracting the upper point position of the same data source data from different medical record data comprises the step of extracting information such as names, identity card numbers, mobile phone numbers, discharge dates, birthdays and the like from different medical record data.

However, the above method is very easy to have recognition errors, and cannot split data with recognized errors. Thus, the results of medical analysis based on inaccurate data will also be inaccurate. The use of these analysis results in medical treatment is a potential problem.

Disclosure of Invention

The invention mainly aims to provide a patient identification method based on medical data, which comprises the following steps:

A. performing point location extraction in each medical record data, and judging whether the extracted point locations are the same;

B. associating the medical record data with the same point location;

C. and D, classifying the medical record data classified in the step B into medical record data of the same patient.

According to the technical scheme, whether the point locations are the same or not is directly judged, and medical records with the same point locations are initially classified into the same type, so that the same patient with different medical records is finally screened. Different from the prior art that probability error classification may exist in calculating the similarity between point locations, and the absolute point locations are the same for judgment, so that the accuracy of patient identification can be improved, and the defects in the prior art are eliminated to a certain extent.

The step B comprises the following steps:

setting different weights for each point;

and if at least one high-weight point position is the same in different medical record data, associating the medical record data.

Therefore, the absolute point positions are judged to be the same, the accuracy of patient identification can be improved, and the defects of the prior art are eliminated to a certain extent.

The step B comprises the following steps:

setting different weights for each point;

and if at least two low-weight point positions are the same in different medical record data, associating the medical record data.

Thus, for the case of lack of high-weight points, the correct rate of patient identification can be maintained by matching of multiple low-weight points.

The step B comprises the following steps:

setting different weights for each point;

and if only one low-weight point position is the same in different medical record data, carrying out manual screening prompt, and associating the medical record data after confirmation.

Therefore, when the matching of the single low-weight point is generated, the condition of error identification is avoided by informing the manual screening.

The step C further comprises the following steps:

and D, splitting the medical record data which cannot be associated in the step B into the medical record data in the step A.

Therefore, when the medical record data can not be classified, the medical record data is split, and the aim of keeping the accuracy of patient identification is also fulfilled.

The point location includes at least one of: name, identification number, mobile phone number, birthday, sex, blood type, hospital number, department, date of admission and date of discharge.

Therefore, whether the medical record data are the same or not is judged through different information.

And D, identifying the identity of the patient according to the point location.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a schematic diagram of point location determination for different medical records;

FIG. 3 is a diagram illustrating the initial classification of medical records having the same point location;

FIG. 4 is a corresponding diagram of associated medical records and a patient;

FIG. 5-1 is a schematic diagram illustrating point location extraction performed synchronously on different medical record data and determining whether the point locations are the same;

FIG. 5-2 is a schematic diagram of an initialization categorization of medical records having the same point location;

5-3 are schematic diagrams of tests for initializing classifications;

FIGS. 5-4 are schematic diagrams of the patient to which each medical record after examination is assigned;

fig. 6 is a flow chart of the prior art.

Detailed Description

The method for identifying a patient based on medical data according to the present invention will be described in detail with reference to fig. 1 to 5.

As shown in fig. 1, the method comprises the following steps:

s100: and synchronously extracting point positions from different medical record data.

Compared with the prior art that point locations are extracted in batches aiming at medical record data, the embodiment adopts the step of synchronously extracting the point locations from different medical record data.

As shown in fig. 2, the extracted points include, but are not limited to, names, identification numbers, mobile phone numbers, birthdays, and the like. The extraction of the point locations is performed in four medical records (corresponding to the first to fourth medical records in fig. 2) simultaneously. In addition, the point location may also include a hospital number, a hospital, a department, a date of admission and a date of discharge, etc., which are not described in detail herein.

S200: it is determined whether the points extracted in step S100 are the same.

Different from the similarity-based judgment in the prior art, in the embodiment, the patient identification is performed based on whether the point locations are the same.

Referring to fig. 2, the identification numbers in the first and second medical records are the same as the mobile phone number; the names and the birthdays in the second and third medical records are the same; the name, the ID card number and the birthday in the third and fourth calendars are the same.

S300: and initializing and classifying all medical records with the same point positions.

Referring to fig. 2 and 3, there are at least two same points between the four medical records. Then it can be used as the basis for initializing the classification, and assuming that the first, second, and third medical records belong to the first patient, the fourth medical record and the third medical record have the same point location, so that the four medical records are related. The association is such that the first to fourth schedules point to the same patient, i.e. the fourth schedule is also identified as the first patient when the classification is initiated.

Further, by the initialization classification, a preliminary corresponding relation between each medical record and the patient can be established. As shown in fig. 4, when two medical records have the same point, the two medical records are considered to have a relationship, i.e., may be the same patient. In fig. 4, the same combinations of the first to third medical records are summarized as the first patient, and the same points of the fourth to seventh medical records are summarized as the second patient.

S400: and (5) checking the initialization classification, if the initialization classification passes the checking, entering the step S500, otherwise, splitting the initialization classification into an original state, and returning to the step S200.

Different weights are set for each point or combination of points, for example, if the coincidence rate of the combination information such as name + birthday is higher, the combination information is set to be a lower weight, and if the combination coincidence rate of the hospital number + hospital + department is higher, the combination information is set to be a lower weight. The combination of the identity card number and the mobile phone number has uniqueness, and the combination is set as the highest weight. Of course, in the actual judgment process, the combinations of the same points are various, which is not listed here.

When the two medical records have the same combination with high weight, the same patient is considered. For example, in the first medical record and the second medical record in fig. 2, if the identification number is consistent with the highest weighted combination of the mobile phone numbers, the two medical records are identified as the same patient. In the second medical record and the third medical record, although the name and the date of birth are the same, since the weight of the combination is low, even if the two medical records are the same, the two medical records may not belong to the same patient.

In addition, because the mobile phone number may be used by the patient B after the patient A logs out, the weight of the mobile phone number may be set to be lower than that of the identity card number, or to be lower than that of the name and the birthday. When the mobile phone numbers are the same and the identity card number is absent, at least two points with the same number including the same mobile phone number can be taken as judgment conditions, so that under the condition of lacking the identity card number, a plurality of medical records can be accurately associated with the patient. The number of points to be matched is not limited herein.

The example is described in connection with fig. 4. And when the eighth medical record exists, the eighth medical record and the third medical record respectively have the condition that the highest weight combination of the identity card number and the mobile phone number is the same. Meanwhile, if the eighth medical record is the same as the fifth medical record in the low-weight combination of name + birthday + hospital + department, there is a possibility that the first patient and the second patient are actually the same patient.

Based on the situation, if the high weight is adopted for distinguishing, the association with the same low weight between the eighth medical record and the fifth medical record is ignored, namely the medical records belong to different patients A and B, and the medical records corresponding to the patient A are the first medical record, the third medical record and the eighth medical record; if the low weight is used for resolution, the first patient and the second patient are considered to be the same patient.

In the case of FIG. 5, A to F represent 6 different medical records, respectively. Wherein thenumber 1 indicates that the point locations are the same and thenumber 0 indicates that the point locations are different. The dashed boxes between the case history C and the case history E represent low weight matching, and if the point locations of the other case histories are the same, the case histories are all high weight matching.

Fig. 5-1 and 5-2 present the initialized categorization between medical records in the form of a table and a block diagram, respectively.

FIG. 5-3 shows that the medical records A-D are classified as patient A and the medical records E-F are classified as patient B according to a high weight matching relationship.

And the case of low-weight matching exists between the medical record C and the medical record E, and no other reference basis exists. Prompting is carried out for the situation so as to carry out manual screening.

S500: and classifying the medical records successfully checked in the step S400 into the same patient.

Fig. 5-4 show the results of the determination after manual screening, i.e., patient a and patient b are the same person. And the identity recognition of the patient can be realized according to the point positions.

The technical scheme of the application is different from the prior art that the similarity between the point positions is calculated, and whether the point positions are the same or not is directly judged. Medical records with the same point location initially fall into the same category. The same patient with different medical records is finally screened by examining the initially summarized classes.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method for identifying a patient based on medical data, comprising the steps of:

B. associating the medical record data with the same point location;

2. The method of claim 1, wherein step B comprises:

setting different weights for each point;

3. The method of claim 1, wherein step B comprises:

setting different weights for each point;

4. The method of claim 1, wherein step B comprises:

setting different weights for each point;

5. The method of claim 1, wherein step C further comprises:

6. The method of any one of claims 1 to 5, wherein the point locations comprise at least one of: name, identification number, mobile phone number, birthday, sex, blood type, hospital number, department, date of admission and date of discharge.

7. The method of claim 1, further comprising a step D of identifying the identity of the patient from the point location.