Movatterモバイル変換


[0]ホーム

URL:


CN111241086A - Data quality improvement method and system based on medical big data - Google Patents

Data quality improvement method and system based on medical big data
Download PDF

Info

Publication number
CN111241086A
CN111241086ACN202010050443.7ACN202010050443ACN111241086ACN 111241086 ACN111241086 ACN 111241086ACN 202010050443 ACN202010050443 ACN 202010050443ACN 111241086 ACN111241086 ACN 111241086A
Authority
CN
China
Prior art keywords
data
quality improvement
data quality
medical big
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010050443.7A
Other languages
Chinese (zh)
Other versions
CN111241086B (en
Inventor
路杰
姚进文
牛宝童
蒲旭虹
殷利霞
白焕莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gansu Health Statistics Information Center Northwest Population Information Center
Original Assignee
Gansu Health Statistics Information Center Northwest Population Information Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gansu Health Statistics Information Center Northwest Population Information CenterfiledCriticalGansu Health Statistics Information Center Northwest Population Information Center
Priority to CN202010050443.7ApriorityCriticalpatent/CN111241086B/en
Publication of CN111241086ApublicationCriticalpatent/CN111241086A/en
Application grantedgrantedCritical
Publication of CN111241086BpublicationCriticalpatent/CN111241086B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention belongs to the technical field of data quality control, and discloses a data quality improvement method and a data quality improvement system based on medical big data, wherein the data quality improvement method and the data quality improvement system are calculated based on HIS atomic index values, and quality management is performed through normative detail data, non-normative detail data, state data, atomic index summary and other field level checks; calculating based on the platform atomic value, and finely checking the resident personal information and the service treatment record by collecting data through a public service platform; calculating based on BI atom index values, and performing directional rule verification on a related basic table by taking the atom index as a guide; writing a dynamic sql execution statement, and performing data quality control and statistics based on hadoop and hash calculation engines. The method for improving the data quality based on the medical big data can realize the control of the data quality by checking the data for multiple times by combining three paths.

Description

Data quality improvement method and system based on medical big data
Technical Field
The invention belongs to the technical field of data quality control, and particularly relates to a data quality improvement method and system based on medical big data.
Background
Currently, the closest prior art: with the development of society, people have more and more high quality requirements on medical data and requirements on the accuracy of the medical data. The existing big data technology can not use a conventional software tool to manage the data quality within a certain time range, and has the problem of uneven data quality.
In summary, the problems of the prior art are as follows: the existing medical data are complex in type and low in data quality. The non-uniform hospital level causes great difficulty in checking data and overlong checking time.
The difficulty of solving the technical problems is as follows: due to the complex data types and the large number of hospitals, the uploaded data types are not uniform.
Differences exist in uploaded data of hospitals, so that more errors exist in the uploaded data verification process, and the data quality is not high.
Different scoring standards are defined according to hospital level requirements, and the standards are customized according to hospital services.
The difference of data uploaded by hospitals is large, so that the time consumption of verification is long.
The significance of solving the technical problems is as follows: defining data standards, and mapping among the standards according to data uploaded by hospitals to achieve unification and standardization of the data uploaded by all hospitals so as to facilitate display in an electronic medical record system.
And providing a verification report to assist the hospital to correct the error relation in the verification report so as to improve the data quality.
According to the hospital level, different verification rules and grading rules are defined, and the effect of grading according to the hospital level is achieved.
Defining an uploading standard, firstly, carrying out a standard conversion before data acquisition to achieve the purposes of standardizing data, reducing conversion during verification, achieving quick verification and shortening verification time.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a data quality improvement method based on medical big data.
The invention is realized in such a way that a data quality improvement method based on medical big data comprises the following steps:
and adopting a PDLMV data cleaning framework theory to carry out multi-level data verification, and carrying out omnibearing display through a data exchange report, a data verification report, a special subject report and other aggregated result data.
Further, the data quality improvement method based on the medical big data comprises the following steps:
step one, calculating based on HIS atomic index values, and performing quality management through specification detail data, non-specification detail data, state data, atomic index summary and other field level checks;
calculating based on the platform atomic value, and finely checking the personal information of residents and service treatment records by collecting data through a public service platform;
calculating based on the BI atomic index value, and carrying out directional rule verification on a related basic table by taking the atomic index as guidance;
and fourthly, writing a dynamic sql execution statement, and performing data quality control and statistics based on a hadoop and hash calculation engine.
Another object of the present invention is to provide a medical big data-based data quality improvement system implementing the medical big data-based data quality improvement method, the medical big data-based data quality improvement system comprising:
a data checking module: the method is used for adopting a PDLMV data cleaning frame to carry out three-path-in-one multi-level data verification;
a data exchange module: the system is used for exchanging data by adopting ETL middleware KETTLE;
an analysis module: the system is used for tracking and analyzing the production log and the system log by utilizing hadoop, hash and other analysis frameworks;
a display module: the data exchange and verification system is used for carrying out all-around display on various aggregation result data through a data exchange report, a data verification report and a special report.
The data quality control module: and the quality of the data and the data verification problem are completely displayed through the consistency, the integrity, the normalization and the timeliness of the data.
It is another object of the present invention to provide a computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface for implementing said method for improving data quality based on medical big data when executed on an electronic device.
Another object of the present invention is to provide a computer-readable storage medium, comprising instructions, which when executed on a computer, cause the computer to execute the method for improving data quality based on medical big data.
In summary, the advantages and positive effects of the invention are: the method for improving the data quality based on the medical big data can realize the control of the data quality by checking the data for multiple times by combining three paths. Aiming at solving the problems of complex medical data type, low data quality and the like in the medical data quality at present, the quality of the medical data is improved, the PDLMV data cleaning framework theory is adopted, multi-level data verification is realized, and all-round display is carried out through various aggregated result data such as a data exchange report, a data verification report special subject report and the like. The invention can solve the problem of improving the data quality based on medical big data, and controls the data quality through advanced theory and core check rules. The invention can self-define the path template for searching and relationship maintenance; performing multi-stage aggregation through data mart; using solr technique, columns are stored, and are quickly searched and stored by map (key, value).
Drawings
Fig. 1 is a flow chart of a data quality improvement method based on medical big data provided by an embodiment of the invention.
Fig. 2 is a schematic structural diagram of a data quality improvement system based on medical big data according to an embodiment of the present invention.
In the figure: 1. a data verification module; 2. a data exchange module; 3. an analysis module; 4. a display module; 5. and a data quality control module.
Fig. 3 is a schematic diagram of a data quality improvement method based on medical big data according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a data quality improvement system based on medical big data provided by an embodiment of the invention.
Fig. 5 is a data interface diagram for data quality control monitoring provided by an embodiment of the present invention.
FIG. 6 is a diagram of a scheduling interface of a data quality control program according to an embodiment of the present invention.
Fig. 7 is a diagram of a data verification script execution code interface provided by an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The existing medical data are complex in type and low in data quality.
In order to solve the problems in the prior art, the present invention provides a method and a system for improving data quality based on medical big data, and the present invention is described in detail below with reference to the accompanying drawings.
The data quality improvement method based on the medical big data provided by the embodiment of the invention comprises the following steps:
and a PDLMV data cleaning framework theory is adopted to carry out multi-level data verification, and omnibearing display is carried out through a data exchange report, a data verification report special subject report and other aggregation result data.
As shown in fig. 1, the data quality improvement method based on medical big data provided by the embodiment of the invention comprises the following steps:
and S101, calculating based on HIS atomic index values, and performing quality management through specification detail data, non-specification detail data, state data, atomic index summary and other field level checks.
And S102, calculating based on the platform atomic value, and finely checking the personal information of the residents and the service attendance records by collecting data through the public service platform.
S103, calculation is carried out based on the BI atom index value, and oriented rule verification is carried out on the relevant basic table by taking the atom index as a guide.
And S104, writing a dynamic sql execution statement, and performing data quality control and statistics based on a hadoop and hash calculation engine.
As shown in fig. 2, a data quality improvement system based on medical big data provided by an embodiment of the invention includes:
data verification module 1: the method is used for performing three-path-in-one multi-stage data verification by adopting a PDLMV data cleaning framework.
The data exchange module 2: for data exchange using ETL middleware keyle.
An analysis module 3: for tracking and analyzing production logs and system logs using hadoop, hash and other analysis frameworks.
The display module 4: the data exchange and verification system is used for carrying out all-around display on various aggregation result data through a data exchange report, a data verification report and a special report.
The data quality control module 5: and the quality of the data and the data verification problem are completely displayed through the consistency, the integrity, the normalization and the timeliness of the data.
Fig. 3 is a schematic diagram of a data quality improvement method based on medical big data according to an embodiment of the present invention.
Data are uploaded to a preposed library from a hospital business library through means of summarizing and the like, the originality of the data is kept, the preposed library uploads the data to big data by using ESB + ETL for verification, secondary summarizing is carried out, gold summarized data and verification data are distributed to each main body storage library by using DATAX so as to be used for each application platform conveniently, third summarizing is carried out according to data of a statistical table uploaded by a hospital, first three-path comparison is carried out according to the third summarizing, and a report is synthesized according to comparison conditions.
Fig. 4 is a schematic diagram of a data quality improvement system based on medical big data provided by an embodiment of the invention.
Accumulating data based on a medical data knowledge base, forming a quality control rule and a measurable quality control rule in a quality control center, carrying out data verification on medical data streams based on a sampk data calculation engine by the quality control rule, forming a problem report with blood relation, selecting and selecting problem-highlighted data to sequentially trace a problem data source manufacturer according to report priority by operation and maintenance personnel, carrying out data supplementary transmission or retransmission after the manufacturer corrects the problem, uniformly scheduling retransmission or supplementary transmission data for secondary verification by a quality control platform based on a data bus, forming a secondary verification report, and generating a final quality control scoring result according to the secondary report.
The technical solution of the present invention is further described with reference to the following specific embodiments.
Example (b):
the data quality improvement method based on the medical big data provided by the embodiment of the invention comprises the following steps:
(1) based on the national standard of medical treatment and the localization medical standard of Gansu province, the measurement standards of HIS atom index values, field normative rules, business association rules, data consistency verification rules and the like are sorted out. And assigning quality control weights according to the medical service relation priority to form a quality control scoring standard capable of tracking and measuring.
(2) The quality control rules are managed in a centralized mode, are adjusted and configured in a unified mode, the quality control rules of hospitals in different levels are determined, and all data verification levels of all links in a medical data link are determined.
(3) And carrying out data verification based on the quality control rule, synchronizing the verification result to a quality control scoring rule table, and carrying out hospital data quality scoring.
(4) The quality control verification needs to continuously test a data structure, detect abnormal contents, form controllable flow and trace problems.
The invention is further described below in connection with specific experiments.
The data quality control monitoring data interface is shown in fig. 5.
The data quality control program scheduling interface is shown in fig. 6.
The data verification script execution code interface is shown in FIG. 7.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

CN202010050443.7A2020-01-172020-01-17Data quality improvement method and system based on medical big dataActiveCN111241086B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010050443.7ACN111241086B (en)2020-01-172020-01-17Data quality improvement method and system based on medical big data

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010050443.7ACN111241086B (en)2020-01-172020-01-17Data quality improvement method and system based on medical big data

Publications (2)

Publication NumberPublication Date
CN111241086Atrue CN111241086A (en)2020-06-05
CN111241086B CN111241086B (en)2021-08-31

Family

ID=70878386

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010050443.7AActiveCN111241086B (en)2020-01-172020-01-17Data quality improvement method and system based on medical big data

Country Status (1)

CountryLink
CN (1)CN111241086B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112734281A (en)*2021-01-212021-04-30山东健康医疗大数据有限公司Decoupling processing method for quality control and task scheduling in medical data processing
CN118152385A (en)*2024-03-292024-06-07西安几何数字信息技术有限公司Medical-based data quality verification method

Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20150134705A1 (en)*2012-11-302015-05-14Dell Products, LpContent Generation Service for Software Testing
CN105550511A (en)*2015-12-112016-05-04北京锐软科技股份有限公司Data quality evaluation system and method based on data verification technique
CN108010573A (en)*2017-11-242018-05-08苏州市环亚数据技术有限公司A kind of hospital data emerging system, method, electronic equipment and storage medium
CN108027982A (en)*2015-07-092018-05-11原子肿瘤有限公司Atom treats index
CN108091372A (en)*2016-11-212018-05-29医渡云(北京)技术有限公司Medical field mapping method of calibration and device
CN108389606A (en)*2018-05-082018-08-10灵玖中科软件(北京)有限公司A kind of the data quality control system and its control method of electronic medical record homepage
CN109101539A (en)*2018-06-292018-12-28东软集团股份有限公司Business datum quality evaluating method, device, storage medium and electronic equipment
CN109522318A (en)*2018-10-222019-03-26中国银行股份有限公司A kind of data quality management method and system
CN109616180A (en)*2018-11-072019-04-12平安科技(深圳)有限公司Data analysing method, device, terminal and storage medium
CN109920534A (en)*2017-12-132019-06-21裴嘉豪A kind of medical information coordinated exchange processing method
CN110021413A (en)*2019-03-012019-07-16医利捷(上海)信息科技有限公司A kind of information for hospital integrated system
US10503574B1 (en)*2017-04-102019-12-10Palantir Technologies Inc.Systems and methods for validating data

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20150134705A1 (en)*2012-11-302015-05-14Dell Products, LpContent Generation Service for Software Testing
CN108027982A (en)*2015-07-092018-05-11原子肿瘤有限公司Atom treats index
CN105550511A (en)*2015-12-112016-05-04北京锐软科技股份有限公司Data quality evaluation system and method based on data verification technique
CN108091372A (en)*2016-11-212018-05-29医渡云(北京)技术有限公司Medical field mapping method of calibration and device
US10503574B1 (en)*2017-04-102019-12-10Palantir Technologies Inc.Systems and methods for validating data
CN108010573A (en)*2017-11-242018-05-08苏州市环亚数据技术有限公司A kind of hospital data emerging system, method, electronic equipment and storage medium
CN109920534A (en)*2017-12-132019-06-21裴嘉豪A kind of medical information coordinated exchange processing method
CN108389606A (en)*2018-05-082018-08-10灵玖中科软件(北京)有限公司A kind of the data quality control system and its control method of electronic medical record homepage
CN109101539A (en)*2018-06-292018-12-28东软集团股份有限公司Business datum quality evaluating method, device, storage medium and electronic equipment
CN109522318A (en)*2018-10-222019-03-26中国银行股份有限公司A kind of data quality management method and system
CN109616180A (en)*2018-11-072019-04-12平安科技(深圳)有限公司Data analysing method, device, terminal and storage medium
CN110021413A (en)*2019-03-012019-07-16医利捷(上海)信息科技有限公司A kind of information for hospital integrated system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DEUK-HUN KIM,等: "KSI based Sensitive Data Integrity Validation Method for Precision Medicine System", 《2019 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON)》*
曹建军,等: "《数据质量导论》", 31 October 2017*
马国耀,等: "数据校验技术在医疗健康大数据质量控制中的应用分析", 《中国卫生信息管理杂志》*

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112734281A (en)*2021-01-212021-04-30山东健康医疗大数据有限公司Decoupling processing method for quality control and task scheduling in medical data processing
CN118152385A (en)*2024-03-292024-06-07西安几何数字信息技术有限公司Medical-based data quality verification method

Also Published As

Publication numberPublication date
CN111241086B (en)2021-08-31

Similar Documents

PublicationPublication DateTitle
AU2022204116A1 (en)Verification method for electrical grid measurement data
CN113592017B (en)Deep learning model standardized training method, management system and processing terminal
US11928086B2 (en)Automatic machine learning data modeling in a low-latency data access and analysis system
US12099501B2 (en)Object scriptability
CN116383198A (en)Decision analysis method and system based on big data
US20230393963A1 (en)Record-replay testing framework with machine learning based assertions
CN111241086B (en)Data quality improvement method and system based on medical big data
CN112540975A (en)Multi-source heterogeneous data quality detection method based on petri net
CN114185791B (en) A data mapping file testing method, device, equipment and storage medium
CN114356928A (en)Risk analysis method and device, electronic equipment and storage medium
CN113722370A (en)Data management method, device, equipment and medium based on index analysis
CN116386799B (en)Medical data acquisition and standard conversion method and system
CN105825314A (en)Monitoring information analysis method and system based on centralized operation and maintenance mode
CN113506636A (en) A population research and scientific research system integrating standardized design, implementation and management
CN117312268B (en) Stream-batch integrated master data management method and device based on multi-source and multi-database
US12099575B2 (en)Auto-triage failures in A/B testing
US20230083123A1 (en)State-Sequence Pathing
CN116737753A (en) Business data processing methods, devices, computer equipment and storage media
CN116827817A (en) Data link status monitoring method, device, monitoring system and storage medium
CN116484060A (en)Data blood relationship analysis method, device, equipment and storage medium
CN114766023B (en)Data processing method, device and system and electronic equipment
Han et al.Research and Design of Construction Engineering Quality Management System Based on Big Data and BIM Technology
WO2020151054A1 (en)Data synchronization method and apparatus
CN118569455B (en) A method and system for calculating carbon budget before and after national land space planning
US20230334068A1 (en)Data processing method and apparatus thereof, electronic device, and computer-readable storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp