Movatterモバイル変換


[0]ホーム

URL:


CN117591508B - Data quality guarantee method under big data scene - Google Patents

Data quality guarantee method under big data scene
Download PDF

Info

Publication number
CN117591508B
CN117591508BCN202410078740.0ACN202410078740ACN117591508BCN 117591508 BCN117591508 BCN 117591508BCN 202410078740 ACN202410078740 ACN 202410078740ACN 117591508 BCN117591508 BCN 117591508B
Authority
CN
China
Prior art keywords
data
monitoring
report
data table
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410078740.0A
Other languages
Chinese (zh)
Other versions
CN117591508A (en
Inventor
卢辉
杨小茹
李齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunzhu Information Technology Chengdu Co ltd
Original Assignee
Yunzhu Information Technology Chengdu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunzhu Information Technology Chengdu Co ltdfiledCriticalYunzhu Information Technology Chengdu Co ltd
Priority to CN202410078740.0ApriorityCriticalpatent/CN117591508B/en
Publication of CN117591508ApublicationCriticalpatent/CN117591508A/en
Application grantedgrantedCritical
Publication of CN117591508BpublicationCriticalpatent/CN117591508B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention discloses a data quality assurance method under a big data scene, which comprises the steps of carrying out quality assessment on data in a pre-online stage based on field information and data quantity; based on the total data before and after pushing, pushing and verifying the data in the stage before online; based on the automatic test of the report interface, analyzing and repairing the core data in the online stage; and monitoring and analyzing the data of the online later stage. The invention takes the life cycle of data as a time line, and effectively correlates a plurality of systems, thereby providing a set of data quality assurance methods in different stages and different directions. The invention realizes the data quality guarantee of pre-prevention, in-process quick processing and post-remediation through the data monitoring of the pre-online stage, the in-online stage and the post-online stage respectively.

Description

Data quality guarantee method under big data scene
Technical Field
The invention relates to the technical field of data guarantee, in particular to a data quality guarantee method under a big data scene.
Background
The data guarantee means in the current big data field is single and fragmented, and is usually monitored by adopting manual comparison, check or single monitoring script, and the workload of a test engineer responsible for configuring a monitoring task is huge, so that the data quality guarantee cannot be systematically and systematically carried out as a whole.
In general, a big data team has a metadata management system, a blood-margin system and other systems, and in addition, various monitoring tools are used for assistance, and a test engineer can write an automatic test case, so that the data quality is ensured by switching the use in each system. However, the existing systems cannot effectively perform service association, and a complete data guarantee method cannot be formed.
Disclosure of Invention
Aiming at the technical problems existing at present, the invention provides a data quality guarantee method under a big data scene, which can complete all data quality guarantee works in one step throughout the life cycle of data.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
the data quality guarantee method under the big data scene comprises the following steps:
step 1, carrying out quality evaluation on data in a pre-online stage based on field information and data volume;
step 2, pushing verification is carried out on data in the stage before online based on the total data before and after pushing;
Step 3, analyzing and repairing the core data in the online stage based on the automated test of the report interface;
and 4, monitoring and analyzing the data in the online stage.
Further, the method further comprises the following steps: all database information in the data warehouse and all data table information in the database are acquired before step 1.
Further, step 1 includes: step 11, acquiring any data table of a database by using a model analysis tool, and counting the names and types of all fields of the current data table; step 12, circulating all fields of the current data table, and counting field information of field value range of all types of fields including null value quantity and null value rate, duplication removal quantity of character type field, field maximum value and minimum value and value type field; step 13, counting the data quantity of the current data table; step 14, cycling all data tables of a database, repeating the steps 11-13, writing the statistical results of the steps 11-13 into an Excel file, and storing the statistical result of one data table in each worksheet in the Excel file; and 15, carrying out quality evaluation on the data in the database according to the statistical result.
Further, step 2 includes: step 21, judging whether the total data amount of a source data table in a data warehouse before pushing is consistent with that of a target data table after pushing by adopting a data comparison tool; and 22, if the data total amounts of the source data table and the target data table are consistent, randomly selecting N rows of data corresponding to the N rows of data and the target data table in the source data table to perform field-to-field comparison, wherein N is a positive integer, and if the comparison of each field is the same, the push verification is successful.
Further, the following monitoring method is adopted in step 4 for monitoring analysis: firstly, adopting enumerated value monitoring to analyze and repair a data table of an original data layer in a data warehouse; analyzing and repairing a report generated by a report interface by adopting interface result monitoring; monitoring the data extraction of the original data layer in the data warehouse by adopting layered data monitoring, and monitoring the scheduling task of the data detail layer in the data warehouse; monitoring the data change of the original data layer in the data warehouse by adopting incremental monitoring; monitoring misuse of the transaction table by adopting global transaction table monitoring; the consistency monitoring of the total data amount before and after pushing the data table in the data warehouse is adopted to monitor the consistency of the pushing ES/CK/MYSQL; monitoring whether the current-day partition in the appointed data table has data or not by adopting the appointed table; the method ten is that the data uniqueness with the uniqueness configuration data table is monitored by adopting dimension table uniqueness monitoring; and the eleventh method is to monitor the data null value of the configuration data table with the core field by adopting index null value rate monitoring.
Further, the method for monitoring the enumeration value comprises the following steps: monitoring fixed fields of a data table of an original data layer, writing enumeration values of the fixed fields into a database, checking whether the enumeration values change or not at regular intervals, triggering an alarm when the enumeration values change, analyzing the influence of the enumeration value change on the data table and repairing the data table.
Further, the method for monitoring the interface result comprises the following steps: and a test plan is established on the automation platform, all the automation cases input by the interfaces of the automation platform are called and tested, the test result is monitored, an alarm is triggered when the automation cases do not pass, the test result is analyzed and a report is restored.
Further, in step 3, the interface result is adopted to monitor and test only the automation use cases input by the interface of the core report of the automation platform, and the test result is analyzed and the core report is repaired.
Further, in step 3, if the interface of the core report is not input with an automation case, the method one to the method eleven in step 4 are adopted to analyze and repair the core data in the online stage.
Further, the method further comprises the following steps: the method comprises the steps of obtaining SQL logic of interface configuration by analyzing interface configuration associated with a report, wherein the SQL logic is configured by a single or a plurality of associated data tables, extracting interfaces associated with the abnormal data tables when the abnormal data tables are obtained by a monitoring method from a first method to a method eleventh, extracting the report of the interface, and analyzing the affected range of the report.
Compared with the prior art, the invention has the following beneficial effects:
the invention takes the life cycle of data as a time line, and effectively correlates a plurality of systems, thereby providing a set of data quality assurance methods in different stages and different directions. The invention realizes the data quality guarantee of pre-prevention, in-process quick processing and post-remediation through the data monitoring of the pre-online stage, the in-online stage and the post-online stage respectively.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
Term interpretation:
ES, collectively referred to as elastic search, is a distributed, scalable, real-time search and analysis engine.
CK, collectively referred to as ClickHouse, is an open-source column store database.
MYSQL is an open-source relational database management system.
Doris is an MPP analysis type database.
SQL logic is an operational rule and principle for data querying and filtering.
The present invention will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the data quality guarantee method under a big data scene provided by the invention comprises the following steps:
step 1, carrying out quality evaluation on data in a pre-online stage based on field information and data volume;
step 2, pushing verification is carried out on data in the stage before online based on the total data before and after pushing;
Step 3, analyzing and repairing the core data in the online stage based on the automated test of the report interface;
and 4, monitoring and analyzing the data in the online stage.
The invention takes the life cycle of data as a time line, and effectively correlates a plurality of systems, thereby providing a set of data quality assurance methods in different stages and different directions. The invention realizes the data quality guarantee of pre-prevention, in-process quick processing and post-remediation through the data monitoring of the pre-online stage, the in-online stage and the post-online stage respectively.
Before step 1, all database information and all data table information in the database are acquired, wherein the database information comprises the names of the databases, the data tables under the databases and remark information of the databases, the data table information comprises the names of the databases to which the data tables belong, the names of the data tables, responsible persons of the data tables, whether the data tables are partition tables, partition numbers, partition types, partition field types, whether the data tables are transaction tables and remark information of the data tables, and the database information and the data table information form the data warehouse information. The asset status of the data platform, such as the database and the number of data tables, can be known in its entirety by the data warehouse information. Meanwhile, the configuration of the subsequent monitoring can use the library name of the database and the table name of the data table, so that the response time of the database to the local interface is reduced. Some necessary information of the database or the data table cannot be directly obtained, and the necessary information for maintaining the prepared later period is needed, for example, the responsible person in the data warehouse can change after the data table structure is edited. Some data tables often require manual maintenance correction because of the lack of specification of data at an early stage, resulting in non-specification of partition type, partition format.
Firstly, verification of data correctness is required to be carried out on data in an online pre-stage, namely a research and development testing stage, and quality evaluation and push verification are also required to be carried out.
The quality assessment is analysis of the data scale, and specifically comprises the following steps: step 11, acquiring any data table of a database by using a model analysis tool, and counting the names and types of all fields of the current data table; step 12, circulating all fields of the current data table, and counting field information of field value range of all types of fields including null value quantity and null value rate, duplication removal quantity of character type field, field maximum value and minimum value and value type field; step 13, counting the data quantity of the current data table; step 14, cycling all data tables of a database, repeating the steps 11-13, writing the statistical results of the steps 11-13 into an Excel file, and storing the statistical result of one data table in each worksheet in the Excel file; and 15, carrying out quality evaluation on the data in the database according to the statistical result, for example, judging whether the field types corresponding to the amounts and the quantity reasonably use the numerical value type field, judging whether the field value range of the amount of the order is consistent with the service, and carrying out quality evaluation on the data and repairing the data which do not accord with the quality evaluation.
Because of the different data usage, the data in the data warehouse may also be pushed to different types of data storage media (MYSQL, CK, ES, doris, etc.) by the pushing tool, where the correctness of the pushing needs to be verified. The method specifically comprises the following steps: step 21, judging whether the total data amount of a source data table (pushed data) in a data warehouse before pushing is consistent with the total data amount of a target data table (pushed data in a target storage medium and a database) after pushing by adopting a data comparison tool, performing count statement of SQL (structured query language) through a link database to complete statistics of the total data amount, and then comparing the total data amount; step 22, if the data total amounts of the source data table and the target data table are consistent, randomly selecting N rows of data in the source data table through a set unique key (namely, determining only one piece of data through one or more fields in the source data table), inquiring corresponding data in the target data table according to the unique key, if the inquiry fails to indicate push verification, comparing the fields in the N rows of data in sequence if the inquiry fails, if the comparison of each field is the same, indicating that push verification is successful, otherwise, failing to push verification, namely, abnormal push program exists.
And then the dispatching task is executed at regular time every day by the dispatching platform, and data processing is carried out one day before the expiration of the morning at regular time every day, and in this stage, due to the fact that the dispatching task of the dispatching platform is highly complex and the requirements on all dependence items such as environment, network, equipment resources and the like are high, failure of any link can cause failure of the data processing task in the day, and therefore on-time and high-quality output of all reports is affected. Based on the scene, the whole big data team needs to ensure the report correctness of the cores preferentially, and the accuracy of some core data is ensured. Therefore, the scheduling platform calls the method for monitoring the interface result, and the monitoring of the interface result can ensure that the report system has correct functionality and unchanged logic, and is specifically as follows: and (3) adopting interface result monitoring to test the automation use cases input by the interfaces of the core report of the automation platform to judge whether the core data is processed normally, and finally analyzing the test result and repairing the core report. If the automatic use cases do not pass, an alarm is triggered, the alarm comprises the failure number and the failure proportion of the automatic use cases, and the alarm is provided to a query detail link address for monitoring the interface result. This ensures that problems are found before the user, leaving sufficient time for repairing the problem.
The scheduling platform is a post-online stage after executing all tasks, and at the moment, the data after online are monitored and analyzed by adopting the following monitoring method:
The method one, adopt the enumerated value to monitor and analyze and repair the data sheet of the original data layer in the data warehouse. The data warehouse can be roughly classified into an original data layer (ods layer), a data detail layer (dwd layer), a data service layer (dws layer), a data application layer (ads layer), and the like. The data sources in the original data layer (ods layer) are synchronized from the service system database to the data warehouse through various synchronization programs, all the data sources of big data are developed from the ods layer, when the enumeration value in the service system is changed, the correctness of the data is affected, for example, when an else statement to case white is used, if the changed or newly added enumeration value is not specially processed, the correctness of the report result is affected by the abnormal data.
The method for monitoring the enumeration value comprises the following steps: monitoring fixed fields of a data table of an original data layer (ods layer), writing enumeration values of the fixed fields into a database, checking whether the enumeration values change or not at regular intervals, triggering an alarm when the enumeration values change, analyzing the influence of the enumeration value change on the data table and repairing the data table. For example, an enumeration value (a/B/C) of a fixed field of a certain type is written into the database, when the enumeration value is changed to (a/B/C/E), an enumeration value D is newly added, and a corresponding development engineer needs to analyze what influence the newly added enumeration value E will cause and how to repair data through the fixed field and the newly added enumeration value E.
And secondly, analyzing and repairing the report generated by the report interface by adopting interface result monitoring. And a test plan is established on the automation platform, all the automation cases input by the interfaces of the automation platform are called and tested, the test result is monitored, an alarm is triggered when the automation cases do not pass, the test result is analyzed and a report is restored. The automation use case can ensure that the report interface is invoked normally and the report and the data logic are not changed. The logic of the interface is realized by an interface path transmitted by the dispatching platform, a corresponding automation use case is found by interface definition dolphinRunApi (List < String > apiList), then the automation use case is executed, the information of the person on duty is directly obtained by a data table stored in a data warehouse when the automation use case is abnormal, and finally an alarm is triggered.
And thirdly, monitoring data extraction of an original data layer (ods layer) in the data warehouse and monitoring a dispatching task of a data detail layer (dwd layer) in the data warehouse by adopting layered data monitoring. If the ods layer is monitored, whether the source service system data are normally extracted to big data or not can be found, and the data integrity can be verified; if dwd layers are configured, whether the dispatching task of the data warehouse is normally performed or not can be found, and the link integrity is verified.
And fourthly, monitoring the data change of the original data layer (ods layer) in the data warehouse by adopting incremental monitoring. The method is generally used for monitoring the data tables of the ods layer and synchronizing the data by a full-scale synchronization scheme, and aims to monitor the change condition of the data, the service system does not allow physical deletion, the data only increases or decreases, if the data decreases, the existing report forms can be influenced, and aiming at some tables with large service growth, if the situation that no growth or little growth exists, the investigation is needed, and the purpose is to ensure the data consistency.
And fifthly, monitoring misuse of the transaction table by adopting global transaction table monitoring so as to prevent misuse of the transaction table.
And the consistency of the total data before and after pushing the data table in the data warehouse is monitored by adopting the consistency monitoring of the pushing ES/CK/MYSQL. The three kinds of monitoring are used for verifying the consistency of the total data before and after pushing.
And step nine, monitoring whether the partition of the specified data table has data or not by adopting specified table monitoring. The method is the extension of layered data monitoring, the data table information can be obtained without according to the layering rule, and whether the partition in the data table has data in the current day can be directly verified through the appointed data table.
According to the method, the data uniqueness of the data table with the uniqueness configuration is monitored by adopting dimension table uniqueness monitoring, the data table with the uniqueness configuration is configured, the uniqueness configuration can be single-field or multi-field, and the purpose is to ensure the data uniqueness and no repeated data.
And the eleventh method is to monitor the data null value of the configuration data table with the core field by adopting index null value rate monitoring. Configuring the core fields of some data tables, if the configured null value rate is null, the null value rate cannot appear, and if the configured null value rate is not null, the actual null value rate must be smaller than the configured value; the null rate calculation formula is: the field is null number/table number, and if the field is character type, the formula is: (field is null number + field is blank number)/table number.
The monitoring methods can be configured to be executed at fixed time according to actual needs, and a plurality of monitoring tasks of each type are supported and configured according to actual situations.
In the online stage, if the interface of the core report is not input with an automatic case, the above eleven monitoring methods are adopted to analyze and repair the core data.
The invention also includes: the method comprises the steps of obtaining SQL logic of interface configuration by analyzing interface configuration associated with a report, wherein the SQL logic is configured by a single or a plurality of associated data tables, extracting interfaces associated with the abnormal data tables when the abnormal data tables are obtained by a monitoring method from a first method to a method eleventh, extracting the report of the interface, and analyzing the affected range of the report.
The report processing logic comprises interface registration and report development. The interface registration is that a BI development engineer finishes SQL script (SQL logic) development through monitored single or multiple associated data on an interface platform according to the explicit report demand, then the SQL script is registered as an interface, and the relation between the data and the interface is established and stored in a data warehouse. Report development is that a front-end development engineer obtains a report by adding front-end low-code configuration to data returned by a registration interface on a report platform, establishes a relation between the report and the interface and stores the relation in a data warehouse. Through the operation, the relation of the data, the interface and the report is constructed, when the data is monitored to be abnormal, an affected interface is found on the interface platform according to the abnormal data, then the affected report is quickly positioned on the report platform according to the affected interface, and the affected report range can be analyzed at the moment.
After all monitoring tasks are executed, monitoring results are recorded, monitoring details are output according to days, and a billboard of the whole quality platform is formed, so that problem positioning and troubleshooting are facilitated.
Finally, it should be noted that: the above embodiments are merely preferred embodiments of the present invention for illustrating the technical solution of the present invention, but not limiting the scope of the present invention; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions; that is, even though the main design concept and spirit of the present invention is modified or finished in an insubstantial manner, the technical problem solved by the present invention is still consistent with the present invention, and all the technical problems are included in the protection scope of the present invention; in addition, the technical scheme of the invention is directly or indirectly applied to other related technical fields, and the technical scheme is included in the scope of the invention.

Claims (8)

The step 1 comprises the following steps: step 11, acquiring any data table of a database by using a model analysis tool, and counting the names and types of all fields of the current data table; step 12, circulating all fields of the current data table, and counting field information of field value range of all types of fields including null value quantity and null value rate, duplication removal quantity of character type field, field maximum value and minimum value and value type field; step 13, counting the data quantity of the current data table; step 14, cycling all data tables of a database, repeating the steps 11-13, writing the statistical results of the steps 11-13 into an Excel file, and storing the statistical result of one data table in each worksheet in the Excel file; step 15, carrying out quality evaluation on the data in the database according to the statistical result;
and 4, monitoring and analyzing by adopting the following monitoring method: firstly, adopting enumerated value monitoring to analyze and repair a data table of an original data layer in a data warehouse; analyzing and repairing a report generated by a report interface by adopting interface result monitoring; monitoring the data extraction of the original data layer in the data warehouse by adopting layered data monitoring, and monitoring the scheduling task of the data detail layer in the data warehouse; monitoring the data change of the original data layer in the data warehouse by adopting incremental monitoring; monitoring misuse of the transaction table by adopting global transaction table monitoring; the consistency monitoring of the total data amount before and after pushing the data table in the data warehouse is adopted to monitor the consistency of the pushing ES/CK/MYSQL; monitoring whether the current-day partition in the appointed data table has data or not by adopting the appointed table; the method ten is that the data uniqueness with the uniqueness configuration data table is monitored by adopting dimension table uniqueness monitoring; and the eleventh method is to monitor the data null value of the configuration data table with the core field by adopting index null value rate monitoring.
CN202410078740.0A2024-01-192024-01-19Data quality guarantee method under big data sceneActiveCN117591508B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410078740.0ACN117591508B (en)2024-01-192024-01-19Data quality guarantee method under big data scene

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410078740.0ACN117591508B (en)2024-01-192024-01-19Data quality guarantee method under big data scene

Publications (2)

Publication NumberPublication Date
CN117591508A CN117591508A (en)2024-02-23
CN117591508Btrue CN117591508B (en)2024-05-28

Family

ID=89913763

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410078740.0AActiveCN117591508B (en)2024-01-192024-01-19Data quality guarantee method under big data scene

Country Status (1)

CountryLink
CN (1)CN117591508B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104636484A (en)*2015-02-162015-05-20广东省公安厅 Monitoring task generation method and device based on data monitoring
CN111177139A (en)*2019-12-312020-05-19青梧桐有限责任公司Data quality verification monitoring and early warning method and system based on data quality system
CN111858646A (en)*2020-07-212020-10-30国网浙江省电力有限公司营销服务中心 A method and system for checking the quality data format of an electric energy meter
CN112100250A (en)*2020-11-232020-12-18支付宝(杭州)信息技术有限公司Data processing method and device
CN112328612A (en)*2020-11-032021-02-05平安信托有限责任公司 Field consistency verification test data generation method, system, device and medium
CN114155054A (en)*2020-09-082022-03-08福建天泉教育科技有限公司 A test method and system for big data report statistics based on kafka
CN115345163A (en)*2022-07-112022-11-15陕西千山航空电子有限责任公司Outfield quality analysis method and system based on fault data
CN115525575A (en)*2022-10-302022-12-27中国太平洋人寿保险股份有限公司 A data automation testing method and system based on Dataworks platform
CN116521664A (en)*2023-05-112023-08-01杭州网易云音乐科技有限公司Data monitoring method and device for data warehouse, computing equipment and storage medium
CN116860740A (en)*2023-07-282023-10-10北京沃东天骏信息技术有限公司Method, device, electronic equipment and medium for monitoring data quality

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11971874B2 (en)*2019-01-312024-04-30Salesforce, Inc.Systems, methods, and apparatuses for implementing efficient storage and validation of data and metadata within a blockchain using distributed ledger technology (DLT)

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104636484A (en)*2015-02-162015-05-20广东省公安厅 Monitoring task generation method and device based on data monitoring
CN111177139A (en)*2019-12-312020-05-19青梧桐有限责任公司Data quality verification monitoring and early warning method and system based on data quality system
CN111858646A (en)*2020-07-212020-10-30国网浙江省电力有限公司营销服务中心 A method and system for checking the quality data format of an electric energy meter
CN114155054A (en)*2020-09-082022-03-08福建天泉教育科技有限公司 A test method and system for big data report statistics based on kafka
CN112328612A (en)*2020-11-032021-02-05平安信托有限责任公司 Field consistency verification test data generation method, system, device and medium
CN112100250A (en)*2020-11-232020-12-18支付宝(杭州)信息技术有限公司Data processing method and device
CN115345163A (en)*2022-07-112022-11-15陕西千山航空电子有限责任公司Outfield quality analysis method and system based on fault data
CN115525575A (en)*2022-10-302022-12-27中国太平洋人寿保险股份有限公司 A data automation testing method and system based on Dataworks platform
CN116521664A (en)*2023-05-112023-08-01杭州网易云音乐科技有限公司Data monitoring method and device for data warehouse, computing equipment and storage medium
CN116860740A (en)*2023-07-282023-10-10北京沃东天骏信息技术有限公司Method, device, electronic equipment and medium for monitoring data quality

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Methodologies for data quality assessment and improvement;CARLO BATINI et al.;《ACM Computing Surveys》;20090730;41(3);1-52*
大数据仓库数据模型在电子政务中的应用研究;郑丹辉;《中国优秀硕士学位论文全文数据库 社会科学I辑》;20210215(第2期);G110-624*

Also Published As

Publication numberPublication date
CN117591508A (en)2024-02-23

Similar Documents

PublicationPublication DateTitle
CN110287052B (en)Root cause task determination method and device for abnormal task
US11829365B2 (en)Systems and methods for data quality monitoring
CN112800044B (en)Data quality judging and monitoring method, management system, storage medium and terminal
US10452625B2 (en)Data lineage analysis
CN111539633A (en)Service data quality auditing method, system, device and storage medium
WO2008060015A1 (en)System and method for management of performance fault using statistical analysis
US6708185B2 (en)SQL execution analysis
CN116303380B (en)Data quality checking method, equipment and medium in monitoring service
CN110716539A (en)Fault diagnosis and analysis method and device
CN116775405A (en)Automatic inspection system and inspection method for database
CN114238463A (en)Calculation engine control method and device for distributed index calculation
CN118410106B (en)Cross-data source real-time synchronization method based on time line mapping
CN115599621A (en)Micro-service abnormity diagnosis method, device, equipment and storage medium
CN112199423A (en)ETL data quality judgment and feedback method
CN117591508B (en)Data quality guarantee method under big data scene
CN119806873A (en) A method for analyzing the causes of faults in power information systems
CN115718690B (en)Data accuracy monitoring system and method
CN115941446B (en)Alarm root cause positioning method, device, electronic equipment and computer readable medium
CN115629949A (en)Optimization method and device for management information system, electronic equipment and storage medium
Tadano et al.Automatic synthesis of SRN models from system operation templates for availability analysis
CN109377039B (en)Method for analyzing key fault factors of system task reliability
CN118113622A (en)Detection and repair method, device and equipment applied to batch operation scheduling
CN117093875A (en)Screening item search quality verification method and device and computer equipment
CN119337838A (en) Method and device for generating financial regulatory reports, storage medium and electronic equipment
CN120705057A (en)Method and device for adjusting Mock object, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp