Movatterモバイル変換


[0]ホーム

URL:


CN107609124A - A kind of data managing method and data management platform - Google Patents

A kind of data managing method and data management platform
Download PDF

Info

Publication number
CN107609124A
CN107609124ACN201710831467.4ACN201710831467ACN107609124ACN 107609124 ACN107609124 ACN 107609124ACN 201710831467 ACN201710831467 ACN 201710831467ACN 107609124 ACN107609124 ACN 107609124A
Authority
CN
China
Prior art keywords
data
model
metadata
target data
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710831467.4A
Other languages
Chinese (zh)
Inventor
宋设
单震
张延群
刘骥飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Cloud Service Information Technology Co Ltd
Original Assignee
Shandong Inspur Cloud Service Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Cloud Service Information Technology Co LtdfiledCriticalShandong Inspur Cloud Service Information Technology Co Ltd
Priority to CN201710831467.4ApriorityCriticalpatent/CN107609124A/en
Publication of CN107609124ApublicationCriticalpatent/CN107609124A/en
Pendinglegal-statusCriticalCurrent

Links

Landscapes

Abstract

The invention provides a kind of data managing method and data management platform, wherein, this method includes:At least one data model is created in the asset library pre-set, determines data source, and target data model corresponding with the data source is determined at least one data model;Source data is gathered from the data source;The source data is handled according to the target data model, determines target data and metadata;By metadata storage into the management storehouse pre-set;By target data storage into the target data model.Scheme provided by the invention can improve data management efficiency.

Description

A kind of data managing method and data management platform
Technical field
The present invention relates to field of computer technology, more particularly to a kind of data managing method and data management platform.
Background technology
With the rapid development of economy, generating the data of magnanimity, these data are from different data sources, how to comingIt is vital to carry out effective management from the data of different data sources.
In the prior art, typically data are managed by artificial mode.
But labor management data is less efficient.
The content of the invention
The embodiments of the invention provide a kind of data managing method and data management platform, it is possible to increase data management is imitatedRate.
In a first aspect, the embodiments of the invention provide a kind of data managing method, created in the asset library pre-setAt least one data model, in addition to:
Data source is determined, and target data mould corresponding with the data source is determined at least one data modelType;
Source data is gathered from the data source;
The source data is handled according to the target data model, determines target data and metadata;
By metadata storage into the management storehouse pre-set;
By target data storage into the target data model.
Preferably,
It is described to create at least one data model in the asset library pre-set, including:
Receive at least one model information that user submits;
For model information each described, it is performed both by:Judge whether "current" model information meets the model pre-setAccess rules, if it is, creating the data model corresponding with the "current" model information in the asset library pre-set.
Preferably,
It is described by the metadata storage into the management storehouse pre-set before, further comprise:
Judge whether the metadata meets the data access rule pre-set, if it is, performing described by the memberIn the management storehouse that data Cun Chudao is pre-set.
Preferably,
The source data is handled according to the target data model described, determine target data and metadata itAfterwards, further comprise:
According to the regular and described target data of the label pre-set, the label information of the target data is determined;
By label information storage into the management storehouse.
Preferably,
Further comprise:
Count the number of targets stored in the data volume of the metadata stored in the management storehouse, the asset libraryAccording to data volume;
The obtained data volume of the metadata will be counted and the data volume of the target data shows user.
Preferably,
The model information, including:The combination of any one or more in model name, model code and model description.
Second aspect, the embodiments of the invention provide a kind of data management platform, including:
Model AM access module, for creating at least one data model in the asset library of setting;Metadata storage is arrivedIn the management storehouse of setting;
Data access module, for determining data source, and at least one number created in the model AM access moduleAccording to determination target data model corresponding with the data source in model;Source data is gathered from the data source;According to describedTarget data model is handled the source data, determines target data and the metadata;The target data is storedInto the target data model.
Preferably,
The model AM access module, for receiving at least one model information of user's submission;For mould each describedType information, is performed both by:Judge whether "current" model information meets the model access rules set, if it is, in the assets of settingThe data model corresponding with the "current" model information is created in storehouse.
Preferably,
The data access module, it is further used for judging whether the metadata meets the data access rule set,The metadata is stored into the management storehouse of setting if it is, execution is described.
Preferably,
The data access module, it is further used for the regular and described target data of label according to setting, it is determined that describedThe label information of target data;
The model AM access module, it is further used for label information storage into the management storehouse.
Preferably,
Further comprise:Display module;
The display module, for counting data volume, the asset library of the metadata stored in the management storehouseThe data volume of the target data of middle storage;The obtained data volume of the metadata and the number of the target data will be countedUser is showed according to amount.
Preferably,
The model information, including:The combination of any one or more in model name, model code and model description.
The embodiments of the invention provide a kind of data managing method and data management platform, wherein, the data managing methodData source is handled by modes such as data conversion, data cleansings, obtains target data and metadata, target data is ledEnter into the target data model of asset library, metadata is imported in management storehouse.Need not manually it be handled during being somebody's turn to do, fromDynamicization degree is high, it is possible to increase data management efficiency.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existingThere is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are the present inventionSome embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basisThese accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of flow chart for data managing method that one embodiment of the invention provides;
Fig. 2 is a kind of flow chart for data managing method that another embodiment of the present invention provides;
Fig. 3 is a kind of structural representation for data management platform that one embodiment of the invention provides;
Fig. 4 is a kind of structural representation for data management platform that another embodiment of the present invention provides.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present inventionIn accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment isPart of the embodiment of the present invention, rather than whole embodiments, based on the embodiment in the present invention, those of ordinary skill in the artThe every other embodiment obtained on the premise of creative work is not made, belongs to the scope of protection of the invention.
As shown in figure 1, the embodiments of the invention provide a kind of data managing method, this method may comprise steps of:
Step 101:At least one data model is created in the asset library pre-set;
Step 102:Data source is determined, and target data mould corresponding with data source is determined at least one data modelType;
Step 103:Source data is gathered from data source;
In embodiments of the present invention, the source data of multiple data sources can be gathered simultaneously, and each source data is entered respectivelyRow processing, but the processing method of each source data is the same.
Step 104:Source data is handled according to target data model, determines target data and metadata;
Step 105:By metadata storage into the management storehouse pre-set;
Step 106:By target data storage into target data model.
The data managing method is handled data source by data conversion, data cleansing etc., obtain target data andMetadata, target data is imported into the target data model of asset library, metadata is imported in management storehouse.During being somebody's turn to do notNeed manually to be handled, automaticity is high, it is possible to increase data management efficiency.
In practical application scene, data warehouse includes three sub- data warehouses, cleaning storehouse, asset library and management storehouse.
Cleaning storehouse prime responsibility is the outside frequently data access of reply, right as the cushion of data processing and cleaningData processing personnel open, and wherein data and model will be cleared up after not operated in certain time.Asset library is mainly blamedIt is storage mass data assets to appoint.It is data storage asset metadata to manage storehouse prime responsibility.
In embodiments of the present invention, source data of the library storage from different data sources is cleaned.
In the data managing method, it is related to two review processes, data examination & verification and model examination & verification, two will be examined belowNuclear process is further described.
In one embodiment of the invention, for Controlling model access procedure, it is necessary to verify the legitimacy of model information,Therefore, at least one data model is created in the asset library pre-set, including:
Receive at least one model information that user submits;
For each model information, it is performed both by:Judge whether "current" model information meets the model access pre-setRule, if it is, creating the data model corresponding with "current" model information in the asset library pre-set.
Model information, including:The combination of any one or more in model name, model code and model description.
Using model access rules, model information is screened, it is determined that meeting the model information of model access rules.It is logicalMultiple data models can be created in asset library by crossing this method, and each data model takes on a different character, to storeIn the corresponding target data of model information.
In one embodiment of the invention, in order to verify the legitimacy of target data to be accessed, deposited by metadataBefore storing up in the management storehouse pre-set, further comprise:
Judge whether metadata meets the data access rule pre-set, if it is, performing metadata storage in advanceIn the management storehouse first set.
Pass through audit function, it is ensured that metadata and the correctness of target data transmission.
In order to facilitate user search to the target data needed, it is necessary to add label to data model and target data.
Wherein, the label of data model refers to the own feature tag of data model, describes the feature of data model,Manual maintenance in model creation or importing process.
The label of target data refers to the feature tag generated after working process, and carrying out business to target data energizesAs a result.Label divide into polytype by tag control, and different types realizes different label management methods, wherein, markLabel management method refers to the method that label realizes page presentation logic, and the screening logical transition of label is into being available for the modes such as ES to inquire aboutFilter method, the Query Result of label is converted into by correlation tag dimension table by suitable result formats method.
In one embodiment of the invention, source data is being handled according to target data model, is determining number of targetsAccording to after metadata, further comprise:
According to the label rule and target data pre-set, the label information of target data is determined, label information is depositedStore up in management storehouse
For example, by label rule by target data according to being divided into A, B two parts month, and be A according to label rulePart addition label August, label September is added for part B., can be with by keyword August when being retrieved to target dataPart A target data is retrieved, by keyword September, part B target data can be retrieved.
The realization of data managing method is to be based on ETL (Extract-Transform-Load, extraction-conversion-loading) skillArt.
Data warehouse is towards analysis, and operational database is application oriented.Therefore, in the method, mainlyIt is the number that the target data model determination needs in data warehouse extract from application database.
In specific development process, developer is necessarily frequently found some ETL steps and target data model description is not inconsistent.It at this time will again check, design requirement, and re-start ETL.As database series this in talk about, it is any be related toTo the variation of demand, it is required for accent to start and document of upgrading demand.
The structure for the data that transfer process is primarily referred to as having got well extraction is changed, to meet the mistake of target data modelJourney.In addition, transfer process also is responsible for quality of data work, this part is also referred to as data cleansing.
Loading procedure is that the target data that the quality of data is ensure that after conversion is loaded into target data model.Loading canIt is divided into two kinds:Load first and refresh loading.Wherein, loading first can be related to mass data, and refresh loading and then belong to oneThe loading of the micro- batch type of kind.
In actual applications, tag control can also be expanded, to realize newly-increased label, modification label, delete and markThe function of label, here is omitted.
In one embodiment of the invention, in order to which the data volume stored into user's display data storehouse, this method are also wrappedInclude:
The data volume of the target data stored in the data volume of the metadata stored in statistical management storehouse, asset library;
The obtained data volume of metadata will be counted and the data volume of target data shows user.
In addition to this it is possible to show asset library to user, manage the capacity in storehouse, quantity of data model etc. in asset libraryThe information of other dimensions.
As shown in Fig. 2 the embodiment of the present invention is by taking the source data access data warehouse by the collection of data source as an example, logarithmIt is described in detail according to management method, this method comprises the following steps:
Step 201:Receive at least one model information that user submits.
Model information, including:Model name, model code and model description.
Wherein, model description includes the characteristic information of the data model.
Step 202:For each model information, it is performed both by:Judge whether "current" model information meets what is pre-setModel access rules, if it is, performing step 203.
For example, the type that data model is set in model access rules is relational model, the data included in model descriptionThe type of model is also relational model, then creates data model according to the model information, still, when the number included in model descriptionWhen according to the type of model being hierarchical model, establishment process is not performed.
Step 203:The data model corresponding with "current" model information is created in the asset library pre-set.
Step 204:Data source is determined, and target data mould corresponding with data source is determined at least one data modelType.
At least one access task that user is set is determined, wherein, access task includes data source to be collected and shouldThe data source target data model to be accessed, this method determine execution sequence according to the time of the access task received, enter oneStep ground, in actual applications, can also set the execution cycle of each access task, i.e., each a period of time just performs access and appointedBusiness.
Step 205:Source data is gathered from data source.
In embodiments of the present invention, the source data of multiple data sources can be gathered simultaneously, and each source data is entered respectivelyRow processing, but the processing method of each source data is the same.
Step 206:Source data is handled according to target data model, determines target data and metadata.
A series of processing such as converted, cleaned to source data according to the call format of target data model, being finally givenTarget data and the metadata for describing target data.
Step 207:According to the label rule and target data pre-set, the label information of target data is determined, will be markedInformation storage is signed into management storehouse.
Target data is made a distinction according to label rule, different label letters is added for different types of target dataBreath.
Step 208:Judge whether metadata meets the data access rule pre-set, if it is, performing step 209.
Step 209:By metadata storage into the management storehouse pre-set.
Step 210:By target data storage into target data model.
In embodiments of the present invention, target data and metadata are stored respectively in management storehouse and the asset library of data warehouseIn.Target data and separated from meta-data, be advantageous to data management, user can be retrieved according to metadata, with asset libraryIt is middle to search the target data needed.
Step 211:The data of the target data stored in the data volume of the metadata stored in statistical management storehouse, asset libraryAmount.
In order that user understands data warehouse storage data cases, in order to adjust storage location in time, in the present embodimentIn, it is necessary to the data volume of the data volume in statistical management storehouse and asset library respectively.
Step 212:The obtained data volume of metadata will be counted and the data volume of target data shows user.
Statistics can be shown to user by modes such as word or charts.
As shown in figure 3, the embodiments of the invention provide a kind of data management platform, including:
Model AM access module 301, for creating at least one data model in the asset library of setting;Metadata is storedInto the management storehouse of setting;
Data access module 302, for determining data source, and at least one data created in model AM access module 301Target data model corresponding with data source is determined in model;Source data is gathered from data source;According to target data model pairSource data is handled, and determines target data and metadata;By target data storage into target data model.
In one embodiment of the invention, model AM access module 301, for receiving at least one model of user's submissionInformation;For each model information, it is performed both by:Judge whether "current" model information meets the model access rules set, such asFruit is that the data model corresponding with "current" model information is created in the asset library of setting.
In one embodiment of the invention, data access module, it is further used for judging whether metadata meets to setData access rule, if it is, perform by metadata storage into the management storehouse pre-set.
In one embodiment of the invention, data access module 301, be further used for according to the label of setting rule andTarget data, determine the label information of target data;
Model AM access module 301, it is additionally operable to label information storage into management storehouse.
In one embodiment of the invention, as shown in figure 4, data management platform also includes:Display module 303;
Display module 303, the number of targets stored in the data volume, asset library for the metadata that is stored in statistical management storehouseAccording to data volume;The obtained data volume of metadata will be counted and the data volume of target data shows user.
In one embodiment of the invention, model information, including:Appoint in model name, model code and model descriptionThe combination for one or more of anticipating.
The contents such as the information exchange between each unit, implementation procedure in said apparatus, due to implementing with the inventive methodExample is based on same design, and particular content can be found in the narration in the inventive method embodiment, and here is omitted.
The embodiments of the invention provide a kind of computer-readable recording medium, including execute instruction, when the computing device of storage controlDuring execute instruction, method that storage control performs above-described embodiment.
The embodiments of the invention provide a kind of storage control, including:Processor, memory and bus;
Memory is used to store execute instruction, and processor is connected with memory by bus, when storage control is run,The execute instruction of computing device memory storage, so that the method that storage control performs above-described embodiment.
To sum up, each embodiment of the present invention at least has the effect that:
1st, in embodiments of the present invention, the data managing method is carried out by data conversion, data cleansing etc. to data sourceProcessing, obtains target data and metadata, target data is imported into the target data model of asset library, metadata is importedManage in storehouse.Need not manually it be handled during being somebody's turn to do, automaticity is high, it is possible to increase data management efficiency.
2nd, in embodiments of the present invention, model access procedure is audited by model information, passes through metadata logarithmAudited according to access procedure, ensure that the security of data warehouse.
3rd, in embodiments of the present invention, it is that data model adds label information by model information, is by label ruleTarget data adds label information, and user can be retrieved by label information to data model and target data, and then be obtainedTake the data of needs.
It should be noted that herein, such as first and second etc relational terms are used merely to an entityOr operation makes a distinction with another entity or operation, and not necessarily require or imply and exist between these entities or operationAny this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant be intended to it is non-It is exclusive to include, so that process, method, article or equipment including a series of elements not only include those key elements,But also the other element including being not expressly set out, or also include solid by this process, method, article or equipmentSome key elements.In the absence of more restrictions, the key element limited by sentence " including one ", is not arrangedExcept other identical factor in the process including the key element, method, article or equipment being also present.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass throughProgrammed instruction related hardware is completed, and foregoing program can be stored in computer-readable storage medium, the programUpon execution, the step of execution includes above method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or lightDisk etc. is various can be with the medium of store program codes.
It is last it should be noted that:Presently preferred embodiments of the present invention is the foregoing is only, is merely to illustrate the skill of the present inventionArt scheme, is not intended to limit the scope of the present invention.Any modification for being made within the spirit and principles of the invention,Equivalent substitution, improvement etc., are all contained in protection scope of the present invention.

Claims (10)

CN201710831467.4A2017-09-152017-09-15A kind of data managing method and data management platformPendingCN107609124A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201710831467.4ACN107609124A (en)2017-09-152017-09-15A kind of data managing method and data management platform

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201710831467.4ACN107609124A (en)2017-09-152017-09-15A kind of data managing method and data management platform

Publications (1)

Publication NumberPublication Date
CN107609124Atrue CN107609124A (en)2018-01-19

Family

ID=61064105

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201710831467.4APendingCN107609124A (en)2017-09-152017-09-15A kind of data managing method and data management platform

Country Status (1)

CountryLink
CN (1)CN107609124A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111767267A (en)*2020-06-182020-10-13杭州数梦工场科技有限公司Metadata processing method and device and electronic equipment
CN113127455A (en)*2019-12-302021-07-16北京奇虎科技有限公司Data management method and device, electronic equipment and readable storage medium
CN113128804A (en)*2019-12-302021-07-16北京奇虎科技有限公司Data management method and device, electronic equipment and readable storage medium
CN113128805A (en)*2019-12-302021-07-16北京奇虎科技有限公司Method and device for treating streaming data, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103699693A (en)*2014-01-102014-04-02中国南方电网有限责任公司Metadata-based data quality management method and system
CN106021624A (en)*2016-07-212016-10-12中国农业银行股份有限公司ETL (extract-transform-load) model generation method and device
CN106777243A (en)*2016-12-272017-05-31浪潮软件集团有限公司 A Dynamic Modeling for Streaming Data Analysis
CN107092701A (en)*2017-05-022017-08-25山东浪潮通软信息科技有限公司The data processing method and device of a kind of Multidimensional Data Model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103699693A (en)*2014-01-102014-04-02中国南方电网有限责任公司Metadata-based data quality management method and system
CN106021624A (en)*2016-07-212016-10-12中国农业银行股份有限公司ETL (extract-transform-load) model generation method and device
CN106777243A (en)*2016-12-272017-05-31浪潮软件集团有限公司 A Dynamic Modeling for Streaming Data Analysis
CN107092701A (en)*2017-05-022017-08-25山东浪潮通软信息科技有限公司The data processing method and device of a kind of Multidimensional Data Model

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113127455A (en)*2019-12-302021-07-16北京奇虎科技有限公司Data management method and device, electronic equipment and readable storage medium
CN113128804A (en)*2019-12-302021-07-16北京奇虎科技有限公司Data management method and device, electronic equipment and readable storage medium
CN113128805A (en)*2019-12-302021-07-16北京奇虎科技有限公司Method and device for treating streaming data, electronic equipment and storage medium
CN111767267A (en)*2020-06-182020-10-13杭州数梦工场科技有限公司Metadata processing method and device and electronic equipment
CN111767267B (en)*2020-06-182024-05-10杭州数梦工场科技有限公司Metadata processing method and device and electronic equipment

Similar Documents

PublicationPublication DateTitle
CN110443552B (en) A method and device for automatic transmission of product master data information
US8566903B2 (en)Enterprise evidence repository providing access control to collected artifacts
CN110795524B (en)Main data mapping processing method and device, computer equipment and storage medium
JP5600185B2 (en) Method for accessing a large collection object table in a database
CN104361018B (en)Electronic archives information reorganization method and device
CN107003935A (en)Optimize database duplicate removal
CN107609124A (en)A kind of data managing method and data management platform
CN1347529A (en) Method of Visualizing Information in Data Warehouse Environment
CN107423035B (en)Product data management system in software development process
CA2793400C (en)Associative memory-based project management system
CN113722352B (en)Intelligent data verification method, system and storage medium for price reporting scheme
US10679230B2 (en)Associative memory-based project management system
CN115905279A (en)Method and device for automatically building business analysis report and electronic equipment
CN113360517A (en)Data processing method and device, electronic equipment and storage medium
CN110941952A (en)Method and device for perfecting audit analysis model
CN110532535A (en)A kind of government intelligence list interactive system
CN105741035A (en)ERP system data integration method and device
CN114880387A (en)Data integration script generation method and device, storage medium and electronic equipment
CN114049042A (en)Risk area analysis method, device, equipment and medium based on artificial intelligence
CN119671649B (en) Data management statistical method based on dynamic open-point segment tree and related equipment
US20080082958A1 (en)Method and Computer Program Product for Providing a Representation of Software Modeled by a Model
CN117763059B (en)Model construction method and system for data warehouse and data mart
Solodovnikova et al.Managing Evolution of Heterogeneous Data Sources of a Data Warehouse.
CN111489132A (en)Engineering data classification and retrieval method and device, equipment and storage medium thereof
CN107656732A (en)Towards the reusable software management system in avionics field

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20180119

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp