Disclosure of Invention
Therefore, the cross-industry data processing method provided by the invention can be instantiated according to different application scenes, and solves the problems that data models between industries are large in difference and data management and data analysis cannot be uniformly carried out.
In order to achieve the above purpose, the invention provides the following technical scheme: a cross-industry data processing method abstracts service data into entities for storage, and the entities are divided into main entities, sporocarp, behavior sporocarp and service entities according to different logic application modes and storage schemes; the main entity is a carrier of the business data, and data analysis is carried out through a data object in the main entity; the sub-entity having a logical affiliation with the master entity, the sub-entity comprising affiliation data that exists in association with the master entity; the behavior sub-entity has a logical affiliation with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior feature information is expanded on the basis of the sub-entity; the business entity serves as a data source of the main entity, the sporocarp and the behavior sporocarp.
As a preferred solution for the cross-industry data processing method, the sub-entities and the behavioral sub-entities exist in logical affiliation from one main entity, and one sub-entity or behavioral sub-entity is affiliated to only one main entity.
As a preferred scheme of the cross-industry data processing method, one-to-many or many-to-one incidence relation exists between main entities of different business data.
As an optimal scheme of the cross-industry data processing method, data structure and field customization are carried out on each service data, and the customized service data are independently stored to realize isolation among the service data.
As a preferred scheme of the cross-industry data processing method, data aggregation is carried out on a plurality of service data which are isolated from each other according to requirements in a pushing and associated configuration mode.
As a preferred scheme of a cross-industry data processing method, performing two-dimensional management of function modularization and data individuation on the service data;
the functional module freely configures whether the functions of label management, grouping management, index management or user portrait are needed or not aiming at each service data;
and the data personalization is used for performing label system, grouping and index counting operation on each service data, and performing data deduplication according to the acquired service data to generate a dedicated user portrait.
As a preferred scheme of the cross-industry data processing method, a relationship between the business data source and the destination entity is defined as a genetic relationship, and objects of the genetic relationship include business entity to main entity, business entity to sub-entity, business entity to behavior sub-entity, or main entity to main entity.
As a preferred scheme of the cross-industry data processing method, the business data are subjected to data circulation display through a data blood margin analysis chart, data problem positioning is performed through the data blood margin analysis chart, and data with problems after positioning are extracted or pushed again through upstream and downstream business data.
As a preferred scheme of the cross-industry data processing method, data cleaning is performed on the acquired business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.
The invention has the following advantages: the business data is abstracted into entities for storage, and the entities are divided into main entities, sporocarps, behavior sporocarps and business entities according to different logic application modes and storage schemes; the main entity is a carrier of business data, and data analysis is carried out through a data object in the main entity; the sub-entity has a logical affiliation with the main entity, the sub-entity including affiliation data that exists in association with the main entity; the behavior sub-entity has a logical affiliation with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands the behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source of the main entity, the sporocarp and the behavior sporocarp. The invention greatly helps to save enterprise expenses, is convenient and quick and improves human efficiency; the data is stored in an isolated manner, so that the service requirement is met, and the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services not only meets the requirement of individuation, but also can realize unified management; tracing the whole trace of the data through the blood margin of the data; through data cleaning, improve data quality to promote the degree of accuracy.
Detailed Description
The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 and 2, a cross-industry data processing method is provided, in which business data is abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of the business data, and data analysis is carried out through a data object in the main entity; the sub-entity having a logical affiliation with the master entity, the sub-entity comprising affiliation data that exists in association with the master entity; the behavior sub-entity has a logical affiliation with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior feature information is expanded on the basis of the sub-entity; the business entity serves as a data source of the main entity, the sporocarp and the behavior sporocarp.
Specifically, the main entity is a main carrier for storing data, and is a main object of data analysis, and the application of the data is mainly an application to the main entity. Such as: contact person and enterprise information. The sub-entity is the attached data attached to the main entity and is the data of the main entity with logical attached relation. Such as: educational history, work history, etc. of the contact. The behavior sub-entity is the behavior information generated by the main entity, a logical affiliation relationship exists between the behavior sub-entity and the main entity, the behavior sub-entity is inherited to the sub-entity, and the characteristic information (such as time and behavior type) of some behaviors is expanded on the basis of the sub-entity. Such as: purchase information for the contact. When all the service data enter the data management system, the service entities are generated in the same structure, so that the safety and the availability of the data are ensured, and the service entities are the source entities of other entity data.
In particular, the sub-entities and the behavioral sub-entities exist in logical affiliation from one main entity, and one sub-entity or behavioral sub-entity is affiliated only to one main entity. I.e. sub-entities and behavioral sub-entities can only exist in logical affiliation with a certain main entity and can only be affiliated with one main entity. And one-to-many or many-to-one association relationship exists between the main entities of different business data.
Specifically, data structure and field customization are performed on each service data, and the customized service data is independently stored to realize isolation between the service data. Each service data is self-defined in data structure and field and is stored independently, and one service data is equal to a set of reduced service systems, so that real data isolation is realized.
Specifically, data aggregation is performed on a plurality of service data which are isolated from each other according to requirements in a pushing and associated configuration mode. The data of a plurality of service data can be aggregated and associated according to the requirement through pushing, association configuration and the like, and the data aggregation can also be realized.
Performing functional modularization and data individuation two-dimensional management on the service data in the cross-industry data processing method;
the functional module freely configures whether the functions of label management, grouping management, index management or user portrait are needed or not aiming at each service data;
and the data personalization is used for performing label system, grouping and index counting operation on each service data, and performing data deduplication according to the acquired service data to generate a dedicated user portrait.
For each service data, whether the functions of label management, grouping management, index management, user portrait, and the like are needed or not can be freely configured, and redundancy of functional modules is avoided. Each business data has a set of label system, grouping and statistical indexes, data duplication is removed according to the collected business data, exclusive user figures are automatically generated, and accurate marketing, driving protection and navigation of enterprises are achieved.
In one embodiment of the cross-industry data processing method, the relationship between the business data source and the destination entity is defined as a consanguinity relationship, and the objects of the consanguinity relationship include business entity to main entity, business entity to sub-entity, business entity to behavior sub-entity or main entity to main entity.
With the aid of fig. 2, the original business data forms business entities in the system, and the business entities are pushed to the designated main entity(s), sub-entities and behavioral sub-entities according to the relationship of blood relationship, and establish the affiliation between them. When a plurality of main entities are put in storage at the same time, the association relationship between the main entities can be established. And after the last behavior entity is put in storage, pushing the data to the next main entity according to the relationship of the blood relationship of the main entity to enter the next round of data stream transfer until the end.
Specifically, the business data are subjected to data circulation display through a data blood margin analysis chart, data problem positioning is carried out through the data blood margin analysis chart, and data with problems after positioning are extracted or pushed again through upstream and downstream business data.
The visualized data blood relationship analysis chart clearly shows that data comes from which table, which fields and data volumes are received, how to circulate, and not only can be clear at a glance, but also can quickly locate the problem root, and can perform re-extraction or push on data influenced by the upstream and the downstream so as to thoroughly correct the data problem.
In one embodiment of the cross-industry data processing method, data cleansing is performed on the collected business data, wherein the data cleansing includes value replacement, interception length, extraction of UTM values, and MD5 aggregation. After the business data is collected, the data can be specially processed, the data is normalized or derived into a new field, the data conversion module supports various cleaning gadgets, and the data conversion module supports the expansion of various cleaning gadgets by 'value replacement, interception length, UTM value extraction and MD5 aggregation'.
The cross-industry data processing method is characterized in that business data are abstracted into entities for storage, and the entities are divided into main entities, sporocarps, behavior sporocarps and business entities according to different logic application modes and storage schemes; the main entity is a carrier of business data, and data analysis is carried out through a data object in the main entity; the sub-entity has a logical affiliation with the main entity, the sub-entity including affiliation data that exists in association with the main entity; the behavior sub-entity has a logical affiliation with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands the behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source of the main entity, the sporocarp and the behavior sporocarp. The data management system can support the self-definition of a plurality of service data management, each service data can self-define the type (field or data relation) and the functional module (whether a label is needed or not, data rating and the like) of the service data, all main service data are simultaneously managed in one set of data management platform or system after the data are accessed through a data source in various modes, and different main service data are completely stored and isolated, so that the data safety is improved. And the upstream and downstream of all data uploaded to a data management platform or system can be inquired through the data blooding margin, and after a data problem is met, the problem can be quickly positioned, if a serious data problem is met, dirty data can be cleared through one key, and then the data can be re-extracted/re-pushed. The invention greatly helps to save enterprise expenses, is convenient and quick and improves human efficiency; the data is stored in an isolated manner, so that the service requirement is met, and the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services not only meets the requirement of individuation, but also can realize unified management; tracing the whole trace of the data through the blood margin of the data; through data cleaning, improve data quality to promote the degree of accuracy.
Although the invention has been described in detail above with reference to a general description and specific examples, it will be apparent to one skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.