Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic application scenario diagram of a Cube-based data query processing method according to an embodiment of the present invention. Fig. 2 is a schematic flowchart of a Cube-based data query processing method according to an embodiment of the present invention. The Cube-based data query processing method is applied to the server. The server performs data interaction with the terminal, the server can store data in a multidimensional data set based on Cube technology, and the data analysis and decision support of enterprises from multiple angles are facilitated through multidimensional analysis, data mining and other technologies, so that the efficiency and performance of data query and analysis are improved, and the time and resource consumption of data query and analysis are reduced.
In this embodiment, cube is an analysis and visualization back-end framework that can help developers quickly build and deploy multidimensional analysis applications. The background technology of Cube mainly comprises the following aspects:
Multidimensional data analysis and OLAP (online analytical processing ) technique: the technical basis of Cube is mainly multidimensional data analysis and online analysis processing technology. Multidimensional data analysis is a data analysis method based on a multidimensional data model, and OLAP is a data processing technology based on the multidimensional data model, which can help users to carry out multidimensional analysis and query on data from different angles. The Cube is a programming language widely applied to Web development, and has the characteristics of simplicity, easiness in learning, flexibility, high efficiency and the like. Meanwhile, the Cube also uses Node technology, which is a server-side JavaScript operating environment based on event-driven and non-blocking I/O models, and can help developers to construct high-performance network application programs.
Data warehouse and business intelligence: the background art of Cube is mainly based on the technology in the fields of data warehouse and business intelligence. Data warehouse is an integrated, theme-oriented, time-varying, non-volatile data set used to support enterprise decisions, while business intelligence is a technique that converts data inside and outside an enterprise into useful information and knowledge through data analysis and data mining, etc., to help the enterprise make better decisions.
Web technology and cloud computing: cube also uses a range of Web technologies such as HTTP protocol, RESTful API, webSocket, etc., and cloud computing technologies such as AWS Lambda, google Cloud Functions, etc., to facilitate the construction and deployment of high-performance analysis and visualization applications.
Fig. 2 is a flow chart of a Cube-based data query processing method according to an embodiment of the present invention. As shown in fig. 2, the method includes the following steps S110 to S160.
S110, defining the dimension and index of the data to be queried.
In this embodiment, the dimensions of the data to be queried, such as time, region, product, etc., are determined. An index of data to be queried, such as sales, profits, access amounts, etc., is determined.
Specifically, according to the service requirement, dimensions and indexes are defined, wherein the dimensions represent classification attributes of the data, and the indexes represent measurement or calculation results of the data. The definition of dimensions and indices is determined for subsequent data processing and analysis.
In one embodiment, referring to fig. 3, the step S110 may include steps S111 to S112.
S111, defining service requirements and analyzing targets.
Specifically, the business requirements are determined: the problem or goal to be solved is explicitly, such as knowing sales trends, assessing market share, or optimizing supply chains, etc.
Defining an analysis target: specific analysis targets are defined according to business requirements, such as determining the most popular product category, finding out the region with the fastest sales growth, or finding out inefficient production links.
And S112, defining dimensions and indexes according to the structure and the content of the data to be queried.
Specifically, dimensions and metrics are defined, mainly considering the hierarchical relationship and relevance of data.
Understanding the structure and content of data: the data to be queried is comprehensively known, and the data includes data sources, data formats, field definitions and the like.
Defining dimensions: the dimensions used for analysis, such as time, region, product, customer, etc., are determined based on the characteristics of the data and the business requirements. The dimensions are used to slice and filter the data.
Defining an index: according to the business requirements, indexes needing to be measured and calculated, such as sales, profits, ordering quantity, access quantity and the like are determined. The metrics are used to measure and measure business performance.
S120, connecting the Cube and the data to be queried, and configuring parameters of the data to be queried.
In this embodiment, the parameters specifically refer to connection parameters, including database addresses and authentication information.
And connecting the data to be queried and the Cube to ensure that the data can be correctly read by the Cube. Specifically, the Cube is connected with the data source, that is, the data to be queried and processed, and the Cube can be a relational database, a NoSQL database or a data warehouse. Cube provides adapters and connectors that facilitate integration with different data sources. Configuring the data source connection and ensuring that the required raw data can be extracted from the data source
In one embodiment, referring to fig. 4, the step S120 may include steps S121 to S124.
S121, determining a data source adapter or connector, and configuring the determined data source adapter or connector.
Specifically, depending on the type and characteristics of the data source, an appropriate data source adapter or connector is selected, such as a database connector, an API adapter, or the like. And configuring relevant information of the adapter or the connector according to the connection parameters of the data source, such as database addresses, user names, passwords and the like.
S122, configuring connection parameters of the data to be queried.
In this embodiment, the configuration data source connection parameters, including database address, authentication information, etc., ensure that a connection to a data source is possible.
Specifically, according to the requirement of the data source, connection parameters of the data to be queried and processed are determined, such as data table names, field names, screening conditions and the like. The connection parameters of the data to be queried are configured into the data source adapter or connector to ensure proper connection to the data source.
S123, defining a Cube in the Cube item.
In this embodiment, a Cube is defined in a Cube item, and the definition of the name, data table or view, dimension, and index of the Cube is specified.
Specifically, a new Cube item is created in the data analysis platform or tool for defining and managing the data analysis model. The Cube, i.e., the data analysis model, is defined in the Cube item. The name, description, and definition of the data source adapter or connector, data table or view, dimension, and index to which the Cube belongs are determined.
S124, configuring the incidence relation, the aggregation function and other optional parameters of the Cube according to the dimension and the index.
In this embodiment, cube dimensions are configured according to service requirements and defined dimensions. And determining the name, type and level of the dimension and the association relation between the dimensions. And configuring the index of Cube according to the service requirement and the defined index. And determining the names, aggregation functions and calculation methods of the indexes, and the association relation among the indexes. Other optional parameters of Cube are configured as required, such as filtering conditions, ordering rules, permission settings and the like.
Through the implementation steps, the data connection parameters can be configured according to the data source adapter or the connector, then the Cube is defined in Cube items, and the incidence relation, the aggregation function and other optional parameters of the Cube are configured according to the dimension and the index. These steps will help ensure proper connection of the data and build a data analysis model for subsequent query processing and analysis.
S130, preprocessing the data to be queried according to the service requirement to obtain a preprocessing result.
In this embodiment, the preprocessing result refers to data formed after performing operations such as data cleansing, data format conversion, and data aggregation on the data to be queried.
Specifically, preprocessing the data to be queried according to the service requirement to obtain a preprocessing result.
And carrying out necessary preprocessing and conversion on the original data according to the service requirements. This may include data cleansing, data format conversion, data aggregation, etc.; cube provides data pipeline and ETL ability, conveniently carries out processing and loading of data.
S140, configuring a data pre-aggregation strategy in the Cube, and aggregating the pre-processing result to obtain an analysis result.
In this embodiment, according to the aggregation function and dimension in Cube definition, the data pre-aggregation policy is configured to improve query performance.
And configuring a data pre-aggregation strategy in the Cube according to the requirement. This can be achieved by aggregation functions and dimensions in the Cube definition. Cube will aggregate the raw data in the background according to predefined policies and store the results in the data source for subsequent quick querying and analysis.
In one embodiment, referring to fig. 5, the step S140 may include steps S141 to S143.
S141, configuring a data pre-aggregation strategy according to the aggregation function of the Cube and the dimension.
Specifically, according to the defined aggregation function of Cube, the index and aggregation method requiring pre-aggregation, such as summation, average value and the like, are determined. From the defined Cube dimensions, it is determined which dimensions need to be pre-aggregated, e.g., time, geographic location, etc. The policy of pre-aggregation is configured, e.g., by hour, by day, by week, etc., according to the aggregation function and dimensions.
S142, setting a time interval and a range of data pre-aggregation according to the update frequency and the query requirement of the data to be queried.
In this embodiment, the update frequency of the data is determined according to the characteristics of the data source and the service requirement, such as every hour, every day, etc. The pre-aggregate time interval is set, e.g., hourly, daily, etc., according to the update frequency and query requirements. The scope of the pre-aggregation is set according to the query requirement, such as the last week, the last month, etc.
S143, in the time interval of data pre-aggregation, the pre-processing results in the range are aggregated by adopting a data pre-aggregation strategy, so that analysis results are obtained.
In this embodiment, according to the policy of pre-aggregation, data is pre-aggregated within a set time interval, so as to generate a pre-processing result. The pre-processing results are stored in a data warehouse or other data storage medium for subsequent query use. And according to the query requirement, acquiring an analysis result from the preprocessing result to support business analysis and decision.
Through the implementation steps, the data pre-aggregation strategy can be configured according to the aggregation function and the dimension of the Cube, and the time interval and the range of the data pre-aggregation can be set according to the update frequency and the query requirement of the data to be queried. And then, in the time interval of data pre-aggregation, adopting a data pre-aggregation strategy to aggregate the pre-processing results in the range so as to obtain analysis results to support business analysis and decision.
In this embodiment, the time interval and range of the pre-aggregation are set according to the update frequency of the data and the query requirement. The code is as follows:
s150, developing an API interface and a query statement.
In this embodiment, AP I and query languages (such as CQL) provided by Cube are used to develop corresponding AP I interfaces and query statements. In this way, the application or front end may perform a query by sending an HTTP request and obtain the analysis results.
Specifically, an AP I interface is developed for receiving a query request and returning a corresponding result. And designing a query statement, and writing the query statement according to the service requirement and the Cube data model.
S160, according to the query request acquired by the API, the analysis result is arranged in a visual and report form and is sent to the terminal, so that a visual chart and a report corresponding to the analysis result are displayed at the terminal.
In this embodiment, the Cube analysis result is integrated with a visualization tool (e.g., a chart library or a report tool) to generate a visualized analysis report. Cube provides integrated support with common visualization tools, such as compact, vue, etc.
Specifically, the analysis result of Cube is integrated with visualization tools, such as practice, vue, etc.
And creating a visual report according to the service requirement by using a chart library or a report component of the visual tool, and displaying the analysis result of the Cube.
The analysis results are converted into visual charts and reports that are easy to understand and present using a visualization tool or programming language. And sending the generated visual chart and report to the terminal, so that a user can intuitively know the analysis result.
Cube index processing generates multidimensional data indexes by pre-computing and aggregating data so as to facilitate quick query and analysis. It may group and aggregate raw data by fact tables and dimension tables to generate various different metrics. Meanwhile, cube index processing can also support a plurality of different aggregation functions and calculation logics, such as count, sum, average, max, min, percentile and the like, so as to meet the requirements of different data analysis.
By pre-calculating and generating the index, the Cube index processing can greatly improve the efficiency and performance of data query and analysis. The method can finish a large amount of calculation work in advance, and reduces the time and resource consumption of data query and analysis. Meanwhile, the Cube index processing can also improve the accuracy and consistency of data, because the Cube index processing can ensure that all data are processed according to the same aggregation logic and calculation method.
The method of the embodiment defines the range and the target of the query by defining the data dimension and the index, and provides the accuracy and the consistency of the data processing; the Cube and the data to be queried are connected, so that the reliability and consistency of the data can be ensured, and the Cube can correctly read and process the data; the preprocessing data can be subjected to data cleaning and conversion according to service requirements, so that the data is more suitable for subsequent analysis and aggregation; the data pre-aggregation strategy is configured in the Cube, so that the query performance and the response speed can be improved, and the computing resources are saved; the development of the API interface and the query statement can facilitate the user to submit the query request and return corresponding results, thereby providing flexible and customized query functions; the analysis results are arranged in a visual and report form, so that a user can more intuitively understand the data and find potential modes and trends; and the visual charts and reports are sent to the terminal, so that the user can access and share the analysis result at any time and any place, and the working efficiency and the communication effect are improved.
In summary, the Cube-based data query processing method can be implemented through the steps of defining data dimensions and indexes, connecting data sources, preprocessing data, configuring a pre-aggregation strategy, developing an API interface and query sentences, visually displaying and the like. The method can provide accurate, efficient, flexible and visual data query and analysis functions, and help users better understand and utilize data.
For example: query analysis targets: and analyzing the medical record data of the patients to know the times of the patients of different ages in different time dimensions and the common diseases.
According to the Cube-based data query processing method, the dimensions and the indexes are defined, the connection parameters of the data are configured, and the like, the data are preprocessed, the data preprocessing strategy is set, the interface is developed for query, and the analysis result can be presented in the form of visualization and report after query, so that the efficiency and performance of data query and analysis are improved, and the time and resource consumption of data query and analysis are reduced.
Fig. 6 is a schematic block diagram of a Cube-based data query processing apparatus 300 according to an embodiment of the present invention. As shown in fig. 6, the present invention further provides a Cube-based data query processing device 300, corresponding to the Cube-based data query processing method. The Cube-based data query processing apparatus 300 includes a unit for performing the Cube-based data query processing method described above, and may be configured in a server. Specifically, referring to fig. 6, the Cube-based data query processing apparatus 300 includes a defining unit 301, a configuration unit 302, a preprocessing unit 303, an aggregation unit 304, a development unit 305, and a display unit 306.
A defining unit 301, configured to define dimensions and indexes of data to be queried and processed; the configuration unit 302 is configured to connect Cube and data to be queried and configure parameters of the data to be queried; a preprocessing unit 303, configured to preprocess the data to be queried according to the service requirement, so as to obtain a preprocessing result; an aggregation unit 304, configured to configure a data pre-aggregation policy in Cube, and aggregate the pre-processing result to obtain an analysis result; a development unit 305 for developing an API interface and a query statement; and the display unit 306 is configured to sort the analysis result in a form of visualization and report according to the query request acquired by the API interface, and send the analysis result to the terminal, so as to display a visualization chart and report corresponding to the analysis result on the terminal.
In one embodiment, as shown in fig. 7, the definition unit 301 includes a first definition subunit 3011 and a second definition subunit 3012.
A first defining subunit 3011, configured to define a service requirement and an analysis target; a second definition subunit 3012, configured to define dimensions and indexes according to the structure and content of the data to be queried.
In one embodiment, as shown in fig. 8, the configuration unit 302 includes a determination subunit 3021, a parameter configuration subunit 3022, a content definition subunit 3023, and a relationship configuration subunit 3024.
A determining subunit 3021 for determining a data source adapter or connector and configuring the determined data source adapter or connector; a parameter configuration subunit 3022, configured to configure connection parameters of the data to be queried and processed; a content definition subunit 3023 for defining Cube in Cube items; a relationship configuration subunit 3024, configured to configure the association relationship, the aggregation function, and other optional parameters of the Cube according to the dimension and the index.
In an embodiment, the preprocessing unit 303 is configured to preprocess the data to be queried according to the service requirement, so as to obtain a preprocessing result.
In one embodiment, as shown in fig. 9, the aggregation unit 304 includes: policy configuration subunit 3041, setting subunit 3042, and data aggregation subunit 3043.
A policy configuration subunit 3041, configured to configure a data pre-aggregation policy according to the aggregation function of the Cube and the dimension; a setting subunit 3042, configured to set a time interval and a range of data pre-aggregation according to an update frequency and a query requirement of the to-be-queried processing data; and the data aggregation subunit 3043 is configured to aggregate the preprocessing results in the range by adopting a data pre-aggregation policy in a data pre-aggregation time interval, so as to obtain an analysis result.
It should be noted that, as will be clearly understood by those skilled in the art, the specific implementation process of the Cube-based data query processing apparatus 300 and each unit may refer to the corresponding description in the foregoing method embodiment, and for convenience and brevity of description, the description is omitted here.
The Cube-based data query processing apparatus 300 described above may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 10.
Referring to fig. 10, fig. 10 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a server, where the server may be a stand-alone server or may be a server cluster formed by a plurality of servers.
With reference to FIG. 10, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions that, when executed, cause the processor 502 to perform a Cube-based data query processing method.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall computer device 500.
The internal memory 504 provides an environment for the execution of a computer program 5032 in the non-volatile storage medium 503, which computer program 5032, when executed by the processor 502, causes the processor 502 to perform a Cube-based data query processing method.
The network interface 505 is used for network communication with other devices. It will be appreciated by those skilled in the art that the structure shown in FIG. 10 is merely a block diagram of some of the structures associated with the present inventive arrangements and does not constitute a limitation of the computer device 500 to which the present inventive arrangements may be applied, and that a particular computer device 500 may include more or fewer components than shown, or may combine certain components, or may have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program 5032 stored in a memory to implement the steps of:
Defining the dimension and index of the data to be inquired; connecting Cube and data to be queried, and configuring parameters of the data to be queried; preprocessing the data to be queried according to the service requirement to obtain a preprocessing result; configuring a data pre-aggregation strategy in Cube, and aggregating the pre-processing result to obtain an analysis result; developing an API interface and a query statement; and according to the query request acquired by the API interface, arranging the analysis result in a visual and report form, and sending the analysis result to the terminal so as to display a visual chart and report corresponding to the analysis result at the terminal.
The connection parameters comprise a database address and authentication information.
In one embodiment, when the step of defining the dimensions and indexes of the data to be queried is implemented by the processor 502, the following steps are specifically implemented:
defining business requirements and analysis targets; and defining dimensions and indexes according to the structure and the content of the data to be queried.
In an embodiment, when the processor 502 implements the step of connecting Cube with the data to be queried and configuring parameters of the data to be queried, the following steps are specifically implemented:
determining a data source adapter or connector, and configuring the determined data source adapter or connector; configuring connection parameters of data to be queried and processed; defining a Cube in Cube items; and configuring the incidence relation, the aggregation function and other optional parameters of the Cube according to the dimension and the index.
In an embodiment, when the step of preprocessing the data to be queried according to the service requirement to obtain the preprocessing result is implemented by the processor 502, the following steps are specifically implemented:
preprocessing the data to be queried according to the service requirement to obtain a preprocessing result.
In an embodiment, when the processor 502 implements the step of configuring the data pre-aggregation policy in Cube and aggregating the pre-processing result to obtain the analysis result, the following steps are specifically implemented:
Configuring a data pre-aggregation strategy according to the aggregation function of the Cube and the dimension; setting a time interval and a range of data pre-aggregation according to the update frequency and the query requirement of the data to be queried; and in the time interval of data pre-aggregation, adopting a data pre-aggregation strategy to aggregate the pre-processing results in the range so as to obtain an analysis result.
It should be appreciated that in embodiments of the present application, the Processor 502 may be a central processing unit (Central Processing Unit, CPU), the Processor 502 may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf Programmable gate arrays (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program comprises program instructions, and the computer program can be stored in a storage medium, which is a computer readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, causes the processor to perform the steps of:
Defining the dimension and index of the data to be inquired; connecting Cube and data to be queried, and configuring parameters of the data to be queried; preprocessing the data to be queried according to the service requirement to obtain a preprocessing result; configuring a data pre-aggregation strategy in Cube, and aggregating the pre-processing result to obtain an analysis result; developing an API interface and a query statement; and according to the query request acquired by the API interface, arranging the analysis result in a visual and report form, and sending the analysis result to the terminal so as to display a visual chart and report corresponding to the analysis result at the terminal.
The connection parameters comprise a database address and authentication information.
In one embodiment, when the processor executes the computer program to implement the steps of defining dimensions and indexes of the data to be queried, the steps are specifically implemented as follows:
defining business requirements and analysis targets; and defining dimensions and indexes according to the structure and the content of the data to be queried.
In an embodiment, when the processor executes the computer program to realize the step of connecting Cube with the data to be queried and configuring parameters of the data to be queried, the processor specifically realizes the following steps:
determining a data source adapter or connector, and configuring the determined data source adapter or connector; configuring connection parameters of data to be queried and processed; defining a Cube in Cube items; and configuring the incidence relation, the aggregation function and other optional parameters of the Cube according to the dimension and the index.
In an embodiment, when the processor executes the computer program to perform the preprocessing on the data to be queried according to the service requirement to obtain a preprocessing result, the following steps are specifically implemented:
preprocessing the data to be queried according to the service requirement to obtain a preprocessing result.
In one embodiment, when the processor executes the computer program to implement the step of configuring the data pre-aggregation policy in Cube and aggregating the pre-processing result to obtain the analysis result, the following steps are specifically implemented:
Configuring a data pre-aggregation strategy according to the aggregation function of the Cube and the dimension; setting a time interval and a range of data pre-aggregation according to the update frequency and the query requirement of the data to be queried; and in the time interval of data pre-aggregation, adopting a data pre-aggregation strategy to aggregate the pre-processing results in the range so as to obtain an analysis result.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be combined, divided and deleted according to actual needs. In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The integrated unit may be stored in a storage medium if implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.