Movatterモバイル変換


[0]ホーム

URL:


CN118689947B - Cloud-based data processing and storage system and method - Google Patents

Cloud-based data processing and storage system and method
Download PDF

Info

Publication number
CN118689947B
CN118689947BCN202411178773.9ACN202411178773ACN118689947BCN 118689947 BCN118689947 BCN 118689947BCN 202411178773 ACN202411178773 ACN 202411178773ACN 118689947 BCN118689947 BCN 118689947B
Authority
CN
China
Prior art keywords
query
data
dimension
cloud
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411178773.9A
Other languages
Chinese (zh)
Other versions
CN118689947A (en
Inventor
杨顺
尹伟华
袁俊敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Fuiou Payment Service Ltd By Share Ltd
Original Assignee
Shanghai Fuiou Payment Service Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Fuiou Payment Service Ltd By Share LtdfiledCriticalShanghai Fuiou Payment Service Ltd By Share Ltd
Priority to CN202411178773.9ApriorityCriticalpatent/CN118689947B/en
Publication of CN118689947ApublicationCriticalpatent/CN118689947A/en
Application grantedgrantedCritical
Publication of CN118689947BpublicationCriticalpatent/CN118689947B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The application relates to a cloud-based data processing and storing method, which comprises the steps of 1) analyzing and recording query dimensions of merchants to construct a query dimension buried point table, 2) carrying out statistical analysis on the query dimension buried points of the merchants to define the query dimensions of high-frequency queries, 3) analyzing summary data in a plurality of bins, wherein the method comprises the steps of synchronizing all metadata related to the query dimensions defined as the high-frequency queries from a self-built database to each plurality of bins of the cloud through a data transmission service, executing query sql scripts related to the query dimensions defined as the high-frequency queries at each plurality of bins and carrying out analysis summary statistics on query results, 4) storing the analyzed statistical result data in a corresponding plurality of bins in an intermediate table form, and 5) associating the query dimensions defined as the high-frequency queries with corresponding intermediate tables to construct a query dimension routing table.

Description

Cloud-based data processing and storing system and method
Technical Field
The application relates to the field of data processing and storage, in particular to a scheme for automatically processing mass data and intelligently storing statistical result data according to the requirements of different business transaction queries.
Background
When the transaction inquiry platform or app inquires about transaction summary data in a specific dimension, massive total transaction table data in a self-built database is generally inquired directly, but the inquiry time is too long (current situation, reason) in this way. The database read-write pressure is easy to be excessively high, the performance is reduced, the transaction system is seriously delayed, unresponsive and even crashed, the normal operation of the underlying system is not only influenced, but also the user experience is poor.
Taking a code scanning and order receiving service scene as an example, merchants log in a query platform or query app to check the total income amount and total number of strokes in the last month or the current month, and the total income amount and total number of strokes in the last year, even the ring comparison data of the income number of strokes, and when the business is accompanied by concurrent access of multiple merchants, a large number of requests for accessing a database appear, and related dimension statistical results are queried from massive transaction data. If a large number of access requests are concurrently executed at the database, the time consumption of a single request is too long, the database connection is full, and the like, so that great read-write pressure is brought to the database.
It is therefore desirable to be able to build a new data storage form on the basis of a full database to further simplify the transaction inquiry operation and to increase the speed and efficiency of the inquiry.
Meanwhile, how to analyze and summarize the statistical result data is also a problem to be considered, if the data analysis task is put into the self-built database, the conditions of long total analysis time and drag-down of the database performance can occur, and the access of other businesses to the database is affected. It is necessary to find an additional platform environment to accomplish the task of data analysis summary statistics.
In view of the foregoing, it is desirable to provide a novel cloud-based data processing and storage system that enables fast processing and efficient data statistics analysis of transaction queries.
Disclosure of Invention
The application relates to a smart storage and analysis scheme using high-frequency query-based mass data cloud processing and storage.
According to a first aspect of the present application, there is provided a cloud-based data processing and storing method, including:
1) Analyzing and recording the query dimension of a merchant to construct a query dimension buried point table;
2) Carrying out statistical analysis on the query dimension buried points of the merchant to define the query dimension of the high-frequency query;
3) Analyzing summary data in a plurality of bins, comprising:
Synchronizing all metadata related to the query dimension defined as high-frequency query from a self-built database to each number bin of the cloud through a data transmission service;
Executing a query sql script associated with the query dimension defined as a high-frequency query at each of the bins and analyzing, summarizing and counting query results;
4) Storing the analyzed statistical result data in the form of intermediate table into correspondent several bins, and
5) The query dimension defined as a high frequency query is associated with a corresponding intermediate table to construct a query dimension routing table.
According to a second aspect of the present application, there is provided a cloud-based data processing and storage system, comprising:
A client in the front-end server configured to receive a transaction query request from a merchant;
a server node in the back-end server configured to process data queries for merchant query dimensions;
the distributed database module comprises a self-built database and a cloud database, wherein the cloud database comprises a real-time number bin and an offline number bin;
a data analyzer configured to perform the cloud-based data processing and storage method of the first aspect.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Drawings
In order to describe the manner in which the above-recited and other advantages and features of the invention can be obtained, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
FIG. 1 shows a schematic system hierarchy of a conventional transaction querying platform without the use of the inventive arrangements.
Fig. 2 shows a transaction inquiry process flow based on the transaction inquiry system in fig. 1.
FIG. 3 illustrates a schematic system hierarchy of a cloud-based data processing and storage system according to one embodiment of the application.
Fig. 4 shows a schematic flow of a method of cloud-based data processing and storage according to an embodiment of the application.
Detailed Description
The application can be applied to the field of real-time extremely high frequency access to massive data, such as industries of financial payment, stock quotation, futures transaction and the like. For convenience, in the following examples, a transaction environment of the payment industry is mainly described as an example.
For ease of understanding and comparison, a schematic system hierarchy of a conventional trading platform that does not use the scheme of the present application is first presented in FIG. 1.
As shown, the graph can be divided into three layers from top to bottom, namely, a front-end server (i.e., user layer), a back-end server (i.e., business layer), and a distributed database (i.e., data layer, which can also be referred to as a distributed database module). Among them, the front-end server (user layer) includes clients and devices thereof that can be used by various merchants, such as smartphones, personal computers, tablets, notebooks, etc. on which the clients are installed. The back-end server (business layer) includes a plurality of business processing systems for processing various transaction queries. While the distributed database module (data layer) includes a distributed database that stores the full amount of transaction data.
A conventional transaction processing flow based on the conventional transaction platform of fig. 1 is shown in fig. 2. Firstly, a user at a user layer initiates a query to a front-end server through a query interface of a client, the front-end server distributes query conditions in a query request to a back-end server, the back-end server directly queries a transaction data table according to the query conditions, and statistical result data is returned to the query interface in a primary way.
As can be readily seen from fig. 1 and 2, all query requests are issued directly to the database and massive amounts of transaction data are queried. Therefore, when massive concurrent query requests exist, the database read-write pressure is overlarge, the performance is reduced, and even the risk of death or downtime occurs.
Thus, the above-described problem can be solved if statistics including only a small number of fields required for the platform query dimension can be summarized from the data of the transaction table in advance and stored for query.
Therefore, a solution is needed to collect in advance from the data of the transaction table a statistics data containing only a small number of fields required for the platform query dimension, to be queried by the merchant or the user. For example, the merchant wants to view transaction data collected on the same day/month, such as total amount of money, total amount of money and the like, or according to different transaction types, such as WeChat, payment treasures, cloud flash payment and the like.
For this reason, the technical scheme of the application firstly uses the concept of 'several bins' to store data.
A data warehouse, or data warehouse (hereinafter also referred to simply as "data warehouse"), is a theme-oriented, integrated, relatively stable, data set that reflects historical changes, primarily for supporting decision-making in management. It is generated for further mining data resources and meeting decision-making needs on the basis that the database already exists in a large number. The data in the data warehouse is obtained by systematic processing, summarizing and sorting on the basis of extracting and cleaning the original scattered database data, and inconsistency in the source data must be eliminated so as to ensure that the information in the data warehouse is consistent global information about the whole enterprise. The data of the data warehouse is mainly used for decision analysis of enterprises, the related data operation is mainly data query, and once certain data enter the data warehouse, the data is generally reserved for a long time. The data in the data warehouse typically contains historical information from which quantitative analysis and predictions of the development history and future trends of the enterprise can be made.
Alternatively, the bins may be divided into real-time bins and offline bins. Both real-time and offline data bins are different types of data warehouses that are used to store and manage enterprise data, but they differ significantly in the time, speed, and purpose of data processing and use.
An offline data store (Offline Data Warehouse) is a system for storing and processing batch data. It is characterized in that the processing and analysis of data is based on batch processing jobs, typically in units of longer time periods. The main application scenarios of the offline digital warehouse include applications requiring historical data analysis and report generation, such as sales reports, month and year financial reports, and the like, and applications with large data volume and non-critical processing time.
Thus, the offline bins include the following features:
Batch processing-off-line number bin processes data by batch processing job, which means that data is collected, stored for a period of time and then processed once.
High capacity offline bins are typically designed to store large amounts of historical data.
The delay is high and off-line bins are not suitable for applications requiring real-time or near real-time data because the data processing is batch.
On the other hand, real-time digital (Real-time Data Warehouse) is a system for storing and processing Real-time data. Its main feature is that the processing and analysis of data is performed instantaneously, and the data enters several bins almost instantaneously and can be used for analysis and decision immediately. The application scene of the real-time digital bin mainly comprises applications needing to monitor business indexes in real time, such as financial transaction signboards, real-time sales reports, online advertisement putting analysis and the like, and applications needing to immediately take actions to cope with real-time events, such as abnormal monitoring large screens, fraud real-time detection and the like.
Thus, the real-time digital bin includes the following features:
low latency-real time bins are capable of capturing and processing data quickly after it is generated, typically at a speed on the order of seconds or sub-seconds.
Data stream processing-real-time bins typically use stream processing techniques to process the data, which allow the data to be converted and calculated immediately upon entry into the warehouse.
Real-time analysis the data can be used for real-time monitoring, dashboards, predictions and decision support.
High throughput-real-time bins require processing of a large number of data streams and thus require high throughput performance.
Complexity the architecture and techniques of real-time bins are often complex due to the need to process real-time data streams.
The real-time digital warehouse is applicable to applications requiring real-time monitoring of business indexes, such as financial transaction signboards, real-time sales reports, online advertisement putting analysis and the like. Immediate action is required to cope with real-time event applications such as anomaly monitoring large screens, fraud real-time detection, etc.
To overcome the above problem of all query requests collectively querying a huge amount of transaction table data,
The application provides an intelligent storage and analysis processing scheme for carrying out cloud number bin analysis and summarization on high-frequency query dimension data. In other words, the scheme of the application meets different transaction data processing requirements by simultaneously introducing the real-time data bins and the off-line data bins into the transaction data processing and storing system, thereby fully utilizing the respective advantages thereof and realizing quick and efficient transaction inquiry.
First, a schematic hierarchical structure of a data processing and storage system based on a query cloud database (also referred to simply as "cloud") according to the present application is shown in fig. 3.
The hierarchy includes:
A client in the front-end server, which may be a mobile device, PC, tablet or notebook, etc. with a corresponding program or APP installed, is configured to receive transaction query requests from merchants.
A server node in the back-end server configured to process data queries for merchant query dimensions;
Databases in the (distributed) database include a self-built database and a cloud database. The self-built database is a database built by an enterprise, is a database where the original sql script and the meta-table are located, and is used for storing all data related to transactions and executing query operation. Unlike the operation of the self-built database in the traditional technology for executing a large amount of transaction inquiry on mass data, the method and the device have the advantages that the sql script corresponding to some high-frequency inquiry dimensions is put into a plurality of bins from the self-built database to be executed, the statistical result data executed in the plurality of bins is stored in an intermediate table, and the intermediate table can be inquired by the next same inquiry dimension without inquiring in the self-built database. Thus, the primary responsibility of the self-built database of the present application is to act as a backup store and not to frequently perform transaction inquiry operations. While most query operations (operations in the high frequency query dimension) are transferred to the cloud database. The cloud database may be an online number bin provided by a third party platform and configured to store the statistical result data after the number bin analysis and to execute a corresponding query. According to different characteristics of the stored statistics data, the cloud database may be further divided into a real-time data bin and an offline data bin, wherein the offline data bin is configured to store data with larger data volume and relatively lower query frequency in the statistics, and the real-time data bin is configured to store data with smaller data volume and relatively higher query frequency in the statistics.
In addition to the above structures, the transaction inquiry platform system further comprises a data analysis layer and a data analyzer therein, wherein the data analyzer is configured to perform buried point and statistical analysis on inquiry requests of merchants, and perform analysis and summarization processing on massive transaction data in corresponding bins. The specific implementation of the method flow of the data analyzer will be described in detail below.
Overall, the cloud-based data processing and storing method of the present application generally includes the following steps:
1. a query dimension buried point table is constructed to record a query dimension (query condition), a query sql script and an associated meta-table, wherein the query dimension buried point is also the basis for defining a high-frequency query dimension later.
2. And determining that certain query dimensions are high-frequency query dimensions according to the data records in the query dimension buried table. Transferring the sql script of each high-frequency query dimension to a plurality of bins (a real-time plurality of bins and an offline plurality of bins) for execution; and storing statistical result data obtained by executing the sql script into an intermediate table, wherein different intermediate tables are constructed by different sql scripts.
3. And carrying out one-to-one association on the query dimension, the table names of the intermediate table and the data sources of the intermediate table so as to construct a query dimension routing table.
After the flow of the method is finished, when the merchant of the user layer initiates the query again, the query dimension in the query request can be matched with the query dimension routing table. If the match is successful, the query request may be directed to execute on a corresponding intermediate table on the data source recorded by the query dimension routing table to save time and improve efficiency. If no match is successful, the query request is performed on the metadata in the self-built database, as in the conventional query.
Fig. 4 is a schematic flow chart of the cloud-based data processing and storing method of the present application.
And step 1, analyzing and recording the query dimension of the merchant to construct a query dimension buried point table.
First, merchants at the user layer initiate query requests through clients on a query platform, which are forwarded by the clients to a backend server. The server node at the back-end server collects the merchant's query terms, such as querying total revenue for a day, paying treasures total revenue for a day, etc., from the received query request. And then, the server node carries out correlative record on the collected query conditions and the date of the current day, simultaneously stores the execution sql script of the current query, records the execution times of the current query on the current day and the time consumption of the current query, and is used for statistics and analysis of the high-frequency query dimension.
These recorded data may be referred to as buried points of the query dimension. Moreover, the data to be recorded may be stored in the form of a table (which may be referred to as a "query dimension buried point table"), as shown in table 1:
TABLE 1
As shown in table 1, the query dimension embedded table records a query dimension (query_condition), and the corresponding query sql script and the associated meta table are also the basis for defining the high-frequency query dimension later.
It should be understood that the fields and field names in the table are exemplary only and not limiting. The technician can add, delete or adjust the query dimension according to the actual requirement to obtain the required query dimension embedded point table, which belongs to the scope of the application to be protected.
By constructing the query dimension buried point table, the subsequent statistical analysis of the high-frequency query dimension is facilitated.
And 2, carrying out statistical analysis on the query dimension buried points of the merchant to define a high-frequency query dimension.
The statistical analysis includes counting high frequency query dimensions among the query dimensions. The high frequency query dimension, as the name implies, refers to the query dimension that is queried more frequently. With the query dimension buried point table constructed in step 1, a query dimension which is determined to be relatively high in query consumption and high in query frequency can be defined as a high-frequency query dimension based on buried point data collected every day. Higher query time consumption means that the statistical result data amount is larger, and higher query frequency means that the number of times of query is larger.
Thus, again taking Table 1 as an example, the definition criteria for the high frequency query dimension may be as follows:
1. The time consumption is less than or equal to 1000 milliseconds (query_time is less than or equal to 1000 ms), and the query times of the day is less than or equal to 10000 times (query_count is less than or equal to 10000 times), and the time is not defined as high-frequency query;
2. The time consumption is less than or equal to 1000 milliseconds, the number of queries in the same day is more than or equal to 10000, and the time consumption is not defined as high-frequency query;
3. the time consumption is more than 1000 milliseconds, the number of queries in the same day is less than or equal to 10000, and the time is not defined as high-frequency query;
4. takes more than 1000 milliseconds, and the number of queries per day is more than or equal to 10000, which is defined as high-frequency query.
It should be appreciated that the above time-consuming threshold of 1000 milliseconds and the number of queries threshold of 10000 times are given as examples. In practical applications, technicians may adjust the transaction amount, server processing capacity, network bandwidth, and other factors as appropriate. Such modifications are within the scope of the present application.
For the query dimension defined as the high-frequency query dimension, the corresponding query sql script needs to be put into a number bin to be executed asynchronously, and data analysis and summarization processing are performed.
And 3, analyzing the summarized data in a plurality of bins.
Firstly, all metadata related to a query dimension defined as a high-frequency query are synchronized from a self-built database to a plurality of bins through a data transmission service, wherein the metadata refer to table data (generally a plurality of tables are associated, such as a transaction table, a merchant table and the like) related to an original sql script, and the synchronization can be divided into:
synchronizing metadata with high real-time requirements, recent metadata and relatively small data volume into a real-time digital bin;
And synchronizing the metadata with low real-time requirements, history and relatively large data volume into the offline data bin.
Where "recent" generally refers to data of the present year or the last year (last 12 months). In practical application, most of transaction inquiry of scenes is aimed at recent data, and the metadata amount is small.
Where "historical" generally refers to data over a year, the amount of metadata in this case is large. In practical applications, there will of course be transaction queries that query transactions several years ago, such as transaction summary queries within 5 years, 10 years, which are slow to analyze, but the query scenario is less frequent.
Next, a query sql script associated with a query dimension defined as a high frequency query is executed at each bin and the query results are analyzed, summarized and counted.
By transferring the sql script query to the plurality of bins, the advantage of the data processing performance of the plurality of bins can be fully utilized, so that the sql execution speed is faster.
Step 4, storing the statistical result data after the analysis of the number bins into the corresponding number bins of the cloud database, wherein:
The data with larger data quantity of the statistical result and relatively lower query frequency are stored in an offline data bin;
the data quantity of the statistical result is small, and the data with relatively high query frequency is stored in a real-time number bin;
For the statistical result data, also called intermediate table data, different scripts obtain different intermediate tables, where the intermediate tables include query dimensions of merchants and required query results, such as merchant numbers, transaction dates, payment modes, total amounts, and the like. The data format can be referred to in the following table 2:
TABLE 2
Also, it should be understood that the fields and field names in the table are exemplary only and not limiting. The technician can add, delete or adjust the query dimension and the query result in the table according to the actual requirement, which belong to the scope of the application to be protected.
The intermediate table corresponding to the high-frequency query dimension is constructed according to the statistical result data after the sql script execution of different high-frequency query dimensions, and massive metadata table data can be replaced by the intermediate table to execute the query so as to improve the query efficiency.
Step 5, associating the query dimension defined as the high frequency dimension with the corresponding intermediate table to construct a query dimension routing table
The query dimension defined as the high-frequency query, the table name of the constructed corresponding intermediate table and the data source (such as real-time number bin or offline number bin) where the intermediate table is located are associatively and uniformly placed in a query dimension routing table to maintain association relation, and an example of the query dimension routing table can refer to table 3:
TABLE 3 Table 3
By using the query dimension routing table, a query request aiming at a high-frequency query dimension can be conveniently routed to a data source where a corresponding intermediate table is located to execute a script on the intermediate table, so that mass queries on a self-built database are reduced.
For each query dimension defined as a high-frequency query, steps 3-5 are sequentially executed to construct a corresponding query dimension routing table for the query dimension of each high-frequency query, so that the schematic flow of the cloud-based data processing and storing method is basically ended.
After the cloud-based data processing and storing method is used for realizing the characteristics of real-time number bins, offline number bins, query dimension embedded point tables, intermediate tables, query dimension routing tables and the like, when a subsequent query request is received, a server side can process a merchant query request according to the following flow, namely, a back-end system judges whether the data of the intermediate tables need to be read or not by verifying the query conditions in the query request, and the method comprises the following steps:
If no matched query dimension routing table exists, the query is carried out without querying intermediate table data, and the flow is the original logic, namely the data such as transaction tables are directly queried in a self-built database;
If the matched query dimension routing table exists, the query can be realized by querying the intermediate table data, namely, the corresponding intermediate table name and the data source where the intermediate table name is located are acquired from the matched query dimension routing table, the data source is connected, and the query on the intermediate table is directly executed on the data source so as to replace the massive data in the query self-built database.
Because the intermediate table data only contains the statistical result data of the high-frequency query dimension, the data volume is far smaller than the metadata in the self-built database. Thus, the query speed and efficiency based on the intermediate table data are far greater than corresponding operations performed on the self-built database.
While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be understood by those of ordinary skill in the relevant art(s) that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims. Thus, the breadth and scope of the present invention as disclosed herein should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims (7)

Translated fromChinese
1.一种基于云端的数据处理及存储的方法,包括:1. A cloud-based data processing and storage method, comprising:1)分析并记录商家的查询维度以构建查询维度埋点表;1) Analyze and record the merchant's query dimensions to build a query dimension tracking table;2)对商家的查询维度埋点进行统计学分析以定义高频查询的查询维度;2) Conduct statistical analysis on the merchant’s query dimension embedding points to define the query dimensions of high-frequency queries;3)在数仓中分析汇总数据,包括:3) Analyze and summarize data in the data warehouse, including:将定义为高频查询的所述查询维度所涉及到的所有元表数据通过数据传输服务从自建数据库同步到所述云端的各数仓中;Synchronize all meta-table data involved in the query dimension defined as a high-frequency query from the self-built database to each data warehouse in the cloud through the data transmission service;在所述各数仓处执行与定义为高频查询的所述查询维度相关联的查询sql脚本并对查询结果进行分析汇总统计;Executing, at each data warehouse, a query SQL script associated with the query dimension defined as a high-frequency query and analyzing and summarizing the query results;4)将分析后的统计结果数据以中间表形式保存到对应数仓中;以及4) Save the analyzed statistical results in the form of an intermediate table to the corresponding data warehouse; and5)将定义为高频查询的所述查询维度与对应的中间表进行关联以构建查询维度路由表;5) associating the query dimension defined as a high-frequency query with the corresponding intermediate table to construct a query dimension routing table;其中,所述方法还包括:Wherein, the method further comprises:在接收到后续查询请求时,通过验证本次查询请求中的查询条件,来判断是否需要读取所述中间表的数据,包括:When a subsequent query request is received, the query condition in the query request is verified to determine whether the data in the intermediate table needs to be read, including:如果不存在匹配的查询维度路由表,则所述查询请求不需要查询中间表的数据,直接在所述自建数据库中进行交易查询;If there is no matching query dimension routing table, the query request does not need to query the data in the intermediate table, and directly performs transaction query in the self-built database;如果存在匹配的查询维度路由表,则从匹配的所述查询维度路由表中获取对应的中间表表名和所述中间表所在的数据源,连接所述数据源,并在所述数据源上直接执行对所述中间表的查询。If there is a matching query dimension routing table, the corresponding intermediate table name and the data source where the intermediate table is located are obtained from the matching query dimension routing table, the data source is connected, and the query on the intermediate table is directly executed on the data source.2.如权利要求1所述的方法,其特征在于,所述数仓包括实时数仓和离线数仓。2. The method according to claim 1, characterized in that the data warehouse includes a real-time data warehouse and an offline data warehouse.3.如权利要求1所述的方法,其特征在于,所述查询维度埋点表记录了查询条件,用来记录所述查询sql脚本及所关联的元表。3. The method as claimed in claim 1 is characterized in that the query dimension tracking table records the query conditions and is used to record the query SQL script and the associated meta table.4.如权利要求1所述的方法,其特征在于,所述中间表包含了商家的查询维度和所需要的查询结果。4. The method as claimed in claim 1 is characterized in that the intermediate table contains the merchant's query dimensions and the required query results.5.如权利要求1所述的方法,其特征在于,所述查询维度路由表包括定义为高频查询的所述查询维度、对应的中间表表名和所述中间表所在数据源。5. The method as claimed in claim 1 is characterized in that the query dimension routing table includes the query dimension defined as a high-frequency query, the corresponding intermediate table name and the data source where the intermediate table is located.6. 如权利要求1所述的方法,其特征在于,针对每个定义为高频查询的所述查询维度,都执行步骤3)- 5)。6. The method according to claim 1, characterized in that steps 3) to 5) are performed for each query dimension defined as a high-frequency query.7.一种基于云端的数据处理及存储系统,包括:7. A cloud-based data processing and storage system, comprising:在前端服务器中的客户端,被配置用于接收来自商家处的交易查询请求;The client in the front-end server is configured to receive a transaction query request from a merchant;在后端服务器中的服务器节点,被配置用于处理针对商家查询维度的数据查询;A server node in a backend server, configured to process data queries for a merchant query dimension;分布式数据库模块,包括自建数据库和云端数据库,其中,所述云端数据库包括实时数仓和离线数仓;Distributed database module, including self-built database and cloud database, wherein the cloud database includes real-time data warehouse and offline data warehouse;数据分析器,被配置用于执行如权利要求1所述的基于云端的数据处理及存储的方法。A data analyzer is configured to execute the cloud-based data processing and storage method as claimed in claim 1.
CN202411178773.9A2024-08-272024-08-27 Cloud-based data processing and storage system and methodActiveCN118689947B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411178773.9ACN118689947B (en)2024-08-272024-08-27 Cloud-based data processing and storage system and method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411178773.9ACN118689947B (en)2024-08-272024-08-27 Cloud-based data processing and storage system and method

Publications (2)

Publication NumberPublication Date
CN118689947A CN118689947A (en)2024-09-24
CN118689947Btrue CN118689947B (en)2024-11-29

Family

ID=92772207

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411178773.9AActiveCN118689947B (en)2024-08-272024-08-27 Cloud-based data processing and storage system and method

Country Status (1)

CountryLink
CN (1)CN118689947B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112784136A (en)*2019-11-052021-05-11阿里巴巴集团控股有限公司Data query method and device and electronic equipment
CN116610680A (en)*2023-07-202023-08-18上海富友支付服务股份有限公司Gao Pinku and data hierarchical storage and query method and system using high frequency libraries

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7424470B2 (en)*2004-05-112008-09-09Sap AgLocal data repository generation
US7546312B1 (en)*2005-09-232009-06-09Emc CorporationSystem and methods for modeling a report query database
CN105893827B (en)*2016-04-292018-12-28宇龙计算机通信科技(深圳)有限公司A kind of applicative notifications message inspection method and terminal
CN109377397A (en)*2018-11-072019-02-22中国平安财产保险股份有限公司 Insurance business policy review method, device, computer equipment and storage medium
CN110658906A (en)*2019-08-302020-01-07华为技术有限公司Display method and electronic equipment
CN115801543B (en)*2022-11-292025-01-07北京伽睿智能科技集团有限公司High concurrency data link buried point log analysis method, device and system
CN115829699A (en)*2022-12-072023-03-21哈尔滨哈银消费金融有限责任公司Full-flow lean operation system based on mass data and index system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112784136A (en)*2019-11-052021-05-11阿里巴巴集团控股有限公司Data query method and device and electronic equipment
CN116610680A (en)*2023-07-202023-08-18上海富友支付服务股份有限公司Gao Pinku and data hierarchical storage and query method and system using high frequency libraries

Also Published As

Publication numberPublication date
CN118689947A (en)2024-09-24

Similar Documents

PublicationPublication DateTitle
US11868373B2 (en)Method and apparatus for monitoring an in-memory computer system
CN113094434B (en) Database synchronization method, system, device, electronic device and medium
JP5421269B2 (en) Non-overlapping ETL-less system and method for reporting OLTP data
CN109034993A (en)Account checking method, equipment, system and computer readable storage medium
CN111640040A (en)Power supply customer value evaluation method based on customer portrait technology and big data platform
US10417265B2 (en)High performance parallel indexing for forensics and electronic discovery
CN110489699A (en)A kind of asynchronous data collection method and system
CN102156933A (en)Method and counting system for counting electronic commerce transaction data
CN114880405A (en)Data lake-based data processing method and system
WO2024230735A1 (en)Data processing method, electronic device, storage medium, and apparatus
CN118069712A (en)Data life cycle management method and device, electronic equipment and storage medium
CN116910079A (en)Method, system, device and storage medium for realizing delay association of Flink with respect to CDC data dimension table
CN114897613A (en)Abnormal transaction behavior detection method and system, electronic device and storage medium
CN118689947B (en) Cloud-based data processing and storage system and method
CN116610680B (en)Gao Pinku and data hierarchical storage and query method and system using high frequency libraries
CN114090686B (en) A method and device for accelerating payment
CN114519059B (en)Data processing method, device, electronic equipment and storage medium
CN116361340A (en)Detail query method and device, electronic equipment and storage medium
CN115718690A (en)Data accuracy monitoring system and method
Morozova et al.Approaches to Organization of Change Tables for Real-Time Business Intelligence.
US20250190499A1 (en)Method and system for data archiving and retrieval in distributed search and analytics environment
CN112131302B (en)Commercial data analysis method and platform
CN120561158A (en)Data aggregation method, device, medium and equipment
CN119646083A (en) Data analysis system and analysis method
CN114116421A (en)User habit analysis method and system based on ELK

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp