Disclosure of Invention
The invention aims to provide a method and a system for processing data of multiple databases, which aim to solve the technical problem that multiple databases with data storage and screening functions for realizing different data processing requirements are lacked in the prior art.
In order to solve the technical problems, the invention specifically provides the following technical scheme:
a multi-database data processing method comprises the following steps:
100, dividing each database into a plurality of subunits, wherein each subunit is used for storing data acquired every day, and the plurality of subunits of each database store terminal data acquired every year in a two-dimensional plane coordinate system mode;
200, increasing year time dimension on the basis of a two-dimensional plane coordinate system, and establishing a three-dimensional storage system of a plurality of databases, wherein the plurality of databases are sequentially arranged in the three-dimensional storage system in a superposition mode and sequentially store terminal data acquired every year in a superposition sequence;
step 300, distinguishing a starting subunit and a terminating subunit of each database for storing data, wherein the terminating subunits and the starting subunits of two adjacent databases respectively establish a one-way transfer relationship according to a stacking sequence, and the data stored in a plurality of subunits of each database establish a two-way transfer relationship according to a time sequence;
step 400, setting data processing sequences of all databases according to a stacking sequence of the three-dimensional storage system, sequentially screening the subunits of each database according to a bidirectional transfer relationship, and sequentially screening next data by the databases according to a unidirectional transfer relationship.
As a preferred aspect of the present invention, in step 100, each database is first split into a plurality of stack storage units according to the month of each year, and the space capacities of the plurality of stack storage units are complementary to each other to achieve that the storage capacity of each stack storage unit is expandable, each stack storage unit is divided into a plurality of sub-units according to the data capacity collected each day, and the space capacities of the plurality of sub-units are complementary to each other to achieve that the storage capacity of each sub-unit is expandable.
As a preferred scheme of the present invention, the stack storage unit of each database corresponds to a Y axis of the two-dimensional plane coordinate system, the sub-unit of each stack storage unit corresponds to an X axis of the two-dimensional plane coordinate system, and data of each database is stored in a two-dimensional plane in which different time dimension heights are located in the three-dimensional storage system in a tiled manner for one year;
each coordinate point of the two-dimensional plane coordinate system is used for storing daily data collected by each subunit, and the stack path storage unit automatically intercepts one subunit and automatically generates the next subunit corresponding to the time points of the end of one day and the beginning of the next day.
As a preferred aspect of the present invention, in step 200, data of each year is stored in a plane in different time dimensions, and data of a plurality of years are sequentially queried in the three-dimensional storage system according to the time dimension and the position of the origin of the two-dimensional plane coordinate system;
and the three-dimensional storage system is internally provided with data retrieval inlets for data stored in any row of the stack storage units when the X-axis coordinate is zero, and simultaneously carries out data retrieval on a plurality of databases according to a retrieval condition and a synchronous interpolation traversal mode so as to realize the rapid processing of the data.
As a preferred scheme of the present invention, in the step 400, each database generates two data traversal patterns according to a bidirectional forwarding relationship, which are a data sequence traversal pattern and a sequential and reverse combined traversal pattern respectively;
the data sequence traversal mode is used for retrieving according to the stacking sequence of the databases in the three-dimensional storage system and the storage time sequence of each database, correspondingly outputting corresponding retrieval data according to the sequence of the acquired terminal data and directly carrying out data processing related to the time sequence;
the sequential and reverse order combined traversal mode respectively takes a starting subunit and a terminating subunit of each database as traversal starting points, sequentially carries out data retrieval on the subunits of each database according to the bidirectional transfer relationship at the traversal starting points, and sequentially carries out data retrieval on the next database according to the sequential and reverse order combined traversal mode after the retrieval of one database is finished according to the stacking sequence of the databases in the three-dimensional storage system, so that the data of the databases are sequentially output, and the data of each database is divided into two partial data of a sequential data and a reverse order data.
As a preferred scheme of the present invention, the data retrieval time lengths corresponding to the synchronous insertion traversal mode, the forward-reverse combination traversal mode, and the data sequence traversal mode are sequentially increased, and the data processing complexity associated with the time sequence corresponding to the synchronous insertion traversal mode, the forward-reverse combination traversal mode, and the data sequence traversal mode is sequentially decreased.
As a preferred scheme of the present invention, a specific implementation method of the data sequence traversal pattern is as follows:
determining a first stored database according to the stacking sequence of all databases, and performing data processing on a plurality of databases according to a fixed sequence after determining the first stored database;
performing data retrieval on the database of the first storage in a unidirectional order from the start subunit and the termination subunit;
and when the data is retrieved to the terminator unit, determining the next database for data retrieval according to the one-way switching relation of the plurality of databases so as to realize data retrieval according to a data sequence traversal mode.
As a preferred scheme of the present invention, a specific implementation method of the data sequence traversal pattern is as follows:
determining a first stored database according to the stacking sequence of all databases, and performing data processing on a plurality of databases according to a fixed sequence after determining the first stored database;
respectively taking the starting subunit and the terminating subunit as traversal starting points for the first stored database, and respectively performing data retrieval on the two traversal starting points according to the reverse order of the bidirectional switching relationship;
and when the same subunit is searched in a data traversal mode and the traversed data are overlapped, switching to the next database according to a fixed sequence to perform bidirectional traversal operation.
As a preferred scheme of the present invention, the data output by the synchronous interpolation traversal mode, the forward and backward combination traversal mode, and the data sequence traversal mode are all provided with a tag, and the content of the tag is specifically a three-dimensional coordinate value corresponding to a subunit where the data is located, and the output data directly distinguishes the acquisition time of the data through the tag.
In order to solve the above technical problems, the present invention further provides the following technical solutions: a data processing system for a method of processing multiple database data, comprising:
the capacity of each database is used for acquiring data generated by a terminal in one year, each database is respectively provided with a plurality of stack storage units, each stack storage unit is divided into a plurality of subunits, each subunit is used for storing terminal data acquired every day, and each stack storage unit is used for storing the terminal data acquired every month;
the three-dimensional storage system takes the collection days, the collection months and the collection years as an X axis, a Y axis and a Z axis respectively, the subunits of each database perform plane storage according to a two-dimensional plane coordinate system corresponding to the collection days and the collection months, and the databases are sequentially distributed on coordinate points of each year in an up-and-down stacking mode;
and the data traversing module comprises a synchronous insertion traversing mode, a sequence and reverse sequence combined traversing mode and a data sequence traversing mode, and performs data retrieval operation on the terminal data of each database in the three-dimensional storage system according to the acquisition time in the sequence, the sequence and the reverse sequence.
Compared with the prior art, the invention has the following beneficial effects:
the mode of the multi-database for storing the data supports various data screening operations, the storage is simple and orderly, the number of traversal entries is large, the traversal operation speed is high, and a user selects a data traversal mode according to specific requirements, so that the data screening speed can be improved or the time sequence regularity of the screened data during output can be increased.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1 and fig. 2, when processing data stored in a plurality of databases, in order to improve the efficiency of data processing, the present invention mostly improves the efficiency of data retrieval by means of data association, and the storage order of data is also closely related to the efficiency of data processing in the later period, and affects the time of data processing and the difficulty of data processing.
In view of the above, the present embodiment provides a multi-database storage schema that affects data processing, and by designing a schema for data storage and a schema for data retrieval, the efficiency of data processing and the basic steps of data processing are adjusted.
The multi-database data processing system comprises: database 1, three-dimensional storage system 2 anddata traversing module 3
The capacity of each database 1 is used for acquiring data generated by a terminal in one year, each database 1 is divided into a plurality of stack storage units, each stack storage unit is divided into a plurality of sub-units, each sub-unit is used for storing terminal data acquired every day, and each stack storage unit is used for storing the terminal data acquired every month;
the three-dimensional storage system 2 takes the collection days, the collection months and the collection years as an X axis, a Y axis and a Z axis respectively, the subunits of each database 1 perform plane storage according to a two-dimensional plane coordinate system corresponding to the collection days and the collection months, and the databases are sequentially distributed on coordinate points of each year in an up-and-down laminated mode;
thedata traversing module 3 comprises a synchronous insertion traversing mode, a sequence and reverse sequence combined traversing mode and a data sequence traversing mode, and performs data retrieval operation on the terminal data of each database 1 in the three-dimensional storage system 2 according to the acquisition time in the sequence, the sequence and the reverse sequence.
In addition, as shown in fig. 3, the multi-database data processing method includes the steps of:
step 100, dividing each database into a plurality of subunits, wherein each subunit is used for storing data collected every day, and the plurality of subunits of each database store terminal data collected every year in a two-dimensional plane coordinate system mode.
In step 100, each database is first split into a plurality of stack storage units according to the month of each year, the space capacities of the plurality of stack storage units are complementary to each other so as to achieve the expandable storage capacity of each stack storage unit, each stack storage unit is divided into a plurality of sub-units according to the data capacity acquired each day, and the space capacities of the plurality of sub-units are complementary to each other so as to achieve the expandable storage capacity of each sub-unit.
That is, under the condition that the total space capacity of a database is constant, the capacity of each stack storage unit is unequal and can be adjusted according to the storage data capacity of each month, and the capacity of the sub-units in each stack storage unit is unequal and is adjusted according to the storage data capacity of each day, so that the redundant capacity of the database can be recycled.
The gallery storage unit of each database corresponds to the Y axis of the two-dimensional plane coordinate system, the sub-unit of each gallery storage unit corresponds to the X axis of the two-dimensional plane coordinate system, and data of each database is stored in a two-dimensional plane with different time dimension heights in the three-dimensional storage system in a tiled mode for one year.
Each coordinate point of the two-dimensional plane coordinate system is used for storing daily data collected by each subunit, and the stack path storage unit automatically intercepts one subunit and automatically generates the next subunit corresponding to the time points of the end of one day and the beginning of the next day.
Each coordinate point of the X axis of the two-dimensional plane coordinate system corresponds to a month of each year, such as january, february, or march … …, and each coordinate point of the Y axis of the two-dimensional plane coordinate system corresponds to a number of days of different months per year, such as first, second, and third numbers … …, so that the database in the two-dimensional plane coordinate system is used for storing terminal data of one year.
And the terminal data is stored in the two-dimensional plane coordinate system according to the row-column combination mode of the stack path storage unit and the sub-unit, so that the data storage sequence in the database is arranged according to time, and the data order is improved so as to facilitate the retrieval of the data.
Furthermore, the ending end of each sub-unit is continuous with the starting end of the next sub-unit in time, so that when traversing the terminal data stored in each sub-unit, the traversing starting point of the data in the next sub-unit can be directly found through continuous time.
And 200, increasing year time dimension on the basis of a two-dimensional plane coordinate system, and establishing a three-dimensional storage system of a plurality of databases, wherein the plurality of databases are sequentially arranged in the three-dimensional storage system in a superposition mode and sequentially store terminal data acquired every year in a superposition sequence.
The annual data are stored in different year time dimensions in a plane mode, the data of multiple years are sequentially inquired in the three-dimensional storage system according to the time dimensions and the original point position of the two-dimensional plane coordinate system, namely, in the three-dimensional storage system, the two-dimensional plane coordinate system formed by XY axes is used for storing terminal data of one year, the year time dimension is used as a Z axis and used for storing the terminal data corresponding to the multiple years, therefore, the convenience of data storage is improved, and the terminal data of different years can be quickly and accurately searched through the Z axis of the three-dimensional storage system.
And the three-dimensional storage system is internally provided with data retrieval inlets for data stored in any row of the stack storage units when the X-axis coordinate is zero, and simultaneously carries out data retrieval on a plurality of databases according to a retrieval condition and a synchronous interpolation traversal mode so as to realize the rapid processing of the data.
The synchronous insertion traversal mode can simultaneously carry out synchronous data retrieval on all databases and each stack path storage unit of each database, so that the retrieval speed is very high and the method is different from the existing database storage mode and data retrieval mode.
Step 300, distinguishing a starting subunit and a terminating subunit of each database for storing data, wherein the terminating subunits and the starting subunits of two adjacent databases respectively establish a one-way transfer relationship according to a stacking sequence, and the data stored in a plurality of subunits of each database establish a two-way transfer relationship according to a time sequence;
step 400, setting data processing sequences of all databases according to a stacking sequence of the three-dimensional storage system, sequentially screening the subunits of each database according to a bidirectional transfer relationship, and sequentially screening next data by the databases according to a unidirectional transfer relationship.
Each database generates two data traversing modes according to the bidirectional transfer relationship, namely a data sequence traversing mode and a forward and reverse sequence combined traversing mode;
the data sequence traversal mode is used for retrieving according to the stacking sequence of the databases in the three-dimensional storage system and the storage time sequence of each database, correspondingly outputting corresponding retrieval data according to the sequence of the acquired terminal data and directly carrying out data processing related to the time sequence;
the sequential and reverse order combined traversal mode respectively takes a starting subunit and a terminating subunit of each database as traversal starting points, sequentially carries out data retrieval on the subunits of each database according to the bidirectional transfer relationship at the traversal starting points, and sequentially carries out data retrieval on the next database according to the sequential and reverse order combined traversal mode after the retrieval of one database is finished according to the stacking sequence of the databases in the three-dimensional storage system, so that the data of the databases are sequentially output, and the data of each database is divided into two partial data of a sequential data and a reverse order data.
The specific implementation method of the data sequence traversal mode comprises the following steps:
determining a first stored database according to the stacking sequence of all databases, and performing data processing on a plurality of databases according to a fixed sequence after determining the first stored database;
performing data retrieval on the database of the first storage in a unidirectional order from the start subunit and the termination subunit;
and when the data is retrieved to the terminator unit, determining the next database for data retrieval according to the one-way switching relation of the plurality of databases so as to realize data retrieval according to a data sequence traversal mode.
The specific implementation method of the data sequence traversal mode comprises the following steps:
determining a first stored database according to the stacking sequence of all databases, and performing data processing on a plurality of databases according to a fixed sequence after determining the first stored database;
respectively taking the starting subunit and the terminating subunit as traversal starting points for the first stored database, and respectively performing data retrieval on the two traversal starting points according to the reverse order of the bidirectional switching relationship;
and when the same subunit is searched in a data traversal mode and the traversed data are overlapped, switching to the next database according to a fixed sequence to perform bidirectional traversal operation.
The data retrieval duration corresponding to the synchronous insertion traversal mode, the forward and reverse combined traversal mode and the data sequence traversal mode is sequentially increased, and the data processing complexity corresponding to the synchronous insertion traversal mode, the forward and reverse combined traversal mode and the data sequence traversal mode and related to the time sequence is sequentially reduced.
When a forward and reverse order combined traversal mode is used for carrying out data screening on the databases, data screening work is carried out on different databases in sequence according to the stacking sequence of year time dimensions of the three-dimensional storage system, but when each database is screened, opposite data retrieval is carried out by taking the starting subunit and the terminator subunit as traversal starting points respectively until the same subunit is traversed, one traversal process is automatically stopped and another traversal process is continued, and after traversal is finished, data screening work is carried out by jumping to the starting subunit and the terminator subunit of the next database according to the database stacking sequence automatically.
Therefore, the efficiency of data traversal can be improved, but when the data is output in such a traversal mode and processed, the output data needs to be rearranged in a time sequence, so that the screening speed of the forward and reverse order combination traversal mode is higher than that of the data sequence traversal mode and lower than that of the synchronous interpolation traversal mode, but the data processing steps of the screened data in the forward and reverse order combination traversal mode are also higher than those of the screened data in the data sequence traversal mode.
On the contrary, when the data sequence traversal mode is used for screening the data of the databases, the data screening work is sequentially carried out on different databases according to the stacking sequence of the year time dimension of the three-dimensional storage system, and when each database is screened, the unidirectional data retrieval is respectively carried out to the terminator subunit by taking the starting subunit as the traversal starting point, so that the efficiency of data traversal of a plurality of databases is reduced.
In addition, the data output by the synchronous insertion traversal mode, the sequential and reverse combined traversal mode and the data sequence traversal mode are all provided with labels, the content of each label is specifically a three-dimensional coordinate value corresponding to a subunit where the data is located, and the output data directly distinguishes the data acquisition time through the labels.
Therefore, in summary, the data storage mode of the multiple databases of the present embodiment supports multiple data screening operations, the storage is simple and ordered, meanwhile, there are multiple traversal entries and the traversal operation speed is fast, and the user selects a data traversal mode according to a specific requirement, so that the data screening speed can be increased or the timing regularity of the screened data during output can be increased.
The above embodiments are only exemplary embodiments of the present application, and are not intended to limit the present application, and the protection scope of the present application is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present application and such modifications and equivalents should also be considered to be within the scope of the present application.