CN110413595B

Movatterモバイル変換

Info

Publication number: CN110413595B
Application number: CN201910588284.3A
Authority: CN
Inventors: 石康; 唐小珍
Original assignee: Wanyi Technology Co Ltd
Current assignee: Wanyi Technology Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2022-07-12
Anticipated expiration: 2039-06-28
Also published as: CN110413595A

Abstract

The application discloses a data migration method and a related device applied to a distributed database, wherein the method comprises the following steps: acquiring data to be migrated from a source database, wherein the data to be migrated belongs to N source data tables, and N is a positive integer; establishing first data migration mapping information between the source database and a target database, wherein the first data migration mapping information is used for representing a mapping relation between the N source data tables and N target data tables of the target database, each target data table of the N target data tables belongs to each target data sub-database of the N target data sub-databases, and the N target data sub-databases belong to the target database; and migrating the data to be migrated to the target database according to the first data migration mapping information. By implementing the embodiment of the invention, the data can be migrated from the source database to the target database, and the diversified storage requirements in the future scene can be met.

Description

Data migration method applied to distributed database and related device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data migration method and a related apparatus for a distributed database.

Background

With the advent of the big data age, the requirement for storage is also higher and higher. With the increasing amount of data, a database often cannot meet the storage requirement. Therefore, enterprises desire to be able to migrate data stored in a database to another database in order to satisfy the storage of massive amounts of data.

In the prior art, when data is migrated from one database to another database, the data cannot be migrated from a source database to a target database due to different table structures between the databases, and thus diversified storage requirements in future scenes are difficult to meet.

Disclosure of Invention

The embodiment of the invention provides a data migration method and a related device applied to a distributed database, and by implementing the embodiment of the invention, data can be migrated from a source database to a target database, so that diversified storage requirements in future scenes can be met.

The invention provides a data migration method applied to a distributed database in a first aspect, which comprises the following steps:

acquiring data to be migrated from a source database, wherein the data to be migrated belongs to N source data tables, and N is a positive integer;

establishing first data migration mapping information between the source database and a target database, wherein the first data migration mapping information is used for representing a mapping relation between the N source data tables and N target data tables of the target database, each target data table of the N target data tables belongs to each target data sub-database of the N target data sub-databases, and the N target data sub-databases belong to the target database;

and migrating the data to be migrated to the target database according to the first data migration mapping information.

A second aspect of the present invention provides a server, comprising:

the data migration method comprises an acquisition module, a migration module and a migration module, wherein the acquisition module is used for acquiring data to be migrated from a source database, the data to be migrated belongs to N source data tables, and N is a positive integer;

an establishing module, configured to establish first data migration mapping information between the source database and the target database, where the first data migration mapping information is used to represent a mapping relationship between the N source data tables and N target data tables of the target database, each target data table of the N target data tables belongs to each target data sub-database of the N target data sub-databases, and the N target data sub-databases belong to the target database;

and the migration module is used for migrating the data to be migrated to the target database according to the first data migration mapping information.

A third aspect of the invention provides a computer readable storage medium for storing a computer program for execution by the processor to perform the method of any one of the data migration methods.

It can be seen that, in the above technical solution, data to be migrated is obtained from a source database first, where the data to be migrated belongs to N source data tables, and preparation is made for subsequently migrating the data to be migrated to a target database, and meanwhile, it can be seen that the data to be migrated may belong to different source data tables or may belong to the same source data table;

then, first data migration mapping information between the source database and the target database is established, where the first data migration mapping information is used to represent a mapping relationship between the N source data tables and N target data tables of the target database, each target data table of the N target data tables belongs to each target data sub-database of the N target data sub-databases, and the N target data sub-databases belong to the target database, and it can be seen that determining a location where data to be migrated is to be migrated to the target database is achieved by establishing the first data migration mapping information, where the location may be a location where a plurality of source data tables correspond to different target data tables of different target data sub-databases, or a location where one source data table corresponds to one target data table of one target data sub-database; and finally, migrating the data to be migrated to the target database according to the first data migration mapping information, so that the data are migrated from the source database to the target database, the database sub-table of the target database is realized, and the diversified storage requirements in the future scene are met.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Wherein:

fig. 1-a is a schematic flow chart of a data migration method applied to a distributed database according to an embodiment of the present invention;

FIG. 1-b is a schematic diagram of a server according to another embodiment of the present invention;

fig. 2 is a schematic flowchart of a data migration method applied to a distributed database according to another embodiment of the present invention;

FIG. 3 is a schematic diagram of a server according to an embodiment of the present invention;

fig. 4 is a schematic server structure diagram of a hardware operating environment according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The following are detailed below.

The terms "comprising" and "having," and any variations thereof, in the description and claims of this invention and the drawings described herein are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Referring to fig. 1-a and fig. 1-b, fig. 1-a is a schematic flow chart of a data migration method applied to a distributed database according to an embodiment of the present invention. The scheme shown in fig. 1-a may be embodied in a system having the architecture shown in fig. 1-b. As shown in fig. 1-a, a data migration method applied to a distributed database according to an embodiment of the present invention may include:

101. and the server acquires the data to be migrated from the source database.

The data to be migrated belong to N source data tables, and N is a positive integer.

Where N may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

The source database may include, for example: a distributed database.

The source data table is a data structure abstracted from all data software, and comprises information such as fields, indexes, keys and the like.

102. And the server establishes first data migration mapping information between the source database and the target database.

The first data migration mapping information is used to indicate mapping relationships between the N source data tables and N target data tables of the target database, where each of the N target data tables belongs to each of N target data sub-databases, and the N target data sub-databases belong to the target database.

The target database may include, for example: a distributed database.

The target data table is a data structure abstracted from all data software, and comprises information such as fields, indexes and keys.

103. And the server migrates the data to be migrated to the target database according to the first data migration mapping information.

Referring to fig. 2, fig. 2 is a schematic flowchart of a data migration method applied to a distributed database according to another embodiment of the present invention. As shown in fig. 2, a data migration method applied to a distributed database according to another embodiment of the present invention may include:

201. the server obtains configuration information.

The configuration information includes a source database address, a source database username, and a source database password.

The source database address here may include, for example: domain name, IP address (Internet Protocol), IP address + port number.

202. The server sends a first login request to the source database according to the source database address, wherein the first login request carries the source database username and the source database password, the first login request is used for indicating the source database to perform identity authentication according to the source database username and the source database password, and a first login response is sent to the server when the identity authentication is passed.

In the application, the source database has the right to perform read-write operation on the source database through the source database user name and the source database password authentication server.

203. And the server establishes connection with the source database according to the first login response.

204. And the server acquires the data to be migrated from the source database.

The source database may include, for example: a distributed database.

The source data table is a data structure abstracted from all data software, and comprises information such as fields, indexes and keys.

Optionally, in a possible implementation manner of the present application, the configuration information further includes a target database address, a target database user name, and a target database password, and before the data to be migrated is obtained from the source database, the method further includes:

sending a second login request to the target database according to the target database address, wherein the second login request carries the target database user name and the target database password, the second login request is used for indicating the target database to perform identity authentication according to the target database user name and the target database password, and a second login response is sent to a server when the identity authentication is passed;

and establishing connection with the target database according to the second login response.

The target database address here may include, for example: domain name, IP address (Internet Protocol), IP address + port number.

In the application, the target database has the authority of performing read-write operation on the target database through the target database user name and the target database password verification server.

Optionally, in a possible implementation manner of the present application, the obtaining data to be migrated from a source database includes:

acquiring data of all source data tables from the source database to obtain M pieces of data, wherein M is a positive integer;

deleting data which do not belong to the N source data tables from the M pieces of data to obtain the data to be migrated.

Where M may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Further, the server obtains data of all source data tables in the source database from the source database through a Command Line Interface shell (CLI shell).

Specifically, the command line shell may include, for example: mysqldump, which is a utility for converting a storage database by mysql (relational database management system).

Further, the server obtains data of all source data tables in the source Database from a Relational Database Service (RDS) of the source Database through mysql, and then deletes data, which does not belong to the N source data tables, in the data of all source data tables through the SED to obtain the data to be migrated.

Wherein SED is a Linux instruction.

205. And the server establishes first data migration mapping information between the source database and the target database.

The target database may include, for example: a distributed database.

The target data table is a data structure abstracted from all data software, and comprises information such as fields, indexes, keys and the like.

Optionally, the fragmentation policy of the source database is the same as that of the target database, and the fragmentation policy is distributed in a modulo manner.

206. And the server migrates the data to be migrated to the target database according to the first data migration mapping information.

Optionally, in a first aspect, in a possible implementation manner of the present application, after the migrating the data to be migrated to the target database according to the first data migration mapping information, the method further includes:

detecting whether the data to be migrated exist in the N target data tables;

if the data to be migrated does not exist in the N target data tables, detecting whether the fragmentation strategies of the source database and the target database are the same;

if the fragmentation strategies of the source database and the target database are different, the fragmentation strategy of the target database is adjusted to the fragmentation strategy of the source database.

Optionally, based on the first aspect, in a first possible implementation manner of the present application, the method further includes:

if the data to be migrated exists in the N target data tables, determining an importance level corresponding to the data to be migrated contained in the N target data tables;

setting a storage time limit for the data to be migrated according to the importance level;

acquiring a preset storage period corresponding to the importance level;

and when the storage period exceeds a preset storage period, deleting the data to be migrated contained in the N target data tables.

Optionally, based on the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner of the present application, the determining the importance levels corresponding to the data to be migrated included in the N target data tables includes:

extracting H key fields contained in the data to be migrated, wherein H is a positive integer;

comparing the H key fields with G preset key fields to determine whether J key fields identical to the G preset key fields exist in the H key fields, wherein G is an integer greater than or equal to H, and J is a positive integer less than or equal to H;

if J key fields identical to the G preset key fields exist in the H key fields, acquiring the key field weights corresponding to the J key fields to obtain J key field weights;

determining a first key field weight with the largest key field weight in the J key field weights;

acquiring an importance level function corresponding to the first key field weight from an importance level function library;

and respectively inputting the J key field weights into the importance level function to obtain the importance level.

Where H may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Where G may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Where J may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Wherein different key field weights correspond to different importance level functions.

Optionally, in a possible implementation manner of the present application, the method further includes:

acquiring first bit point information;

acquiring a log of the source database according to the first bit point information, wherein the log of the source database is used for recording a corresponding structured query language when data of the source database is changed;

acquiring changed data in the source database according to the log of the source database, wherein the changed data in the source database belong to L source data tables, and L is a positive integer;

establishing second data migration mapping information between the source database and the target database, wherein the second data migration mapping information is used for representing a mapping relationship between the L source data tables and L target data tables of the target database, each target data table of the L target data tables belongs to each target data sub-database of L target data sub-databases, and the L target data sub-databases belong to the target database;

and migrating the changed data in the source database to the target database according to the second data migration mapping information.

Wherein the structured query language does not include a data query language.

Where L may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Wherein the first location information includes at least any one of first start location information and first end location information.

Optionally, the first location information is used to instruct the server to obtain the log of the source database from the location point matched with the first location information.

Further, the first start point information is contained in the log. The log is binlog, which is a file in binary format and is used for recording SQL statement information updated on the database by a user, for example, statements for changing the database and the data table are all recorded in the binlog, but query statements for the database or the data table are not recorded in the binlog.

For example, when the first location information is first start location information, the server obtains the log from the first start location information. When the first location information includes first start location information and first end location information, the server then obtains a log between the first start location information to the first end location information. When the first location information is the first end location information, the server acquires the log from the first start end information.

Specifically, the server obtains the log of the source database according to the first bit point information through the distributed database synchronization system.

acquiring second bit point information;

acquiring a log of the target database according to the second location information, wherein the log of the target database is used for recording a corresponding structured query language when the data of the target database is changed;

acquiring changed data in the target database according to the log of the target database, wherein the changed data in the target database belong to K target data tables, and K is a positive integer;

establishing third data migration mapping information between the source database and the target database, wherein the third data migration mapping information is used for representing mapping relations between the K target data tables and K source data tables of the source database, each source data table of the K source data tables belongs to each target data sub-database of the K source data sub-databases, and the K source data sub-databases belong to the source database;

and migrating the changed data in the target database to the source database according to the third data migration mapping information.

Wherein the structured query language does not include a data query language.

Where K may be equal to 1, 2, 3, 5, 6, 11, 13, 20, or other values, for example.

Wherein the second location information includes at least any one of second start location information and second end location information.

Optionally, the second location information is used to instruct the server to obtain the log of the target database from the location point matched with the second location information.

Further, the second start position information is included in the log. The log is binlog, which is a file in binary format and is used for recording SQL statement information updated on the database by a user, for example, statements for changing the database and the data table are all recorded in the binlog, but query statements for the database or the data table are not recorded in the binlog.

For example, when the second location information is the second start location information, the server obtains the log starting from the second start location information. And when the second location information comprises second starting location information and second ending location information, the server acquires a log from the second starting location information to the second ending location information. And when the second location point information is second ending location point information, the server acquires the log from the second starting ending information.

Specifically, the server obtains the log of the target database according to the second location point information through the distributed database synchronization system.

It can be seen that, in the above technical solution, the second bit point information is obtained to obtain the log of the target database according to the second bit point information, so as to obtain the changed data in the target database according to the log of the target database, and the changed data in the target database is migrated to the source database according to the third data migration mapping information, so that when there is data change in the target database, the changed data in the target database is migrated to the source database, in other words, migration of real-time data in the target database is achieved, and real-time synchronization between the target database and the source database is ensured.

Referring to fig. 3, fig. 3 is a schematic diagram of a server according to an embodiment of the present invention. As shown in fig. 3, aserver 300 provided by an embodiment of the present invention may include:

an obtainingmodule 301, configured to obtain data to be migrated from a source database.

The source database may include, for example: a distributed database.

An establishingmodule 302, configured to establish first data migration mapping information between the source database and the target database.

The first data migration mapping information is used to represent mapping relationships between the N source data tables and N target data tables of the target database, where each target data table of the N target data tables belongs to each target data sub-base of N target data sub-bases, and the N target data sub-bases belong to the target database.

The target database may include, for example: a distributed database.

Themigration module 303 is configured to migrate the data to be migrated to the target database according to the first data migration mapping information.

Referring to fig. 4, fig. 4 is a schematic diagram of a server structure of a hardware operating environment according to an embodiment of the present application. As shown in fig. 4, a server of a hardware operating environment according to an embodiment of the present application may include:

aprocessor 401, such as a CPU.

Thememory 402 may alternatively be a high speed RAM memory or a stable memory such as a disk memory.

Acommunication interface 403 for implementing connection communication between theprocessor 401 and thememory 402.

Those skilled in the art will appreciate that the configuration of the server shown in fig. 4 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 4, thememory 402 may include therein an operating system, a network communication module, and a program for data processing. The operating system is a program that manages and controls server hardware and software resources, a program that supports personnel management, and the execution of other software or programs. The network communication module is used to implement communication between the components within thememory 402 and with other hardware and software within the server.

In the server shown in fig. 4, aprocessor 401 is configured to execute a program for data migration stored in amemory 402, and implement the following steps:

For specific implementation of the server according to the present application, reference may be made to the embodiments of the data migration method, which are not described herein again.

The present application further provides a computer readable storage medium for storing a computer program, the stored computer program being executable by the processor to perform the steps of:

For specific implementation of the computer-readable storage medium related to the present application, reference may be made to the embodiments of the data migration method, which are not described herein again.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A data migration method applied to a distributed database is characterized by comprising the following steps:

acquiring data to be migrated from a source database, wherein the data to be migrated belongs to N source data tables, N is a positive integer, the source data tables are data structures abstracted from all data software, and the data structures comprise at least one of the following items: fields, indices, keys;

migrating the data to be migrated to the target database according to the first data migration mapping information;

wherein the method further comprises:

if the data to be migrated exist in the N target data tables, determining the importance levels corresponding to the data to be migrated contained in the N target data tables;

acquiring a preset storage period corresponding to the importance level;

when the storage period exceeds a preset storage period, deleting the data to be migrated contained in the N target data tables;

wherein the determining the importance levels corresponding to the data to be migrated included in the N target data tables includes:

if J key fields identical to the G preset key fields exist in the H key fields, acquiring key field weights corresponding to the J key fields to obtain J key field weights;

acquiring an importance level function corresponding to the weight of the first key field from an importance level function library;

2. The method according to claim 1, wherein before the obtaining the data to be migrated from the source database, the method further comprises:

acquiring configuration information, wherein the configuration information comprises a source database address, a source database user name and a source database password;

sending a first login request to the source database according to the source database address, wherein the first login request carries the source database username and the source database password, the first login request is used for indicating the source database to perform identity authentication according to the source database username and the source database password, and sending a first login response to the server when the identity authentication is passed;

and establishing connection with the source database according to the first login response.

3. The method according to claim 1, wherein the obtaining the data to be migrated from the source database comprises:

4. The method according to any one of claims 1-3, further comprising:

acquiring first bit point information;

establishing second data migration mapping information between the source database and the target database, wherein the second data migration mapping information is used for representing mapping relationships between the L source data tables and L target data tables of the target database, each target data table of the L target data tables belongs to each target data sub-database of L target data sub-databases, and the L target data sub-databases belong to the target database;

5. The method according to any one of claims 1-3, further comprising:

acquiring second position point information;

acquiring change data in the target database according to the log of the target database, wherein the change data in the target database belong to K target data tables, and K is a positive integer;

6. A server, comprising:

the data migration method comprises an acquisition module, a migration module and a migration module, wherein the acquisition module is used for acquiring data to be migrated from a source database, the data to be migrated belongs to N source data tables, N is a positive integer, the source data tables are data structures abstracted from all data software, and the data structures comprise at least one of the following: fields, indices, keys;

the migration module is used for migrating the data to be migrated to the target database according to the first data migration mapping information;

wherein the server is further specifically configured to:

acquiring a preset storage period corresponding to the importance level;

7. The server according to claim 6, wherein the establishing module is further configured to establish the connection between the server and the server

8. The server according to claim 6, wherein the obtaining module is specifically configured to

9. The server according to any of claims 6-8, wherein the migration module is further configured to migrate the file to another server

Acquiring first bit point information;

acquiring a log of the source database according to the first location information, wherein the log of the source database is used for recording a corresponding structured query language when data of the source database is changed;

10. The server according to any of claims 6-8, wherein the migration module is further configured to migrate the file to another server

Acquiring second bit point information;