Distributed metadata management method and systemTechnical Field
The present invention relates to the field of metadata management technologies, and in particular, to a distributed metadata management method and system.
Background
At present, in the face of more and more mass data, because of the limits of performance and price, the existing storage mode is increasingly unable to meet the demand, and the market needs a data storage system with large storage capacity, scalability, safety and high availability, so that distributed storage is produced under the demand.
In order to efficiently manage the storage nodes in a distributed storage system, a distributed file system typically stores metadata and data separately, depending on the storage and access characteristics of the metadata and data in the file system. The metadata storage system is a bridge connecting users and data storage servers. Therefore, efficient metadata management is crucial to achieving high performance and scalability of the distributed storage system, and distributed metadata management becomes an important research hotspot. The existing metadata management strategy has the problems of unbalanced load, a large amount of metadata movement caused by renaming operation, poor expandability of a metadata management system and the like.
Disclosure of Invention
The present invention is directed to a distributed metadata management method and system, so as to solve the foregoing problems in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a distributed metadata management method, comprising: a static load balancing method of metadata and a dynamic load balancing method of metadata;
the static load balancing method of the metadata comprises the following steps: distributing the metadata to the metadata server nodes by adopting a consistent hash function of the virtual nodes and a metadata server list; the metadata server list is a table for recording mapping relations between all virtual nodes and metadata servers, and each metadata server node stores a list of virtual nodes stored on the node;
the dynamic load balancing method of the metadata comprises the following steps: and migrating part of the metadata from the overloaded nodes to the lightest nodes by adopting a metadata migration mode.
Preferably, the static load balancing method for metadata includes the following steps:
a1, after the system is started, the manager of the metadata server generates a metadata server list according to the information of each metadata server and the configuration information of the list items;
a2, using a consistent hash function to find items in the metadata server list according to the complete path of the file, and finding a corresponding target metadata server;
a3, adding metadata information in the virtual node of the target metadata server according to the list of virtual nodes stored on the metadata server node.
Preferably, the number of items that each metadata server appears in the metadata server list is calculated by using the following function:
where Ui denotes the number of times the ith metadata server appears in the list, C denotes the number of entries in the list, and n denotes the total number of the metadata servers.
Preferably, the consistent hash function is:
NameNode_Locator=Hash(f)mod NNT_Length,
wherein, NameNode _ Locator represents the item in the selected metadata server list, f is the complete path name of the file, and NNT _ Length is the total number of items in the metadata server list.
Preferably, the method for dynamically balancing load of metadata includes the following steps:
b1, the metadata server collects load information periodically and sends the load information to a metadata server manager;
b2, the manager of the metadata server periodically calculates the load balance of the metadata server, if the load balance of the metadata server exceeds a set threshold, the metadata server is an overloaded node, and if the load balance of the metadata server does not reach the set threshold, the metadata server is an excessively light node;
b3, the metadata server manager migrating part of the metadata from the overloaded node to the too light node;
b4, the overloaded node and the too light node update the load information and send to the metadata server manager.
Preferably, the load balance of the metadata server is calculated by using the following formula:
Ti=η1di+η2mi,
in the formula,
jiis the load balancing index of the inode in time t;
wiis the load index of the ith metadata server node within the time t;
n is the number of metadata servers;
η1+η2=1,
Tin items of load indexes of i items in the metadata server list at the moment t;
diis the operation response delay of i item in the metadata server list within the time t;
miis the number of i metadata servers in the metadata server list at time t.
Preferably, the method for dynamically balancing load of metadata further includes the steps of:
calculating the overall load degree of the system, and if the overall load degree of the system exceeds a set threshold value, adding a metadata server node in the system; wherein the overall load degree of the system is calculated by adopting the following function:
wherein,
e is a load index of the system,
n is the number of metadata server nodes;
wiis the load index of the ith metadata server node within time t.
Preferably, the method further comprises the following steps: the method for performing metadata delay movement by adopting the directory redirection table to solve the problem of local consistency of metadata specifically comprises the following steps:
maintaining a directory path redirection table on each metadata server, the directory path redirection table for storing metadata information not on the current metadata server;
each entry in the directory path redirection table is a pair of key values < hash (directory path), and a virtual node >, the former is a hash value of the renamed directory path, and the latter is a current storage location of metadata to be moved.
A distributed metadata management system, comprising: the system comprises a metadata server manager and a metadata server, wherein the metadata server manager comprises a metadata server list maintenance module, a metadata server selection module and a load balancing module; the metadata server comprises a metadata processing module and a load measuring module;
the metadata server list maintenance module is responsible for maintaining correct corresponding relation between the virtual nodes and the metadata server nodes;
the selection module of the metadata server is adapted to perform a random distribution of metadata,
the load balancing module is used for receiving load information of each metadata server, calculating a system load value, sequencing the load of each metadata server, and moving metadata when the system load is unbalanced or when a metadata server cluster needs to be adjusted;
the load measurement module is used for collecting load information on the current server, calculating the load of each virtual node, calculating the load on the current server again, and sending the load information to a metadata server manager;
the metadata processing module comprises a metadata reading module, a metadata writing module and a metadata modifying module, the metadata reading module is responsible for acquiring the metadata, the metadata writing module is responsible for storing the metadata, the metadata modifying module is responsible for processing the metadata after renaming operation, a directory redirection table is maintained, and the directory path redirection table is used for storing the metadata information which is not on the current metadata server.
Preferably, the system further comprises a backup server, wherein the backup server comprises a backup server of a metadata server manager and a backup server of the metadata server, and the backup server of the metadata server manager is used for replacing the metadata server manager when the metadata server manager fails and recovering the data of the metadata server manager; and the backup server of the metadata server is responsible for recovering data of the metadata server when the metadata server node fails.
The invention has the beneficial effects that: according to the distributed metadata management method and system provided by the embodiment of the invention, two strategies, namely static load balancing during metadata distribution and dynamic load balancing during system operation, are used, so that the utilization rate of metadata service resources is improved, the load balancing of the system is ensured, and the expandability of the system is improved; in addition, the problem of massive metadata movement caused by renaming operation is solved by using a metadata delay movement scheme based on a directory redirection table, and the stability of system efficiency is ensured.
Drawings
FIG. 1 is an architecture diagram of a distributed file system having a metadata management system in accordance with the present invention;
FIG. 2 is an illustration of a NameNode list in a metadata management system according to an embodiment of the present invention;
FIG. 3 is a flow of metadata reading provided by an embodiment of the present invention;
fig. 4 is a diagram of a reliability guarantee strategy of a metadata management system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Example one
The embodiment of the invention provides a distributed metadata management method, which comprises the following steps: a static load balancing method of metadata and a dynamic load balancing method of metadata;
the static load balancing method of the metadata comprises the following steps: distributing the metadata to the metadata server nodes by adopting a consistent hash function of the virtual nodes and a metadata server list; the metadata server list is a table for recording mapping relations between all virtual nodes and metadata servers, and each metadata server node stores a list of virtual nodes stored on the node;
the dynamic load balancing method of the metadata comprises the following steps: and migrating part of the metadata from the overloaded nodes to the lightest nodes by adopting a metadata migration mode.
In the embodiment of the present invention, the architecture of the distributed file system including the distributed metadata management system is shown in fig. 1, and as can be seen from fig. 1, the overall architecture of the distributed file system includes four parts: a data storage server DN (DataNode) which is used as a storage node of the application data and stores the data block after the file segmentation; and the metadata server NN (NameNode) is used as a metadata response and update node and is responsible for maintaining a global name space which comprises file and folder attributes, and the NN maintains a name space tree and stores the mapping from the data blocks in the file to the DN. One or more NNs are in a cluster; the Client supports operations such as reading, writing, deleting files, creating and deleting directories and the like of a file system, and interacts control information (metadata) with the NN and interacts data streams (application data) with the DN; the management node NNmanager of the NameNode is responsible for periodically collecting the state information of each NN and maintaining a NameNode list; the NameNode list NNT is used for storing the NameNode; NameNode Personal (NNP) used for storing the corresponding item information in the NN; both NNT and NNP are responsible for maintenance and updating by NNManger; and the directory redirection table DPRT is used for storing a directory information list of metadata information which is not on the current metadata server, and each NN maintains one DPRT.
In the method, when the metadata is initially distributed, the metadata is distributed to the metadata server nodes by adopting the consistency hash function optimized by using the virtual machine points, so that the load balance of the metadata in static distribution is ensured;
with the operation of a system, the metadata server nodes have unbalanced load, and partial metadata are migrated from overloaded nodes to too light nodes in a metadata migration mode, so that the load balance among a plurality of metadata server nodes is realized; when the metadata stored in the system is large enough, the phenomenon that the whole load degree of the system exceeds a threshold value can occur, and the load of the system is reduced by adding a metadata server node to the system;
therefore, the embodiment of the invention improves the utilization rate of metadata service resources, ensures the load balance of the system and improves the expandability of the system by using two strategies of static load balance during metadata distribution and dynamic load balance during system operation.
In a preferred embodiment of the present invention, the method for static load balancing of metadata includes the following steps:
a1, after the system is started, the manager of the metadata server generates a metadata server list according to the information of each metadata server and the configuration information of the list items;
a2, using a consistent hash function to find items in the metadata server list according to the complete path of the file, and finding a corresponding target metadata server;
a3, adding metadata information in the virtual node of the target metadata server according to the list of virtual nodes stored on the metadata server node.
The metadata server administrator maintains a metadata server list (denoted as NameNode list or NNT), which is a table that records the mapping relationships between all virtual nodes to the metadata servers. After the system is started, the number of items in the table is unchanged, namely the number of the virtual nodes is unchanged. In order to make the metadata server load adjustment process more flexible and have smaller granularity, the number of items needs to be large enough within a certain range.
Wherein the number of items of each metadata server appearing in the metadata server list is calculated by adopting the following function:
where Ui denotes the number of times the ith metadata server appears in the list, C denotes the number of entries in the list, and n denotes the total number of the metadata servers.
Fig. 2 shows a metadata server column representation diagram. There are 7 items in the list, which correspond to 4 metadata servers, a, B, C and D, respectively.
In practical use, the method can be implemented by adopting the following method:
the client side can obtain the NameNode list from the metadata server manager during the first access, and then in the system operation process, if the NameNode list changes, the metadata server manager can send the latest metadata server list to the client side. When a client reads a file, the client calculates the number of the stored virtual node according to the hash value of the complete path name of the file, and then searches out which server the metadata is stored in according to the NameNode list.
In a preferred embodiment of the present invention, the consistent hash function is:
NameNode_Locator=Hash(f)mod NNT_Length,
wherein, NameNode _ Locator represents the item in the selected metadata server list, f is the complete path name of the file, and NNT _ Length is the total number of items in the metadata server list.
In a preferred embodiment of the present invention, the method for dynamically balancing load of metadata includes the following steps:
b1, the metadata server collects load information periodically and sends the load information to a metadata server manager;
b2, the manager of the metadata server periodically calculates the load balance of the metadata server, if the load balance of the metadata server exceeds a set threshold, the metadata server is an overloaded node, and if the load balance of the metadata server does not reach the set threshold, the metadata server is an excessively light node;
b3, the metadata server manager migrating part of the metadata from the overloaded node to the too light node;
b4, the overloaded node and the too light node update the load information and send to the metadata server manager.
The load balance of the metadata server is calculated by adopting the following formula:
Ti=η1di+η2mi,
in the formula,
in the formula,
jiis the load balancing index of the inode in time t;
wiis the load index of the ith metadata server node within the time t;
n is the number of metadata servers;
η1+η2=1,
Tin items of load indexes of i items in the metadata server list at the moment t;
diis the operation response delay of i item in the metadata server list within the time t;
miis the number of i metadata servers in the metadata server list at time t.
In this embodiment of the present invention, the method for dynamically balancing load of metadata further includes:
calculating the overall load degree of the system, and if the overall load degree of the system exceeds a set threshold value, adding a metadata server node in the system; wherein the overall load degree of the system is calculated by adopting the following function:
wherein,
e is a load index of the system,
n is the number of metadata server nodes;
wiis the load index of the ith metadata server node within time t.
In the embodiment of the invention, the dynamic load balancing method of the metadata comprises two aspects:
firstly, when the load of a certain metadata server node exceeds a threshold value set by a system, a virtual node with the maximum load needs to be selected from a metadata server with the maximum load, and the metadata information on the virtual node is moved to the metadata server with the minimum load; secondly, when the load of the whole system exceeds a set threshold value, the metadata server cluster of the current scale cannot meet the requirements of the system, a metadata server needs to be added, then load balancing adjustment is carried out according to a strategy of the first condition, after the adjustment is completed, a manager of the metadata server can adjust the NameNode list, and the latest NameNode list is sent to the client, the metadata server and the data storage server.
The distributed metadata management method provided by the embodiment of the present invention may further include: the method for performing metadata delay movement by adopting the directory redirection table to solve the problem of local consistency of metadata specifically comprises the following steps:
maintaining a directory path redirection table on each metadata server, the directory path redirection table for storing metadata information not on the current metadata server;
each entry in the directory path redirection table is a pair of key values < hash (directory path), and a virtual node >, the former is a hash value of the renamed directory path, and the latter is a current storage location of metadata to be moved.
Wherein, the directory redirection table may be represented as DPRT;
as shown in fig. 3, the specific implementation process of the method may be:
when the client accesses the file, the virtual node number stored is calculated according to the hash value of the complete path name of the file, which server the metadata is stored in is found out according to the metadata server list, i.e., the target metadata server, because of the method of metadata deferred movement employed in the present system, it may occur that the accessed metadata is not on the current target server, and therefore, when querying for target metadata information, firstly, searching a target metadata server for a hash value item corresponding to the complete path name of a file on a DPRT maintained on the target metadata server, if yes, the metadata information to be inquired is not on the target metadata server, then the server where the metadata is located is inquired for the target metadata information, moving the metadata information to a target metadata server, and deleting corresponding items on the DPRT maintained on the metadata information; if not, the target metadata information is queried on the target metadata server.
In the method, the metadata is moved only when the metadata is accessed instead of the movement when the directory or the file name is modified, so that the stability of the system throughput can be ensured when the metadata is moved in a large scale after the movement of the metadata is pushed.
Example two
An embodiment of the present invention provides a distributed metadata management system, including: the system comprises a metadata server manager and a metadata server, wherein the metadata server manager comprises a metadata server list maintenance module, a metadata server selection module and a load balancing module; the metadata server comprises a metadata processing module and a load measuring module;
the metadata server list maintenance module is responsible for maintaining correct corresponding relation between the virtual nodes and the metadata server nodes;
the selection module of the metadata server is adapted to perform a random distribution of metadata,
the load balancing module is used for receiving load information of each metadata server, calculating a system load value, sequencing the load of each metadata server, and moving metadata when the system load is unbalanced or when a metadata server cluster needs to be adjusted;
the load measurement module is used for collecting load information on the current server, calculating the load of each virtual node, calculating the load on the current server again, and sending the load information to a metadata server manager;
the metadata processing module comprises a metadata reading module, a metadata writing module and a metadata modifying module, the metadata reading module is responsible for acquiring the metadata, the metadata writing module is responsible for storing the metadata, the metadata modifying module is responsible for processing the metadata after renaming operation, a directory redirection table is maintained, and the directory path redirection table is used for storing the metadata information which is not on the current metadata server.
The distributed metadata management system with the above structure has already been described in detail in the first embodiment, and is not described in detail here.
The distributed metadata management system with the structure can realize the following functions: the static load balance during metadata distribution and the dynamic load balance during system operation improve the utilization rate of metadata service resources, ensure the load balance of the system and improve the expandability of the system; in addition, the problem of massive metadata movement caused by renaming operation can be solved by using a metadata delay movement scheme based on a directory redirection table, and the stability of system efficiency is ensured.
The distributed metadata management system provided by the embodiment of the invention can also comprise a backup server, wherein the backup server comprises a backup server of a metadata server manager and a backup server of the metadata server, and the backup server of the metadata server manager is used for replacing the metadata server manager when the metadata server manager fails and recovering the data of the metadata server manager; and the backup server of the metadata server is responsible for recovering data of the metadata server when the metadata server node fails.
The backup server is used as a reliability guarantee of the metadata management system, and the actual working process of the backup server can be as follows:
a backup server backing NNM of a metadata server manager adopts a redundancy mechanism; the backup server backNN of the metadata server adopts a logging mechanism.
The main NNM and the back NNM run the same program at the same time, the main NNM and the back NNM communicate through a read-write module and a network, the back NNM is mainly used for supervising the main NNM at ordinary times, the state of the main NNM is analyzed through the message processing module, the main NNM can regularly send heartbeat information to the back NNM and inform the state of the back NNM, if the back NNM does not receive the heartbeat information sent by the main NNM in more than one period, the main NNM can be considered to have a fault, and the back NNM can automatically take over all work on the main NNM, provide service for the system and restore the main NNM. After the main NNM is recovered, heartbeat information is sent to the back NNM to inform that the main NNM is recovered to be normal, all work is taken over from the back NNM, the back NNM is recovered to a monitoring state, and the strategy can ensure that the service is not interrupted; the NN (metadata server) and the backhaul NN are also interacted through a communication module, and the NN and the backhaul NN are connected and then receive and send data, wherein the data mainly comprises log files and metadata mirror images. And then the backNN mirrors the log file and the metadata in the memory through a synthesis module to synthesize a new metadata mirror image file, wherein the strategy can cause service interruption. See in particular fig. 4.
By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained: according to the distributed metadata management method and system provided by the embodiment of the invention, two strategies, namely static load balancing during metadata distribution and dynamic load balancing during system operation, are used, so that the utilization rate of metadata service resources is improved, the load balancing of the system is ensured, and the expandability of the system is improved; in addition, the problem of massive metadata movement caused by renaming operation is solved by using a metadata delay movement scheme based on a directory redirection table, and the stability of system efficiency is ensured.
Specifically, by adopting a consistent hash function optimized by using virtual nodes to distribute metadata, static load balance during metadata distribution is realized; when the load balance degree of the metadata server node exceeds a set threshold value, the dynamic load balance of the metadata is realized by adopting a metadata migration mode; when the overall load of the system exceeds a set threshold, reducing the load of the system by adding metadata server nodes; when the renaming operation causes a large amount of movement of metadata, the stability of the system efficiency is ensured by adopting a metadata delay movement scheme based on a directory redirection table.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
It should be understood by those skilled in the art that the timing sequence of the method steps provided in the above embodiments may be adaptively adjusted according to actual situations, or may be concurrently performed according to actual situations.
All or part of the steps in the methods according to the above embodiments may be implemented by a program instructing related hardware, where the program may be stored in a storage medium readable by a computer device and used to execute all or part of the steps in the methods according to the above embodiments. The computer device, for example: personal computer, server, network equipment, intelligent mobile terminal, intelligent home equipment, wearable intelligent equipment, vehicle-mounted intelligent equipment and the like; the storage medium, for example: RAM, ROM, magnetic disk, magnetic tape, optical disk, flash memory, U disk, removable hard disk, memory card, memory stick, network server storage, network cloud storage, etc.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.