Summary of the invention
Technical problem to be solved by this invention is how to improve the disaster tolerance of group system.Therefore, provide a kind of cluster storage system and date storage method thereof.
In order to address the above problem, the invention discloses a kind of cluster storage system, comprise shared storage device, be positioned at the data staging server and the DRBD of first node, wherein:
Described data staging server is determined the preferred value of each file in the described shared storage device, and with preferred value greater than the data upload of the file of set point to described DRBD;
Described DRBD receives the data and the storage of the file that described data staging server uploads.
Preferably, in the said system, described data staging server determines that the preferred value of each file in the described shared storage device refers to:
Described data staging server is the summation of the parameter value of file in the described shared storage device preferred value as this document, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
Perhaps, in the said system, described data staging server determines that the preferred value of each file in the described shared storage device refers to:
The parameter value that described data staging server is a file in the described shared storage device is determined weights respectively, the product of the weights that each parameter value is corresponding with it is as the preferred value calculating parameter, and with the summation of all preferred value calculating parameters preferred value as this document, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
Preferably, have data staging server and DRBD at least two first nodes in the above-mentioned cluster storage system.
The invention also discloses a kind of date storage method of aforesaid cluster storage system, comprising:
Described cluster storage system is determined the preferred value of each file in the shared storage device, only with preferred value greater than the storage of the file of set point in distributed copy block equipment (DRBD).
Preferably, in the said method, described cluster storage system determines that the preferred value of each file in the described shared storage device refers to:
Described cluster storage system is the summation of the parameter value of file in the described shared storage device preferred value as this document, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
Perhaps, in the said method, described cluster storage system determines that the preferred value of each file in the described shared storage device refers to:
Described cluster storage system is that the parameter value of file in the described shared storage device is determined weights respectively, the product of the weights that each parameter value is corresponding with it is as the preferred value calculating parameter, and with the summation of all preferred value calculating parameters preferred value as this document, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
Preferably, has DRBD at least two first nodes in the above-mentioned cluster storage system.
The embodiment of the invention adopts mixes the advantage that storage architecture is taken into account overcast jumbo advantage of centralised storage and distributed storage high reliability, make up the data extract category of model simultaneously and lay data, be convenient to data management, improve the disaster tolerance of whole cluster, for the safe operation of electronic information provides effective guarantee.
Embodiment
Below in conjunction with drawings and the specific embodiments technical solution of the present invention is described in further details.Need to prove that under the situation of not conflicting, embodiment among the application and the feature among the embodiment be combination in any mutually.
At present, two kinds of storage modes that extensively adopt are arranged in the cluster storage system.First centralised storage mode adopts this mode, and memory becomes single failure node.It two is distributed storage modes, adopts this mode, and the memory disk utilance is too low, and data to lay strategy single, can not effectively manage.Based on this, the present patent application people considers to adopt and mixes the advantage that storage architecture is taken into account overcast jumbo advantage of centralised storage and distributed storage high reliability, make up the data extract model simultaneously, so that data management, improve the disaster tolerance of whole cluster.
Particularly, by modification/etc/multipath.conf configuration file, realize that each node in the cluster is visited and failover the multipath of shared storage device.DRBD equipment promptly is installed on plural first node at least, is realized coming the synchronous mirror entire equipment, be similar to the function of a network RAID a little by network service.That is to say when the user writes data file system on the local DRBD equipment, data can be sent on the other main frame in the network simultaneously, and be recorded in the file system, thereby reach the effect of distributed storage with identical form.Memory requirement that so both can the satisfying magnanimity data also can partly satisfy the requirement of data security, improves disk utilance and balance cost.
Embodiment 1
Present embodiment is based on above-mentioned thought, a kind of cluster storage system is provided, this system architecture as shown in Figure 1, comprise bit data tiered server, distributed copy block equipment (DRBD, Distributed ReplicatedBlock Device) and shared storage device, shared storage device is selected shared array for use in the present embodiment, service was not interrupted when the demand that shared array is used for satisfying service level HA guaranteed that node breaks down, DRBD then is used to satisfy the requirement of storage level HA, and the assurance significant data is not lost.As seen from Figure 1, all nodes all link to each other with shared array permutation, two first nodes except that with DRBD also is installed shared array links to each other.
Wherein, the data staging server is positioned on two first nodes, and it mainly is responsible for file of sharing in the array and makes up the data extract model determining the preferred value of each file, and will share preferred value in the array greater than the data upload of the file of set point to DRBD;
Particularly, on the basis of customer surveys, the data staging server with the parameter value of file and as the preferred value of file, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
In preferred version, the data staging server except with each parameter value of file and during as preferred value, also to consider the weight of each parameter, the parameter value that is file in the shared storage device is determined weights respectively, the product of the weights that each parameter value is corresponding with it is as the preferred value calculating parameter, and with the summation of all the preferred value calculating parameters preferred value as this document.For example, the modification frequency values that the reading frequency value that the sizes values of file data is designated as x, file data is designated as y, file data is designated as z, and the user's of file correspondence grade point is designated as v, sets up the data extract model afterwards, determines that promptly the preferred value of file is as follows:
ax+by+cz+dv=f
Wherein, a, b, c and d are the weights of each parameter, can determine the weights of each parameter by sample training;
F is the preferred value of file.
In addition, also think data greater than the file of set point also will upload to DRBD concerning the key message of whole cluster operation.In case cluster collapse or array damage like this, the cost of data degradation can be reduced to minimum, make cluster in the shortest time, recover running simultaneously, reach the target that improves disaster tolerance.
And still be retained in the shared array for the data of preferred value less than the file of set point.
DRBD, the data of the file that storage data staging server is uploaded.
Wherein, in order to improve the disaster tolerance of cluster storage system, general DRBD is positioned on first node.
Like this, optical fiber switch can link to each other setting/etc/corosync/corosync.conf with each node with shared storage device (being the shared array in the present embodiment); Set up the high available cluster of active/active pattern by Pacemaker, each node all becomes the potential source node that is equipped with like this, select two big inner servers as first node, by DRBD and configuration file are set, set up the high available cluster of active/passive pattern, existing active/active pattern has the active/passive pattern again in a cluster like this, thereby realizes mixed architecture.
Share array, the memory priority value is less than the data of the file of set point.
Present embodiment, on the basis of a large amount of experiments and sampling statistics, with the user's of the modification frequency values of the reading frequency value of file data, file data, file correspondence grade point as parameter value, the design data extraction model, coding, the realization data are settled automatically, thereby reach the requirement that improves the disaster tolerance ability, safeguard a index simultaneously, be convenient to search data, record data migration situation.In addition, with configuration file important in the cluster, facility information backs up in DRBD, in case the cluster collapse can recover rapidly.By to subscriber authorisation, the authority of restricting user access DRBD, the fail safe that can improve this cluster.
Embodiment 2
Present embodiment is based on above-mentioned cluster storage system, a kind of date storage method of cluster storage system is proposed, its core is that data are screened, significant data (being the data of preferred value greater than the file of set point) is placed on distributed copy block equipment (DRBD, Distributed Replicated Block Device) in, general data (being the data of preferred value less than the file of set point) is put in shared storage device (being shared array in the present embodiment), even sharing array like this damages, also the loss of loss of data can be dropped to minimum, and also can back up important system information among the DRBD (as the configuration file of server, the Administrator Info, log information or the like is determined by the keeper) can fast quick-recovery when cluster collapses.
Particularly, this method comprises: cluster storage system is determined the preferred value of each file in the shared storage device, only with preferred value greater than the storage of the file of set point in DRBD.Wherein, DRBD generally is positioned on each first node.
Particularly, cluster storage system determines that the preferred value of each file refers in the shared storage device:
With the summation of the parameter value of file in the shared storage device preferred value as this document, wherein, the parameter value of file comprise following one or more:
The user's of the reading frequency value of the sizes values of file data, file data, the modification frequency values of file data, file correspondence grade point.
Also have in some preferred versions, cluster storage system is that the parameter value of file in the shared storage device is determined weights respectively, the product of the weights that each parameter value is corresponding with it is as the preferred value calculating parameter, and with the summation of all the preferred value calculating parameters preferred value as this document.For example, the modification frequency values that the reading frequency value that the sizes values of file data is designated as x, file data is designated as y, file data is designated as z, and the user's of file correspondence grade point is designated as v, sets up the data extract model afterwards, determines that promptly the preferred value of file is as follows:
ax+by+cz+dv=f
Wherein, a, b, c and d are the weights of each parameter, can determine the weights of each parameter by sample training;
F is the preferred value of file.
From the foregoing description as can be seen, embodiments of the invention use DRBD and shared storage device by collocation, to the data classification, separately deposit, and have improved the disaster tolerance ability of whole system.Reach the purpose of taking into account storage security and reducing cost simultaneously.
The above is the preferred embodiments of the present invention only, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.