Movatterモバイル変換


[0]ホーム

URL:


CN117097746A - Node repairing method and related equipment - Google Patents

Node repairing method and related equipment
Download PDF

Info

Publication number
CN117097746A
CN117097746ACN202311058784.9ACN202311058784ACN117097746ACN 117097746 ACN117097746 ACN 117097746ACN 202311058784 ACN202311058784 ACN 202311058784ACN 117097746 ACN117097746 ACN 117097746A
Authority
CN
China
Prior art keywords
node
surviving
nodes
repair
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311058784.9A
Other languages
Chinese (zh)
Inventor
李诗逸
张帅蓬
夏文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Original Assignee
Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of TechnologyfiledCriticalHarbin Institute Of Technology shenzhen Shenzhen Institute Of Science And Technology Innovation Harbin Institute Of Technology
Priority to CN202311058784.9ApriorityCriticalpatent/CN117097746A/en
Publication of CN117097746ApublicationCriticalpatent/CN117097746A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

The embodiment of the application provides a node repairing method and related equipment, which are used for solving the network transmission problem caused by repairing a plurality of nodes in a distributed storage system. The method of the embodiment of the application comprises the following steps: acquiring all data nodes in a distributed storage system; wherein, all data nodes comprise fault nodes and surviving nodes; scheduling and grouping surviving nodes, wherein surviving nodes in any group exist in pairs; merging the surviving nodes in the same group of times to transmit the node repair data in the current surviving nodes to other surviving nodes in the same group of times to form a node repair block; the node repair block is formed by node repair data of other surviving nodes and node repair data of the current surviving nodes; and transmitting the node repair blocks in any group of times to the fault node, and repairing the fault node so that the fault node completes fault repair according to the node repair blocks.

Description

Translated fromChinese
节点修复方法及相关设备Node repair methods and related equipment

技术领域Technical field

本申请实施例涉及数据存储技术领域,尤其涉及一种节点修复方法及相关设备。The embodiments of the present application relate to the field of data storage technology, and in particular, to a node repair method and related equipment.

背景技术Background technique

里德-所罗门码(RS(k,m))是擦除码家族中应用最广泛的一种纠删码。尤其是在分布式存储系统中。Reed-Solomon code (RS(k, m)) is the most widely used erasure code in the erasure code family. Especially in distributed storage systems.

为了实现高可用性和低存储成本,纠删码被广泛地使用来代替复制。与复制相比,纠删码可以降低存储成本,但也带来了更高的修复开销。例如,目前的多节点修复算法是并行多个单节点修复算法,这将会产生两个严重的问题:节点间负载不平衡和多个独立的单节点修复产生的冗余块传输。当修复条带内的多个块时,每个节点的上传块数和下载块数不平衡,容易导致修复时间取决于负载最重的节点,其中网络传输开销在实际应用中占据主导地位,且这种开销随着故障节点的增加而线性增加,从而造成修复时间的提升。To achieve high availability and low storage costs, erasure coding is widely used to replace replication. Compared with replication, erasure coding can reduce storage costs, but also introduces higher repair overhead. For example, the current multi-node repair algorithm is to parallel multiple single-node repair algorithms, which will produce two serious problems: load imbalance among nodes and redundant block transmission caused by multiple independent single-node repairs. When repairing multiple blocks within a stripe, the number of uploaded blocks and downloaded blocks of each node is unbalanced, which easily causes the repair time to depend on the most heavily loaded node, where network transmission overhead dominates in practical applications, and This overhead increases linearly with the number of faulty nodes, resulting in an increase in repair time.

发明内容Contents of the invention

本申请实施例提供了一种节点修复方法及相关设备,用于解决分布式存储系统下修复多个节点所带来的网络传输问题,以尽可能地加快修复速度。Embodiments of the present application provide a node repair method and related equipment to solve network transmission problems caused by repairing multiple nodes in a distributed storage system, so as to speed up the repair as much as possible.

本申请实施例第一方面提供了一种节点修复方法,应用于分布式存储系统,包括:The first aspect of the embodiment of this application provides a node repair method, which is applied to a distributed storage system, including:

获取所述分布式存储系统中的所有数据节点;其中,所述所有数据节点包括故障节点及幸存节点,所述幸存节点为所述所有数据节点中除开所述故障节点的节点;Obtain all data nodes in the distributed storage system; wherein, all data nodes include fault nodes and survivor nodes, and the survivor nodes are nodes among all data nodes except the fault node;

将所述幸存节点进行调度分组,其中,任一组次中的所述幸存节点为成对存在;Schedule the surviving nodes into groups, where the surviving nodes in any group exist in pairs;

将同一组次中的所述幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,形成节点修复块;其中,所述节点修复块由所述其他幸存节点的节点修复数据及所述当前幸存节点的节点修复数据形成;The surviving nodes in the same group are merged to transmit the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block; wherein the node repair block is composed of the The node repair data of other surviving nodes and the node repair data of the current surviving node are formed;

将任一组次中的所述节点修复块传输至所述故障节点,对所述故障节点进行修复,以使得所述故障节点根据所述节点修复块完成故障修复。The node repair block in any group is transmitted to the fault node, and the fault node is repaired, so that the fault node completes fault repair according to the node repair block.

可选地,所述将所述幸存节点进行调度分组包括:Optionally, the scheduling and grouping of the surviving nodes includes:

将k个所述幸存节点分为个组,其中,任一/>个组中包括两个所述幸存节点,所述k为正整数;Divide the k surviving nodes into groups, any of which/> Each group includes two surviving nodes, and k is a positive integer;

所述将同一组次中的所述幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,包括:The merging operation of the surviving nodes in the same group to transmit the node repair data in the current surviving node to other surviving nodes in the same group includes:

将第一幸存节点的第一幸存节点修复数据传输至第二幸存节点;其中,所述第一幸存节点为在所述个组中处于同一组次且位于后半部分的幸存节点,所述第二幸存节点为在所述/>个组中处于同一组次且位于前半部分的幸存节点。Transmitting the first survivor node repair data of the first survivor node to the second survivor node; wherein the first survivor node is the Survivor nodes in the same group and located in the second half of the group, the second survivor node is in the/> The surviving nodes in the same group and located in the first half of the group.

可选地,所述将第一幸存节点的节点修复数据传输至第二幸存节点之后,所述方法还包括:Optionally, after transmitting the node repair data of the first survivor node to the second survivor node, the method further includes:

将所述个组进行归并操作,获取/>个组;其中,任一个组中包括四个所述幸存节点;will be described Groups are merged and obtained/> groups; wherein any group includes four of the surviving nodes;

获取第三幸存节点的第三幸存节点修复数据及第四幸存节点的第四幸存节点修复数据,并确定所述第三幸存节点对应的第三加权系数及所述第四幸存节点对应的第四加权系数;其中,所述第三幸存节点及所述所述第四幸存节点分别为在所述个组中处于同一组次且位于后半部分的幸存节点;Obtain the third surviving node repair data of the third surviving node and the fourth surviving node repair data of the fourth surviving node, and determine the third weighting coefficient corresponding to the third surviving node and the fourth corresponding to the fourth surviving node. Weighting coefficient; wherein, the third survivor node and the fourth survivor node are respectively in the Surviving nodes in the same group and located in the second half of each group;

确定基于所述第三幸存节点修复数据、所述第四幸存节点修复数据及所述第三加权系数的与所述第五幸存节点对应的第五幸存节点修复数据,并确定基于所述第三幸存节点修复数据、所述第四幸存节点修复数据及所述第四加权系数的与所述第六幸存节点对应的第六幸存节点修复数据;其中,所述第五幸存节点及所述第六幸存节点分别为在所述个组中处于同一组次且位于前半部分的幸存节点;Determine fifth surviving node repair data corresponding to the fifth surviving node based on the third surviving node repair data, the fourth surviving node repair data and the third weighting coefficient, and determine based on the third surviving node repair data. The surviving node repair data, the fourth surviving node repair data and the sixth surviving node repair data corresponding to the sixth surviving node of the fourth weighting coefficient; wherein, the fifth surviving node and the sixth surviving node The surviving nodes are as described in Surviving nodes in the same group and located in the first half of each group;

将所述第五幸存节点修复数据传输至所述第五幸存节点,将所述第六幸存节点修复数据传输至所述第六幸存节点。The fifth survivor node repair data is transmitted to the fifth survivor node, and the sixth survivor node repair data is transmitted to the sixth survivor node.

可选地,所述方法还包括:Optionally, the method also includes:

若同一分组中的所述幸存节点的数量的一半小于所述故障节点的数量,将当前分组中前半部分的所述幸存节点的节点修复数据传输至后半部分的所述幸存节点;If half of the number of surviving nodes in the same group is less than the number of failed nodes, transmit the node repair data of the surviving nodes in the first half of the current group to the surviving nodes in the second half;

所述形成节点修复块,包括:The forming node repair block includes:

将经过传输所述节点修复数据后的任一所述幸存节点进行归并,生成所述节点修复块;其中,所述节点修复块包含当前幸存节点获取的所有节点修复数据。Merge any of the surviving nodes after transmitting the node repair data to generate the node repair block; wherein the node repair block includes all node repair data obtained by the current surviving node.

可选地,所述方法还包括:Optionally, the method also includes:

若存在未位于任一个组中的幸存节点,确定未位于任一/>个组中的幸存节点为剩余幸存节点,并将所有所述剩余幸存节点归并至剩余组;If there is not located in any Surviving nodes in groups, determined not to be in any/> The surviving nodes in each group are the remaining surviving nodes, and all remaining surviving nodes are merged into the remaining groups;

所述形成节点修复块,包括:The forming node repair block includes:

确定所述剩余组中的剩余幸存节点的节点修复数据及与所述节点修复数据对应的加权系数;Determine the node repair data of the remaining surviving nodes in the remaining group and the weighting coefficient corresponding to the node repair data;

根据所述剩余幸存节点的节点修复数据及加权系数生成与所述剩余组对应的节点修复块。A node repair block corresponding to the remaining group is generated according to the node repair data and weighting coefficient of the remaining surviving nodes.

可选地,若所述剩余组中的所述剩余幸存节点的数量小于所述故障节点的数量,且所述故障节点包括第一故障节点、第二故障节点及第三故障节点,所述剩余幸存节点包括第一剩余幸存节点及第二剩余幸存节点,所述根据所述剩余幸存节点的节点修复数据及加权系数生成与所述剩余组对应的节点修复块,包括:Optionally, if the number of remaining surviving nodes in the remaining group is less than the number of faulty nodes, and the faulty nodes include a first faulty node, a second faulty node and a third faulty node, the remaining The surviving nodes include a first remaining surviving node and a second remaining surviving node. Generating a node repair block corresponding to the remaining group according to the node repair data and weighting coefficient of the remaining surviving node includes:

根据所述第一剩余幸存节点的节点修复数据、所述第二剩余幸存节点的节点修复数据及第一剩余加权系数确定第一剩余节点修复块;其中,所述第一剩余加权系数为与所述第一故障节点对应的加权系数;The first remaining node repair block is determined according to the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node and the first remaining weighted coefficient; wherein the first remaining weighted coefficient is the same as the first remaining weighted coefficient. The weighting coefficient corresponding to the first fault node;

根据所述第一剩余幸存节点的节点修复数据、所述第二剩余幸存节点的节点修复数据及第二剩余加权系数确定第二剩余节点修复块;其中,所述第二剩余加权系数为与所述第二故障节点对应的加权系数;The second remaining node repair block is determined according to the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node and the second remaining weighted coefficient; wherein the second remaining weighted coefficient is the same as the remaining weighted coefficient. The weighting coefficient corresponding to the second fault node;

根据所述第一剩余幸存节点的节点修复数据、所述第二剩余幸存节点的节点修复数据及第三剩余加权系数确定第三剩余节点修复块;其中,所述第三剩余加权系数为与所述第三故障节点对应的加权系数;The third remaining node repair block is determined according to the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node and the third remaining weighted coefficient; wherein the third remaining weighted coefficient is the same as the remaining weighted coefficient. The weighting coefficient corresponding to the third fault node;

所述将任一组次中的所述节点修复块传输至所述故障节点,对所述故障节点进行修复,包括:Transmitting the node repair block in any group to the faulty node and repairing the faulty node includes:

将所述第一剩余节点修复块、所述第二剩余节点修复块或所述第三剩余节点修复块分别传输至所述第一故障节点、所述第二故障节点或所述第三故障节点,以分别对所述第一故障节点、所述第二故障节点或所述第三故障节点进行修复。Transmitting the first remaining node repair block, the second remaining node repair block or the third remaining node repair block to the first fault node, the second fault node or the third fault node respectively , to respectively repair the first fault node, the second fault node or the third fault node.

可选地,所述将任一组次中的所述节点修复块传输至所述故障节点,对所述故障节点进行修复,包括:Optionally, transmitting the node repair block in any group to the faulty node and repairing the faulty node includes:

将任一组次中的所述幸存节点进行排序;Sort the surviving nodes in any group;

若所述故障节点的数量为m个,将任一组次中排序为前m个的所述幸存节点对应的所述节点修复块分别传输至所述故障节点,对m个所述故障节点进行修复;其中,所述m为正整数。If the number of faulty nodes is m, the node repair blocks corresponding to the top m surviving nodes in any group are respectively transmitted to the faulty nodes, and the m faulty nodes are Repair; where m is a positive integer.

本申请实施例第二方面提供了一种节点修复系统,应用于分布式存储系统,包括:The second aspect of the embodiment of this application provides a node repair system, which is applied to a distributed storage system, including:

获取单元,用于获取所述分布式存储系统中的所有数据节点;其中,所述所有数据节点包括故障节点及幸存节点,所述幸存节点为所述所有数据节点中除开所述故障节点的节点;An acquisition unit configured to acquire all data nodes in the distributed storage system; wherein all data nodes include fault nodes and survivor nodes, and the survivor nodes are nodes among all data nodes excluding the fault node. ;

分组单元,用于将所述幸存节点进行调度分组,其中,任一组次中的所述幸存节点为成对存在;A grouping unit, configured to schedule and group the surviving nodes, wherein the surviving nodes in any group exist in pairs;

归并单元,用于将同一分组中的所述幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一分组中的其他幸存节点,形成节点修复块;其中,所述节点修复块由所述其他幸存节点的节点修复数据及所述当前幸存节点的节点修复数据形成;A merging unit, configured to perform a merging operation on the surviving nodes in the same group, so as to transmit the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block; wherein, the node repair block It is formed by the node repair data of the other surviving nodes and the node repair data of the current surviving node;

传输单元,用于将任一分组中的所述节点修复块传输至所述故障节点,对所述故障节点进行修复,以使得所述故障节点根据所述节点修复块完成故障修复。A transmission unit, configured to transmit the node repair block in any group to the fault node, and repair the fault node, so that the fault node completes fault repair according to the node repair block.

本申请实施例第二方面提供的用于执行第一方面所述的节点修复方法。The second aspect of the embodiment of the present application provides a method for executing the node repair method described in the first aspect.

本申请实施例第三方面提供了一种节点修复装置,包括:The third aspect of the embodiment of the present application provides a node repair device, including:

中央处理器,存储器,输入输出接口,有线或无线网络接口以及电源;Central processing unit, memory, input and output interfaces, wired or wireless network interfaces and power supply;

所述存储器为短暂存储存储器或持久存储存储器;The memory is a short-term storage memory or a persistent storage memory;

所述中央处理器配置为与所述存储器通信,并执行所述存储器中的指令操作以执行第一方面所述的节点修复方法。The central processing unit is configured to communicate with the memory and execute instruction operations in the memory to perform the node repair method described in the first aspect.

本申请实施例第四方面提供了一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括指令,当所述指令在计算机上运行时,使得计算机执行第一方面所述的节点修复方法。The fourth aspect of the embodiments of the present application provides a computer-readable storage medium, which is characterized in that the computer-readable storage medium includes instructions that, when the instructions are run on a computer, cause the computer to execute the steps described in the first aspect. Node repair method.

从以上技术方案可以看出,本申请实施例具有以下优点:通过本申请实施例公开的一种节点修复方法,先获取分布式存储系统中的所有数据节点;其中,所有数据节点包括故障节点及幸存节点,幸存节点为所有数据节点中除开故障节点的节点;再将幸存节点进行调度分组,其中,任一组次中的幸存节点为成对存在;然后将同一组次中的幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,形成节点修复块;其中,节点修复块由其他幸存节点的节点修复数据及当前幸存节点的节点修复数据形成;最后将任一组次中的节点修复块传输至故障节点,对故障节点进行修复,以使得故障节点根据节点修复块完成故障修复。从而,在分布式存储系统中,能通过幸存节点有效修复故障节点,同时,将幸存节点进行分组传输,也便于故障节点修复时降低网络传输开销,从而加快修复速度,从而尽可能地优化在分布式存储系统下的多节点修复过程。It can be seen from the above technical solutions that the embodiments of the present application have the following advantages: through a node repair method disclosed in the embodiments of the present application, all data nodes in the distributed storage system are first obtained; where all data nodes include fault nodes and Surviving nodes, the surviving nodes are all data nodes except the fault node; then the surviving nodes are scheduled and grouped, where the surviving nodes in any group exist in pairs; then the surviving nodes in the same group are merged Operation to transmit the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block; wherein the node repair block is formed by the node repair data of other surviving nodes and the node repair data of the current surviving node. ; Finally, the node repair block in any group is transmitted to the faulty node, and the faulty node is repaired, so that the faulty node completes fault repair according to the node repair block. Therefore, in the distributed storage system, the failed nodes can be effectively repaired through the surviving nodes. At the same time, the surviving nodes are grouped for transmission, which also facilitates the reduction of network transmission overhead when repairing the failed nodes, thus speeding up the repair speed and optimizing the distribution as much as possible. Multi-node repair process under traditional storage system.

附图说明Description of the drawings

为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some implementations recorded in the present application. For example, those of ordinary skill in the art can also obtain other drawings based on these drawings.

图1为一种现有的RS(4,2)码的编码和解码过程的示意图;Figure 1 is a schematic diagram of the encoding and decoding process of an existing RS(4,2) code;

图2为一种现有的RS(4,2)码的修复过程的示意图;Figure 2 is a schematic diagram of the repair process of an existing RS(4,2) code;

图3为一种现有的RS(4,2)码的PPR修复过程的示意图;Figure 3 is a schematic diagram of the PPR repair process of an existing RS(4,2) code;

图4为一种现有的RS(6,2)码的负载不平衡的示意图;Figure 4 is a schematic diagram of the load imbalance of an existing RS(6,2) code;

图5为一种现有的RS(4,3)码的传输时隙的示意图;Figure 5 is a schematic diagram of an existing RS(4,3) code transmission time slot;

图6为本申请实施例公开的一种节点修复方法的流程示意图;Figure 6 is a schematic flow chart of a node repair method disclosed in the embodiment of the present application;

图7为本申请实施例公开的另一种节点修复方法的流程示意图;Figure 7 is a schematic flow chart of another node repair method disclosed in the embodiment of the present application;

图8为本申请实施例公开的一种RS(6,3)码的多节点修复的示意图;Figure 8 is a schematic diagram of a multi-node repair of RS(6,3) code disclosed in the embodiment of the present application;

图9为本申请实施例公开的一种节点修复系统的结构示意图;Figure 9 is a schematic structural diagram of a node repair system disclosed in the embodiment of the present application;

图10为本申请实施例公开的一种节点修复装置的结构示意图。Figure 10 is a schematic structural diagram of a node repair device disclosed in an embodiment of the present application.

具体实施方式Detailed ways

本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if present) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects without necessarily using Used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method, system, product, or apparatus that encompasses a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.

需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。It should be noted that the descriptions involving “first”, “second”, etc. in this application are for descriptive purposes only and cannot be understood as indicating or implying their relative importance or implicitly indicating the number of indicated technical features. . Therefore, features defined as "first" and "second" may explicitly or implicitly include at least one of these features. In addition, the technical solutions in various embodiments can be combined with each other, but it must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that such a combination of technical solutions does not exist. , nor is it within the scope of protection required by this application.

为方便对本申请实施例中的技术方案进行理解,以下对目前主流的纠删码多节点故障修复优化的方法进行描述。不难理解的是,当前的纠删码多节点故障修复优化方法主要是执行并行多个单节点的修复方案。In order to facilitate understanding of the technical solutions in the embodiments of the present application, the current mainstream erasure coding multi-node fault repair and optimization methods are described below. It is not difficult to understand that the current erasure coding multi-node fault repair optimization method mainly executes multiple single-node repair solutions in parallel.

以下将对目前主流的节点故障修复方法进行描述。其中,需要提前说明的是,纠删码具有这样的特性:当条带中的失败数据块数小于或等于m时,始终可以重建原始的数据。用户可以灵活地配置不同的参数,以便在数据可靠性和冗余性之间取得平衡。The following will describe the current mainstream node failure repair methods. Among them, it should be noted in advance that erasure coding has the following characteristics: when the number of failed data blocks in the stripe is less than or equal to m, the original data can always be reconstructed. Users have the flexibility to configure different parameters to strike a balance between data reliability and redundancy.

为方便理解和描述,请参阅图1,图1为一种现有的RS(4,2)码的编码和解码过程的示意图。对于(k,m)纠删码,大小为R的数据对象被分成k个数据块{C0,C1,...Ck-1,},每个块的大小为,然后根据编码方程/>k个数据块被编码为m个校验块,cij(0≤i≤m-1,0≤j≤k-1)表示计算校验块Pi时Cj的系数,编码过程也可以用(n,k)的生成矩阵G与(k,1)的数据列向量的乘积来表示。不难理解的是,对于(k,m)纠删码而言,表示的是k个数据块被编码为m个校验块,(k,m)代表的是解码时k行m列的子矩阵,为方便理解和描述,后续不再对此进行赘述。不难理解的是,本实施例中的i,j均为整数。For convenience of understanding and description, please refer to Figure 1, which is a schematic diagram of the encoding and decoding process of an existing RS(4,2) code. For (k, m) erasure coding, a data object of size R is divided into k data blocks {C0 , C1 ,...Ck-1 ,}, and the size of each block is , and then according to the encoding equation/> k data blocks are encoded into m check blocks, cij (0≤i≤m-1,0≤j≤k-1) represents the coefficient of Cj when calculating the check blockPi . The encoding process can also be used It is represented by the product of the generating matrix G of (n, k) and the data column vector of (k, 1). It is not difficult to understand that for (k, m) erasure code, it means that k data blocks are encoded into m check blocks, and (k, m) represents the sub-code of k rows and m columns during decoding. Matrix, for the convenience of understanding and description, will not be described in detail later. It is easy to understand that i and j in this embodiment are both integers.

其中,RS码的解码操作可以通过将生成矩阵中的(k,k)的子矩阵求逆并与(k,1)的幸存数据向量相乘来完成故障块的恢复。如图1所示,当两个数据块C0,C1故障时,恢复分为两个步骤。首先,对生成矩阵中剩余块所对应的行(2,3,4,5)的(4,4)子矩阵求逆。其次,将逆矩阵与幸存块向量C2,C3,P0,P1相乘。乘积的结果是原始数据块C0,C1。需要理解的是,由于在计算机语言中,一般以0为起始值,由此,0代表第一行的(1,0,0,0)。Among them, the decoding operation of the RS code can complete the recovery of the fault block by inverting the sub-matrix of (k, k) in the generating matrix and multiplying it with the surviving data vector of (k, 1). As shown in Figure 1, when two data blocks C0 and C1 fail, recovery is divided into two steps. First, invert the (4, 4) submatrix of the row (2, 3, 4, 5) corresponding to the remaining block in the generator matrix. Second, multiply the inverse matrix with the surviving block vectors C2 , C3 , P0 , P1 . The result of the product is the original data block C0 , C1 . What needs to be understood is that in computer languages, 0 is generally used as the starting value, so 0 represents (1,0,0,0) in the first line.

在其他的现有技术方案中,请参阅图2,图2为一种现有的RS(4,2)码的修复过程的示意图。即k=4,m=2。分布式存储群集中的数据块修复需要修复节点从k不同的幸存节点检索k个数据块,所有这些数据块都与故障数据块在同一条带。不难理解的是,假设一个块Di出现故障,并且是k个幸存块的集合。由此,便可以使用k块的线性组合来修复,其中aij是修复块Di时Dj所对应的系数。在图2中,当块D0故障时,幸存的4个节点N1,N2,N3,N4分别将块D1,D2,D3,D4传送给N6节点,然后根据解码方程/>完成块的修复,总共花费4个时隙。不难理解的是,本申请实施例中的i,j均为整数。In other existing technical solutions, please refer to FIG. 2 , which is a schematic diagram of an existing RS(4,2) code repair process. That is, k=4, m=2. Data block repair in a distributed storage cluster requires the repair node to retrieve k data blocks from k different surviving nodes, all of which are on the same stripe as the failed data block. It is easy to understand that suppose a block Di fails, and is the set of k surviving blocks. From this, we can use a linear combination of k blocks to repair , where aij is the coefficient corresponding to Dj when repairing block Di . In Figure 2, when block D0 fails, the four surviving nodes N1 , N2 , N3 , and N4 transmit blocks D1 , D2 , D3 , and D4 to the N6 node respectively, and then according to Decoding equations/> It takes a total of 4 time slots to complete the repair of the block. It is easy to understand that i and j in the embodiment of this application are both integers.

在其他的现有技术方案中,请参阅图3,图3为一种现有的RS(4,2)码的PPR修复过程的示意图。即k=4,m=2。单节点修复算法PPR会将整个修复过程分解为多个子过程减少修复节点的下载流量。图3表明了这种算法的修复过程,首先节点N2和N4接收N1和N3发送的块a01D1和a03D3,节点N2将接收来的块与自身的块进行计算生成中间块a01D1+a02D2,然后发送中间块给节点N4,最后N4完成块的计算并传送给修复节点N6,总共花费3个时隙。In other existing technical solutions, please refer to FIG. 3 , which is a schematic diagram of an existing PPR repair process of RS(4,2) code. That is, k=4, m=2. The single node repair algorithm PPR will decompose the entire repair process into multiple sub-processes to reduce the download traffic of the repair node. Figure 3 shows the repair process of this algorithm. First, nodes N2 and N4 receive the blocks a01 D1 and a03 D3 sent by N1 and N3. The node N2 compares the received blocks with its own blocks. Calculate and generate the intermediate block a01 D1 + a02 D2 , and then send the intermediate block to node N4 . Finally, N4 completes the calculation of the block and transmits it to the repair node N6 , spending a total of 3 time slots.

在其他的现有技术方案中,请参阅图4,图4为一种现有的RS(6,2)码的负载不平衡的示意图。即k=6,m=2。在该现有方案中,是一种并行多个单节点修复算法。这将会产生两个严重的问题:节点间负载不平衡和多个独立的单节点修复产生的冗余块传输。图4表明负载不平衡问题,当修复条带内2个块时每个节点的上传块数和下载块数不平衡导致修复时间取决于负载最重的节点。其中,图4中的左侧upload及download所对应的数字,即为现有方案在传输过程中上传及下载的负载。右侧upload及download所对应的数字,即为本申请技术方案在传输过程中上传及下载的负载。In other existing technical solutions, please refer to FIG. 4 , which is a schematic diagram of load imbalance of an existing RS(6,2) code. That is, k=6, m=2. In this existing solution, it is a parallel multiple single-node repair algorithm. This will create two serious problems: load imbalance between nodes and redundant block transfers resulting from multiple independent single-node repairs. Figure 4 illustrates the load imbalance problem. When repairing 2 blocks within a stripe, the number of uploaded blocks and downloaded blocks per node is imbalanced, causing the repair time to depend on the most heavily loaded node. Among them, the numbers corresponding to upload and download on the left side in Figure 4 are the loads uploaded and downloaded during the transmission process of the existing solution. The numbers corresponding to upload and download on the right are the loads uploaded and downloaded during the transmission process of the technical solution of this application.

还存在一种现有技术方案,请参阅图5,图5为一种现有的RS(4,3)码的传输时隙的示意图。即k=4,m=3。其中,图5表明多个独立的单节点修复产生的冗余块传输问题,在传输数据块时可以传输原始数据块以供多个节点同时修复多个块而减少不必要的块传输,从而减少修复时间。需要说明的是,图5的左面为现有技术方案,图5的右面为本申请技术方案。由图5还可以看出,传输原始的数据块给其他节点,那么其他节点就可以利用这个原始数据块来修复多个节点上的数据块。其中,图5左面逻辑上花费了3个时隙,右面花费2个时隙。由此,本申请技术方案与现有技术方案对比,可明显减少传输时间。There is also an existing technical solution. Please refer to Figure 5. Figure 5 is a schematic diagram of an existing RS(4,3) code transmission time slot. That is, k=4, m=3. Among them, Figure 5 shows the redundant block transmission problem caused by multiple independent single-node repairs. When transmitting data blocks, original data blocks can be transmitted for multiple nodes to repair multiple blocks at the same time to reduce unnecessary block transmission, thereby reducing Repair time. It should be noted that the left side of Figure 5 shows the existing technical solution, and the right side of Figure 5 shows the technical solution of the present application. It can also be seen from Figure 5 that by transmitting the original data block to other nodes, other nodes can use this original data block to repair data blocks on multiple nodes. Among them, the left side of Figure 5 logically takes 3 time slots, and the right side takes 2 time slots. Therefore, compared with the existing technical solutions, the technical solution of the present application can significantly reduce the transmission time.

为方便对本申请技术方案进行详细描述,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to facilitate a detailed description of the technical solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only part of the implementation of the present application. examples, not all examples. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

基于上述描述,本申请技术方案公开了一种分布式纠删码存储系统下多节点修复调度优化方法,请参阅图6,图6为本申请实施例公开的一种节点修复方法的流程示意图。包括步骤601-步骤604。Based on the above description, the technical solution of this application discloses a multi-node repair scheduling optimization method under a distributed erasure code storage system. Please refer to Figure 6. Figure 6 is a schematic flow chart of a node repair method disclosed in an embodiment of this application. Including steps 601 to 604.

601、获取分布式存储系统中的所有数据节点。601. Obtain all data nodes in the distributed storage system.

需要提前说明的是,在本申请技术方案中,算法将修复过程主要分为两个阶段,分别是归并阶段和传输阶段。其中在归并节点,只有幸存节点彼此传输数据块,而在传输节点,数据块被传输到故障节点(也可理解为修复节点)。为方便理解,后续将对此进行详细描述。It should be noted in advance that in the technical solution of this application, the algorithm mainly divides the repair process into two stages, namely the merging stage and the transmission stage. Among them, at the merging node, only the surviving nodes transmit data blocks to each other, while at the transmission node, the data blocks are transmitted to the failed node (which can also be understood as a repair node). For ease of understanding, this will be described in detail later.

在其中一个实施例中,分布式存储系统可运行于元数据服务器上。在分布式存储系统中,若某些数据节点出现故障,分布式存储系统可获取当前系统中所有的数据节点。具体的,分布式存储群集中的数据块修复需要幸存节点检索k个数据块,所有这些数据块都与故障数据块在同一条带。由此,便可以获取到当前分布式存储系统中与故障节点处于同一条带上的所有数据节点。不难理解的是,该数据节点包括幸存节点和故障节点。故障节点即为发生故障的数据块,幸存节点即为未发生故障的数据块。为方便理解和描述,后续不再对此进行赘述。其中,为方便理解,可参阅图3中对条带的描述,具体此处不再赘述。In one embodiment, the distributed storage system may run on a metadata server. In a distributed storage system, if some data nodes fail, the distributed storage system can obtain all data nodes in the current system. Specifically, data block repair in a distributed storage cluster requires the surviving nodes to retrieve k data blocks, all of which are on the same stripe as the failed data block. Thus, all data nodes in the current distributed storage system that are on the same strip as the faulty node can be obtained. It is not difficult to understand that the data node includes surviving nodes and failed nodes. The faulty node is the data block that has failed, and the surviving node is the data block that has not failed. To facilitate understanding and description, this will not be described again. For ease of understanding, please refer to the description of the strips in Figure 3, and the details will not be described again here.

602、将幸存节点进行调度分组。602. Schedule and group the surviving nodes.

分布式存储系统中的算法可以先确定数据节点中的所有幸存节点,然后将幸存节点进行调度分组。其中,任一组次中的幸存节点为成对存在。可以理解为,任一组次中的幸存节点的数量为偶数个。The algorithm in the distributed storage system can first determine all the surviving nodes in the data node, and then schedule the surviving nodes into groups. Among them, the surviving nodes in any group exist in pairs. It can be understood that the number of surviving nodes in any group is an even number.

在其中一个具体的实施例中,算法可以将k个幸存节点分成多个部分进行调度,每个部分的节点数为2的指数次幂,各部分之间可以并行执行。其中,每个部分可以理解为一个组次或分组。In one specific embodiment, the algorithm can divide k surviving nodes into multiple parts for scheduling. The number of nodes in each part is an exponential power of 2, and each part can be executed in parallel. Among them, each part can be understood as a group or grouping.

具体的,若幸存节点集合为S,故障节点集合为M。在将k个幸存节点分为p个部分(或组)中,不难理解的是,k,p,i均为自然数,/>即为对应的部分或组次,在该部分中,幸存节点的数量为偶数个。Specifically, if the set of surviving nodes is S and the set of failed nodes is M. In dividing k surviving nodes into p parts (or groups), It is not difficult to understand that k, p, and i are all natural numbers,/> That is, the corresponding part or group, in which the number of surviving nodes is an even number.

603、将同一组次中的幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,形成节点修复块。603. Merge the surviving nodes in the same group to transmit the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block.

将幸存节点分组完成后,便可以将在同一组次中的幸存节点进行归并操作,从而将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,以形成节点修复块。具体可参阅图4及图5中对于传输数据的描述。After grouping the surviving nodes, the surviving nodes in the same group can be merged, thereby transmitting the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block. For details, please refer to the description of transmission data in Figure 4 and Figure 5 .

在其中一个具体的实施例中,算法将同一组次中的幸存节点相互之间传输原始数据块,具体的,让每个组的后半部分中的节点向前半部分中的节点传输节点修复数据。不难理解的是,该节点修复数据可以理解为该幸存节点的节点数据,具体此处不做赘述。基于该实施例,在另外一个具体的实施例中,如果故障节点的数量大于任一组次中节点数量的一半,则该组内前半部分的节点再传输另外的块给后半部分的节点。同时,如果归并操作不能在同一时隙内完成,则继续按照上述规则进行归并,直到归并完成。不难理解的是,在归并阶段结束时,k个幸存节点被归并到同一组中。In one specific embodiment, the algorithm transmits original data blocks between surviving nodes in the same group. Specifically, the nodes in the second half of each group transmit node repair data to the nodes in the first half. . It is easy to understand that the node repair data can be understood as the node data of the surviving node, and the details will not be described here. Based on this embodiment, in another specific embodiment, if the number of faulty nodes is greater than half of the number of nodes in any group, then the first half of the nodes in the group will transmit additional blocks to the second half of the nodes. At the same time, if the merging operation cannot be completed within the same time slot, merging will continue according to the above rules until the merging is completed. It is not difficult to understand that at the end of the merging phase, the k surviving nodes are merged into the same group.

由此,任一幸存节点便可接收其他幸存节点传输过来的节点修复数据,从而与本身的节点修复数据形成节点修复块。As a result, any surviving node can receive the node repair data transmitted from other surviving nodes, thereby forming a node repair block with its own node repair data.

604、将任一组次中的节点修复块传输至故障节点,对故障节点进行修复,以使得故障节点根据节点修复块完成故障修复。604. Transmit the node repair block in any group to the faulty node, and repair the faulty node, so that the faulty node completes fault repair according to the node repair block.

在归并阶段结束后,便可进入传输阶段。其中,将任一组次中的节点修复块传输至故障节点,对故障节点进行修复,从而使得故障节点根据节点修复块完成故障修复。After the merge phase is completed, the transfer phase can be entered. Among them, the node repair block in any group is transmitted to the faulty node, and the faulty node is repaired, so that the faulty node completes fault repair according to the node repair block.

在其中一个具体的实施例中,若故障节点的数量为m个,则可以将任一组次中的前m个节点向m个故障节点发送节点修复块,从而使得故障节点完成修复。其中,需要说明的是,发送节点修复块为一一对应发送,具体的,第一个幸存节点对应的节点修复块发送至第一个故障节点,第二个幸存节点对应的节点修复块发送至第二个故障节点,依次类推。具体此处不做赘述。In one specific embodiment, if the number of faulty nodes is m, the first m nodes in any group can send node repair blocks to the m faulty nodes, so that the faulty nodes can be repaired. Among them, it should be noted that the node repair block is sent in one-to-one correspondence. Specifically, the node repair block corresponding to the first surviving node is sent to the first failed node, and the node repair block corresponding to the second surviving node is sent to The second faulty node, and so on. The details will not be described here.

通过本实施例公开的一种节点修复方法,先获取分布式存储系统中的所有数据节点;其中,所有数据节点包括故障节点及幸存节点,幸存节点为所有数据节点中除开故障节点的节点;再将幸存节点进行调度分组,其中,任一组次中的幸存节点为成对存在;然后将同一组次中的幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一组次中的其他幸存节点,形成节点修复块;其中,节点修复块由其他幸存节点的节点修复数据及当前幸存节点的节点修复数据形成;最后将任一组次中的节点修复块传输至故障节点,对故障节点进行修复,以使得故障节点根据节点修复块完成故障修复。从而,在分布式存储系统中,能通过幸存节点有效修复故障节点,同时,将幸存节点进行分组传输,也便于故障节点修复时降低网络传输开销,从而加快修复速度,从而尽可能地优化在分布式存储系统下的多节点修复过程。Through a node repair method disclosed in this embodiment, all data nodes in the distributed storage system are first obtained; where all data nodes include fault nodes and surviving nodes, and the surviving nodes are nodes among all data nodes excluding the fault node; and then Schedule the surviving nodes into groups, where the surviving nodes in any group exist in pairs; then merge the surviving nodes in the same group to transfer the node repair data in the current surviving node to the same group The other surviving nodes in the group form a node repair block; among them, the node repair block is formed by the node repair data of other surviving nodes and the node repair data of the current surviving node; finally, the node repair block in any group is transmitted to the fault node, Repair the faulty node so that the faulty node completes fault repair according to the node repair block. Therefore, in the distributed storage system, the failed nodes can be effectively repaired through the surviving nodes. At the same time, the surviving nodes are grouped for transmission, which also facilitates the reduction of network transmission overhead when repairing the failed nodes, thus speeding up the repair speed and optimizing the distribution as much as possible. Multi-node repair process under traditional storage system.

为方便对本申请实施例公开的一种节点修复方法进行详细描述,请参阅图7,图7为本申请实施例公开的另一种节点修复方法的流程示意图。包括步骤701-步骤709。To facilitate a detailed description of a node repair method disclosed in an embodiment of the present application, please refer to Figure 7 , which is a schematic flow chart of another node repair method disclosed in an embodiment of the present application. Including steps 701 to 709.

701、获取分布式存储系统中的所有数据节点。701. Obtain all data nodes in the distributed storage system.

本实施例中步骤701与前述图6中步骤601类似,具体此处不做赘述。Step 701 in this embodiment is similar to the aforementioned step 601 in Figure 6, and will not be described in detail here.

702、将k个幸存节点分为个组,并将第一幸存节点的第一幸存节点修复数据传输至第二幸存节点。702. Divide the k surviving nodes into group, and transmit the first survivor node repair data of the first survivor node to the second survivor node.

当获取到所有数据节点后,便可以将所有幸存节点两两一组相互传输原始数据块。具体的,将k个幸存节点划分为个组,每个组中有两个幸存节点。若至少存在一组,则让每个组的后半部分中的幸存节点向前半部分中的幸存节点传输节点修复数据。不难理解的是,k为正整数,且第一幸存节点为在同一组次中位于后半部分的幸存节点。第二幸存节点为在同一组次中位于前半部分的幸存节点。When all data nodes are obtained, all surviving nodes can be transferred to each other in pairs to transmit original data blocks. Specifically, the k surviving nodes are divided into groups, with two surviving nodes in each group. If there is at least one group, let the surviving nodes in the second half of each group transmit node repair data to the surviving nodes in the first half. It is not difficult to understand that k is a positive integer, and the first surviving node is the surviving node located in the second half of the same group. The second surviving node is the surviving node located in the first half of the same group.

在其中一个具体的实施例中,为方便对本实施例中的归并阶段及传输阶段进行详细描述,请参阅图8,图8为本申请实施例公开的一种RS(6,3)码的多节点修复的示意图。其中,k=6,m=3。幸存节点的数量为6,其中,N3至N8为幸存节点,N0至N2为故障节点,为方便理解和描述后续不再对此进行赘述。由图8可以看出,将幸存节点分为3组,其中,N3与N4在一组,N5与N6在一组,N7与N8在一组。在第一个时隙中,N4将节点修复数据D4传输至N3,N6将节点修复数据D6传输至N5,N8将节点修复数据D8传输至N7。需要理解的是,在本步骤中,可以将第一幸存节点理解为N4,N6或N8,对应的第二幸存节点为N3,N5或N7。第一幸存节点修复数据对应为D4,D6或D8。In one of the specific embodiments, in order to facilitate a detailed description of the merging stage and the transmission stage in this embodiment, please refer to FIG. 8. FIG. Schematic diagram of node repair. Among them, k=6, m=3. The number of surviving nodes is 6, among which N3 to N8 are surviving nodes and N0 to N2 are faulty nodes. For the convenience of understanding and description, this will not be described again. As can be seen from Figure 8, the surviving nodes are divided into three groups, among which N3 and N4 are in one group, N5 and N6 are in one group, and N7 and N8 are in one group. In the first time slot, N4 transmits the node repair data D4 to N3, N6 transmits the node repair data D6 to N5, and N8 transmits the node repair data D8 to N7. It should be understood that in this step, the first survivor node can be understood as N4, N6 or N8, and the corresponding second survivor node is N3, N5 or N7. The repair data of the first surviving node corresponds to D4, D6 or D8.

703、若同一分组中的幸存节点的数量的一半小于故障节点的数量,将当前分组中前半部分的幸存节点的节点修复数据传输至后半部分的幸存节点。703. If half of the number of surviving nodes in the same group is less than the number of failed nodes, transmit the node repair data of the first half of the surviving nodes in the current group to the second half of the surviving nodes.

当同一分组中的幸存节点的数量的一般小于故障节点的数量时,则将当前分组中的前半部分的幸存节点的节点修复数据传输至后半部分的幸存节点。需要说明的是,本步骤的执行前提为故障节点与幸存节点的数量大小的判定依据。When the number of surviving nodes in the same group is generally smaller than the number of failed nodes, the node repair data of the first half of the surviving nodes in the current group is transmitted to the second half of the surviving nodes. It should be noted that the premise for the execution of this step is the basis for determining the number of failed nodes and surviving nodes.

在其中一个具体的实施例中,可参阅图8,由于在同一组次中幸存节点的数量为2,则一半为1,而故障节点的数量为3,由此需要执行该步骤。即,前半部分的幸存节点将节点修复数据传输至后半部分的幸存节点。具体的,第二幸存节点N3,N5或N7将对应的节点修复数据D3,D5或D7传输至第一幸存节点N4,N6或N8。不难理解的是,由于N3至N8之间数据的传输方式是幸存节点之间两两相互传输,因此任一幸存节点的负载数均为1,由此可以在同一时隙中执行完成。In one specific embodiment, please refer to Figure 8. Since the number of surviving nodes in the same group is 2, half is 1, and the number of failed nodes is 3, so this step needs to be performed. That is, the first half of the surviving nodes transmit the node repair data to the second half of the surviving nodes. Specifically, the second survivor node N3 , N5 or N7 transmits the corresponding node repair data D3 , D5 or D7 to the first survivor node N4 , N6 or N8 . It is not difficult to understand that since the data transmission method between N3 and N8 is that the surviving nodes transmit each other in pairs, the load number of any surviving node is 1, so the execution can be completed in the same time slot.

还需要说明的是,本实施例中步骤702-步骤703中的节点修复数据D3至D8还可以理解为原始数据块。It should also be noted that the node repair data D3 to D8 in steps 702 to 703 in this embodiment can also be understood as original data blocks.

704、将个组进行归并操作,获取/>个组。704. will Groups are merged and obtained/> group.

基于上述步骤702后,将个组两两归并为/>个组,每个组具有4个幸存节点。若存在幸存节点未能构成一组,则执行步骤705。若幸存节点能构成一组,则执行步骤706。Based on the above step 702, the The groups are merged into/> groups, each group has 4 surviving nodes. If there are surviving nodes that cannot form a group, step 705 is executed. If the surviving nodes can form a group, step 706 is executed.

在其中一个具体的实施例中,可参阅图8。对应的,N3至N6可分为同一组,而N7和N8由于该幸存节点的数量为2,不能构成分组,由此,N7和N8执行步骤705,对于N3至N6的幸存节点执行步骤706。In one specific embodiment, please refer to FIG. 8 . Correspondingly, N3 to N6 can be divided into the same group, but N7 and N8 cannot form a group because the number of surviving nodes is 2. Therefore, N7 and N8 execute step 705. For N3 to N The surviving node6 executes step 706.

705、若存在未位于任一个组中的幸存节点,确定未位于任一/>个组中的幸存节点为剩余幸存节点,并将所有剩余幸存节点归并至剩余组。705. If there is no Surviving nodes in groups, determined not to be in any/> The surviving nodes in each group are the remaining surviving nodes, and all remaining surviving nodes are merged into the remaining groups.

当存在未位于任一个组中的幸存节点时,则确定未位于任一/>个组中的幸存节点为剩余幸存节点。同时,也可以将所有的剩余幸存节点归并至剩余组。在其中一个实施例中,也可以不将剩余幸存节点归并至剩余组,具体此处不做赘述。When there is not located in any When there are surviving nodes in groups, it is determined that they are not located in any/> The surviving nodes in each group are the remaining surviving nodes. At the same time, all remaining surviving nodes can also be merged into the remaining group. In one embodiment, the remaining surviving nodes may not be merged into the remaining group, and details will not be described here.

在其中一个具体的实施例中,可参阅图8,在图8中,N7及N8即为剩余幸存节点。对应的N3至N6为在一个组次中的幸存节点,具体此处不做赘述。In one specific embodiment, please refer to Figure 8. In Figure 8, N7 and N8 are the remaining surviving nodes. The corresponding N3 to N6 are the surviving nodes in a group, and the details will not be described here.

706、获取第三幸存节点的第三幸存节点修复数据及第四幸存节点的第四幸存节点修复数据,并确定第三幸存节点对应的第三加权系数及第四幸存节点对应的第四加权系数。706. Obtain the third surviving node repair data of the third surviving node and the fourth surviving node repair data of the fourth surviving node, and determine the third weighting coefficient corresponding to the third surviving node and the fourth weighting coefficient corresponding to the fourth surviving node. .

将幸存节点分为具有4个节点的组后,将该组的幸存节点分别定义为第三幸存节点、第四幸存节点、第五幸存节点和第六幸存节点,其中,第三幸存节点和第四幸存节点为在该组中位于后半部分的幸存节点,第五幸存节点及第六幸存节点为在该组中位于前半部分的幸存节点。具体的,先获取第三幸存节点的第三幸存节点修复数据及第四幸存节点的第四幸存节点修复数据,并确定第三幸存节点对应的第三加权系数及第四幸存节点对应的第四加权系数。After dividing the surviving nodes into groups with 4 nodes, the surviving nodes of the group are defined as the third surviving node, the fourth surviving node, the fifth surviving node and the sixth surviving node respectively, where the third surviving node and the sixth surviving node The fourth surviving node is the surviving node located in the second half of the group, and the fifth surviving node and the sixth surviving node are the surviving nodes located in the first half of the group. Specifically, first obtain the third surviving node repair data of the third surviving node and the fourth surviving node repair data of the fourth surviving node, and determine the third weighting coefficient corresponding to the third surviving node and the fourth corresponding to the fourth surviving node. Weighting coefficient.

在其中一个具体的实施例中,可参阅图8。其中,N5和N6分别对应第三幸存节点和第四幸存节点,N3和N4分别对应第五幸存节点和第六幸存节点,为方便描述,后续不再对此进行赘述,以具体的节点编号进行说明。那么有,对应的第三幸存节点修复数据的数据块为D5,第四幸存节点修复数据的数据块为D6。由于归并的规则是让每个组的后半部分中的幸存节点向前半部分中的幸存节点传输数据,由此,便需要确定N5和N6对应的第三加权系数和第四加权系数,具体的,第三加权系数为a05和a06,第四加权系数为a15和a16In one specific embodiment, please refer to FIG. 8 . Among them, N5 and N6 correspond to the third surviving node and the fourth surviving node respectively, and N3 and N4 correspond to the fifth surviving node and the sixth surviving node respectively. For the convenience of description, this will not be described in detail later. node number for description. Then, the corresponding data block of the third surviving node's repair data is D5 , and the data block of the fourth surviving node's repair data is D6 . Since the rule of merging is to let the surviving nodes in the second half of each group transmit data to the surviving nodes in the first half, it is necessary to determine the third and fourth weighting coefficients corresponding to N5 and N6 , Specifically, the third weighting coefficient is a05 and a06 , and the fourth weighting coefficient is a15 and a16 .

707、确定基于第三幸存节点修复数据、第四幸存节点修复数据及第三加权系数的与第五幸存节点对应的第五幸存节点修复数据,并确定基于第三幸存节点修复数据、第四幸存节点修复数据及第四加权系数的与第六幸存节点对应的第六幸存节点修复数据,并将第五幸存节点修复数据传输至第五幸存节点,将第六幸存节点修复数据传输至第六幸存节点。707. Determine the fifth surviving node repair data corresponding to the fifth surviving node based on the third surviving node repair data, the fourth surviving node repair data and the third weighting coefficient, and determine the fifth surviving node repair data based on the third surviving node repair data, the fourth surviving node repair data and the third weighting coefficient. The node repair data and the sixth surviving node repair data corresponding to the sixth surviving node of the fourth weighted coefficient are transmitted to the fifth surviving node repair data, and the sixth surviving node repair data is transmitted to the sixth surviving node. node.

基于上述实施例,便可以根据第三幸存节点修复数据、第四幸存节点修复数据及第三加权系数的节点修复数据确定第五幸存节点修复数据,其中,该第五幸存节点修复数据与第五幸存节点对应。对应的,还可以根据第三幸存节点修复数据、第四幸存节点修复数据及第四加权系数的节点修复数据确定第六幸存节点修复数据,其中,该第六幸存节点修复数据与第六幸存节点对应。Based on the above embodiment, the fifth surviving node repair data can be determined based on the third surviving node repair data, the fourth surviving node repair data and the node repair data of the third weighting coefficient, wherein the fifth surviving node repair data is the same as the fifth surviving node repair data. Surviving node correspondence. Correspondingly, the sixth surviving node repair data can also be determined based on the third surviving node repair data, the fourth surviving node repair data and the node repair data of the fourth weighted coefficient, wherein the sixth surviving node repair data is the same as the sixth surviving node repair data. correspond.

在其中一个具体的实施例中,可参阅图8,具体的,对应于第五幸存节点(N3)的数据为a05D5+a06D6,对应于第六幸存节点(N4)的数据为a15D5+a16D6In one specific embodiment, please refer to Figure 8. Specifically, the data corresponding to the fifth survivor node (N3) is a05 D5 +a06 D6 , and the data corresponding to the sixth survivor node (N4) is is a15 D5 +a16 D6 .

基于上述实施例,在另外一个具体的实施例中,由于故障节点的数量大于组中幸存节点数量的一半,即故障节点的数量为3,组中幸存节点数量的一半为2,由此,组内前半部分的幸存节点还需要再传输另外的数据块给后半部分的幸存节点。具体的,可参阅图8,其中,N3可以将D3与D4对应的数据块传输至N5。即a23D3+a24D4。其中,a23和a24均为加权系数,具体此处不对具体的加权系数的数值进行赘述。不难理解的是,在本步骤中,在第二时隙(Timeslot2)中,a05D5+a06D6,a15D5+a16D6,a23D3+a24D4可以同时传输,因为每个幸存节点可以同时上传和下载。由于D7和D8满足传输阶段条件,则直接进行传输。Based on the above embodiment, in another specific embodiment, since the number of faulty nodes is greater than half of the number of surviving nodes in the group, that is, the number of faulty nodes is 3, and half of the number of surviving nodes in the group is 2. Therefore, the group The surviving nodes in the first half also need to transmit additional data blocks to the surviving nodes in the second half. Specifically, please refer to Figure 8, where N3 can transmit the data blocks corresponding to D3 and D4 to N5 . That is a23 D3 +a24 D4 . Among them, a23 and a24 are both weighting coefficients, and the specific values of the weighting coefficients will not be described here. It is not difficult to understand that in this step, in the second time slot (Timeslot2), a05 D5 +a06 D6 , a15 D5 +a16 D6 , a23 D3 +a24 D4 Simultaneous transfers are possible because each surviving node can upload and download at the same time. Since D7 and D8 meet the transmission stage conditions, transmission is performed directly.

具体的传输方式可参阅图4或图5的描述。For specific transmission methods, please refer to the description in Figure 4 or Figure 5.

708、将经过传输节点修复数据后的任一幸存节点进行归并,生成节点修复块。708. Merge any surviving node after the data has been repaired by the transmission node to generate a node repair block.

将传输过节点修复数据后的幸存节点进行归并,从而生成节点修复块。Merge the surviving nodes after the node repair data has been transmitted to generate a node repair block.

在其中一个具体的实施例中,基于步骤707,并结合图8可以看出,由于N3至N6的幸存节点在同一组次,由于,对应于N3的幸存节点,其节点修复块的数据为对于N4的幸存节点,其节点修复块的数据为/>对于N5的幸存节点,其节点修复块的数据为不难理解的是,其中的a0i,a1i或a2i均属于加权系数,上述步骤以对此进行详细说明,具体此处不做赘述,同时,也不对加权系数的具体数值进行限制。In one specific embodiment, based on step 707 and combined with Figure 8, it can be seen that since the surviving nodes from N3 to N6 are in the same group, since, corresponding to the surviving node of N3 , its node repair block The data is For N4 surviving nodes, the data of its node repair block is/> For N5 surviving nodes, the data of its node repair block is It is easy to understand that a0i , a1i or a2i are all weighting coefficients. The above steps will be explained in detail and will not be repeated here. At the same time, the specific values of the weighting coefficients are not limited.

基于上述实施例,在另外一个具体的实施例中,基于步骤705,由于N7和N8节点属于剩余幸存节点,由此,其对应的N7或N8的节点修复块为a07D7+a08D8,a17D7+a18D8和a27D7+a28D8。不难理解的是,由于故障节点的数量为3,而N7和N8所在组的幸存节点的数量为2,而修复每个数据块时都需要任意k个幸存的数据块,比如修复D0需要D3至D8。由此,便需要三个数据块进行修复,本实施例中,采用a27D7+a28D8的数据块作为对N2节点的节点修复块。不难理解的是,a07,a17,a27,a08,a18及a28均为加权系数,即a07和a08为上述中所描述的第一剩余加权系数,a17和a18为上述中所描述的第二剩余加权系数,a27和a28为上述中所描述的第三剩余加权系数,N7为第一剩余幸存节点,N8为第二剩余幸存节点。具体此处不做赘述,也不对加权系数的具体数值进行限制。Based on the above embodiment, in another specific embodiment, based on step 705, since nodes N7 and N8 belong to the remaining surviving nodes, their corresponding node repair blocks of N7 or N8 are a07 D7 + a08 D8 , a17 D7 +a18 D8 and a27 D7 +a28 D8 . It is not difficult to understand that since the number of failed nodes is 3 and the number of surviving nodes in the group of N7 and N8 is 2, any k surviving data blocks are needed to repair each data block, such as repairing D0 requires D3 to D8 . Therefore, three data blocks are needed for repair. In this embodiment, the data block a27 D7 + a28 D8 is used as the node repair block for the N2 node. It is not difficult to understand that a07 , a17 , a27 , a08 , a18 and a28 are all weighting coefficients, that is, a07 and a08 are the first remaining weighting coefficients described above, a17 and a18 is the second remaining weighting coefficient described in the above, a27 and a28 are the third remaining weighting coefficient described in the above, N7 is the first remaining surviving node, and N8 is the second remaining surviving node. The specific details will not be described here, nor will the specific value of the weighting coefficient be limited.

需要说明的是,具体的加权系数的计算方式可参阅图1的描述,具体此处不进行赘述。It should be noted that the specific calculation method of the weighting coefficient can refer to the description in Figure 1, and will not be described again here.

709、将任一组次中的幸存节点进行排序,以当故障节点的数量为m个时,将任一组次中排序为前m个的幸存节点对应的节点修复块分别传输至故障节点,对m个故障节点进行修复。709. Sort the surviving nodes in any group, so that when the number of failed nodes is m, transmit the node repair blocks corresponding to the top m surviving nodes in any group to the failed nodes respectively. Repair m faulty nodes.

基于上述步骤,当故障节点的数量为m个时,可以将任一组次中排序前m个的幸存节点对应的节点修复块分别传输至故障节点,从而对m个故障节点进行修复。其中,可以对任一组次中的幸存节点进行排序。不难理解的是,在本实施例中,若上述中已确定相关幸存节点的排序,在本步骤中也可不执行相关的排序动作。Based on the above steps, when the number of faulty nodes is m, the node repair blocks corresponding to the top m surviving nodes in any group can be transmitted to the faulty node respectively, thereby repairing the m faulty nodes. Among them, the surviving nodes in any group can be sorted. It is easy to understand that in this embodiment, if the sorting of relevant surviving nodes has been determined above, the relevant sorting action may not be performed in this step.

在其中一个具体的实施例中,可参阅图8,由于m=3,由此便需要将在同一组次中的N3至N5节点所对应的节点修复块传输至故障节点。具体的,基于步骤708,N3节点对应的节点修复块的数据为N4节点对应的节点修复块的数据为/>N5节点对应的节点修复块的数据为/>In one specific embodiment, please refer to Figure 8. Since m=3, it is necessary to transmit the node repair blocks corresponding to theN3 toN5 nodes in the same group to the faulty node. Specifically, based on step 708, the data of the node repair block corresponding to the N3 node is The data of the node repair block corresponding to N4 nodes is/> The data of the node repair block corresponding to N5 nodes is/>

由此,将N3节点对应的节点修复块传输至N0的故障节点,以对D0的数据块进行修复;将N4节点对应的节点修复块传输至N1的故障节点,以对D1的数据块进行修复;将N5节点对应的节点修复块传输至N2的故障节点,以对D2的数据块进行修复。不难理解的是,由于N0至N2节点并未接收其他的数据块,即负载为0,由此,在第三时隙(Timeslot3)可以同步执行相关的传输步骤,即将相关的节点修复块分别传输至N0至N2节点。Therefore, the node repair block corresponding to the N3 node is transmitted to the faulty node N0 to repair the data block of D0 ; the node repair block corresponding to the N4 node is transmitted to the faulty node N1 to repair the data block D1 Repair the data block of the node; transmit the node repair block corresponding to the N5 node to the faulty node of N2 to repair the data block of D2 . It is not difficult to understand that since nodes N0 to N2 have not received other data blocks, that is, the load is 0, therefore, the relevant transmission steps can be executed synchronously in the third time slot (Timeslot3), that is, the relevant nodes can be repaired Blocks are transmitted to N0 to N2 nodes respectively.

由于第二时隙及第三时隙(即Timeslot2及Timeslot3)的N7和N8的幸存节点并未发生变化,由此,在TimeSlot2时N7和N8节点发送a07D7+a08D8和a17D7+a18D8,在Timeslot3未发生变化是因为此时N0,N1,N2三个节点的下载带宽已经占满了(上述已描述),所以只能等待下一个时隙发送,即在第四时隙(Timeslot4)由N7节点发送相关的数据块a27D7+a28D8至N2Since the surviving nodes of N7 and N8 in the second and third time slots (i.e., Timeslot2 and Timeslot3) have not changed, therefore, at TimeSlot2, the N7 and N8 nodes send a07 D7 + a08 D8 and a17 D7 + a18 D8 do not change in Timeslot3 because the download bandwidth of the three nodes N0 , N1 , and N2 is already full at this time (described above), so we can only wait. The next time slot is sent, that is, in the fourth time slot (Timeslot4), the N7 node sends the relevant data block a27 D7 +a28 D8 to N2 .

综上所述,本实施例通过提出一种节点修复方法,可以在通用的分布式纠删码存储系统下实现多节点修复调度优化,从而解决分布式纠删码存储系统下修复多个节点所带来的网络传输开销问题,加快节点的修复速度。在其他实施例中,本申请方案也可以应用在异或(XOR,Exclusive OR)码或里所(RS,Reed-solomon codes)码等多个流行的编码之上,具体此处不进行限制。In summary, by proposing a node repair method, this embodiment can realize multi-node repair scheduling optimization under a general distributed erasure code storage system, thereby solving the problem of repairing multiple nodes under a distributed erasure code storage system. The network transmission overhead caused by this problem will speed up the repair of nodes. In other embodiments, the solution of the present application can also be applied to multiple popular codes such as XOR (Exclusive OR) codes or Reed-solomon codes (RS) codes, which are not specifically limited here.

应该理解的是,虽然如上所述的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,如上所述的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowcharts involved in the above-mentioned embodiments are shown in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in the flowcharts involved in the above embodiments may include multiple steps or stages. These steps or stages are not necessarily executed at the same time, but may be completed at different times. The execution order of these steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least part of the steps or stages in other steps.

若方案涉及敏感信息(如用户信息、企业信息),则应当说明针对敏感信息的收集、使用和处理需要遵守相关国家和地区的法律法规和标准,且需要在相应主体(如用户或企业等)许可或同意的情况下进行。If the plan involves sensitive information (such as user information, enterprise information), it should explain that the collection, use and processing of sensitive information need to comply with the laws, regulations and standards of relevant countries and regions, and need to be disclosed to the corresponding subject (such as users or enterprises, etc.) performed with permission or consent.

请参阅图9,图9为本申请实施例公开的一种节点修复系统的结构示意图。Please refer to FIG. 9 , which is a schematic structural diagram of a node repair system disclosed in an embodiment of the present application.

获取单元901,用于获取分布式存储系统中的所有数据节点;其中,所有数据节点包括故障节点及幸存节点,幸存节点为所有数据节点中除开故障节点的节点;The acquisition unit 901 is used to acquire all data nodes in the distributed storage system; where all data nodes include fault nodes and survivor nodes, and survivor nodes are nodes among all data nodes excluding the fault node;

分组单元902,用于将幸存节点进行调度分组,其中,任一组次中的幸存节点为成对存在;The grouping unit 902 is used to schedule and group the surviving nodes, where the surviving nodes in any group exist in pairs;

归并单元903,用于将同一分组中的幸存节点进行归并操作,以将当前幸存节点中的节点修复数据传输至同一分组中的其他幸存节点,形成节点修复块;其中,节点修复块由其他幸存节点的节点修复数据及当前幸存节点的节点修复数据形成;The merging unit 903 is used to merge the surviving nodes in the same group to transmit the node repair data in the current surviving node to other surviving nodes in the same group to form a node repair block; wherein the node repair block is composed of other surviving nodes. The node repair data of the node and the node repair data of the current surviving node are formed;

传输单元904,用于将任一分组中的节点修复块传输至故障节点,对故障节点进行修复,以使得故障节点根据节点修复块完成故障修复。The transmission unit 904 is used to transmit the node repair block in any group to the faulty node, and repair the faulty node, so that the faulty node completes fault repair according to the node repair block.

示例性地,系统包括:By way of example, the system includes:

分组单元902,具体用于将k个幸存节点分为个组,其中,任一/>个组中包括两个幸存节点,k为正整数;Grouping unit 902 is specifically used to divide the k surviving nodes into groups, any of which/> Each group includes two surviving nodes, and k is a positive integer;

传输单元904,具体用于将第一幸存节点的第一幸存节点修复数据传输至第二幸存节点;其中,第一幸存节点为在个组中处于同一组次且位于后半部分的幸存节点,第二幸存节点为在/>个组中处于同一组次且位于前半部分的幸存节点。The transmission unit 904 is specifically configured to transmit the first survivor node repair data of the first survivor node to the second survivor node; wherein the first survivor node is Among the surviving nodes in the same group and located in the second half, the second surviving node is in/> The surviving nodes in the same group and located in the first half of the group.

示例性地,系统还包括:确定单元905;Exemplarily, the system further includes: a determining unit 905;

归并单元903,还用于将个组进行归并操作,获取/>个组;其中,任一个组中包括四个幸存节点;The merging unit 903 is also used to merge Groups are merged and obtained/> groups; among them, any group includes four surviving nodes;

获取单元901,还用于获取第三幸存节点的第三幸存节点修复数据及第四幸存节点的第四幸存节点修复数据,并确定第三幸存节点对应的第三加权系数及第四幸存节点对应的第四加权系数;其中,第三幸存节点及第四幸存节点分别为在个组中处于同一组次且位于后半部分的幸存节点;The obtaining unit 901 is also used to obtain the third surviving node repair data of the third surviving node and the fourth surviving node repair data of the fourth surviving node, and determine the third weighting coefficient corresponding to the third surviving node and the corresponding fourth surviving node. The fourth weighting coefficient of ; where, the third surviving node and the fourth surviving node are respectively Surviving nodes in the same group and located in the second half of each group;

确定单元905,还用于确定基于第三幸存节点修复数据、第四幸存节点修复数据及第三加权系数的与第五幸存节点对应的第五幸存节点修复数据,并确定基于第三幸存节点修复数据、第四幸存节点修复数据及第四加权系数的与第六幸存节点对应的第六幸存节点修复数据;其中,第五幸存节点及第六幸存节点分别为在个组中处于同一组次且位于前半部分的幸存节点;The determining unit 905 is also used to determine the fifth surviving node repair data corresponding to the fifth surviving node based on the third surviving node repair data, the fourth surviving node repair data and the third weighting coefficient, and determine the fifth surviving node repair data based on the third surviving node repair data. data, the fourth surviving node repair data and the sixth surviving node repair data corresponding to the sixth surviving node with the fourth weighting coefficient; among which, the fifth surviving node and the sixth surviving node are respectively Surviving nodes in the same group and located in the first half of each group;

传输单元904,还用于将第五幸存节点修复数据传输至第五幸存节点,将第六幸存节点修复数据传输至第六幸存节点。The transmission unit 904 is also used to transmit the repair data of the fifth survivor node to the fifth survivor node, and transmit the repair data of the sixth survivor node to the sixth survivor node.

示例性地,系统还包括:Exemplarily, the system also includes:

传输单元904,还用于当同一分组中的幸存节点的数量的一半小于故障节点的数量时,将当前分组中前半部分的幸存节点的节点修复数据传输至后半部分的幸存节点;The transmission unit 904 is also configured to transmit the node repair data of the first half of the surviving nodes in the current group to the second half of the surviving nodes when half of the number of surviving nodes in the same group is less than the number of failed nodes;

归并单元903,具体用于将经过传输节点修复数据后的任一幸存节点进行归并,生成节点修复块;其中,节点修复块包含当前幸存节点获取的所有节点修复数据。The merging unit 903 is specifically used to merge any surviving node after transmitting node repair data to generate a node repair block; wherein the node repair block includes all node repair data obtained by the current surviving node.

示例性地,系统还包括:生成单元906;Exemplarily, the system further includes: a generating unit 906;

确定单元905,具体用于当存在未位于任一个组中的幸存节点时,确定未位于任一/>个组中的幸存节点为剩余幸存节点,并将所有剩余幸存节点归并至剩余组;Determining unit 905 is specifically used when there is a When the surviving node in the group is determined not to be in any/> The surviving nodes in each group are the remaining surviving nodes, and all remaining surviving nodes are merged into the remaining groups;

确定单元905,还用于确定剩余组中的剩余幸存节点的节点修复数据及与节点修复数据对应的加权系数;The determination unit 905 is also used to determine the node repair data of the remaining surviving nodes in the remaining group and the weighting coefficient corresponding to the node repair data;

生成单元906,用于根据剩余幸存节点的节点修复数据及加权系数生成与剩余组对应的节点修复块。The generating unit 906 is configured to generate node repair blocks corresponding to the remaining groups according to the node repair data and weighting coefficients of the remaining surviving nodes.

示例性地,若剩余组中的剩余幸存节点的数量小于故障节点的数量,且故障节点包括第一故障节点、第二故障节点及第三故障节点,剩余幸存节点包括第一剩余幸存节点及第二剩余幸存节点,系统还包括:修复单元907;For example, if the number of remaining surviving nodes in the remaining group is less than the number of faulty nodes, and the faulty nodes include a first faulty node, a second faulty node, and a third faulty node, the remaining surviving nodes include the first remaining surviving node and the third faulty node. Two remaining surviving nodes, the system also includes: repair unit 907;

确定单元905,具体用于根据第一剩余幸存节点的节点修复数据、第二剩余幸存节点的节点修复数据及第一剩余加权系数确定第一剩余节点修复块;其中,第一剩余加权系数为与第一故障节点对应的加权系数;The determining unit 905 is specifically configured to determine the first remaining node repair block based on the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node, and the first remaining weighted coefficient; wherein the first remaining weighted coefficient is and The weighting coefficient corresponding to the first failed node;

确定单元905,还用于根据第一剩余幸存节点的节点修复数据、第二剩余幸存节点的节点修复数据及第二剩余加权系数确定第二剩余节点修复块;其中,第二剩余加权系数为与第二故障节点对应的加权系数;The determining unit 905 is also configured to determine the second remaining node repair block based on the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node, and the second remaining weighting coefficient; wherein the second remaining weighting coefficient is and The weighting coefficient corresponding to the second failed node;

确定单元905,还用于根据第一剩余幸存节点的节点修复数据、第二剩余幸存节点的节点修复数据及第三剩余加权系数确定第三剩余节点修复块;其中,第三剩余加权系数为与第三故障节点对应的加权系数;The determining unit 905 is also configured to determine the third remaining node repair block based on the node repair data of the first remaining surviving node, the node repair data of the second remaining surviving node, and the third remaining weighted coefficient; wherein the third remaining weighted coefficient is and The weighting coefficient corresponding to the third failed node;

修复单元907,用于将第一剩余节点修复块、第二剩余节点修复块或第三剩余节点修复块分别传输至第一故障节点、第二故障节点或第三故障节点,以分别对第一故障节点、第二故障节点或第三故障节点进行修复。The repair unit 907 is configured to transmit the first remaining node repair block, the second remaining node repair block or the third remaining node repair block to the first fault node, the second fault node or the third fault node respectively to repair the first fault node respectively. The faulty node, the second faulty node or the third faulty node are repaired.

示例性地,系统还包括:排序单元908;Exemplarily, the system also includes: a sorting unit 908;

排序单元908,用于将任一组次中的幸存节点进行排序;Sorting unit 908, used to sort the surviving nodes in any group;

传输单元904,具体用于当故障节点的数量为m个时,将任一组次中排序为前m个的幸存节点对应的节点修复块分别传输至故障节点,对m个故障节点进行修复;其中,m为正整数。The transmission unit 904 is specifically used to transmit the node repair blocks corresponding to the top m surviving nodes in any group to the faulty nodes when the number of faulty nodes is m, and repair the m faulty nodes; Among them, m is a positive integer.

下面请参阅图10,本申请实施例公开的一种节点修复装置的结构示意图包括:Please refer to Figure 10 below. A schematic structural diagram of a node repair device disclosed in an embodiment of the present application includes:

中央处理器1001,存储器1005,输入输出接口1004,有线或无线网络接口1003以及电源1002;Central processing unit 1001, memory 1005, input and output interface 1004, wired or wireless network interface 1003 and power supply 1002;

存储器1005为短暂存储存储器或持久存储存储器;Memory 1005 is a short-term storage memory or a persistent storage memory;

中央处理器1001配置为与存储器1005通信,并执行存储器1005中的指令操作以执行前述图6或图7所示实施例中的节点修复方法。The central processing unit 1001 is configured to communicate with the memory 1005 and execute instruction operations in the memory 1005 to perform the node repair method in the embodiment shown in FIG. 6 or 7 .

本申请实施例还提供一种芯片系统,其特征在于,芯片系统包括至少一个处理器和通信接口,通信接口和至少一个处理器通过线路互联,至少一个处理器用于运行计算机程序或指令,以执行前述图6或图7所示实施例中的节点修复方法。Embodiments of the present application also provide a chip system, which is characterized in that the chip system includes at least one processor and a communication interface, the communication interface and the at least one processor are interconnected through lines, and the at least one processor is used to run computer programs or instructions to execute The node repair method in the embodiment shown in Figure 6 or Figure 7 is mentioned above.

所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,read-onlymemory)、随机存取存储器(RAM,random access memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), magnetic disk or optical disk and other media that can store program code.

Claims (10)

CN202311058784.9A2023-08-222023-08-22Node repairing method and related equipmentPendingCN117097746A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202311058784.9ACN117097746A (en)2023-08-222023-08-22Node repairing method and related equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202311058784.9ACN117097746A (en)2023-08-222023-08-22Node repairing method and related equipment

Publications (1)

Publication NumberPublication Date
CN117097746Atrue CN117097746A (en)2023-11-21

Family

ID=88781627

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202311058784.9APendingCN117097746A (en)2023-08-222023-08-22Node repairing method and related equipment

Country Status (1)

CountryLink
CN (1)CN117097746A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119011080A (en)*2024-07-252024-11-22哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院)Node repairing method based on erasure codes and related equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119011080A (en)*2024-07-252024-11-22哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院)Node repairing method based on erasure codes and related equipment

Similar Documents

PublicationPublication DateTitle
US10963341B2 (en)Isolating the introduction of software defects in a dispersed storage network
RU2501072C2 (en)Distributed storage of recoverable data
US10223201B2 (en)Method of storing encoded data slices using a distributed agreement protocol
CN109643258B (en) Multi-node repair of erasure codes using high-rate minimal storage regeneration
US20170123922A1 (en)Expansion of dispersed storage network (dsn) memory
US20170285999A1 (en)Using separate read and write memory devices in a distributed storage network
CN103810061B (en)A kind of High Availabitity cloud storage method
CN111149093A (en)Data coding, decoding and repairing method of distributed storage system
US20170212683A1 (en)Provisioning ds units on the fly in a dsn memory in response to load
US20170177228A1 (en)Generation collapse
CN105518996B (en)A kind of data decoding method based on binary field reed-solomon code
EP2533450B1 (en)Method and device for data check processing
CN113541870A (en)Recovery optimization method for erasure code storage single node failure
CN113687975A (en)Data processing method, device, equipment and storage medium
WO2023151290A1 (en)Data encoding method and apparatus, device, and medium
CN102843212B (en)Coding and decoding processing method and device
CN112799605B (en)Square part repeated code construction method, node repair method and capacity calculation method
CN117097746A (en)Node repairing method and related equipment
CN111045843B (en)Distributed data processing method with fault tolerance capability
US10469406B2 (en)Partial task execution in a dispersed storage network
CN114237970A (en)Method and device for expanding erasure code storage system
US10110258B2 (en)Accelerated erasure coding for storage systems
Xie et al.AZ-Recovery: An efficient crossing-AZ recovery scheme for erasure coded cloud storage systems
US10958731B2 (en)Indicating multiple encoding schemes in a dispersed storage network
Li et al.Parallelizing degraded read for erasure coded cloud storage systems using collective communications

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp