Movatterモバイル変換


[0]ホーム

URL:


CN112612653B - A business recovery method, device, arbitration server and storage system - Google Patents

A business recovery method, device, arbitration server and storage system
Download PDF

Info

Publication number
CN112612653B
CN112612653BCN202011331689.8ACN202011331689ACN112612653BCN 112612653 BCN112612653 BCN 112612653BCN 202011331689 ACN202011331689 ACN 202011331689ACN 112612653 BCN112612653 BCN 112612653B
Authority
CN
China
Prior art keywords
storage array
arbitration
storage
array
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011331689.8A
Other languages
Chinese (zh)
Other versions
CN112612653A (en
Inventor
段虎成
梁永贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Huawei Technology Co Ltd
Original Assignee
Chengdu Huawei Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Huawei Technology Co LtdfiledCriticalChengdu Huawei Technology Co Ltd
Priority to CN202011331689.8ApriorityCriticalpatent/CN112612653B/en
Publication of CN112612653ApublicationCriticalpatent/CN112612653A/en
Application grantedgrantedCritical
Publication of CN112612653BpublicationCriticalpatent/CN112612653B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

一种业务恢复方法、装置、仲裁服务器以及存储系统,该方法包括:在存储系统由双从状态切换到故障恢复状态后,第一存储阵列首先向仲裁服务器发送第一仲裁请求,当仲裁服务器接收该第一存储阵列发送的该第一仲裁请求后,则确定该第一存储阵列的第一仲裁结果以及该第二存储阵列的第二仲裁结果,在确定该第一仲裁结果和第二仲裁结果之后,则向第一存储阵列发送该第一仲裁结果以及向第二存储阵列发送该第二仲裁结果。该第一存储阵列在接收根据该第一仲裁结果后,则向该第二存储阵列发起协商,确定协商结果,若确定协商结果为该第一存储阵列为获胜winning状态且该第二存储阵列为失败losing状态,则该第一存储阵列向客户端提供业务服务。

A business recovery method, device, arbitration server and storage system, the method comprising: after the storage system switches from a dual-slave state to a fault recovery state, the first storage array first sends a first arbitration request to the arbitration server, and when the arbitration server receives the first arbitration request sent by the first storage array, it determines the first arbitration result of the first storage array and the second arbitration result of the second storage array, and after determining the first arbitration result and the second arbitration result, sends the first arbitration result to the first storage array and sends the second arbitration result to the second storage array. After receiving the first arbitration result, the first storage array initiates negotiation with the second storage array to determine the negotiation result, and if it is determined that the negotiation result is that the first storage array is in a winning state and the second storage array is in a losing state, the first storage array provides business services to the client.

Description

Service recovery method and device, arbitration server and storage system
Technical Field
The present application relates to the field of storage technologies, and in particular, to a service recovery method, a device, an arbitration server, and a storage system.
Background
With the advent of the internet era, business requirements for various industries are increasing. In order to ensure the reliability of the service, a dual-active storage system is adopted to provide cloud computing service.
Referring to fig. 1, a schematic diagram of a dual active storage system is shown. As shown in fig. 1, the dual-active storage system includes an arbitration server and two storage arrays (storage array a and storage array B, respectively) that are backup to each other. When the dual-active storage system operates normally, both storage arrays are in an operating state, and services are provided by both storage arrays at the same time. When the storage array A is powered down or equipment failure occurs and the like, and service cannot be processed, the state of the storage array A and the state of the storage array B are arbitrated by the arbitration server, so that the service originally distributed on the storage array A is smoothed to take over the storage array B, and the service is simultaneously provided by the original storage array A and the storage array B in the working mode of the dual-activity storage system, and the service is independently provided by the storage array B. In this way, higher data reliability and service continuity can be provided to the user.
However, due to the limitation of networking environment, during the use process of the dual-active storage system, a large number of dual-slave states, in which both storage arrays cannot process services, occur in the dual-active storage system due to network faults, power supply faults and the like. When the dual active storage system is in the dual slave state, even if the failure is recovered, the log of each storage array needs to be manually analyzed, and then one storage array is selected to pull up the service. The time required for manually analyzing the log is long, and the timely recovery of the service is often affected.
Disclosure of Invention
The embodiment of the application provides a service recovery method, a device, an arbitration server and a storage system, which are used for improving the service recovery efficiency of a dual-activity storage system.
In a first aspect, an embodiment of the present application provides a service recovery method, applied to a storage system, where the storage system includes a first storage array, a second storage array, and an arbitration server, after the storage system is switched from a dual-slave state to a failure recovery state, the first storage array first sends a first arbitration request to the arbitration server, when the arbitration server receives the first arbitration request sent by the first storage array, determines a first arbitration result of the first storage array and a second arbitration result of the second storage array, and after the first arbitration result and the second arbitration result are determined, sends the first arbitration result to the first storage array and sends the second arbitration result to the second storage array. After receiving the first arbitration result, the first storage array initiates a negotiation to the second storage array, determines a negotiation result, and if the negotiation result is determined to be that the first storage array is in a winning winning state and the second storage array is in a losing losing state, the first storage array provides business service for the client.
In the technical scheme, after the storage system is changed from the double-slave state to the fault recovery state, the first storage array can actively transmit a negotiation process to automatically determine the winning storage array, and the winning storage array actively pulls up the service, so that the waiting time for manually operating the service to be pulled up can be reduced, and the service recovery efficiency of the storage system can be improved.
In one possible design, the first storage array initiates a negotiation to the second storage array after receiving the first arbitration result, and when the second storage array receives the negotiation request, the first storage array feeds back an arbitration result of the arbitration server to the second storage array, that is, a second arbitration result. After the first storage array receives the second arbitration result from the second storage array, determining a negotiation result according to the first arbitration result and the second arbitration result.
In the above technical solution, the first storage array may initiate negotiation according to the arbitration result sent by the arbitration server, so as to improve accuracy of the negotiation result.
In one possible design, the first storage array may determine the negotiation result according to the following negotiation principles, which may include, but are not limited to, the following:
If the first arbitration result is a winning winning state and the second arbitration result is a checking checking state, the first storage array determines that the first storage array is winning and the second storage array is losing state, or
If the first arbitration result is a winning winning state and the second arbitration result is a losing state, the first storage array determines that the first storage array is winning and the second storage array is losing states, or
If the first arbitration result is a failed losing state and the second arbitration result is a winning state, the first storage array determines that the first storage array is losing and the second storage array is winning states, or
If the first arbitration result is check checking and the second arbitration result is checking, the first storage array determines that the first storage array is winning and the second storage array is losing.
In the technical scheme, the flexibility of the storage system in the service recovery process can be increased through the various negotiation principles in advance.
In one possible design, the first storage array sends synchronization data to the second storage array after the first storage array provides business services to clients.
And after the difference data of the first storage array are completely synchronized to the second storage array, the dual-activity characteristic of the storage system is recovered, so that the first storage array and the second storage array can simultaneously provide services.
In one possible design, if the replication link between the first storage array and the second storage array is disconnected and before being in the dual-slave state, the arbitration server determines and stores the arbitration result according to the received third arbitration request sent by the first storage array and/or the fourth arbitration request sent by the second storage array, and does not feed back to the first storage array and the second storage array, then the arbitration server determines the first arbitration result and the second arbitration result directly from the stored arbitration results after receiving the first arbitration request sent by the first storage array.
In the above technical solution, if the arbitration server has already stored the arbitration result of the first storage array and the arbitration result of the second storage array before the storage system changes to the dual-slave, the arbitration server may directly send the stored arbitration result to the first storage array and the second storage array, so that the energy consumption of the arbitration server may be reduced.
In one possible design, the arbitration results stored in the arbitration server include, but are not limited to, the following:
the first arbitration result is checked winning and the second arbitration result is losing, or
The first arbitration result is unknown winning state and the second arbitration result is checking state, or
The first arbitration result is checked losing and the second arbitration result is winning;
the first arbitration result is a check checking state and the second arbitration result is a checking state.
In one possible design, if the arbitration server receives a second arbitration request from the second storage array before determining the first arbitration result of the first storage array and the second arbitration result of the second storage array, the arbitration server determines the first arbitration result and the second arbitration result according to the order in which the first arbitration request and the second arbitration request are received. For example, the arbitration server receives the first arbitration request before receiving the second arbitration request, and then the arbitration server determines that the first arbitration result is winning and the second arbitration result is losing.
In the above technical solution, the arbitration server may also determine the arbitration result according to the sequence of receiving the arbitration requests of the two storage arrays, for example, determine the storage array corresponding to the previously received arbitration request as winning state, and determine the other storage array as losing state, which may improve the flexibility of the arbitration server.
In one possible design, if the arbitration server determines that a second arbitration request has not been received from the second storage array within a predetermined time period from the receipt of the first arbitration request, the arbitration server determines that the first arbitration result is winning and the second arbitration result is checking.
In the above technical solution, if the arbitration server does not receive an arbitration request of a certain storage array, the storage array may be considered to be faulty, so that the storage array corresponding to the received arbitration request may be set to winning, and the storage array not receiving the arbitration request may be set to checking, which may improve flexibility of the arbitration server.
In a second aspect, an embodiment of the present application provides a service restoration device, applied to a first storage array, where the first storage array is located in a storage system, and the storage system further includes a second storage array and an arbitration server, where the device includes a processor, where the method is implemented by the first storage array in the method described in the first aspect. The apparatus may also include a memory for storing program instructions and data. The memory is coupled to the processor, which may call and execute program instructions stored in the memory for implementing the method performed by the first storage array in the method described in the first aspect above. The apparatus may also include a communication interface for the apparatus to communicate with other devices. Illustratively, the other device comprises the second storage array or arbitration server mentioned in the first aspect above.
In one possible design, the apparatus includes a communication interface and a processor, specifically, after the storage system is switched from the dual-slave state to the failure recovery state, the processor sends a first arbitration request to an arbitration server through the communication interface, receives a first arbitration result from the arbitration server through the communication interface, and initiates a negotiation to the second storage array through the communication interface according to the first arbitration result, determines a negotiation result, and if the negotiation result is determined to be a winning winning state and the second storage array is a losing losing state, the processor provides a service to a client through the communication interface.
In one possible design, the processor initiates a negotiation to the second storage array through the communication interface according to the first arbitration result, and the communication interface receives a second arbitration result from the second storage array, where the second arbitration result is an arbitration result of the arbitration server for the second storage array, and determines a negotiation result according to the first arbitration result and the second arbitration result.
In one possible design, the processor determines that the memory array is in the winning state and the second memory array is in the losing state if the first arbitration result is in the winning winning state, or determines that the memory array is in the winning state and the second memory array is in the losing state if the first arbitration result is in the winning winning state, or determines that the memory array is in the losing state and the second memory array is in the winning state if the first arbitration result is in the losing losing state, or determines that the memory array is in the winning state if the first arbitration result is in the losing state and the second memory array is in the winning state, or determines that the memory array is in the winning state and the second memory array is in the losing state if the first arbitration result is in the checking checking state.
In one possible design, the processor is further configured to send synchronization data to the second storage array after the processor provides business services to clients through the communication interface.
In a third aspect, an embodiment of the present application provides an arbitration server located in a storage system, where the storage system further includes a first storage array and a second storage array, and the arbitration server includes a processor configured to implement a method performed by the arbitration server in the method described in the first aspect. The arbitration server may also include memory for storing program instructions and data. The memory is coupled to the processor, which may call and execute program instructions stored in the memory for implementing the method performed by the arbitration server in the method described in the first aspect above. The mediation server may also include a communication interface for the mediation server to communicate with other devices. Illustratively, the other device comprises the first storage array or the second storage array mentioned in the first aspect above.
In one possible design, the arbitration server includes a processor and a communication interface, specifically, after the storage system is switched from the dual-slave state to the failure recovery state, the processor receives a first arbitration request from a first storage array through the communication interface, the processor determines a first arbitration result for the first storage array and a second arbitration result for the second storage array, the processor sends the first arbitration result to the first storage array through the communication interface, and sends the second arbitration result to the second storage array through the communication interface.
In one possible design, the processor determines the first arbitration result and the second arbitration result from stored arbitration results, the stored arbitration results being arbitration results determined by the arbitration server and not fed back to the first storage array and the second storage array before a duplicate link of the storage system is disconnected and in the dual-slave state, the duplicate link being a link between the first storage array and the second storage array;
The processor determines the stored arbitration result based on a third arbitration request sent by the first storage array received via the communication interface and/or a fourth arbitration request sent by the second storage array received via the communication interface before the replicated link of the storage system is broken and before being in the dual slave state.
In one possible design, the stored arbitration result includes:
the first arbitration result is checked winning and the second arbitration result is losing, or
The first arbitration result is unknown winning state and the second arbitration result is checking state, or
The first arbitration result is checked losing and the second arbitration result is winning;
the first arbitration result is a check checking state and the second arbitration result is a checking state.
In one possible design, before the processor determines the first arbitration result of the first storage array and the second arbitration result of the second storage array, a second arbitration request is received from the second storage array through the communication interface, the processor determines the first arbitration result and the second arbitration result according to a sequence of receiving the first arbitration request and the second arbitration request, and if the processor receives the second arbitration request after receiving the first arbitration request, the processor determines that the first arbitration result is winning state and the second arbitration result is losing state.
In one possible design, the processor determines that a second arbitration request is not received from the second memory array within a predetermined time period from receipt of the first arbitration request through the communication interface, determines that a first arbitration result is winning states, and determines that the second arbitration result is checking states.
In a fourth aspect, an embodiment of the present application provides a service restoration device, applied to a first storage array, where the first storage array is located in a storage system, where the storage system further includes a second storage array and an arbitration server, where the device may include a transceiver unit and a processing unit, where the modules may perform the corresponding functions of the first storage array in any of the design examples of the first aspect, and the modules may be implemented by software modules, or may be implemented by corresponding hardware entities, for example, when implemented by corresponding hardware entities, where the function of the transceiver unit is similar to the function of the communication interface in the second aspect, and the function of the processing unit is similar to the function of the processor in the second aspect.
In a fifth aspect, an embodiment of the present application provides an arbitration server, where the arbitration server may include a transceiver unit and a processing unit, where the modules may perform corresponding functions of the arbitration server in any of the design examples of the first aspect, and the modules may be implemented by using software modules, or may be implemented by using corresponding hardware entities, for example, when implemented by using corresponding hardware entities, the transceiver unit may function similarly to the function of the communication interface in the third aspect, and the processing unit may function similarly to the processor in the third aspect.
In a sixth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method of any one of the first aspects performed by a first storage array.
In a seventh aspect, an embodiment of the present application provides a computer readable storage medium storing a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method of any one of the first aspects performed by an arbitration server.
In an eighth aspect, an embodiment of the present application provides a computer program product storing a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method of any one of the first aspects performed by the first storage array.
In a ninth aspect, embodiments of the present application provide a computer program product storing a computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method of any one of the first aspects, performed by an arbitration server.
In a tenth aspect, the present application provides a chip system comprising a processor, and further comprising a memory for implementing the method performed by the first storage array of the first aspect or the method performed by the arbitration server. The chip system may be formed of a chip or may include a chip and other discrete devices.
In an eleventh aspect, the present application provides a storage system including the service restoration apparatus of the second aspect or the fourth aspect and the arbitration server of the third aspect or the fifth aspect.
In a twelfth aspect, the present application provides a storage system, which includes the service restoration device of the second aspect or the fourth aspect, the arbitration server of the third aspect or the fifth aspect, and the second storage array.
Advantageous effects of the above second to twelfth aspects and implementations thereof reference may be made to the description of the advantageous effects of the method of the first aspect and implementations thereof.
Drawings
FIG. 1 is a block diagram of a dual active storage system according to an embodiment of the present application;
FIG. 2 is a block diagram of another dual active storage system according to an embodiment of the present application;
FIG. 3A is a block diagram of another dual-active storage system according to an embodiment of the present application;
FIG. 3B is a block diagram of another dual-active storage system according to an embodiment of the present application;
fig. 4 is a flowchart of an example of a service restoration method according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating a state change process of a memory array according to an embodiment of the present application;
Fig. 6 is a flowchart of another example of a service restoration method according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a service restoration device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of another service recovery device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an arbitration server according to an embodiment of the present application;
Fig. 10 is a schematic structural diagram of another arbitration server according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be described in detail below with reference to the accompanying drawings and specific embodiments of the present application.
The term "and/or" herein is merely an association relation describing the association object, and means that three kinds of relations may exist, for example, a and/or B, and that three kinds of cases where a exists alone, while a and B exist together, and B exists alone. In this context, unless otherwise specified, the term "/" generally indicates that the associated object is an "or" relationship.
In addition, it should be understood that reference to "a plurality of" means two or more than two. The words "first," "second," and the like are used merely for distinguishing between the descriptions and not be construed as indicating or implying a relative importance or order.
The embodiment of the application provides a service recovery method, which is applied to a dual-activity storage system. The dual active storage system may be a file storage system, a block storage system, or an object storage system, or a combination of the above storage systems, and is not limited in this embodiment of the present application.
Referring to fig. 1 to 3B, four possible architecture diagrams of a dual active memory system according to an embodiment of the application are shown. The dual active storage system as shown in fig. 1 has been described above and will not be described again here. Unlike the dual active storage system shown in fig. 1, in the dual active storage system shown in fig. 2, each storage array is a coupled node set formed of a plurality of storage nodes, and services are provided to the outside in cooperation. As shown in FIG. 2, the storage array A of the dual-activity storage system comprises storage nodes 0-2, and the storage array B comprises storage nodes 3-5.
It should be noted that, in the dual-active storage system shown in fig. 1 or fig. 2, the storage array a and the storage array B may be located in the same region (region), as shown in fig. 3A, the storage array a and the storage array B are located in the same region, and the storage array a and the storage array B may be connected through a high-speed network, so as to ensure a lower latency. Storage array a and storage array B may also be located in different regions, as shown in fig. 3B, storage array a is located in region1, storage array B is located in region2, and wired connection networks such as Fiber Channel (FC), internet small computer system interface (INTERNET SMALL computer SYSTEM INTERFACE, iSCSI) and the like are supported between each region.
It should be noted that, the dual-active storage system is not limited to the architecture shown in fig. 1 to 3B, and the dual-active storage system described in the embodiment of the present application is for more clearly describing the technical solution of the embodiment of the present application, and does not constitute a limitation on the technical solution provided by the embodiment of the present application, and those skilled in the art can know that, along with the evolution of the storage technology and the architecture of the storage system, the technical solution provided by the embodiment of the present application is also applicable to similar technical problems.
In the following, terms related to a storage system are described to facilitate understanding by those skilled in the art.
(1) Duplicate links, representing links between two storage arrays in a dual-active storage system.
(2) An arbitration link, which means a link between two storage arrays in a dual-active storage system and an arbitration server, for example, a link between storage array a and the arbitration server and a link between storage array B and the arbitration server, respectively.
(3) Double activity means that two storage arrays are mutually backed up and are in an operation state, and the same service can be processed at the same time. When one storage array in the dual-active storage system fails, the service is rapidly switched to the other storage array, so that the continuity of the service is ensured.
(4) Double-slave, meaning that neither storage array can handle traffic. When the service in the dual-active storage system cannot be switched to any storage array, the service is interrupted. Reasons for causing the dual active storage system to become dual slave state include, but are not limited to, three types, a) the two arrays of the dual active storage system are powered down simultaneously, b) the two arrays of the dual active storage system and the arbitration server are powered down simultaneously, and c) the replication link and the arbitration link of the dual active storage system are simultaneously failed.
(5) Consistency groups are made up of logical storage units (logic unit number, LUNs) or files on two storage arrays that are backup to each other in a dual-active storage system. The two storage arrays constituting the consistency group are provided with a priority array and a non-priority array, for example, the storage array a is a priority array and the storage array B is a non-priority array.
(6) And the arbitration server arbitrates by taking the dual-activity consistency group as a unit when the replication link is disconnected. The arbitration server may be a stand-alone device, which may be a computer or mobile terminal, etc., or may be a logical concept, such as a software module, or a virtual machine under virtualized implementation, without limitation.
In the following, the service recovery procedure in the prior art will be described by taking the dual-active storage system described in fig. 1 as an example, and the reason why the dual-active storage system becomes dual-slave is that the replication link and the arbitration link simultaneously fail.
And step 1, when the dual-active storage system is normally used, two storage arrays simultaneously provide services, for example, a storage array A is arranged to process a part of tasks, and a storage array B is arranged to process another part of services. When the storage array A or the storage array B processes the service, the synchronous data can be sent by the other party in real time so as to synchronize the data in the two storage arrays. For example, when storage array a processes traffic 1, storage array a will send traffic 1 data to storage array B, or when storage array B processes traffic 2, storage array B will also send traffic 2 data to storage array a.
In this case, since the arbitration server is not required for arbitration, the states of the storage arrays a and B in the arbitration server are unknown unknown states.
Step 2, when the machine room of the dual-active storage system is powered down, or the storage array a, the storage array B and the arbitration server are powered down simultaneously, or after the replication link fails, the arbitration link between the storage array a and the arbitration server and the arbitration link between the storage array B and the arbitration server also fail, etc., in which case the arbitration server cannot arbitrate the storage array. Memory array a and memory array B determine an arbitration timeout such that both memory array a and memory array B become losing and the dual active memory system then becomes the dual slave state.
And 3, after the faults of the replication link and the arbitration link are recovered, in order to ensure the continuity of the service, the storage arrays corresponding to the latest data stored with the service are required to be selected to continuously process the service, so that technicians are required to analyze the logs of the two storage arrays and determine the time point when each storage array is in a losing state. Since the time that storage array A becomes losing in state is later than storage array B, the technician selects storage array A to pull up traffic. After receiving the instruction of pulling up the service, the storage array a continues to process the service.
Because the time required for manually analyzing the log of the storage array is long, the service recovery method in the prior art has the problem of influencing the timely recovery of the service.
In view of this, the embodiment of the application provides a service recovery method for improving the service recovery efficiency of a dual-active storage system. The following describes the technical scheme provided by the embodiment of the application with reference to the accompanying drawings.
Referring to fig. 4, a flowchart of a service restoration method according to an embodiment of the present application is described below:
step 401, the storage system operates normally, and the storage array processes the service.
In the embodiment of the present application, the storage system includes a first storage array, a second storage array and an arbitration server, which can be understood that the storage system is a dual-active storage system, and specifically can be one of the storage systems shown in fig. 1 to 3B. For convenience of explanation, in the following description, the memory system shown in fig. 1 will be taken as an example, that is, the first memory array is the memory array a in fig. 1, the second memory array is the memory array B in fig. 1, and the arbitration server is the arbitration server in fig. 1.
When the storage system is operating normally, then the traffic is handled by storage array A and storage array B together. The specific process is the same as step 1 in the prior art, and will not be described here again.
It should be noted that, since the arbitration server is not required to perform arbitration when the storage system is operating normally, the states of the storage array a and the storage array B in the arbitration server are unknown states. Referring to fig. 5, a state change process of a memory array according to an embodiment of the application is shown. In this case, the state of the memory array is state 1.
Step 402, the replication link of the storage system fails, the storage array a sends a third arbitration request to the arbitration server and/or the storage array B sends a fourth arbitration request to the arbitration server, and the arbitration server receives the arbitration request.
It should be noted that, the third arbitration request and the fourth arbitration request are respectively arbitration requests sent by the memory system before the memory system is changed to the dual slave state.
Specifically, storage array a sends a third arbitration request to the arbitration server and/or storage array B sends a fourth arbitration request to the arbitration server, including the following three cases:
In the first transmission case, when a replication link between the storage array a and the storage array B fails, for example, the replication link is disconnected, if the storage array a and the storage array B are in a normal state, that is, the storage array a and the storage array B are in a power-on state and no device failure occurs, the storage array a and the storage array B both transmit an arbitration request to the arbitration server, and if the arbitration link between the storage array a and the storage array B and the arbitration server does not fail, the arbitration server receives two arbitration requests transmitted by the two storage arrays, that is, the third arbitration request and the fourth arbitration request, respectively. If an arbitration link between one of the storage arrays and the arbitration server fails, the arbitration server cannot receive the arbitration request although the storage array sent the arbitration request to the arbitration server. For example, if the arbitration link between storage array a and the arbitration server is broken, the arbitration server cannot receive the third arbitration request sent by storage array a.
In the second sending case, if the storage array a is in a normal state and the storage array B is powered down or has a device failure, only the storage array a sends a third arbitration request to the arbitration server, and if an arbitration link between the storage array a and the arbitration server has no failure, the arbitration server only receives the third arbitration request sent by the storage array a.
And in the third sending condition, if the storage array B is in a normal state and the storage array A is powered down or has equipment failure, only the storage array B sends a fourth arbitration request to the arbitration server, and if an arbitration link between the storage array B and the arbitration server has no failure, the arbitration server only receives the fourth arbitration request sent by the storage array B.
It should be noted that in step 402, the storage system is not yet in the dual slave state, but the arbitration process is triggered due to the failure of the replication link. In addition, in the embodiment of the present application, the arbitration request may be a data packet carrying specific content or an empty packet carrying a specific packet header, where the specific content and the specific packet header are pre-agreed for the arbitration server and the storage array, and of course, the arbitration request may also be in other forms, which is not limited herein. In fig. 4, taking as an example the storage array a sending a third arbitration request to the arbitration server and the storage array B sending a fourth arbitration request to the arbitration server.
Step 403, the arbitration server determines and stores the arbitration result.
In the embodiment of the application, the arbitration server determines arbitration results including but not limited to the following two modes. These two modes are described separately below.
In the first determining manner, the arbitration server determines an arbitration result according to the received third arbitration request and/or fourth arbitration request. The following description is directed to different arbitration processes.
In the first arbitration process, if the arbitration server receives the third arbitration request and the fourth arbitration request sent by the two storage arrays respectively, the arbitration server determines an arbitration result according to the sequence of receiving the third arbitration request and the fourth arbitration request, that is, firstly, the arbitration request sent by a certain storage array is received, the arbitration result of the storage array is determined to be winning, and the arbitration result of the other storage array is determined to be losing. For example, the arbitration request carries identification information of the storage array, the identification information may be a number or an index number of the storage array, and the like, and after receiving the first arbitration request, the arbitration server obtains the identification information of the storage array from the arbitration request, for example, if the arbitration server determines that the received first arbitration request carries the number of the storage array a, then it determines that the arbitration result of the storage array a is winning, and the arbitration result of the storage array B is losing.
In the second arbitration process, if the arbitration server only receives an arbitration request sent by one storage array, the arbitration server can determine that the sender of the arbitration request is the storage array a or the storage array B according to the identification information of the storage array carried in the arbitration request, and determine that the storage array sending the arbitration request is winning, and the other storage array is losing, unknown or checking. For example, if the arbitration server determines that the received arbitration request carries the number of the storage array a, the arbitration server determines that the arbitration result of the storage array a is winning and the arbitration result of the storage array B is checking.
Note that when the state of a certain memory array is checking, this indicates that the memory array is in an unreadable state.
In a third arbitration process, the arbitration server may also determine the arbitration result for the storage array based on the received arbitration request along with other information, which may be the heartbeat of the storage array. For example, the storage array a and the storage array B respectively send a third arbitration request and a fourth arbitration request to the arbitration server, but because the arbitration link between the arbitration server and the storage array B fails, the arbitration server only receives the arbitration request sent by one storage array, and the arbitration server determines that the sender of the arbitration request is the storage array a according to the identification information of the storage array carried in the arbitration request, and then determines that the arbitration result of the storage array B which does not receive the arbitration request is checking according to the received arbitration request. If the storage array a is powered down at the time of sending the third arbitration request due to other reasons, the arbitration server determines that the heartbeat of the storage array cannot be detected after receiving the third arbitration request sent by the storage array a, so as to determine that the arbitration result of the storage array a is checking. It should be noted that, the arbitration server may determine whether each storage array is in a normal state through the heartbeat of each storage array, for example, each storage array may periodically send a heartbeat to the arbitration server, if the heartbeat of a certain storage array is detected, determine that the storage array is in a normal state, and if the heartbeat of a certain storage array cannot be detected, determine that the storage array is in a fault state. The communication link used to carry the heartbeat of each storage array is the same as the arbitration link, e.g., iSCSI link, etc.
Of course, the arbitration server may determine the arbitration result in other ways, not specifically recited herein. With continued reference to FIG. 5, in this case, the state of the memory array is changed to winning, state 2, if the arbitration result is successful, and losing, state 3, if the arbitration result is failed.
When the arbitration server determines an arbitration result, the arbitration result is stored.
In step 404, the states of the memory array a and the memory array B are checking, and the memory system is changed to the dual slave state.
If the machine room of the storage array is powered down, or the storage array a, the storage array B and the arbitration server are powered down simultaneously, or after the replication link fails, the arbitration link between the storage array a and the arbitration server and the arbitration link between the storage array B and the arbitration server also fail, etc., so that after the arbitration server determines the arbitration results of the storage array a and the storage array B, the arbitration results cannot be sent to the storage array, and therefore, the storage array a and the storage array B are in a checking state, and the storage system is in a double-slave state. In this case, both the storage array a and the storage array B are in a non-readable and writable state, and no service can be provided.
With continued reference to FIG. 5, in this case, the state of the memory array changes to checking, state 4, as the memory system changes to the dual slave state.
It should be noted that, the scenario of causing the storage system to become in the dual slave state includes, but is not limited to, the above case, and is not limited in the embodiment of the present application.
Steps 401 to 404 are optional steps, i.e. not necessarily performed, and are illustrated by dashed lines in fig. 4.
Step 405, the storage system switches from the dual slave state to the failure recovery state, and the storage array a sends a first arbitration request to the arbitration server, and the arbitration server receives the first arbitration request.
If the storage system becomes a dual-slave state due to power failure of a machine room of the storage system or simultaneous power failure of the storage array a, the storage array B and the arbitration server, the storage array a will send a first arbitration request to the arbitration server after the storage system is powered on. Or if the storage system becomes a dual slave state due to a failure of a replication link of the storage system, an arbitration link between the storage array a and the arbitration server, and an arbitration link between the storage array B and the arbitration server, after the storage array a detects that the failure of the replication link and the two arbitration links is recovered, the storage array a sends a first arbitration request to the arbitration server.
It should be noted that the first arbitration request may be understood as an arbitration request sent by the storage array a to the arbitration server after the storage system changes from the dual slave state to the failure recovery state. In the storage system shown in fig. 1, the storage array a is a main storage array of the storage system, so the storage array a sends a first arbitration request to the arbitration server, which can be understood as that the main storage array of the storage system sends the first arbitration request after the storage system changes from the dual slave state to the failure recovery state.
Step 406, the storage array B sends a second arbitration request to the arbitration server, and the arbitration server receives the second arbitration request.
After the storage system changes from the dual slave state to the failback state, the storage array B may also send an arbitration request, i.e., a second arbitration request, to the arbitration server. The storage array B sends the second arbitration request to the arbitration server, which may be understood that other storage arrays in the storage system than the main storage array send the arbitration request to the arbitration server, and when the storage arrays in the storage system include 3 or more than 3 storage arrays, it may be understood that each of the other storage arrays in the storage system except the main storage array sends the arbitration request to the arbitration server.
It should be noted that step 406 is an optional step, i.e., it is not necessary to perform. If step 406 is executed, step 406 may be executed simultaneously with step 405, or step 406 may be executed first and then step 405 may be executed, or step 405 may be executed first and then step 406 may be executed, where in the embodiment of the present application, the execution sequence of step 405 and step 406 is not limited. In fig. 4, step 405 is performed before step 406 is performed.
Step 407, the arbitration server determines a first arbitration result for storage array A and a second arbitration result for storage array B.
After the arbitration server receives the first arbitration request sent by the storage array a and/or the second arbitration request sent by the storage array B, the arbitration server determines whether the arbitration results of the storage array a and the storage array B are already stored. Since the arbitration server holds the persistent arbitration result, the arbitration result determined in step 403 is not fed back to the storage array a and the storage array B, and thus the arbitration server directly determines the first arbitration result and the second arbitration result from the stored arbitration results. For example, when the stored arbitration result is that the storage array a is winning and the storage array B is losing, the arbitration server determines that the first arbitration result is winning and the second arbitration result is losing, and when the stored arbitration result is that the storage array a is unknown and the storage array B is checking, the first arbitration result is unknown and the second arbitration result is checking. Of course, if the stored arbitration result is other content, the first arbitration result and the second arbitration result will also change accordingly, which is not listed here.
Step 408, the arbitration server sends the first arbitration result to storage array a, and sends the second arbitration result to storage array B, which receives the first arbitration result.
Step 409, the storage array a initiates negotiation to the storage array B according to the first arbitration result.
Specifically, after the storage array a receives the first arbitration result, a negotiation request is sent to the storage array B, where the negotiation request may be used to query the storage array B for the request information of the second arbitration result, and of course, a data packet with specific content or an empty packet with specific packet header, where the specific content and the specific packet header are pre-agreed for the storage array a and the storage array B, and of course, the negotiation request may also be in other forms, which is not limited herein.
In step 410, the storage array B feeds back the second arbitration result to the storage array a, and the storage array a receives the second arbitration result.
And after the storage array B receives the negotiation request sent by the storage array A, feeding back a second arbitration result obtained from the arbitration server to the storage array A, and receiving the second arbitration result by the storage array A.
In step 411, the storage array a determines a negotiation result according to the first arbitration result and the second arbitration result.
After the storage array A receives the second arbitration result fed back by the storage array B, determining a negotiation result with the storage array B according to a preset negotiation principle. As an example, the negotiation principle of the storage array a may refer to the contents shown in table 1. As shown in table 1, the negotiation principle is that when the home array state is checking and the peer array state is winning, the negotiation result is losing and the peer array state is winning, when the home array state is checking and the peer array state is losing, the negotiation result is winning and the peer array state is losing, when the home array state is checking and the peer array state is unknown, the negotiation result is losing and the peer array state is winning, and when the home array state is checking and the peer array state is checking, the negotiation result is winning and the peer array state is losing. In table 1, the home array is an array for determining a negotiation result, for example, a storage array a, and the peer array is an array for negotiating with an array for determining a negotiation result, for example, a storage array B. Of course, if the array of the negotiation result is determined to be the storage array B, the local array is the storage array B, and the opposite array is the storage array a, and the meaning of the local array and the opposite array needs to be flexibly explained by those skilled in the art, which is not described herein.
TABLE 1
Local array stateOpposite end array stateNegotiation result
Checking StateWinning State(losing,winnig)
Checking StateLosing State(winnig,losing)
Checking StateUnknown State(losing,winnig)
Checking StateChecking State(winnig,losing)
Specifically, if the storage array a determines that the first arbitration result is winning and the second arbitration result is checking, the negotiation result is winning and the storage array B is losing according to the negotiation rule shown in table 1, if the storage array a determines that the first arbitration result is checking and the second arbitration result is losing, the negotiation result is winning and the storage array B is losing according to the negotiation rule shown in table 1, if the storage array a determines that the first arbitration result is unknown and the second arbitration result is checking, the storage array a determines that the negotiation result is winning and the storage array B is losing according to the negotiation rule shown in table 1, and if the storage array a determines that the first arbitration result is checking and the second arbitration result is checking, the storage array a determines that the negotiation result is winning and the storage array B is losing according to the negotiation rule shown in table 1.
It should be noted that the negotiation principle shown in table 1 is only described by taking the state of the home array as checking as an example, which is only an example, and should not be construed as limiting the negotiation principle. When the arbitration result of the home terminal array is in other states, the negotiation principle can be adjusted by referring to the principle shown in table 1, which is not listed here.
It should be further noted that the negotiation result includes only two states, that is, winning and losing.
With continued reference to FIG. 5, in this case, since the storage system changes from the dual slave state to the fail-safe state, the storage array changes to winning, state 2 if the negotiation is successful, and losing, state 3 if the negotiation is failed when the storage array determines that the negotiation is successful.
In step 412, the storage array a determines that the negotiation result is that the storage array a is winning and the storage array B is losing, and the storage array a provides service for the client.
When the storage array A determines that the negotiation result is the local end array winning, the storage array A actively restores the state of the local end array to winning so as to provide business service. Of course, when the storage array a determines that the negotiation result is the opposite end array winning, the storage array a may send the negotiation result to the storage array B, so that when the storage array B determines that the array winning, the storage array a actively pulls up the service to provide the service.
It should be noted that, if the storage array a determines that the first arbitration result and the second arbitration result are both winning states, or the storage array a determines that the first arbitration result and the second arbitration result are both losing states, in this case, the storage array a determines that negotiation fails. If the storage array A determines that the negotiation fails, the storage system cannot automatically restore the service.
Step 413, the storage array a sends the synchronization data to the storage array B.
After the storage array A provides business service, the latest data of the business is sent to the storage array B, so that the business data in the storage array B are synchronized with the storage array A, and after the difference data of the array A are completely synchronized to the array B, the dual-activity characteristic of the storage system is recovered, so that the storage array A and the storage array B can simultaneously provide the business.
Step 413 is an optional step, i.e., not necessarily performed.
In the technical scheme, after the storage system is changed from the double-slave state to the fault recovery state, the winning storage array can be automatically determined through the negotiation principle, and the service is actively pulled up by the winning storage array, so that the waiting time for manually operating the service pulling up can be reduced, and the service recovery efficiency of the storage system can be improved.
In the embodiment shown in fig. 4, when the storage system is changed from the dual slave state to the failure recovery state, the arbitration server determines the arbitration result of each storage array after the failure recovery state according to the previously stored arbitration result. Next, an embodiment will be described in which the arbitration server redetermines the arbitration result after the storage system changes from the dual slave state to the failure recovery state.
Referring to fig. 6, a flowchart of another embodiment of a service recovery method according to an embodiment of the present application is described below:
Step 601, the storage system operates normally, and the storage array processes the service.
In the embodiment of the present application, the storage system includes a first storage array, a second storage array and an arbitration server, which can be understood that the storage system is a dual-active storage system, and specifically can be one of the storage systems shown in fig. 1 to 3B. For convenience of explanation, in the following description, the memory system shown in fig. 1 will be taken as an example, that is, the first memory array is the memory array a in fig. 1, the second memory array is the memory array B in fig. 1, and the arbitration server is the arbitration server in fig. 1.
Step 601 is the same as step 401 and will not be described again here.
In step 602, the memory system is changed to the dual slave state, and the states of the memory array a and the memory array B are checking.
In the embodiment of the application, the storage system is directly changed from normal operation into a double-slave state, for example, a machine room of the storage array is powered down, or the storage array A, the storage array B and the arbitration server are powered down simultaneously, or at the same moment when the replication link breaks down, the arbitration link between the storage array A and the arbitration server and the arbitration link between the storage array B and the arbitration server also break down, and the like. In this case, the storage arrays a and B have not yet made an opportunity to request arbitration from the arbitration server, and are directly changed to the dual slave state.
It should be noted that, the steps 601 to 602 are optional steps, i.e. not necessarily performed.
Step 603, the storage system switches from the dual slave state to the failure recovery state, and the storage array a sends a first arbitration request to the arbitration server, and the arbitration server receives the first arbitration request.
Step 603 is the same as step 405 and the first arbitration request is the same as the first arbitration request in step 405.
Step 604, the storage array B sends a second arbitration request to the arbitration server, and the arbitration server receives the second arbitration request.
Step 604 is the same as step 406 and the second arbitration request is the same as the second arbitration request in step 406.
It should be noted that step 604 is an optional step, i.e., it is not necessary to perform. For example, after the storage system changes from the dual-slave state to the failure recovery state, the storage array B may fail to send the second arbitration request to the arbitration server due to a failure of an arbitration link between the storage array B and the arbitration server, or a failure of a device sent by the storage array B, or the like.
Step 605, the arbitration server determines a first arbitration result for storage array A and a second arbitration result for storage array B.
After the arbitration server receives the first arbitration request sent by the storage array a and/or the second arbitration request sent by the storage array B, the arbitration server determines whether the arbitration results of the storage array a and the storage array B are already stored. Since the storage system is directly changed from the normal operation to the dual-slave state, that is, before the storage system is changed to the dual-slave state, neither the storage array a nor the storage array B sends an arbitration request to the arbitration server, and therefore, the arbitration result of the arbitration of the storage array a and the storage array B is not performed in the arbitration server.
If the method of the embodiment of the present application performs step 603 and step 604, that is, the arbitration server receives two arbitration requests from the storage array a and the storage array B, the arbitration server needs to re-arbitrate the state of the storage array according to the first arbitration request received from the storage array a and the second arbitration request received from the storage array B. Specifically, the arbitration server determines the first arbitration result and the second arbitration result according to the sequence of receiving the first arbitration request and the second arbitration request. If the arbitration server receives the second arbitration request after receiving the first arbitration request, it determines that the first arbitration result is winning and determines that the second arbitration result is losing. For example, after receiving the first arbitration request, the arbitration server determines that the arbitration request carries the identification information of the storage array a, and then the arbitration server receives the second arbitration request, where the arbitration request carries the identification information of the storage array B, and then the first arbitration request is determined to be received first and then the second arbitration request is received, so that it is determined that the arbitration result of the storage array a is winning, and the arbitration result of the storage array B is losing.
If the method in the embodiment of the present application only executes step 603, that is, the arbitration server only receives the first arbitration request from the storage array a, and does not receive the second arbitration request from the storage array B within the preset time period from the receipt of the first arbitration request, the arbitration server determines that the first arbitration result is winning and the second arbitration result is checking.
It should be noted that the preset duration may be preconfigured by the arbitration server, or may be preset by the arbitration server and the storage in the array, specifically may be 2ms or 5ms, which is not limited herein.
Step 606, the arbitration server sends the first arbitration result to the storage array a, and sends the second arbitration result to the storage array B, where the storage array a receives the first arbitration result and the storage array B receives the second arbitration result.
Step 607, the storage array a initiates a negotiation to the storage array B according to the first arbitration result.
In step 608, the storage array B feeds back the second arbitration result to the storage array a, and the storage array a receives the second arbitration result.
Step 609, the storage array a determines a negotiation result according to the first arbitration result and the second arbitration result.
In step 610, the storage array a determines that the negotiation result is that the storage array a is winning and the storage array B is losing, and the storage array a provides service for the client.
Step 611, storage array a sends synchronization data to storage array B.
Step 606 to step 611 are the same as step 408 to step 413, and are not described here again.
In the technical scheme, after the storage system is changed from the double-slave state to the fault recovery state, the winning storage array can be automatically determined through the negotiation principle, and the service is actively pulled up by the winning storage array, so that the waiting time for manually operating the service pulling up can be reduced, and the service recovery efficiency of the storage system can be improved.
In the embodiments of fig. 4 and 6, the storage system needs to pull up traffic based on the arbitration result of the arbitration server. In other scenarios, such as multiple failure scenarios, the storage system may also rely on the arbitration server to actively initiate negotiations by the primary array to resume traffic. The service recovery process under the multiple fault scenario is described below.
As an example, in the case where the dual active storage system shown in fig. 1 is in a normal operation state, the arbitration link between the arbitration server and the storage array a may be disconnected for some reasons, and since the operation of the storage system is not affected by the disconnection of the arbitration link, the storage array a and the storage array B can still maintain the normal operation state, that is, the storage array a and the storage array B process the traffic at the same time. When the storage array A and the storage array B process the service at the same time, the data of the service can be written into the storage array A and the storage array B, if the storage array B fails, the data of the service cannot be successfully written into the storage array B, at the moment, the storage array B can set the state of the storage array B to be losing so as to stop processing the service, and the storage array A determines that the data of the service cannot be successfully written into the storage array B, the storage array B is confirmed to fail, so that the storage array A determines that the state of the storage array B needs to be set to be winning, and the service is processed by the storage array A.
If the storage array a determines that the state of the storage array a needs to be set to winning and before the storage array a changes to winning, the replication link between the storage array a and the storage array B is disconnected, and at this time, the arbitration process of the arbitration server will be triggered, and since the arbitration link between the arbitration server and the storage array a is disconnected, the arbitration server cannot receive the third arbitration request sent by the storage array a, and therefore, the storage array a does not receive the arbitration result of the arbitration service network. When the memory array A does not receive the arbitration result within the preset time period, determining that the arbitration is overtime, and setting the state of the memory array A to checking.
In this way, after the replication link between the storage array a and the storage array B is restored, the storage array a may actively initiate a negotiation to the storage array B, and then the storage array B sends its own stored state, that is, losing states, to the storage array a, and the storage array a may determine that the negotiation result of the storage array a is winning states and the negotiation result of the storage array B is losing states according to the negotiation principle shown in table 1, thereby determining that the service is handled by the storage array a.
As another example, in the case where the dual active storage system shown in fig. 1 is in a normal operation state, the arbitration link between the arbitration server and the storage array a may be disconnected for some reasons, and since the operation of the storage system is not affected by the disconnection of the arbitration link, the storage array a and the storage array B can still maintain the normal operation state, that is, the storage array a and the storage array B process the traffic at the same time. Since the memory array B is in a normal state, the state of the memory array B is unknow, and if the memory array B is restarted after power failure, the memory array B is also unknow after the restart is completed.
If the storage array B is powered down and restarted, the replication link between the storage arrays a and B is disconnected, and at this time, the arbitration process of the arbitration server is triggered, and since the arbitration link between the arbitration server and the storage array a is disconnected, the arbitration server cannot receive the third arbitration request sent by the storage array a, so that the arbitration server cannot perform arbitration, the storage array a cannot receive the arbitration result of the arbitration server, and when the storage array a does not receive the arbitration result within the preset time period, the arbitration timeout is determined, so that the state of the storage array a is set to checking. Since the storage array B is restarted after power down, the state after the storage array B is restarted is unknown.
In this way, after the replication link between the storage array a and the storage array B is restored, the storage array a may actively initiate a negotiation to the storage array B, and then the storage array B sends its own stored state, that is, unknown states, to the storage array a, and the storage array a may determine that the negotiation result of the storage array a is losing states and the negotiation result of the storage array B is winning states according to the negotiation principle shown in table 1, thereby determining that the service is handled by the storage array B.
Through the technical scheme, the storage arrays can be actively negotiated, so that the storage arrays used for processing the service after the fault recovery is determined, the interaction between the storage arrays and the arbitration server can be reduced, and the processing efficiency is improved.
In the embodiment of the present application, the method provided by the embodiment of the present application is described from the perspective of interaction between the first storage array, the second storage array and the arbitration server, respectively. In order to implement the functions in the method provided by the embodiment of the present application, the first storage array may include a hardware structure and/or a software module, and implement the functions in the form of a hardware structure, a software module, or a hardware structure plus a software module. Some of the functions described above are performed in a hardware configuration, a software module, or a combination of hardware and software modules, depending on the specific application of the solution and design constraints.
Fig. 7 shows a schematic structural diagram of a service restoration apparatus 700. The service restoration device 700 may be applied to the first storage array or a device in the first storage array, and may be capable of implementing the function of the first storage array in the method provided by the embodiment of the present application, or the service restoration device 700 may be a device capable of supporting the first storage array to implement the function of the first storage array in the method provided by the embodiment of the present application. The service restoration apparatus 700 may be a hardware structure, a software module, or a hardware structure plus a software module. The service restoration apparatus 700 may be implemented by a system-on-chip. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices.
The service restoration apparatus 700 may include a transceiving unit 701 and a processing unit 702.
The transceiver unit 701 may be used to perform steps 401, 402, 405, 408-410, and 413 in the embodiment shown in fig. 4, and/or to perform steps 601, 603, 606-608, and 611 in the embodiment shown in fig. 6, and/or to support other processes of the techniques described herein. The transceiver unit 701 is used to communicate with the storage array 700 and other modules, which may be circuits, devices, interfaces, buses, software modules, transceivers, or any other means of enabling communications.
The processing unit 702 may be used to perform steps 401, 404, 411, and 412 in the embodiment shown in fig. 4, and/or to perform steps 601, 602, 609, and 610 in the embodiment shown in fig. 6, and/or to support other processes of the techniques described herein.
All relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.
The division of the modules in the embodiments of the present application is schematically only one logic function division, and there may be another division manner in actual implementation, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, or may exist separately and physically, or two or more modules may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules.
Fig. 8 shows a service restoration device 800 according to an embodiment of the present application, where the service restoration device 800 may be the first storage array in the embodiment shown in fig. 4 or fig. 6, or a device in the first storage array, and may implement the function of the first storage array in the embodiment shown in fig. 4 or fig. 6, and the service restoration device 800 may also be a device capable of supporting the first storage array to implement the function of the first storage array in the method provided in the embodiment shown in fig. 4 or fig. 6. The service recovering device 800 may be a chip system. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices.
The service restoration device 800 includes at least one processor 820 for implementing or for supporting the service restoration device 800 to implement the functionality of the first storage array in the embodiment of the present application shown in fig. 4 or 6. For example, the processor 820 may initiate a negotiation to the second storage array according to the first arbitration result, determine the negotiation result, and determine that the negotiation result is that the storage array is in the winning winning state and the second storage array is in the losing losing state, and provide the service to the client.
The service restoration device 800 may also include at least one memory 830 for storing program instructions and/or data. Memory 830 is coupled to processor 820. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units, or modules, which may be in electrical, mechanical, or other forms for information interaction between the devices, units, or modules. Processor 820 may operate in conjunction with memory 830. Processor 820 may execute program instructions stored in memory 830. At least one of the at least one memory may be included in the processor. The methods shown in fig. 4 or 6 may be implemented when processor 820 executes program instructions in memory 830.
The service restoration apparatus 800 may further include a communication interface 810 for communicating with other devices over a transmission medium such that the apparatus for use in the storage array 800 may communicate with other devices. The other device may be, for example, a second storage array or an arbitration server. Processor 820 may transmit and receive data using communication interface 810.
The specific connection medium between the communication interface 810, the processor 820, and the memory 830 is not limited in the embodiment of the present application. In the embodiment of the present application, the memory 830, the processor 820 and the communication interface 810 are connected through the bus 840 in fig. 8, where the bus is indicated by a thick line in fig. 8, and the connection manner between other components is only schematically illustrated, but not limited thereto. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.
In an embodiment of the present application, processor 820 may be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, which may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
In an embodiment of the present application, the memory 830 may be a nonvolatile memory, such as a hard disk (HARD DISK DRIVE, HDD) or a solid-state disk (SSD), or may be a volatile memory (RAM). The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory in embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.
Fig. 9 shows a schematic diagram of an arbitration server 900. The arbitration server 900 may be a device capable of supporting the arbitration server to implement the function of the arbitration server in the method provided by the embodiment of the present application. The arbitration server 900 may be a hardware structure, a software module, or a hardware structure plus a software module. The arbitration server 900 may be implemented by a system-on-chip. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices.
The arbitration server 900 may include a transceiving unit 901 and a processing unit 902.
The transceiver unit 901 may be used to perform steps 402, 405, 406, and 408 in the embodiment shown in fig. 4, and/or to perform steps 603, 604, and 606 in the embodiment shown in fig. 6, and/or to support other processes of the techniques described herein. The transceiver unit 901 is used to arbitrate the communication between the server 900 and other modules, which may be circuits, devices, interfaces, buses, software modules, transceivers, or any other device capable of implementing communication.
The processing unit 902 may be configured to perform step 403 and step 407 in the embodiment shown in fig. 4, and/or to perform step 605 in the embodiment shown in fig. 6, and/or to support other processes of the techniques described herein.
All relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.
The division of the modules in the embodiments of the present application is schematically only one logic function division, and there may be another division manner in actual implementation, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, or may exist separately and physically, or two or more modules may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules.
Fig. 10 shows an arbitration server 1000 according to an embodiment of the present application, where the arbitration server 1000 may be an arbitration server in the embodiment shown in fig. 4 or fig. 6, and may be capable of implementing the functions of the arbitration server in the embodiment shown in fig. 4 or fig. 6, and the arbitration server 1000 may also be a device capable of supporting the arbitration server to implement the functions of the arbitration server in the method provided in the embodiment shown in fig. 4 or fig. 6. Wherein the mediation server 1000 may be a system-on-chip. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices.
The mediation server 1000 includes at least one processor 1020 for implementing or for supporting the mediation server 1000 to implement the functions of the mediation server in the embodiments of the present application shown in fig. 4 or 6. For example, the processor 1020 may determine a first arbitration result of the first memory array and a second arbitration result of the second memory array according to the first arbitration request, which are specifically described in the method example and are not described herein.
The arbitration server 1000 may also include at least one memory 1030 for storing program instructions and/or data. Memory 1030 is coupled to processor 1020. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units, or modules, which may be in electrical, mechanical, or other forms for information interaction between the devices, units, or modules. Processor 1020 may operate in conjunction with memory 1030. Processor 1020 may execute program instructions stored in memory 1030. At least one of the at least one memory may be included in the processor. The methods shown in fig. 4 or 6 may be implemented when the processor 1020 executes program instructions in the memory 1030.
The arbitration server 1000 may also include a communication interface 1010 for communicating with other devices over a transmission medium so that the means for storing in the array 1000 may communicate with other devices. The other device may be, for example, a second storage array or a first storage array. The processor 1020 may transmit and receive data using the communication interface 1010.
The specific connection medium between the communication interface 1010, the processor 1020, and the memory 1030 is not limited in this embodiment. In the embodiment of the present application, the memory 1030, the processor 1020 and the communication interface 1010 are connected by a bus 1040 in fig. 10, where the bus is indicated by a thick line in fig. 10, and the connection manner between other components is merely illustrative and not limited thereto. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 10, but not only one bus or one type of bus.
In an embodiment of the present application, the processor 1020 may be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, where the methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
In an embodiment of the present application, the memory 1030 may be a nonvolatile memory, such as a hard disk (HARD DISK DRIVE, HDD) or a Solid State Disk (SSD), or may be a volatile memory (RAM). The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory in embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.
Embodiments of the present application also provide a computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform a method performed by a first storage array in an embodiment as shown in fig. 4 or 6.
Embodiments of the present application also provide a computer-readable storage medium comprising instructions that, when executed on a computer, cause the computer to perform the method performed by the arbitration server in the embodiments shown in fig. 4 or 6.
Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method performed by the first storage array in the embodiments shown in fig. 4 or fig. 6.
Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of arbitrating server execution in the embodiments shown in fig. 4 or 6.
The embodiment of the application provides a chip system, which comprises a processor and can also comprise a memory, wherein the memory is used for realizing the function of a first memory array in the method. The chip system may be formed of a chip or may include a chip and other discrete devices.
The embodiment of the application provides a chip system, which comprises a processor and can also comprise a memory, wherein the memory is used for realizing the function of an arbitration server in the method. The chip system may be formed of a chip or may include a chip and other discrete devices.
The embodiment of the application also provides a storage system which comprises the first storage array in the method and the arbitration server in the method.
The embodiment of the application also provides a storage system which comprises the first storage array, the second storage array and the arbitration server in the method.
The method provided by the embodiment of the application can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user device, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g., from one website, computer, server, or data center, by wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL) or wireless (e.g., infrared, wireless, microwave, etc.) means, the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc., that contains an integration of one or more available media, the available media may be magnetic media (e.g., floppy disk, hard disk, tape), optical media (e.g., digital video disc (digital video disc, DVD)), or semiconductor media (e.g., SSD), etc.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

Translated fromChinese
1.一种业务恢复方法,其特征在于,应用于存储系统,所述存储系统包括第一存储阵列、第二存储阵列以及仲裁服务器,所述方法包括:1. A service recovery method, characterized in that it is applied to a storage system, wherein the storage system includes a first storage array, a second storage array, and an arbitration server, and the method includes:在所述存储系统由双从状态切换到故障恢复状态后,所述第一存储阵列向所述仲裁服务器发送第一仲裁请求,所述第二存储阵列向所述仲裁服务器发送第二仲裁请求;在所述存储系统处于所述双从状态时,所述第一存储阵列和所述第二存储阵列均无法处理业务;After the storage system switches from the dual-slave state to the fault recovery state, the first storage array sends a first arbitration request to the arbitration server, and the second storage array sends a second arbitration request to the arbitration server; when the storage system is in the dual-slave state, both the first storage array and the second storage array cannot process services;所述第一存储阵列从所述仲裁服务器接收第一仲裁结果;The first storage array receives a first arbitration result from the arbitration server;所述第二存储阵列从所述仲裁服务器接收第二仲裁结果;The second storage array receives a second arbitration result from the arbitration server;若所述第一仲裁结果指示所述第一存储阵列为获胜winning状态且所述第二仲裁结果指示所述第二存储阵列为失败losing状态时,所述第一存储阵列向客户端提供业务服务。If the first arbitration result indicates that the first storage array is in a winning state and the second arbitration result indicates that the second storage array is in a losing state, the first storage array provides business services to the client.2.根据权利要求1所述的方法,其特征在于,在所述第一存储阵列向客户端提供业务服务之后,所述方法还包括:2. The method according to claim 1, characterized in that after the first storage array provides business services to the client, the method further comprises:所述第一存储阵列向所述第二存储阵列发送同步数据。The first storage array sends synchronization data to the second storage array.3.一种业务恢复方法,其特征在于,应用于存储系统,所述存储系统包括第一存储阵列、第二存储阵列,所述方法包括:3. A service recovery method, characterized in that it is applied to a storage system, the storage system comprising a first storage array and a second storage array, the method comprising:在所述存储系统由双从状态切换到故障恢复状态后,所述第一存储阵列向所述第二存储阵列发起协商,确定协商结果;在所述存储系统处于所述双从状态时,所述第一存储阵列和所述第二存储阵列均无法处理业务;After the storage system switches from the dual-slave state to the fault recovery state, the first storage array initiates negotiation with the second storage array to determine a negotiation result; when the storage system is in the dual-slave state, both the first storage array and the second storage array cannot process services;若确定协商结果为所述第一存储阵列为获胜winning状态且所述第二存储阵列为失败losing状态,则所述第一存储阵列向客户端提供业务服务。If it is determined that the negotiation result is that the first storage array is in a winning state and the second storage array is in a losing state, the first storage array provides business services to the client.4.根据权利要求3所述的方法,其特征在于,在所述第一存储阵列向客户端提供业务服务之后,所述方法还包括:4. The method according to claim 3, characterized in that after the first storage array provides business services to the client, the method further comprises:所述第一存储阵列向所述第二存储阵列发送同步数据。The first storage array sends synchronization data to the second storage array.5.一种存储系统,其特征在于,所述存储系统包括第一存储阵列、第二存储阵列以及仲裁服务器,5. A storage system, characterized in that the storage system comprises a first storage array, a second storage array and an arbitration server,所述第一存储阵列用于在所述存储系统由双从状态切换到故障恢复状态后,向所述仲裁服务器发送第一仲裁请求;The first storage array is used for sending a first arbitration request to the arbitration server after the storage system switches from a dual-slave state to a failure recovery state;所述第二存储阵列用于向所述仲裁服务器发送第二仲裁请求;在所述存储系统处于所述双从状态时,所述第一存储阵列和所述第二存储阵列均无法处理业务;The second storage array is used to send a second arbitration request to the arbitration server; when the storage system is in the dual-slave state, both the first storage array and the second storage array cannot process services;所述第一存储阵列还用于从所述仲裁服务器接收第一仲裁结果;The first storage array is further used to receive a first arbitration result from the arbitration server;所述第二存储阵列还用于从所述仲裁服务器接收第二仲裁结果;The second storage array is further used to receive a second arbitration result from the arbitration server;所述第一存储阵列还用于在所述第一仲裁结果指示所述第一存储阵列为获胜winning状态且所述第二仲裁结果指示所述第二存储阵列为失败losing状态时向客户端提供业务服务。The first storage array is further configured to provide business services to the client when the first arbitration result indicates that the first storage array is in a winning state and the second arbitration result indicates that the second storage array is in a losing state.6.根据权利要求5所述的存储系统,其特征在于,所述第一存储阵列还用于在所述第一存储阵列向客户端提供业务服务之后,向所述第二存储阵列发送同步数据。6 . The storage system according to claim 5 , wherein the first storage array is further configured to send synchronization data to the second storage array after the first storage array provides business services to the client.7.一种存储系统,其特征在于,所述存储系统包括第一存储阵列、第二存储阵列;7. A storage system, characterized in that the storage system comprises a first storage array and a second storage array;所述第一存储阵列用于在所述存储系统由双从状态切换到故障恢复状态后,向所述第二存储阵列发起协商,确定协商结果;在所述存储系统处于所述双从状态时,所述第一存储阵列和所述第二存储阵列均无法处理业务;以及The first storage array is used to initiate negotiation with the second storage array to determine the negotiation result after the storage system switches from the dual-slave state to the fault recovery state; when the storage system is in the dual-slave state, both the first storage array and the second storage array cannot process services; and所述第一存储阵列在确定协商结果为所述第一存储阵列为获胜winning状态且所述第二存储阵列为失败losing状态时向客户端提供业务服务。The first storage array provides business services to the client when it is determined that the negotiation result is that the first storage array is in a winning state and the second storage array is in a losing state.8.根据权利要求7所述的存储系统,其特征在于,所述第一存储阵列还用于在所述第一存储阵列向客户端提供业务服务之后,向所述第二存储阵列发送同步数据。8 . The storage system according to claim 7 , wherein the first storage array is further configured to send synchronization data to the second storage array after the first storage array provides business services to the client.9.一种计算机可读存储介质,其特征在于,所述介质上存储有指令,当所述指令在计算机上运行时,使得计算机执行如权利要求1-2或3-4任一项所述的方法。9. A computer-readable storage medium, characterized in that instructions are stored on the medium, and when the instructions are executed on a computer, the computer executes the method according to any one of claims 1-2 or 3-4.
CN202011331689.8A2018-08-312018-08-31 A business recovery method, device, arbitration server and storage systemActiveCN112612653B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011331689.8ACN112612653B (en)2018-08-312018-08-31 A business recovery method, device, arbitration server and storage system

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
CN201811015654.6ACN109445984B (en)2018-08-312018-08-31 A service recovery method, device, arbitration server and storage system
CN202011331689.8ACN112612653B (en)2018-08-312018-08-31 A business recovery method, device, arbitration server and storage system

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811015654.6ADivisionCN109445984B (en)2018-08-312018-08-31 A service recovery method, device, arbitration server and storage system

Publications (2)

Publication NumberPublication Date
CN112612653A CN112612653A (en)2021-04-06
CN112612653Btrue CN112612653B (en)2025-07-01

Family

ID=65532596

Family Applications (2)

Application NumberTitlePriority DateFiling Date
CN201811015654.6AActiveCN109445984B (en)2018-08-312018-08-31 A service recovery method, device, arbitration server and storage system
CN202011331689.8AActiveCN112612653B (en)2018-08-312018-08-31 A business recovery method, device, arbitration server and storage system

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
CN201811015654.6AActiveCN109445984B (en)2018-08-312018-08-31 A service recovery method, device, arbitration server and storage system

Country Status (1)

CountryLink
CN (2)CN109445984B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110502326A (en)*2019-08-272019-11-26珠海格力电器股份有限公司Cloud service scheduling and recovering method based on fault detection and terminal equipment
CN114416501A (en)*2021-12-232022-04-29中国农业银行股份有限公司云南省分行Storage double-activity and test system and method
CN117614805B (en)*2023-11-212024-06-14杭州沃趣科技股份有限公司Data processing system for monitoring state of data center
CN119728701B (en)*2025-02-282025-09-02苏州元脑智能科技有限公司 Arbitration control method, product, electronic device and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106909307A (en)*2015-12-222017-06-30华为技术有限公司A kind of method and device for managing dual-active storage array

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20150339200A1 (en)*2014-05-202015-11-26Cohesity, Inc.Intelligent disaster recovery
CN104469699B (en)*2014-11-272018-09-21华为技术有限公司Cluster quorum method and more cluster coupled systems
CN105426275B (en)*2015-10-302019-04-19成都华为技术有限公司The method and device of disaster tolerance in dual-active group system
WO2017107110A1 (en)*2015-12-232017-06-29华为技术有限公司Service take-over method and storage device, and service take-over apparatus
CN107526652B (en)*2016-06-212021-08-20华为技术有限公司 A data synchronization method and storage device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106909307A (en)*2015-12-222017-06-30华为技术有限公司A kind of method and device for managing dual-active storage array

Also Published As

Publication numberPublication date
CN109445984B (en)2020-12-15
CN112612653A (en)2021-04-06
CN109445984A (en)2019-03-08

Similar Documents

PublicationPublication DateTitle
US10983880B2 (en)Role designation in a high availability node
US11194679B2 (en)Method and apparatus for redundancy in active-active cluster system
CN109842651B (en)Uninterrupted service load balancing method and system
US7661022B2 (en)System for error handling in a dual adaptor system where one adaptor is a master
CN112612653B (en) A business recovery method, device, arbitration server and storage system
CN106330475B (en) A method and device for managing active and standby nodes in a communication system and a high-availability cluster
CN110275680B (en) A dual-control dual-active storage system
CN108512753B (en) A method and device for message transmission in a cluster file system
CN103905247B (en)Two-unit standby method and system based on multi-client judgment
CN102394914A (en)Cluster brain-split processing method and device
WO2016107443A1 (en)Snapshot processing method and related device
JPH086910A (en) Cluster computer system
CN106325768B (en)A kind of two-shipper storage system and method
CN111209265B (en)Database switching method and terminal equipment
CN110620684A (en)Storage double-control split-brain-preventing method, system, terminal and storage medium
CN114218169A (en) File operation method, distributed storage system, electronic device and storage medium
CN109344015B (en) A method and system for preventing dual-primary nodes by using HA for database services
US12141448B2 (en)Volume promotion management and visualization in a metro cluster
US11947431B1 (en)Replication data facility failure detection and failover automation
US12430062B2 (en)Managing transitions from metro-cluster to standalone objects
JP6822706B1 (en) Cluster system, server equipment, takeover method, and program
CN118331496A (en) Arbitration method, device, computer equipment and storage medium for storage cluster
CN116540940A (en)Storage cluster management and control method, device, equipment and storage medium
CN118672118A (en)Redundancy management method and device for virtualized control system
CN119719224A (en)Data synchronization method, system and related products

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp