Movatterモバイル変換


[0]ホーム

URL:


CN106250048B - Manage the method and device of storage array - Google Patents

Manage the method and device of storage array
Download PDF

Info

Publication number
CN106250048B
CN106250048BCN201510307041.XACN201510307041ACN106250048BCN 106250048 BCN106250048 BCN 106250048BCN 201510307041 ACN201510307041 ACN 201510307041ACN 106250048 BCN106250048 BCN 106250048B
Authority
CN
China
Prior art keywords
storage array
controller
lun
write lock
main controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510307041.XA
Other languages
Chinese (zh)
Other versions
CN106250048A (en
Inventor
丁文强
袁舟
陈立钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co LtdfiledCriticalHuawei Technologies Co Ltd
Priority to CN201510307041.XApriorityCriticalpatent/CN106250048B/en
Publication of CN106250048ApublicationCriticalpatent/CN106250048A/en
Application grantedgrantedCritical
Publication of CN106250048BpublicationCriticalpatent/CN106250048B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Landscapes

Abstract

Translated fromChinese

本发明实施例提供一种管理存储阵列的方法及装置。该方法包括:当第一存储阵列中的控制器接收到向LUN写入数据的请求时,且LUN的写锁权限管理者的运行状态是在线时,第一存储阵列中的控制器通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求;当第一存储阵列中的控制器获得写锁权限管理者提供的写锁定权限后,向LUN写入所述数据。在本发明实施例中,当本地存储阵列在向对端存储阵列申请到LUN的写锁定权限后再写入数据,可以保持数据的一致性。

Embodiments of the present invention provide a method and apparatus for managing a storage array. The method includes: when the controller in the first storage array receives a request to write data to the LUN, and when the running state of the write lock authority manager of the LUN is online, the controller in the first storage array passes the communication link way, sending an application request for write lock permission to the write lock permission manager of the LUN; when the controller in the first storage array obtains the write lock permission provided by the write lock permission manager, it writes the data to the LUN. In the embodiment of the present invention, when the local storage array writes data after applying for the write lock permission of the LUN from the opposite storage array, data consistency can be maintained.

Description

Translated fromChinese
管理存储阵列的方法及装置Method and apparatus for managing storage array

技术领域technical field

本发明实施例涉及通信技术,尤其涉及一种管理存储阵列的方法及装置。Embodiments of the present invention relate to communication technologies, and in particular, to a method and apparatus for managing a storage array.

背景技术Background technique

双活数据中心,是采用双写模式把数据同时写入本地存储系统和远端存储系统,保证数据在远端存储系统上与本地存储系统上的实时同步。当本地存储系统故障时,本地存储系统承载的业务能够切换到远端存储系统上,从而实现复原点目标(Recovery PointObjective,简称:RPO)和复原时间目标(Recovery Time Objective,简称:RTO)都为0的一种解决方案。Active-active data centers use dual-write mode to simultaneously write data to the local storage system and the remote storage system to ensure real-time synchronization of data on the remote storage system and the local storage system. When the local storage system fails, the services carried by the local storage system can be switched to the remote storage system, so that both the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO) can be achieved. 0 for a solution.

双活(Active-Active,简称:AA)数据中心解决方案包括双主控制器模式。在AA模式中,本地存储系统和远端存储系统同时提供逻辑单元号(logic unit number,简称:LUN)的读写,当本地存储系统故障时,远端存储系统可以无缝地承载本地存储系统上的业务。The Active-Active (Abbreviated: AA) data center solution includes a dual-host controller mode. In the AA mode, both the local storage system and the remote storage system provide reading and writing of logical unit numbers (LUNs for short). When the local storage system fails, the remote storage system can seamlessly carry the local storage system. on the business.

现有技术中,AA模式将本地存储系统和远端存储系统看作一个整体,在各存储系统设置一个或者多个集群。集群包括一个或多个引擎,引擎间通过光纤通道(fibrechannel,简称:FC)或者互联网协议(Internet Protocol,简称:IP)网络同步数据。但该现有技术中,本地存储系统无法获知远端存储系统中存储控制器的状态,并且当本地的LUN被远端存储系统的控制器锁定时,如果本地控制器写入数据,会破坏数据的一致性。In the prior art, the AA mode regards the local storage system and the remote storage system as a whole, and sets up one or more clusters in each storage system. The cluster includes one or more engines, and the engines synchronize data through a fiber channel (fibrechannel, FC for short) or Internet Protocol (Internet Protocol, IP for short) network. However, in this prior art, the local storage system cannot know the status of the storage controller in the remote storage system, and when the local LUN is locked by the controller of the remote storage system, if the local controller writes data, the data will be destroyed. consistency.

发明内容SUMMARY OF THE INVENTION

本发明实施例提供一种管理存储阵列的方法及装置,当本地存储阵列在向对端存储阵列申请到LUN的写锁定权限后再写入数据,可以保持数据的一致性。Embodiments of the present invention provide a method and device for managing a storage array, which can maintain data consistency when the local storage array writes data after applying for the write lock permission of the LUN from the opposite storage array.

第一方面,本发明实施例提供一种管理存储阵列的方法,应用于第一存储阵列的控制器中,所述第一存储阵列和第二存储阵列通过通信链路通信,所述第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,所述控制器运行状态信息包括所述第二存储阵列中各控制器的运行状态,所述写锁权限信息包括所述第一存储阵列中逻辑单元号LUN的写锁权限管理者,所述LUN的写锁权限管理者是所述第二存储阵列中的控制器,所述方法包括:In a first aspect, an embodiment of the present invention provides a method for managing a storage array, which is applied to a controller of a first storage array, wherein the first storage array and the second storage array communicate through a communication link, and the first storage array The controllers in the array store controller operation status information and write lock authority information, wherein the controller operation status information includes the operation status of each controller in the second storage array, and the write lock authority information includes all the controller operation status information. The write lock authority manager of the logical unit number LUN in the first storage array, the write lock authority manager of the LUN is the controller in the second storage array, and the method includes:

当所述第一存储阵列中的控制器接收到向所述LUN写入数据的请求时,并且所述LUN的写锁权限管理者的运行状态是在线时,所述第一存储阵列中的控制器通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;When the controller in the first storage array receives a request to write data to the LUN, and the operating status of the write lock authority manager of the LUN is online, the controller in the first storage array The device sends an application request for write lock permission to the write lock permission manager of the LUN through the communication link;

当所述第一存储阵列中的控制器获得所述写锁权限管理者提供的写锁定权限后,向所述LUN写入所述数据。After the controller in the first storage array obtains the write lock permission provided by the write lock permission manager, the controller writes the data to the LUN.

结合第一方面,在第一方面的第一种可能的实现方式中,所述第一存储阵列中的控制器通过所述通信链路,向所述LUN的锁权限拥有者发送写锁定权限的申请请求之后,所述方法还包括:With reference to the first aspect, in a first possible implementation manner of the first aspect, the controller in the first storage array sends the lock permission owner of the LUN a write lock permission through the communication link After applying for the request, the method further includes:

当所述第一存储阵列中的控制器获得所述写锁权限管理者提供的写锁定权限后,向所述第二存储阵列发送向所述LUN的镜像LUN中写入所述数据的指令,所述镜像LUN由所述第二存储阵列中的控制器管理。After the controller in the first storage array obtains the write lock authority provided by the write lock authority manager, it sends an instruction to write the data into the mirror LUN of the LUN to the second storage array, The mirrored LUN is managed by a controller in the second storage array.

结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述方法还包括:With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the method further includes:

当所述LUN写入完成,并且收到所述第二存储阵列发送的所述镜像LUN写入完成的响应消息后,所述第一存储阵列中的控制器释放所述写锁定权限。The controller in the first storage array releases the write lock permission when the writing to the LUN is completed and a response message that the writing to the mirrored LUN is completed is received from the second storage array.

结合第一方面、第一方面的第一种至第二种可能的实现方式中任意一种,在第一方面的第三种可能的实现方式中,所述方法还包括:With reference to the first aspect and any one of the first to second possible implementations of the first aspect, in a third possible implementation of the first aspect, the method further includes:

在所述第一存储阵列中的控制器释放所述写锁定权限之前,所述第一存储阵列中的控制器接收到另外一个向所述LUN写入数据的请求,并且所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;Before the controller in the first storage array releases the write lock permission, the controller in the first storage array receives another request to write data to the LUN, and the write lock of the LUN When the running state of the authority manager is online, send an application request for write lock authority to the write lock authority manager of the LUN through the communication link;

所述第一存储阵列中的控制器接收所述写锁权限管理者返回拒绝给予写锁定权限的响应消息。The controller in the first storage array receives a response message that the write lock permission manager returns a refusal to grant the write lock permission.

结合第一方面,在第一方面的第四种可能的实现方式中,所述方法还包括:With reference to the first aspect, in a fourth possible implementation manner of the first aspect, the method further includes:

所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的待处理事件,所述待处理事件发生在所述第二存储阵列中,其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个;The main controller in the first storage array receives a pending event sent by the main controller in the second storage array, and the pending event occurs in the second storage array, wherein the first The main controller in the storage array is one of the controllers in the first storage array, and the main controller in the second storage array is one of the controllers in the second storage array;

所述第一存储阵列中的主控制器根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态;The main controller in the first storage array updates the running state of the controller in the second storage array included in the controller running state information according to the to-be-processed event;

所述第一存储阵列中的主控制器发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的所述控制器运行状态信息。The main controller in the first storage array sends the updated controller running status information to other controllers in the first storage array, and each other controller updates the controller running status information stored by itself.

结合第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,当所述待处理事件为控制器故障事件时,所述第一存储阵列中的主控制器根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态,包括:With reference to the fourth possible implementation manner of the first aspect, in the fifth possible implementation manner of the first aspect, when the to-be-processed event is a controller failure event, the main controller in the first storage array The controller updates the operating state of the controller in the second storage array included in the controller operating state information according to the to-be-processed event, including:

所述第一存储阵列中的主控制器将发生所述控制器故障事件的控制器的运行状态从所述控制器运行状态信息中移除,或The master controller in the first storage array removes the operating status of the controller in which the controller failure event occurred from the controller operating status information, or

在所述控制器运行状态信息中,所述第一存储阵列中的主控制器将发生所述控制器故障事件的控制器的运行状态更新为离线或故障。In the controller operating state information, the master controller in the first storage array updates the operating state of the controller in which the controller failure event occurs to offline or failure.

结合第一方面,在第一方面的第六种可能的实现方式中,所述方法还包括:With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the method further includes:

所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的恢复请求,所述恢复请求携带所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array receives a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array;

所述第一存储阵列中的主控制器将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The main controller in the first storage array adds the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;

所述第一存储阵列中的主控制器发送第一响应报文给所述第二存储阵列中的主控制器,所述第一响应报文携带所述第一存储阵列中各控制器的运行状态;The main controller in the first storage array sends a first response message to the main controller in the second storage array, where the first response message carries the operation of each controller in the first storage array state;

其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is the controller in the second storage array one of the.

结合第一方面,在第一方面的第七种可能的实现方式中,所述方法还包括:With reference to the first aspect, in a seventh possible implementation manner of the first aspect, the method further includes:

当所述第一存储阵列中的主控制器检测到通信链路恢复事件时,所述第一存储阵列中的主控制器生成恢复请求,所述恢复请求携带所述第一存储阵列中各控制器的运行状态;When the main controller in the first storage array detects a communication link recovery event, the main controller in the first storage array generates a recovery request, and the recovery request carries each control in the first storage array the operating state of the device;

所述第一存储阵列中的主控制器发送所述恢复请求给所述第二存储阵列中的主控制器;The main controller in the first storage array sends the recovery request to the main controller in the second storage array;

所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的第二响应报文,所述第二响应报文携带所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array receives a second response packet sent by the main controller in the second storage array, where the second response packet carries the information of each controller in the second storage array. Operating status;

所述第一存储阵列中的主控制器解析所述第二响应报文,获取所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array parses the second response message, and obtains the operating status of each controller in the second storage array;

所述第一存储阵列中的主控制器将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The main controller in the first storage array adds the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;

其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is the controller in the second storage array one of the.

结合第一方面、第一方面的第一种至第七种可能的实现方式中任意一种,在第一方面的第八种可能的实现方式中,所述控制器运行状态包括:在线、离线以及故障中的任意一个。With reference to the first aspect and any one of the first to seventh possible implementation manners of the first aspect, in an eighth possible implementation manner of the first aspect, the controller running state includes: online, offline and any of the faults.

第二方面,本发明实施例提供一种管理存储阵列的装置,集成于第一存储阵列的控制器中,所述第一存储阵列和第二存储阵列通过通信链路通信,所述第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,所述控制器运行状态信息包括所述第二存储阵列中各控制器的运行状态,所述写锁权限信息包括所述第一存储阵列中逻辑单元号LUN的写锁权限管理者,所述LUN的写锁权限管理者是所述第二存储阵列中的控制器,所述装置包括接收模块、发送模块、检测模块及处理模块;In a second aspect, an embodiment of the present invention provides an apparatus for managing a storage array, which is integrated into a controller of a first storage array, the first storage array and the second storage array communicate through a communication link, and the first storage array communicates with the second storage array through a communication link. The controllers in the array store controller operation status information and write lock authority information, wherein the controller operation status information includes the operation status of each controller in the second storage array, and the write lock authority information includes all the controller operation status information. The write lock authority manager of the logical unit number LUN in the first storage array, the write lock authority manager of the LUN is the controller in the second storage array, and the device includes a receiving module, a sending module, and a detection module and processing module;

所述接收模块,用于接收向所述LUN写入数据的请求;the receiving module, configured to receive a request for writing data to the LUN;

所述检测模块,用于检测所述LUN的写锁权限管理者的运行状态;The detection module is used to detect the running state of the write lock authority manager of the LUN;

所述发送模块,用于当所述接收模块接收到所述向所述LUN写入数据的请求时,并且所述检测模块检测到所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;The sending module is configured to, when the receiving module receives the request to write data to the LUN, and the detection module detects that the running state of the write lock authority manager of the LUN is online, Send an application request for write lock permission to the write lock permission manager of the LUN through the communication link;

所述接收模块,还用于接收所述写锁权限管理者提供的写锁定权限的应答消息;The receiving module is further configured to receive a response message of the write lock authority provided by the write lock authority manager;

所述处理模块,用于获得所述写锁权限管理者提供的写锁定权限后,向所述LUN写入所述数据。The processing module is configured to write the data to the LUN after obtaining the write lock authority provided by the write lock authority manager.

结合第二方面,在第二方面的第二种可能的实现方式中,所述处理模块还用于:With reference to the second aspect, in a second possible implementation manner of the second aspect, the processing module is further configured to:

在获得所述写锁权限管理者提供的写锁定权限后,触发所述发送模块向所述第二存储阵列发送向所述LUN的镜像LUN中写入所述数据的指令,所述镜像LUN由所述第二存储阵列中的控制器管理。After obtaining the write lock permission provided by the write lock permission manager, the sending module is triggered to send an instruction to write the data to the mirror LUN of the LUN to the second storage array, and the mirror LUN consists of A controller in the second storage array manages.

结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述接收模块,还用于接收所述第二存储阵列发送的所述镜像LUN写入完成的响应消息;With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the receiving module is further configured to receive the mirrored LUN write sent by the second storage array completed response message;

所述处理模块,还用于当所述LUN写入完成,并且所述接收模块收到所述镜像LUN写入完成的响应消息后,释放所述写锁定权限。The processing module is further configured to release the write lock authority when the LUN writing is completed and the receiving module receives a response message that the mirror LUN writing is completed.

结合第二方面、第二方面的第一种至第二种可能的实现方式中任意一种,在第二方面的第三种可能的实现方式中,所述发送模块,还用于在所述处理模块释放所述写锁定权限之前,所述接收模块接收到另外一个向所述LUN写入数据的请求,并且所述检测模块检测到所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;With reference to the second aspect and any one of the first to second possible implementation manners of the second aspect, in a third possible implementation manner of the second aspect, the sending module is further configured to Before the processing module releases the write lock permission, the receiving module receives another request to write data to the LUN, and the detection module detects that the running state of the write lock permission manager of the LUN is online , send an application request for write lock permission to the write lock permission manager of the LUN through the communication link;

所述接收模块,还用于接收所述写锁权限管理者返回拒绝给予写锁定权限的响应消息。The receiving module is further configured to receive a response message returned by the write lock authority manager for refusing to grant the write lock authority.

结合第二方面,在第二方面的第四种可能的实现方式中,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;With reference to the second aspect, in a fourth possible implementation manner of the second aspect, the controller integrating the device is a main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array;

所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的待处理事件,所述待处理事件发生在所述第二存储阵列中,其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个;The receiving module is further configured to receive a pending event sent by a main controller in the second storage array, where the pending event occurs in the second storage array, wherein the second storage array The main controller is one of the controllers in the second storage array;

所述处理模块,还用于根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态;The processing module is further configured to update the operating state of the controller in the second storage array included in the controller operating state information according to the to-be-processed event;

所述发送模块,还用于发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的所述控制器运行状态信息。The sending module is further configured to send the updated controller running state information to other controllers in the first storage array, and each other controller updates the controller running state information stored by itself.

结合第二方面的第四种可能的实现方式,在第二方面的第五种可能的实现方式中,当所述待处理事件为控制器故障事件时,所述处理模块执行根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态时,具体为:With reference to the fourth possible implementation manner of the second aspect, in a fifth possible implementation manner of the second aspect, when the to-be-processed event is a controller failure event, the processing module executes the process according to the to-be-processed event. event, when updating the operating state of the controller in the second storage array included in the controller operating state information, specifically:

将发生所述控制器故障事件的控制器的运行状态从所述控制器运行状态信息中移除,或remove the operational status of the controller in which the controller failure event occurred from the controller operational status information, or

在所述控制器运行状态信息中,将发生所述控制器故障事件的控制器的运行状态更新为离线或故障。In the controller operating state information, the operating state of the controller in which the controller failure event occurs is updated to offline or failure.

结合第二方面,在第二方面的第六种可能的实现方式中,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the controller integrating the device is a main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array;

所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的恢复请求,所述恢复请求携带所述第二存储阵列中各控制器的运行状态;The receiving module is further configured to receive a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array;

所述处理模块,还用于将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The processing module is further configured to add the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;

所述发送模块,还用于发送第一响应报文给所述第二存储阵列中的主控制器,所述第一响应报文携带所述第一存储阵列中各控制器的运行状态;The sending module is further configured to send a first response message to the main controller in the second storage array, where the first response message carries the operating status of each controller in the first storage array;

其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。Wherein, the main controller in the second storage array is one of the controllers in the second storage array.

结合第二方面,在第二方面的第七种可能的实现方式中,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;With reference to the second aspect, in a seventh possible implementation manner of the second aspect, the controller integrating the device is a main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array;

所述检测模块,还用于检测通信链路恢复事件;The detection module is also used to detect a communication link recovery event;

所述处理模块,还用于当所述检测模块检测到所述通信链路恢复事件时,生成恢复请求,所述恢复请求携带所述第一存储阵列中各控制器的运行状态;The processing module is further configured to generate a recovery request when the detection module detects the communication link recovery event, where the recovery request carries the operating status of each controller in the first storage array;

所述发送模块,还用于发送所述恢复请求给所述第二存储阵列中的主控制器;The sending module is further configured to send the recovery request to the main controller in the second storage array;

所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的第二响应报文,所述第二响应报文携带所述第二存储阵列中各控制器的运行状态;The receiving module is further configured to receive a second response message sent by the main controller in the second storage array, where the second response message carries the operating status of each controller in the second storage array;

所述处理模块,还用于解析所述第二响应报文,获取所述第二存储阵列中各控制器的运行状态;及,将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The processing module is further configured to parse the second response message to obtain the operating status of each controller in the second storage array; and add the operating status of each controller in the second storage array to a The controller operating status information, wherein the operating status of each controller in the second storage array is online, or the operating status of each controller in the second storage array is updated to online;

其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。Wherein, the main controller in the second storage array is one of the controllers in the second storage array.

结合第二方面、第二方面的第一种至第七种可能的实现方式中任意一种,在第二方面的第八种可能的实现方式中,所述控制器运行状态包括:在线、离线以及故障中的任意一个。With reference to the second aspect and any one of the first to seventh possible implementation manners of the second aspect, in an eighth possible implementation manner of the second aspect, the controller running state includes: online, offline and any of the faults.

本发明实施例管理存储阵列的方法及装置,通过在第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,控制器运行状态信息包括第二存储阵列中各控制器的运行状态,写锁权限信息包括第一存储阵列中LUN的写锁权限管理者,LUN的写锁权限管理者是第二存储阵列中的控制器,第一存储阵列和第二存储阵列通过通信链路通信,实现第一存储阵列对第二存储阵列中的控制器的运行状态的监测,也就是说,第一存储阵列可以直接获知第二存储阵列的存在及故障的发生;另外,在第一存储阵列中的控制器接收到向LUN写入数据的请求时,且LUN的写锁权限管理者的运行状态是在线时,第一存储阵列中的控制器向LUN的写锁权限管理者发送写锁定权限的申请请求,从而获得写锁权限管理者提供的写锁定权限,向LUN写入数据,可以保持数据的一致性。In the method and device for managing a storage array according to an embodiment of the present invention, the controller in the first storage array stores controller operation state information and write lock permission information, wherein the controller operation state information includes each control device in the second storage array. The write lock permission information includes the write lock permission manager of the LUN in the first storage array. The write lock permission manager of the LUN is the controller in the second storage array. The first storage array and the second storage array pass through The communication link communicates to realize the monitoring of the running state of the controller in the second storage array by the first storage array, that is to say, the first storage array can directly know the existence and failure of the second storage array; When the controller in the first storage array receives a request to write data to the LUN, and the running state of the write lock authority manager of the LUN is online, the controller in the first storage array sends the write lock authority manager of the LUN to the running state. Send a request for write lock permission to obtain the write lock permission provided by the write lock permission manager, write data to the LUN, and maintain data consistency.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are some of the present invention. In the embodiments, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative labor.

图1为本发明应用场景示例图;1 is an example diagram of an application scenario of the present invention;

图2为本发明管理存储阵列的方法实施例一的流程图;FIG. 2 is a flowchart of Embodiment 1 of a method for managing a storage array according to the present invention;

图3为本发明管理存储阵列的方法应用于如图1所示应用场景的工作原理示例图;FIG. 3 is an example diagram of the working principle of the method for managing a storage array of the present invention applied to the application scenario shown in FIG. 1;

图4为本发明管理存储阵列的方法实施例二的流程图;4 is a flowchart of Embodiment 2 of a method for managing a storage array according to the present invention;

图5为本发明管理存储阵列的方法实施例三的示例图;FIG. 5 is an exemplary diagram of Embodiment 3 of a method for managing a storage array according to the present invention;

图6为本发明管理存储阵列的方法实施例四的流程图;FIG. 6 is a flowchart of Embodiment 4 of a method for managing a storage array according to the present invention;

图7为本发明管理存储阵列的方法实施例五的示例图;FIG. 7 is an exemplary diagram of Embodiment 5 of a method for managing a storage array according to the present invention;

图8为本发明管理存储阵列的方法实施例六的示例图;FIG. 8 is an exemplary diagram of Embodiment 6 of a method for managing a storage array according to the present invention;

图9为本发明管理存储阵列的方法实施例七的流程图;9 is a flowchart of Embodiment 7 of a method for managing a storage array according to the present invention;

图10为本发明管理存储阵列的方法实施例八的流程图;10 is a flowchart of Embodiment 8 of a method for managing a storage array according to the present invention;

图11为本发明管理存储阵列的装置实施例一的结构示意图;FIG. 11 is a schematic structural diagram of Embodiment 1 of an apparatus for managing a storage array according to the present invention;

图12为本发明管理存储阵列的装置实施例二的结构示意图。FIG. 12 is a schematic structural diagram of Embodiment 2 of an apparatus for managing a storage array according to the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

应理解,尽管本文中可使用术语第一、第二等等以描述各种元件或事件或报文,但此等元件或事件或报文不应受到此等术语限制。此等术语仅用以区分一个元件或事件或报文与另一元件或事件或报文。举例而言,在不脱离本申请保护范围的情况下,第一存储阵列可被称为第二存储阵列,且类似地,第二存储阵列可被称为第一存储阵列。其中,存储阵列包括控制器和磁盘,控制器接收到用户数据以后,发送给磁盘进行存储。It will be understood that, although the terms first, second, etc. may be used herein to describe various elements or events or messages, these elements or events or messages should not be limited by these terms. These terms are only used to distinguish one element or event or message from another element or event or message. For example, a first storage array may be referred to as a second storage array, and similarly, a second storage array may be referred to as a first storage array, without departing from the scope of this application. The storage array includes a controller and a disk, and after the controller receives user data, it sends the data to the disk for storage.

其中,RTO是企业可容许服务中断的时间长度。比如说灾难发生后半天内便需要恢复,RTO值就是十二小时。RPO是指当服务恢复后,恢复得来的数据所对应时的间点。如果现时企业每天凌晨零时进行备份一次,当服务恢复后,系统内储存的只会是最近灾难发生前那个凌晨零时的资料。根据以上两个简单的原则,企业不但可以对现有的数据系统获得最适合的灾备方案,也可以按照既定的RTO及RPO要求获得最适合的灾备方案。where RTO is the length of time an enterprise can tolerate service disruption. For example, if the disaster needs to be restored within half a day, the RTO value is 12 hours. RPO refers to the point in time corresponding to the recovered data after the service is recovered. If the current enterprise backs up once a day at 0:00 a.m., when the service is restored, only the data at 0:00 a.m. before the most recent disaster will be stored in the system. According to the above two simple principles, enterprises can not only obtain the most suitable disaster recovery plan for the existing data system, but also obtain the most suitable disaster recovery plan according to the established RTO and RPO requirements.

在本发明实施例中,将AA模式双活数据中心解决方案作为存储系统的一个特性。本发明实施例在不需要添加例如引擎等设备的情况下即可实现RTO及RPO等于0的目标。本发明实施例可以推广到例如网络附属存储(Network Attached Storage,简称:NAS)、对象存储等存储系统组成的双活数据中心解决方案中。In the embodiment of the present invention, the AA mode active-active data center solution is used as a feature of the storage system. In the embodiment of the present invention, the goal of RTO and RPO being equal to 0 can be achieved without adding equipment such as an engine. The embodiments of the present invention can be extended to an active-active data center solution composed of storage systems such as network attached storage (Network Attached Storage, NAS for short), object storage, and the like.

本发明实施例中,通过第一存储阵列中的控制器存储有控制器运行状态信息,其中,该控制器运行状态信息包括第二存储阵列中各控制器的运行状态,将两个组成双活数据中心的、独立的存储阵列中的控制器集群虚拟成一个整体,每个存储阵列中的控制器都有一份控制器运行状态信息。控制器运行状态信息不仅能使存储阵列感知对端的存在和对端中故障发生,另外,第一存储阵列中的控制器存储有写锁权限信息,该写锁权限信息包括第一存储阵列中LUN的写锁权限管理者,该LUN的写锁权限管理者是第二存储阵列中的控制器,从而为分布式锁提供了统一管理。In the embodiment of the present invention, the controller in the first storage array stores controller operation state information, wherein the controller operation state information includes the operation state of each controller in the second storage array, and the two are composed of active-active The controller clusters in the independent storage arrays in the data center are virtualized as a whole, and each controller in the storage array has a copy of the controller running status information. The controller operating status information not only enables the storage array to sense the existence of the peer end and the occurrence of faults in the peer end, in addition, the controller in the first storage array stores write lock permission information, which includes the LUN in the first storage array. The write lock authority manager of the LUN is the controller in the second storage array, thereby providing unified management for distributed locks.

如图1所示,两个存储阵列,分别为存储阵列A和存储阵列B,其中,本领域技术人员可以将存储阵列A理解为本发明实施例中的第一存储阵列,存储阵列B即为本发明实施例中的第二存储阵列。且类似地,也可以将存储阵列B理解为本发明实施例中的第一存储阵列,存储阵列A即为本发明实施例中的第二存储阵列。As shown in FIG. 1 , the two storage arrays are storage array A and storage array B respectively. Those skilled in the art may understand storage array A as the first storage array in the embodiment of the present invention, and storage array B is The second storage array in the embodiment of the present invention. And similarly, the storage array B can also be understood as the first storage array in the embodiment of the present invention, and the storage array A is the second storage array in the embodiment of the present invention.

以下实施例中,以存储阵列A作为第一存储阵列,存储阵列B作为第二存储阵列为例进行说明。In the following embodiments, the storage array A is used as the first storage array and the storage array B is used as the second storage array as an example for description.

存储阵列A包括控制器A1、控制器A2、控制器A3和控制器A4。其中,控制器A1为存储阵列A中的主控制器,即第一存储阵列中的主控制器,其余为第一存储阵列中的从控制器。存储阵列B包括控制器B1、控制器B2、控制器B3和控制器B4。其中,控制器B1为存储阵列B中的主控制器,即第二存储阵列中的主控制器,其余为第二存储阵列中的从控制器。需说明的是,第一存储阵列中的从控制器和第二存储阵列中的从控制器的个数不局限于如图1所示的3个。Storage array A includes controller A1, controller A2, controller A3, and controller A4. The controller A1 is the master controller in the storage array A, that is, the master controller in the first storage array, and the rest are slave controllers in the first storage array. The storage array B includes a controller B1, a controller B2, a controller B3, and a controller B4. The controller B1 is the master controller in the storage array B, that is, the master controller in the second storage array, and the rest are slave controllers in the second storage array. It should be noted that, the number of slave controllers in the first storage array and the number of slave controllers in the second storage array is not limited to three as shown in FIG. 1 .

本发明实施例中,每个存储阵列维护一份控制器运行状态信息以及写锁权限信息,也就是说控制器运行状态信息以及写锁权限信息存储在存储阵列中的控制器中。对于存储阵列存储上述信息的形式不受限制,例如以视图的形式存储,等等。采用视图的形式存储时,视图包括存储阵列B中的控制器在存储阵列A内的映射,可选地,还包括存储阵列A中的控制器的本地映射。这里,可以将存储阵列B中的控制器映射称为别名节点。因此,本发明实施例中存储阵列A和存储阵列B各自看到的视图是不同的。In this embodiment of the present invention, each storage array maintains a copy of controller operating status information and write lock authority information, that is, the controller operating status information and write lock authority information are stored in the controller in the storage array. The form in which the above-mentioned information is stored in the storage array is not limited, for example, it is stored in the form of a view, and so on. When stored in the form of a view, the view includes the mapping of the controller in the storage array B in the storage array A, and optionally, also includes the local mapping of the controller in the storage array A. Here, the controller map in storage array B may be referred to as an alias node. Therefore, the views seen by the storage array A and the storage array B in the embodiment of the present invention are different.

如图1所示,存储阵列A的视图包括控制器A1、控制器A2、控制器A3和控制器A4的本地映射A1、A2、A3和A4,还包括控制器B1、控制器B2、控制器B3和控制器B4映射得到别名节点B1’、B2’、B3’和B4’。对应地,存储阵列B的视图包括控制器B1、控制器B2、控制器B3和控制器B4的本地映射B1、B2、B3和B4,还包括由控制器A1、控制器A2、控制器A3和控制器A4映射得到的别名节点A1’、A2’、A3’和A4’。存储阵列A和存储阵列B之间通过交换机10和交换机20来处理存储阵列中的控制器之间的通信,每个交换机服务于一个不同的连接结构。As shown in Figure 1, the view of storage array A includes local maps A1, A2, A3, and A4 of controller A1, controller A2, controller A3, and controller A4, as well as controller B1, controller B2, controller A4, and controller A4. B3 and controller B4 are mapped to obtain alias nodes B1', B2', B3' and B4'. Correspondingly, the view of storage array B includes local maps B1, B2, B3, and B4 of controller B1, controller B2, controller B3, and controller B4, and also includes local maps B1, B2, B3, and B4 of controller A1, controller A2, controller A3, and controller B4. The alias nodes A1', A2', A3' and A4' are mapped by the controller A4. The communication between the controllers in the storage array is handled between the storage array A and the storage array B through the switch 10 and the switch 20, and each switch serves a different connection structure.

图2为本发明管理存储阵列的方法实施例一的流程图。本发明实施例提供一种管理存储阵列的方法,应用于第一存储阵列的控制器中,该第一存储阵列和第二存储阵列通过通信链路通信。第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,控制器运行状态信息包括第二存储阵列中各控制器的运行状态,写锁权限信息包括第一存储阵列中LUN的写锁权限管理者,LUN的写锁权限管理者是第二存储阵列中的控制器。FIG. 2 is a flowchart of Embodiment 1 of a method for managing a storage array according to the present invention. An embodiment of the present invention provides a method for managing a storage array, which is applied to a controller of a first storage array, where the first storage array and the second storage array communicate through a communication link. The controller in the first storage array stores controller operating status information and write lock permission information, wherein the controller operating status information includes the operating status of each controller in the second storage array, and the write lock permission information includes the first storage array The write lock authority manager of the LUN in the LUN is the controller in the second storage array.

该方法包括:The method includes:

S201、当第一存储阵列中的控制器接收到向LUN写入数据的请求时,并且LUN的写锁权限管理者的运行状态是在线时,第一存储阵列中的控制器通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求。S201. When the controller in the first storage array receives a request to write data to the LUN, and the running state of the write lock authority manager of the LUN is online, the controller in the first storage array passes the communication link, Send a write lock permission application request to the LUN write lock permission manager.

S202、当第一存储阵列中的控制器获得写锁权限管理者提供的写锁定权限后,向LUN写入数据。S202. After the controller in the first storage array obtains the write lock permission provided by the write lock permission manager, write data to the LUN.

其中,写锁定权限的申请者是控制器,具体而言是控制器为数据申请,把该数据写入到LUN的权限。The applicant for the write lock permission is the controller, and specifically, the controller applies for data and has the permission to write the data to the LUN.

具体地,基于分布式锁的双活数据中心解决方案主机输入输出(Input/Output,简称:IO)可以同时对两个存储阵列进行读写。当主机对第一存储阵列中的控制器的LUN进行写时,同时将数据写到第二存储阵列中的控制器的LUN中,实现了两个存储阵列中的控制器的LUN的实时同步,其中,LUN是指存储阵列向主机呈现的块存储单元。由于主机可以同时对第一存储阵列中的控制器的LUN和第二存储阵列中的控制器的LUN进行读写,因此,主机访问LUN时须加存储阵列间的互斥锁。存储阵列间互斥锁是个分布式锁,分布式锁的原理是写锁权限管理者可以是两个存储阵列中的任意一个控制器,而其他控制器若要申请写锁权限都必须向写锁权限管理者申请。Specifically, the host input and output (Input/Output, IO for short) of the active-active data center solution based on distributed locks can read and write to two storage arrays at the same time. When the host writes to the LUN of the controller in the first storage array, it simultaneously writes the data to the LUN of the controller in the second storage array, thereby realizing real-time synchronization of the LUNs of the controllers in the two storage arrays. The LUN refers to the block storage unit presented by the storage array to the host. Since the host can simultaneously read and write to the LUN of the controller in the first storage array and the LUN of the controller in the second storage array, a mutual exclusion lock between storage arrays must be added when the host accesses the LUN. The mutual exclusion lock between storage arrays is a distributed lock. The principle of the distributed lock is that the write lock permission manager can be any controller in the two storage arrays, and other controllers must apply for the write lock permission to the write lock. Permission manager application.

第一存储阵列中的控制器和第二存储阵列中的控制器映射得到视图,同时视图为存储阵列间互斥锁提供跨存储阵列控制器视图。本发明实施例将组成双活数据中心的两个存储阵列当作一个集群进行管理,简化了存储阵列间互斥锁的部署,降低双活数据中心IO的复杂度,提高IO性能。The controller in the first storage array and the controller in the second storage array are mapped to obtain a view, and the view provides a cross-storage array controller view for the mutual exclusion lock between the storage arrays. The embodiment of the present invention manages two storage arrays constituting an active-active data center as a cluster, which simplifies the deployment of mutual exclusion locks between storage arrays, reduces the IO complexity of the active-active data center, and improves IO performance.

由于控制器运行状态可以包括:在线、离线以及故障中的任意一个。因此,在第一存储阵列中的控制器接收到向LUN写入数据的请求时,第一存储阵列中的控制器检测写锁权限管理者的运行状态是否为在线,若写锁权限管理者的运行状态是在线时,第一存储阵列中的控制器通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求,且在获得写锁权限管理者提供的写锁定权限后,向LUN写入数据。Because the controller running state can include: any one of online, offline and fault. Therefore, when the controller in the first storage array receives a request to write data to the LUN, the controller in the first storage array detects whether the running status of the write lock authority manager is online. When the running state is online, the controller in the first storage array sends an application request for write lock permission to the write lock permission manager of the LUN through the communication link, and after obtaining the write lock permission provided by the write lock permission manager, Write data to the LUN.

本发明实施例管理存储阵列的方法及装置,通过在第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,控制器运行状态信息包括第二存储阵列中各控制器的运行状态,写锁权限信息包括第一存储阵列中LUN的写锁权限管理者,LUN的写锁权限管理者是第二存储阵列中的控制器,第一存储阵列和第二存储阵列通过通信链路通信,实现第一存储阵列对第二存储阵列中的控制器的运行状态的监测,也就是说,第一存储阵列可以直接获知第二存储阵列的存在及故障的发生;另外,在第一存储阵列中的控制器接收到向LUN写入数据的请求时,且LUN的写锁权限管理者的运行状态是在线时,第一存储阵列中的控制器向LUN的写锁权限管理者发送写锁定权限的申请请求,从而获得写锁权限管理者提供的写锁定权限,向LUN写入数据,可以保持数据的一致性。In the method and device for managing a storage array according to an embodiment of the present invention, the controller in the first storage array stores controller operation state information and write lock permission information, wherein the controller operation state information includes each control device in the second storage array. The write lock permission information includes the write lock permission manager of the LUN in the first storage array. The write lock permission manager of the LUN is the controller in the second storage array. The first storage array and the second storage array pass through The communication link communicates to realize the monitoring of the running state of the controller in the second storage array by the first storage array, that is to say, the first storage array can directly know the existence and failure of the second storage array; When the controller in the first storage array receives a request to write data to the LUN, and the running state of the write lock authority manager of the LUN is online, the controller in the first storage array sends the write lock authority manager of the LUN to the running state. Send a request for write lock permission to obtain the write lock permission provided by the write lock permission manager, write data to the LUN, and maintain data consistency.

在上述实施例的基础上,第一存储阵列中的控制器通过通信链路,向LUN的锁权限拥有者发送写锁定权限的申请请求之后,该方法还可以包括:当第一存储阵列中的控制器获得写锁权限管理者提供的写锁定权限后,向第二存储阵列发送向LUN的镜像LUN中写入数据的指令,该镜像LUN由第二存储阵列中的控制器管理。其中,LUN和镜像LUN可以拥有相同的LUN标识(identification,简称:ID)。LUN和镜像LUN存储的数据保持同步。因此,LUN和镜像LUN之间形成数据保护,当管理LUN的控制器发送故障时,可以从其镜像LUN那种获得数据,避免了数据丢失。LUN和镜像LUN位于不同的存储阵列中,由各自对应的控制器进行管理。On the basis of the above embodiment, after the controller in the first storage array sends an application request for write lock permission to the lock permission owner of the LUN through the communication link, the method may further include: After obtaining the write lock authority provided by the write lock authority manager, the controller sends an instruction to write data to the mirror LUN of the LUN to the second storage array, and the mirror LUN is managed by the controller in the second storage array. The LUN and the mirror LUN may have the same LUN identification (identification, ID for short). The data stored on the LUN and the mirrored LUN are kept in sync. Therefore, data protection is formed between the LUN and the mirrored LUN. When the controller that manages the LUN sends a fault, data can be obtained from its mirrored LUN, avoiding data loss. LUNs and mirrored LUNs are located in different storage arrays and are managed by their respective controllers.

在本发明实施例的存储阵列有2个,主机和2个存储阵列连接,当主机发出向某个LUN写入数据的指令时,指令中携带的LUN ID所指示的LUN被称为双活LUN,这两个存储阵列的集合可以称为双活存储阵列。双活LUN是虚拟的,双活LUN对应2个物理LUN,也就是前述的LUN和镜像LUN,双活LUN和物理LUN三者的LUN ID可以相同。因此,这两个存储阵列呈现给用户的只有一个LUN(双活LUN),但实际存储数据的是2个物理LUN(LUN和镜像LUN)。向某个LUN中写入数据时,数据实际上被写入2个LUN中,相当于数据得到备份。本地存储系统可以直接获知远端存储系统的存在及故障的发生,从而无法实现数据写入的无缝接管。在写入数据时,双活LUN对应的2个物理LUN都处于可以活动(可以写入数据)的状态,因此它被称为双LUN。In the embodiment of the present invention, there are two storage arrays, and the host is connected to the two storage arrays. When the host sends an instruction to write data to a LUN, the LUN indicated by the LUN ID carried in the instruction is called a dual-active LUN. , the set of these two storage arrays can be called an active-active storage array. Active-active LUNs are virtual. Active-active LUNs correspond to two physical LUNs, that is, the aforementioned LUNs and mirrored LUNs. The LUN IDs of the active-active LUNs and physical LUNs can be the same. Therefore, the two storage arrays present only one LUN (active-active LUN) to users, but actually store data with two physical LUNs (LUN and mirrored LUN). When writing data to a LUN, the data is actually written to two LUNs, which means that the data is backed up. The local storage system can directly learn the existence and failure of the remote storage system, so that the seamless takeover of data writing cannot be realized. When writing data, the two physical LUNs corresponding to the active-active LUN are both active (data can be written), so it is called a dual LUN.

运行状态包括:在线、离线以及故障中的任意一个。在线的控制器可以正常工作,离线和故障中的控制器无法正常工作。The running state includes: any one of online, offline and fault. The online controller can work normally, and the offline and faulty controller cannot work normally.

如果在向LUN写入数据的,其他控制器也向LUN中写入其他数据,或者向其镜像LUN中写入其他数据,均会造成LUN和镜像LUN的数据不一致,也就是破坏了LUN和镜像LUN之间的数据一致性。因此,本发明实施例使用写锁定权限来避免这种情况的发生,写锁定权限是一种互斥的权限。仅申请到写锁定权限的控制器有向这个LUN中写入数据的权利;此外,申请到写锁定权限的控制器拥有向镜像LUN中写入相同数据的权利。在释放写锁定权限之前,其他阵列中的其他控制器均没有权限向LUN或者镜像LUN中写入数据。If data is written to the LUN, other controllers also write other data to the LUN, or write other data to its mirror LUN, the data of the LUN and the mirror LUN will be inconsistent, that is, the LUN and mirror will be destroyed. Data consistency between LUNs. Therefore, the embodiment of the present invention uses the write lock permission to avoid the occurrence of this situation, and the write lock permission is a mutually exclusive permission. Only the controller that has applied for the write lock permission has the right to write data to this LUN; in addition, the controller that has applied for the write lock permission has the right to write the same data to the mirror LUN. Before releasing the write lock permission, other controllers in other arrays have no permission to write data to the LUN or mirror LUN.

写锁定权限和被写的数据绑定,也就是说写锁定权限限定了只能写入申请写锁定权限时所指定的数据。即使控制器拥有写锁定权限,也无法写入其他数据。换句话说,写锁定权限的申请者是控制器,是控制器为所述数据申请,把所述数据写入到所述LUN的权限。The write lock permission is bound to the data being written, that is to say, the write lock permission limits the writing to the data specified when applying for the write lock permission. Even if the controller has write lock permission, no other data can be written. In other words, the applicant for the write lock permission is the controller, and the controller applies for the data and the permission to write the data to the LUN.

另外,该方法还可以包括:当LUN写入完成,并且收到第二存储阵列发送的镜像LUN写入完成的响应消息后,第一存储阵列中的控制器释放写锁定权限。该实施例保证写锁定权限在不同时刻被不同控制器获得。In addition, the method may further include: after the LUN writing is completed and a response message of the mirroring LUN writing completion sent by the second storage array is received, the controller in the first storage array releases the write lock authority. This embodiment ensures that write lock rights are acquired by different controllers at different times.

在上述基础上,该方法还可以包括:在第一存储阵列中的控制器释放写锁定权限之前,第一存储阵列中的控制器接收到另外一个向LUN写入数据的请求,并且LUN的写锁权限管理者的运行状态是在线时,通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求;第一存储阵列中的控制器接收写锁权限管理者返回拒绝给予写锁定权限的响应消息。也就是说,写锁定权限在同一时刻只能被一个控制器获得,在有控制器获得其没有释放之前,其余控制器等待。Based on the above, the method may further include: before the controller in the first storage array releases the write lock authority, the controller in the first storage array receives another request to write data to the LUN, and the write request of the LUN When the running state of the lock authority manager is online, it sends an application request for write lock authority to the write lock authority manager of the LUN through the communication link; the controller in the first storage array receives the write lock authority manager's response and refuses to grant the write lock authority. Response message to lock permissions. That is to say, the write lock permission can only be obtained by one controller at a time, and the other controllers wait until one controller obtains it and does not release it.

这里通过具体示例说明上述实施例。如图3所示,图3示出如图1所示示例的工作原理图。参考图3,用户通过存储阵列A的控制器A2和存储阵列B的控制器B2访问一个双活LUN,这个双活LUN包括控制器A2对应的物理LUN1和控制器B2对应的物理LUN2,而这个双活LUN的写锁权限管理者为控制器B1。在存储阵列B的控制器看到的写锁权限管理者为控制器B1,而在存储阵列A上看到的写锁权限管理者是控制器B1在存储阵列A的视图里的别名节点B1’。当用户从存储阵列B的控制器B2下发写入数据的请求时,控制器B2向控制器B1申请写锁定权限,然后分别向物理LUN2和物理LUN1写数据,待写完后,控制器B2释放写锁定权限。如果在控制器B2申请到写锁定权限到释放写锁定权限期间,用户从存储阵列A的控制器A2下发另外一个向物理LUN1写入数据的请求,控制器A2向控制器B1在存储阵列B的视图中别名节点B1’申请写锁定权限,控制器A2判断别名节点B1’所代表的控制器物理节点为存储阵列B的控制器B1,控制器A2将申请权限请求发送给存储阵列B的控制器B1。控制器B1发现写锁定权限被控制器B2持有且没有释放,则给控制器A2返回申请失败,控制器A2上的写入数据的请求等待写锁定权限。控制器B1等待控制器B2释放写锁定权限后再分配写锁定权限给控制器A2,控制器A2上写入数据的请求获得写锁定权限后继续执行数据写入操作。而读数据仅在下发的存储阵列本地读数据,不涉及到跨存储阵列的互斥,不需要加读锁定权限。The above-described embodiments are described herein by way of specific examples. As shown in FIG. 3 , FIG. 3 shows a working principle diagram of the example shown in FIG. 1 . Referring to Figure 3, a user accesses an active-active LUN through controller A2 of storage array A and controller B2 of storage array B. The active-active LUN includes physical LUN1 corresponding to controller A2 and physical LUN2 corresponding to controller B2. The administrator of the write lock permission for the active-active LUN is controller B1. The write lock authority manager seen by the controller of storage array B is controller B1, while the write lock authority manager seen on storage array A is the alias node B1' of controller B1 in the view of storage array A . When a user sends a data write request from controller B2 of storage array B, controller B2 applies for write lock permission to controller B1, and then writes data to physical LUN2 and physical LUN1 respectively. After the writing is complete, controller B2 Release the write lock permission. If the user sends another request to write data to the physical LUN1 from the controller A2 of the storage array A during the period from when the controller B2 applies for the write lock permission to when the write lock permission is released, the controller A2 sends another request to the controller B1 to write data to the physical LUN1 in the storage array B. In the view of the alias node B1' to apply for the write lock permission, the controller A2 determines that the controller physical node represented by the alias node B1' is the controller B1 of the storage array B, and the controller A2 sends the request for permission to the controller of the storage array B. device B1. When the controller B1 finds that the write lock permission is held by the controller B2 and has not been released, it returns an application failure to the controller A2, and the request for writing data on the controller A2 waits for the write lock permission. The controller B1 waits for the controller B2 to release the write lock authority before assigning the write lock authority to the controller A2. The data write request on the controller A2 obtains the write lock authority and continues to perform the data writing operation. The read data is only read locally in the distributed storage array, does not involve mutual exclusion across storage arrays, and does not require read lock permissions.

以下通过具体实施例说明对前述第一存储阵列中的控制器存储的控制器运行状态信息的管理。The following describes the management of the controller operating state information stored by the controller in the first storage array by using specific embodiments.

图4为本发明管理存储阵列的方法实施例二的流程图。在图2所示实施例的基础上,如图4所示,该方法还可以包括:FIG. 4 is a flowchart of Embodiment 2 of a method for managing a storage array according to the present invention. On the basis of the embodiment shown in FIG. 2 , as shown in FIG. 4 , the method may further include:

S401、第一存储阵列中的主控制器接收第二存储阵列中的主控制器发送的待处理事件。S401. The main controller in the first storage array receives the to-be-processed event sent by the main controller in the second storage array.

具体地,待处理事件发生在第二存储阵列中。其中,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个,第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。Specifically, the event to be processed occurs in the second storage array. The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is one of the controllers in the second storage array.

S402、第一存储阵列中的主控制器根据待处理事件,更新控制器运行状态信息包含的第二存储阵列中的控制器的运行状态;S402, the main controller in the first storage array updates the running state of the controller in the second storage array included in the controller running state information according to the to-be-processed event;

S403、第一存储阵列中的主控制器发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的控制器运行状态信息。S403. The main controller in the first storage array sends the updated controller running state information to other controllers in the first storage array, and each other controller updates the controller running state information stored by itself.

通过本发明实施例保证第一存储阵中的控制器存储的控制器运行状态信息与第二存储阵列种的控制器存储的控制器运行状态信息一致。但本发明所指的一致不是强一致性,而是保证最终一致性。强一致性是要保证每一时刻都是一致的。而最终一致不要求时时一致,而是等配置或者事件处理流程结束后是一致的。The embodiment of the present invention ensures that the controller running state information stored by the controller in the first storage array is consistent with the controller running state information stored by the controller of the second storage array. However, the consistency referred to in the present invention is not strong consistency, but guarantees eventual consistency. Strong consistency is to ensure that every moment is consistent. The final consistency does not require consistency from time to time, but is consistent after the configuration or event processing process ends.

可选地,待处理事件可以包括控制器故障事件、新增控制器事件、第一存储阵列与第二存储阵列之间通信链路故障事件及通信链路恢复事件中的至少一种类型。待处理事件是指可引起第一存储阵列中的控制器的个数和/或运行状态发生改变的事件,和/或,第二存储阵列中的控制器的个数和/或运行状态发生改变的事件。Optionally, the events to be processed may include at least one type of a controller failure event, a newly added controller event, a communication link failure event between the first storage array and the second storage array, and a communication link recovery event. A pending event refers to an event that can cause the number and/or operating state of the controllers in the first storage array to change, and/or the number and/or operating state of the controllers in the second storage array change event.

不同待处理事件对第一存储阵列中的控制器存储的控制器运行状态信息的影响不同,以下通过几种具体实现方式详细说明。Different to-be-processed events have different effects on the controller running state information stored by the controller in the first storage array, which will be described in detail below through several specific implementation manners.

第一种具体实现方式中,待处理事件为控制器故障事件。该实现方式中,S402可以包括:第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态从控制器运行状态信息中移除;或,在控制器运行状态信息中,第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态更新为离线或故障。In the first specific implementation manner, the event to be processed is a controller failure event. In this implementation manner, S402 may include: the main controller in the first storage array removes the operating state of the controller in which the controller failure event occurs from the controller operating state information; or, in the controller operating state information, The master controller in the first storage array updates the running state of the controller in which the controller failure event occurs to offline or failure.

第一种可选方式中,对于第一存储阵列中的控制器发生故障的事件,也就是本端控制器故障事件时,第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态从控制器运行状态信息中移除;或,在控制器运行状态信息中,第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态更新为离线或故障。参考图5,当第一存储阵列中的主控制器,即控制器A1检测到控制器A4故障时,将控制器A4对应的映射从视图中移除,或,将控制器A4的运行状态更新为离线或故障。In the first optional manner, for the event that the controller in the first storage array fails, that is, when the local controller fails, the primary controller in the first storage array will be the controller of the controller failure event. The running status of the controller is removed from the controller running status information; or, in the controller running status information, the master controller in the first storage array updates the running status of the controller in which the controller failure event occurs to offline or fault. Referring to FIG. 5 , when the main controller in the first storage array, that is, the controller A1, detects that the controller A4 is faulty, the mapping corresponding to the controller A4 is removed from the view, or the running state of the controller A4 is updated. is offline or faulty.

第二种可选方式中,若发生控制器故障事件的为第一存储阵列中的主控制器,第一存储阵列确定新的第一存储阵列中的主控制器;新的第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态从控制器运行状态信息中移除;或,在控制器运行状态信息中,新的第一存储阵列中的主控制器将发生控制器故障事件的控制器的运行状态更新为离线或故障。In the second optional manner, if the controller failure event occurs in the main controller in the first storage array, the first storage array determines the main controller in the new first storage array; The primary controller of the controller removes the operating status of the controller in which the controller failure event occurred from the controller operating status information; or, in the controller operating status information, the primary controller in the new first storage array will control the The operating status of the controller in the event of a controller failure is updated to Offline or Failed.

发生控制器故障事件的控制器可能是第二存储阵列中的控制器、第一存储阵列中的主控制器或其他控制器。若是第二存储阵列中的控制器控制器,采用第一种具体实现方式更新控制器运行状态信息;若是第一存储阵列中的主控制器,采用第二种可选方式更新控制器运行状态信息;若是第一存储阵列中的其他控制器,采用第一种可选方式更新控制器运行状态信息。The controller in which the controller failure event occurred may be the controller in the second storage array, the primary controller in the first storage array, or another controller. If it is the controller controller in the second storage array, the first specific implementation method is used to update the controller operating state information; if it is the main controller in the first storage array, the second optional method is used to update the controller operating state information ; If there are other controllers in the first storage array, use the first optional method to update the controller running state information.

第二种具体实现方式中,待处理事件为新增控制器事件。该实现方式中,S402可以包括:第一存储阵列中的主控制器将新增控制器事件新增的控制器添加至控制器运行状态信息。In the second specific implementation manner, the to-be-processed event is a newly added controller event. In this implementation manner, S402 may include: the main controller in the first storage array adds the newly added controller to the controller running state information by the event of adding a controller.

其中,新增控制器事件的处理流程和控制器故障事件的处理流程类似。将新增的控制器以别名节点的方式加入到第一存储阵列的控制器运行状态信息。若新增控制器事件为第一存储阵列中发生的事件,第一存储阵列中的主控制器将将该新增的控制器加入到控制器运行状态信息。The processing flow of the newly added controller event is similar to the processing flow of the controller failure event. The newly added controller is added to the controller running state information of the first storage array in the form of an alias node. If the newly added controller event is an event occurring in the first storage array, the main controller in the first storage array will add the newly added controller to the controller running state information.

图6和图7示例说明第一存储阵列中的主控制器和第二存储阵列中的主控制器对控制器故障事件的处理流程。FIG. 6 and FIG. 7 illustrate the processing flow of the controller failure event by the main controller in the first storage array and the main controller in the second storage array.

如图6所示,该方法可以包括:As shown in Figure 6, the method may include:

S601、第一存储阵列中的主控制器检测到第一存储阵列中的控制器故障事件。S601. The main controller in the first storage array detects a controller failure event in the first storage array.

S602、第一存储阵列中的主控制器根据控制器故障事件更新控制器运行状态信息。S602. The main controller in the first storage array updates the controller operating state information according to the controller failure event.

S603、第一存储阵列中的主控制器将更新后的控制器运行状态信息同步到第一存储阵列中的其他控制器。S603. The main controller in the first storage array synchronizes the updated controller running state information to other controllers in the first storage array.

S604、第一存储阵列中的主控制器将控制器故障事件转发到第二存储阵列中的主控制器。S604. The main controller in the first storage array forwards the controller failure event to the main controller in the second storage array.

此后,第一存储阵列中的主控制器结束处理流程。Thereafter, the main controller in the first storage array ends the processing flow.

第二存储阵列中的主控制器接收到该控制器故障事件,更新第二存储阵列中的控制器存储的控制器运行状态信息;将更新后的第二存储阵列中的控制器存储的控制器运行状态信息同步到第二存储阵列中的其他控制器上;结束处理流程。The main controller in the second storage array receives the controller failure event, and updates the controller running state information stored in the controller in the second storage array; The running state information is synchronized to other controllers in the second storage array; the processing flow is ended.

参考图7,①表示第一存储阵列中的主控制器更新第一存储阵列内的控制器运行状态信息,②表示第一存储阵列中的主控制器将控制器故障事件发送给第二存储阵列中的主控制器,③表示第二存储阵列中的主控制器更新第二存储阵列内的控制器运行状态信息。Referring to FIG. 7, ① indicates that the main controller in the first storage array updates the controller running state information in the first storage array, and ② indicates that the main controller in the first storage array sends the controller failure event to the second storage array The main controller in the ③ indicates that the main controller in the second storage array updates the controller running state information in the second storage array.

第一存储阵列中的主控制器可以处理上述第一存储阵列和第二存储阵列的控制器个数和/或运行发生状态变化的事件。The main controller in the first storage array can process the event that the number and/or running state of the controllers of the first storage array and the second storage array change.

第三种具体实现方式中,待处理事件为通信链路故障事件。该实现方式中,第一存储阵列中的主控制器将第二存储阵列中的控制器的运行状态从控制器运行状态信息中移除;或,第一存储阵列中的主控制器将第二存储阵列中的控制器的运行状态在控制器运行状态信息中的运行状态更新为离线或故障。In a third specific implementation manner, the event to be processed is a communication link failure event. In this implementation manner, the main controller in the first storage array removes the operating status of the controller in the second storage array from the controller operating status information; or, the main controller in the first storage array The operating state of the controller in the storage array is updated to Offline or Failed in the operating state of the controller operating state information.

第一存储阵列与第二存储阵列间的通信链路发生故障,第一存储阵列与第二存储阵列无法通信。第一存储阵列与第二存储阵列各自将对方的别名节点从控制器运行状态信息中移除或将其运行状态更新为离线或故障,如图8所示。The communication link between the first storage array and the second storage array fails, and the first storage array and the second storage array cannot communicate. The first storage array and the second storage array respectively remove the other's alias node from the controller running state information or update its running state to offline or fault, as shown in FIG. 8 .

第四种具体实现方式中,待处理事件为通信链路恢复事件,且待处理事件为第一存储阵列中发生的事件。In a fourth specific implementation manner, the to-be-processed event is a communication link recovery event, and the to-be-processed event is an event that occurs in the first storage array.

图9为本发明管理存储阵列的方法实施例七的流程图。在图2所示实施例的基础上,如图9所示,该方法还可以包括:FIG. 9 is a flowchart of Embodiment 7 of a method for managing a storage array according to the present invention. On the basis of the embodiment shown in FIG. 2, as shown in FIG. 9, the method may further include:

S901、第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的恢复请求,该恢复请求携带第二存储阵列中各控制器的运行状态。S901. The main controller in the first storage array receives a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array.

S902、第一存储阵列中的主控制器将第二存储阵列中各控制器的运行状态添加至控制器运行状态信息,其中,第二存储阵列中各控制器的运行状态为在线,或,将第二存储阵列中各控制器的运行状态更新为在线。S902. The main controller in the first storage array adds the running status of each controller in the second storage array to the controller running status information, wherein the running status of each controller in the second storage array is online, or, add The running status of each controller in the second storage array is updated to be online.

S903、第一存储阵列中的主控制器发送第一响应报文给第二存储阵列中的主控制器,该第一响应报文携带第一存储阵列中各控制器的运行状态。S903. The main controller in the first storage array sends a first response packet to the main controller in the second storage array, where the first response packet carries the operating status of each controller in the first storage array.

其中,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个,第二存储阵列中的主控制器为第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is one of the controllers in the second storage array.

第五种具体实现方式中,待处理事件为链路恢复事件,且待处理事件为第二存储阵列一侧生成的事件。图10为本发明管理存储阵列的方法实施例八的流程图。在图2所示实施例的基础上,如图10所示,该方法还可以包括:In a fifth specific implementation manner, the to-be-processed event is a link recovery event, and the to-be-processed event is an event generated on the side of the second storage array. FIG. 10 is a flowchart of Embodiment 8 of a method for managing a storage array according to the present invention. On the basis of the embodiment shown in FIG. 2, as shown in FIG. 10, the method may further include:

S110、当第一存储阵列中的主控制器检测到通信链路恢复事件时,第一存储阵列中的主控制器生成恢复请求,该恢复请求携带第一存储阵列中各控制器的运行状态。S110. When the main controller in the first storage array detects a communication link recovery event, the main controller in the first storage array generates a recovery request, where the recovery request carries the operating status of each controller in the first storage array.

S120、第一存储阵列中的主控制器发送恢复请求给第二存储阵列中的主控制器。S120. The main controller in the first storage array sends a recovery request to the main controller in the second storage array.

S130、第一存储阵列中的主控制器接收第二存储阵列中的主控制器发送的第二响应报文,该第二响应报文携带第二存储阵列中各控制器的运行状态。S130. The main controller in the first storage array receives a second response packet sent by the main controller in the second storage array, where the second response packet carries the operating status of each controller in the second storage array.

S140、第一存储阵列中的主控制器解析第二响应报文,获取第二存储阵列中各控制器的运行状态。S140. The main controller in the first storage array parses the second response message, and obtains the operating status of each controller in the second storage array.

S150、第一存储阵列中的主控制器将第二存储阵列中各控制器的运行状态添加至控制器运行状态信息,其中,第二存储阵列中各控制器的运行状态为在线,或,将第二存储阵列中各控制器的运行状态更新为在线。S150. The main controller in the first storage array adds the running status of each controller in the second storage array to the controller running status information, wherein the running status of each controller in the second storage array is online, or, add The running status of each controller in the second storage array is updated to be online.

其中,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个,第二存储阵列中的主控制器为第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is one of the controllers in the second storage array.

接下来,统一说明第四种具体实现方式和第五种具体实现方式。Next, the fourth specific implementation manner and the fifth specific implementation manner are described in a unified manner.

当第一存储阵列与第二存储阵列间的通信链路恢复正常,第一存储阵列与第二存储阵列通过第一存储阵列中的主控制器和第二存储阵列中的主控制器进行控制器运行状态信息平滑配置。对于平滑配置。本领域技术人员可以理解为:在两存储阵列间的通信链路恢复正常时,由发起创建控制器运行状态信息的一方向共同创建该控制器运行状态信息的另一方查询控制器信息并转化成别名节点写入到本地的控制器运行状态信息中。也就是说,若创建控制器运行状态信息是由本发明实施例中的第一存储阵列发起的,则本发明实施例中的第一存储阵列采用第五种具体实现方式更新控制器运行状态信息;若创建控制器运行状态信息是由本发明实施例中的第二存储阵列发起的,则本发明实施例中的第一存储阵列采用第四种具体实现方式更新控制器运行状态信息。When the communication link between the first storage array and the second storage array returns to normal, the first storage array and the second storage array are controlled by the main controller in the first storage array and the main controller in the second storage array. Smooth configuration of running status information. For smooth configuration. Those skilled in the art can understand that: when the communication link between the two storage arrays returns to normal, the one party that initiates the creation of the controller operating state information will query the controller information and the other party that jointly creates the controller operating state information and convert it into The alias node is written to the local controller running state information. That is, if the creation of the controller operating state information is initiated by the first storage array in the embodiment of the present invention, the first storage array in the embodiment of the present invention uses the fifth specific implementation manner to update the controller operating state information; If the creation of the controller running state information is initiated by the second storage array in the embodiment of the present invention, the first storage array in the embodiment of the present invention uses the fourth specific implementation manner to update the controller running state information.

上述实施例通过具体实现方式说明如何更新控制器运行状态信息,而更新控制器运行状态信息的前提是要创建控制器运行状态信息。The above-mentioned embodiments illustrate how to update the controller operation state information through a specific implementation manner, and the premise of updating the controller operation state information is to create the controller operation state information.

创建控制器运行状态信息可以包括:Creating controller operating status information can include:

1)第一存储阵列中的主控制器接收用于创建控制器运行状态信息的命令。1) The host controller in the first storage array receives a command for creating controller operating status information.

用户发送用于创建控制器运行状态信息的命令到第一存储阵列,由第一存储阵列中的主控制器执行该用于创建控制器运行状态信息的命令。The user sends a command for creating the controller running state information to the first storage array, and the master controller in the first storage array executes the command for creating the controller running state information.

2)第一存储阵列中的主控制器获取第一存储阵列中的控制器的运行状态。2) The main controller in the first storage array acquires the running state of the controllers in the first storage array.

3)第一存储阵列中的主控制器发送创建命令给第二存储阵列中的主控制器,该创建命令包括第一存储阵列中的控制器的运行状态。3) The main controller in the first storage array sends a creation command to the main controller in the second storage array, where the creation command includes the running state of the controller in the first storage array.

第一存储阵列根据第一存储阵列中的控制器的运行状态,生成创建命令并发送给第二存储阵列中的主控制器。The first storage array generates a creation command according to the running state of the controller in the first storage array and sends it to the main controller in the second storage array.

相应地,第二存储阵列中的主控制器接收到创建命令后,先获取本地的控制器的运行状态;然后,第二存储阵列中的主控制器根据创建命令包括的第一存储阵列中的控制器的运行状态、和第二存储阵列中的控制器的运行状态,配置第二存储阵列的控制器运行状态信息;在配置好之后,第二存储阵列中的主控制器将配置好的控制器运行状态信息同步到第二存储阵列中的其他控制器中;第二存储阵列中的主控制器将该第二存储阵列中的控制器的运行状态、创建执行结果携带在应答报文中返回给第一存储阵列中的主控制器。Correspondingly, after receiving the creation command, the main controller in the second storage array first obtains the running state of the local controller; then, the main controller in the second storage array includes the first storage array according to the creation command. The operating state of the controller and the operating state of the controller in the second storage array, configure the operating state information of the controller of the second storage array; after the configuration is completed, the main controller in the second storage array will control the configured control The operating status information of the controller is synchronized to other controllers in the second storage array; the master controller in the second storage array carries the running status and the creation and execution results of the controllers in the second storage array in the response message and returns to the primary controller in the first storage array.

4)第一存储阵列中的主控制器接收第二存储阵列中的主控制器发送的应答报文,该应答报文包括第二存储阵列中的控制器的运行状态。4) The main controller in the first storage array receives a response message sent by the main controller in the second storage array, where the response message includes the running state of the controller in the second storage array.

5)第一存储阵列中的主控制器根据第一存储阵列中的控制器的运行状态和第二存储阵列中的控制器的运行状态,创建第一存储阵列的控制器运行状态信息。5) The main controller in the first storage array creates controller operation state information of the first storage array according to the operation state of the controller in the first storage array and the operation state of the controller in the second storage array.

综上,第一存储阵列接收到用于创建控制器运行状态信息的命令后,将本存储阵列中的控制器的运行状态发送到第二存储阵列,并等待第二存储阵列的执行结果;然后根据两个存储阵列的控制器的运行状态,创建本地的控制器运行状态信息。To sum up, after the first storage array receives the command for creating the controller running state information, it sends the running state of the controller in this storage array to the second storage array, and waits for the execution result of the second storage array; then Create local controller running status information according to the running status of the controllers of the two storage arrays.

最后,第一存储阵列中的主控制器将本地的控制器运行状态信息同步到各第一存储阵列中的其他控制器中。可选地,第一存储阵列中的主控制器返回结果给用户。Finally, the main controller in the first storage array synchronizes the local controller operating state information to other controllers in each of the first storage arrays. Optionally, the main controller in the first storage array returns the result to the user.

上述实施例中,可以采用位图表示第一存储阵列中的控制器的运行状态,和/或,第二存储阵列中的控制器的运行状态。例如,在线表示为1,故障或离线表示为0,等等,本发明实施例不以此为限。In the foregoing embodiment, a bitmap may be used to represent the operating state of the controller in the first storage array, and/or the operating state of the controller in the second storage array. For example, online is represented as 1, fault or offline is represented as 0, and so on, which is not limited in this embodiment of the present invention.

图11为本发明管理存储阵列的装置实施例一的结构示意图。本发明实施例提供一种管理存储阵列的装置,该装置集成于第一存储阵列的控制器中,第一存储阵列和第二存储阵列通过通信链路通信。第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,控制器运行状态信息包括第二存储阵列中各控制器的运行状态;写锁权限信息包括第一存储阵列中LUN的写锁权限管理者,LUN的写锁权限管理者是第二存储阵列中的控制器。FIG. 11 is a schematic structural diagram of Embodiment 1 of an apparatus for managing a storage array according to the present invention. An embodiment of the present invention provides an apparatus for managing a storage array. The apparatus is integrated in a controller of a first storage array, and the first storage array and the second storage array communicate through a communication link. The controller in the first storage array stores controller operating status information and write lock permission information, wherein the controller operating status information includes the operating status of each controller in the second storage array; the write lock permission information includes the first storage array The write lock authority manager of the LUN in the LUN is the controller in the second storage array.

如图11所示,管理存储阵列的装置100包括:接收模块11、发送模块22、检测模块33及处理模块44。As shown in FIG. 11 , the apparatus 100 for managing a storage array includes: a receiving module 11 , a sending module 22 , a detection module 33 and a processing module 44 .

其中,接收模块11用于接收向LUN写入数据的请求。检测模块33用于检测LUN的写锁权限管理者的运行状态。发送模块22用于当接收模块11接收到向所述LUN写入数据的请求时,并且检测模块33检测到LUN的写锁权限管理者的运行状态是在线时,通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求。接收模块11还用于接收写锁权限管理者提供的写锁定权限的应答消息。处理模块44用于获得写锁权限管理者提供的写锁定权限后,向所述LUN写入数据。The receiving module 11 is configured to receive a request for writing data to the LUN. The detection module 33 is used to detect the running state of the write lock authority manager of the LUN. The sending module 22 is configured to, when the receiving module 11 receives a request to write data to the LUN, and the detecting module 33 detects that the running state of the write lock authority manager of the LUN is online, send the data to the LUN through the communication link. The write lock permission manager sends an application request for the write lock permission. The receiving module 11 is further configured to receive a reply message of the write lock permission provided by the write lock permission manager. The processing module 44 is configured to write data to the LUN after obtaining the write lock authority provided by the write lock authority manager.

本实施例的装置,可以用于执行上述任意方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The apparatus of this embodiment can be used to execute the technical solutions of any of the above-mentioned method embodiments, and the implementation principles and technical effects thereof are similar, and are not repeated here.

在上述基础上,处理模块44还可以用于:在获得写锁权限管理者提供的写锁定权限后,触发发送模块22向第二存储阵列发送向LUN的镜像LUN中写入所述数据的指令,该镜像LUN由第二存储阵列中的控制器管理。On the basis of the above, the processing module 44 can also be used to: after obtaining the write lock permission provided by the write lock permission manager, trigger the sending module 22 to send an instruction to write the data into the mirror LUN of the LUN to the second storage array , the mirrored LUN is managed by the controller in the second storage array.

此外,接收模块11还可以用于接收第二存储阵列发送的镜像LUN写入完成的响应消息。处理模块44还可以用于当LUN写入完成,并且接收模块11收到镜像LUN写入完成的响应消息后,释放写锁定权限。In addition, the receiving module 11 may also be configured to receive a response message of completion of writing to the mirrored LUN sent by the second storage array. The processing module 44 may also be configured to release the write lock permission when the LUN writing is completed and the receiving module 11 receives a response message that the mirror LUN writing is completed.

进一步地,发送模块22还可以用于在处理模块44释放写锁定权限之前,接收模块11接收到另外一个向LUN写入数据的请求,并且检测模块33检测到LUN的写锁权限管理者的运行状态是在线时,通过通信链路,向LUN的写锁权限管理者发送写锁定权限的申请请求。接收模块11还可以用于接收写锁权限管理者返回拒绝给予写锁定权限的响应消息。Further, the sending module 22 can also be used for the receiving module 11 to receive another request to write data to the LUN before the processing module 44 releases the write lock authority, and the detection module 33 detects the operation of the write lock authority manager of the LUN. When the status is online, a request for write lock permission is sent to the write lock permission manager of the LUN through the communication link. The receiving module 11 may also be configured to receive a response message returned by the write lock authority manager refusing to grant the write lock authority.

一种实现方式中,集成管理存储阵列的装置100的控制器为第一存储阵列中的主控制器,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个。In an implementation manner, the controller of the apparatus 100 for integrated management of storage arrays is the main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array.

在该实现方式中,接收模块11还可以用于接收第二存储阵列中的主控制器发送的待处理事件,待处理事件发生在第二存储阵列中,其中,第二存储阵列中的主控制器为第二存储阵列中的控制器中的一个。处理模块44还可以用于根据待处理事件,更新控制器运行状态信息包含的第二存储阵列中的控制器的运行状态。发送模块22还可以用于发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的所述控制器运行状态信息。In this implementation manner, the receiving module 11 may also be configured to receive an event to be processed sent by the main controller in the second storage array, where the to-be-processed event occurs in the second storage array, wherein the main controller in the second storage array The controller is one of the controllers in the second storage array. The processing module 44 may also be configured to update the operating state of the controller in the second storage array included in the controller operating state information according to the to-be-processed event. The sending module 22 may also be configured to send the updated controller running status information to other controllers in the first storage array, and each other controller updates the controller running status information stored by itself.

可选地,当待处理事件为控制器故障事件时,处理模块44执行根据待处理事件,更新控制器运行状态信息包含的第二存储阵列中的控制器的运行状态时,可以具体为:将发生控制器故障事件的控制器的运行状态从控制器运行状态信息中移除,或,在控制器运行状态信息中,将发生控制器故障事件的控制器的运行状态更新为离线或故障。Optionally, when the to-be-processed event is a controller failure event, when the processing module 44 executes, according to the to-be-processed event, to update the running state of the controller in the second storage array included in the controller running state information, it may be specifically: The operating state of the controller in which the controller failure event occurred is removed from the controller operating state information, or, in the controller operating state information, the operating state of the controller in which the controller failure event occurred is updated to offline or fault.

另一种实现方式中,集成管理存储阵列的装置100的控制器为第一存储阵列中的主控制器,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个。In another implementation manner, the controller of the apparatus 100 for integrated management of storage arrays is a main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array.

在该实现方式中,接收模块11还可以用于接收第二存储阵列中的主控制器发送的恢复请求,该恢复请求携带第二存储阵列中各控制器的运行状态。处理模块44还可以用于将第二存储阵列中各控制器的运行状态添加至控制器运行状态信息,其中,第二存储阵列中各控制器的运行状态为在线,或,将第二存储阵列中各控制器的运行状态更新为在线。发送模块22还可以用于发送第一响应报文给第二存储阵列中的主控制器,第一响应报文携带第一存储阵列中各控制器的运行状态。其中,第二存储阵列中的主控制器为第二存储阵列中的控制器中的一个。In this implementation manner, the receiving module 11 may also be configured to receive a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array. The processing module 44 may also be configured to add the running status of each controller in the second storage array to the controller running status information, wherein the running status of each controller in the second storage array is online, or add the running status of each controller in the second storage array to the controller running status information. The running status of each controller is updated to online. The sending module 22 may also be configured to send a first response message to the main controller in the second storage array, where the first response message carries the operating status of each controller in the first storage array. The main controller in the second storage array is one of the controllers in the second storage array.

又一种实现方式中,集成管理存储阵列的装置100的控制器为第一存储阵列中的主控制器,第一存储阵列中的主控制器为第一存储阵列中的控制器中的一个。In another implementation manner, the controller of the apparatus 100 for integrated management of storage arrays is a main controller in the first storage array, and the main controller in the first storage array is one of the controllers in the first storage array.

在该实现方式中,检测模块33还可以用于检测通信链路恢复事件。处理模块44还可以用于当检测模块33检测到通信链路恢复事件时,生成恢复请求,该恢复请求携带第一存储阵列中各控制器的运行状态。发送模块22还可以用于发送恢复请求给第二存储阵列中的主控制器。接收模块11还可以用于接收第二存储阵列中的主控制器发送的第二响应报文,该第二响应报文携带第二存储阵列中各控制器的运行状态。处理模块44还可以用于解析第二响应报文,获取第二存储阵列中各控制器的运行状态;及,将第二存储阵列中各控制器的运行状态添加至控制器运行状态信息,其中,第二存储阵列中各控制器的运行状态为在线,或,将第二存储阵列中各控制器的运行状态更新为在线。其中,第二存储阵列中的主控制器为第二存储阵列中的控制器中的一个。In this implementation, the detection module 33 may also be used to detect a communication link recovery event. The processing module 44 may also be configured to generate a recovery request when the detection module 33 detects a communication link recovery event, where the recovery request carries the operating status of each controller in the first storage array. The sending module 22 may also be configured to send a recovery request to the main controller in the second storage array. The receiving module 11 may also be configured to receive a second response packet sent by the main controller in the second storage array, where the second response packet carries the operating status of each controller in the second storage array. The processing module 44 may also be configured to parse the second response message to obtain the operating status of each controller in the second storage array; and, add the operating status of each controller in the second storage array to the controller operating status information, wherein , the running status of each controller in the second storage array is online, or, the running status of each controller in the second storage array is updated to online. The main controller in the second storage array is one of the controllers in the second storage array.

需要说明的,在上述任意实施例中,控制器运行状态可以包括:在线、离线以及故障中的任意一个。另外,对于新增控制器事件和通信链路故障事件以及创建控制器运行状态信息等,在此不再赘述,具体描述可参考上述方法实施例。It should be noted that, in any of the foregoing embodiments, the operating state of the controller may include any one of online, offline, and fault. In addition, for the newly added controller event and the communication link failure event, and the creation of the controller running state information, etc., details are not repeated here, and the specific description may refer to the above method embodiments.

图12为本发明管理存储阵列的装置实施例二的结构示意图。如图12所示,本实施例提供的管理存储阵列的装置200包括处理器210和存储器220。其中,存储器220存储执行指令,当管理存储阵列的装置200运行时,处理器210与存储器220之间通信,处理器210调用存储器220中的执行指令,用于执行上述管理存储阵列的方法,其实现原理和技术效果类似,此处不再赘述。FIG. 12 is a schematic structural diagram of Embodiment 2 of an apparatus for managing a storage array according to the present invention. As shown in FIG. 12 , the apparatus 200 for managing a storage array provided in this embodiment includes a processor 210 and a memory 220 . The memory 220 stores execution instructions. When the device 200 for managing the storage array runs, the processor 210 communicates with the memory 220, and the processor 210 calls the execution instructions in the memory 220 to execute the above-mentioned method for managing the storage array. The implementation principle and technical effect are similar, and are not repeated here.

在本申请所提供的几个实施例中,应该理解到,所揭示的装置和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述单元或模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或模块可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,设备或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are only illustrative. For example, the division of the units or modules is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or modules may be divided into Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or modules, and may be in electrical, mechanical or other forms.

所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by program instructions related to hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the steps including the above method embodiments are executed; and the foregoing storage medium includes: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present invention. scope.

Claims (18)

Translated fromChinese
1.一种管理存储阵列的方法,其特征在于,应用于第一存储阵列的控制器中,所述第一存储阵列和第二存储阵列通过通信链路通信,所述第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,所述控制器运行状态信息包括所述第二存储阵列中各控制器的运行状态,所述写锁权限信息包括所述第一存储阵列中逻辑单元号LUN的写锁权限管理者,所述LUN的写锁权限管理者是所述第二存储阵列中的控制器,所述方法包括:1. A method for managing a storage array, wherein the method is applied in a controller of a first storage array, the first storage array and the second storage array communicate through a communication link, and the first storage array communicates with the second storage array through a communication link. The controller stores controller operating status information and write lock permission information, wherein the controller operating status information includes the operating status of each controller in the second storage array, and the write lock permission information includes the first A write lock authority manager of a logical unit number LUN in a storage array, where the write lock authority manager of the LUN is a controller in the second storage array, and the method includes:当所述第一存储阵列中的控制器接收到向所述LUN写入数据的请求时,并且所述LUN的写锁权限管理者的运行状态是在线时,所述第一存储阵列中的控制器通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;When the controller in the first storage array receives a request to write data to the LUN, and the operating status of the write lock authority manager of the LUN is online, the controller in the first storage array The device sends an application request for write lock permission to the write lock permission manager of the LUN through the communication link;当所述第一存储阵列中的控制器获得所述写锁权限管理者提供的写锁定权限后,向所述LUN写入所述数据。After the controller in the first storage array obtains the write lock permission provided by the write lock permission manager, the controller writes the data to the LUN.2.根据权利要求1所述的方法,其特征在于,所述第一存储阵列中的控制器通过所述通信链路,向所述LUN的锁权限拥有者发送写锁定权限的申请请求之后,所述方法还包括:2. The method according to claim 1, wherein after the controller in the first storage array sends an application request for write lock permission to the lock permission owner of the LUN through the communication link, The method also includes:当所述第一存储阵列中的控制器获得所述写锁权限管理者提供的写锁定权限后,向所述第二存储阵列发送向所述LUN的镜像LUN中写入所述数据的指令,所述镜像LUN由所述第二存储阵列中的控制器管理。After the controller in the first storage array obtains the write lock authority provided by the write lock authority manager, it sends an instruction to write the data into the mirror LUN of the LUN to the second storage array, The mirrored LUN is managed by a controller in the second storage array.3.根据权利要求2所述的方法,其特征在于,所述方法还包括:3. The method according to claim 2, wherein the method further comprises:当所述LUN写入完成,并且收到所述第二存储阵列发送的所述镜像LUN写入完成的响应消息后,所述第一存储阵列中的控制器释放所述写锁定权限。The controller in the first storage array releases the write lock permission when the writing to the LUN is completed and a response message that the writing to the mirrored LUN is completed is received from the second storage array.4.根据权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:4. The method according to any one of claims 1 to 3, wherein the method further comprises:在所述第一存储阵列中的控制器释放所述写锁定权限之前,所述第一存储阵列中的控制器接收到另外一个向所述LUN写入数据的请求,并且所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;Before the controller in the first storage array releases the write lock permission, the controller in the first storage array receives another request to write data to the LUN, and the write lock of the LUN When the running state of the authority manager is online, send an application request for write lock authority to the write lock authority manager of the LUN through the communication link;所述第一存储阵列中的控制器接收所述写锁权限管理者返回拒绝给予写锁定权限的响应消息。The controller in the first storage array receives a response message that the write lock permission manager returns a refusal to grant the write lock permission.5.根据权利要求1所述的方法,其特征在于,所述方法还包括:5. The method according to claim 1, wherein the method further comprises:所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的待处理事件,所述待处理事件发生在所述第二存储阵列中,其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个;The main controller in the first storage array receives a pending event sent by the main controller in the second storage array, and the pending event occurs in the second storage array, wherein the first The main controller in the storage array is one of the controllers in the first storage array, and the main controller in the second storage array is one of the controllers in the second storage array;所述第一存储阵列中的主控制器根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态;The main controller in the first storage array updates the running state of the controller in the second storage array included in the controller running state information according to the to-be-processed event;所述第一存储阵列中的主控制器发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的所述控制器运行状态信息。The main controller in the first storage array sends the updated controller running status information to other controllers in the first storage array, and each other controller updates the controller running status information stored by itself.6.根据权利要求5所述的方法,其特征在于,当所述待处理事件为控制器故障事件时,所述第一存储阵列中的主控制器根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态,包括:6 . The method according to claim 5 , wherein when the to-be-processed event is a controller failure event, the main controller in the first storage array updates the control according to the to-be-processed event. 7 . The operating status of the controller in the second storage array included in the controller operating status information, including:所述第一存储阵列中的主控制器将发生所述控制器故障事件的控制器的运行状态从所述控制器运行状态信息中移除,或The master controller in the first storage array removes the operating status of the controller in which the controller failure event occurred from the controller operating status information, or在所述控制器运行状态信息中,所述第一存储阵列中的主控制器将发生所述控制器故障事件的控制器的运行状态更新为离线或故障。In the controller operating state information, the master controller in the first storage array updates the operating state of the controller in which the controller failure event occurs to offline or failure.7.根据权利要求1所述的方法,其特征在于,所述方法还包括:7. The method of claim 1, wherein the method further comprises:所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的恢复请求,所述恢复请求携带所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array receives a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array;所述第一存储阵列中的主控制器将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The main controller in the first storage array adds the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;所述第一存储阵列中的主控制器发送第一响应报文给所述第二存储阵列中的主控制器,所述第一响应报文携带所述第一存储阵列中各控制器的运行状态;The main controller in the first storage array sends a first response message to the main controller in the second storage array, where the first response message carries the operation of each controller in the first storage array state;其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is the controller in the second storage array one of the.8.根据权利要求1所述的方法,其特征在于,所述方法还包括:8. The method of claim 1, wherein the method further comprises:当所述第一存储阵列中的主控制器检测到通信链路恢复事件时,所述第一存储阵列中的主控制器生成恢复请求,所述恢复请求携带所述第一存储阵列中各控制器的运行状态;When the main controller in the first storage array detects a communication link recovery event, the main controller in the first storage array generates a recovery request, and the recovery request carries each control in the first storage array the operating state of the device;所述第一存储阵列中的主控制器发送所述恢复请求给所述第二存储阵列中的主控制器;The main controller in the first storage array sends the recovery request to the main controller in the second storage array;所述第一存储阵列中的主控制器接收所述第二存储阵列中的主控制器发送的第二响应报文,所述第二响应报文携带所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array receives a second response packet sent by the main controller in the second storage array, where the second response packet carries the information of each controller in the second storage array. Operating status;所述第一存储阵列中的主控制器解析所述第二响应报文,获取所述第二存储阵列中各控制器的运行状态;The main controller in the first storage array parses the second response message, and obtains the operating status of each controller in the second storage array;所述第一存储阵列中的主控制器将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The main controller in the first storage array adds the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;其中,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。The main controller in the first storage array is one of the controllers in the first storage array, and the main controller in the second storage array is the controller in the second storage array one of the.9.根据权利要求1至3任一项所述的方法,其特征在于,所述控制器运行状态包括:在线、离线以及故障中的任意一个。9. The method according to any one of claims 1 to 3, wherein the controller operating state comprises: any one of online, offline and fault.10.一种管理存储阵列的装置,其特征在于,集成于第一存储阵列的控制器中,所述第一存储阵列和第二存储阵列通过通信链路通信,所述第一存储阵列中的控制器存储有控制器运行状态信息以及写锁权限信息,其中,所述控制器运行状态信息包括所述第二存储阵列中各控制器的运行状态,所述写锁权限信息包括所述第一存储阵列中逻辑单元号LUN的写锁权限管理者,所述LUN的写锁权限管理者是所述第二存储阵列中的控制器,所述装置包括接收模块、发送模块、检测模块及处理模块;10. An apparatus for managing a storage array, characterized in that it is integrated in a controller of a first storage array, the first storage array and the second storage array communicate through a communication link, and the first storage array communicates with the second storage array through a communication link. The controller stores controller operating status information and write lock permission information, wherein the controller operating status information includes the operating status of each controller in the second storage array, and the write lock permission information includes the first The write lock authority manager of the logical unit number LUN in the storage array, the write lock authority manager of the LUN is the controller in the second storage array, and the device includes a receiving module, a sending module, a detection module and a processing module ;所述接收模块,用于接收向所述LUN写入数据的请求;the receiving module, configured to receive a request for writing data to the LUN;所述检测模块,用于检测所述LUN的写锁权限管理者的运行状态;The detection module is used to detect the running state of the write lock authority manager of the LUN;所述发送模块,用于当所述接收模块接收到所述向所述LUN写入数据的请求时,并且所述检测模块检测到所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;The sending module is configured to, when the receiving module receives the request to write data to the LUN, and the detection module detects that the running state of the write lock authority manager of the LUN is online, Send an application request for write lock permission to the write lock permission manager of the LUN through the communication link;所述接收模块,还用于接收所述写锁权限管理者提供的写锁定权限的应答消息;The receiving module is further configured to receive a response message of the write lock authority provided by the write lock authority manager;所述处理模块,用于获得所述写锁权限管理者提供的写锁定权限后,向所述LUN写入所述数据。The processing module is configured to write the data to the LUN after obtaining the write lock authority provided by the write lock authority manager.11.根据权利要求10所述的装置,其特征在于,所述处理模块还用于:11. The apparatus according to claim 10, wherein the processing module is further configured to:在获得所述写锁权限管理者提供的写锁定权限后,触发所述发送模块向所述第二存储阵列发送向所述LUN的镜像LUN中写入所述数据的指令,所述镜像LUN由所述第二存储阵列中的控制器管理。After obtaining the write lock permission provided by the write lock permission manager, the sending module is triggered to send an instruction to write the data to the mirror LUN of the LUN to the second storage array, and the mirror LUN consists of A controller in the second storage array manages.12.根据权利要求11所述的装置,其特征在于,12. The apparatus of claim 11, wherein所述接收模块,还用于接收所述第二存储阵列发送的所述镜像LUN写入完成的响应消息;The receiving module is further configured to receive a response message of the mirror LUN writing completion sent by the second storage array;所述处理模块,还用于当所述LUN写入完成,并且所述接收模块收到所述镜像LUN写入完成的响应消息后,释放所述写锁定权限。The processing module is further configured to release the write lock authority when the LUN writing is completed and the receiving module receives a response message that the mirror LUN writing is completed.13.根据权利要求10至12任一项所述的装置,其特征在于,13. The device according to any one of claims 10 to 12, characterized in that,所述发送模块,还用于在所述处理模块释放所述写锁定权限之前,所述接收模块接收到另外一个向所述LUN写入数据的请求,并且所述检测模块检测到所述LUN的写锁权限管理者的运行状态是在线时,通过所述通信链路,向所述LUN的写锁权限管理者发送写锁定权限的申请请求;The sending module is further configured to, before the processing module releases the write lock authority, the receiving module receives another request for writing data to the LUN, and the detection module detects that the LUN is When the running state of the write lock authority manager is online, send an application request for write lock authority to the write lock authority manager of the LUN through the communication link;所述接收模块,还用于接收所述写锁权限管理者返回拒绝给予写锁定权限的响应消息。The receiving module is further configured to receive a response message returned by the write lock authority manager for refusing to grant the write lock authority.14.根据权利要求10所述的装置,其特征在于,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;14. The device according to claim 10, wherein the controller integrated with the device is a main controller in the first storage array, and the main controller in the first storage array is the first storage array. one of the controllers in a storage array;所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的待处理事件,所述待处理事件发生在所述第二存储阵列中,其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个;The receiving module is further configured to receive a pending event sent by a main controller in the second storage array, where the pending event occurs in the second storage array, wherein the second storage array The main controller is one of the controllers in the second storage array;所述处理模块,还用于根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态;The processing module is further configured to update the operating state of the controller in the second storage array included in the controller operating state information according to the to-be-processed event;所述发送模块,还用于发送更新后的控制器运行状态信息给第一存储阵列中的其他控制器,各个其他控制器更新自己存储的所述控制器运行状态信息。The sending module is further configured to send the updated controller running state information to other controllers in the first storage array, and each other controller updates the controller running state information stored by itself.15.根据权利要求14所述的装置,其特征在于,当所述待处理事件为控制器故障事件时,所述处理模块执行根据所述待处理事件,更新所述控制器运行状态信息包含的所述第二存储阵列中的控制器的运行状态时,具体为:15. The apparatus according to claim 14, wherein when the to-be-processed event is a controller failure event, the processing module executes, according to the to-be-processed event, updating the information included in the controller operating state information. The running state of the controller in the second storage array is specifically:将发生所述控制器故障事件的控制器的运行状态从所述控制器运行状态信息中移除,或remove the operational status of the controller in which the controller failure event occurred from the controller operational status information, or在所述控制器运行状态信息中,将发生所述控制器故障事件的控制器的运行状态更新为离线或故障。In the controller operating state information, the operating state of the controller in which the controller failure event occurs is updated to offline or failure.16.根据权利要求10所述的装置,其特征在于,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;16 . The device according to claim 10 , wherein the controller integrated with the device is a main controller in the first storage array, and the main controller in the first storage array is the first storage array. 17 . one of the controllers in a storage array;所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的恢复请求,所述恢复请求携带所述第二存储阵列中各控制器的运行状态;The receiving module is further configured to receive a recovery request sent by the main controller in the second storage array, where the recovery request carries the operating status of each controller in the second storage array;所述处理模块,还用于将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The processing module is further configured to add the operating status of each controller in the second storage array to the controller operating status information, wherein the operating status of each controller in the second storage array is online, or, updating the running status of each controller in the second storage array to online;所述发送模块,还用于发送第一响应报文给所述第二存储阵列中的主控制器,所述第一响应报文携带所述第一存储阵列中各控制器的运行状态;The sending module is further configured to send a first response message to the main controller in the second storage array, where the first response message carries the operating status of each controller in the first storage array;其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。Wherein, the main controller in the second storage array is one of the controllers in the second storage array.17.根据权利要求10所述的装置,其特征在于,集成所述装置的控制器为所述第一存储阵列中的主控制器,所述第一存储阵列中的主控制器为所述第一存储阵列中的控制器中的一个;17 . The device according to claim 10 , wherein the controller integrated with the device is a main controller in the first storage array, and the main controller in the first storage array is the first storage array. 18 . one of the controllers in a storage array;所述检测模块,还用于检测通信链路恢复事件;The detection module is also used to detect a communication link recovery event;所述处理模块,还用于当所述检测模块检测到所述通信链路恢复事件时,生成恢复请求,所述恢复请求携带所述第一存储阵列中各控制器的运行状态;The processing module is further configured to generate a recovery request when the detection module detects the communication link recovery event, where the recovery request carries the operating status of each controller in the first storage array;所述发送模块,还用于发送所述恢复请求给所述第二存储阵列中的主控制器;The sending module is further configured to send the recovery request to the main controller in the second storage array;所述接收模块,还用于接收所述第二存储阵列中的主控制器发送的第二响应报文,所述第二响应报文携带所述第二存储阵列中各控制器的运行状态;The receiving module is further configured to receive a second response message sent by the main controller in the second storage array, where the second response message carries the operating status of each controller in the second storage array;所述处理模块,还用于解析所述第二响应报文,获取所述第二存储阵列中各控制器的运行状态;及,将所述第二存储阵列中各控制器的运行状态添加至所述控制器运行状态信息,其中,所述第二存储阵列中各控制器的运行状态为在线,或,将所述第二存储阵列中各控制器的运行状态更新为在线;The processing module is further configured to parse the second response message to obtain the operating status of each controller in the second storage array; and add the operating status of each controller in the second storage array to a The controller operating status information, wherein the operating status of each controller in the second storage array is online, or the operating status of each controller in the second storage array is updated to online;其中,所述第二存储阵列中的主控制器为所述第二存储阵列中的控制器中的一个。Wherein, the main controller in the second storage array is one of the controllers in the second storage array.18.根据权利要求10至12任一项所述的装置,其特征在于,所述控制器运行状态包括:在线、离线以及故障中的任意一个。18. The device according to any one of claims 10 to 12, wherein the controller operating state comprises: any one of online, offline and fault.
CN201510307041.XA2015-06-052015-06-05Manage the method and device of storage arrayActiveCN106250048B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201510307041.XACN106250048B (en)2015-06-052015-06-05Manage the method and device of storage array

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201510307041.XACN106250048B (en)2015-06-052015-06-05Manage the method and device of storage array

Publications (2)

Publication NumberPublication Date
CN106250048A CN106250048A (en)2016-12-21
CN106250048Btrue CN106250048B (en)2019-06-28

Family

ID=57626470

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201510307041.XAActiveCN106250048B (en)2015-06-052015-06-05Manage the method and device of storage array

Country Status (1)

CountryLink
CN (1)CN106250048B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108345515A (en)*2017-01-222018-07-31中国移动通信集团四川有限公司Storage method and device and its storage system
WO2019080150A1 (en)*2017-10-252019-05-02华为技术有限公司Dual active storage system and address allocation method
CN110209641A (en)*2018-02-122019-09-06杭州宏杉科技股份有限公司A kind of trunking service processing method and device applied in more controlled storage systems
CN109857341B (en)*2019-01-152022-04-12新华三技术有限公司成都分公司Method and device for determining write lock prefetch length

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6073218A (en)*1996-12-232000-06-06Lsi Logic Corp.Methods and apparatus for coordinating shared multiple raid controller access to common storage devices
CN102541471A (en)*2011-12-282012-07-04创新科软件技术(深圳)有限公司Storage system with multiple controllers
CN103731485A (en)*2013-12-262014-04-16华为技术有限公司Network equipment, cluster storage system and distributed lock management method
CN104486319A (en)*2014-12-092015-04-01上海爱数软件有限公司Real-time synchronization method and real-time synchronization system for configuration file applied to high-availability system
CN104520845A (en)*2012-09-062015-04-15惠普发展公司,有限责任合伙企业Scalable file system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US6073218A (en)*1996-12-232000-06-06Lsi Logic Corp.Methods and apparatus for coordinating shared multiple raid controller access to common storage devices
CN102541471A (en)*2011-12-282012-07-04创新科软件技术(深圳)有限公司Storage system with multiple controllers
CN104520845A (en)*2012-09-062015-04-15惠普发展公司,有限责任合伙企业Scalable file system
CN103731485A (en)*2013-12-262014-04-16华为技术有限公司Network equipment, cluster storage system and distributed lock management method
CN104486319A (en)*2014-12-092015-04-01上海爱数软件有限公司Real-time synchronization method and real-time synchronization system for configuration file applied to high-availability system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
分布式并行文件系统锁管理的研究与设计;赵旺;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20090515(第2009年5期);正文第9-11页
双控制器RAID系统的研究与实现;严亮;《万方数据(学位)》;20121225;正文第31-32页

Also Published As

Publication numberPublication date
CN106250048A (en)2016-12-21

Similar Documents

PublicationPublication DateTitle
US11422908B2 (en)Non-disruptive controller replacement in a cross-cluster redundancy configuration
US11163653B2 (en)Storage cluster failure detection
US12073091B2 (en)Low overhead resynchronization snapshot creation and utilization
US11249857B2 (en)Methods for managing clusters of a storage system using a cloud resident orchestrator and devices thereof
US10146472B2 (en)Tertiary storage unit management in bidirectional data copying
EP3062226B1 (en)Data replication method and storage system
WO2016070375A1 (en)Distributed storage replication system and method
JP2004532442A (en) Failover processing in a storage system
JP2006209775A (en)Storage replication system with data tracking
KR20110044858A (en) Maintain data indetermination in data servers across data centers
CN106331166B (en)A kind of access method and device of storage resource
CN104994168A (en)distributed storage method and distributed storage system
US9830237B2 (en)Resynchronization with compliance data preservation
CN106250048B (en)Manage the method and device of storage array
WO2021115043A1 (en)Distributed database system and data disaster backup drilling method
US8949562B2 (en)Storage system and method of controlling storage system
CN105988901A (en)Data copying method and storage system
WO2015196692A1 (en)Cloud computing system and processing method and apparatus for cloud computing system
US9582384B2 (en)Method and system for data replication
CN102281159A (en)Recovery method of cluster system
US20240256147A1 (en)Volume promotion management and visualization in a metro cluster

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp