CN102662787A

Movatterモバイル変換

Info

Publication number: CN102662787A
Application number: CN201210116418XA
Authority: CN
Inventors: 孙磊; 李瑞东
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: IEIT Systems Co Ltd
Priority date: 2012-04-20
Filing date: 2012-04-20
Publication date: 2012-09-12

Abstract

本发明提供一种保护系统盘raid的方法，系统包括检测模块和操作模块，检测模块检测系统盘raid的状态，操作模块,根据检测模块反馈的状态关闭或恢复系统的运行，检测模块根据收集到raid信息,发出强制的恢复指令给操作模块,从而强制用户进行恢复操作,否则系统将保持关闭，具体步骤如下：系统先利用raid卡的API提供的接口信息来获取掉盘的raiddegrade信息,包括通过接收raid卡发出的报警日志来判断状态或通过提供的API接口由检测模块自己去检测raid状态，检测模块在获取到raid掉盘信息后,自动触发下一个操作,给正在使用系统的用户发出警告,如果用户不及时处理,那么操作模块将对用户系统强制暂停,进行关机操作,如果这时候用户去强行开机,操作模块拒绝开机,只有当用户真正的把坏掉的硬盘更换后,恢复raid状态后,用户才可以正常开机。

The invention provides a method for protecting system disk raid. The system includes a detection module and an operation module. The detection module detects the state of the system disk raid. Raid information, send a mandatory recovery command to the operation module, thereby forcing the user to perform recovery operations, otherwise the system will remain closed, the specific steps are as follows: the system first uses the interface information provided by the raid card API to obtain the raiddegrade information of the lost disk, including through Receive the alarm log sent by the raid card to judge the status or use the provided API interface to detect the raid status by the detection module itself. After the detection module obtains the information of the raid disk failure, it will automatically trigger the next operation and issue a warning to the users who are using the system , if the user does not deal with it in time, the operation module will force the user system to suspend and shut down the system. If the user tries to force the system to start at this time, the operation module will refuse to start the system. Only when the user actually replaces the broken hard disk, the raid state will be restored. After that, the user can boot normally.

Description

Translated fromChinese

一种保护系统盘raid的方法A method of protecting system disk raid

技术领域technical field

本发明涉及计算机存储测试领域，具体涉及一种保护系统盘raid的方法。The invention relates to the field of computer storage testing, in particular to a method for protecting system disk raid.

背景技术Background technique

目前诸多系统盘raid都存在一个致命的问题，就是如果磁盘损坏不及时修复的话，很可能会引起系统的彻底崩溃。而在raid发生故障时,一般存储都会有告警方式,比如控制器蜂鸣器报警、盘柜会有红灯报警、并会伴随日至报警,但机器如果是在远程机房,用户又没有及时的去查看日志,没有及时的去修复故障,那么很可能会造成raid的进一步降级,导致不可恢复性的损坏.本发明就是解决这种问题，通过对raid状态的识别，来强制用户进行raid修复。At present, many system disk raids have a fatal problem, that is, if the disk damage is not repaired in time, it may cause a complete system crash. When the raid fails, the general storage will have alarm methods, such as the controller buzzer alarm, the cabinet will have a red light alarm, and will be accompanied by an alarm at the end of the day, but if the machine is in a remote computer room, the user does not have timely warning To check the log, if the failure is not repaired in time, it is likely to cause further degradation of the raid, resulting in irrecoverable damage. The present invention solves this problem by forcing the user to repair the raid by identifying the raid status.

发明内容Contents of the invention

本发明的目的是提供一种保护系统盘raid的方法。The purpose of the invention is to provide a method for protecting system disk raid.

本发明的目的是按以下方式实现的，系统包括检测模块和操作模块，检测模块检测系统盘raid的状态，操作模块,根据检测模块反馈的状态关闭或恢复系统的运行，检测模块根据收集到raid信息,发出强制的恢复指令给操作模块,从而强制用户进行恢复操作,否则系统将保持关闭，具体步骤如下：The purpose of the present invention is achieved in the following manner, the system includes a detection module and an operation module, the detection module detects the state of the system disk raid, and the operation module closes or resumes the operation of the system according to the state fed back by the detection module, and the detection module collects the raid according to information, send a forced recovery command to the operation module, thereby forcing the user to perform the recovery operation, otherwise the system will remain closed, the specific steps are as follows:

系统先利用raid卡的API提供的接口信息来获取掉盘的raid degrade信息,包括通过接收raid卡发出的报警日志来判断状态或通过提供的API接口由检测模块自己去检测raid状态，检测模块在获取到raid掉盘信息后,自动触发下一个操作,给正在使用系统的用户发出警告,如果用户不及时处理,那么操作模块将对用户系统强制暂停,进行关机操作,如果这时候用户去强行开机,操作模块拒绝开机,只有当用户真正的把坏掉的硬盘更换后,恢复raid状态后,用户才可以正常开机。The system first uses the interface information provided by the API of the raid card to obtain the raid degrade information of the lost disk, including judging the status by receiving the alarm log sent by the raid card, or detecting the raid status by the detection module itself through the provided API interface. After obtaining the raid disk loss information, it will automatically trigger the next operation and issue a warning to the user who is using the system. If the user does not deal with it in time, the operation module will force the user system to suspend and shut down the system. If the user tries to force the system to start , the operation module refuses to boot, and only after the user actually replaces the broken hard disk and restores the raid state, the user can boot normally.

本发明的有益效果是：解决了由于用户疏忽或者网络管理员素质不高，由于疏于管理，所导致的无法挽回的数据损失。因为本发明的办法是强制用户进行故障处理和恢复，这大大加强了系统的可靠性和稳定性简化了raid错误报警机制，以往在发生raid错误时，总有一大队的日志等着去解读，然后在判断怎么去做。而本发明将这些过程简化，只需要用户按提示进行故障恢复就行。The beneficial effect of the present invention is that it solves the irreparable data loss caused by negligence of users or low quality of network administrators and negligent management. Because the method of the present invention is to force the user to carry out troubleshooting and recovery, this greatly strengthens the reliability and stability of the system and simplifies the raid error alarm mechanism. In the past, when a raid error occurred, there was always a large team of logs waiting to be interpreted, judging what to do. However, the present invention simplifies these processes, and only needs the user to perform fault recovery according to the prompts.

附图说明Description of drawings

图1是系统流程图；Fig. 1 is a system flow chart;

图2是系统结构示意图。Figure 2 is a schematic diagram of the system structure.

具体实施方式Detailed ways

参照说明书附图对本发明的方法作以下详细地说明。The method of the present invention is described in detail below with reference to the accompanying drawings.

本发明的一种保护系统盘raid的方法，是首先利用raid卡的API提供的接口信息来获取掉盘的raid degrade信息,这个可以通过多种方式:可以通过接收raid卡发出的报警日志来判断状态;当然也可以通过提供的API接口由检测模块自己去检测raid状态.检测模块在获取到raid掉盘信息后,并不像常规的操作那样来抛出故障,而是自动触发下一个操作,也就是给正在使用系统的用户发出警告,如果用户不及时处理,那么操作模块将对用户系统强制暂停,进行关机操作,如果这时候用户去强行开机,操作模块都是不允许去开机的,只有当用户真正的把坏掉的硬盘更坏后,恢复raid状态后,用户才可以正常开机。A method for protecting the system disk raid of the present invention is to firstly utilize the interface information provided by the API of the raid card to obtain the raid degrade information of the lost disk. This can be done in a variety of ways: it can be judged by receiving the alarm log sent by the raid card Status; of course, the detection module can also detect the raid status by itself through the provided API interface. After the detection module obtains the raid disk loss information, it does not throw a fault like a conventional operation, but automatically triggers the next operation. That is to issue a warning to users who are using the system. If the user does not deal with it in time, the operation module will forcefully suspend the user system and perform a shutdown operation. After the user actually makes the broken hard disk worse and restores the raid state, the user can start the system normally.

实施例Example

如图所示,通过raid卡的API接口信息,可以获取目前raid的状态信息,如果是degrade的状态,检测模块在收到信息后，会发送强制关闭系统的指令给操作模块。As shown in the figure, the current status information of the raid can be obtained through the API interface information of the raid card. If it is in the degrade state, the detection module will send an instruction to forcibly shut down the system to the operation module after receiving the information.

操作模块对系统执行强制关闭的操作,来迫使用户进行系统恢复;如果不恢复用户将无法进入系统,等用户恢复后,通过RAID卡API反馈的新的raid状态信息,检测模块来发动开启系统的指令,这时候用户才可以进入系统.The operation module executes the operation of forcibly shutting down the system to force the user to restore the system; if the user does not restore the system, the user will not be able to enter the system. After the user recovers, the new raid status information fed back by the RAID card API will be used by the detection module to initiate the start of the system. command, the user can enter the system at this time.

可以通过在OS下内嵌检测模块和操作模块软件来实现，检测模块负责与raid卡进行交互连接，时时检测和交换raid信息。操作模块和OS进行绑定，并根据检测模块的输入进行对应的操作。It can be realized by embedding the detection module and operation module software under the OS. The detection module is responsible for interactive connection with the raid card, and detects and exchanges raid information from time to time. The operation module is bound with the OS, and performs corresponding operations according to the input of the detection module.

除说明书所述的技术特征外，均为本专业技术人员的已知技术。Except for the technical features described in the instructions, all are known technologies by those skilled in the art.

Claims

1. the method for a protection system dish raid is characterized in that, system comprises detection module and operational module; The state of detection module detection system dish raid, operational module is closed or the operation of recovery system according to the state of detection module feedback; Detection module sends compulsory restore instruction and give operational module, thereby force users is carried out recovery operation according to collecting raid information; Otherwise system will keep shut, and concrete steps are following:

The interface message that system utilizes the API of raid card to provide is earlier obtained the raid degrade information of dish, comprises through receiving the alarm log that the raid card sends coming the judgement state or removing to detect the raid state through the api interface that provides by detection module oneself, and detection module is getting access to after raid falls dish information; Automatically trigger next operation; The user who just gives at using system gives a warning, if the untimely processing of user, operational module will be to the custom system mandatory pause so; Carry out power-off operation; If at this time the user goes to start shooting by force, the start of operational module refusal has only after the hard disk that the real handle of user breaks down is changed; After recovering the raid state, the user just can normal boot-strap.