CN106021065A

Movatterモバイル変換

Info

Publication number: CN106021065A
Application number: CN201610333684.6A
Authority: CN
Inventors: 曲洪磊
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: IEIT Systems Co Ltd
Priority date: 2016-05-19
Filing date: 2016-05-19
Publication date: 2016-10-12

Abstract

The invention discloses a method for automatically detecting bad track information of a disk under raid under linux, which is characterized in that based on shell scripts, a raid tool under linux of each raid card manufacturer is called, all track information of the disk under a server is detected once at intervals, and the bad track information of the disk is prompted. The method utilizes the interfaces provided by each raid card manufacturer and the monitoring service aiming at IO processing under the OS to monitor the bad track information of the disk under the raid in real time, finds the hidden danger of the disk in time, avoids the loss of data and the interruption of service, and greatly ensures the stability of the server and the safety of the data.

Description

Translated fromChinese

一种在linux下自动检测raid下磁盘坏道信息的方法A method for automatically detecting bad track information of disk under raid under linux

技术领域technical field

本发明涉及系统维护技术领域，具体涉及一种在linux下自动检测raid下磁盘坏道信息的方法。The invention relates to the technical field of system maintenance, in particular to a method for automatically detecting bad track information of a disk under raid under Linux.

背景技术Background technique

磁盘是服务器上必不可少的一个组成部分，目前大多数的服务器都会采用raid阵列的方式使用磁盘，这样在保证数据的安全性的基础上，又能够给在线更换磁盘提供了可能，但是raid自带的报警功能很多情况下都是在磁盘无法工作的情况下才会进行报警提示，而不会对磁盘的坏道进行定时的检测和预警，对于一些需要频繁读写数据的业务，磁盘的坏道很容易引起业务读写时的死锁现象，大大影响了数据的安全性和时效性。The disk is an essential part of the server. At present, most servers will use the disk in the form of a raid array. In this way, on the basis of ensuring the security of the data, it is possible to replace the disk online. However, the RAID automatically In many cases, the alarm function of the belt will only give an alarm prompt when the disk cannot work, and will not regularly detect and warn the bad sectors of the disk. For some businesses that need to read and write data frequently It is easy to cause a deadlock phenomenon during business reading and writing, which greatly affects the security and timeliness of data.

发明内容Contents of the invention

本发明要解决的技术问题是：为了解决上述问题，本发明提出了一种在linux下自动检测raid下磁盘坏道信息的方法，实时的检测raid下磁盘的坏道信息，当坏道出现时，即开始预警，提示用户坏道数量。The technical problem to be solved by the present invention is: in order to solve the above problems, the present invention proposes a method for automatically detecting the bad track information of the disk under the raid under linux, real-time detection of the bad track information of the disk under the raid, when the bad track occurs , that is to start an early warning, prompting the user the number of bad sectors.

本发明所采用的技术方案为：The technical scheme adopted in the present invention is:

一种在linux下自动检测raid下磁盘坏道信息的方法，所述方法基于shell脚本，调用各raid卡厂商的linux下raid工具（例如LSI的storcliIa和PMC的arcconf工具），每隔一段时间（如1小时），检测一遍服务器下所有的磁盘轨道信息，并将磁盘的坏道信息提示出来。A method for automatically detecting bad track information of disks under raid under linux, said method is based on shell scripts, calling raid tools under linux (such as the storcliia of LSI and the arcconf tool of PMC) of each raid card manufacturer, every once in a while ( Such as 1 hour), check all the disk track information under the server, and prompt the bad track information of the disk.

所述方法通过shell脚本将磁盘的坏道信息保存日志中。The method saves the bad track information of the disk in a log through a shell script.

当磁盘的坏道信息已经达到了预警线，所述方法通过磁盘的定位灯闪烁，提示用户关注磁盘信息，及时更换磁盘。When the bad track information of the disk has reached the warning line, the method prompts the user to pay attention to the disk information and replace the disk in time by flickering the positioning light of the disk.

所述脚本运行流程如下：The script running process is as follows:

1）显示出当前的磁盘列表信息；1) Display the current disk list information;

2）获取磁盘总的数目；2) Obtain the total number of disks;

3）遍历检查所有磁盘的温度，如果温度超出标准，则报警并点亮指示灯；3) Traversely check the temperature of all disks, if the temperature exceeds the standard, it will alarm and light up the indicator light;

4）遍历检查所有磁盘的坏道信息，如果坏道信息超出标准则报警，并点亮指示灯。4) Traversing and checking the bad track information of all disks, if the bad track information exceeds the standard, it will alarm and turn on the indicator light.

整个过程主要是利用shell脚本实现磁盘坏道的信息检测、日志保存功能、同步闪灯预警提示，该方法使用方便，操作简单，在保证数据的正常读写情况下，及时的发现磁盘坏道信息，保证数据的安全性，保证业务的正常运行。The whole process mainly uses the shell script to realize the information detection of disk bad sectors, log saving function, and synchronous flashing light warning prompt. This method is easy to use and easy to operate. It can detect bad disk information in time while ensuring normal reading and writing of data. , to ensure the security of the data and the normal operation of the business.

本发明的有益效果为：The beneficial effects of the present invention are:

本发明方法利用各个raid卡厂商提供的接口以及OS下针对IO处理的监控服务，进行raid下磁盘坏道信息进行实时监控，及时的发现磁盘的隐患，避免了数据的丢失以及服务的中断，大大的保证了服务器的稳定性，数据的安全性。The method of the present invention utilizes the interface provided by each raid card manufacturer and the monitoring service for IO processing under the OS to monitor the bad track information of the disk under the raid in real time, discover the hidden danger of the disk in time, avoid the loss of data and the interruption of service, greatly It guarantees the stability of the server and the security of data.

附图说明Description of drawings

图1为本发明脚本流程图。Fig. 1 is a script flow chart of the present invention.

具体实施方式detailed description

下面结合说明书附图，根据具体实施方式对本发明进一步说明：Below in conjunction with accompanying drawing of description, the present invention is further described according to specific embodiment:

实施例1：Example 1:

实施例2Example 2

在实施例1的基础上，本实施例所述方法通过shell脚本，将磁盘的坏道信息保存于日志中。On the basis of Embodiment 1, the method described in this embodiment saves the bad track information of the disk in a log through a shell script.

实施例3Example 3

在实施例2的基础上，本实施例当磁盘的坏道信息已经达到了预警线，所述方法通过磁盘的定位灯闪烁，提示用户需要关注磁盘信息，及时更换磁盘。On the basis of Embodiment 2, in this embodiment, when the bad track information of the disk has reached the warning line, the method uses the blinking of the positioning light of the disk to prompt the user to pay attention to the disk information and replace the disk in time.

实施例4Example 4

在实施例3的基础上，本实施例所述脚本主要完成如下功能：On the basis of Embodiment 3, the script described in this embodiment mainly completes the following functions:

自动检测磁盘的坏道信息，同步将信息保存到日志中，并每隔一个小时重新进行检测。当坏道信息达到预警数量，那么该脚本会自动点亮磁盘的定位灯来提示用户磁盘坏道数量已经达到预警。Automatically detect the bad track information of the disk, save the information to the log synchronously, and re-detect every hour. When the number of bad sectors reaches the warning number, the script will automatically turn on the positioning light of the disk to remind the user that the number of bad sectors on the disk has reached the warning number.

如图1所示，执行脚本/disk.sh；As shown in Figure 1, execute the script /disk.sh;

1）每隔一个小时检测一次磁盘坏道信息；1) Detect disk bad track information every hour;

2）当坏道数量超过临界值，立即电量磁盘的定位灯，在系统下显示报警，并执行步骤4）；2) When the number of bad sectors exceeds the critical value, the positioning light of the power disk will display an alarm under the system immediately, and perform step 4);

3）当坏道数量未超过临界值时；3) When the number of bad sectors does not exceed the critical value;

4）继续进行检测，同步保存结果，实时监控日志；4) Continue to detect, save the results synchronously, and monitor the logs in real time;

5）执行完成。5) Execution complete.

主要代码如下：The main code is as follows:

raid_check /c0/eall/sall show //显示出当前的磁盘列表信息raid_check /c0/eall/sall show //Display the current disk list information

sas_number=`raid_check /c0/eall/sall show |grep –i HDD |wc –l ` //获取磁盘总的numbersas_number=`raid_check /c0/eall/sall show |grep –i HDD |wc –l ` //Get the total number of the disk

for ((i=0;i<$sas_number;i++))for ((i=0;i<$sas_number;i++))

dodo

temperature=`.raid_check /c0/eall/s$i show all |grep –i temperature`temperature=`.raid_check /c0/eall/s$i show all |grep –i temperature`

if [ "$temperature" > 64 ];if [ "$temperature" > 64 ];

thenthen

echo the SAS $i temperature is too hot!!!!!echo the SAS $i temperature is too hot!!!!!

raid_check /c0/eall/s$i start locateraid_check /c0/eall/s$i start locate

fithe fi

//遍历检查所有磁盘的温度，如果温度超出标准，则报警并点亮指示灯//Traverse to check the temperature of all disks, if the temperature exceeds the standard, an alarm will be issued and the indicator light will be turned on

sas_error=` raid_check /c0/eall/s$i show all |grep –i predictive `sas_error=` raid_check /c0/eall/s$i show all |grep –i predictive`

if [ "$ sas_error " > 0];if [ "$ sas_error " > 0];

thenthen

echo the SAS $i has $sas_error error!!!echo the SAS $i has $sas_error error!!!

raid_check /c0/eall/s$i start locateraid_check /c0/eall/s$i start locate

fithe fi

donedone

//遍历检查所有磁盘的坏道信息，如果坏道信息超出标准则报警，并点亮指示灯// Traverse and check the bad track information of all disks, if the bad track information exceeds the standard, it will alarm and light up the indicator light

上实施方式仅用于说明本发明，而并非对本发明的限制，有关技术领域的普通技术人员，在不脱离本发明的精神和范围的情况下，还可以做出各种变化和变型，因此所有等同的技术方案也属于本发明的范畴，本发明的专利保护范围应由权利要求限定。The above embodiments are only used to illustrate the present invention, but not to limit the present invention. Those of ordinary skill in the relevant technical fields can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, all Equivalent technical solutions also belong to the category of the present invention, and the scope of patent protection of the present invention should be defined by the claims.

Claims

1. the automatic method of Bad Track information under detection raid under linux, it is characterised in that: described method based onShell script, calls raid instrument under the linux of each raid card manufacturer, at set intervals, detects under a server allDisk track information, and by the bad track information alert of disk out.

A kind of automatically method of Bad Track information, its feature under detection raid under linux the most according to claim 1Being, the bad track information of disk is preserved in daily record by described method by shell script.

A kind of automatically method of Bad Track information, its feature under detection raid under linux the most according to claim 2Being, when the bad track information of disk has reached early warning line, described method is flashed by the positioning lamp of disk, and prompting user is closedNote disc information.

A kind of automatically method of Bad Track information, its feature under detection raid under linux the most according to claim 3Being, described script operational process is as follows:

1) current disk list information is demonstrated；

2) number that disk is total is obtained；

3) traversal checks the temperature of all disks, if temperature is beyond standard, then reports to the police and lights display lamp；

4) traversal checks the bad track information of all disks, if bad track information is beyond standard, reports to the police, and lights display lamp.