技术领域technical field
本发明涉及系统维护技术领域,具体涉及一种在linux下自动检测raid下磁盘坏道信息的方法。The invention relates to the technical field of system maintenance, in particular to a method for automatically detecting bad track information of a disk under raid under Linux.
背景技术Background technique
磁盘是服务器上必不可少的一个组成部分,目前大多数的服务器都会采用raid阵列的方式使用磁盘,这样在保证数据的安全性的基础上,又能够给在线更换磁盘提供了可能,但是raid自带的报警功能很多情况下都是在磁盘无法工作的情况下才会进行报警提示,而不会对磁盘的坏道进行定时的检测和预警,对于一些需要频繁读写数据的业务,磁盘的坏道很容易引起业务读写时的死锁现象,大大影响了数据的安全性和时效性。The disk is an essential part of the server. At present, most servers will use the disk in the form of a raid array. In this way, on the basis of ensuring the security of the data, it is possible to replace the disk online. However, the RAID automatically In many cases, the alarm function of the belt will only give an alarm prompt when the disk cannot work, and will not regularly detect and warn the bad sectors of the disk. For some businesses that need to read and write data frequently It is easy to cause a deadlock phenomenon during business reading and writing, which greatly affects the security and timeliness of data.
发明内容Contents of the invention
本发明要解决的技术问题是:为了解决上述问题,本发明提出了一种在linux下自动检测raid下磁盘坏道信息的方法,实时的检测raid下磁盘的坏道信息,当坏道出现时,即开始预警,提示用户坏道数量。The technical problem to be solved by the present invention is: in order to solve the above problems, the present invention proposes a method for automatically detecting the bad track information of the disk under the raid under linux, real-time detection of the bad track information of the disk under the raid, when the bad track occurs , that is to start an early warning, prompting the user the number of bad sectors.
本发明所采用的技术方案为:The technical scheme adopted in the present invention is:
一种在linux下自动检测raid下磁盘坏道信息的方法,所述方法基于shell脚本,调用各raid卡厂商的linux下raid工具(例如LSI的storcliIa和PMC的arcconf工具),每隔一段时间(如1小时),检测一遍服务器下所有的磁盘轨道信息,并将磁盘的坏道信息提示出来。A method for automatically detecting bad track information of disks under raid under linux, said method is based on shell scripts, calling raid tools under linux (such as the storcliia of LSI and the arcconf tool of PMC) of each raid card manufacturer, every once in a while ( Such as 1 hour), check all the disk track information under the server, and prompt the bad track information of the disk.
所述方法通过shell脚本将磁盘的坏道信息保存日志中。The method saves the bad track information of the disk in a log through a shell script.
当磁盘的坏道信息已经达到了预警线,所述方法通过磁盘的定位灯闪烁,提示用户关注磁盘信息,及时更换磁盘。When the bad track information of the disk has reached the warning line, the method prompts the user to pay attention to the disk information and replace the disk in time by flickering the positioning light of the disk.
所述脚本运行流程如下:The script running process is as follows:
1)显示出当前的磁盘列表信息;1) Display the current disk list information;
2)获取磁盘总的数目;2) Obtain the total number of disks;
3)遍历检查所有磁盘的温度,如果温度超出标准,则报警并点亮指示灯;3) Traversely check the temperature of all disks, if the temperature exceeds the standard, it will alarm and light up the indicator light;
4)遍历检查所有磁盘的坏道信息,如果坏道信息超出标准则报警,并点亮指示灯。4) Traversing and checking the bad track information of all disks, if the bad track information exceeds the standard, it will alarm and turn on the indicator light.
整个过程主要是利用shell脚本实现磁盘坏道的信息检测、日志保存功能、同步闪灯预警提示,该方法使用方便,操作简单,在保证数据的正常读写情况下,及时的发现磁盘坏道信息,保证数据的安全性,保证业务的正常运行。The whole process mainly uses the shell script to realize the information detection of disk bad sectors, log saving function, and synchronous flashing light warning prompt. This method is easy to use and easy to operate. It can detect bad disk information in time while ensuring normal reading and writing of data. , to ensure the security of the data and the normal operation of the business.
本发明的有益效果为:The beneficial effects of the present invention are:
本发明方法利用各个raid卡厂商提供的接口以及OS下针对IO处理的监控服务,进行raid下磁盘坏道信息进行实时监控,及时的发现磁盘的隐患,避免了数据的丢失以及服务的中断,大大的保证了服务器的稳定性,数据的安全性。The method of the present invention utilizes the interface provided by each raid card manufacturer and the monitoring service for IO processing under the OS to monitor the bad track information of the disk under the raid in real time, discover the hidden danger of the disk in time, avoid the loss of data and the interruption of service, greatly It guarantees the stability of the server and the security of data.
附图说明Description of drawings
图1为本发明脚本流程图。Fig. 1 is a script flow chart of the present invention.
具体实施方式detailed description
下面结合说明书附图,根据具体实施方式对本发明进一步说明:Below in conjunction with accompanying drawing of description, the present invention is further described according to specific embodiment:
实施例1:Example 1:
一种在linux下自动检测raid下磁盘坏道信息的方法,所述方法基于shell脚本,调用各raid卡厂商的linux下raid工具(例如LSI的storcliIa和PMC的arcconf工具),每隔一段时间(如1小时),检测一遍服务器下所有的磁盘轨道信息,并将磁盘的坏道信息提示出来。A method for automatically detecting bad track information of disks under raid under linux, said method is based on shell scripts, calling raid tools under linux (such as the storcliia of LSI and the arcconf tool of PMC) of each raid card manufacturer, every once in a while ( Such as 1 hour), check all the disk track information under the server, and prompt the bad track information of the disk.
实施例2Example 2
在实施例1的基础上,本实施例所述方法通过shell脚本,将磁盘的坏道信息保存于日志中。On the basis of Embodiment 1, the method described in this embodiment saves the bad track information of the disk in a log through a shell script.
实施例3Example 3
在实施例2的基础上,本实施例当磁盘的坏道信息已经达到了预警线,所述方法通过磁盘的定位灯闪烁,提示用户需要关注磁盘信息,及时更换磁盘。On the basis of Embodiment 2, in this embodiment, when the bad track information of the disk has reached the warning line, the method uses the blinking of the positioning light of the disk to prompt the user to pay attention to the disk information and replace the disk in time.
实施例4Example 4
在实施例3的基础上,本实施例所述脚本主要完成如下功能:On the basis of Embodiment 3, the script described in this embodiment mainly completes the following functions:
自动检测磁盘的坏道信息,同步将信息保存到日志中,并每隔一个小时重新进行检测。当坏道信息达到预警数量,那么该脚本会自动点亮磁盘的定位灯来提示用户磁盘坏道数量已经达到预警。Automatically detect the bad track information of the disk, save the information to the log synchronously, and re-detect every hour. When the number of bad sectors reaches the warning number, the script will automatically turn on the positioning light of the disk to remind the user that the number of bad sectors on the disk has reached the warning number.
如图1所示,执行脚本/disk.sh;As shown in Figure 1, execute the script /disk.sh;
1)每隔一个小时检测一次磁盘坏道信息;1) Detect disk bad track information every hour;
2)当坏道数量超过临界值,立即电量磁盘的定位灯,在系统下显示报警,并执行步骤4);2) When the number of bad sectors exceeds the critical value, the positioning light of the power disk will display an alarm under the system immediately, and perform step 4);
3)当坏道数量未超过临界值时;3) When the number of bad sectors does not exceed the critical value;
4)继续进行检测,同步保存结果,实时监控日志;4) Continue to detect, save the results synchronously, and monitor the logs in real time;
5)执行完成。5) Execution complete.
主要代码如下:The main code is as follows:
raid_check /c0/eall/sall show //显示出当前的磁盘列表信息raid_check /c0/eall/sall show //Display the current disk list information
sas_number=`raid_check /c0/eall/sall show |grep –i HDD |wc –l ` //获取磁盘总的numbersas_number=`raid_check /c0/eall/sall show |grep –i HDD |wc –l ` //Get the total number of the disk
for ((i=0;i<$sas_number;i++))for ((i=0;i<$sas_number;i++))
dodo
temperature=`.raid_check /c0/eall/s$i show all |grep –i temperature`temperature=`.raid_check /c0/eall/s$i show all |grep –i temperature`
if [ "$temperature" > 64 ];if [ "$temperature" > 64 ];
thenthen
echo the SAS $i temperature is too hot!!!!!echo the SAS $i temperature is too hot!!!!!
raid_check /c0/eall/s$i start locateraid_check /c0/eall/s$i start locate
fithe fi
//遍历检查所有磁盘的温度,如果温度超出标准,则报警并点亮指示灯//Traverse to check the temperature of all disks, if the temperature exceeds the standard, an alarm will be issued and the indicator light will be turned on
sas_error=` raid_check /c0/eall/s$i show all |grep –i predictive `sas_error=` raid_check /c0/eall/s$i show all |grep –i predictive`
if [ "$ sas_error " > 0];if [ "$ sas_error " > 0];
thenthen
echo the SAS $i has $sas_error error!!!echo the SAS $i has $sas_error error!!!
raid_check /c0/eall/s$i start locateraid_check /c0/eall/s$i start locate
fithe fi
donedone
//遍历检查所有磁盘的坏道信息,如果坏道信息超出标准则报警,并点亮指示灯// Traverse and check the bad track information of all disks, if the bad track information exceeds the standard, it will alarm and light up the indicator light
上实施方式仅用于说明本发明,而并非对本发明的限制,有关技术领域的普通技术人员,在不脱离本发明的精神和范围的情况下,还可以做出各种变化和变型,因此所有等同的技术方案也属于本发明的范畴,本发明的专利保护范围应由权利要求限定。The above embodiments are only used to illustrate the present invention, but not to limit the present invention. Those of ordinary skill in the relevant technical fields can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, all Equivalent technical solutions also belong to the category of the present invention, and the scope of patent protection of the present invention should be defined by the claims.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610333684.6ACN106021065A (en) | 2016-05-19 | 2016-05-19 | Method for automatically detecting bad track information of raid disk under linux |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610333684.6ACN106021065A (en) | 2016-05-19 | 2016-05-19 | Method for automatically detecting bad track information of raid disk under linux |
| Publication Number | Publication Date |
|---|---|
| CN106021065Atrue CN106021065A (en) | 2016-10-12 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201610333684.6APendingCN106021065A (en) | 2016-05-19 | 2016-05-19 | Method for automatically detecting bad track information of raid disk under linux |
| Country | Link |
|---|---|
| CN (1) | CN106021065A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108182136A (en)* | 2018-01-11 | 2018-06-19 | 郑州云海信息技术有限公司 | A kind of system and method for lighting disk positioning lamp in raid |
| CN108958960A (en)* | 2018-07-26 | 2018-12-07 | 郑州云海信息技术有限公司 | Low-quality disk localization method, device, equipment and storage medium in distributed storage cluster |
| CN109032525A (en)* | 2018-07-26 | 2018-12-18 | 广东浪潮大数据研究有限公司 | A kind of method, apparatus, equipment and storage medium being automatically positioned low-quality disk |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6944800B2 (en)* | 2001-05-11 | 2005-09-13 | Dell Products L.P. | Dynamic display of personal computer support information |
| CN102662787A (en)* | 2012-04-20 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Method for protecting system disk RAID (redundant array of independent disks) |
| CN105045689A (en)* | 2015-06-25 | 2015-11-11 | 浪潮电子信息产业股份有限公司 | Method for monitoring and alarming hard disks by using RAID card batch detection |
| CN105141478A (en)* | 2015-09-02 | 2015-12-09 | 浪潮电子信息产业股份有限公司 | Method for monitoring state of sas card hard disk of linux server |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6944800B2 (en)* | 2001-05-11 | 2005-09-13 | Dell Products L.P. | Dynamic display of personal computer support information |
| CN102662787A (en)* | 2012-04-20 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Method for protecting system disk RAID (redundant array of independent disks) |
| CN105045689A (en)* | 2015-06-25 | 2015-11-11 | 浪潮电子信息产业股份有限公司 | Method for monitoring and alarming hard disks by using RAID card batch detection |
| CN105141478A (en)* | 2015-09-02 | 2015-12-09 | 浪潮电子信息产业股份有限公司 | Method for monitoring state of sas card hard disk of linux server |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108182136A (en)* | 2018-01-11 | 2018-06-19 | 郑州云海信息技术有限公司 | A kind of system and method for lighting disk positioning lamp in raid |
| CN108958960A (en)* | 2018-07-26 | 2018-12-07 | 郑州云海信息技术有限公司 | Low-quality disk localization method, device, equipment and storage medium in distributed storage cluster |
| CN109032525A (en)* | 2018-07-26 | 2018-12-18 | 广东浪潮大数据研究有限公司 | A kind of method, apparatus, equipment and storage medium being automatically positioned low-quality disk |
| Publication | Publication Date | Title |
|---|---|---|
| CN105045689A (en) | Method for monitoring and alarming hard disks by using RAID card batch detection | |
| CN102279775B (en) | Method for processing failure of hard disk under Linux system | |
| CN103309775B (en) | A kind of fault-tolerance approach of high-reliability disk array | |
| CN106502861A (en) | A kind of method for penetrating RAID card reading hard disk SMART information | |
| CN106021065A (en) | Method for automatically detecting bad track information of raid disk under linux | |
| CN105740110A (en) | Detection method for smart information of hard disk in linux system | |
| CN113656228B (en) | A disk fault detection method, device, computer equipment and storage medium | |
| CN107341077A (en) | A kind of method and its system for hard disk screening | |
| CN102760090A (en) | Debugging method and computer system | |
| US20110154049A1 (en) | System and method for performing data backup of digital video recorder | |
| CN107590253A (en) | A kind of automated detection method for MySQL database configuration security | |
| CN109684141A (en) | A kind of disk failure diagnostic method, device, terminal and readable storage medium storing program for executing | |
| CN109165143A (en) | Database detection method, system, server and storage medium | |
| TW201333483A (en) | System and method for detecting voltage | |
| CN100395717C (en) | Hard disk device damage monitoring method and system | |
| CN107784635A (en) | A kind of server panel automatic identifying method and system | |
| CN110704287B (en) | Method, system and storage medium for collecting abnormal logs of RAID cards under Linux system | |
| CN107766734A (en) | Clean boot RAID card method, apparatus, equipment and computer-readable recording medium | |
| Huang et al. | Characterizing disk health degradation and proactively protecting against disk failures for reliable storage systems | |
| CN107918574A (en) | A kind of method of inspection based on test hardware information under Redhat | |
| CN105183597A (en) | Method for rapidly and effectively analyzing and repairing system hard disk failure | |
| CN110413463A (en) | A method for checking SMART information of hard disk | |
| JP2016076071A (en) | Log management device, log management program, and log management method | |
| CN107807862A (en) | Detect the method, apparatus and server of hard disk failure point | |
| US11740790B2 (en) | Electronic device and method for monitoring hard disks |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20161012 | |
| RJ01 | Rejection of invention patent application after publication |