Data scrubbing is anerror correction technique that uses a background task to periodically inspectmain memory orstorage for errors, then corrects detected errors usingredundant data in the form of differentchecksums or copies of data. Data scrubbing reduces the likelihood that single correctable errors will accumulate, leading to reduced risks of uncorrectable errors.
Data integrity is a high-priority concern in writing, reading, storage, transmission, or processing of data in computeroperating systems and in computer storage anddata transmission systems. However, only a few of the currently existing and usedfile systems provide sufficient protection againstdata corruption.[1][2][3]
To address this issue, data scrubbing provides routine checks of all inconsistencies in data and, in general, prevention of hardware or software failure. This "scrubbing" feature occurs commonly in memory, disk arrays,file systems, orFPGAs as a mechanism of error detection and correction.[4][5][6]
With data scrubbing, aRAID controller may periodically read allhard disk drives in a RAID array and check for defective blocks before applications might actually access them. This reduces the probability of silent data corruption and data loss due to bit-level errors.[7]
InDell PowerEdge RAID environments, a feature called "patrol read" can perform data scrubbing andpreventive maintenance.[8]
InOpenBSD, thebioctl(8) utility allows thesystem administrator to control these patrol reads through theBIOCPATROLioctl on the/dev/biopseudo-device; as of 2019, this functionality is supported in some device drivers forLSI Logic and Dell controllers — this includesmfi(4) since OpenBSD 5.8 (2015) andmfii(4) since OpenBSD 6.4 (2018).[9][10]
InFreeBSD andDragonFly BSD, patrol can be controlled through aRAID controller-specific utilitymfiutil(8) since FreeBSD 8.0 (2009) and 7.3 (2010).[11] The implementation from FreeBSD was used by the OpenBSD developers for adding patrol support to their genericbio(4) framework and thebioctl utility, without a need for a separate controller-specific utility.
InNetBSD in 2008, the bio(4) framework from OpenBSD was extended to feature support for consistency checks, which was implemented for/dev/biopseudo-device underBIOCSETSTATEioctl command, with the options being start and stop (BIOC_SSCHECKSTART_VOL andBIOC_SSCHECKSTOP_VOL, respectively); this is supported only by a single driver as of 2019 —arcmsr(4).[12]
Linux MD RAID, as asoftware RAID implementation, makes data consistency checks available and provides automated repairing of detected data inconsistencies. Such procedures are usually performed by setting up a weeklycron job. Maintenance is performed by issuing operationscheck,repair, oridle to each of the examined MD devices. Statuses of all performed operations, as well as general RAID statuses, are always available.[13][14][15]
As acopy-on-write (CoW)file system forLinux,Btrfs provides fault isolation, corruption detection and correction, and file-system scrubbing. If the file system detects a checksum mismatch while reading a block, it first tries to obtain (or create) a good copy of this block from another device – if its internal mirroring or RAID techniques are in use.[16]
Btrfs can initiate an online check of the entire file system by triggering a file system scrub job that is performed in the background. The scrub job scans the entire file system for integrity and automatically attempts to report and repair any bad blocks it finds along the way.[17][18]
ReFS features automatic data scrubbing. Files that should not be scrubbed can be marked with the FILE_ATTRIBUTE_NO_SCRUB_DATA flag.[19]
The features of ZFS, which is a combinedfile system andlogical volume manager, include the verification againstdata corruption modes, continuous integrity checking, and automatic repair.Sun Microsystems designed ZFS from the ground up with a focus on data integrity and to protect the data on disks against issues such as disk firmware bugs and phantom writes (a write that never actually makes it to disk).[20]
ZFS provides a repair utility calledscrub that examines and repairs silentdata corruption caused bydata rot and other problems.
Due to the high integration density of contemporary computer memorychips, the individual memory cell structures became small enough to be vulnerable tocosmic rays and/oralpha particle emission. The errors caused by these phenomena are calledsoft errors. This can be a problem forDRAM- andSRAM-based memories.
Memory scrubbing does error-detection and correction of bit errors in computerRAM by usingECC memory, other copies of the data, or othererror-correction codes.
Scrubbing is a technique used to reprogram anFPGA. It can be used periodically to avoid the accumulation of errors without the need to find one in the configuration bitstream, thus simplifying the design.
Numerous approaches can be taken with respect to scrubbing, from simply reprogramming the FPGA to partial reconfiguration. The simplest method of scrubbing is to completely reprogram the FPGA at some periodic rate (typically 1/10 the calculated upset rate). However, the FPGA is not operational during that reprogram time, on the order of micro to milliseconds. For situations that cannot tolerate that type of interruption, partial reconfiguration is available. This technique allows the FPGA to be reprogrammed while still operational.[21]
The Patrol Read feature is designed as a preventative measure to ensure physical disk health and data integrity. Patrol Read scans for and resolves potential problems on configured physical disks.