In computermain memory,auxiliary storage andcomputer buses,data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a complete copy of the actual data (a type ofrepetition code), or only select pieces of data that allowdetection of errors and reconstruction of lost or damaged data up to a certain level.
For example, by including computed check bits,ECC memory is capable of detecting and correcting single-bit errors within eachmemory word, whileRAID 1 combines twohard disk drives (HDDs) into a logical storage unit that allows stored data to survive a complete failure of one drive.[1][2] Data redundancy can also be used as a measure againstsilent data corruption; for example,file systems such asBtrfs andZFS use data andmetadata checksumming in combination with copies of stored data to detect silent data corruption and repair its effects.[3]
While different in nature,data redundancy also occurs indatabase systems that have values repeated unnecessarily in one or more records orfields, within atable, or where the field is replicated/repeated in two or more tables. Often this is found inunnormalized database designs and results in the complication of database management, introducing the risk of corrupting the data, and increasing the required amount ofstorage. When done on purpose from a previously normalized database schema, itmay be considered a form ofdatabase denormalization; used to improve performance of database queries (shorten the database response time).
For instance, when customer data are duplicated and attached with each product bought, then redundancy of data is a known source ofinconsistency since a given customer might appear with different values for one or more of their attributes.[4] Data redundancy leads todata anomalies and corruption and generally should be avoided by design;[5] applyingdatabase normalization prevents redundancy and makes the best possible usage of storage.[6]