This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Denormalization" – news ·newspapers ·books ·scholar ·JSTOR(May 2008) (Learn how and when to remove this message) |
Denormalization is a strategy used on a previously-normalized database to increase performance. Incomputing, denormalization is the process of trying to improve the read performance of adatabase, at the expense of losing some write performance, by addingredundant copies of data or by grouping data.[1][2] It is often motivated byperformance orscalability inrelationaldatabase software needing to carry out very large numbers of read operations. Denormalization differs from theunnormalized form in that denormalization benefits can only be fully realized on a data model that is otherwise normalized.
Anormalized design will often "store" different but related pieces of information in separate logical tables (called relations). If these relations are stored physically as separate disk files, completing a databasequery that draws information from several relations (ajoin operation) can be slow. If many relations are joined, it may be prohibitively slow. There are two strategies for dealing with this by denormalization:
With this approach, database administrators can keep the logical design normalized, but allow thedatabase management system (DBMS) to store additional redundant information on disk to optimize query response. In this case it is the DBMS software's responsibility to ensure that any redundant copies are kept consistent. This method is often implemented inSQL as indexed views (Microsoft SQL Server) ormaterialized views (Oracle,PostgreSQL). A view may, among other factors, represent information in a format convenient for querying, and the index ensures that queries against the view are optimized physically.
With this approach, a database administrator or designer has to denormalize the logical data design. With care this can achieve a similar improvement in query response, but at a cost — it is now the database designer's responsibility to ensure that the denormalized database does not become inconsistent. This is done by creating rules in the database calledconstraints, that specify how the redundant copies of information must be kept synchronized, which may easily make the de-normalization procedure pointless. It is the increase in logicalcomplexity of the database design and the added complexity of the additional constraints that make this approach hazardous. Moreover, constraints introduce atrade-off, speeding up reads (SELECT in SQL) while slowing down writes (INSERT,UPDATE, andDELETE). This means a denormalized database under heavy write load may offerworse performance than its functionally equivalent normalized counterpart.
A denormalized data model is not the same as a data model that has not been normalized, and denormalization should only take place after a satisfactory level of normalization has taken place and that any required constraints and/or rules have been created to deal with the inherent anomalies in the design. For example, all the relations are inthird normal form and any relations withjoin dependencies andmulti-valued dependencies are handled appropriately.
Examples of denormalization techniques include:
With the continued dramatic increase in all three of storage, processing power and bandwidth, on all levels, denormalization in databases has moved from being an unusual or extension technique, to the commonplace, or even the norm.[when?] For example, one specific downside of denormalization was, simply, that it "uses more storage" (that is to say, literally more columns in a database). With the exception of truly enormous systems, increased storage requirements is considered a relatively small problem in the 2020s.