This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Single source of truth" – news ·newspapers ·books ·scholar ·JSTOR(January 2021) (Learn how and when to remove this message) |
Ininformation science andinformation technology,single source of truth (SSOT) architecture, orsingle point of truth (SPOT) architecture, forinformation systems is the practice of structuringinformation models and associateddata schemas such that everydata element ismastered (or edited) in only one place, providingdata normalization to a canonical form (for example, indatabase normalization or contenttransclusion).[1]
There are several scenarios with respect to copies and updates:
The advantages of SSOT architectures include easier prevention of mistaken inconsistencies (such as a duplicate value/copy somewhere being forgotten), and greatly simplifiedversion control. Without a SSOT, dealing with inconsistencies implies either complex and error-prone consensus algorithms, or using a simpler architecture that's liable to lose data in the face of inconsistency (the latter may seem unacceptable but it is sometimes a very good choice; it is how most blockchains operate: a transaction is actually final only if it was included in the next block that is mined).
Ideally, SSOT systems provide data that are authentic (andauthenticatable), relevant, andreferable.[2]
Deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentionaldenormalization of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information. Common examples (i.e., example classes of implementation) are as follows:
An acknowledgedprerequisite (of the notion that any given single source of truth can exist) is that it depends on the ontologic condition that no more than a single truth (about any particular fact or idea) exists, an assertion that is ontologic in boththe IT sense andthe general sense of that word. In many instances, this presents no problem (for example, within particularnamespaces, or even across them, as long asnaming collisions or broadername conflicts are adequately handled). The broadest contexts (and thus thorniest, regarding ontologic discrepancies) require adequateepistemic regime comparison andreconciliation (or at least negotiation ortransactional exchanges). An archetypal example of this class of reconciliation is that twotheological seminary libraries, from two different religions (X and Y), couldexchange information with an SSOT architecture, but the unification of truth would reside on the level of the statement that "religion X asserts that God is purple whereas religion Y asserts that God is green", rather than on the level of "God is purple" or "God is green".
An ideal implementation of SSOT is rarely possible in most enterprises. This is because many organisations have multiple information systems, each of which needs access to data relating to the same entities (e.g., customer). Often these systems are purchased ascommercial off-the-shelf products from vendors and cannot be modified in trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record (hence immediately violating the SSOT approach defined above). For example, anenterprise resource planning (ERP) system (such asSAP orOracle e-Business Suite) may store a customer record; thecustomer relationship management (CRM) system also needs a copy of the customer record (or part of it) and the warehouse dispatch system might also need a copy of some or all of the customer data (e.g., shipping address). In cases where vendors do not support such modifications, it is not always possible to replace these records with pointers to the SSOT.
For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), some supporting architectures are:
Amaster data management system typically serves as the source of truth for an organization's metadata, helping to ensure accuracy and consistency throughout that organizations multiple data sources.[4] Typically the MDM acts as a hub for multiple systems, many of which could allow (be the source of truth for) updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may (for example) also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative (the golden record) and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems.[5]
In event oriented architectures, it has become increasingly common to find an implementation of theEvent Sourcing pattern which stores the system state as an ordered sequence of state changes.[6] To do this, you need anEvent Store, a particular type of database designed to hold all the events that change the state of the system. The event store in anEvent Sourcing +Command Query Responsibility Separation +Domain Driven Design +Messaging architecture is in fact a "single source of truth", with the additional advantage that it can also act as an Enterprise Service Bus as it can listen directly to the event store for status changes as everything passes by. In addition, by saving all the events, it also plays the role ofData Warehouse. One last advantage is that through this system theShared Database pattern can be implemented, another technique not mentioned to obtain a single source of truth.
While the primary purpose of adata warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that such data has been combined (according to business logic embedded in thedata transformation and integration processes) means that the data warehouse is often used as ade facto SSOT. Generally, however, the data available from the data warehouse are not used to update other systems; rather the DW becomes the "single source of truth" for reporting to multiple stakeholders. In this context, the Data Warehouse is more correctly referred to as a "single version of the truth" since other versions of the truth exist in its operational data sources (no data originates in the DW; it is simply a reporting mechanism for data loaded from operational systems).[7]
Data warehouses are solely intended to perform queries and analysis and often contain large amounts of historical data. The data within a data warehouse is usually derived from a wide range of sources such as application log files and transaction applications.