Anin-memory database (IMDb, ormain memory database system (MMDB) ormemory resident database) is adatabase management system that primarily relies onmain memory forcomputer data storage. It is contrasted with database management systems that employ adisk storage mechanism. In-memory databases are faster than disk-optimized databases because disk access is slower than memory access and the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminatesseek time when querying the data, which provides faster and more predictable performance than disk.[1][2]
Applications where response time is critical, such as those running telecommunications network equipment andmobile advertising networks, often use main-memory databases.[3] IMDBs have gained much traction, especially in thedata analytics space, starting in themid-2000s – mainly due to multi-core processors that can address large memory and due to less expensiveRAM.[4][5]
A potential technical hurdle with in-memory data storage is the volatility of RAM. Specifically in the event of a power loss, intentional or otherwise, data stored involatile RAM is lost.[6] With the introduction ofnon-volatile random-access memory technology, in-memory databases will be able to run at full speed and maintain data in the event of power failure.[7][8][9]
In its simplest form, main memory databases store data onvolatile memory devices. These devices lose all stored information when the device loses power or is reset. In this case, IMDBs can be said to lack support for the "durability" portion of theACID (atomicity, consistency, isolation, durability) properties. Volatile memory-based IMDBs can, and often do, support the other three ACID properties of atomicity, consistency and isolation.
Many IMDBs have added durability via the following mechanisms:
Some IMDBs allow the database schema to specify different durability requirements for selected areas of the database – thus, faster-changing data that can easily be regenerated or that has no meaning after a system shut-down would not need to be journaled for durability (though it would have to be replicated for high availability), whereas configuration information would be flagged as needing preservation.
While storing data in-memory confers performance advantages, it is an expensive method of data storage. An approach to realising the benefits of in-memory storage while limiting its costs is to store the most frequently accessed data in-memory and the rest on disk. Since there is no hard distinction between which data should be stored in-memory and which should be stored on disk, some systems dynamically update where data is stored based on the data's usage.[10] This approach is subtly different fromcaching, in which the mostrecently accessed data is cached, as opposed to the mostfrequently accessed data being stored in-memory.
The flexibility of hybrid approaches allow a balance to be struck between:
In thecloud computing industry the terms "data temperature", or "hot data" and "cold data" have emerged to describe how data is stored in this respect.[11] Hot data is used to describe mission-critical data that needs to be accessed frequently while cold data describes data that is needed less often and less urgently, such as data kept for archiving or auditing purposes. Hot data should be stored in ways offering fast retrieval and modification, often accomplished by in-memory storage but not always. Cold data on the other hand can be stored in a more cost-effective way and is accepted that data access will likely be slower compared to hot data. While these descriptions are useful, "hot" and "cold" lack concrete definitions.[11]
Manufacturing efficiency provides another reason for selecting a combined in-memory/on-disk database system. Some device product lines, especially inconsumer electronics, include some units with permanent storage, and others that rely on memory for storage (set-top boxes, for example). If such devices require a database system, a manufacturer can adopt a hybrid database system at lower andupper cost, and with less customization of code, rather than using separate in-memory and on-disk databases, respectively, for its disk-less and disk-based products.
The firstdatabase engine to support both in-memory and on-disk tables in a single database, WebDNA, was released in 1995.
Another variation involves large amounts of nonvolatile memory in the server, for example, flash memory chips as addressable memory rather than structured as disk arrays. A database in this form of memory combines very fast access speed with persistence over reboots and power losses.[12]