Movatterモバイル変換

Ordered key–value store

From Wikipedia, the free encyclopedia

(Redirected fromOrdered Key-Value Store)

Anordered key–value store (OKVS) is a type of data storage paradigm that can supportmulti-model databases. An OKVS is an ordered mapping of bytes to bytes. An OKVS will keep the key–value pairs sorted by the key lexicographic order. OKVS systems provides different set of features and performance trade-offs. Most of them are shipped as a library without network interfaces, in order to be embedded in another process. Most OKVS supportACID guarantees. Some OKVS aredistributed databases. Ordered key–value stores found their way into many modern database systems includingNewSQL database systems.

History

[edit]

The origin of ordered key–value store stems from the work ofKen Thompson ondbm in 1979. Later in 1991,Berkeley DB was released that featured aB-Tree backend that allowed the keys to stay sorted. Berkeley DB was said to be very fast and made its way into various commercial product. It was included in Python standard library until 2.7.^[1] In 2009, Tokyo Cabinet was released that was superseded byKyoto Cabinet that support both transaction and ordered keys. In 2011,LMDB was created to replace Berkeley DB inOpenLDAP. There is also Google'sLevelDB that was forked by Facebook in 2012 asRocksDB. In 2014,WiredTiger, successor of Berkeley DB was acquired byMongoDB and is since 2019 the primary backend of MongoDB database.

Other notable implementation of the OKVS paradigm are Sophia^[2] and SQLite3 LSM extension. Another notable use of OKVS paradigm is the multi-model database system calledArangoDB^[3] based on RocksDB.

SomeNewSQL databases are supported by ordered key–value stores.JanusGraph, a property graph database, has both a Berkeley DB backend andFoundationDB backend.

Key concepts

[edit]

Lexicographic encoding

[edit]

There are algorithms that encode basic data types (boolean, string, number) and composition of those data types inside sorted containers (tuple, list, vector) that preserve their natural ordering. It is possible to work with an ordered key–value store without having to work directly with bytes. In FoundationDB, it is called the tuple layer.^[4]

Range query

[edit]

Inside an OKVS, keys are ordered, and because of that it is possible to do range queries. A range query retrieves all keys between two specified keys, ensuring that the fetched keys are returned in a sorted order.

Subspaces

[edit]

Key composition

[edit]

One can construct key spaces to build higher level abstractions. The idea is to construct keys, that takes advantage of the ordered nature of the top level key space. When taking advantage of the ordered nature of the key space, one can query ranges of keys that have particular pattern.

Denormalization

[edit]

Denormalization, as in, repeating the same piece of data in multiple subspace is common practice. It allows to create secondary representation, also called indices, that will allow to speed up queries.

Higher level abstractions

[edit]

The following abstraction or databases were built on top ordered key–value stores:

Timeseries database,
Record Database,^[5] also known as Row store databases, they behave similarly to what is dubbedRDBMS,
Tuple Stores, also known asTriple Store or Quad Store but also Generic Tuple Store,^[6]^[7]
Document database,^[8] that mimics MongoDB API,
Full-text search^[9]
Geographic Information Systems^[10]
Property Graph^[11]
Versioned Data^[12]
Vector space database for Approximate Nearest Neighbor^[13]

All those abstraction can co-exist with the same OKVS database and when ACID is supported, the operations happens with the guarantees offered by the transaction system.

Feature matrix

[edit]

Comparison of several ordered key–value stores
OKVS	License	Transactions	Distributed	In-memory	Multiple threads	Multiple processes	Nested Transactions
Berkeley DB	AGPL	Yes	No	No	Yes	Yes	Yes
FoundationDB	Apache	Yes	Yes	Yes	Yes	Yes	No
Kyoto Cabinet	GPL	Yes	No	No	?	No	No
LevelDB	Apache	No	No	No	?	No	No
LMDB	OpenLDAP	Yes	No	No	Yes	Yes	Yes
RocksDB	Apache	Yes	No	No	Yes	No	No
SQLite (LSM extension)	Public domain	Yes	No	Yes	Yes	Yes	Yes
TiKV	Apache	Yes	Yes	No	Yes	Yes	No
Tkrzw	Apache	Yes	No	?	Yes	No	?
Tokyo Cabinet	LGPL	Some	No	?	Yes	?	?
WiredTiger	GPL	Yes	No	Yes	Yes	No	No

Use-cases

[edit]

OKVS are useful to implement two strategies: optimize a small feature e.g. to make a 10% improvement in read or write latency; the second strategy is to take advantage of the distributed nature of FoundationDB, and TiKV, for which there is no equivalent at very large scale in resilience. Both users need to re-implement the needed high level abstractions, because there are no portable ready-to-use libraries of high-level abstraction. There is still a complex balance, of complexity, maintainability, fine-tuning, and readily available features that makes it still a choice of experts. Sometime more specialized data-structures can be faster than a high-level abstraction on top of an OKVS.

Another interest of OKVS paradigm stems from it simple, and versatile interface, that makes it an interesting target for experimental storage algorithms, and data structures.

References

[edit]

^"11.11. bsddb — Interface to Berkeley DB library — Python 2.7.17 documentation".docs.python.org. Retrieved2020-01-16.
^"sophia - modern transactional key-value/row storage library".sophia.systems. Retrieved2020-01-16.
^"Comparing new RocksDB and MMFiles storage engines".ArangoDB. Retrieved2020-01-16.
^"Python API — FoundationDB 6.2".apple.github.io. Retrieved2020-01-19.
^A record-oriented store built on FoundationDB., FoundationDB, 2020-01-16, retrieved2020-01-17
^"Generic Tuple Store Database".srfi.schemers.org. Retrieved2020-01-17.
^"Generic Tuple Store".GitHub.
^A document data model on FoundationDB, implementing MongoDB® wire protocol: FoundationDB/fdb-document-layer, FoundationDB, 2019-12-09, retrieved2020-01-17
^meilisearch/MeiliSearch, MeiliSearch, 2021-06-19, retrieved2021-06-19
^"6.1. GeoMesa Index Structure — GeoMesa 1.3.1 Manuals".www.geomesa.org. Retrieved2020-01-19.^{[permanent dead link]}
^"The JanusGraph FoundationDB Storage Adapter - Ted Wilmes, Expero Inc".www.youtube.com. Retrieved2020-01-17.
^"Lightning Talk: Entity Store: A FoundationDB Layer for Versioned... - Stephen Pimentel, - YouTube".www.youtube.com. Retrieved2020-01-17.
^meilisearch/arroy, Meilisearch, 2024-08-05, retrieved2024-08-06

v t e Database models
Common models	Flat Hierarchical Dimensional Network Relational Entity–relationship Enhanced Graph Object-oriented Entity–attribute–value
Other models	Multi-dimensional Array Semantic Star schema XML database
Implementations	Flat file Column-oriented Document-oriented Object–relational Deductive Temporal Valid time Transaction time Decision time XML data store Key–value store Ordered Key-Value Store Triplestore