Movatterモバイル変換

dm-cache

From Wikipedia, the free encyclopedia

Component of the Linux kernel's device mapper

dm-cache
Developer(s)	Joe Thornber, Heinz Mauelshagen, Mike Snitzer and others
Initial release	April 28, 2013; 11 years ago (2013-04-28) (Linux 3.9)
Written in	C
Operating system	Linux
Type	Linux kernel feature
License	GNU GPL
Website	kernel.org

dm-cache is a component (more specifically, a target) of theLinux kernel'sdevice mapper, which is aframework for mappingblock devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-basedsolid-state drives (SSDs), to act as acache for one or more slower storage devices such ashard disk drives (HDDs); this effectively createshybrid volumes and providessecondary storage performance improvements.

The design of dm-cache requires three physical storage devices for the creation of a single hybrid volume; dm-cache uses those storage devices to separately store actual data, cache data, and requiredmetadata. Configurable operating modes and cache policies, with the latter in the form of separate modules, determine the way data caching is actually performed.

dm-cache is licensed under the terms ofGNU General Public License (GPL), with Joe Thornber, Heinz Mauelshagen and Mike Snitzer as its primary developers.

Overview

[edit]

dm-cache uses solid-state drives (SSDs) as an additional level of indirection while accessing hard disk drives (HDDs), improving the overall performance by using fastflash-based SSDs as caches for the slower mechanical HDDs based on rotationalmagnetic media. As a result, the costly speed of SSDs becomes combined with the storage capacity offered by slower but less expensive HDDs.^[1] Moreover, in the case ofstorage area networks (SANs) used incloud environments as shared storage systems forvirtual machines, dm-cache can also improve overall performance and reduce the load of SANs by providing data caching using client-side local storage.^[2]^[3]^[4]

dm-cache is implemented as a component of the Linux kernel'sdevice mapper, which is avolume management framework that allows various mappings to be created between physical and virtual block devices. The way a mapping between devices is created determines how the virtualblocks are translated into underlying physical blocks, with the specific translation types referred to astargets.^[5] Acting as a mapping target, dm-cache makes it possible for SSD-based caching to be part of the created virtual block device, while the configurable operating modes and cache policies determine how dm-cache works internally. The operating mode selects the way in which the data is kept in sync between an HDD and an SSD, while the cache policy, selectable from separate modules that implement each of the policies, provides thealgorithm for determining which blocks are promoted (moved from an HDD to an SSD), demoted (moved from an SSD to an HDD), cleaned, etc.^[6]

When configured to use themultiqueue (mq) orstochastic multiqueue (smq) cache policy, with the latter being the default, dm-cache uses SSDs to store the data associated with performedrandom reads and writes, capitalizing on near-zeroseek times of SSDs and avoiding suchI/O operations as typical HDD performance bottlenecks. The data associated withsequential reads and writes is not cached on SSDs, avoiding undesirablecache invalidation during such operations; performance-wise, this is beneficial because the sequential I/O operations are suitable for HDDs due to their mechanical nature. Not caching the sequential I/O also helps in extending thelifetime of SSDs used as caches.^[7]

History

[edit]

Another dm-cache project with similar goals was announced by Eric Van Hensbergen and Ming Zhao in 2006, as the result of an internship work atIBM.^[8]

Later, Joe Thornber, Heinz Mauelshagen and Mike Snitzer provided their own implementation of the concept, which resulted in the inclusion of dm-cache into the Linux kernel. dm-cache was merged into theLinux kernel mainline in kernel version 3.9, which was released on April 28, 2013.^[6]^[9]

Design

[edit]

In dm-cache, creating a mapped virtual block device that acts as ahybrid volume requires three physical storage devices:^[6]

Origin device – provides slow primary storage (usually an HDD)
Cache device – provides a fast cache (usually an SSD)
Metadata device – records the placement of blocks and their dirty flags, as well as other internal data required by a cache policy, including per-block hit counts; a metadata device cannot be shared between multiple cache devices, and it is recommended to bemirrored

Internally, dm-cache references to each of the origin devices through a number of fixed-size blocks; the size of these blocks, equaling to the size of a cachingextent, is configurable only during the creation of a hybrid volume. The size of a caching extent must range between 32 KB and 1 GB, and it must be a multiple of 32 KB; typically, the size of a caching extent is between 256 and 1024 KB. The choice of the caching extents bigger thandisk sectors acts a compromise between the size ofmetadata and the possibility for wasting cache space. Having too small caching extents increases the size of metadata, both on the metadata device and in kernel memory, while having too large caching extents increases the amount of wasted cache space due to caching whole extents even in the case of highhit rates only for some of their parts.^[6]^[10]

Operating modes supported by dm-cache arewrite-back, which is the default,write-through, andpass-through. In the write-back operating mode, writes to cached blocks go only to the cache device, while the blocks on origin device are only marked as dirty in the metadata. For the write-through operating mode, write requests are not returned as completed until the data reaches both the origin and cache devices, with no clean blocks becoming marked as dirty. In the pass-through operating mode, all reads are performed directly from the origin device, avoiding the cache, while all writes go directly to the origin device; any cache write hits also cause invalidation of the cached blocks. The pass-through mode allows a hybrid volume to be activated when the state of a cache device is not known to be consistent with the origin device.^[6]^[11]

The rate of data migration that dm-cache performs in both directions (i.e., data promotions and demotions) can bethrottled down to a configured speed so regular I/O to the origin and cache devices can be preserved. Decommissioning a hybrid volume or shrinking a cache device requires use of thecleaner policy, which effectively flushes all blocks marked in metadata as dirty from the cache device to the origin device.^[6]^[7]

Cache policies

[edit]

As of August 2015^[update] and version 4.2 of the Linux kernel,^[12] the following three cache policies are distributed with the Linux kernel mainline, out of which dm-cache by default uses thestochastic multiqueue policy:^[6]^[7]

multiqueue (mq): Themultiqueue (mq) policy has three sets of 16queues, using the first set for entries waiting for the cache and the remaining two sets for entries already in the cache, with the latter separated so the clean and dirty entries belong to each of the two sets. The age of cache entries in the queues is based on their associated logical time. The selection of entries going into the cache (i.e., becoming promoted) is based on variable thresholds, and queue selection is based on the hit count of an entry. This policy aims to take differentcache miss costs into account, and to make automatic adjustments to different load patterns.; This policy internally trackssequential I/O operations so they can be routed around the cache, with different configurable thresholds for the differentiation betweenrandom I/O and sequential I/O operations. As a result, large contiguous I/O operations are left to be performed by the origin device because such data access patterns are suitable for HDDs, and because they avoid undesirable cache invalidation.

stochastic multiqueue (smq): Thestochastic multiqueue (smq) policy performs in a similar way as themultiqueue policy, but requires fewer resources to operate; in particular, it uses substantially smaller amounts ofmain memory to track cached blocks. It also replaces the hit counting from themultiqueue policy with a "hotspot" queue, and decides on data promotion and demotion on aleast-recently used (LRU) basis. As a result, this policy provides better performance compared to themultiqueue policy, adjusts better automatically to different load patterns, and eliminates the configuration of various thresholds.

cleaner: Thecleaner policy writes back to the origin device all blocks that are marked as dirty in the metadata. After the completion of this operation, a hybrid volume can be decommissioned or the size of a cache device can be reduced.

Use with LVM

[edit]

Logical Volume Manager includeslvmcache, which provides a wrapper fordm-cache integrated with LVM.^[13]

References

[edit]

^Petros Koutoupis (November 25, 2013)."Advanced Hard Drive Caching Techniques".Linux Journal. RetrievedDecember 2, 2013.
^"dm-cache: Dynamic Block-level Storage Caching".visa.cs.fiu.edu. Archived fromthe original on July 18, 2014. RetrievedJuly 24, 2014.
^Dulcardo Arteaga; Douglas Otstott; Ming Zhao (May 16, 2012)."Dynamic Block-level Cache Management for Cloud Computing Systems".visa.cs.fiu.edu. Archived fromthe original(PDF) on December 3, 2013. RetrievedDecember 2, 2013.
^Dulcardo Arteaga; Ming Zhao (June 21, 2014)."Client-side Flash Caching for Cloud Systems".visa.cs.fiu.edu.ACM. Archived fromthe original(PDF) on September 6, 2015. RetrievedAugust 31, 2015.
^"Red Hat Enterprise Linux 6 Documentation, Appendix A. The Device Mapper".Red Hat. October 8, 2014. RetrievedDecember 23, 2014.
^^a ^b ^c ^d ^e ^f ^gJoe Thornber; Heinz Mauelshagen; Mike Snitzer (July 20, 2015)."Linux kernel documentation: Documentation/device-mapper/cache.txt".kernel.org. RetrievedAugust 31, 2015.
^^a ^b ^cJoe Thornber; Heinz Mauelshagen; Mike Snitzer (June 29, 2015)."Linux kernel documentation: Documentation/device-mapper/cache-policies.txt".kernel.org. RetrievedAugust 31, 2015.
^Eric Van Hensbergen; Ming Zhao (November 28, 2006)."Dynamic Policy Disk Caching for Storage Networking"(PDF). IBM Research Report.IBM. RetrievedDecember 2, 2013.
^"Linux kernel 3.9, Section 1.3. SSD cache devices".kernelnewbies.org. April 28, 2013. RetrievedOctober 7, 2013.
^Jake Edge (May 1, 2013)."LSFMM: Caching – dm-cache and bcache".LWN.net. RetrievedOctober 7, 2013.
^Joe Thornber (November 11, 2013)."Linux kernel source tree: kernel/git/torvalds/linux.git: dm cache: add passthrough mode".kernel.org. RetrievedFebruary 6, 2014.
^Jonathan Corbet (July 1, 2015)."4.2 Merge window part 2".LWN.net. RetrievedAugust 31, 2015.
^Red Hat, Inc."lvmcache — LVM caching". Debian Manpages.A read and write hot-spot cache, using the dm-cache kernel module.

External links

[edit]

Linux Block Caching Choices in Stable Upstream Kernel (PDF),Dell, December 2013
Performance Comparison among EnhanceIO, bcache and dm-cache,LKML, June 11, 2013
EnhanceIO, Bcache & DM-Cache Benchmarked,Phoronix, June 11, 2013, by Michael Larabel
SSD Caching Using dm-cache Tutorial, July 2014, by Kyle Manna
Re: [dm-devel] [PATCH 8/8] [dm-cache] cache target, December 14, 2012 (guidelines for metadata device sizing)

Linux kernel

Organization

Kernel	Linux Foundation Linux Mark Institute Linus's law Tanenbaum–Torvalds debate Tux SCO disputes Linaro GNU GPL v2 menuconfig Supported computer architectures Version history Criticism
Support	Developers The Linux Programming Interface kernel.org LKML Linux conferences Users Linux User Group (LUG)
People	Werner Almesberger H. Peter Anvin Jens Axboe Moshe Bar Suparna Bhattacharya Andries Brouwer Rémy Card Alan Cox Matthew Garrett Avi Kivity Con Kolivas Greg Kroah-Hartman Benson Leung Robert Love David S. Miller Ingo Molnár Andrew Morton Hans Reiser Rusty Russell Shuah Khan Linus Torvalds Theodore Ts'o Stephen Tweedie Harald Welte Chris Wright

Technical

Debugging

Startup

ABIs

APIs

Kernel

System Call Interface	POSIX ioctl select open read close sync … Linux-only futex epoll splice dnotify inotify readahead …
In-kernel	ALSA Crypto API io uring DRM kernfs Memory barrier New API RCU Video4Linux IIO

Userspace

Daemons, File systems	bpffs configfs devfs devpts debugfs FUSE hugetlbfs pipefs procfs securityfs sockfs sysfs tmpfs systemd udev Kmscon
Wrapper libraries	C standard library glibc uClibc Bionic libhybris dietlibc EGLIBC klibc musl Newlib libcgroup libdrm libalsa libevdev libusb liburing

Components

Variants

Virtualization	Hypervisor KVM Xen OS-level virtualization Linux-VServer Lguest LXC OpenVZ Other L4Linux ELinOS User-mode Linux MkLinux coLinux

Adoption

Range of use	Desktop Embedded Gaming Thin client: LTSP Server: LAMP LYME-LYCE Devices
Adopters	List of Linux adopters

Linux portal
Free and open-source software portal
Category

v t e Linux
Linux kernel	History Linus's law Linux-libre Booting process Kernel oops Tux more…
Controversies	Criticism of Linux Criticism of desktop Linux GNU/Linux naming controversy Tanenbaum–Torvalds debate SCO and Linux
Distributions	General comparison Distributions list Netbook-specific comparison Distributions that run from RAM Lightweight Security-focused operating system Package manager Package format List of software package managers
Organizations	LinuxChix Linux Counter Linux Documentation Project Linux Foundation Linux Mark Institute Linux User Group (LUG)
Adoption	Adopters Desktop Embedded Gaming Mobile Range of use Linux malware
Media	DistroWatch Free Software Magazine Full Circle Hacker Public Radio Linux.com Linux Format Linux Gazette Linux Journal Linux Magazine LinuxUser Ubuntu User Linux Outlaws Linux Voice LugRadio LWN.net Phoronix Revolution OS The Code
Professional related certifications	CompTIA Linux+ Linux Foundation Red Hat Ubuntu
Linux portal Free and open-source software portal Category

Operating systems

General

Variants

Kernel

Architectures	Exokernel Hybrid Microkernel Monolithic Multikernel vkernel Rump kernel Unikernel
Components	Device driver Loadable kernel module User space and kernel space

Process management

Concepts	Computer multitasking (Cooperative,Preemptive) Context switch Interrupt IPC Process Process control block Real-time Thread Time-sharing
Scheduling algorithms	Fixed-priority preemptive Multilevel feedback queue Round-robin Shortest job next