Ceph Storage | |
---|---|
![]() | |
Original author(s) | Inktank Storage:
|
Developer(s) | |
Stable release | 19.2.2[2] ![]() |
Repository | |
Written in | C++,Python[3] |
Operating system | Linux,FreeBSD,[4]Windows[5] |
Type | Distributed object store |
License | LGPLv2.1[6] |
Website | ceph |
Ceph (pronounced/ˈsɛf/) is afree andopen-source software-definedstorageplatform that providesobject storage,[7]block storage, andfile storage built on a commondistributed cluster foundation. Ceph provides distributed operation without asingle point of failure and scalability to theexabyte level. Since version 12 (Luminous), Ceph does not rely on any other conventional filesystem and directly managesHDDs andSSDs with its own storage backend BlueStore and can expose aPOSIXfilesystem.
Cephreplicates data withfault tolerance,[8] usingcommodity hardware and Ethernet IP and requiring no specific hardware support. Ceph ishighly available and ensures strong data durability through techniques including replication,erasure coding, snapshots and clones. By design, the system is both self-healing andself-managing, minimizing administration time and other costs.
Large-scale production Ceph deployments includeCERN,[9][10]OVH[11][12][13][14] andDigitalOcean.[15][16]
Ceph employs five distinct kinds ofdaemons:[17]
All of these are fully distributed, and may be deployed on disjoint, dedicated servers or in aconverged topology. Clients with different needs directly interact with appropriate cluster components.[21]
Cephdistributes data across multiple storage devices and nodes to achieve higher throughput, in a fashion similar toRAID. Adaptiveload balancing is supported whereby frequently accessed services may be replicated over more nodes.[22]
As of September 2017[update], BlueStore is the default and recommended storage back end for production environments,[23] which provides better latency and configurability than the older Filestore back end, and avoiding the shortcomings of filesystem based storage involving additional processing and caching layers. The Filestore back end will be deprecated as of the Reef release in mid 2023.XFS was the recommended underlying filesystem for Filestore OSDs, andBtrfs could be used at one's own risk.ext4 filesystems were not recommended due to limited metadata capacity.[24] The BlueStore back end does still use XFS for a small metadata partition.[25]
Ceph implements distributedobject storage via the RADOS GateWay (ceph-rgw), which exposes the underlying storage layer via an interface compatible withAmazon S3 orOpenStack Swift.
Ceph RGW deployments scale readily and often utilize large and dense storage media for bulk use cases that includeBig Data (datalake),backups &archives,IOT, media, video recording, and deployment images forvirtual machines andcontainers.[26]
Ceph's software libraries provide client applications with direct access to thereliable autonomic distributed object store (RADOS) object-based storage system. More frequently used are libraries for Ceph'sRADOS Block Device (RBD),RADOS Gateway, andCeph File System services. In this way, administrators can maintain their storage devices within a unified system, which makes it easier to replicate and protect the data.
The "librados"software libraries provide access inC,C++,Java,PHP, andPython. The RADOS Gateway also exposes the object store as aRESTful interface which can present as both nativeAmazon S3 andOpenStack Swift APIs.
Ceph can provide clients withthin-provisionedblock devices. When an application writes data to Ceph using a block device, Ceph automatically stripes and replicates the data across the cluster. Ceph'sRADOS Block Device (RBD) also integrates withKernel-based Virtual Machines (KVMs).
Ceph block storage may be deployed on traditional HDDs and/orSSDs which are associated with Ceph's block storage for use cases, including databases, virtual machines, data analytics, artificial intelligence, and machine learning. Block storage clients often require highthroughput andIOPS, thus Ceph RBD deployments increasingly utilize SSDs withNVMe interfaces.
"RBD" is built on with Ceph's foundational RADOS object storage system that provides the librados interface and the CephFS file system. Since RBD is built on librados, RBD inherits librados's abilities, including clones andsnapshots. By striping volumes across the cluster, Ceph improves performance for large block device images.
"Ceph-iSCSI" is a gateway which enables access to distributed, highly available block storage fromMicrosoft Windows andVMware vSphere servers or clients capable of speaking theiSCSI protocol. By using ceph-iscsi on one or more iSCSI gateway hosts, Ceph RBD images become available as Logical Units (LUs) associated with iSCSI targets, which can be accessed in an optionally load-balanced, highly available fashion.
Since ceph-iscsi configuration is stored in the Ceph RADOS object store, ceph-iscsi gateway hosts are inherently without persistent state and thus can be replaced, augmented, or reduced at will. As a result, Ceph Storage enables customers to run a truly distributed, highly-available, resilient, and self-healing enterprise storage technology on commodity hardware and an entirely open source platform.
The block device can be virtualized, providing block storage to virtual machines, in virtualization platforms such asOpenShift,OpenStack,Kubernetes,OpenNebula,Ganeti,Apache CloudStack andProxmox Virtual Environment.
Ceph's file system (CephFS) runs on top of the same RADOS foundation as Ceph's object storage and block device services. The CephFS metadata server (MDS) provides a service that maps the directories and file names of the file system to objects stored within RADOS clusters. The metadata server cluster can expand or contract, and it can rebalance file system metadata ranks dynamically to distribute data evenly among cluster hosts. This ensures high performance and prevents heavy loads on specific hosts within the cluster.
Clients mount thePOSIX-compatible file system using aLinux kernel client. An olderFUSE-based client is also available. The servers run as regular Unixdaemons.
Ceph's file storage is often associated with log collection, messaging, and file storage.
From 2018 there is also a Dashboard web UI project, which helps to manage the cluster. It's being developed by Ceph community on LGPL-3 and usesceph-mgr,Python,Angular andGrafana.[27] Its landing page has been refreshed in the beginning of 2023.[28]
Previous dashboards were developed but are closed now: Calamari (2013–2018), OpenAttic (2013–2019), VSM (2014–2016), Inkscope (2015–2016) and Ceph-Dash (2015–2017).[29]
Beginning in 2019 the Crimson project has been reimplementing the OSD data path. The goal of Crimson is to minimize latency and CPU overhead. Modern storage devices and interfaces includingNVMe and3D XPoint have become much faster thanHDD and even SAS/SATASSDs, but CPU performance has not kept pace. Moreovercrimson-osd is meant to be a backward-compatibledrop-in replacement forceph-osd. While Crimson can work with the BlueStore back end (via AlienStore), a new native ObjectStore implementation called SeaStore is also being developed along with CyanStore for testing purposes. One reason for creating SeaStore is that transaction support in the BlueStore back end is provided byRocksDB, which needs to be re-implemented to achieve better parallelism.[30][31][32]
Ceph was created bySage Weil for hisdoctoral dissertation,[33] which was advised by Professor Scott A. Brandt at theJack Baskin School of Engineering,University of California, Santa Cruz (UCSC), and sponsored by theAdvanced Simulation and Computing Program (ASC), includingLos Alamos National Laboratory (LANL),Sandia National Laboratories (SNL), andLawrence Livermore National Laboratory (LLNL).[34] The first line of code that ended up being part of Ceph was written by Sage Weil in 2004 while at a summer internship at LLNL, working on scalable filesystem metadata management (known today as Ceph's MDS).[35] In 2005, as part of a summer project initiated by Scott A. Brandt and led by Carlos Maltzahn, Sage Weil created a fully functional file system prototype which adopted the name Ceph. Ceph made its debut with Sage Weil giving two presentations in November 2006, one atUSENIX OSDI 2006[36] and another atSC'06.[37]
After his graduation in autumn 2007, Weil continued to work on Ceph full-time, and the core development team expanded to include Yehuda Sadeh Weinraub and Gregory Farnum. On March 19, 2010,Linus Torvalds merged the Ceph client into Linux kernel version 2.6.34[38][39] which was released on May 16, 2010. In 2012, Weil createdInktank Storage for professional services and support for Ceph.[40][41]
In April 2014,Red Hat purchased Inktank, bringing the majority of Ceph development in-house to make it a production version for enterprises with support (hotline) and continuous maintenance (new versions).[42]
In October 2015, the Ceph Community Advisory Board was formed to assist the community in driving the direction of open source software-defined storage technology. The charter advisory board includes Ceph community members from global IT organizations that are committed to the Ceph project, including individuals fromRed Hat,Intel,Canonical,CERN,Cisco,Fujitsu,SanDisk, andSUSE.[43]
In November 2018, the Linux Foundation launched the Ceph Foundation as a successor to the Ceph Community Advisory Board. Founding members of the Ceph Foundation included Amihan,Canonical,China Mobile,DigitalOcean,Intel,OVH, ProphetStor Data Services,Red Hat, SoftIron,SUSE,Western Digital, XSKY Data Technology, andZTE.[44]
In March 2021, SUSE discontinued its Enterprise Storage product incorporating Ceph in favor ofRancher's Longhorn,[45] and the former Enterprise Storage website was updated stating "SUSE has refocused the storage efforts around serving our strategic SUSE Enterprise Storage Customers and are no longer actively selling SUSE Enterprise Storage."[46]
Name | Release | First release | End of life | Milestones |
---|---|---|---|---|
Argonaut | Unsupported: 0.48 | July 3, 2012 | First major "stable" release | |
Bobtail | Unsupported: 0.56 | January 1, 2013 | ||
Cuttlefish | Unsupported: 0.61 | May 7, 2013 | ceph-deploy is stable | |
Dumpling | Unsupported: 0.67 | August 14, 2013 | May 2015 | namespace, region, monitoring REST API |
Emperor | Unsupported: 0.72 | November 9, 2013 | May 2014 | multi-datacenter replication for RGW |
Firefly | Unsupported: 0.80 | May 7, 2014 | April 2016 | erasure coding, cache tiering, primary affinity, key/value OSD backend (experimental), standalone RGW (experimental) |
Giant | Unsupported: 0.87 | October 29, 2014 | April 2015 | |
Hammer | Unsupported: 0.94 | April 7, 2015 | August 2017 | |
Infernalis | Unsupported: 9.2.0 | November 6, 2015 | April 2016 | |
Jewel | Unsupported: 10.2.0 | April 21, 2016 | 2018-06-01 | Stable CephFS, experimental OSD back end named BlueStore, daemons no longer run as the root user |
Kraken | Unsupported: 11.2.0 | January 20, 2017 | 2017-08-01 | BlueStore is stable, EC for RBD pools |
Luminous | Unsupported: 12.2.0 | August 29, 2017 | 2020-03-01 | pg-upmap balancer |
Mimic | Unsupported: 13.2.0 | June 1, 2018 | 2020-07-22 | snapshots are stable, Beast is stable, official GUI (Dashboard) |
Nautilus | Unsupported: 14.2.0 | March 19, 2019 | 2021-06-01 | asynchronous replication, auto-retry of failed writes due to grown defect remapping |
Octopus | Unsupported: 15.2.0 | March 23, 2020 | 2022-06-01 | |
Pacific | Unsupported: 16.2.0 | March 31, 2021[47] | 2023-06-01 | |
Quincy | Unsupported: 17.2.0 | April 19, 2022[48] | 2024-06-01 | auto-setting of min_alloc_size for novel media |
Reef | Supported: 18.2.0 | Aug 3, 2023[49] | 2025-08-01[50] | |
Squid | Latest version:19.2.0 | Sep 26, 2024[51] | 2026-09-19[52] | |
Tentacle[53] | Future version: TBA | TBA |
While basically built forLinux, Ceph has been also partially ported to Windows platform. It is production-ready forWindows Server 2016 (some commands might be unavailable due to lack ofUNIX socket implementation),Windows Server 2019 andWindows Server 2022, but testing/development can be done also onWindows 10 andWindows 11. One can use Ceph RBD and CephFS on Windows, but OSD is not supported on this platform.[54][5][55]
There is alsoFreeBSD implementation of Ceph.[4]
MicroCeph is a simplified Ceph deployment system for non-experts usingsnap packaging system, created byCanonical in 2022.[56] It's isolated from the underlying host, platform independent, scalable and offers minimal setup and maintenance overheads.[57] MicroCeph supports all Ceph data access protocols - block, file and object - and can be deployed with full disk encryption.[58]
The name "Ceph" is a shortened form of "cephalopod", a class ofmolluscs that includes squids, cuttlefish, nautiloids, and octopuses. The name (emphasized by the logo) suggests the highly parallel behavior of an octopus and was chosen to associate the file system with "Sammy", thebanana slug mascot ofUCSC.[17] Both cephalopods and banana slugs are molluscs.
{{cite web}}
:Missing or empty|title=
(help)