Persistent Volumes
This document describespersistent volumes in Kubernetes. Familiarity withvolumes,StorageClassesandVolumeAttributesClasses is suggested.
Introduction
Managing storage is a distinct problem from managing compute instances.The PersistentVolume subsystem provides an API for users and administratorsthat abstracts details of how storage is provided from how it is consumed.To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
APersistentVolume (PV) is a piece of storage in the cluster that has beenprovisioned by an administrator or dynamically provisioned usingStorage Classes. It is a resource inthe cluster just like a node is a cluster resource. PVs are volume plugins likeVolumes, but have a lifecycle independent of any individual Pod that uses the PV.This API object captures the details of the implementation of the storage, be thatNFS, iSCSI, or a cloud-provider-specific storage system.
APersistentVolumeClaim (PVC) is a request for storage by a user. It is similarto a Pod. Pods consume node resources and PVCs consume PV resources. Pods canrequest specific levels of resources (CPU and Memory). Claims can request specificsize and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany,ReadWriteMany, or ReadWriteOncePod, seeAccessModes).
While PersistentVolumeClaims allow a user to consume abstract storage resources,it is common that users need PersistentVolumes with varying properties, such asperformance, for different problems. Cluster administrators need to be able tooffer a variety of PersistentVolumes that differ in more ways than size and accessmodes, without exposing users to the details of how those volumes are implemented.For these needs, there is theStorageClass resource.
See thedetailed walkthrough with working examples.
Lifecycle of a volume and claim
PVs are resources in the cluster. PVCs are requests for those resources and also actas claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle:
Provisioning
There are two ways PVs may be provisioned: statically or dynamically.
Static
A cluster administrator creates a number of PVs. They carry the details of thereal storage, which is available for use by cluster users. They exist in theKubernetes API and are available for consumption.
Dynamic
When none of the static PVs the administrator created match a user's PersistentVolumeClaim,the cluster may try to dynamically provision a volume specially for the PVC.This provisioning is based on StorageClasses: the PVC must request astorage class andthe administrator must have created and configured that class for dynamicprovisioning to occur. Claims that request the class"" effectively disabledynamic provisioning for themselves.
To enable dynamic storage provisioning based on storage class, the cluster administratorneeds to enable theDefaultStorageClassadmission controlleron the API server. This can be done, for example, by ensuring thatDefaultStorageClass isamong the comma-delimited, ordered list of values for the--enable-admission-plugins flag ofthe API server component. For more information on API server command-line flags,checkkube-apiserver documentation.
Binding
A user creates, or in the case of dynamic provisioning, has already created,a PersistentVolumeClaim with a specific amount of storage requested and withcertain access modes. A control loop in the control plane watches for new PVCs, findsa matching PV (if possible), and binds them together. If a PV was dynamicallyprovisioned for a new PVC, the loop will always bind that PV to the PVC. Otherwise,the user will always get at least what they asked for, but the volume may be inexcess of what was requested. Once bound, PersistentVolumeClaim binds are exclusive,regardless of how they were bound. A PVC to PV binding is a one-to-one mapping,using a ClaimRef which is a bi-directional binding between the PersistentVolumeand the PersistentVolumeClaim.
Claims will remain unbound indefinitely if a matching volume does not exist.Claims will be bound as matching volumes become available. For example, acluster provisioned with many 50Gi PVs would not match a PVC requesting 100Gi.The PVC can be bound when a 100Gi PV is added to the cluster.
Using
Pods use claims as volumes. The cluster inspects the claim to find the boundvolume and mounts that volume for a Pod. For volumes that support multipleaccess modes, the user specifies which mode is desired when using their claimas a volume in a Pod.
Once a user has a claim and that claim is bound, the bound PV belongs to theuser for as long as they need it. Users schedule Pods and access their claimedPVs by including apersistentVolumeClaim section in a Pod'svolumes block.SeeClaims As Volumes for more details on this.
Storage Object in Use Protection
The purpose of the Storage Object in Use Protection feature is to ensure thatPersistentVolumeClaims (PVCs) in active use by a Pod and PersistentVolume (PVs)that are bound to PVCs are not removed from the system, as this may result in data loss.
Note:
PVC is in active use by a Pod when a Pod object exists that is using the PVC.If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately.PVC removal is postponed until the PVC is no longer actively used by any Pods. Also,if an admin deletes a PV that is bound to a PVC, the PV is not removed immediately.PV removal is postponed until the PV is no longer bound to a PVC.
You can see that a PVC is protected when the PVC's status isTerminating and theFinalizers list includeskubernetes.io/pvc-protection:
kubectl describe pvc hostpathName: hostpathNamespace: defaultStorageClass: example-hostpathStatus: TerminatingVolume:Labels: <none>Annotations: volume.beta.kubernetes.io/storage-class=example-hostpath volume.beta.kubernetes.io/storage-provisioner=example.com/hostpathFinalizers:[kubernetes.io/pvc-protection]...You can see that a PV is protected when the PV's status isTerminating andtheFinalizers list includeskubernetes.io/pv-protection too:
kubectl describe pv task-pv-volumeName: task-pv-volumeLabels:type=localAnnotations: <none>Finalizers:[kubernetes.io/pv-protection]StorageClass: standardStatus: TerminatingClaim:Reclaim Policy: DeleteAccess Modes: RWOCapacity: 1GiMessage:Source: Type: HostPath(bare host directory volume) Path: /tmp/data HostPathType:Events: <none>Reclaiming
When a user is done with their volume, they can delete the PVC objects from theAPI that allows reclamation of the resource. The reclaim policy for a PersistentVolumetells the cluster what to do with the volume after it has been released of its claim.Currently, volumes can either be Retained, Recycled, or Deleted.
Retain
TheRetain reclaim policy allows for manual reclamation of the resource.When the PersistentVolumeClaim is deleted, the PersistentVolume still existsand the volume is considered "released". But it is not yet available foranother claim because the previous claimant's data remains on the volume.An administrator can manually reclaim the volume with the following steps.
- Delete the PersistentVolume. The associated storage asset in external infrastructurestill exists after the PV is deleted.
- Manually clean up the data on the associated storage asset accordingly.
- Manually delete the associated storage asset.
If you want to reuse the same storage asset, create a new PersistentVolume withthe same storage asset definition.
Delete
For volume plugins that support theDelete reclaim policy, deletion removesboth the PersistentVolume object from Kubernetes, as well as the associatedstorage asset in the external infrastructure. Volumes that were dynamically provisionedinherit thereclaim policy of their StorageClass, whichdefaults toDelete. The administrator should configure the StorageClassaccording to users' expectations; otherwise, the PV must be edited orpatched after it is created. SeeChange the Reclaim Policy of a PersistentVolume.
Recycle
Warning:
TheRecycle reclaim policy is deprecated. Instead, the recommended approachis to use dynamic provisioning.If supported by the underlying volume plugin, theRecycle reclaim policy performsa basic scrub (rm -rf /thevolume/*) on the volume and makes it available again for a new claim.
However, an administrator can configure a custom recycler Pod template usingthe Kubernetes controller manager command line arguments as described in thereference.The custom recycler Pod template must contain avolumes specification, asshown in the example below:
apiVersion:v1kind:Podmetadata:name:pv-recyclernamespace:defaultspec:restartPolicy:Nevervolumes:-name:volhostPath:path:/any/path/it/will/be/replacedcontainers:-name:pv-recyclerimage:"registry.k8s.io/busybox"command:["/bin/sh","-c","test -e /scrub && rm -rf /scrub/..?* /scrub/.[!.]* /scrub/* && test -z \"$(ls -A /scrub)\" || exit 1"]volumeMounts:-name:volmountPath:/scrubHowever, the particular path specified in the custom recycler Pod template in thevolumes part is replaced with the particular path of the volume that is being recycled.
PersistentVolume deletion protection finalizer
Kubernetes v1.33 [stable](enabled by default)Finalizers can be added on a PersistentVolume to ensure that PersistentVolumeshavingDelete reclaim policy are deleted only after the backing storage are deleted.
The finalizerexternal-provisioner.volume.kubernetes.io/finalizer(introducedin v1.31) is added to both dynamically provisioned and statically provisionedCSI volumes.
The finalizerkubernetes.io/pv-controller(introduced in v1.31) is added todynamically provisioned in-tree plugin volumes and skipped for staticallyprovisioned in-tree plugin volumes.
The following is an example of dynamically provisioned in-tree plugin volume:
kubectl describe pv pvc-74a498d6-3929-47e8-8c02-078c1ece4d78Name: pvc-74a498d6-3929-47e8-8c02-078c1ece4d78Labels: <none>Annotations: kubernetes.io/createdby: vsphere-volume-dynamic-provisioner pv.kubernetes.io/bound-by-controller: yes pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volumeFinalizers:[kubernetes.io/pv-protection kubernetes.io/pv-controller]StorageClass: vcp-scStatus: BoundClaim: default/vcp-pvc-1Reclaim Policy: DeleteAccess Modes: RWOVolumeMode: FilesystemCapacity: 1GiNode Affinity: <none>Message:Source: Type: vSphereVolume(a Persistent Disk resource in vSphere) VolumePath:[vsanDatastore] d49c4a62-166f-ce12-c464-020077ba5d46/kubernetes-dynamic-pvc-74a498d6-3929-47e8-8c02-078c1ece4d78.vmdk FSType: ext4 StoragePolicyName: vSAN Default Storage PolicyEvents: <none>The finalizerexternal-provisioner.volume.kubernetes.io/finalizer is added for CSI volumes.The following is an example:
Name: pvc-2f0bab97-85a8-4552-8044-eb8be45cf48dLabels: <none>Annotations: pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.comFinalizers:[kubernetes.io/pv-protection external-provisioner.volume.kubernetes.io/finalizer]StorageClass: fastStatus: BoundClaim: demo-app/nginx-logsReclaim Policy: DeleteAccess Modes: RWOVolumeMode: FilesystemCapacity: 200MiNode Affinity: <none>Message:Source: Type: CSI(a Container Storage Interface(CSI) volumesource) Driver: csi.vsphere.vmware.com FSType: ext4 VolumeHandle: 44830fa8-79b4-406b-8b58-621ba25353fd ReadOnly:false VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1648442357185-8081-csi.vsphere.vmware.comtype=vSphere CNS Block VolumeEvents: <none>When theCSIMigration{provider} feature flag is enabled for a specific in-tree volume plugin,thekubernetes.io/pv-controller finalizer is replaced by theexternal-provisioner.volume.kubernetes.io/finalizer finalizer.
The finalizers ensure that the PV object is removed only after the volume is deletedfrom the storage backend provided the reclaim policy of the PV isDelete. Thisalso ensures that the volume is deleted from storage backend irrespective of theorder of deletion of PV and PVC.
Reserving a PersistentVolume
The control plane canbind PersistentVolumeClaims to matching PersistentVolumesin the cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a bindingbetween that specific PV and PVC. If the PersistentVolume exists and has not reservedPersistentVolumeClaims through itsclaimRef field, then the PersistentVolume andPersistentVolumeClaim will be bound.
The binding happens regardless of some volume matching criteria, including node affinity.The control plane still checks thatstorage class,access modes, and requested storage size are valid.
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:foo-pvcnamespace:foospec:storageClassName:""# Empty string must be explicitly set otherwise default StorageClass will be setvolumeName:foo-pv...This method does not guarantee any binding privileges to the PersistentVolume.If other PersistentVolumeClaims could use the PV that you specify, you firstneed to reserve that storage volume. Specify the relevant PersistentVolumeClaimin theclaimRef field of the PV so that other PVCs can not bind to it.
apiVersion:v1kind:PersistentVolumemetadata:name:foo-pvspec:storageClassName:""claimRef:name:foo-pvcnamespace:foo...This is useful if you want to consume PersistentVolumes that have theirpersistentVolumeReclaimPolicy settoRetain, including cases where you are reusing an existing PV.
Expanding Persistent Volumes Claims
Kubernetes v1.24 [stable]Support for expanding PersistentVolumeClaims (PVCs) is enabled by default. You can expandthe following types of volumes:
- csi (including some CSI migratedvolme types)
- flexVolume (deprecated)
- portworxVolume (deprecated)
You can only expand a PVC if its storage class'sallowVolumeExpansion field is set to true.
apiVersion:storage.k8s.io/v1kind:StorageClassmetadata:name:example-vol-defaultprovisioner:vendor-name.example/magicstorageparameters:resturl:"http://192.168.10.100:8080"restuser:""secretNamespace:""secretName:""allowVolumeExpansion:trueTo request a larger volume for a PVC, edit the PVC object and specify a largersize. This triggers expansion of the volume that backs the underlying PersistentVolume. Anew PersistentVolume is never created to satisfy the claim. Instead, an existing volume is resized.
Warning:
Directly editing the size of a PersistentVolume can prevent an automatic resize of that volume.If you edit the capacity of a PersistentVolume, and then edit the.spec of a matchingPersistentVolumeClaim to make the size of the PersistentVolumeClaim match the PersistentVolume,then no storage resize happens.The Kubernetes control plane will see that the desired state of both resources matches,conclude that the backing volume size has been manuallyincreased and that no resize is necessary.CSI Volume expansion
Kubernetes v1.24 [stable]Support for expanding CSI volumes is enabled by default but it also requires aspecific CSI driver to support volume expansion. Refer to documentation of thespecific CSI driver for more information.
Resizing a volume containing a file system
You can only resize volumes containing a file system if the file system is XFS, Ext3, or Ext4.
When a volume contains a file system, the file system is only resized when a new Pod is usingthe PersistentVolumeClaim inReadWrite mode. File system expansion is either done when a Pod is starting upor when a Pod is running and the underlying file system supports online expansion.
FlexVolumes (deprecated since Kubernetes v1.23) allow resize if the driver is configured with theRequiresFSResize capability totrue. The FlexVolume can be resized on Pod restart.
Resizing an in-use PersistentVolumeClaim
Kubernetes v1.24 [stable]In this case, you don't need to delete and recreate a Pod or deployment that is using an existing PVC.Any in-use PVC automatically becomes available to its Pod as soon as its file system has been expanded.This feature has no effect on PVCs that are not in use by a Pod or deployment. You must create a Pod thatuses the PVC before the expansion can complete.
Similar to other volume types - FlexVolume volumes can also be expanded when in-use by a Pod.
Note:
FlexVolume resize is possible only when the underlying driver supports resize.Recovering from Failure when Expanding Volumes
If a user specifies a new size that is too big to be satisfied by underlyingstorage system, expansion of PVC will be continuously retried until user orcluster administrator takes some action. This can be undesirable and henceKubernetes provides following methods of recovering from such failures.
If expanding underlying storage fails, the cluster administrator can manuallyrecover the Persistent Volume Claim (PVC) state and cancel the resize requests.Otherwise, the resize requests are continuously retried by the controller withoutadministrator intervention.
- Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC)with
Retainreclaim policy. - Delete the PVC. Since PV has
Retainreclaim policy - we will not lose any datawhen we recreate the PVC. - Delete the
claimRefentry from PV specs, so as new PVC can bind to it.This should make the PVAvailable. - Re-create the PVC with smaller size than PV and set
volumeNamefield of thePVC to the name of the PV. This should bind new PVC to existing PV. - Don't forget to restore the reclaim policy of the PV.
If expansion has failed for a PVC, you can retry expansion with asmaller size than the previously requested value. To request a new expansion attempt with asmaller proposed size, edit.spec.resources for that PVC and choose a value that is less than thevalue you previously tried.This is useful if expansion to a higher value did not succeed because of capacity constraint.If that has happened, or you suspect that it might have, you can retry expansion by specifying asize that is within the capacity limits of underlying storage provider. You can monitor status ofresize operation by watching.status.allocatedResourceStatuses and events on the PVC.
Note that,although you can specify a lower amount of storage than what was requested previously,the new value must still be higher than.status.capacity.Kubernetes does not support shrinking a PVC to less than its current size.
Types of Persistent Volumes
PersistentVolume types are implemented as plugins. Kubernetes currently supports the following plugins:
csi- Container Storage Interface (CSI)fc- Fibre Channel (FC) storagehostPath- HostPath volume(for single node testing only; WILL NOT WORK in a multi-node cluster;consider usinglocalvolume instead)iscsi- iSCSI (SCSI over IP) storagelocal- local storage devicesmounted on nodes.nfs- Network File System (NFS) storage
The following types of PersistentVolume are deprecated but still available.If you are using these volume types except forflexVolume,cephfs andrbd,please install corresponding CSI drivers.
awsElasticBlockStore- AWS Elastic Block Store (EBS)(migration on by default starting v1.23)azureDisk- Azure Disk(migration on by default starting v1.23)azureFile- Azure File(migration on by default starting v1.24)cinder- Cinder (OpenStack block storage)(migration on by default starting v1.21)flexVolume- FlexVolume(deprecated starting v1.23, no migration plan and no plan to remove support)gcePersistentDisk- GCE Persistent Disk(migration on by default starting v1.23)portworxVolume- Portworx volume(migration on by default starting v1.31)vsphereVolume- vSphere VMDK volume(migration on by default starting v1.25)
Older versions of Kubernetes also supported the following in-tree PersistentVolume types:
cephfs(not available starting v1.31)flocker- Flocker storage.(not available starting v1.25)glusterfs- GlusterFS storage.(not available starting v1.26)photonPersistentDisk- Photon controller persistent disk.(not available starting v1.15)quobyte- Quobyte volume.(not available starting v1.25)rbd- Rados Block Device (RBD) volume(not available starting v1.31)scaleIO- ScaleIO volume.(not available starting v1.21)storageos- StorageOS volume.(not available starting v1.25)
Persistent Volumes
Each PV contains a spec and status, which is the specification and status of the volume.The name of a PersistentVolume object must be a validDNS subdomain name.
apiVersion:v1kind:PersistentVolumemetadata:name:pv0003spec:capacity:storage:5GivolumeMode:FilesystemaccessModes:- ReadWriteOncepersistentVolumeReclaimPolicy:RecyclestorageClassName:slowmountOptions:- hard- nfsvers=4.1nfs:path:/tmpserver:172.17.0.2Note:
Helper programs relating to the volume type may be required for consumption ofa PersistentVolume within a cluster. In this example, the PersistentVolume isof type NFS and the helper program /sbin/mount.nfs is required to support themounting of NFS filesystems.Capacity
Generally, a PV will have a specific storage capacity. This is set using the PV'scapacity attribute which is aQuantity value.
Currently, storage size is the only resource that can be set or requested.Future attributes may include IOPS, throughput, etc.
Volume Mode
Kubernetes v1.18 [stable]Kubernetes supports twovolumeModes of PersistentVolumes:Filesystem andBlock.
volumeMode is an optional API parameter.Filesystem is the default mode used whenvolumeMode parameter is omitted.
A volume withvolumeMode: Filesystem ismounted into Pods into a directory. If the volumeis backed by a block device and the device is empty, Kubernetes creates a filesystemon the device before mounting it for the first time.
You can set the value ofvolumeMode toBlock to use a volume as a raw block device.Such volume is presented into a Pod as a block device, without any filesystem on it.This mode is useful to provide a Pod the fastest possible way to access a volume, withoutany filesystem layer between the Pod and the volume. On the other hand, the applicationrunning in the Pod must know how to handle a raw block device.SeeRaw Block Volume Supportfor an example on how to use a volume withvolumeMode: Block in a Pod.
Access Modes
A PersistentVolume can be mounted on a host in any way supported by the resourceprovider. As shown in the table below, providers will have different capabilitiesand each PV's access modes are set to the specific modes supported by that particularvolume. For example, NFS can support multiple read/write clients, but a specificNFS PV might be exported on the server as read-only. Each PV gets its own set ofaccess modes describing that specific PV's capabilities.
The access modes are:
ReadWriteOnce- the volume can be mounted as read-write by a single node. ReadWriteOnce accessmode still can allow multiple pods to access (read from or write to) that volume when the pods arerunning on the same node. For single pod access, please see ReadWriteOncePod.
ReadOnlyMany- the volume can be mounted as read-only by many nodes.
ReadWriteMany- the volume can be mounted as read-write by many nodes.
ReadWriteOncePod- FEATURE STATE:the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePodaccess mode if you want to ensure that only one pod across the whole cluster canread that PVC or write to it.
Kubernetes v1.29 [stable]
Note:
TheReadWriteOncePod access mode is only supported forCSI volumes and Kubernetes version1.22+. To use this feature you will need to update the followingCSI sidecarsto these versions or greater:
In the CLI, the access modes are abbreviated to:
- RWO - ReadWriteOnce
- ROX - ReadOnlyMany
- RWX - ReadWriteMany
- RWOP - ReadWriteOncePod
Note:
Kubernetes uses volume access modes to match PersistentVolumeClaims and PersistentVolumes.In some cases, the volume access modes also constrain where the PersistentVolume can be mounted.Volume access modes donot enforce write protection once the storage has been mounted.Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany,they don't set any constraints on the volume. For example, even if a PersistentVolume iscreated as ReadOnlyMany, it is no guarantee that it will be read-only. If the access modesare specified as ReadWriteOncePod, the volume is constrained and can be mounted on only a single Pod.Important! A volume can only be mounted using one access mode at a time,even if it supports many.
| Volume Plugin | ReadWriteOnce | ReadOnlyMany | ReadWriteMany | ReadWriteOncePod |
|---|---|---|---|---|
| AzureFile | ✓ | ✓ | ✓ | - |
| CephFS | ✓ | ✓ | ✓ | - |
| CSI | depends on the driver | depends on the driver | depends on the driver | depends on the driver |
| FC | ✓ | ✓ | - | - |
| FlexVolume | ✓ | ✓ | depends on the driver | - |
| HostPath | ✓ | - | - | - |
| iSCSI | ✓ | ✓ | - | - |
| NFS | ✓ | ✓ | ✓ | - |
| RBD | ✓ | ✓ | - | - |
| VsphereVolume | ✓ | - | - (works when Pods are collocated) | - |
| PortworxVolume | ✓ | - | ✓ | - |
Class
A PV can have a class, which is specified by setting thestorageClassName attribute to the name of aStorageClass.A PV of a particular class can only be bound to PVCs requestingthat class. A PV with nostorageClassName has no class and can only be boundto PVCs that request no particular class.
In the past, the annotationvolume.beta.kubernetes.io/storage-class was used insteadof thestorageClassName attribute. This annotation is still working; however,it will become fully deprecated in a future Kubernetes release.
Reclaim Policy
Current reclaim policies are:
- Retain -- manual reclamation
- Recycle -- basic scrub (
rm -rf /thevolume/*) - Delete -- delete the volume
For Kubernetes 1.35, onlynfs andhostPath volume types support recycling.
Mount Options
A Kubernetes administrator can specify additional mount options for when aPersistent Volume is mounted on a node.
Note:
Not all Persistent Volume types support mount options.The following volume types support mount options:
csi(including CSI migrated volume types)iscsinfs
Mount options are not validated. If a mount option is invalid, the mount fails.
In the past, the annotationvolume.beta.kubernetes.io/mount-options was used insteadof themountOptions attribute. This annotation is still working; however,it will become fully deprecated in a future Kubernetes release.
Node Affinity
Note:
For most volume types, you do not need to set this field.You need to explicitly set this forlocal volumes.A PV can specify node affinity to define constraints that limit what nodes thisvolume can be accessed from. Pods that use a PV will only be scheduled to nodesthat are selected by the node affinity. To specify node affinity, setnodeAffinity in the.spec of a PV. ThePersistentVolumeAPI reference has more details on this field.
Updates to node affinity
Kubernetes v1.35 [alpha](disabled by default)If theMutablePVNodeAffinityfeature gate is enabled in your cluster,the.spec.nodeAffinity field of a PersistentVolume is mutable.This allows cluster administrators or external storage controller to update the node affinity of a PersistentVolume when the data is migrated,without interrupting the running pods.
When updating the node affinity, you should ensure that the new node affinity still matches the nodes where the volume is currently in use.For the pods violating the new affinity, if the pod is already running, it may continue to run. But Kubernetes does not support this configuration.You should terminate the violating pods soon.Due to in memory caching, the pods created after the update may still be scheduled according to the old node affinity for a short period of time.
To use this feature, you should enable theMutablePVNodeAffinity feature gate on the following components:
kube-apiserverkubelet
Phase
A PersistentVolume will be in one of the following phases:
Available- a free resource that is not yet bound to a claim
Bound- the volume is bound to a claim
Released- the claim has been deleted, but the associated storage resource is not yet reclaimed by the cluster
Failed- the volume has failed its (automated) reclamation
You can see the name of the PVC bound to the PV usingkubectl describe persistentvolume <name>.
Phase transition timestamp
Kubernetes v1.31 [stable](enabled by default)The.status field for a PersistentVolume can include an alphalastPhaseTransitionTime field. This field recordsthe timestamp of when the volume last transitioned its phase. For newly createdvolumes the phase is set toPending andlastPhaseTransitionTime is set tothe current time.
PersistentVolumeClaims
Each PVC contains a spec and status, which is the specification and status of the claim.The name of a PersistentVolumeClaim object must be a validDNS subdomain name.
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:myclaimspec:accessModes:- ReadWriteOncevolumeMode:Filesystemresources:requests:storage:8GistorageClassName:slowselector:matchLabels:release:"stable"matchExpressions:- {key: environment, operator: In, values:[dev]}Access Modes
Claims usethe same conventions as volumes when requestingstorage with specific access modes.
Volume Modes
Claims usethe same convention as volumes to indicate theconsumption of the volume as either a filesystem or block device.
Volume Name
Claims can use thevolumeName field to explicitly bind to a specific PersistentVolume. You can also leavevolumeName unset, indicating that you'd like Kubernetes to set up a new PersistentVolumethat matches the claim.If the specified PV is already bound to another PVC, the binding will be stuckin a pending state.
Resources
Claims, like Pods, can request specific quantities of a resource. In this case,the request is for storage. The sameresource modelapplies to both volumes and claims.
Note:
ForFilesystem volumes, the storage request refers to the "outer" volume size(i.e. the allocated size from the storage backend).This means that the writeable size may be slightly lower for providers thatbuild a filesystem on top of a block device, due to filesystem overhead.This is especially visible with XFS, where many metadata features are enabled by default.Selector
Claims can specify alabel selectorto further filter the set of volumes.Only the volumes whose labels match the selector can be bound to the claim.The selector can consist of two fields:
matchLabels- the volume must have a label with this valuematchExpressions- a list of requirements made by specifying key, list of values,and operator that relates the key and values.Valid operators includeIn,NotIn,Exists, andDoesNotExist.
All of the requirements, from bothmatchLabels andmatchExpressions, areANDed together – they must all be satisfied in order to match.
Class
A claim can request a particular class by specifying the name of aStorageClassusing the attributestorageClassName.Only PVs of the requested class, ones with the samestorageClassName as the PVC,can be bound to the PVC.
PVCs don't necessarily have to request a class. A PVC with itsstorageClassName setequal to"" is always interpreted to be requesting a PV with no class, so itcan only be bound to PVs with no class (no annotation or one set equal to"").A PVC with nostorageClassName is not quite the same and is treated differentlyby the cluster, depending on whether theDefaultStorageClass admission pluginis turned on.
- If the admission plugin is turned on, the administrator may specify a default StorageClass.All PVCs that have no
storageClassNamecan be bound only to PVs of that default.Specifying a default StorageClass is done by setting the annotationstorageclass.kubernetes.io/is-default-classequal totruein a StorageClass object.If the administrator does not specify a default, the cluster responds to PVC creationas if the admission plugin were turned off.If more than one default StorageClass is specified, the newest default is used whenthe PVC is dynamically provisioned. - If the admission plugin is turned off, there is no notion of a default StorageClass.All PVCs that have
storageClassNameset to""can be bound only to PVsthat havestorageClassNamealso set to"".However, PVCs with missingstorageClassNamecan be updated later once default StorageClass becomes available.If the PVC gets updated it will no longer bind to PVs that havestorageClassNamealso set to"".
Seeretroactive default StorageClass assignment for more details.
Depending on installation method, a default StorageClass may be deployedto a Kubernetes cluster by addon manager during installation.
When a PVC specifies aselector in addition to requesting a StorageClass,the requirements are ANDed together: only a PV of the requested class and withthe requested labels may be bound to the PVC.
Note:
Currently, a PVC with a non-emptyselector can't have a PV dynamically provisioned for it.In the past, the annotationvolume.beta.kubernetes.io/storage-class was used insteadofstorageClassName attribute. This annotation is still working; however,it won't be supported in a future Kubernetes release.
Retroactive default StorageClass assignment
Kubernetes v1.28 [stable]You can create a PersistentVolumeClaim without specifying astorageClassNamefor the new PVC, and you can do so even when no default StorageClass existsin your cluster. In this case, the new PVC creates as you defined it, and thestorageClassName of that PVC remains unset until default becomes available.
When a default StorageClass becomes available, the control plane identifies anyexisting PVCs withoutstorageClassName. For the PVCs that either have an emptyvalue forstorageClassName or do not have this key, the control plane thenupdates those PVCs to setstorageClassName to match the new default StorageClass.If you have an existing PVC where thestorageClassName is"", and you configurea default StorageClass, then this PVC will not get updated.
In order to keep binding to PVs withstorageClassName set to""(while a default StorageClass is present), you need to set thestorageClassNameof the associated PVC to"".
This behavior helps administrators change default StorageClass by removing theold one first and then creating or setting another one. This brief window whilethere is no default causes PVCs withoutstorageClassName created at that timeto not have any default, but due to the retroactive default StorageClassassignment this way of changing defaults is safe.
Claims As Volumes
Pods access storage by using the claim as a volume. Claims must exist in thesame namespace as the Pod using the claim. The cluster finds the claim in thePod's namespace and uses it to get the PersistentVolume backing the claim.The volume is then mounted to the host and into the Pod.
apiVersion:v1kind:Podmetadata:name:mypodspec:containers:-name:myfrontendimage:nginxvolumeMounts:-mountPath:"/var/www/html"name:mypdvolumes:-name:mypdpersistentVolumeClaim:claimName:myclaimA Note on Namespaces
PersistentVolumes binds are exclusive, and since PersistentVolumeClaims arenamespaced objects, mounting claims with "Many" modes (ROX,RWX) is onlypossible within one namespace.
PersistentVolumes typedhostPath
AhostPath PersistentVolume uses a file or directory on the Node to emulatenetwork-attached storage. Seean example ofhostPath typed volume.
Raw Block Volume Support
Kubernetes v1.18 [stable]The following volume plugins support raw block volumes, including dynamic provisioning whereapplicable:
- CSI (including some CSI migrated volume types)
- FC (Fibre Channel)
- iSCSI
- Local volume
PersistentVolume using a Raw Block Volume
apiVersion:v1kind:PersistentVolumemetadata:name:block-pvspec:capacity:storage:10GiaccessModes:- ReadWriteOncevolumeMode:BlockpersistentVolumeReclaimPolicy:Retainfc:targetWWNs:["50060e801049cfd1"]lun:0readOnly:falsePersistentVolumeClaim requesting a Raw Block Volume
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:block-pvcspec:accessModes:- ReadWriteOncevolumeMode:Blockresources:requests:storage:10GiPod specification adding Raw Block Device path in container
apiVersion:v1kind:Podmetadata:name:pod-with-block-volumespec:containers:-name:fc-containerimage:fedora:26command:["/bin/sh","-c"]args:["tail -f /dev/null"]volumeDevices:-name:datadevicePath:/dev/xvdavolumes:-name:datapersistentVolumeClaim:claimName:block-pvcNote:
When adding a raw block device for a Pod, you specify the device path in thecontainer instead of a mount path.Binding Block Volumes
If a user requests a raw block volume by indicating this using thevolumeModefield in the PersistentVolumeClaim spec, the binding rules differ slightly fromprevious releases that didn't consider this mode as part of the spec.Listed is a table of possible combinations the user and admin might specify forrequesting a raw block device. The table indicates if the volume will be bound ornot given the combinations: Volume binding matrix for statically provisioned volumes:
| PV volumeMode | PVC volumeMode | Result |
|---|---|---|
| unspecified | unspecified | BIND |
| unspecified | Block | NO BIND |
| unspecified | Filesystem | BIND |
| Block | unspecified | NO BIND |
| Block | Block | BIND |
| Block | Filesystem | NO BIND |
| Filesystem | Filesystem | BIND |
| Filesystem | Block | NO BIND |
| Filesystem | unspecified | BIND |
Note:
Only statically provisioned volumes are supported for alpha release. Administratorsshould take care to consider these values when working with raw block devices.Volume Snapshot and Restore Volume from Snapshot Support
Kubernetes v1.20 [stable]Volume snapshots only support the out-of-tree CSI volume plugins.For details, seeVolume Snapshots.In-tree volume plugins are deprecated. You can read about the deprecated volumeplugins in theVolume Plugin FAQ.
Create a PersistentVolumeClaim from a Volume Snapshot
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:restore-pvcspec:storageClassName:csi-hostpath-scdataSource:name:new-snapshot-testkind:VolumeSnapshotapiGroup:snapshot.storage.k8s.ioaccessModes:- ReadWriteOnceresources:requests:storage:10GiVolume Cloning
Volume Cloningonly available for CSI volume plugins.
Create PersistentVolumeClaim from an existing PVC
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:cloned-pvcspec:storageClassName:my-csi-plugindataSource:name:existing-src-pvc-namekind:PersistentVolumeClaimaccessModes:- ReadWriteOnceresources:requests:storage:10GiVolume populators and data sources
Kubernetes v1.24 [beta]Kubernetes supports custom volume populators.To use custom volume populators, you must enable theAnyVolumeDataSourcefeature gate forthe kube-apiserver and kube-controller-manager.
Volume populators take advantage of a PVC spec field calleddataSourceRef. Unlike thedataSource field, which can only contain either a reference to another PersistentVolumeClaimor to a VolumeSnapshot, thedataSourceRef field can contain a reference to any object in thesame namespace, except for core objects other than PVCs. For clusters that have the featuregate enabled, use of thedataSourceRef is preferred overdataSource.
Cross namespace data sources
Kubernetes v1.26 [alpha]Kubernetes supports cross namespace volume data sources.To use cross namespace volume data sources, you must enable theAnyVolumeDataSourceandCrossNamespaceVolumeDataSourcefeature gates forthe kube-apiserver and kube-controller-manager.Also, you must enable theCrossNamespaceVolumeDataSource feature gate for the csi-provisioner.
Enabling theCrossNamespaceVolumeDataSource feature gate allows you to specifya namespace in the dataSourceRef field.
Note:
When you specify a namespace for a volume data source, Kubernetes checks for aReferenceGrant in the other namespace before accepting the reference.ReferenceGrant is part of thegateway.networking.k8s.io extension APIs.SeeReferenceGrantin the Gateway API documentation for details.This means that you must extend your Kubernetes cluster with at least ReferenceGrant from theGateway API before you can use this mechanism.Data source references
ThedataSourceRef field behaves almost the same as thedataSource field. If one isspecified while the other is not, the API server will give both fields the same value. Neitherfield can be changed after creation, and attempting to specify different values for the twofields will result in a validation error. Therefore the two fields will always have the samecontents.
There are two differences between thedataSourceRef field and thedataSource field thatusers should be aware of:
- The
dataSourcefield ignores invalid values (as if the field was blank) while thedataSourceReffield never ignores values and will cause an error if an invalid value isused. Invalid values are any core object (objects with no apiGroup) except for PVCs. - The
dataSourceReffield may contain different types of objects, while thedataSourcefieldonly allows PVCs and VolumeSnapshots.
When theCrossNamespaceVolumeDataSource feature is enabled, there are additional differences:
- The
dataSourcefield only allows local objects, while thedataSourceReffield allowsobjects in any namespaces. - When namespace is specified,
dataSourceanddataSourceRefare not synced.
Users should always usedataSourceRef on clusters that have the feature gate enabled, andfall back todataSource on clusters that do not. It is not necessary to look at both fieldsunder any circumstance. The duplicated values with slightly different semantics exist only forbackwards compatibility. In particular, a mixture of older and newer controllers are able tointeroperate because the fields are the same.
Using volume populators
Volume populators arecontrollers that cancreate non-empty volumes, where the contents of the volume are determined by a Custom Resource.Users create a populated volume by referring to a Custom Resource using thedataSourceRef field:
apiVersion:v1kind:PersistentVolumeClaimmetadata:name:populated-pvcspec:dataSourceRef:name:example-namekind:ExampleDataSourceapiGroup:example.storage.k8s.ioaccessModes:- ReadWriteOnceresources:requests:storage:10GiBecause volume populators are external components, attempts to create a PVC that uses onecan fail if not all the correct components are installed. External controllers should generateevents on the PVC to provide feedback on the status of the creation, including warnings ifthe PVC cannot be created due to some missing component.
You can install the alphavolume data source validatorcontroller into your cluster. That controller generates warning Events on a PVC in the case that no populatoris registered to handle that kind of data source. When a suitable populator is installed for a PVC, it's theresponsibility of that populator controller to report Events that relate to volume creation and issues duringthe process.
Using a cross-namespace volume data source
Kubernetes v1.26 [alpha]Create a ReferenceGrant to allow the namespace owner to accept the reference.You define a populated volume by specifying a cross namespace volume data sourceusing thedataSourceRef field. You must already have a valid ReferenceGrantin the source namespace:
apiVersion:gateway.networking.k8s.io/v1beta1kind:ReferenceGrantmetadata:name:allow-ns1-pvcnamespace:defaultspec:from:-group:""kind:PersistentVolumeClaimnamespace:ns1to:-group:snapshot.storage.k8s.iokind:VolumeSnapshotname:new-snapshot-demoapiVersion:v1kind:PersistentVolumeClaimmetadata:name:foo-pvcnamespace:ns1spec:storageClassName:exampleaccessModes:- ReadWriteOnceresources:requests:storage:1GidataSourceRef:apiGroup:snapshot.storage.k8s.iokind:VolumeSnapshotname:new-snapshot-demonamespace:defaultvolumeMode:FilesystemWriting Portable Configuration
If you're writing configuration templates or examples that run on a wide range of clustersand need persistent storage, it is recommended that you use the following pattern:
- Include PersistentVolumeClaim objects in your bundle of config (alongsideDeployments, ConfigMaps, etc).
- Do not include PersistentVolume objects in the config, since the user instantiatingthe config may not have permission to create PersistentVolumes.
- Give the user the option of providing a storage class name when instantiatingthe template.
- If the user provides a storage class name, put that value into the
persistentVolumeClaim.storageClassNamefield.This will cause the PVC to match the right storageclass if the cluster has StorageClasses enabled by the admin. - If the user does not provide a storage class name, leave the
persistentVolumeClaim.storageClassNamefield as nil. This will cause aPV to be automatically provisioned for the user with the default StorageClassin the cluster. Many cluster environments have a default StorageClass installed,or administrators can create their own default StorageClass.
- If the user provides a storage class name, put that value into the
- In your tooling, watch for PVCs that are not getting bound after some timeand surface this to the user, as this may indicate that the cluster has nodynamic storage support (in which case the user should create a matching PV)or the cluster has no storage system (in which case the user cannot deployconfig requiring PVCs).
What's next
- Learn more aboutCreating a PersistentVolume.
- Learn more aboutCreating a PersistentVolumeClaim.
- Read thePersistent Storage design document.
API references
Read about the APIs described in this page: