Kubernetes Pod filesystems are ephemeral by default. This is in keeping with the stateless nature of containers. Persistent data should be stored outside the container, even when it looks like it’s within the container’s filesystem. Here’s how to provision persistent storage in Kubernetes.

The basic unit of Kubernetes persistent storage is a Persistent Volume. This is an abstraction over the more fundamental Volume.

Persistent Volumes exist independently of any specific Pod. Similarly to plain Docker volumes, Kubernetes’ Persistent Volumes can remain in your cluster even when there’s no Pods using them.

Pods are given access to Persistent Volumes by making a Persistent Volume Claim. This is another resource type which represents a Pod’s request to use persistent storage. The claim handles the provisioning of Persistent Volumes that satisfy the request.

A Basic Example

Let’s look at how to create a persistent storage system by manually setting up a Persistent Volume and Persistent Volume Claim. Each resource will go into its own manifest file. You can apply these files to your cluster with kubectl apply.

Create a Persistent Volume

Begin by creating your volume:

This definition creates a volume called my-volume. It has a capacity of 2Gi and will be stored at /mnt/data on the host Node. Because we’re creating this volume manually, the storageClassName is set to manual. Storage classes can be used to mandate that volumes are only bound to volume claims requesting the same class.

Create a Persistent Volume Claim

You can now configure a Persistent Volume Claim:

The claim requests 1Gi of storage from a volume using the manual class. The volume we created earlier can fulfil these conditions. When the claim is created, Kubernetes should realise this and bind the claim to the volume.

If you were to inspect the details of the volume and claim, you’d see they both show a status of Bound.

Add a Pod

The final stage is to use your volume claim to add persistent storage to a Pod.

Within the volumes section, a reference to the Persistent Volume Claim is configured. You don’t need to specify any other information about the volume. The Pod will use the claim, which will provide the volume it’s bound to.

The claim is referenced in volumeMounts. Make sure you use the same name in volumes and volumeMounts. The volume will be mounted into your Pod at the location specified by mountPath.

Your Pod now has persistent storage available. Anything written to /path/in/container will be stored to the Persistent Volume. The Persistent Volume Claim will be reused by new Pods that reference it, allowing data to outlive any individual Pod.

Storage Classes

The manual storage class is used when you’re creating your own volume and volume claim manifests. Different volume plugin drivers provide their own storage classes. Reference the storage class that represents the volume type you want to use.

Managed Kubernetes services usually provide their own storage classes which map to the platform’s block storage implementation. Examples include gcePersistentDisk with Google Kubernetes Engine, or do-block-storage with DigitalOcean Managed Kubernetes.

In these scenarios, you don’t need to create the PersistentVolume manifest manually. Create a PersistentVolumeClaim with the correct storageClassName and use the resources.requests.storage field (shown above) to specify the desired capacity. The storage driver will automatically bind the claim to a compatible volume instance.

Access Modes

There are three supported values for the accessModes field:

ReadWriteOnce – The volume can only be mounted to a single Kubernetes node. That node will have full read-write access to the volume. ReadOnlyMany – The volume can be consumed by multiple nodes simultaneously. Each node has read-only access (nothing can write to the volume). ReadWriteMany – The volume can be mounted to multiple nodes simultaneously. Each node can read and write to the volume.

Only one access mode can be used by a given volume at any time. That means two volume claims will only bind to the same volume if both claims declare the same access mode.

The access mode of your volumes affects the Kubernetes scheduler’s ability to span replicas of your Pods across multiple nodes. The ReadOnlyMany/ReadWriteMany modes must be used if you need Pods to share persistent storage and be replicated over multiple nodes.

Be aware that not all storage drivers support all access modes – you should check with your plugin’s provider. A non-exhaustive list of volume plugins and compatible access modes is provided in the Kubernetes documentation.

Conclusion

Persistent storage in Kubernetes isn’t as daunting as it seems at first glance. Make sure Pods which need access to storage have volumes which are bound to a Persistent Volume Claim.

When Persistent Volume Claims are used, Kubernetes will create Persistent Volumes which outlive individual Pods. When your Pods are replaced, the claimed volumes will be automatically mounted into the new Pods. Data will not be destroyed until the claim is deleted.