Tech ONTAP Blogs
Tech ONTAP Blogs
When working with stateful, data-rich applications in Kubernetes, you might run into situations where moving your persistent volumes (PVs) to different storage back ends is required—for example, to achieve better performance or lower cost, or to phase out old storage hardware. When using dynamic provisioning, this involves migrating your PVs to different storage classes. NetApp® Trident™ protect data management capabilities offer you easy and safe means to migrate persistent volumes to a different storage class while minimizing application downtime.
NetApp® Trident™ protect provides application-aware data protection, mobility, and disaster recovery for any workload running on any K8s distribution. Trident protect enables administrators to easily protect, back up, migrate, and create working clones of K8s applications, through either its CLI or its Kubernetes-native custom resource definitions (CRDs).
We use a NGINX application deployed on an Azure Kubernetes Service (AKS) cluster with a persistent volume backed by Azure Disk via the AKS managed-csi storage class. We want to migrate this PV to Azure NetApp Files storage in the standard performance tier with minimum downtime and effort. The corresponding storage class was already created when we installed and configured Trident on the cluster:
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azure-netapp-files-standard (default) csi.trident.netapp.io Delete Immediate true 2d
azurefile file.csi.azure.com Delete Immediate true 2d
azurefile-csi file.csi.azure.com Delete Immediate true 2d
azurefile-csi-premium file.csi.azure.com Delete Immediate true 2d
azurefile-premium file.csi.azure.com Delete Immediate true 2d
default disk.csi.azure.com Delete WaitForFirstConsumer true 2d
managed disk.csi.azure.com Delete WaitForFirstConsumer true 2d
managed-csi disk.csi.azure.com Delete WaitForFirstConsumer true 2d
managed-csi-premium disk.csi.azure.com Delete WaitForFirstConsumer true 2d
managed-premium disk.csi.azure.com Delete WaitForFirstConsumer true 2d
Here is the Kubernetes configuration of the NGINX application in the namespace web-ad:
$ kubectl get all,pvc -n web-ad
NAME READY STATUS RESTARTS AGE
pod/web-64cdb84b99-sdfff 1/1 Running 0 20h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1/1 1 1 22h
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-64cdb84b99 1 1 1 22h
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/nginxdata Bound pvc-6ec8b2c0-bd05-4b50-b1bb-3ad970855a4d 2Gi RWO managed-csi <unset> 22h
Let’s add some random data to the application’s persistent volume:
$ for i in {1..5}; do kubectl -n web-ad exec -it pod/web-64cdb84b99-sdfff -- dd if=/dev/urandom of=/data/file${i} bs=1024k count=10; done
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0498185 s, 210 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0608314 s, 172 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0531475 s, 197 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0577863 s, 181 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0535681 s, 196 MB/s
$ kubectl -n web-ad exec -it pod/web-64cdb84b99-sdfff -- ls -l /data
total 51216
-rw-r--r-- 1 root root 10485760 Apr 17 11:40 file1
-rw-r--r-- 1 root root 10485760 Apr 17 11:40 file2
-rw-r--r-- 1 root root 10485760 Apr 17 11:40 file3
-rw-r--r-- 1 root root 10485760 Apr 17 11:40 file4
-rw-r--r-- 1 root root 10485760 Apr 17 11:40 file5
drwx------ 2 root root 16384 Apr 16 13:24 lost+found
$ kubectl -n web-ad exec -it pod/web-64cdb84b99-sdfff -- df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/sdc 2.0G 51M 1.9G 3% /data
Trident and Trident protect are already installed and configured on the cluster, so we can create the corresponding Trident protect application web-ad right away:
$ tridentctl-protect create app web-ad --namespaces web-ad -n web-ad
Application "web-ad" created.
~$ tridentctl-protect get application -n web-ad
+--------+------------+-------+-----+
| NAME | NAMESPACES | STATE | AGE |
+--------+------------+-------+-----+
| web-ad | web-ad | Ready | 19s |
+--------+------------+-------+-----+
The next sections walk you through two slightly different scenarios that explain how to migrate the PV of the NGINX application to a different storage class with minimum application downtime.
This option uses backup and restore with Trident protect to clone the NGINX application into a new namespace (web-clone in our example) and into a new storage class. If you don’t need to keep the application in the original namespace, this is the easiest, fastest, and safest way to migrate to a new storage class, because it keeps the original application for an easy failback in the unlikely event that an error occurs.
First, we stop application traffic by scaling the web deployment down to zero replicas:
$ kubectl -n web-ad scale deployment.apps/web --replicas=0
deployment.apps/web scaled
$ kubectl get all,pvc -n web-ad
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 0/0 0 0 104m
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-64cdb84b99 0 0 0 104m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/nginxdata Bound pvc-6ec8b2c0-bd05-4b50-b1bb-3ad970855a4d 2Gi RWO managed-csi <unset> 104m
Then we create a Trident protect backup, wait for its finalization using the tridentctl-protect wait command option, and immediately restore from the backup into the new namespace web-clone and the target storage class azure-netapp-files-standard with the --storageclass-mapping option of tridentctl-protect:
$ tridentctl-protect create backup web-bkp --appvault demo --app web-ad -n web-ad; tridentctl-protect wait backup web-bkp -n web-ad; tridentctl-protect create backuprestore --backup web-ad/web-bkp --namespace-mapping web-ad:web-clone --storageclass-mapping managed-csi:azure-netapp-files-standard -n web-clone
Backup "web-bkp" created.
Waiting for resource to be in final state: 0s
Resource is in final state: Completed
BackupRestore "web-ad-vrqjv3" created.
The restored app comes up quickly in the target namespace:
$ kubectl get all,pvc -n web-clone
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 0/0 0 0 4m15s
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-64cdb84b99 0 0 0 4m15s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/nginxdata Bound pvc-03c8efd6-c68f-4927-b035-00e4d0809948 50Gi RWO azure-netapp-files-standard <unset> 4m17s
The last step after the clone operation is to start NGINX again in the new namespace by scaling up the web deployment:
$ kubectl -n web-clone scale deployment.apps/web --replicas=1
deployment.apps/web scaled
$ kubectl get all,pvc -n web-clone
NAME READY STATUS RESTARTS AGE
pod/web-64cdb84b99-d2c84 1/1 Running 0 11s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1/1 1 1 4m48s
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-64cdb84b99 1 1 1 4m48s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/nginxdata Bound pvc-03c8efd6-c68f-4927-b035-00e4d0809948 50Gi RWO azure-netapp-files-standard <unset> 4m51s
And to confirm that the persistent data have been successfully copied to the azure-netapp-files-standard storage class:
$ kubectl -n web-clone exec -it pod/web-64cdb84b99-d2c84 -- ls -l /data
total 51444
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file1
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file2
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file3
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file4
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file5
drwx------ 2 nobody nogroup 4096 Apr 16 13:24 lost+found
$ kubectl -n web-clone exec -it pod/web-64cdb84b99-d2c84 -- df -h /data
Filesystem Size Used Avail Use% Mounted on
10.21.2.4:/pvc-03c8efd6-c68f-4927-b035-00e4d0809948 50G 51M 50G 1% /data
Use this approach if your workflows require the application to remain in the same K8s namespace after the storage class migration. In this case, we must delete the original namespace before we can restore an application backup into the same namespace and into a different storage class. Again, we show you how to migrate using the CLI of Trident protect.
As the deletion of the source namespace will also delete the Trident protect custom resources in the namespace, namely the backup CR (but not the actual backup in the object storage when using the default reclaim policy of Retain for the backups), we need to find (and save) the path of the backup in the object storage archive before deleting the backup. With the appArchivePath value available, we can then restore from the object storage archive without having the backup CR. To make the steps less error prone, we can use this little script:
$ cat backuprestore-scmig.sh
#!/bin/bash
#
APPVAULT=demo
APP=web-ad
APPNS=web-ad
APPSC=managed-csi
CLONE=web-clone
CLONENS=web-clone
CLONESC=azure-netapp-files-standard
BKUPNAME=web-bkp
# Create backup
tridentctl-protect create backup ${BKUPNAME} --appvault ${APPVAULT} --app ${APP} --reclaim-policy Retain -n ${APPNS}
# Wait for backup to finish
tridentctl-protect wait backup ${BKUPNAME} -n ${APPNS}
# Check if backup succeeded
BKUPSTATE=$(kubectl -n ${APPNS} get backup ${BKUPNAME} -o yaml | yq '.status.state')
if [[ $BKUPSTATE != "Completed" ]]
then
printf "Backup didn't complete successfully, exiting. \n"
exit 10
fi
# Get APPARCHIVEPATH
APPARCHIVEPATH=$(kubectl -n ${APPNS} get backup ${BKUPNAME} -o yaml | yq '.status.appArchivePath')
# Delete app namespace
kubectl delete ns ${APPNS}
# Run BackupRestore with Storage Class mapping:
tridentctl-protect create backuprestore --appvault ${APPVAULT} --path ${APPARCHIVEPATH} --namespace-mapping ${APPNS}:${APPNS} --storageclass-mapping ${APPSC}:${CLONESC} -n ${APPNS}
In more detail, the script
Let’s run the script then:
$ sh ./backuprestore-scmig.sh
Backup "web-bkp" created.
Waiting for resource to be in final state: 0s
Resource is in final state: Completed
namespace "web-ad" deleted
BackupRestore "web-ad-b5sr14" created.
Once the restore is complete, we start NGINX in the web-ad namespace by scaling up the web deployment:
$ kubectl -n web-ad scale deployment.apps/web --replicas=1
deployment.apps/web scaled
$ kubectl get all,pvc -n web-ad
NAME READY STATUS RESTARTS AGE
pod/web-64cdb84b99-6vrhm 1/1 Running 0 21s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/web 1/1 1 1 10m
NAME DESIRED CURRENT READY AGE
replicaset.apps/web-64cdb84b99 1 1 1 10m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/nginxdata Bound pvc-027a8609-c9de-47b1-aae9-5061b959ccc0 50Gi RWO azure-netapp-files-standard <unset> 10m
Finally, we check for the successful restore of the persistent data to the new storage class:
$ kubectl -n web-ad exec -it pod/web-64cdb84b99-6vrhm -- ls -l /data
total 51444
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file1
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file2
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file3
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file4
-rw-r--r-- 1 nobody nogroup 10485760 Apr 17 11:40 file5
drwx------ 2 nobody nogroup 4096 Apr 16 13:24 lost+found
$ kubectl -n web-ad exec -it pod/web-64cdb84b99-6vrhm -- df -h /data
Filesystem Size Used Avail Use% Mounted on
10.21.2.4:/pvc-027a8609-c9de-47b1-aae9-5061b959ccc0 50G 51M 50G 1% /data
When you need to migrate data of your data-rich Kubernetes applications between storage classes, the data management capabilities of NetApp Trident protect offer easy and safe means to migrate persistent volumes to a different storage class with minimal application downtime.
In this blog post we demonstrated two different ways of migrating a stateful K8s application to a different storage class with Trident protect, depending on whether your application can be migrated to a different namespace or needs to remain in the same namespace after the storage migration.