Tech ONTAP Blogs
Tech ONTAP Blogs
In a recent series of blog posts, we introduced the NetApp® Trident™ protect advanced application data management (ADM) and protection capabilities for stateful Kubernetes applications, along with the new Trident protect CLI. We discussed how its Kubernetes-native custom resources (CRs) can facilitate the integration of application protection into your automation and deployment workflows by using either manifests or the Trident protect CLI. In addition, we explored how Trident protect can seamlessly integrate into your GitOps workflow.
Most recently, another Community blog post delved into how NetApp Trident protect , built on top of the NetApp ONTAP® SnapMirror® feature, can empower your business to achieve seamless application mobility and disaster recovery (DR) for mission-critical applications. As a result, your organization can attain very low recovery point objectives (RPOs) and recovery time objectives (RTOs).
However, not all workloads require an RTO and RPO of minutes or less. For those workloads, backup and restore can still be the most appropriate strategy, because it is the simplest and least expensive to implement. By replicating your backup data to another data center or region, you can effectively manage large-scale disasters. If a disaster prevents your workload from operating in a region, the workload can be restored to a recovery region or data center and can continue operations from there.
When using public cloud Kubernetes services, you can even use an account or a subscription that differs from your primary region, with distinct credentials. This approach can prevent human error or malicious actions in one region from affecting another, enabling you to recover your services even if, for example, a ransomware attack compromises your primary account. Also, replicating your backup data from an on-premises object storage bucket to a cloud-based object storage bucket and then restoring your services in the cloud during a disaster are another effective strategy to protect your business-critical applications.
You might decide, for example, to store only replicated backup data in the cloud while keeping your production environment in your own data center. With this hybrid approach, you still gain the advantages of scalability and geographic distance without having to move your production environment. In a cloud-to-cloud model, both production and DR are in the cloud, although at different sites and in different subscriptions to maintain enough physical and logical separation.
NetApp Trident protect provides advanced application data management capabilities that enhance the functionality and availability of stateful Kubernetes applications supported by NetApp ONTAP storage systems and the NetApp Trident Container Storage Interface (CSI) storage provisioner. It is compatible with a wide range of fully managed and self-managed Kubernetes offerings (see the supported Kubernetes distributions and storage back ends), making it an optimal solution for protecting your Kubernetes services across various platforms and regions.
In this blog post, I show you how to combine Trident protect backup and restore with bucket replication between two different regions, with clusters and buckets hosted on Amazon Web Services (AWS). However, other supported clusters and object storage solutions work the same way. For example, the NetApp StorageGRID® object-based storage solution provides the CloudMirror replication service for its S3 buckets.
This blog post walks through a basic backup and restore workflow of a sample application with persistent data running on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster backed by Amazon FSx for NetApp ONTAP storage. The Trident protect backups are stored in an Amazon S3 bucket in the eu-west-1 region and are replicated to an Amazon S3 bucket in eu-central-1. From there, we restore the sample application into another Amazon EKS cluster in the eu-central-1 region.
If you plan to follow the process in this blog post step by step, you must have the following available:
To set up the Amazon S3 buckets and the replication between the buckets for our sample environment, we follow this example AWS walkthrough. We create two Amazon S3 buckets, pu-repl-source in the eu-west-1 region and pu-repl-dest in the eu-central-1 region. In the AWS console, we confirm the created replication configuration, as shown in Figure 1.
Figure 1) Replication configuration between buckets pu-repl-source and pu-repl-dest.
Let’s quickly test the configured bucket replication by using the AWS CLI. We list the two buckets and confirm that they are empty.
$ aws s3 ls | grep pu
2025-03-27 15:24:27 pu-repl-dest
2025-03-27 17:07:10 pu-repl-source
$ aws s3 ls s3://pu-repl-source
$ aws s3 ls s3://pu-repl-dest
Now we upload a random image file to the source bucket pu-repl-source and confirm that it’s replicated to the destination bucket pu-repl-dest.
$ aws s3 cp ~/Downloads/Image.jpeg s3://pu-repl-source
upload: Downloads/Image.jpeg to s3://pu-repl-source/Image.jpeg
$ aws s3 ls s3://pu-repl-source
2025-03-28 11:24:49 379931 Image.jpeg
$ aws s3 ls s3://pu-repl-dest
2025-03-28 11:24:49 379931 Image.jpeg
Deleting the object from the source bucket also deletes it from the destination bucket.
$ aws s3 rm s3://pu-repl-source/Image.jpeg
delete: s3://pu-repl-source/Image.jpeg
$ aws s3 ls s3://pu-repl-source
$ aws s3 ls s3://pu-repl-dest
For the tests, we use two Amazon EKS clusters in the same AWS regions where the Amazon S3 buckets are. Both clusters have persistent storage that’s backed by Amazon FSx for NetApp ONTAP storage and that’s provisioned through NetApp Trident. Both also have the following storage classes available, backed by the respective Trident back ends.
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
fsx-netapp-block csi.trident.netapp.io Delete Immediate true 12m
fsx-netapp-file (default) csi.trident.netapp.io Delete Immediate true 12m
gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 32m
$ tridentctl get backends
+-----------------------+----------------+--------------------------------------+--------+------------+---------+
| NAME | STORAGE DRIVER | UUID | STATE | USER-STATE | VOLUMES |
+-----------------------+----------------+--------------------------------------+--------+------------+---------+
| backend-fsx-ontap-nas | ontap-nas | fa49d82c-8600-4114-be54-49947ebbe80a | online | normal | 0 |
| backend-fsx-ontap-san | ontap-san | a0e43009-b193-40c8-af66-24ad21c65dfb | online | normal | 0 |
+-----------------------+----------------+--------------------------------------+--------+------------+---------+
For our testing purposes, we deploy a simple Alpine container with a persistent volume that’s backed by Amazon FSx for NetApp ONTAP storage on the Amazon EKS cluster eks-source-cluster in the namespace alpine.
$ kubectl apply -f - <<EOF
> apiVersion: v1
> kind: Namespace
> metadata:
> name: alpine
> labels:
> app: alpine
> ---
> apiVersion: apps/v1
> kind: Deployment
> metadata:
> labels:
> app: alpine
> name: alpine
> namespace: alpine
> spec:
> replicas: 1
> selector:
> matchLabels:
> app: alpine
> strategy: {}
> template:
> metadata:
> creationTimestamp: null
> labels:
> app: alpine
> spec:
> containers:
> - image: alpine:latest
> name: alpine-container
> command: ["/bin/sh", "-c", "sleep infinity"] # Keep the container running
> volumeMounts:
> - mountPath: /data
> name: data
> volumes:
> - name: data
> persistentVolumeClaim:
> claimName: alpinedata
> ---
> apiVersion: v1
> kind: PersistentVolumeClaim
> metadata:
> name: alpinedata
> namespace: alpine
> spec:
> accessModes:
> - ReadWriteMany
> resources:
> requests:
> storage: 2Gi
> storageClassName: fsx-netapp-file
> EOF
namespace/alpine created
deployment.apps/alpine created
persistentvolumeclaim/alpinedata created
$ kubectl get all,pvc -n alpine
NAME READY STATUS RESTARTS AGE
pod/alpine-5bdb97fb48-f9wgr 1/1 Running 0 78s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/alpine 1/1 1 1 79s
NAME DESIRED CURRENT READY AGE
replicaset.apps/alpine-5bdb97fb48 1 1 1 79s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/alpinedata Bound pvc-d5c38879-83f5-49b0-9ad9-667792dd3154 2Gi RWX fsx-netapp-file <unset> 79s
Let’s also add some random data files to the persistent volume.
$ kubectl -n alpine exec -it pod/alpine-5bdb97fb48-f9wgr -- df -h /data
Filesystem Size Used Available Use% Mounted on
198.19.255.230:/trident_pvc_d5c38879_83f5_49b0_9ad9_667792dd3154
2.0G 768.0K 2.0G 0% /data
$ for i in 1 2 3 4 5; do kubectl -n alpine exec -it pod/alpine-5bdb97fb48-f9wgr -- dd if=/dev/urandom of=/data/file${i} bs=1024k count=100; done
100+0 records in
100+0 records out
104857600 bytes (100.0MB) copied, 1.169176 seconds, 85.5MB/s
100+0 records in
100+0 records out
104857600 bytes (100.0MB) copied, 1.169829 seconds, 85.5MB/s
100+0 records in
100+0 records out
104857600 bytes (100.0MB) copied, 1.165403 seconds, 85.8MB/s
100+0 records in
100+0 records out
104857600 bytes (100.0MB) copied, 1.167324 seconds, 85.7MB/s
100+0 records in
100+0 records out
104857600 bytes (100.0MB) copied, 1.175639 seconds, 85.1MB/s
$ kubectl -n alpine exec -it pod/alpine-5bdb97fb48-f9wgr -- df -h /data
Filesystem Size Used Available Use% Mounted on
198.19.255.230:/trident_pvc_d5c38879_83f5_49b0_9ad9_667792dd3154
2.0G 509.2M 1.5G 25% /data
Before we can protect the sample application with Trident protect on the primary cluster, we need to create the AppVault CR that provides Trident protect with the access details for the Amazon S3 bucket storing the backup data. First, we store the Amazon S3 access credentials for the source bucket pu-repl-source in the secret pu-repl-source-secret in the trident-protect namespace.
$ kubectl create secret generic pu-repl-source-secret --from-literal=accessKeyID=<REDACTED> --from-literal=secretAccessKey=<REDACTED> -n trident-protect
secret/pu-repl-source-secret created
Now we can create the alpine Trident protect application by configuring the complete alpine namespace as an application in Trident protect.
$ tridentctl-protect create application alpine --namespaces alpine -n alpine
Application "alpine" created.
$ tridentctl-protect get application -A
+-----------+--------+------------+-------+-----+
| NAMESPACE | NAME | NAMESPACES | STATE | AGE |
+-----------+--------+------------+-------+-----+
| alpine | alpine | alpine | Ready | 14s |
+-----------+--------+------------+-------+-----+
To regularly protect the application, we also create a protection schedule, making hourly backups to the AppVault pu-repl-source and retaining the last three backups and snapshots, again using the Trident protect CLI.
$ tridentctl-protect create schedule --app alpine --appvault pu-repl-source --snapshot-retention 3 --backup-retention 3 --granularity Hourly --minute 10 -n alpine
Schedule "alpine-x7ye33" created.
$ tridentctl-protect get schedules -A
+-----------+---------------+--------+---------------+---------+-------+-------+-----+
| NAMESPACE | NAME | APP | SCHEDULE | ENABLED | STATE | ERROR | AGE |
+-----------+---------------+--------+---------------+---------+-------+-------+-----+
| alpine | alpine-x7ye33 | alpine | Hourly:min=10 | true | | | 4s |
+-----------+---------------+--------+---------------+---------+-------+-------+-----+
After the first backup is complete, we check the content of the AppVault pu-repl-source.
$ tridentctl-protect get backup -n alpine
+-----------------------------+--------+----------------+-----------+-------+-------+
| NAME | APP | RECLAIM POLICY | STATE | ERROR | AGE |
+-----------------------------+--------+----------------+-----------+-------+-------+
| hourly-54241-20250401141000 | alpine | Retain | Completed | | 3m52s |
+-----------------------------+--------+----------------+-----------+-------+-------+
$ tridentctl-protect get appvaultcontent pu-repl-source
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
| CLUSTER | APP | TYPE | NAME | NAMESPACE | TIMESTAMP |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
| eks-source-cluster | alpine | backup | hourly-54241-20250401141000 | alpine | 2025-04-01 14:11:50 (UTC) |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
On our destination Amazon EKS cluster eks-dest-cluster in the eu-central-1 region, we configure Trident protect to access the destination Amazon S3 bucket pu-repl-dest with the replicated backup content through the AppVault CR pu-repl-dest. After creating the secret pu-repl-dest-secret with the access credentials for the destination Amazon S3 bucket, we create the AppVault CR pu-repl-dest with the Trident protect CLI, allowing Trident protect to access the bucket.
$ kubectl create secret generic pu-repl-dest-secret --from-literal=accessKeyID=<REDACTED> --from-literal=secretAccessKey=<REDACTED> -n trident-protect
secret/pu-repl-dest-secret created
$ tridentctl-protect create appvault AWS pu-repl-dest --bucket pu-repl-dest --secret pu-repl-dest-secret --endpoint s3.eu-central-1.amazonaws.com -n trident-protect
AppVault "pu-repl-dest" created.
$ tridentctl-protect get appvault --show-full-error
+--------------+----------+-----------+-------+---------+-------+
| NAME | PROVIDER | STATE | ERROR | MESSAGE | AGE |
+--------------+----------+-----------+-------+---------+-------+
| pu-repl-dest | AWS | Available | | | 8m |
+--------------+----------+-----------+-------+---------+-------+
Now we can check the content of the replicated bucket by using the get appvaultcontent command of the Trident protect CLI.
$ tridentctl-protect get appvaultcontent pu-repl-dest
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
| CLUSTER | APP | TYPE | NAME | NAMESPACE | TIMESTAMP |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
| eks-source-cluster | alpine | backup | hourly-54241-20250401141000 | alpine | 2025-04-01 14:11:50 (UTC) |
| eks-source-cluster | alpine | backup | hourly-54241-20250401151000 | alpine | 2025-04-01 15:11:38 (UTC) |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+
Because the protection schedule on the primary cluster, eks-source-cluster, meanwhile executed a second hourly backup of the sample application, we see these two backups now on the replicated bucket and can start a restore test.
We test a restore on the DR cluster eks-dest-cluster with the most recent backup, hourly-54241-20250401151000, that’s available in the replicated Amazon S3 bucket pu-repl-dest. To use the create backuprestore command on the DR cluster, we first need to determine the path of the backup archive in the AppVault with the --show-paths option of the get appvaultcontent command.
$ tridentctl-protect get appvaultcontent pu-repl-dest --show-paths
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+----------------------------------------------------------------------------------------------------------------------+
| CLUSTER | APP | TYPE | NAME | NAMESPACE | TIMESTAMP | PATH |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+----------------------------------------------------------------------------------------------------------------------+
| eks-source-cluster | alpine | backup | hourly-54241-20250401141000 | alpine | 2025-04-01 14:11:50 (UTC) | alpine_ea2ea171-1c23-40bb-8625-d33a7e7c2edd/backups/hourly-54241-20250401141000_3a9b5d2e-5d0c-4bc9-86cc-f53f84cf7faa |
| eks-source-cluster | alpine | backup | hourly-54241-20250401151000 | alpine | 2025-04-01 15:11:38 (UTC) | alpine_ea2ea171-1c23-40bb-8625-d33a7e7c2edd/backups/hourly-54241-20250401151000_f41b3bf0-2aff-404b-90c3-c9c1a960fcce |
+--------------------+--------+--------+-----------------------------+-----------+---------------------------+----------------------------------------------------------------------------------------------------------------------+
With the path value alpine_ea2ea171-1c23-40bb-8625-d33a7e7c2edd/backups/hourly-54241-20250401151000_f41b3bf0-2aff-404b-90c3-c9c1a960fcce of the most recent backup, we can now start the restore from the replicated backup on the destination cluster.
$ tridentctl-protect create backuprestore --appvault pu-repl-dest --path alpine_ea2ea171-1c23-40bb-8625-d33a7e7c2edd/backups/hourly-54241-20250401151000_f41b3bf0-2aff-404b-90c3-c9c1a960fcce --namespace-mapping alpine:alpine -n alpine
BackupRestore "alpine-heqp4w" created.
We follow the progress of the restore, which finishes quickly.
$ kubectl -n alpine get backuprestore alpine-heqp4w -w
NAME STATE ERROR AGE
alpine-heqp4w Running 17s
alpine-heqp4w Running 23s
alpine-heqp4w Running 23s
alpine-heqp4w Running 36s
alpine-heqp4w Running 36s
alpine-heqp4w Running 36s
alpine-heqp4w Running 36s
alpine-heqp4w Running 36s
alpine-heqp4w Running 40s
alpine-heqp4w Running 40s
alpine-heqp4w Running 40s
alpine-heqp4w Running 40s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Running 47s
alpine-heqp4w Completed 47s
The sample application comes up successfully after the restore, and the sample data files are also available, so the replicated backup was valid.
$ kubectl get all,pvc -n alpine
NAME READY STATUS RESTARTS AGE
pod/alpine-5bdb97fb48-2lnb5 1/1 Running 0 52s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/alpine 1/1 1 1 52s
NAME DESIRED CURRENT READY AGE
replicaset.apps/alpine-5bdb97fb48 1 1 1 52s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
persistentvolumeclaim/alpinedata Bound pvc-b5d6ced6-5734-40ae-8c9e-b473b032e59e 2Gi RWX fsx-netapp-file <unset> 55s
$ kubectl -n alpine exec -it pod/alpine-5bdb97fb48-2lnb5 -- df -h /data
Filesystem Size Used Available Use% Mounted on
198.19.255.178:/trident_pvc_b5d6ced6_5734_40ae_8c9e_b473b032e59e
2.0G 508.7M 1.5G 25% /data
$ kubectl -n alpine-restore exec -it pod/alpine-5bdb97fb48-2lnb5 -- ls -l /data
total 474848
-rw-r--r-- 1 root root 104857600 Apr 1 13:51 file1
-rw------- 1 root root 104857600 Apr 2 11:03 file2
-rw-r--r-- 1 root root 104857600 Apr 1 13:51 file3
-rw-r--r-- 1 root root 104857600 Apr 1 13:51 file4
-rw-r--r-- 1 root root 104857600 Apr 1 13:51 file5
In conclusion, the integration of NetApp Trident protect with bucket replication for DR across regions offers a robust and cost-effective solution for maintaining the availability and protection of your business-critical stateful Kubernetes applications. By using object storage bucket replication, you can achieve geographic redundancy and safeguard your critical data against regional failures and disasters. The step-by-step guide provided in this blog post demonstrates how to configure and to use Trident protect for backup and restore operations so that your applications can be quickly and reliably restored in a DR scenario.
The seamless integration of Trident protect with Kubernetes-native CRs and the Trident protect CLI simplifies the automation of backup and restore processes, making it easier to integrate these critical operations into your existing workflows. This approach not only enhances data protection but also provides flexibility in managing your backups across different environments, whether on premises, hybrid, or cloud based.
By implementing the strategies outlined in this blog post, your organization can effectively mitigate risks, maintain business continuity, and keep your applications resilient and recoverable in the face of unforeseen events. If you’re seeking to enhance your DR capabilities, NetApp Trident protect offers a comprehensive and scalable solution that you can tailor to meet the specific needs of your Kubernetes environments.
If you want to see for yourself how easy it is to protect persistent Kubernetes applications with Trident protect, get started today!