Tech ONTAP Blogs

Kubernetes-driven data management: The new era with Trident protect

PatricU
NetApp
212 Views

In today’s digital landscape, Kubernetes has become the de facto standard for container orchestration and application deployment. With its scalability and flexibility, it offers numerous benefits for managing and scaling applications. However, as organizations rely more heavily on Kubernetes for critical workloads, it becomes crucial to have a robust data protection strategy in place. 

 

NetApp® Trident protect software provides advanced data management capabilities that enhance the functionality and availability of stateful Kubernetes applications backed by storage systems running NetApp ONTAP® data management software and the proven Trident Container Storage Interface (CSI) storage provisioner. Trident protect simplifies the management, protection, and movement of containerized workloads across public clouds and on-premises environments. It also offers automation capabilities through its Kubernetes-native API and powerful tridentctl-protect CLI, enabling programmatic access for seamless integration with existing workflows. 

 

By integrating with Kubernetes APIs and resources, data protection can become an inherent part of the application lifecycle through an organization’s existing continuous integration and continuous deployment (CI/CD) and/or GitOps tools.

 

In this blog, we’ll show how to manage a Kubernetes cluster with Trident protect and its custom resources (CRs), and we’ll protect an application running on the cluster by running several kubectl commands. We’ll also dive into the structure and format of these commands and manifest, and then we’ll wrap everything up by providing some additional resources and areas to investigate to learn even more about Trident protect’s Kubernetes-native architecture.

 

Another blog post will show you how to achieve the same task easily with Trident protect’s tridentctl-protect CLI.

Prerequisites

This blog post assumes that you’re working with the default admin user. In an upcoming blog post, we’ll show you how Trident protect’s role-based access model enables you to isolate tenant user activity while using Trident protect’s capabilities.

 

If you plan to follow this blog step by step, you need to have the following available:

  • A Kubernetes cluster with the latest Trident installed, and its associated kubeconfig 
  • An ONTAP storage back end and Trident with configured storage back ends, storage classes, and volumesnapshot classes
  • A configured object storage bucket for storing backups and metadata information
  • A workstation with kubectl configured to use the kubeconfig 
  • A workstation with Helm installed (or a different means of deploying a sample Kubernetes application)

Trident protect installation

The Trident protect Kubernetes CR-driven architecture requires installing some components on the cluster: Kubernetes custom resource definitions (CRDs) and the Trident protect controller manager. Fortunately, this is a straightforward process.

 

If your environment meets the requirements, you can follow these easy steps to install Trident protect on your Kubernetes cluster:

  • Create the namespace trident-protect:

 

$ kubectl create ns trident-protect
namespace/trident-protect created

 

  • Add the Trident Helm repository:

 

$ helm repo add netapp-trident-protect https://netapp.github.io/trident-protect-helm-chart
"netapp-trident-protect" has been added to your repositories

 

  • Use Helm to install the Trident protect CRDs:

 

$ helm install trident-protect-crds netapp-trident-protect/trident-protect-crds --version 100.2410.0
NAME: trident-protect-crds
LAST DEPLOYED: Fri Nov  8 13:40:24 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

 

  • Helm-install the Trident protect controller manager. Make sure to replace <name_of_cluster> with your cluster name, which will be assigned to the cluster and used to identify backups and snapshots from this cluster:

 

$ helm install trident-protect netapp-trident-protect/trident-protect --set clusterName=<name of cluster> --version 100.2410.0
NAME: trident-protect
LAST DEPLOYED: Fri Nov  8 13:43:07 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

 

Now the Trident protect CRDs are installed in your cluster:

 

$ kubectl get crd | grep protect.trident
applications.protect.trident.netapp.io                 2024-10-18T13:18:54Z
appmirrorrelationships.protect.trident.netapp.io       2024-10-18T13:18:54Z
appmirrorupdates.protect.trident.netapp.io             2024-10-18T13:18:54Z
appvaults.protect.trident.netapp.io                    2024-10-18T13:18:54Z
autosupportbundles.protect.trident.netapp.io           2024-10-18T13:18:54Z
autosupportbundleschedules.protect.trident.netapp.io   2024-10-18T13:18:54Z
backupinplacerestores.protect.trident.netapp.io        2024-10-18T13:18:54Z
backuprestores.protect.trident.netapp.io               2024-10-18T13:18:54Z
backups.protect.trident.netapp.io                      2024-10-18T13:18:54Z
exechooks.protect.trident.netapp.io                    2024-10-18T13:18:54Z
exechooksruns.protect.trident.netapp.io                2024-10-18T13:18:54Z
kopiavolumebackups.protect.trident.netapp.io           2024-10-18T13:18:54Z
kopiavolumerestores.protect.trident.netapp.io          2024-10-18T13:18:54Z
pvccopies.protect.trident.netapp.io                    2024-10-18T13:18:54Z
pvcerases.protect.trident.netapp.io                    2024-10-18T13:18:54Z
resourcebackups.protect.trident.netapp.io              2024-10-18T13:18:54Z
resourcedeletes.protect.trident.netapp.io              2024-10-18T13:18:54Z
resourcerestores.protect.trident.netapp.io             2024-10-18T13:18:54Z
resticvolumebackups.protect.trident.netapp.io          2024-10-18T13:18:54Z
resticvolumerestores.protect.trident.netapp.io         2024-10-18T13:18:54Z
schedules.protect.trident.netapp.io                    2024-10-18T13:18:54Z
shutdownsnapshots.protect.trident.netapp.io            2024-10-18T13:18:54Z
snapshotinplacerestores.protect.trident.netapp.io      2024-10-18T13:18:54Z
snapshotrestores.protect.trident.netapp.io             2024-10-18T13:18:54Z
snapshots.protect.trident.netapp.io                    2024-10-18T13:18:54Z

 

And the controller manager will be up and running:

 

$ kubectl -n trident-protect get all
NAME                                                               READY   STATUS    RESTARTS   AGE
pod/trident-protect-controller-manager-8d4c94b56-dnj4b             2/2     Running   0          3d15h

NAME                                                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/tp-webhook-service                                   ClusterIP   10.0.160.11    <none>        443/TCP    11d
service/trident-protect-controller-manager-metrics-service   ClusterIP   10.0.189.168   <none>        8443/TCP   11d

NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/trident-protect-controller-manager   1/1     1            1           11d

NAME                                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/trident-protect-controller-manager-8d4c94b56   1         1         1       11d

 

Before we can manage and protect the first application, we need to define at least one application vault (appVault) CR corresponding to an existing object storage bucket, because buckets are used to store application (and snapshot) metadata in addition to backup data by Trident protect.

 

The following example configures an appVault CR demo backed by an Azure storage container demo in the Azure storage account putest. First, we create a secret containing the accountName and accountKey of the Azure storage account in the default trident-protect namespace:

 

$ kubectl -n trident-protect create secret generic bucket-demo --from-literal=accountName=putest --from-literal=accountKey=<ACCOUNT-KEY>

 

With the secret in place, we now create the appVault CR in the trident-protect namespace:

 

$ kubectl apply -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: AppVault
metadata:
  name: demo
  namespace: trident-protect
spec:
  providerConfig:
    azure:
      accountName: puneptunetest
      bucketName: demo
      endpoint: core.windows.net
  providerCredentials:
    accountKey:
      valueFromSecret:
        key: accountKey
        name: bucket-demo
  providerType: azure
EOF

 

Application management

Now that our cluster has all the necessary software components installed, and an appVault CR is configured, we’re ready to manage an application. As a sample application for the purpose of this blog post, we use a simple MinIO application deployed on an Azure Kubernetes Service (AKS) cluster in the namespace minio with one persistent volume provisioned through Trident and backed by an Azure NetApp Files volume. If your cluster already has any other test application with Trident provisioned persistent volumes running on it, feel free to use it instead.

 

$ kubectl get all,pvc,volumesnapshots -n minio
NAME                         READY   STATUS    RESTARTS   AGE
pod/minio-795cc45c9d-v4dbk   1/1     Running   0          12m
 
NAME            TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)             AGE
service/minio   ClusterIP   10.0.151.74   <none>        9000/TCP,9001/TCP   12m
 
NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/minio   1/1     1            1           12m
 
NAME                               DESIRED   CURRENT   READY   AGE
replicaset.apps/minio-795cc45c9d   1         1         1       12m
 
NAME                          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   	AGE
persistentvolumeclaim/minio   Bound    pvc-851427ec-3cd5-43bd-8538-98c634ece592   8Gi        RWO            netapp-anf		12m

 

We now manage the MinIO application with this kubectl apply command.

 

$ kubectl apply -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: Application
metadata:
  name: minio1
  namespace: minio
spec:
  includedNamespaces:
  - namespace: minio
EOF	

 

Let’s dig into this CR a bit more by section.

 

apiVersion: protect.trident.netapp.io/v1
kind: Application

 

If you’re already familiar with Kubernetes CRs, this should be straightforward. The apiVersion field points to the specific version of the API that the resource adheres to. In this case, it’s an extension of the Kubernetes API (as are all CR definitions). The kind specifies the type of resource under that API definition.  

 

metadata:
  name: minio1
  namespace: minio

 

As with all Kubernetes resources, there’s a top-level metadata field, with the name of a CR and the namespace to which the CR is scoped. As Trident protect’s application data management CRs need to be created in the same namespace as their associated application, we create it in the minio namespace.

 

spec:
  includedNamespaces:
  - namespace: minio

 

Finally, the spec contains the specifics of the application definition. This is a simple application with only a single namespace, but it’s possible for this specification to include any number of namespaces, label selectors, and cluster-scoped resources.  

 

Checking for existing application CRs, we see that the protection state of our application minio1 is none, because we haven’t created any snapshots or backups yet:

 

$ kubectl get applications -n minio
NAME      PROTECTION STATE   AGE
minio1   none                50m

 

Now that our application is defined, let’s perform some application data management operations.

Snapshot creation

We have successfully managed our application by creating a CR. We can now carry out any other application data management operation in the same manner, and we’ll start with a snapshot. 

 

In your terminal, execute the following command:

 

$ kubectl -n trident-protect get appvaults
NAME   AGE   STATE
demo   2h   Available

 

You should see the bucket that was created at the end of the cluster management step. You might wonder why this bucket (or appVault) was defined on the cluster. Many application data management actions specifically reference a bucket/appVault (as we’ll see in a moment with our snapshot definition). This appVault is then used to store necessary information about the application, enabling “self-contained” applications (or snapshots, or backups) that do not require any further inventory for restoring from a backup or snapshot.

 

Run the following command to store the name of the appVault as an environment variable to be used in the next step. 

 

$ appVault=$(kubectl -n trident-protect get appvaults | grep -v NAME | awk '{print $1}')
$ echo $appVault
demo

 

Finally, let’s create a snapshot by running the following command:

 

$ kubectl apply -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: Snapshot
metadata:
  name: minio1-snap-1
  namespace: minio
spec:
  appVaultRef: $appVault
  applicationRef: minio1
EOF

 

Let’s investigate the fields that are different from the application CR previously inspected.

 

apiVersion: astra.netapp.io/v1
kind: Snapshot

 

Although the apiVersion field is the same, the kind is unsurprisingly different. Rather than an “application” definition, this is a “snapshot” definition.

 

spec:
  appVaultRef: $appVault
  applicationRef: minio1

 

The spec field contains two references: one to the application we previously defined and one to the appVault that we just discussed. These references are the core of the instruction to Trident protect: take a snapshot of this application and store the application metadata in this appVault (or bucket).

Spoiler

Even though the application metadata is stored in an external bucket for a snapshot, the Kubernetes persistent volumes are still stored locally on the cluster through a CSI volume snapshot. This means that if the namespace or cluster is destroyed, the application snapshot data will be destroyed, too. This is the key difference between a snapshot and a backup, where the volume snapshot data is also stored on the referenced bucket. 

Let’s make sure that our snapshot completed successfully by running the following command:

 

$ kubectl -n minio get snapshots
NAME            STATE     ERROR   AGE
minio1-snap-1   Completed         2m7s

 

So, we have successfully taken an application snapshot from our CR definition. Next, let’s make this snapshot more robust in case of a disaster.

Backup creation

As mentioned in the previous section, if our minio namespace or the Kubernetes cluster is destroyed, we’ll lose our application’s persistent volumes, including their snapshots. Let’s change that by applying the following CR to create a backup:

 

$ kubectl apply -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: Backup
metadata:
  name: minio1-bkup-1
  namespace: minio
spec:
  appVaultRef: $appVault
  applicationRef: minio1
  snapshotRef: minio1-snap-1
EOF

 

Again, we see the same apiVersion field, but as we expected, a different kind (Backup). Let’s further inspect the spec section. 

 

spec:
  appVaultRef: $appVault
  applicationRef: minio1
  snapshotRef: minio1-snap-1

 

We see the same application and appVault references as in our snapshot CR, but we also see a new snapshot reference field. This is an optional entry for a backup. If it isn’t specified, a new snapshot will first be created. When the snapshot is complete, the CSI volume snapshot data is copied to the referenced appVault.

 

We can follow the backup progress with this command until it completes:

 

$ kubectl -n minio get backup.astra.netapp.io/minio1-bkup-1 -w
NAME            STATE     ERROR   AGE
minio1-bkup-1   Running           22s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Running           2m30s
minio1-bkup-1   Completed           2m30s

 

Protection schedules

If we want to protect our application by regularly creating snapshots and backups, we can assign one (or multiple) protection schedules to it. The following command assigns a protection schedule to our sample app that creates a daily snapshot and backup at 17:30 UTC and retains the last three snapshots and backups.

 

$ kubectl create -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: Schedule
metadata:
  generateName: minio1-sched1-daily
  namespace: minio
spec:
  appVaultRef: demo-puakstest5
  applicationRef: minio1
  backupRetention: '3'
  dayOfMonth: ''
  dayOfWeek: ''
  granularity: daily
  hour: '17'
  minute: '30'
  snapshotRetention: '3'
EOF

 

We can list all configured protection schedules like this:

 

$ kubectl -n minio get schedules
NAME                       AGE
minio1-sched1-dailyd8bqf   52s 

 

Application restore operations

Trident protect allows you to restore your protected application completely or partially from snapshots or backups, either on the same cluster, across clusters, to different namespaces or the same namespace, or to the same or a different storage class. With our sample application protected by a snapshot and a backup, we can now go through some of the possible restore scenarios in this section.

Restore to a new namespace

To restore our application from a backup, we first need to find the appArchivePath within the backup CR from which we want to restore. This can easily be done with the following command, storing it as an environment variable for use in the next steps.

 

$ appArchivePath=$(kubectl -n trident-protect get backups minio1-bkup-1 -o yaml | yq '.status.appArchivePath')
$ echo $appArchivePath
minio1_350649a1-b303-4af9-a27e-9e6a88cf9725/backups/minio1-bkup-1_38d01d54-0609-43c2-b0dc-86fbea14c4c7

 

With the appVault already stored in an environment variable, these commands restore our sample application into the namespace minio-restore, which we need to create first.

 

$ kubectl create ns minio-restore

$ kubectly create -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: BackupRestore
metadata:
  generateName: minio-restore-
  namespace: minio-restore
spec:
  appArchivePath: $appArchivePath
  appVaultRef: $appVault
  namespaceMapping:
  - destination: minio-restore
    source: minio
EOF

 

The kind of the restore CR is BackupRestore. In the spec field, besides the appArchivePath and appVaultRef defining the backup location to restore from, we see the namespaceMapping, which instructs Trident protect to restore into the destination namespace minio-restore.

 

Note that Trident protect automatically creates an application CR for the restored application in its namespace, so it can directly be managed by Trident protect, too.

If we want to select only certain resources of the application to restore, we can add filtering that includes or excludes resources marked with particular labels:

  • "<INCLUDE-EXCLUDE>": (Required for filtering) Use include or exclude to include or exclude a resource defined in resourceMatchers. Add the following resourceMatchers parameters to define the resources to be included or excluded:
    • <GROUP>: (Optional) Group of the resource to be filtered.
    • <KIND>: (Optional) Kind of the resource to be filtered.
    • <VERSION>: (Optional) Version of the resource to be filtered.
    • <NAMES>: (Optional) Names in the Kubernetes metadata.name field of the resource to be filtered.
    • <NAMESPACES>: (Optional) Namespaces in the Kubernetes metadata.name field of the resource to be filtered.
    • <SELECTORS>: (Optional) Label selector string in the Kubernetes metadata.name field of the resource as defined in the Kubernetes documentation. Example: '"trident.netapp.io/os=linux"'.

The spec section would then have additional entries like this:

 

spec:
  resourceFilter:
    resourceSelectionCriteria: "<INCLUDE-EXCLUDE>"
      resourceMatchers:
        group: <GROUP>
        kind: <KIND>
        version: <VERSION>
        names: <NAMES>
        namespaces: <NAMESPACES>
        labelSelectors: <SELECTORS>

 

If you want to restore the application to a different cluster that has Trident protect installed, you can use the same restore command/manifest. The destination cluster must have Trident protect installed, access to the object storage bucket containing the backup, and the respective appVault CR defined.

 

To restore from a snapshot, the procedure is quite similar. The appArchivePath must be found in the snapshot CR from which we want to restore in this case, so we run this command:

 

$ appArchivePath=$(kubectl -n minio get snapshots minio1-snap-1 -o yaml | yq '.status.appArchivePath')
$ echo $appArchivePath
minio1_350649a1-b303-4af9-a27e-9e6a88cf9725/snapshots/20240808153119_minio1-snap-1_f6f98dde-da71-478f-82db-1151581ff017

 

The manifest for the restore from the snapshot is very similar to the restore from backup case, but the kind is now SnapshotRestore. This command will restore our sample application from the snapshot to the namespace minio-restore1, after creating the namespace:

 

$ kubectl create ns minio-restore1

$ kubectl -n minio create -f - <<EOF
apiVersion: astra.netapp.io/v1
kind: SnapshotRestore
metadata:
  generateName: minio-restore1-
  namespace: minio-restore1
spec:
  appArchivePath: $appArchivePath
  appVaultRef: $appVault
  namespaceMapping:
  - destination: minio-restore1
    source: minio
EOF

 

Lastly, you can also migrate data from one storage class to a different storage class when restoring from a snapshot or backup. To do so, simply add a storageClassMapping to the spec section of the respective restore manifest:

 

spec:
  storageClassMapping:
    destination: "${destinationStorageClass}"
    source: "${sourceStorageClass}"
  …

 

Restore to the original namespace

When restoring an application from a backup or snapshot to its original namespace on the same cluster, Trident protect first stops the application and removes its resources from the namespace. Then it restores the data from the backup or snapshot, and finally restores the application metadata. To do this, we use a CR of kind SnapshotInplaceRestore or BackupInplaceRestore, as in the following command to restore the sample app from its snapshot into its original namespace:

 

$ kubectl create -f - <<EOF
apiVersion: protect.trident.netapp.io/v1
kind: SnapshotInplaceRestore
metadata:
  generateName: snapshotipr-minio-
  namespace: minio
spec:
  appArchivePath: $appArchivePath
  appVaultRef: $appVault

 

Next steps

Although manually applying YAML through kubectl is great for a demonstration, a blog post, or other one-off actions, it’s not how most organizations prefer to perform application data management operations. Instead, these operations are typically automated through CI/CD platforms and GitOps tools, minimizing the likelihood of human error.

 

The Trident protect architecture enables seamless integration between these tools and application data management. Deploying a new application through an automated pipeline? Simply add a final pipeline step that applies a protect.trident.netapp.io CR to the same cluster. Using a GitOps tool like Argo CD to update the application through a git push? Add a presync resource hook to the git repository that applies a backups.protect.trident.netapp.io CR. These tools already have access to the application’s Kubernetes cluster, so applying one additional CR is a trivial step.

 

Instead of manually applying YAML using kubectl, Trident protect also offers a powerful CLI tridentctl-protect, which we’ll cover in another blog post.

Conclusion

In summary, we installed Trident protect on a Kubernetes cluster. We then managed a demo application by creating an application CR through kubectl and protected the application by creating snapshot and backup CRs. Finally, we showed how the application can be restored from a snapshot or a backup in various ways.

Public