Tech ONTAP Blogs
Tech ONTAP Blogs
The cloud infrastructure landscape demands tools that offer speed, scalability, and consistency. Infrastructure as code (IaC), spearheaded by Terraform, empowers teams to automate and manage infrastructure with precision and ease, using code to provision and maintain resources across diverse cloud environments.
This blog introduces a GitHub repository designed to fast track the deployment of Kubernetes clusters across the leading cloud providers—AWS, Azure, and Google Cloud. Each provider is paired with NetApp's first-party cloud storage—FSx for NetApp ONTAP, Azure NetApp Files, and Google Cloud NetApp Volumes. The code in each is tailored for NetApp® customers and partners, facilitating a quick start with Kubernetes and first party NetApp cloud storage, leveraging the efficiency of Terraform and the reliability of NetApp storage solutions.
With dedicated directories for each cloud provider, the repository sets up Kubernetes clusters configured with NetApp Trident™ container storage interface (CSI), complete with the necessary back ends and storage classes. It showcases how users can quickly deploy NetApp storage solutions in a Kubernetes context, with the flexibility to adapt the code to their specific needs. Join us as we delve into the details of this repository and illustrate how it can serve as a launchpad for your cloud-native storage initiatives with Kubernetes and NetApp.
The NetApp 1st Party Cloud Storage and Kubernetes Terraform IaC repository is provided under the MIT license, which means that you’re free to use the code without restriction. If you want to use just a portion of the code, forking or cloning the repository is probably not necessary; instead, you can just copy the necessary components.
If you instead want to modify the code, but plan to keep the overall structure, it is recommended that you fork the repository. If you’re not sure how you’ll use the code, or if you just want to get hands-on with Kubernetes and NetApp first-party storage, simply clone the repository and change into the created directory:
git clone https://github.com/MichaelHaigh/netapp-1p-k8s-terraform.git
cd netapp-1p-k8s-terraform
Before we can deploy our NetApp first-party cloud storage and Kubernetes cluster, we need to configure the credentials that Terraform uses.
Although cloud platforms vary in how access credentials and service accounts are created, the Terraform code in this repository works in a similar fashion on each platform by reading the credentials from a file path on the host system. Within each cloud directory the default.tfvars file contains one or more variables with which you can customize the location and name of this file.
However, for each cloud there are a number of other ways to authenticate (see AWS, Azure, and Google Cloud), so feel free to modify the code to use a different method.
The AWS code has a variable called aws_cred_file that defines the location for the AWS access key file:
$ grep aws_cred_file fsxn-eks/default.tfvars
aws_cred_file = "~/.aws/aws-terraform.json"
This file must be in the following format:
$ cat ~/.aws/aws-terraform.json
{
"aws_access_key_id": "AKIAIOSFODNN7EXAMPLE",
"aws_secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}
To generate these keys for your user, see this AWS page.
The Azure code has a variable called sp_creds that defines the location for the Azure service principal file:
$ grep sp_creds anf-aks/default.tfvars
sp_creds = "~/.azure/azure-sp-tme-demo2-terraform.json"
This file must be in the following format:
$ cat ~/.azure/azure-sp-tme-demo2-terraform.json
{
"subscriptionId": "acb5685a-dead-4d22-beef-ad9330cd14b4",
"appId": "c16a3d0b-dead-4a32-beef-576623b3706c",
"displayName": "azure-sp-terraform",
"password": "11F8Q~4deadbeefNOBbOtnOfN3~FRhrsD9N0SaCP",
"tenant": "d26875b4-dead-456e-beef-bafc77f348b5"
}
To create an Azure service principal and generate the needed password, see this Terraform document.
The Google Cloud code has four variables that much be updated to match your environment:
$ grep -e sa -e gcp_project gcnv-gke/default.tfvars
sa_creds = "~/.gcp/astracontroltoolkitdev-terraform-sa-f8e9.json"
gcp_sa = "terraform-sa@astracontroltoolkitdev.iam.gserviceaccount.com"
gcp_project = "astracontroltoolkitdev"
gcp_project_number = "239048101169"
To gather the service account key and email, click the service account name on the service account page of the console. You can gather your Google Cloud project name and number from the welcome page of the console.
The bottom of the default.tfvars file in each cloud provider directory contains an authorized_networks variable, which is a list of arrays. To access the resources that Terraform destroys, your IP address must be present in this list. If you’re not sure of your IP address, run the following command:
curl http://checkip.amazonaws.com
Two sample networks are present, one for a range of company VPN addresses and another for a home address. Feel free to modify these examples or to add any number of additional entries.
There are a handful of other variables in each of the default.tfvars files that are worth reviewing at a high level, because they may need to be updated for your environment. In the AWS deployment:
In the Azure deployment:
In the Google Cloud deployment:
Now that we’ve covered the available variables and configurations, we’re ready to initialize the environment.
The provider versions in the main.tf file in each cloud directory are constrained by the ~> operator, as shown in this example:
$ head -13 anf-aks/main.tf
terraform {
required_version = ">= 0.12"
required_providers {
azuread = {
source = "hashicorp/azuread"
version = "~> 2.53.1"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~> 4.1.0"
}
}
}
This operator sets both an upper and a lower bound on the code version, which ensures code compatibility while enabling bug fixes and other minor updates. Depending on your use case and the difference between your usage and when the repository was last updated, it may be beneficial to either:
This change allows you to use the most up-to-date provider code; however, you run the risk of the underlying Terraform code needing to be updated due to software incompatibilities. If you’re just experimenting, modifications are not needed.
Assuming that you’re not already in a cloud provider directory, change into it now. You need to initialize Terraform, which downloads and installs the specified providers:
terraform init
We’re now ready to deploy our infrastructure.
We first run Terraform plan, which enables us to view the proposed deployment and make sure that there aren’t any issues with the updates made so far. (See the next section if you’re curious about the workspace component.)
terraform plan -var-file="$(terraform workspace show).tfvars"
Make sure that there are no errors reported and that you aren’t prompted to input any variables. Scroll through the output. If the infrastructure looks as expected, then run Terraform apply:
terraform apply -var-file="$(terraform workspace show).tfvars"
Enter yes at the prompt and then wait for the infrastructure to be deployed. This can take anywhere between 10 and 60 minutes, depending on the cloud and options selected.
The final step in each cloud deployment is installing the Trident back-end configuration and Kubernetes storage classes. Although the precise steps vary by cloud, the overall workflow is:
To verify that everything was created correctly, you can run the following two commands from your terminal—the deployed Kubernetes cluster should already be your current context:
kubectl -n trident get tbc
kubectl get sc
The exact output will vary by cloud and deployment configuration, but here are AWS, Azure, and Google Cloud examples.
AWS:
$ kubectl -n trident get tbc
NAME BACKEND NAME BACKEND UUID PHASE STATUS
backend-fsx-ontap-nas backend-fsx-ontap-nas bddd862e-7af0-4584-8e80-5a1d22089450 Bound Success
backend-fsx-ontap-san backend-fsx-ontap-san 0a25e388-51cb-439c-a55f-eb8ad0ca1801 Bound Success
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
fsx-netapp-block csi.trident.netapp.io Delete Immediate true 10m
fsx-netapp-file (default) csi.trident.netapp.io Delete Immediate true 10m
gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 36m
Azure:
$ kubectl -n trident get tbc
NAME BACKEND NAME BACKEND UUID PHASE STATUS
backend-aks-default-netapppool backend-aks-default-netapppool 34956d74-70de-4dab-a4ee-06bc4852305c Bound Success
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
azure-netapp-files-standard (default) csi.trident.netapp.io Delete Immediate true 34s
azurefile file.csi.azure.com Delete Immediate true 30m
azurefile-csi file.csi.azure.com Delete Immediate true 30m
azurefile-csi-premium file.csi.azure.com Delete Immediate true 30m
azurefile-premium file.csi.azure.com Delete Immediate true 30m
default disk.csi.azure.com Delete WaitForFirstConsumer true 30m
managed disk.csi.azure.com Delete WaitForFirstConsumer true 30m
managed-csi disk.csi.azure.com Delete WaitForFirstConsumer true 30m
managed-csi-premium disk.csi.azure.com Delete WaitForFirstConsumer true 30m
managed-premium disk.csi.azure.com Delete WaitForFirstConsumer true 30m
Google Cloud:
$ kubectl -n trident get tbc
NAME BACKEND NAME BACKEND UUID PHASE STATUS
backend-gke-default-standard-pool backend-gke-default-standard-pool 7aa74b9e-e1ee-4751-bf1b-d97a0a674bb1 Bound Success
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
netapp-gcnv-standard (default) csi.trident.netapp.io Delete Immediate true 17m
premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 22m
standard kubernetes.io/gce-pd Delete Immediate true 22m
standard-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer true 22m
In the commands in the previous section, you probably noticed the -var-file argument, which references a Terraform workspace command. Workspaces enable separate instances of state data within the same working directory, so that you can have multiple deployments running at the same time.
All of the code in this repository has been designed to natively support workspaces. Perhaps you’ve deployed an environment in the eastern United States, and now realize that you need a second environment in the western U.S. for business continuity and disaster recovery. Simply run the following command to create a new Terraform workspace—providing a descriptive, unique workspace name:
terraform workspace new <workspace-name>
Then copy the default.tfvars variable file to match your new workspace name:
cp default.tfvars <workspace-name>.tfvars
Open the <workspace-name>.tfvars file and modify the region variable to a western U.S. region. Optionally update any other variables; for instance, perhaps the default node count can be lower for a disaster recovery workload. Then create your new environment with the same apply command that we previously ran:
terraform apply -var-file="$(terraform workspace show).tfvars"
The resources deployed contain the <workspace-name> field in their name, rather than default from the default workspace. If you need to switch to the default workspace, run:
terraform workspace select default
Finally, to view all available workspaces, run:
terraform workspace list
When your Kubernetes and NetApp first-party cloud storage has reached the end of its lifecycle, Terraform makes it easy to clean up the deployed resources. Before running the following command, be sure to clean up any resources deployed outside of Terraform, such as Kubernetes resources like elastic IP address and persistent volumes, or NetApp volume Snapshot™ copies.
terraform destroy -var-file="$(terraform workspace show).tfvars"
Make sure that the infrastructure displayed in the output is the components you want to destroy, then enter yes at the prompt. This process will take around 10 to 60 minutes, depending on the cloud environment and infrastructure choices.
In summary, the GitHub repository we've delved into is a testament to the power of combining Terraform's infrastructure as code with NetApp's cloud storage for Kubernetes environments. The combination offers a streamlined pathway for NetApp partners and customers to deploy robust, scalable Kubernetes clusters across the major cloud platforms with the added performance of NetApp first-party storage.
The repository's modular design allows customization and scalability, meeting diverse infrastructure needs. With Terraform, you can automate your infrastructure provisioning, manage multiple environments with workspaces, and easily decommission resources when necessary, showcasing flexibility and control over your cloud resources.
Whether you're starting fresh or optimizing your cloud infrastructure, this repository stands as a foundational asset for your cloud-native journey. As the cloud landscape evolves, employing such resources will be key to maintaining a competitive edge in the fast-paced world of technology. We encourage you to engage with the community, share your insights, and contribute to the ongoing enhancement of cloud-native storage and management practices.