Tech ONTAP Blogs

Deploying StorageGRID in a Kubernetes Cluster

MauricioS
NetApp
43 Views

SG-11.1.1-K8s-diagram.png


NetApp’s StorageGRID object store is very easy to use, but it can be challenging to get all of the right infrastructure together to deploy it, especially if you just want to try it out. But under the covers it’s built and deployed as container images, and recent Kubernetes-friendly enhancements make it possible to use those to deploy StorageGRID in an existing Kubernetes cluster, making it easy to try as well!

Note: This deployment method is not officially supported.

Overview

In this post, we will extract the Docker images from the Ubuntu/Debian bare metal distribution packages, make the images available to our Kubernetes worker nodes and deploy StorageGRID onto our Kubernetes cluster. Our example shows a minimal, single site StorageGRID deployment with one primary Admin Node and three Storage Nodes.

 

Networking

Kubernetes pods within the same cluster can communicate directly with each other; this is perfect for a single StorageGRID site as all the StorageGRID nodes need to intercommunicate. Additionally, the three Storage Nodes will register with the Admin Node by providing the Admin Node’s DNS name during deployment. A Kubernetes headless service will provide DNS mappings for the StorageGRID nodes. Using the DNS name, as opposed to the IP, will also allow the Storage Nodes to be deployed in parallel with the Admin Node.

 

Note: The existence of multiple sites would require StorageGRID nodes at each site be able to route directly to StorageGRID nodes at the other sites. This could be accomplished with a VPN or other tunneling technology and is outside the scope of this blog.

 

StorageGRID Nodes

Admin Node (one required) A primary Admin Node is required for every StorageGRID deployment (secondary Admin Nodes are optional). Admin Nodes provide the Grid Manager Interface (GMI) for administering your StorageGRID deployment using a web browser.

 

Storage Node (three required) Three Storage Nodes are required per StorageGRID site. Storage Nodes are the backbone of a StorageGRID deployment

 

Gateway Node (optional) The API Gateway Node monitors the health of the grid and the usage of each Storage Node. Since this is a minimalistic deployment, we will forego the Gateway Node’s health checks and use Kubernetes basic round-robin built-in load-balancer.

 

For further discussion on StorageGRID load balancing, see this Technical Report.

 

SG-11.1.1-K8s-diagram.gif

 

Prerequisites

Kubernetes

  • Command line access to Master and Worker nodes of the cluster.
  • Resources for four StorageGRID nodes (1 primary Admin Node, 3 Storage Nodes) Each StorageGRID node will require:
    • 24Gi of memory (total of 96Gi)
    • 8 CPUs (total of 32 CPUs)
  • Persistent Volumes for the StorageGRID primary Admin Node:
    • 100Gi mounted at /var/local
    • 200Gi mounted at /var/local/mysql_ibdata
    • 200Gi mounted at /var/local/audit/export
  • Persistent Volumes for three StorageGRID Storage Nodes:
    • 100Gi mounted at /var/local (per Storage Node)
    • 4Ti mounted at /var/local/rangedb/0 (per Storage Node)
    • 4Ti mounted at /var/local/rangedb/1 (per Storage Node)
    • 4Ti mounted at /var/local/rangedb/2 (per Storage Node)

StorageGRID Docker Images

StorageGRID 11.6.0 (SG1000 or SG100 Primary Admin Node tgz)

Kubernetes Worker Nodes Preparation Load the StorageGRID Docker images on each Kubernetes node.

$ tar -C /tmp -xzf StorageGRID-Webscale-11.9.0-DEB-20241007.1657.e515f58.tgz

$ dpkg -x /tmp/StorageGRID-Webscale-11.9.0/debs/storagegrid-webscale-images-11-9-0_11.9.0-20241007.1657.e515f58_amd64.deb  /tmp/StorageGRID-Webscale-11.9.0/

$ docker load -i /tmp/StorageGRID-Webscale-11.9.0/var/lib/storagegrid/images/11.9.0/storagegrid-11.9.0.tgz
Loaded image: storagegrid-11.9.0:API_Gateway
Loaded image: storagegrid-11.9.0:Admin_Node
Loaded image: storagegrid-11.9.0:Storage_Node

$ docker images storagegrid-11.9.0
REPOSITORY           TAG            IMAGE ID       CREATED        SIZE
storagegrid-11.9.0   Admin_Node     be885ca2ff2f   4 months ago   2.77GB
storagegrid-11.9.0   Storage_Node   a539a72ea1ec   4 months ago   2.65GB
storagegrid-11.9.0   API_Gateway    a805db0f8f88   4 months ago   1.82GB
 

Note: Alternatively, the StorageGRID images could be hosted on a private Docker repository; just make sure to update the primary-admin.yaml and the storage.yaml files with the image path/names.

 

Deploy StorageGRID on Kubernetes

Follow the detailed kubectl commands documented on GitHub:
https://github.com/NetApp-StorageGRID/Kubernetes/blob/master/README.md

  • The Admin Node service (GMI HTTPS - 443) will be mapped to an external IP.
  • The Storage Node service will map S3 protocol (port 18082) from the Storage Nodes to an external, load balanced IP.

Install StorageGRID

  • Direct your browser to https://<EXTERNAL-IP> (of the admin-node-service)
  • Accept the default StorageGRID insecure certificate.
  • Set an Installer password.
  • Click Install a StorageGRID system.
  • From this point forward, you can follow the StorageGRID 11.9.0 installation documentation.
 

SG-11.9.0-2.png

 

Follow the installation wizard – paying special attention to:

  • Step 1: License – Use the demo license that came with the download: /tmp/StorageGRID-Webscale-11.6.0/debs/NLF000000.txt
  • Step 2: Grid Network – Use the network available to the Kubernetes pods
  • Step 3: Grid Nodes – Make sure all four StorageGRID nodes get approved.

Note: NTP, DNS, and other fields will be specific to your environment, fill them in accordingly. Click the Install button and monitor the installation.

 

After installation, simply:

  • Log in as the administrator (use the password created during installation).
  • Add Tenants and Buckets (making note of AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID).
  • Don’t forget to direct your S3 client to port 18082 of the EXTERNAL-IP of the service.

Summary

 

At this point you have a fully functional StorageGRID system; albeit very minimum isolated single site grid. You can now leverage information lifecycle management (ILM) rules to manage object data.

Kubernetes and its deployment model have greatly simplified this whole process. We’re specifying compute resources, storage resources, DNS names, services to expose, etc.. Also, we’re deploying all the pods as StatefulSets, this makes all the PVCs stick to their pods.

 

The yaml files can be found on GitHub. Go kick the tires!!!

Public