Modern Data Platform – Dremio with NetApp ONTAP (FSxN-AWS)
Introduction
This document explains how to integrate Dremio with NetApp ONTAP 1P FSxN (AWS). We assume you are already familiar with Dremio and NetApp. Our focus is to help data professionals like Data Engineers, Data Analysts, and Data Scientists –quickly and securely access data stored in NetApp backends without major data movement.
Combining Dremio and NetApp enables you to perform daily data activities, build ETL/ELT pipelines, and develop AI/ML and NLP (MCP–LLM) use cases using your trusted NetApp storage environment.
High Level Flow

Objective
The key objectives are:
- Enable and engage customers to leverage their data using Dremio with NetApp.
- Allow quick and easy access to data stored on NetApp ONTAP 1P FSxN (AWS).
- Build data pipelines for ETL/ELT, NLP (MCP-LLM), and exploratory data analysis (EDA) to drive effective business decisions.
- Avoid data movement by accessing data in situ (in-place), reducing costs, preventing data silos, and maintaining higher security since data isn't transferred over networks to another storage system.
Prerequisites
- Kubernetes Cluster (On-Premises, AKS, EKS, GKE, etc.)
- Dremio (Free or paid subscription) – For big data analytics, ACID, and time travel functionality with OTF (Iceberg, Hudi, Parquet, etc.) support and a query engine for future AI/ML engineering.
- NetApp ONTAP FS (On-Premises, 1P/3P [ANF, FSxN, GCNV] or CVO)
- Claude (Anthropic subscription) for MCP Agentic GenAI LLM solutions.
Details
Setup: Dremio on Kubernetes with FSxN for Data Lake
- Deploy a Kubernetes Cluster: Set up any type of Kubernetes cluster (EKS, AKS, GKE, or on-premises). Ensure you can access the cluster by running:
kubectl get svc
kubectl get ns
kubectl get pods
- Prepare Deployment Directory: Create a directory for your deployment (e.g., dremiok8s) on the instance where you access your Kubernetes cluster:
mkdir dremiok8s and cd dremiok8s
- Download & Update Configuration File: Obtain the values-overrides.yaml configuration file for deploying Dremio on Kubernetes. This may be provided via email or available from Dremio if you are subscribed. And edit the values-overrides.yaml file to include your license keys and other required parameters.

- Configure NetApp ONTAP Storage: Update the storage-related properties in values-overrides.yaml with your FSxN keys and IP addresses.

- Create and Set Namespace:
kubectl create ns dremiok8s
kubectl config set-context --current --namespace=dremiok8s
- Validate Storage Classes, PVs, and PVCs
Ensure your persistent volumes and storage classes are set up correctly:
kubectl get pv
kubectl get pvc
kubectl get sc
If needed, set your desired storage class as default (e.g., gp2 for ONTAP S3): kubectl patch storageclass gp2 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
- Install Dremio with FSxN Backend: Deploy Dremio using Helm with your customized configuration:
helm install dremiok8s oci://quay.io/dremio/dremio-helm -f ./values-overrides.yaml
Monitor pods, events, and logs. Once running, get the load balancer URL:
kubectl get svc
Access Dremio at http://<load-balancer-url>:9047/

- Add Data Sources: Add your volumes or desired datasets from FSxN NAS or S3 buckets in the Dremio UI.
Dremio MCP-LLM Setup with ONTAP
With Dremio and FSxN backend ready, follow the Dremio Agentic AI MCP installation steps to enable NLP queries against your datasets in FSxN (NetApp ONTAP 1P storage) via Claude. Ensure Claude is configured to point to your ONTAP Dremio instance to work with your own datasets for Q&A.
Demo
[Insert Demo Links are Here] External link will be adding later.
Internal Link: Dremio_1_2_Merged.mp4
Below are example screenshots and results from the demo environment:
- Alpha FSxN and Bucket Details:

- Dremio-SQL (Analytics, ACID, Time Travel): Basic SQL operations through Dremio (on EKS) for Iceberg, including snapshots, metadata, and actual data details from FSxN.

- AI Agentic LLM with NetApp 1P ONTAP FSxN (Dremio – Claude MCP LLMs): Launch Claude and start prompting queries as desired.

Few basic example NLP Queries to interact with NetApp storage datasets:
- Find list of available datasets.
- Find the shortest distance trip from fsxn.silver.yellow_tripdata_2024-06.parquet.
- Find the longest distance trip from fsxn.silver.yellow_tripdata_2024-06.parquet.
Conclusion
This document outlines an approach to harness the combined power of Dremio and NetApp ONTAP. You can build sophisticated, cost-effective data pipelines, conduct extensive data analysis, and leverage Data Lake and NLP MCP LLM capabilities on your own datasets—without data movement—while maintaining data security. This approach minimizes risk, reduces cost, and ensures optimal performance from your data management infrastructure.
Learn More