Tech ONTAP Blogs

Part 2 — What is an MCP Server and why does it matter?

MinithP
NetApp
61 Views

MCP: The structured interface between AI and infrastructure (aka the hands)

Once you understand what an AI agent does, the next question is: how does the agent safely interact with enterprise systems? That is the job of the Model Context Protocol (MCP). MCP is an open standard that lets AI models connect to tools, resources, and data sources through a structured interface. In practice, the MCP server becomes the structured interface layer between a stateless model and systems like ONTAP, APIs, databases, monitoring platforms, and operational tools — governing what tools the model can discover and invoke.

MCP has three main jobs: it exposes contextconstrains access, and structures execution. It publishes what the AI is allowed to use, limits tool discovery and invocation through policy and authorization and translates requests into structured operations. The model never gets direct administrative access to infrastructure. Instead, it invokes pre-defined, scoped tools that map to validated API workflows.

 

The NetApp MCP ecosystem

MCP is the cornerstone of NetApp's strategy to bridge the gap between AI applications like Claude Desktop, VS Code, and GitHub Copilot, and enterprise storage resources.

Rather than deploying a one-size-fits-all server, NetApp has modularized its MCP capabilities into three distinct categories: Operational servers for managing storage systems, observability servers for querying performance and health metrics, and retrieval servers for feeding enterprise knowledge bases to LLMs.

Operational MCP servers: Storage & infrastructure orchestration

These servers expose actionable tools that allow AI agents to manage physical and virtual storage systems, translating natural language commands into safe, validated API requests.

ONTAP MCP Server (ontap-mcp) A Go-based, open-source server that allows AI clients to administer NetApp ONTAP systems through standard REST API controls. Key capabilities include:

  • Multi-cluster support: Manage multiple ONTAP clusters through a single, secure unified endpoint
  • NFS/CIFS provisioning: Dynamically create, configure, resize, and manage volumes, shares, and exports
  • Access control: Handle export policies, access controls, and administrative lifecycle operations
  • Safety guardrails: operations are standardized into strict schemas; the agent calls predefined ONTAP REST API workflows instead of executing raw scripts, ensuring predictable outputs and robust parameter validation

 

DataOps toolkit MCP server (Python-based) Designed for developer, AI engineering, and MLOps workflows, this local MCP server (running on stdiotransport) enables rapid data manipulation.

  • NFS volume lifecycle: Rapidly create and list NFS volumes with customizable parameters
  • FlexClone technology: Instantly trigger space-efficient volume clones, ideal for spinning up isolated sandboxes for AI experimentation
  • Snapshots & replication: Create and list snapshots for data versioning, and establish SnapMirror relationships for replication
  • FlexCache & CIFS shares: Manage FlexCache volumes for cloud bursting and create CIFS shares with specific ACLs

 

Kubernetes & cloud-native MCP servers

  • Trident CSI MCP Server: Manage persistent volumes and JupyterLab workspaces inside Kubernetes clusters, compatible across ONTAP, FSx for ONTAP, Azure NetApp Files, and Google Cloud NetApp Volumes
  • GCNV MCP Server: A TypeScript-based server for provisioning and managing NAS and iSCSI volumes on Google Cloud NetApp Volumes

 

Observability MCP server: Insights & analytics

Harvest MCP Server Instead of hunting through complex dashboards or raw metric databases, the Harvest MCP Server translates open source metrics into conversational data.

  • Conversational metrics: Query historical and real-time operational data using natural language
  • Cross-platform visibility: Pull performance, capacity, and health metrics across NetApp ONTAP, StorageGRID, E-Series, and integrated Cisco switches

 

Retrieval MCP server: AI knowledge enrichment

NetApp Workload Factory GenAI MCP Server This server focuses on the data layer rather than the infrastructure layer, allowing enterprise LLMs to retrieve context from private, unstructured files.

  • Unstructured file integration: Connect unstructured file data residing on-premises, in Amazon FSx for NetApp ONTAP, or Cloud Volumes ONTAP via SMB or NFS
  • Instant RAG: Expose NetApp-managed knowledge bases as standard MCP tools. When a user asks a question in Claude Desktop or Amazon Q, the LLM fetches real-time, relevant context directly from secure file systems.
  • No-code knowledge setup: Integrate with NetApp Console Workload Factory to continuously catalog and vectorize enterprise documents without requiring custom database pipelines

 

How NetApp implements MCP

ONTAP MCP exposes storage operations as structured tools from volume lifecycle to data protection policies. What we want to focus on now is how that interface works and why the architecture matters for security.

The official ONTAP MCP documentation shows that the server can be run as a container, integrated with clients like GitHub Copilot or Claude Desktop, and configured against one or more ONTAP clusters using ontap.yaml. The server also supports streamable HTTP, multiple client connections, and centralized cluster registration, making it significantly more practical than building custom AI-to-storage integrations from scratch.

 

Critically, ONTAP MCP supports a --read-only mode so only non-mutating tools are registered. This is particularly useful for discovery-first deployments where teams want to explore what an agent can see before granting it the ability to act.

 

How a Standard (Secured) MCP Flow should look like

MinithP_0-1782147979309.png

 

 

How NetApp secures the MCP layer

The security value of MCP is that it creates a bounded execution path between AI and infrastructure. It is important to be precise here: ONTAP MCP itself supports several credential-handling patterns, including static credentials, credentials_script, and credentials_file, and the docs explicitly note that static passwords are not recommended for production use. Instead, external credential retrieval patterns are preferred.

Separately, ONTAP itself supports OAuth 2.0 for REST API access and uses REST roles for authorization decisions. That means the secure pattern is the MCP layer retrieves credentials or tokens safely, and then ONTAP enforces what that identity is actually allowed to do.

 

MCP secret wrapper

One question comes up immediately: if MCP is the interface layer, how are the credentials it uses protected?

A modern pattern is to use tools like Secret Wrapper, so the MCP server does not store long-lived credentials locally. The MCP Secret Wrapper is designed as a lightweight layer that injects secrets from enterprise vaults — such as CyberArk and HashiCorp Vault — into MCP services on demand, without persisting them in local config files, environment variables, or source repositories. This reduces the risk of credential exposure if the MCP host or user device is compromised.

This fits naturally with the broader NetApp security story:

  • The wrapper controls how the MCP workflow retrieves credentials.
  • ONTAP OAuth and REST RBAC control what the workflow is allowed to do.
  • ONTAP audit/REST/EMS logging records what the workflow actually did.

MCP & secret wrapper secure flow

MinithP_1-1782147979312.png

 

 

Full audit trail: Tracking every agent action

One of the most critical requirements for running AI agents in production is the ability to answer a simple question after the fact: what exactly did the agent do, when, and under what identity?

ONTAP provides three complementary logging mechanisms that together create a complete audit trail for every action an AI agent takes through MCP.

 

ONTAP audit logs (Management activity)

ONTAP's management audit log captures administrative operations performed on the cluster including every REST API call made by an MCP-authenticated identity. This includes:

  • What was requested: The specific API endpoint, HTTP method (GET, POST, PATCH, DELETE), and parameters
  • Who requested it: The authenticated user or OAuth client identity
  • When it happened: Timestamped entries for every operation
  • Whether it succeeded or failed: Response status for each call

This is the primary record for tracking agent-initiated changes like volume creation, resizing, snapshot policy application, and export policy modifications. Audit logs can be configured to capture read operations (GET requests) as well, which is important for environments where even discovery activity by an agent needs to be tracked.

 

REST API logs

Every interaction between the MCP server and ONTAP flows through the ONTAP REST API. ONTAP provides detailed logging of REST API activity, including:

  • Request and response tracking: Full visibility into the API calls the agent made and the responses ONTAP returned
  • Job tracking: Asynchronous operations (like volume creation or SnapMirror initialization) generate job objects that can be queried for status, duration, and outcome
  • Error capture: Failed API calls are logged with error codes and diagnostic messages, making it possible to trace exactly where and why an agent workflow failed.

For AI agent workflows, this is especially valuable because it creates a machine-readable record of every infrastructure action, taking you from the message that “the agent provisioned a volume" to the exact API call, parameters, and result.

 

EMS (Event Management System) logs

ONTAP's Event Management System generates events for significant operational activities across the cluster. In the context of AI agent workflows, EMS provides:

  • Operational event visibility: Events for volume state changes, snapshot operations, policy modifications, replication status changes, and capacity thresholds
  • Security-relevant events: Authentication failures, authorization denials, configuration changes, and certificate events
  • Alerting and forwarding: EMS events can be configured to trigger alerts, forward to syslog destinations, or integrate with SIEM platforms for centralized monitoring.

EMS is particularly useful for detecting anomalous agent behavior. For example, if an AI agent triggers an unusual volume of snapshot deletions or repeatedly hits authorization failures, EMS events surface that activity for investigation.

 

Bringing It All Together

When combined, these three logging layers provide full observability into AI agent operations:

Layer

What it captures

Why it matters for AI agents

Audit Logs

Who did what, when, and whether it succeeded

Identity-level accountability for every agent action

REST API Logs

Exact API calls, parameters, job status, and errors

Machine-readable proof of every infrastructure operation

EMS Events

Operational and security events across the cluster

Anomaly detection and real-time alerting on agent behavior

 

All three can be forwarded to external SIEM/SOC platforms, enabling security teams to monitor AI agent activity alongside other enterprise operations. This is what makes the difference between "we deployed an AI agent" and "we deployed an AI agent and can prove exactly what it did."

 

Example workflow

A realistic MCP-driven prompt might be:

"Show me the top five volumes by utilization, then create a new volume on the best target and apply the standard snapshot policy."

Through MCP, the agent:

  • Invokes a discovery tool to query volume capacity and utilization across registered clusters
  • Receives structured data back from ONTAP and reasons over the results to rank volumes and identify the best target
  • Invokes a provisioning tool to create the volume with the correct parameters on the selected target
  • Applies the snapshot policy through a policy tool

This is an example of agentic tool chaining where the agent sequences multiple MCP tool calls with LLM reasoning between each step. At no point does the model craft raw REST API calls or hold administrative credentials directly. Every action flows through a scoped, auditable MCP tool interface, and every step is captured across ONTAP's audit logs, REST API logs, and EMS events.

In production environments, workflows like this can be configured with human-in-the-loop confirmation before mutating operations, adding an additional layer of control over what the agent is allowed to execute autonomously.

 

Key takeaways

If the AI agent is “the brain,” then MCP is “the hands” — the structured interface that allows the agent to reach into enterprise infrastructure and take action. However, hands without discipline are dangerous. NetApp makes those hands precise and trustworthy by giving MCP a real enterprise control surface for storage operations across operational, observability, and retrieval use cases — secured through safer credential-handling patterns, ONTAP OAuth, REST RBAC, optional read-only mode, and a full audit trail that captures every action the agent takes across three complementary logging layers.

 

Part 3 will focus on the Shield — how NetApp's security foundation protects everything the agent touches

 

Here are links to all the parts of this blog series. 

Blog Intro - Running AI agents on NetApp: Securely, practically, and without surprises

Part 1 — What an AI agent actually is, and why the data layer decides whether it succeeds

Part 2 — What is an MCP Server and why does it matter?

Part 3 — How NetApp empowers AI Agentic workflows

Part 4 — Configuring your NetApp infrastructure for AI agents

Public