NetApp Intelligent Data Services for AI Powered by NVIDIA NeMo Retriever & NVIDIA NIM Microservices

ArunGururajan · ‎2024-09-24

The rapid evolution of artificial intelligence has brought in an era of unprecedented innovation and transformation across various sectors. From healthcare to finance, and from retail to manufacturing, AI models are being increasingly deployed to enhance operational efficiency, drive decision-making, and unlock new growth opportunities. For instance, in the healthcare industry, AI is being used for image analysis to detect diseases more accurately and at an earlier stage; in finance, AI-powered chatbots are revolutionizing customer service by providing personalized support and assistance; while in retail, AI-driven recommendation systems help personalize product offerings, thereby enhancing customer experiences.

[Opportunity/Problem]

However, despite these compelling use cases and the potential benefits, several key challenges hinder smooth deployment in enterprises. One significant problem lies in effectively leveraging existing data, regardless of its location - whether it's stored on premises, in the cloud, or in multi-cloud environments. This calls for robust integration capabilities across different environments to deliver seamless data access. There is also a pressing need to make sure that the right data is always being leveraged, as data quality and completeness can impact the results and potentially erode customer trust. Lastly, security and compliance concerns have emerged as major barriers for promoting generative AI solutions from proof-of-concept to production environments. Addressing these challenges is crucial for widespread adoption and successful deployment of AI and generative AI technologies.

[Approach/Solution]

Accelerating time to insights with NetApp and NVIDIA NeMo on NetApp AIPod

At the NVIDIA GTC conference in March, we announced how we are working with NVIDIA to advance the NetApp platform’s retrieval-augmented generation (RAG) capabilities for developers building generative AI applications and copilots. At NetApp INSIGHT, we will demonstrate an integration that combines the power of NetApp's robust data platform with the rich capabilities of NVIDIA’s NeMo ecosystem to deliver a comprehensive solution that addresses key challenges faced by businesses in their AI deployments.

The integrated solution offers several compelling benefits:

Accelerated Time to Insights: By breaking down data silos and simplifying data discovery, NetApp enables rapid access to data across hybrid multi-cloud data estates.
Enhanced Security and Compliance: Robust policy-based data governance on enterprise data and protection against large language model (LLM) attacks help ensure the security and compliance of AI deployments, safeguarding sensitive information and preventing regulatory risks.
Simplified Data Management: The integrated solution streamlines data management across hybrid multi-cloud environments, reducing complexity and administrative burdens associated with managing dispersed data assets.

NetApp's Data Platform integrates seamlessly with NVIDIA NeMo Retriever, providing a full-stack solution for deploying Generative AI at scale. This innovative approach enables organizations to:

Deploy AI Applications in Minutes: With NetApp's single-click secure deployments, businesses can rapidly deploy RAG applications, thus streamlining the development and deployment process.
Maximize Throughput and Minimize Latency: Accelerated performance means that data access and model deployment are accomplished quickly, reducing time to insights and enhancing overall efficiency.
Increased Return-On-Investment (ROI) and Efficiency: By maximizing throughput and minimizing latency, the integrated solution helps businesses rapidly build new Generative AI applications at scale quickly and efficiently, leading to faster time to solution and improved ROI.

Let’s look at an example to understand the value. Data personas typically spend more than 80% of time in data preparation, and the lack of data discoverability at this stage significantly impacts their productivity. To counter this, NetApp data explorer empowers data personas to access data from any storage anywhere in their data estate and the global namespace enables a unified search for data discovery. This search not only looks for keywords in the filename but also looks for a semantic match with the file content.

In this example below (Figure 1), when a data persona searches for “Diabetic Patient Record,” NetApp’s data explorer finds and shows files related to diabetes, insulin and asthma in milliseconds, thanks to the AI-powered search.

Data personas can choose the files they want to use and create a curated logical container of the relevant datasets, which we term as a data collection (Figure 2). These data collections can now be used to power AI workflows ranging from model training to inferencing (such as RAG). With the advent of foundation models, obtaining insights from hundreds of files need not be a time-consuming task. With NetApp’s integration with NVIDIA NeMo ecosystem, data personas can deploy their data collection in a secure and compliant way to the NVIDIA OVX-powered AIPod that can run LLMs and provide insights in seconds (Figure 3).

Figure 2. An example of a data collection containing curated medical data

Figure 3. The chatbot experience powered by the underlying curated data collection.

To summarize, NetApp’s integration with NVIDIA NeMo ecosystem is not only seamless (1-click deployment) but also efficient, secure and compliant, thus allowing a data persona to quickly take a proof-of-concept into an enterprise-grade deployment.
By embracing this innovative approach, organizations can drive digital transformation, enhance competitiveness, and achieve long-term success in an increasingly competitive market.

The solution will be available for Tech Preview later this calendar year.