Tech ONTAP Blogs
Tech ONTAP Blogs
The true value of generative artificial intelligence (GenAI) is in leveraging your proprietary data—or “context”—to differentiate and enhance the general AI capabilities of public foundational models. This is where retrieval-augmented generation (RAG) solutions come into play.
But while basic designs for RAG solutions may handle small, localized datasets, scaling data access across hybrid and on-premises environments for enterprise-grade solutions introduces significant challenges.
In this post I’ll outline how Amazon FSx for NetApp ONTAP (FSx for ONTAP) and NetApp BlueXP™ workload factory on AWS (workload factory) can seamlessly unlock context-aware GenAI via simplified data management and RAG-based design.
Read on, here’s what we cover:
For any GenAI model to generate contextually accurate responses, it needs seamless access to your proprietary data. Whether you're working with a first-party or third-party AI solution, your data needs to be nearby, embedded, and retrievable.
However, that can be easier said than done. Especially when working with on-premises or hybrid data deployments, integrating your data sources into a RAG-based GenAI infrastructure comes with a few challenges:
Amazon FSx for NetApp ONTAP (FSx for ONTAP) is a data storage service from AWS that extends the benefits of NetApp® ONTAP® to the AWS Cloud. Designed for multi-protocol access, automatic storage scaling, and many other data management features, FSx for ONTAP offers high performance and cost optimization for enterprises whose business relies on data.
FSx for ONTAP acts as an intelligent data layer for GenAI that addresses the main data infrastructure challenges based on best practices. This diagram shows the Amazon GenAI stack with FSx for ONTAP:
For the data management of your RAG-based GenAI applications on AWS, FSx for ONTAP provides a seamless path to access and manage your data with extended data mobility and infrastructure support.
For the deployment of your RAG-based GenAI applications, FSx for ONTAP exposes a REST API that allows easy data access. You can develop your custom workloads using these APIs directly, or rely on the best-practice no-dev workloads available to you by using workload factory, which we’ll take a look at next.
BlueXP workload factory (workload factory) is a free-of-charge NetApp service that abstracts the FSx for ONTAP API, enabling users to build and manage RAG-based GenAI applications in AWS without any development overhead.
Through its user-friendly UI, workload factory simplifies the management of data, infrastructure, and AI infrastructure configurations, so your applications are both secure and easy to align with compliance goals.
Here you can see the workload factory UI:
Additionally, workload factory gives you the ability to embed your FSx for ONTAP workloads and artifacts in your custom GenAI applications and infrastructure with the workload factory APIs.
Building a GenAI application with workload factory and FSx for ONTAP involves a few key steps:
Workload factory then proceeds with setting up the RAG pipeline for you, including embedding the data sources into a LanceDB vector database and automatically syncing changes to data sources. After the first sync completes, you can publish the knowledge base for access to GenAI applications.
3. Test and deploy your GenAI chatbot: The RAG-based chatbot is available in the workload factory UI and accessible for external applications via the workload factory API.
Below we explore three real-world use cases to provide a practical perspective on what workload factory with FSx for ONTAP can offer for your GenAI initiatives.
Use case: Doctors need quick access to accurate medical records and drug information in real time to provide optimal patient care.
Using workload factory and FSx for ONTAP, healthcare organizations can integrate patient records and drug databases across regions into a centralized knowledge base. That data will be securely stored and ready to be accessed in real time thanks to data mirroring. Through access control lists (ACLs), the RAG-based solution is designed with access just to surface information in patient records.
Medical professionals can interact with a chatbot, allowing them to quickly cross-reference patient records with drug databases, reducing the risk of adverse drug interactions and improving patient outcomes.
Here’s what the this medical use case would look like:
Use case: Factory personnel need immediate guidance when machines malfunction, and downtime leads to significant revenue loss.
With the GenAI-based chatbot powered by Amazon Bedrock and FSx for ONTAP, technicians can input error codes and get real-time instructions. The chatbot pulls from maintenance logs, user manuals, and schematics stored in the embedded knowledge base, guiding the staff through troubleshooting procedures.
Here’s what a factory chatbot use case would look like:
The data is protected and managed via FSx for ONTAP Snapshot capabilities, cloning, and replication, all of which reduce downtime and improve operational efficiency as technicians receive immediate, context-aware guidance without the need for expert intervention.
Use case: Enterprises need GenAI solutions that work seamlessly across both cloud and on-premises data environments, to maintain high-performance AI inference and low-latency access.
FSx for ONTAP delivers the high performance required for AI workloads, especially in hybrid deployments where data may reside both on-premises and in the cloud. Leveraging its advanced caching features, businesses can create GenAI chatbots that perform well even during cloud-bursting scenarios, where high-demand workloads are dynamically shifted between cloud and local resources.
Here’s what the hybrid use case for a GenAI chatbot would look like:
The RAG-based solution provides the chatbot with access to the most up-to-date information from both environments. Also, FSx for ONTAP retains metadata and enforces security policies across different data sources, keeping sensitive information protected, regardless of where it’s accessed.
With FSx for ONTAP, you can address critical data challenges like managing large datasets, retaining metadata, and ensuring data security; and, with workload factory, you can leverage no-code workloads to develop RAG-based GenAI applications that have access to your FSx for ONTAP sources.
By combining these two technologies, businesses can accelerate adoption of GenAI applications that deliver real-time, contextually accurate insights, without the overhead of managing complex data environments.
Ready to take the next step? Visit the BlueXP workload factory homepage or get started now with a free account.