Maximizing the value of GenAI with Amazon Bedrock and Amazon FSx for NetApp ONTAP

YuvalK · ‎2024-07-10

Generative artificial intelligence (GenAI) has revolutionized how businesses and individuals find inspiration, develop knowledge, prototype ideas, and create content. For your business, the capabilities unlocked by GenAI translate into a new way of thinking about processes for internal teams and customers, as well as a new perspective on your data.

Traditionally, businesses have based data-driven processes on structured data hosted in databases. Unstructured data such as text, images, videos, and audio, has been sidelined due to the complexity that its unsystematic nature and volume entail. With 90% of new enterprise data being unstructured, the opportunity loss has been substantial. GenAI models, however, can process and generate unstructured data at scale. With GenAI, you can now use your full data potential to drive your business processes.

Unlocking this value is made easy using Amazon Bedrock with Amazon FSx for NetApp ONTAP (FSx for ONTAP). You’ll get a secure, scalable, highly performant GenAI solution for your proprietary data that’s customizable and fully managed by AWS. Let’s see how it’s done.

Read on to find out more about how GenAI and proprietary data work, or jump ahead using these links:

How do you use GenAI on proprietary data?

What is retrieval-augmented generation in GenAI?

The benefits of GenAI for your business

How to create a RAG application with FSx for ONTAP data

Solution components

How it works: Solution overview

Build a RAG-based AI agent with FSx for ONTAP and Amazon Bedrock

Prerequisites

Ingestion phase

Query phase

Unlock the full potential of GenAI

How do you use GenAI on proprietary data?

Pre-trained GenAI models need access to your proprietary data to be able to extract relevant information and patterns that are tailored to your business when generating their outputs.

The tasks performed by GenAI models are complex. Such models have been trained on vast quantities of data using an immense amount of computing resources. Training a GenAI model from scratch on proprietary data that can reach a similar level of state-of-the-art performance is simply not feasible. Instead, you can leverage them as foundation models (FMs) connected to your proprietary data.

Foundation models are general-purpose models that can support a wide variety of end-user scenarios out of the box.

There are a lot of foundation models to choose from, many of which are supported by Amazon Bedrock, the fully managed service from AWS for connecting to established foundation models via APIs.

While you can further specialize FMs via instruction tuning to achieve a more precise and reliable performance for a downstream task of interest, connecting your proprietary data to the foundation model is the only necessary step to tailor these models to your business context.

A large language model (LLM) such as Amazon Titan can perform state-of-the-art text generation, text classification, question answering, information extraction, and text embedding, all in one service.

Using FSx for ONTAP with Amazon Bedrock provides an efficient solution for using LLMs on your proprietary data: retrieval-augmented generation (RAG).

What is retrieval-augmented generation in GenAI?

RAG allows an LLM to retrieve external sources unseen during training, such as your proprietary data, when generating responses.

When you prompt an LLM, you are asking the model to find an answer to a query. RAG provides the model with dynamic access to your data sources so that it can automatically augment the prompt with relevant proprietary information.

RAG needs a way to search through your documents quickly and accurately in order to support real-time queries. The solution for that is vector embeddings.

Here is an example user query on proprietary data with a RAG-based GenAI approach (source: AWS News Blog)

A vector embedding is a meaningful numerical representation of a chunk of your document that can capture the semantic context of your data. A chunk can be a set number of characters, a sentence, or a paragraph. Vector embeddings are stored in vector databases, which use advanced indexing to support quick retrieval. The N most similar vector embeddings to the embedded input prompt are retrieved at runtime to provide business context.

With RAG, you customize the FM’s behavior at runtime while keeping the latency of response generation within budget.

The benefits of GenAI for your business

GenAIーespecially RAG-based GenAI applicationsーcan dramatically improve your organization’s processes and offerings, and even unlock business avenues that were previously unfeasible, providing you with a unique and modern edge.

Both your internal teams and external customers can benefit from GenAI solutions. Internal teams from all lines of business can boost their productivity and enjoy optimized workflows. Customers and partners can experience new methods of interacting with your products and services, in ways that are more effective and can be highly tailored to them.

For example, you could have a specialized GenAI chatbot to support your enterprise IT data science team with questions on proprietary data sources as well as general best practices, and another specialized GenAI chatbot to support your customers in learning about your products as well as placing an order or keeping track of deliveries.

With FSx for ONTAP and Amazon Bedrock, you can develop these kinds of GenAI solutions in just minutes.

How to create a RAG application with FSx for ONTAP data

You can connect your data from FSx for ONTAP with Amazon Bedrock via API calls in Typescript or via NetApp BlueXP™ w orkload factory for AWS, the workload-oriented service built into NetApp® BlueXP to help you migrate, protect, and optimize your workloads on FSx for ONTAP.

This section will explain how to do it.

Solution components

First, let’s take a look at the components involved.

FSx for ONTAP is the shared file, block, and object storage service that delivers NetApp® ONTAP® capabilities as an AWS-managed service.

Many organizations use ONTAP to store enterprise data in large quantities. With FSx for ONTAP you can use your data to build GenAI models leveraging AWS tools and services. FSx for ONTAP provides you with some key capabilities such as data collection from multiple sources, continuous data updates, and access to management functionalities for your RAG data.

Other benefits include multi-protocol access, default high availability, data protection, cost-cutting storage efficiencies, and adjustable high-performance throughput and IOPS levels.

Amazon Bedrock is the fully managed service on the AWS cloud that provides access to foundation models from Amazon and other third-party providers such as AI21 Studio, Anthopric, and Meta via APIs. With Amazon Bedrock, you can connect your data sources, adding your datasets to the FM's knowledge.

BlueXP workload factory is the new NetApp service that helps in simplifying and automating your FSx for ONTAP deployments with wizards, dashboards, and infrastructure as code. The BlueXP workload factory GenAI capability will let you connect your organization’s private data on FSx for ONTAP with Amazon Bedrock to create GenAI applications.

How it works: Solution overview

The high-level solution diagram below showcases how you can seamlessly connect FSx for ONTAP with Amazon Bedrock.

A typical architecture for a GenAI-powered chatbot assistant developed using FSx for ONTAP with Amazon Bedrock

The solution architecture is composed of these processes:

Amazon GenAI infrastructure
Amazon Bedrock provides access to embedding models, such as Amazon Titan, which can be used to create vector embeddings for your data sources, and foundation models such as AI21 or Claude, which are used to create your GenAI chatbot assistants.
Enterprise data on FSx for ONTAP
FSx for ONTAP hosts your enterprise data in volumes, offering state-of-the-art data management functionalities such as efficient data change tracking, automated NetApp Snapshots™ and backups, and support for both Windows and Unix/Linux environments.
Embedding datastore
The vector embedding is stored in a vector database (LanceDB) hosted on FSx for ONTAP, fully managed for you from deployment to backups. For the first sync, a full scan of your data sources is performed to populate the database. Then, the vector embedding data updates happen asynchronously through a periodic scan of the data source for updates; in the future, changes will be detected synchronously using NetApp FPolicy®.

Updating the vector embedding keeps your embeddings and file system in sync: the system automatically chunks and embeds new or updated files stored in the data sources by sending the new data to the embedding model on AWS Bedrock. This way, users can trust that the responses from the GenAI model are based on the latest data.

To set up this knowledge base, you just need to specify the data location, select an embedding model, and provide the details for your vector database.

NetApp AI engine
NetApp AI engine connects FSx for ONTAP and Amazon Bedrock to support you with the creation and updates of the vector database with your enterprise private data. The NetApp AI engine is also used by GenAI applications to get that data for response generation.
NetApp AI service backend
Once the user asks a question to the RAG-based GenAI application, the model can provide the most relevant answers using both static knowledge gathered during training and dynamic knowledge retrieved from your data sources. Users can now chat with a personalized GenAI chatbot.

With the capabilities offered by FSx for ONTAP, you don’t need to worry about infrastructure and creating connectors for integration. You can focus all your attention on getting the full business value out of your GenAI applications.

Build a RAG-based AI agent with FSx for ONTAP and Amazon Bedrock

Here is an explanation of the process flow pictured in the diagram above.

The process can be divided into two set-up phases:

Ingestion phase: To define the RAG system, you’ll give access to your proprietary data stored in FSx for ONTAP to Amazon Bedrock in an organized way via an embedding store.
Query phase: To support real-time user interactions with your AI agent.

Prerequisites

Make sure the relevant enterprise data to be accessed by the AI agent is in your knowledge base on FSx for ONTAP. Remember that FSx for ONTAP allows you to connect both data on-premises and on AWS.

To learn more about how this is done, check out our post How to deploy and manage RAG knowledge base on FSx for ONTAP with BlueXP workload factory GenAI.

Ingestion phase

The initial setup of the RAG system is orchestrated by the NetApp AI engine.

First, the AI engine scans the data from your data sources, beginning with a full scan and then only looking at new and updated information.
Next, the AI engine calls the embedding model on Amazon Bedrock. This takes place after performing customizable data cleaning and chunking that ensure parallel preprocessing and efficient analysis.
Then, the AI engine stores the embedded data in the vector database together with relevant metadata.

Your proprietary data is now in a format that AI agents can access.

Query phase

Real-time user interactions require your solution to be able to connect the user to an AI chat model and for the model to access the encoded knowledge base.

The process workflow for carrying out a chat is as follows:

The user enters a prompt through the UI.
Next, the prompt goes to the NetApp AI service backend.
Then, the prompt is sent to the AI engine.
The AI engine accesses the vector database to retrieve the most relevant embedded data to support the user request.
The AI model in Amazon Bedrock generates an answer to the user request based on the original prompt and retrieved embedded internal knowledge.
The AI backend sends the most relevant answer back to the user.

This process is repeated multiple times as the user continues to interact with the RAG-based AI chatbot assistant.

Unlock the full potential of GenAI

It’s a rapidly evolving IT world and GenAI is at the center of it. To unlock the full potential of GenAI for your organization, you can connect your data on Amazon FSx for NetApp ONTAP with Amazon Bedrock.

You can seamlessly connect your unstructured and structured data hosted on FSx to the GenAI models hosted on Amazon Bedrock while ensuring performance, cost optimization, data protection, and security.

With FSx for ONTAP and Amazon Bedrock, you can create your custom AI chatbot assistants in minutes without the need for deep AI knowledge and without your data ever leaving your environment.

To learn more, visit the BlueXP workload factory homepage, read How to deploy and manage a RAG knowledge base on FSx for ONTAP with BlueXP workload factory GenAI, or get started now