Set up your RAG-based GenAI pipeline using workload factory

robertbell · ‎2024-07-10

[Last edited on Jan-21, 2025]

BlueXP™ workload factory for AWS (workload factory) is here to help orchestrate and automate workloads for your Amazon FSx for NetApp ONTAP (FSx for ONTAP) data. With many different capabilities, workload factory GenAI is used to bring enterprise data into generative artificial intelligence (GenAI) architectures. Together with Amazon Bedrock, workload factory can easily set up retrieval-augmented generation (RAG) pipelines.

This post will guide you on how to set up and operate workload factory GenAI to create a knowledge base, connect it to your data sources on FSx for ONTAP, and make it accessible to your RAG-based AI applications. Additionally, you will discover how workload factory API enables you to easily embed your RAG-based solution within your custom GenAI applications and infrastructure.

Set up your RAG-based GenAI pipeline using workload factory
Step 1: Start workload factory GenAI workload
Step 2: Create your first knowledge base
Step 3: Define infrastructure
Step 4: Configure your knowledge base
Step 5: Add an FSx for ONTAP data source
Step 6: Publish the knowledge base
Interact with your RAG-based GenAI chatbots
Conclusion

Set up your RAG-based GenAI pipeline using workload factory

Step 1: Start workload factory GenAI workload

First, go to the workload factory home page. Log in with an existing workload factory account or sign up to create a new one.

Workload factory is used to deploy and manage a variety of different workloads. Workload factory GenAI is located in the AI section of the workload factory home page. To learn more about this process, read our post Maximizing the value of GenAI with Amazon Bedrock and Amazon FSx for NetApp ONTAP.

Go to the navigation menu on the left and select the GenAI icon (the chip shape).

Another way to do this is to select “Deploy & manage” from the options in the AI section of the workload factory homepage.

In the next screen you can read the Introduction and then click “Get started.”

Step 2: Create your first knowledge base

Workload factory GenAI will start you off by asking to select and manage a knowledge base from the list, or to add a new knowledge base for the RAG function.

A knowledge base consists of one or more data sources on FSx for ONTAP or on-premises NetApp® ONTAP®. The data within the knowledge base is an embedding representation of your source data, automatically stored under the hood in a vector database. This data can then be used to augment prompts from the GenAI application.

To get started, click the “Add knowledge base” button.

Step 3: Define infrastructure

In this screen you will be asked to enter your AWS credentials, which include the deployment location and the security key pair.

Step 4: Configure your knowledge base

In the next screen you’ll be asked to define the details of your knowledge base.

Make sure to follow your organization’s best practices on styling conventions and AI governance as you complete this section.

Click on each of the sections to configure the knowledge base’s name, description, embedding model, and chat model.

You can also set data guardrails—private data masking before the data is embedded. This feature is powered by BlueXP classification.

Next, you can set conversation starters. Automatic mode will generate four conversation starters once your data is scanned, and Manual mode will allow you to define your own.

Finally, choose the FSx for ONTAP and SVM where you want the knowledge base vector data to be stored, and then give the vector database volume a name and snapshot policy.

Once you have all the correct information about the knowledge base entered, click the “Create Knowledge base” button.

Step 5: Add an FSx for ONTAP data source

Workload factory GenAI lets you add FSx for ONTAP volumes as data sources for your RAG’s knowledge base.

If the data source is an ONTAP system located on-premises, you need to replicate it to an FSx for ONTAP file system using workload factory. To set up the replication relationship, follow these instructions.

At this stage there are two options depending on how you’re using workload factory:

If you just created a new knowledge base (as shown in the previous step), click the "Add data source" button.

You’ll now be in the “Add data source” screen, where you’ll be prompted to select one or more file systems from a list of available FSx for ONTAP file systems. (Note: If you haven’t set up an FSx for ONTAP file system before, here are the instructions.)

If you already have a knowledge base, you’ll be starting from the workload factory AI workload homepage, where you'll see a list of all your existing knowledge bases. Select the knowledge base to which you wish to add data sources.

Select the file system that you want the GenAI application to access, and click “Next.”

In the next screen, select one or more volumes where your private data is stored.

For SMB volume(s), you need to enter your user credentials, domain name, and ActiveDirectory IP address.

When you’re done, click the “Apply” button.

In the next screen, you can either choose to embed the entire volume or specific folders in the selected volume(s).

If you choose the option for specific folders, you’ll be presented with a list of all the folders that reside in the volume.

Select each of the folders you want to use for the RAG pipeline.

Next, you can define the parameters for the embedding model that will be used to create embedding vectors for your data sources. Here, you can choose how data is being chunked and stored in the vector database.

SMB volumes data source will also have the option to enable “Permission aware.” This setting will restrict answers to use only sources the user can access.

Once your data source is added, you will now see it is listed for the corresponding knowledge base under the workload factory GenAI screen.

From here, you can view and manage this knowledge base and any others that you create.

Find the knowledge base you want to manage in the list and click the three-dot menu icon to its right. Then select the “Manage Knowledge base” option from the drop-down menu.

This opens a screen where you can view the details of the knowledge base and add more data sources.

Step 6: Publish the knowledge base

Publishing the knowledge base activates a unique API endpoint so that your GenAI applications can access the knowledge base data.

Go to the Actions menu to the upper right of the screen and select “Manage authentication settings”.

Now go to the Actions menu again and select the Publish option.

Your knowledge base has now been published for your GenAI apps to find.

Interact with your RAG-based GenAI chatbots

Workload factory provides you with a chat widget that lets you interact with your RAG-based GenAI chatbot directly from the console. This is particularly useful for quickly testing how the chatbot surfaces your proprietary data from the knowledge base and iteratively refining the data sources available in the knowledge base.

After publishing the knowledge base, workload factory also makes it accessible via API so that you can integrate your RAG-based pipeline within your custom external GenAI applications.

The workload factory AI API allows you to access external APIs (given your personal account token), AWS, FSx deployments, data sources, knowledge bases, and chats. You can review the API specifications by clicking on the View Swagger button on the top right of the chat widget.

To get started with using the workload factory AI API, follow our example chatbot application from Github. It showcases how to access your custom knowledge base programmatically and how to build a local chatbot browser experience in Next.js with TypeScript.

Using the workload factory AI API, you can seamlessly embed your RAG GenAI chatbot in your applications end-to-end with minimum overhead.

Conclusion

Workload factory GenAI is a quick and easy no-code way to get a Knowledge base up and running with your organization’s private data on FSx for ONTAP to create custom GenAI chatbots. You can interact with your GenAI chatbots directly within the workload factory UI, or you can use the workload factory API to embed the models within your custom GenAI applications.

There’s a lot more workload factory can do for you to check out. To learn more, visit the BlueXP workload factory for AWS homepage or get started now.

How to deploy and manage RAG knowledge base using FSx for ONTAP with BlueXP workload factory GenAI

Set up your RAG-based GenAI pipeline using workload factory

Step 1: Start workload factory GenAI workload

Step 2: Create your first knowledge base

Step 3: Define infrastructure

Step 4: Configure your knowledge base

Step 5: Add an FSx for ONTAP data source

Step 6: Publish the knowledge base

Interact with your RAG-based GenAI chatbots

Conclusion