RAG (LLamaIndex)
A guide on integrating Retrieval-Augmented Generation (RAG) into your realtime Voice Bot using the OpenAI Realtime API.
This example currently only works on your local machine. It doesn’t work with Outspeed Cloud.
In this guide, we will extend the basic Voice Bot by integrating Retrieval-Augmented Generation (RAG). This enhancement allows your bot to provide more informed and contextually accurate responses by retrieving relevant information from a structured knowledge base in real-time.
Integrating RAG into Your Realtime API Voice Bot
We will modify the existing Voice Bot to include RAG functionality, leveraging the llama_index
library for indexing and querying documents. This involves setting up the data index, creating a search tool, and integrating it into the bot’s AI service pipeline.
- The full code is available here.
Setting Up the Data Index
To enable RAG, we need to index our data sources. We use Llama Index (llama_index
) for this purpose.
Load and Parse Documents
First, load your documents from the data directory and parse them into nodes.
Creating the Vector Store Index
Next, create a vector store index from the parsed nodes. This index enables efficient similarity-based queries.
Defining the Search Tool
Define a custom search tool that uses the query engine to retrieve relevant information based on user queries.
Integrating RAG into the VoiceBot
Incorporate the SearchTool
into the OpenAIRealtime
AI service pipeline within the VoiceBot
setup.
Understanding the Integration
SearchTool Class
The SearchTool
class extends sp.Tool
and is responsible for interacting with the query engine to perform searches based on user inputs.
-
Constructor (
__init__
Method): Initializes the tool with a name, description, parameter types, response types, and the query engine instance. -
run
Method: This asynchronous method takes aQuery
object, uses the query engine to retrieve relevant information, and returns aSearchResult
object containing the search outcome.
VoiceBot Setup
In the setup
method of the VoiceBot
class, we perform the following actions:
-
Load and Index Data: Use
SimpleDirectoryReader
to load documents from the specified data directory and parse them into nodes usingSimpleNodeParser
. -
Create Vector Store Index: Utilize
VectorStoreIndex
to create an index from the parsed nodes, enabling efficient similarity-based searches. -
Initialize AI Services: Instantiate the
OpenAIRealtime
node with theSearchTool
integrated into its toolset. This setup allows the Voice Bot to perform real-time searches based on user queries, enhancing its response accuracy and relevance.
Streaming Endpoint (run
Method)
The run
method is decorated with @sp.streaming_endpoint()
and is responsible for handling incoming audio and text streams. It sets up the AI service pipeline by running the llm_node
with the provided input streams and returns the output streams.
Support
For any assistance or questions, feel free to join our Discord community. We’re excited to see what you build!
Was this page helpful?