This example currently only works on your local machine. It doesn’t work with Outspeed Cloud.

In this guide, we will extend the basic Voice Bot by integrating Retrieval-Augmented Generation (RAG). This enhancement allows your bot to provide more informed and contextually accurate responses by retrieving relevant information from a structured knowledge base in real-time.

Integrating RAG into Your Realtime API Voice Bot

We will modify the existing Voice Bot to include RAG functionality, leveraging the llama_index library for indexing and querying documents. This involves setting up the data index, creating a search tool, and integrating it into the bot’s AI service pipeline.

  • The full code is available here.
1

Setting Up the Data Index

To enable RAG, we need to index our data sources. We use Llama Index (llama_index) for this purpose.

Load and Parse Documents

First, load your documents from the data directory and parse them into nodes.

from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SimpleNodeParser

PARENT_DIR = os.path.dirname(os.path.abspath(__file__))

documents = SimpleDirectoryReader(f"{PARENT_DIR}/data/").load_data()
node_parser = SimpleNodeParser.from_defaults(chunk_size=512)
nodes = node_parser.get_nodes_from_documents(documents=documents)
2

Creating the Vector Store Index

Next, create a vector store index from the parsed nodes. This index enables efficient similarity-based queries.

from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
self.query_engine = vector_index.as_query_engine(similarity_top_k=2)
3

Defining the Search Tool

Define a custom search tool that uses the query engine to retrieve relevant information based on user queries.

from pydantic import BaseModel
import outspeed as sp
import logging

class Query(BaseModel):
    query_for_neural_search: str

class SearchResult(BaseModel):
    result: str

class SearchTool(sp.Tool):
    def __init__(
        self,
        name: str,
        description: str,
        parameters_type: type[Query],
        response_type: type[SearchResult],
        query_engine,
    ):
        super().__init__(name, description, parameters_type, response_type)
        self.query_engine = query_engine

    async def run(self, query: Query) -> SearchResult:
        logging.info(f"Searching for: {query.query_for_neural_search}")
        response = self.query_engine.query(query.query_for_neural_search)
        logging.info(f"RAG Response: {response}")
        return SearchResult(result=str(response))
4

Integrating RAG into the VoiceBot

Incorporate the SearchTool into the OpenAIRealtime AI service pipeline within the VoiceBot setup.

@sp.App()
class VoiceBot:
    async def setup(self) -> None:
        # Load and index data
        documents = SimpleDirectoryReader(f"{PARENT_DIR}/data/").load_data()
        node_parser = SimpleNodeParser.from_defaults(chunk_size=512)
        nodes = node_parser.get_nodes_from_documents(documents=documents)
        vector_index = VectorStoreIndex(nodes)
        self.query_engine = vector_index.as_query_engine(similarity_top_k=2)

        # Initialize the AI services
        self.llm_node = sp.OpenAIRealtime(
            tools=[
                SearchTool(
                    name="search",
                    description="Search the web for information",
                    parameters_type=Query,
                    response_type=SearchResult,
                    query_engine=self.query_engine,
                )
            ]
        )

Understanding the Integration

SearchTool Class

The SearchTool class extends sp.Tool and is responsible for interacting with the query engine to perform searches based on user inputs.

  • Constructor (__init__ Method): Initializes the tool with a name, description, parameter types, response types, and the query engine instance.

  • run Method: This asynchronous method takes a Query object, uses the query engine to retrieve relevant information, and returns a SearchResult object containing the search outcome.

VoiceBot Setup

In the setup method of the VoiceBot class, we perform the following actions:

  1. Load and Index Data: Use SimpleDirectoryReader to load documents from the specified data directory and parse them into nodes using SimpleNodeParser.

  2. Create Vector Store Index: Utilize VectorStoreIndex to create an index from the parsed nodes, enabling efficient similarity-based searches.

  3. Initialize AI Services: Instantiate the OpenAIRealtime node with the SearchTool integrated into its toolset. This setup allows the Voice Bot to perform real-time searches based on user queries, enhancing its response accuracy and relevance.

Streaming Endpoint (run Method)

The run method is decorated with @sp.streaming_endpoint() and is responsible for handling incoming audio and text streams. It sets up the AI service pipeline by running the llm_node with the provided input streams and returns the output streams.

Support

For any assistance or questions, feel free to join our Discord community. We’re excited to see what you build!