What is a Custom LLM?

A Custom LLM (Language Model) allows you to tailor the language processing capabilities of your voice bot to better fit specific use cases or requirements.

By integrating a custom LLM, you can control the behavior, responses, and interactions of your voice bot more precisely, ensuring it aligns with your application’s goals.

Make sure you have gone over QuickStart before trying this example.

Integrating a Custom LLM in 4 Steps!

This guide will walk you through creating a custom LLM node and integrating it into a voice bot application using Outspeed.

  • The full code is available here.
1

Install Dependencies

2

Create CustomLLM Application

To create a custom LLM, let’s set up a file voice_bot.py.

# voice_bot.py
import json
import os

import requests
import outspeed as sp

class CustomLLM(sp.CustomLLMNode):
    def __init__(self, system_prompt: str):
        super().__init__()
        self.system_prompt = system_prompt
        self.chat_history = []

        if self.system_prompt:
            self.chat_history.append({"role": "system", "content": self.system_prompt})

    async def process(self, input_text: str):
        self.chat_history.append({"role": "user", "content": input_text})

        response = requests.post(
            "https://api.openai.com/v1/chat/completions",
            json={"messages": self.chat_history, "model": "gpt-4o-mini"},
            headers={
                "Authorization": f"Bearer {os.getenv('OPENAI_API_KEY')}",
                "Content-Type": "application/json",
            },
        )

        self.chat_history.append({"role": "assistant", "content": response.json()["choices"][0]["message"]["content"]})

        return response.json()["choices"][0]["message"]["content"]

This CustomLLM class extends sp.CustomLLMNode to integrate a custom language model.

It initializes with a system prompt and manages the chat history to maintain conversational context.

Next, we’ll integrate this custom LLM into the VoiceBot application.

# voice_bot.py

@sp.App()
class VoiceBot:
    """
    This class handles the setup, running, and teardown of various AI services
    used to process audio input, generate responses, and convert text to speech.
    """

    async def setup(self) -> None:
        """
        This method is called when the app starts. It should be used to set up services, load models, and perform any necessary initialization.
        """
        self.deepgram_node = sp.DeepgramSTT()

        # .... update the LLM node to use our custom LLM ....
        self.llm_node = CustomLLM(
            system_prompt="You are a helpful assistant. Keep your answers very short. No special characters in responses.",
        )
        # .... update the LLM node to use our custom LLM ....

        self.token_aggregator_node = sp.TokenAggregator()
        self.tts_node = sp.CartesiaTTS(
            voice_id="95856005-0332-41b0-935f-352e296aa0df",
        )
        self.vad_node = sp.SileroVAD(min_volume=0)

Next, we’ll add the run and teardown methods to the VoiceBot class.

    @sp.streaming_endpoint()
    async def run(self, audio_input_queue: sp.AudioStream, text_input_queue: sp.TextStream):
        """
        It sets up and runs the various AI services in a pipeline to process audio input and generate audio output.
        """
        deepgram_stream: sp.TextStream = self.deepgram_node.run(audio_input_queue)

        vad_stream: sp.VADStream = self.vad_node.run(audio_input_queue.clone())

        text_input_queue = sp.map(text_input_queue, lambda x: json.loads(x).get("content"))

        llm_input_queue: sp.TextStream = sp.merge(
            [deepgram_stream, text_input_queue],
        )

        llm_token_stream: sp.TextStream
        llm_token_stream = self.llm_node.run(llm_input_queue)

        token_aggregator_stream: sp.TextStream = self.token_aggregator_node.run(llm_token_stream)
        tts_stream: sp.AudioStream = self.tts_node.run(token_aggregator_stream)

        self.llm_node.set_interrupt_stream(vad_stream)
        self.token_aggregator_node.set_interrupt_stream(vad_stream.clone())
        self.tts_node.set_interrupt_stream(vad_stream.clone())

        return tts_stream

    async def teardown(self) -> None:
        """
        Clean up resources when the VoiceBot is shutting down.
        """
        await self.deepgram_node.close()
        await self.llm_node.close()
        await self.token_aggregator_node.close()
        await self.tts_node.close()


if __name__ == "__main__":
    # Start the VoiceBot when the script is run directly
    VoiceBot().start()

By integrating the CustomLLM, the VoiceBot now utilizes a tailored language model for generating responses, providing greater control over the conversational behavior.

3

Setup API Keys and Run

To run this example locally, you’ll need API keys set up in the environment variables for the following services:

  1. Deepgram - For transcription. Sign up and navigate to https://console.deepgram.com/ to get the API key.
  2. OpenAI - For the custom LLM. Sign up and navigate to https://platform.openai.com/account/api-keys to get the API key.
  3. Cartesia - For text-to-speech. Sign up and navigate to https://play.cartesia.ai/keys to get the API key.

All of these providers have a free tier. Once you have your keys, create a .env file in the same directory as voice_bot.py and add the following:

DEEPGRAM_API_KEY=<your_deepgram_api_key>
OPENAI_API_KEY=<your_openai_api_key>
CARTESIA_API_KEY=<your_cartesia_api_key>

Finally, run the following command to start the server:

# in examples/voice_bot_custom_llm
python voice_bot.py

The console will output the URL you can use to connect to the (default is http://localhost:8080).

4

Try it Out

You can use our playground to interact with the customized voice bot.

  1. Navigate to playground and select “Voice Bot”
  2. Paste the link you received from the previous step into the URL field.
  3. Select Audio device. Leave Video device blank. Click Run to begin.

The playground is built using our React SDK. You can use it to build your own frontends or integrate with an existing one!

Support

For any assistance or questions, feel free to join our Discord community. We’re excited to see how you enhance your voice bots with custom LLMs!