What is Outspeed?

Outspeed is a platform for realtime voice and video AI applications. The Outspeed SDK provides a simple and intuitive interface for building real-time voice and video AI applications. It abstracts away the complexities of handling multimedia inputs, allowing developers to focus on creating powerful multimodal AI experiences.

Key features of the Outspeed SDK include:

  • Easy-to-use API inspired by PyTorch
  • Built-in support for processing voice and video inputs
  • Seamless integration with various AI models and services
  • Real-time processing capabilities for responsive applications

Now, let’s get started!

Demo

In this quick start guide, we’ll demonstrate how to build a voice bot using Outspeed. This bot can engage in real-time conversations and respond based on the LLM prompt.

Below, you’ll find a brief video demonstration showcasing the capabilities of the voice bot: (PS: Adapt was our previous name 😅)

Prerequisites

Make sure to have the following installed:

  1. Python: version 3.9 or higher, but less than 3.12
  2. pip: latest version (comes with Python)
  3. Git: for cloning the repository

Creating a Voice Bot in 4 steps!

This app will process input from your microphone, send it to an LLM and convert the reponse back to voice.

This application only supports English as the source language. For other languages, switch out the models/configs used in this example.

1

Setup environment and install dependencies.

2

Clone the repository

git clone https://github.com/outspeed-ai/outspeed.git
cd outspeed/examples/voice_bot/
3

Run Backend

Ensure you’re in the same directory as voice_bot.py!

voice_bot.py contains implementation of @outspeed.App annotated class which implements a low-latency conversational voice bot using outspeed SDK.

To run this example locally, you’ll need API keys setup in the environment variables for the following services:

  1. Deepgram - For transcription. Sign up and navigate to https://console.deepgram.com/ to get the API key.
  2. Groq - For LLM. Sign up and navigate to https://console.groq.com/keys and to get the API key.
  3. Cartesia - For text-to-speech. Sign up and navigate to https://play.cartesia.ai/keys to get the API key.

All of these providers have a free tier. Once you have your keys, run the following command:

export DEEPGRAM_API_KEY=<your_deepgram_api_key>
export GROQ_API_KEY=<your_groq_api_key>
export CARTESIA_API_KEY=<your_cartesia_api_key>

Finally, run the following command to start the server:

python voice_bot.py

The console will output the URL you can use to connect to the (default is http://localhost:8080).

4

Try it Out

You can use our playground to interact with the voice bot.

  1. Navigate to playground and select “WebRTC”
  2. Paste the link your received from the previous step into the URL field.
  3. Select Audio device. Leave Video device blank. Click Run to begin.

The playground is built using our our React SDK. You can use it to build your own frontends, or integrate with an existing one!

Support

For any assistance or questions, feel free to join our Discord community. We’re excited to see what you build!