This API is currently in Alpha

The Outspeed Live API lets you create low-latency, multi-modal conversational experiences by connecting to open-source speech-to-speech models hosted in Outspeed’s infrastructure. These models enable realtime text and audio interactions, voice activation detection, function calling, and many other capabilities.

Outspeed Live API is compatible with OpenAI Realtime API.

Key Capabilities

  • Text and Audio Interactions: Support for text and audio inputs and outputs
  • Real-time Processing: Low-latency responses for natural conversations
  • Contextual Memory: Maintains conversation history within sessions
  • Voice Activity Detection: Automatically detects speech start/end for fluid interactions
  • Function Calling: Integrate external tools and services seamlessly (coming soon)

Supported Models

  • Fast MiniCPM-o 2.6
  • StepFun/Step-Audio-Chat (coming soon)

API Events