Voice DevTools

Test and compare different speech-to-speech models with our interactive Voice DevTools. Click to access the repo.

Voice DevTools is based on OpenAI’s realtime console.

The Voice DevTools gives ability to test a prompt with different speech to speech models. Supported models are:

  1. OpenAI Realtime API
  2. Fast MiniCPM-o hosted by Outspeed
  3. Gemini Multimodal Live coming soon
  4. Moshi coming soon

Usage

The Voice DevTools provides a simple interface to test and interact with various speech-to-speech models:

  1. Select your preferred model from the supported options
  2. Configure model settings like voice, temperature, and response length
  3. Use your microphone to speak directly to the model
  4. Receive real-time audio responses from the model
  5. View conversation transcripts and debug information in the logging panel

Capabilities

The Voice DevTools offers several key features:

  • Multi-model Support: Test and compare different speech-to-speech models in one interface
  • Real-time Processing: Experience low-latency voice conversations with automatic speech detection
  • Voice Customization: Choose from available voices for model responses
  • Debug Tools: Access detailed logs of events and model interactions
  • Audio Controls: Configure input/output audio formats and settings
  • Function Calling: Test model capabilities with tool integration (where supported)
  • Session Management: Maintain conversation context within testing sessions