Skip to main content
Gemini Multimodal Live is Google’s model for realtime multimodal interactions. It supports:
  • Realtime video understanding
  • Live speech conversation
  • Multimodal streaming
  • Voice customization
This model will soon be available in Voice DevTools. For more information, visit Google’s Gemini documentation.
I