Gemini Multimodal Live is Google’s model for real-time multimodal interactions. It supports:

  • Real-time video understanding
  • Live speech conversation
  • Multimodal streaming
  • Voice customization

This model will soon be available in Voice DevTools.

For more information, visit Google’s Gemini documentation.