Send audio and video streams Stay organized with collections Save and categorize content based on your preferences.
This document describes how to send audio and video streams to the Live API forreal-time, bidirectional communication with Gemini models. Learn how toconfigure and transmit audio and video data to build dynamic and interactiveapplications.
Send audio streams
Implementing real-time audio requires strict adherence to sample ratespecifications and careful buffer management to ensure low latency and naturalinterruptibility.
The Live API supports the following audio formats:
- Input audio: Raw 16-bit PCM audio at 16 kHz, little-endian
- Output audio: Raw 16-bit PCM audio at 24 kHz, little-endian
The following code sample shows you how to send streaming audio data:
importasyncio# Assumes session is an active Live API session# and chunk_data contains bytes of raw 16-bit PCM audio at 16 kHz.fromgoogle.genaiimporttypes# Send audio input data in chunksawaitsession.send_realtime_input(audio=types.Blob(data=chunk_data,mime_type="audio/pcm;rate=16000"))The client must maintain a playback buffer. The server streams audio in chunkswithinserver_content messages. The client's responsibility is to decode,buffer, and play the data.
The following code sample shows you how to process streaming audio data:
importasyncio# Assumes session is an active Live API session# and audio_queue is an asyncio.Queue for buffering audio for playback.importnumpyasnpasyncformsginsession.receive():server_content=msg.server_contentifserver_content:# 1. Handle Interruptionifserver_content.interrupted:print("\n[Interrupted] Flushing buffer...")# Clear the Python queuewhilenotaudio_queue.empty():try:audio_queue.get_nowait()exceptasyncio.QueueEmpty:break# Send signal to worker to reset hardware buffers if neededawaitaudio_queue.put(None)continue# 2. Process Audio chunksifserver_content.model_turn:forpartinserver_content.model_turn.parts:ifpart.inline_data:# Add PCM data to playback queueawaitaudio_queue.put(np.frombuffer(part.inline_data.data,dtype='int16'))Send video streams
Video streaming provides visual context. The Live API expects a sequence ofdiscrete image frames and supports video frames input at 1 FPS. For bestresults, use native 768x768 resolution at 1 FPS.
The following code sample shows you how to send streaming video data:
importasyncio# Assumes session is an active Live API session# and chunk_data contains bytes of a JPEG image.fromgoogle.genaiimporttypes# Send video input data in chunksawaitsession.send_realtime_input(media=types.Blob(data=chunk_data,mime_type="image/jpeg"))The client implementation captures a frame from the video feed, encodes it as aJPEG blob, and transmits it using therealtime_input message structure.
importcv2importasynciofromgoogle.genaiimporttypesasyncdefsend_video_stream(session):# Open webcamcap=cv2.VideoCapture(0)whileTrue:ret,frame=cap.read()ifnotret:break# 1. Resize to optimal resolution (768x768 max)frame=cv2.resize(frame,(768,768))# 2. Encode as JPEG_,buffer=cv2.imencode('.jpg',frame,)# 3. Send as realtime inputawaitsession.send_realtime_input(media=types.Blob(data=buffer.tobytes(),mime_type="image/jpeg"))# 4. Wait 1 second (1 FPS)awaitasyncio.sleep(1.0)cap.release()Configure media resolution
You can specify the resolution for input media by setting themedia_resolution field in the session configuration. Lower resolutionreduces token usage and latency, while higher resolution improves detailrecognition. Supported values includelow,medium, andhigh.
config={"response_modalities":["audio"],"media_resolution":"low",}What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.