Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Realtime Configuration

Run Configuration

Bases:TypedDict

Configuration for running a realtime agent session.

Source code insrc/agents/realtime/config.py
classRealtimeRunConfig(TypedDict):"""Configuration for running a realtime agent session."""model_settings:NotRequired[RealtimeSessionModelSettings]"""Settings for the realtime model session."""output_guardrails:NotRequired[list[OutputGuardrail[Any]]]"""List of output guardrails to run on the agent's responses."""guardrails_settings:NotRequired[RealtimeGuardrailsSettings]"""Settings for guardrail execution."""tracing_disabled:NotRequired[bool]"""Whether tracing is disabled for this run."""async_tool_calls:NotRequired[bool]"""Whether function tool calls should run asynchronously. Defaults to True."""

model_settingsinstance-attribute

model_settings:NotRequired[RealtimeSessionModelSettings]

Settings for the realtime model session.

output_guardrailsinstance-attribute

output_guardrails:NotRequired[list[OutputGuardrail[Any]]]

List of output guardrails to run on the agent's responses.

guardrails_settingsinstance-attribute

guardrails_settings:NotRequired[RealtimeGuardrailsSettings]

Settings for guardrail execution.

tracing_disabledinstance-attribute

tracing_disabled:NotRequired[bool]

Whether tracing is disabled for this run.

async_tool_callsinstance-attribute

async_tool_calls:NotRequired[bool]

Whether function tool calls should run asynchronously. Defaults to True.

Model Settings

Bases:TypedDict

Model settings for a realtime model session.

Source code insrc/agents/realtime/config.py
classRealtimeSessionModelSettings(TypedDict):"""Model settings for a realtime model session."""model_name:NotRequired[RealtimeModelName]"""The name of the realtime model to use."""instructions:NotRequired[str]"""System instructions for the model."""prompt:NotRequired[Prompt]"""The prompt to use for the model."""modalities:NotRequired[list[Literal["text","audio"]]]"""The modalities the model should support."""voice:NotRequired[str]"""The voice to use for audio output."""speed:NotRequired[float]"""The speed of the model's responses."""input_audio_format:NotRequired[RealtimeAudioFormat|OpenAIRealtimeAudioFormats]"""The format for input audio streams."""output_audio_format:NotRequired[RealtimeAudioFormat|OpenAIRealtimeAudioFormats]"""The format for output audio streams."""input_audio_transcription:NotRequired[RealtimeInputAudioTranscriptionConfig]"""Configuration for transcribing input audio."""input_audio_noise_reduction:NotRequired[RealtimeInputAudioNoiseReductionConfig|None]"""Noise reduction configuration for input audio."""turn_detection:NotRequired[RealtimeTurnDetectionConfig]"""Configuration for detecting conversation turns."""tool_choice:NotRequired[ToolChoice]"""How the model should choose which tools to call."""tools:NotRequired[list[Tool]]"""List of tools available to the model."""handoffs:NotRequired[list[Handoff]]"""List of handoff configurations."""tracing:NotRequired[RealtimeModelTracingConfig|None]"""Configuration for request tracing."""

model_nameinstance-attribute

model_name:NotRequired[RealtimeModelName]

The name of the realtime model to use.

instructionsinstance-attribute

instructions:NotRequired[str]

System instructions for the model.

promptinstance-attribute

prompt:NotRequired[Prompt]

The prompt to use for the model.

modalitiesinstance-attribute

modalities:NotRequired[list[Literal['text','audio']]]

The modalities the model should support.

voiceinstance-attribute

voice:NotRequired[str]

The voice to use for audio output.

speedinstance-attribute

speed:NotRequired[float]

The speed of the model's responses.

input_audio_formatinstance-attribute

input_audio_format:NotRequired[RealtimeAudioFormat|RealtimeAudioFormats]

The format for input audio streams.

output_audio_formatinstance-attribute

output_audio_format:NotRequired[RealtimeAudioFormat|RealtimeAudioFormats]

The format for output audio streams.

input_audio_transcriptioninstance-attribute

input_audio_transcription:NotRequired[RealtimeInputAudioTranscriptionConfig]

Configuration for transcribing input audio.

input_audio_noise_reductioninstance-attribute

input_audio_noise_reduction:NotRequired[RealtimeInputAudioNoiseReductionConfig|None]

Noise reduction configuration for input audio.

turn_detectioninstance-attribute

turn_detection:NotRequired[RealtimeTurnDetectionConfig]

Configuration for detecting conversation turns.

tool_choiceinstance-attribute

tool_choice:NotRequired[ToolChoice]

How the model should choose which tools to call.

toolsinstance-attribute

tools:NotRequired[list[Tool]]

List of tools available to the model.

handoffsinstance-attribute

handoffs:NotRequired[list[Handoff]]

List of handoff configurations.

tracinginstance-attribute

tracing:NotRequired[RealtimeModelTracingConfig|None]

Configuration for request tracing.

Audio Configuration

Bases:TypedDict

Configuration for audio transcription in realtime sessions.

Source code insrc/agents/realtime/config.py
classRealtimeInputAudioTranscriptionConfig(TypedDict):"""Configuration for audio transcription in realtime sessions."""language:NotRequired[str]"""The language code for transcription."""model:NotRequired[Literal["gpt-4o-transcribe","gpt-4o-mini-transcribe","whisper-1"]|str]"""The transcription model to use."""prompt:NotRequired[str]"""An optional prompt to guide transcription."""

languageinstance-attribute

language:NotRequired[str]

The language code for transcription.

modelinstance-attribute

model:NotRequired[Literal["gpt-4o-transcribe","gpt-4o-mini-transcribe","whisper-1",]|str]

The transcription model to use.

promptinstance-attribute

prompt:NotRequired[str]

An optional prompt to guide transcription.

Bases:TypedDict

Noise reduction configuration for input audio.

Source code insrc/agents/realtime/config.py
classRealtimeInputAudioNoiseReductionConfig(TypedDict):"""Noise reduction configuration for input audio."""type:NotRequired[Literal["near_field","far_field"]]"""Noise reduction mode to apply to input audio."""

typeinstance-attribute

type:NotRequired[Literal['near_field','far_field']]

Noise reduction mode to apply to input audio.

Bases:TypedDict

Turn detection config. Allows extra vendor keys if needed.

Source code insrc/agents/realtime/config.py
classRealtimeTurnDetectionConfig(TypedDict):"""Turn detection config. Allows extra vendor keys if needed."""type:NotRequired[Literal["semantic_vad","server_vad"]]"""The type of voice activity detection to use."""create_response:NotRequired[bool]"""Whether to create a response when a turn is detected."""eagerness:NotRequired[Literal["auto","low","medium","high"]]"""How eagerly to detect turn boundaries."""interrupt_response:NotRequired[bool]"""Whether to allow interrupting the assistant's response."""prefix_padding_ms:NotRequired[int]"""Padding time in milliseconds before turn detection."""silence_duration_ms:NotRequired[int]"""Duration of silence in milliseconds to trigger turn detection."""threshold:NotRequired[float]"""The threshold for voice activity detection."""idle_timeout_ms:NotRequired[int]"""Threshold for server-vad to trigger a response if the user is idle for this duration."""

typeinstance-attribute

type:NotRequired[Literal['semantic_vad','server_vad']]

The type of voice activity detection to use.

create_responseinstance-attribute

create_response:NotRequired[bool]

Whether to create a response when a turn is detected.

eagernessinstance-attribute

eagerness:NotRequired[Literal["auto","low","medium","high"]]

How eagerly to detect turn boundaries.

interrupt_responseinstance-attribute

interrupt_response:NotRequired[bool]

Whether to allow interrupting the assistant's response.

prefix_padding_msinstance-attribute

prefix_padding_ms:NotRequired[int]

Padding time in milliseconds before turn detection.

silence_duration_msinstance-attribute

silence_duration_ms:NotRequired[int]

Duration of silence in milliseconds to trigger turn detection.

thresholdinstance-attribute

threshold:NotRequired[float]

The threshold for voice activity detection.

idle_timeout_msinstance-attribute

idle_timeout_ms:NotRequired[int]

Threshold for server-vad to trigger a response if the user is idle for this duration.

Guardrails Settings

Bases:TypedDict

Settings for output guardrails in realtime sessions.

Source code insrc/agents/realtime/config.py
classRealtimeGuardrailsSettings(TypedDict):"""Settings for output guardrails in realtime sessions."""debounce_text_length:NotRequired[int]"""    The minimum number of characters to accumulate before running guardrails on transcript    deltas. Defaults to 100. Guardrails run every time the accumulated text reaches    1x, 2x, 3x, etc. times this threshold.    """

debounce_text_lengthinstance-attribute

debounce_text_length:NotRequired[int]

The minimum number of characters to accumulate before running guardrails on transcriptdeltas. Defaults to 100. Guardrails run every time the accumulated text reaches1x, 2x, 3x, etc. times this threshold.

Model Configuration

Bases:TypedDict

Options for connecting to a realtime model.

Source code insrc/agents/realtime/model.py
classRealtimeModelConfig(TypedDict):"""Options for connecting to a realtime model."""api_key:NotRequired[str|Callable[[],MaybeAwaitable[str]]]"""The API key (or function that returns a key) to use when connecting. If unset, the model will    try to use a sane default. For example, the OpenAI Realtime model will try to use the    `OPENAI_API_KEY`  environment variable.    """url:NotRequired[str]"""The URL to use when connecting. If unset, the model will use a sane default. For example,    the OpenAI Realtime model will use the default OpenAI WebSocket URL.    """headers:NotRequired[dict[str,str]]"""The headers to use when connecting. If unset, the model will use a sane default.    Note that, when you set this, authorization header won't be set under the hood.    e.g., {"api-key": "your api key here"} for Azure OpenAI Realtime WebSocket connections.    """initial_model_settings:NotRequired[RealtimeSessionModelSettings]"""The initial model settings to use when connecting."""playback_tracker:NotRequired[RealtimePlaybackTracker]"""The playback tracker to use when tracking audio playback progress. If not set, the model will    use a default implementation that assumes audio is played immediately, at realtime speed.    A playback tracker is useful for interruptions. The model generates audio much faster than    realtime playback speed. So if there's an interruption, its useful for the model to know how    much of the audio has been played by the user. In low-latency scenarios, it's fine to assume    that audio is played back immediately at realtime speed. But in scenarios like phone calls or    other remote interactions, you can set a playback tracker that lets the model know when audio    is played to the user.    """call_id:NotRequired[str]"""Attach to an existing realtime call instead of creating a new session.    When provided, the transport connects using the `call_id` query string parameter rather than a    model name. This is used for SIP-originated calls that are accepted via the Realtime Calls API.    """

api_keyinstance-attribute

api_key:NotRequired[str|Callable[[],MaybeAwaitable[str]]]

The API key (or function that returns a key) to use when connecting. If unset, the model willtry to use a sane default. For example, the OpenAI Realtime model will try to use theOPENAI_API_KEY environment variable.

urlinstance-attribute

url:NotRequired[str]

The URL to use when connecting. If unset, the model will use a sane default. For example,the OpenAI Realtime model will use the default OpenAI WebSocket URL.

headersinstance-attribute

headers:NotRequired[dict[str,str]]

The headers to use when connecting. If unset, the model will use a sane default.Note that, when you set this, authorization header won't be set under the hood.e.g., {"api-key": "your api key here"} for Azure OpenAI Realtime WebSocket connections.

initial_model_settingsinstance-attribute

initial_model_settings:NotRequired[RealtimeSessionModelSettings]

The initial model settings to use when connecting.

playback_trackerinstance-attribute

playback_tracker:NotRequired[RealtimePlaybackTracker]

The playback tracker to use when tracking audio playback progress. If not set, the model willuse a default implementation that assumes audio is played immediately, at realtime speed.

A playback tracker is useful for interruptions. The model generates audio much faster thanrealtime playback speed. So if there's an interruption, its useful for the model to know howmuch of the audio has been played by the user. In low-latency scenarios, it's fine to assumethat audio is played back immediately at realtime speed. But in scenarios like phone calls orother remote interactions, you can set a playback tracker that lets the model know when audiois played to the user.

call_idinstance-attribute

call_id:NotRequired[str]

Attach to an existing realtime call instead of creating a new session.

When provided, the transport connects using thecall_id query string parameter rather than amodel name. This is used for SIP-originated calls that are accepted via the Realtime Calls API.

Tracing Configuration

Bases:TypedDict

Configuration for tracing in realtime model sessions.

Source code insrc/agents/realtime/config.py
classRealtimeModelTracingConfig(TypedDict):"""Configuration for tracing in realtime model sessions."""workflow_name:NotRequired[str]"""The workflow name to use for tracing."""group_id:NotRequired[str]"""A group identifier to use for tracing, to link multiple traces together."""metadata:NotRequired[dict[str,Any]]"""Additional metadata to include with the trace."""

workflow_nameinstance-attribute

workflow_name:NotRequired[str]

The workflow name to use for tracing.

group_idinstance-attribute

group_id:NotRequired[str]

A group identifier to use for tracing, to link multiple traces together.

metadatainstance-attribute

metadata:NotRequired[dict[str,Any]]

Additional metadata to include with the trace.

User Input Types

User input that can be a string or structured message.

Bases:TypedDict

A text input from the user.

Source code insrc/agents/realtime/config.py
classRealtimeUserInputText(TypedDict):"""A text input from the user."""type:Literal["input_text"]"""The type identifier for text input."""text:str"""The text content from the user."""

typeinstance-attribute

type:Literal['input_text']

The type identifier for text input.

textinstance-attribute

text:str

The text content from the user.

Bases:TypedDict

A message input from the user.

Source code insrc/agents/realtime/config.py
classRealtimeUserInputMessage(TypedDict):"""A message input from the user."""type:Literal["message"]"""The type identifier for message inputs."""role:Literal["user"]"""The role identifier for user messages."""content:list[RealtimeUserInputText|RealtimeUserInputImage]"""List of content items (text and image) in the message."""

typeinstance-attribute

type:Literal['message']

The type identifier for message inputs.

roleinstance-attribute

role:Literal['user']

The role identifier for user messages.

contentinstance-attribute

content:list[RealtimeUserInputText|RealtimeUserInputImage]

List of content items (text and image) in the message.

Client Messages

Bases:TypedDict

A raw message to be sent to the model.

Source code insrc/agents/realtime/config.py
classRealtimeClientMessage(TypedDict):"""A raw message to be sent to the model."""type:str# explicitly required"""The type of the message."""other_data:NotRequired[dict[str,Any]]"""Merged into the message body."""

typeinstance-attribute

type:str

The type of the message.

other_datainstance-attribute

other_data:NotRequired[dict[str,Any]]

Merged into the message body.

Type Aliases

The name of a realtime model.

The audio format for realtime audio streams.


[8]ページ先頭

©2009-2025 Movatter.jp