Model
TTSVoicemodule-attribute
Exportable type for the TTSModelSettings voice enum
TTSModelSettingsdataclass
Settings for a TTS model.
Source code insrc/agents/voice/model.py
voiceclass-attributeinstance-attribute
voice:TTSVoice|None=NoneThe voice to use for the TTS model. If not provided, the default voice for the respective modelwill be used.
buffer_sizeclass-attributeinstance-attribute
The minimal size of the chunks of audio data that are being streamed out.
dtypeclass-attributeinstance-attribute
The data type for the audio data to be returned in.
transform_dataclass-attributeinstance-attribute
A function to transform the data from the TTS model. This is useful if you want the resultingaudio stream to have the data in a specific shape already.
instructionsclass-attributeinstance-attribute
instructions:str="You will receive partial sentences. Do not complete the sentence just read out the text."The instructions to use for the TTS model. This is useful if you want to control the tone of theaudio output.
text_splitterclass-attributeinstance-attribute
text_splitter:Callable[[str],tuple[str,str]]=(get_sentence_based_splitter())A function to split the text into chunks. This is useful if you want to split the text intochunks before sending it to the TTS model rather than waiting for the whole text to beprocessed.
TTSModel
Bases:ABC
A text-to-speech model that can convert text into audio output.
Source code insrc/agents/voice/model.py
runabstractmethod
run(text:str,settings:TTSModelSettings)->AsyncIterator[bytes]Given a text string, produces a stream of audio bytes, in PCM format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text | str | The text to convert to audio. | required |
Returns:
| Type | Description |
|---|---|
AsyncIterator[bytes] | An async iterator of audio bytes, in PCM format. |
Source code insrc/agents/voice/model.py
StreamedTranscriptionSession
Bases:ABC
A streamed transcription of audio input.
Source code insrc/agents/voice/model.py
transcribe_turnsabstractmethod
Yields a stream of text transcriptions. Each transcription is a turn in the conversation.
This method is expected to return only afterclose() is called.
STTModelSettingsdataclass
Settings for a speech-to-text model.
Source code insrc/agents/voice/model.py
temperatureclass-attributeinstance-attribute
The temperature of the model.
STTModel
Bases:ABC
A speech-to-text model that can convert audio input into text.
Source code insrc/agents/voice/model.py
transcribeabstractmethodasync
transcribe(input:AudioInput,settings:STTModelSettings,trace_include_sensitive_data:bool,trace_include_sensitive_audio_data:bool,)->strGiven an audio input, produces a text transcription.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input | AudioInput | The audio input to transcribe. | required |
settings | STTModelSettings | The settings to use for the transcription. | required |
trace_include_sensitive_data | bool | Whether to include sensitive data in traces. | required |
trace_include_sensitive_audio_data | bool | Whether to include sensitive audio data in traces. | required |
Returns:
| Type | Description |
|---|---|
str | The text transcription of the audio input. |
Source code insrc/agents/voice/model.py
create_sessionabstractmethodasync
create_session(input:StreamedAudioInput,settings:STTModelSettings,trace_include_sensitive_data:bool,trace_include_sensitive_audio_data:bool,)->StreamedTranscriptionSessionCreates a new transcription session, which you can push audio to, and receive a streamof text transcriptions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input | StreamedAudioInput | The audio input to transcribe. | required |
settings | STTModelSettings | The settings to use for the transcription. | required |
trace_include_sensitive_data | bool | Whether to include sensitive data in traces. | required |
trace_include_sensitive_audio_data | bool | Whether to include sensitive audio data in traces. | required |
Returns:
| Type | Description |
|---|---|
StreamedTranscriptionSession | A new transcription session. |
Source code insrc/agents/voice/model.py
VoiceModelProvider
Bases:ABC
The base interface for a voice model provider.
A model provider is responsible for creating speech-to-text and text-to-speech models, given aname.