The pronunciation can also contain pitch accents. The start    of a pitch phrase is specified with `^` and the down-pitch    position is specified with `!`, for example:    ::        phrase:端  pronunciation:^はし        phrase:箸  pronunciation:^は!し        phrase:橋  pronunciation:^はし!    We currently only support the Tokyo dialect, which allows at    most one down-pitch per phrase (i.e. at most one `!`    between `^`).PHONETIC_ENCODING_PINYIN (4):    Used to specify pronunciations for Mandarin    words. See https://en.wikipedia.org/wiki/Pinyin.    For example: 朝阳, the pronunciation is "chao2    yang2". The number represents the tone, and    there is a space between syllables. Neutral    tones are represented by 5, for example 孩子 "hai2    zi5".

CustomPronunciations

A collection of pronunciation customizations.

CustomVoiceParams

Description of the custom voice to be synthesized.

ReportedUsage

Deprecated. The usage of the synthesized audio. Usage doesnot affect billing.

ListVoicesRequest

The top-level message sent by the client for theListVoicesmethod.

ListVoicesResponse

The message returned to the client by theListVoices method.

MultiSpeakerMarkup

A collection of turns for multi-speaker synthesis.

MultiSpeakerVoiceConfig

Configuration for a multi-speaker text-to-speech setup.Enables the use of up to two distinct voices in a singlesynthesis request.

MultispeakerPrebuiltVoice

Configuration for a single speaker in a Gemini TTSmulti-speaker setup. Enables dialogue between two speakers.

SsmlVoiceGender

Gender of the voice as described inSSML voiceelement <https://www.w3.org/TR/speech-synthesis11/#edef_voice>__.

StreamingAudioConfig

Description of the desired output audio data.

StreamingSynthesisInput

Input to be synthesized.

This message hasoneof_ fields (mutually exclusive fields).For each oneof, at most one member field can be set at the same time.Setting any member of the oneof automatically clears all othermembers.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

StreamingSynthesizeConfig

Provides configuration information for theStreamingSynthesize request.

Request message for theStreamingSynthesize method. MultipleStreamingSynthesizeRequest messages are sent in one call. Thefirst message must contain astreaming_config that fullyspecifies the request configuration and must not containinput.All subsequent messages must only haveinput set.

This message hasoneof_ fields (mutually exclusive fields).For each oneof, at most one member field can be set at the same time.Setting any member of the oneof automatically clears all othermembers.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

StreamingSynthesizeResponse

StreamingSynthesizeResponse is the only message returned to theclient byStreamingSynthesize method. A series of zero or moreStreamingSynthesizeResponse messages are streamed back to theclient.

SynthesisInput

Contains text input to be synthesized. Eithertext orssmlmust be supplied. Supplying both or neither returnsgoogle.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT].The input size is limited to 5000 bytes.

This message hasoneof_ fields (mutually exclusive fields).For each oneof, at most one member field can be set at the same time.Setting any member of the oneof automatically clears all othermembers.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

SynthesizeLongAudioMetadata

Metadata for response returned by theSynthesizeLongAudiomethod.

SynthesizeLongAudioRequest

The top-level message sent by the client for theSynthesizeLongAudio method.

SynthesizeLongAudioResponse

The message returned to the client by theSynthesizeLongAudiomethod.

SynthesizeSpeechRequest

The top-level message sent by the client for theSynthesizeSpeech method.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

SynthesizeSpeechResponse

The message returned to the client by theSynthesizeSpeechmethod.

Voice

Description of a voice supported by the TTS service.

VoiceCloneParams

The configuration of Voice Clone feature.

VoiceSelectionParams

Description of which voice to use for a synthesis request.

TextToSpeechAsyncClient

Service that implements Google Cloud Text-to-Speech API.

TextToSpeechClient

Service that implements Google Cloud Text-to-Speech API.

TextToSpeechLongAudioSynthesizeAsyncClient

Service that implements Google Cloud Text-to-Speech API.

TextToSpeechLongAudioSynthesizeClient

Service that implements Google Cloud Text-to-Speech API.

AdvancedVoiceOptions

Used for advanced voice options.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

AudioConfig

Description of audio data to be synthesized.

AudioEncoding

Configuration to set up audio encoder. The encodingdetermines the output audio format that we'd like.

CustomPronunciationParams

Pronunciation customization for a phrase.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

PhoneticEncoding

The phonetic encoding of the phrase.

    The pronunciation can also contain pitch accents. The start    of a pitch phrase is specified with `^` and the down-pitch    position is specified with `!`, for example:    ::        phrase:端  pronunciation:^はし        phrase:箸  pronunciation:^は!し        phrase:橋  pronunciation:^はし!    We currently only support the Tokyo dialect, which allows at    most one down-pitch per phrase (i.e. at most one `!`    between `^`).PHONETIC_ENCODING_PINYIN (4):    Used to specify pronunciations for Mandarin    words. See https://en.wikipedia.org/wiki/Pinyin.    For example: 朝阳, the pronunciation is "chao2    yang2". The number represents the tone, and    there is a space between syllables. Neutral    tones are represented by 5, for example 孩子 "hai2    zi5".

CustomPronunciations

A collection of pronunciation customizations.

CustomVoiceParams

Description of the custom voice to be synthesized.

ReportedUsage

Deprecated. The usage of the synthesized audio. Usage doesnot affect billing.

ListVoicesRequest

The top-level message sent by the client for theListVoicesmethod.

ListVoicesResponse

The message returned to the client by theListVoices method.

MultiSpeakerMarkup

A collection of turns for multi-speaker synthesis.

Turn

A multi-speaker turn.

MultiSpeakerVoiceConfig

Configuration for a multi-speaker text-to-speech setup.Enables the use of up to two distinct voices in a singlesynthesis request.

MultispeakerPrebuiltVoice

Configuration for a single speaker in a Gemini TTSmulti-speaker setup. Enables dialogue between two speakers.

SsmlVoiceGender

Gender of the voice as described inSSML voiceelement <https://www.w3.org/TR/speech-synthesis11/#edef_voice>__.

StreamingAudioConfig

Description of the desired output audio data.

StreamingSynthesisInput

Input to be synthesized.

This message hasoneof_ fields (mutually exclusive fields).For each oneof, at most one member field can be set at the same time.Setting any member of the oneof automatically clears all othermembers.

.. _oneof:https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-01-13 UTC.

Movatterモバイル変換

Package Classes (2.34.0) Stay organized with collections Save and categorize content based on your preferences.

Classes

Package Classes (2.34.0)