Recognizers Stay organized with collections Save and categorize content based on your preferences.
The Cloud Speech-to-Text API V2 supports a Google Cloud resource calledrecognizers. Recognizers represent stored and reusablerecognition configuration. You can use them to logically group togethertranscriptions or traffic for your application.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Enable the Speech-to-Text APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
In the Google Cloud console, go to theIAM page.
Go to IAM- Select the project.
In thePrincipal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check theRole column to see whether the list of roles includes the required roles.
Grant the roles
In the Google Cloud console, go to theIAM page.
Go to IAM- Select the project.
- ClickGrant access.
In theNew principals field, enter your user identifier. This is typically the email address for a Google Account.
- In theSelect a role list, select a role.
- To grant additional roles, clickAdd another role and add each additional role.
- ClickSave.
Install the Google Cloud CLI.
Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
Toinitialize the gcloud CLI, run the following command:
gcloudinit
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Enable the Speech-to-Text APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.Make sure that you have the following role or roles on the project: Cloud Speech Administrator
Check for the roles
In the Google Cloud console, go to theIAM page.
Go to IAM- Select the project.
In thePrincipal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
- For all rows that specify or include you, check theRole column to see whether the list of roles includes the required roles.
Grant the roles
In the Google Cloud console, go to theIAM page.
Go to IAM- Select the project.
- ClickGrant access.
In theNew principals field, enter your user identifier. This is typically the email address for a Google Account.
- In theSelect a role list, select a role.
- To grant additional roles, clickAdd another role and add each additional role.
- ClickSave.
Install the Google Cloud CLI.
Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
Toinitialize the gcloud CLI, run the following command:
gcloudinit
If you're using a local shell, then create local authentication credentials for your user account:
gcloudauthapplication-defaultlogin
You don't need to do this if you're using Cloud Shell.
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.
Client libraries can useApplication Default Credentials to easily authenticate with Google APIs and send requests to those APIs. With Application Default Credentials, you can test your application locally and deploy it without changing the underlying code. For more information, see Authenticate for using client libraries.
Also ensure you haveinstalled the client library.
Understand recognizers
Recognizers are configurable, reusable recognition configurations. Creatingrecognizers with frequently used recognition configuration helps to simplify andreduce the size of recognition requests.
The core element of a recognizer is itsdefault configuration. This is theconfiguration for every recognition request that this recognizer performs. Youcan override this default per request. Keep the default configuration forfeatures you need across requests for a given recognizer, while overridingspecific features for specific requests.
Reuse recognizers as often as possible. Creating one for each requestdramatically increases the latency of your application and consumes yourresource quotas. Create them infrequently during integration andsetup, then reuse them for recognition requests.
Create recognizers
Here is an example of creating a recognizer that can be used to send recognition requests:
Python
importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")defcreate_recognizer(recognizer_id:str)->cloud_speech.Recognizer:"""Сreates a recognizer with an unique ID and default recognition configuration. Args: recognizer_id (str): The unique identifier for the recognizer to be created. Returns: cloud_speech.Recognizer: The created recognizer object with configuration. """# Instantiates a clientclient=SpeechClient()request=cloud_speech.CreateRecognizerRequest(parent=f"projects/{PROJECT_ID}/locations/global",recognizer_id=recognizer_id,recognizer=cloud_speech.Recognizer(default_recognition_config=cloud_speech.RecognitionConfig(language_codes=["en-US"],model="long"),),)# Sends the request to create a recognizer and waits for the operation to completeoperation=client.create_recognizer(request=request)recognizer=operation.result()print("Created Recognizer:",recognizer.name)returnrecognizerUse an existing recognizer to send requests
Here is an example of sending multiple recognition requests using the same recognizer:
Python
importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")deftranscribe_reuse_recognizer(audio_file:str,recognizer_id:str,)->cloud_speech.RecognizeResponse:"""Transcribe an audio file using an existing recognizer. Args: audio_file (str): Path to the local audio file to be transcribed. Example: "resources/audio.wav" recognizer_id (str): The ID of the existing recognizer to be used for transcription. Returns: cloud_speech.RecognizeResponse: The response containing the transcription results. """# Instantiates a clientclient=SpeechClient()# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/{recognizer_id}",content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript:{result.alternatives[0].transcript}")returnresponseEnable features in a recognizer
Recognizers can be used to enable various features in recognition, such asautomatic punctuation orprofanity filtering.
Here is an example of enabling automatic punctuation in a recognizer, whichenables automatic punctuation in the recognition request using this recognizer:
Note: Samples throughout the documentation repeatedly illustrate creatingrecognizers. However, you should create them infrequently and reuse them often.Python
fromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechfromgoogle.api_core.exceptionsimportNotFound# Instantiates a clientclient=SpeechClient()# TODO(developer): Update and un-comment below line# PROJECT_ID = "your-project-id"# recognizer_id = "id-recognizer"recognizer_name=(f"projects/{PROJECT_ID}/locations/global/recognizers/{recognizer_id}")try:# Use an existing recognizerrecognizer=client.get_recognizer(name=recognizer_name)print("Using existing Recognizer:",recognizer.name)exceptNotFound:# Create a new recognizerrequest=cloud_speech.CreateRecognizerRequest(parent=f"projects/{PROJECT_ID}/locations/global",recognizer_id=recognizer_id,recognizer=cloud_speech.Recognizer(default_recognition_config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="latest_long",features=cloud_speech.RecognitionFeatures(enable_automatic_punctuation=True,),),),)operation=client.create_recognizer(request=request)recognizer=operation.result()print("Created Recognizer:",recognizer.name)# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/{recognizer_id}",content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript:{result.alternatives[0].transcript}")Override recognizer features in recognition requests
Here is an example of enabling multiple features in a recognizer, but disablingautomatic punctuation for this recognition request:
Python
importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechfromgoogle.protobuf.field_mask_pb2importFieldMaskPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")deftranscribe_override_recognizer(audio_file:str,recognizer_id:str,)->cloud_speech.RecognizeResponse:"""Transcribe an audio file using an existing recognizer with overridden settings for the recognition request. Args: audio_file (str): Path to the local audio file to be transcribed. Example: "resources/audio.wav" recognizer_id (str): The unique ID of the recognizer to be used for transcription. Returns: cloud_speech.RecognizeResponse: The response containing the transcription results. """# Instantiates a clientclient=SpeechClient()request=cloud_speech.CreateRecognizerRequest(parent=f"projects/{PROJECT_ID}/locations/global",recognizer_id=recognizer_id,recognizer=cloud_speech.Recognizer(default_recognition_config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="latest_long",features=cloud_speech.RecognitionFeatures(enable_automatic_punctuation=True,enable_word_time_offsets=True,),),),)operation=client.create_recognizer(request=request)recognizer=operation.result()print("Created Recognizer:",recognizer.name)# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/{recognizer_id}",config=cloud_speech.RecognitionConfig(features=cloud_speech.RecognitionFeatures(enable_word_time_offsets=False,),),config_mask=FieldMask(paths=["features.enable_word_time_offsets"]),content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript:{result.alternatives[0].transcript}")returnresponseSend requests without recognizers
Recognizers are optional in recognition requests. To make a request without arecognizer, use the recognizer resource ID_ in the location you aremaking a request. Here is an example:
Python
importosfromgoogle.cloud.speech_v2importSpeechClientfromgoogle.cloud.speech_v2.typesimportcloud_speechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")defquickstart_v2(audio_file:str)->cloud_speech.RecognizeResponse:"""Transcribe an audio file. Args: audio_file (str): Path to the local audio file to be transcribed. Returns: cloud_speech.RecognizeResponse: The response from the recognize request, containing the transcription results """# Reads a file as byteswithopen(audio_file,"rb")asf:audio_content=f.read()# Instantiates a clientclient=SpeechClient()config=cloud_speech.RecognitionConfig(auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),language_codes=["en-US"],model="long",)request=cloud_speech.RecognizeRequest(recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/_",config=config,content=audio_content,)# Transcribes the audio into textresponse=client.recognize(request=request)forresultinresponse.results:print(f"Transcript:{result.alternatives[0].transcript}")returnresponseClean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Optional: Revoke the authentication credentials that you created, and delete the local credential file.
gcloudauthapplication-defaultrevoke
Optional: Revoke credentials from the gcloud CLI.
gcloudauthrevoke
Console
gcloud
What's next
- Learn how totranscribe short audio files.
- Learn how totranscribe streaming audio.
- Learn how totranscribe long audio files.
- For best performance, accuracy, and other tips, see thebest practices documentation.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.