- Notifications
You must be signed in to change notification settings - Fork111
Official Python SDK for Deepgram.
License
deepgram/deepgram-python-sdk
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The official Python SDK for Deepgram's automated speech recognition, text-to-speech, and language understanding APIs. Power your applications with world-class speech and Language AI models.
Comprehensive API documentation and guides are available atdevelopers.deepgram.com.
Install the Deepgram Python SDK using pip:
pip install deepgram-sdk
- API Reference - Complete reference for all SDK methods and parameters
- WebSocket Reference - Detailed documentation for real-time WebSocket connections
The Deepgram SDK provides both synchronous and asynchronous clients for all major use cases:
Our newest and most advanced speech recognition model with contextual turn detection (WebSocket Reference):
fromdeepgramimportDeepgramClientfromdeepgram.core.eventsimportEventTypeclient=DeepgramClient()withclient.listen.v2.connect(model="flux-general-en",encoding="linear16",sample_rate="16000")asconnection:defon_message(message):print(f"Received{message.type} event")connection.on(EventType.OPEN,lambda_:print("Connection opened"))connection.on(EventType.MESSAGE,on_message)connection.on(EventType.CLOSE,lambda_:print("Connection closed"))connection.on(EventType.ERROR,lambdaerror:print(f"Error:{error}"))# Start listening and send audio dataconnection.start_listening()
Transcribe pre-recorded audio files (API Reference):
fromdeepgramimportDeepgramClientclient=DeepgramClient()withopen("audio.wav","rb")asaudio_file:response=client.listen.v1.media.transcribe_file(request=audio_file.read(),model="nova-3" )print(response.results.channels[0].alternatives[0].transcript)
Generate natural-sounding speech from text (API Reference):
fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.speak.v1.audio.generate(text="Hello, this is a sample text to speech conversion.")# Save the audio filewithopen("output.mp3","wb")asaudio_file:audio_file.write(response.stream.getvalue())
Analyze text for sentiment, topics, and intents (API Reference):
fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.read.v1.text.analyze(request={"text":"Hello, world!"},language="en",sentiment=True,summarize=True,topics=True,intents=True)
Build interactive voice agents (WebSocket Reference):
fromdeepgramimportDeepgramClientfromdeepgram.extensions.types.socketsimport (AgentV1SettingsMessage,AgentV1Agent,AgentV1AudioConfig,AgentV1AudioInput,AgentV1Listen,AgentV1ListenProvider,AgentV1Think,AgentV1OpenAiThinkProvider,AgentV1SpeakProviderConfig,AgentV1DeepgramSpeakProvider)client=DeepgramClient()withclient.agent.v1.connect()asagent:settings=AgentV1SettingsMessage(audio=AgentV1AudioConfig(input=AgentV1AudioInput(encoding="linear16",sample_rate=44100) ),agent=AgentV1Agent(listen=AgentV1Listen(provider=AgentV1ListenProvider(type="deepgram",model="nova-3") ),think=AgentV1Think(provider=AgentV1OpenAiThinkProvider(type="open_ai",model="gpt-4o-mini" ) ),speak=AgentV1SpeakProviderConfig(provider=AgentV1DeepgramSpeakProvider(type="deepgram",model="aura-2-asteria-en" ) ) ) )agent.send_settings(settings)agent.start_listening()
For comprehensive documentation of all available methods, parameters, and options:
API Reference - Complete reference for REST API methods including:
- Listen (Speech-to-Text): File transcription, URL transcription, and media processing
- Speak (Text-to-Speech): Audio generation and voice synthesis
- Read (Text Intelligence): Text analysis, sentiment, summarization, and topic detection
- Manage: Project management, API keys, and usage analytics
- Auth: Token generation and authentication management
WebSocket Reference - Detailed documentation for real-time connections:
- Listen v1/v2: Real-time speech recognition with different model capabilities
- Speak v1: Real-time text-to-speech streaming
- Agent v1: Conversational voice agents with integrated STT, LLM, and TTS
The Deepgram SDK supports two authentication methods:
Use access tokens for temporary or scoped access (recommended for client-side applications):
fromdeepgramimportDeepgramClient# Explicit access tokenclient=DeepgramClient(access_token="YOUR_ACCESS_TOKEN")# Or via environment variable DEEPGRAM_TOKENclient=DeepgramClient()# Generate access tokens using your API keyauth_client=DeepgramClient(api_key="YOUR_API_KEY")token_response=auth_client.auth.v1.tokens.grant()token_client=DeepgramClient(access_token=token_response.access_token)
Use your Deepgram API key for server-side applications:
fromdeepgramimportDeepgramClient# Explicit API keyclient=DeepgramClient(api_key="YOUR_API_KEY")# Or via environment variable DEEPGRAM_API_KEYclient=DeepgramClient()
The SDK automatically discovers credentials from these environment variables:
DEEPGRAM_TOKEN- Your access token (takes precedence)DEEPGRAM_API_KEY- Your Deepgram API key
Precedence: Explicit parameters > Environment variables
The SDK provides full async/await support for non-blocking operations:
importasynciofromdeepgramimportAsyncDeepgramClientasyncdefmain():client=AsyncDeepgramClient()# Async file transcriptionwithopen("audio.wav","rb")asaudio_file:response=awaitclient.listen.v1.media.transcribe_file(request=audio_file.read(),model="nova-3" )# Async WebSocket connectionasyncwithclient.listen.v2.connect(model="flux-general-en",encoding="linear16",sample_rate="16000" )asconnection:asyncdefon_message(message):print(f"Received{message.type} event")connection.on(EventType.MESSAGE,on_message)awaitconnection.start_listening()asyncio.run(main())
The SDK provides detailed error information for debugging and error handling:
fromdeepgramimportDeepgramClientfromdeepgram.core.api_errorimportApiErrorclient=DeepgramClient()try:response=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3" )exceptApiErrorase:print(f"Status Code:{e.status_code}")print(f"Error Details:{e.body}")print(f"Request ID:{e.headers.get('x-dg-request-id','N/A')}")exceptExceptionase:print(f"Unexpected error:{e}")
Access raw HTTP response data including headers:
fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.listen.v1.media.with_raw_response.transcribe_file(request=audio_data,model="nova-3")print(response.headers)# Access response headersprint(response.data)# Access the response object
Configure timeouts, retries, and other request options:
fromdeepgramimportDeepgramClient# Global client configurationclient=DeepgramClient(timeout=30.0)# Per-request configurationresponse=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3",request_options={"timeout_in_seconds":60,"max_retries":3 })
Use a custom httpx client for advanced networking features:
importhttpxfromdeepgramimportDeepgramClientclient=DeepgramClient(httpx_client=httpx.Client(proxies="http://proxy.example.com",timeout=httpx.Timeout(30.0) ))
The SDK automatically retries failed requests with exponential backoff:
# Automatic retries for 408, 429, and 5xx status codesresponse=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3",request_options={"max_retries":3})
We welcome contributions to improve this SDK! However, please note that this library is primarily generated from our API specifications.
Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org| python - -y --version 1.5.1Install dependencies:
poetry install
Install example dependencies:
poetry run pip install -r examples/requirements.txt
Run tests:
poetry run pytest -rP.Run examples:
python -u examples/listen/v2/connect/main.py
See ourCONTRIBUTING guide.
- Python 3.8+
- See
pyproject.tomlfor full dependency list
Please see our communitycode of conduct before contributing to this project.
This project is licensed under the MIT License - see theLICENSE file for details.
About
Official Python SDK for Deepgram.
Topics
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.