TeamADAPT/deepgram-python-sdkPublic

forked fromdeepgram/deepgram-python-sdk

NotificationsYou must be signed in to change notification settings
Fork0
Star0

Official Python SDK for Deepgram.

developers.deepgram.com

License

MIT license

0 stars 111 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 706 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src/deepgram		src/deepgram
tests		tests
.fernignore		.fernignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
reference.md		reference.md
requirements.txt		requirements.txt
websockets-reference.md		websockets-reference.md

Repository files navigation

Deepgram Python SDK

The official Python SDK for Deepgram's automated speech recognition, text-to-speech, and language understanding APIs. Power your applications with world-class speech and Language AI models.

Documentation

Comprehensive API documentation and guides are available atdevelopers.deepgram.com.

Migrating From Earlier Versions

v2 to v3+
v3+ to v5 (current)

Installation

Install the Deepgram Python SDK using pip:

pip install deepgram-sdk

Reference

API Reference - Complete reference for all SDK methods and parameters
WebSocket Reference - Detailed documentation for real-time WebSocket connections

Usage

Quick Start

The Deepgram SDK provides both synchronous and asynchronous clients for all major use cases:

Real-time Speech Recognition (Listen v2)

Our newest and most advanced speech recognition model with contextual turn detection (WebSocket Reference):

fromdeepgramimportDeepgramClientfromdeepgram.core.eventsimportEventTypeclient=DeepgramClient()withclient.listen.v2.connect(model="flux-general-en",encoding="linear16",sample_rate="16000")asconnection:defon_message(message):print(f"Received{message.type} event")connection.on(EventType.OPEN,lambda_:print("Connection opened"))connection.on(EventType.MESSAGE,on_message)connection.on(EventType.CLOSE,lambda_:print("Connection closed"))connection.on(EventType.ERROR,lambdaerror:print(f"Error:{error}"))# Start listening and send audio dataconnection.start_listening()

File Transcription

Transcribe pre-recorded audio files (API Reference):

fromdeepgramimportDeepgramClientclient=DeepgramClient()withopen("audio.wav","rb")asaudio_file:response=client.listen.v1.media.transcribe_file(request=audio_file.read(),model="nova-3"    )print(response.results.channels[0].alternatives[0].transcript)

Text-to-Speech

Generate natural-sounding speech from text (API Reference):

fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.speak.v1.audio.generate(text="Hello, this is a sample text to speech conversion.")# Save the audio filewithopen("output.mp3","wb")asaudio_file:audio_file.write(response.stream.getvalue())

Text Analysis

Analyze text for sentiment, topics, and intents (API Reference):

fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.read.v1.text.analyze(request={"text":"Hello, world!"},language="en",sentiment=True,summarize=True,topics=True,intents=True)

Voice Agent (Conversational AI)

Build interactive voice agents (WebSocket Reference):

fromdeepgramimportDeepgramClientfromdeepgram.extensions.types.socketsimport (AgentV1SettingsMessage,AgentV1Agent,AgentV1AudioConfig,AgentV1AudioInput,AgentV1Listen,AgentV1ListenProvider,AgentV1Think,AgentV1OpenAiThinkProvider,AgentV1SpeakProviderConfig,AgentV1DeepgramSpeakProvider)client=DeepgramClient()withclient.agent.v1.connect()asagent:settings=AgentV1SettingsMessage(audio=AgentV1AudioConfig(input=AgentV1AudioInput(encoding="linear16",sample_rate=44100)        ),agent=AgentV1Agent(listen=AgentV1Listen(provider=AgentV1ListenProvider(type="deepgram",model="nova-3")            ),think=AgentV1Think(provider=AgentV1OpenAiThinkProvider(type="open_ai",model="gpt-4o-mini"                )            ),speak=AgentV1SpeakProviderConfig(provider=AgentV1DeepgramSpeakProvider(type="deepgram",model="aura-2-asteria-en"                )            )        )    )agent.send_settings(settings)agent.start_listening()

Complete SDK Reference

For comprehensive documentation of all available methods, parameters, and options:

API Reference - Complete reference for REST API methods including:
- Listen (Speech-to-Text): File transcription, URL transcription, and media processing
- Speak (Text-to-Speech): Audio generation and voice synthesis
- Read (Text Intelligence): Text analysis, sentiment, summarization, and topic detection
- Manage: Project management, API keys, and usage analytics
- Auth: Token generation and authentication management
WebSocket Reference - Detailed documentation for real-time connections:
- Listen v1/v2: Real-time speech recognition with different model capabilities
- Speak v1: Real-time text-to-speech streaming
- Agent v1: Conversational voice agents with integrated STT, LLM, and TTS

Authentication

The Deepgram SDK supports two authentication methods:

Access Token Authentication

Use access tokens for temporary or scoped access (recommended for client-side applications):

fromdeepgramimportDeepgramClient# Explicit access tokenclient=DeepgramClient(access_token="YOUR_ACCESS_TOKEN")# Or via environment variable DEEPGRAM_TOKENclient=DeepgramClient()# Generate access tokens using your API keyauth_client=DeepgramClient(api_key="YOUR_API_KEY")token_response=auth_client.auth.v1.tokens.grant()token_client=DeepgramClient(access_token=token_response.access_token)

API Key Authentication

Use your Deepgram API key for server-side applications:

fromdeepgramimportDeepgramClient# Explicit API keyclient=DeepgramClient(api_key="YOUR_API_KEY")# Or via environment variable DEEPGRAM_API_KEYclient=DeepgramClient()

Environment Variables

The SDK automatically discovers credentials from these environment variables:

DEEPGRAM_TOKEN - Your access token (takes precedence)
DEEPGRAM_API_KEY - Your Deepgram API key

Precedence: Explicit parameters > Environment variables

Async Client

The SDK provides full async/await support for non-blocking operations:

importasynciofromdeepgramimportAsyncDeepgramClientasyncdefmain():client=AsyncDeepgramClient()# Async file transcriptionwithopen("audio.wav","rb")asaudio_file:response=awaitclient.listen.v1.media.transcribe_file(request=audio_file.read(),model="nova-3"        )# Async WebSocket connectionasyncwithclient.listen.v2.connect(model="flux-general-en",encoding="linear16",sample_rate="16000"    )asconnection:asyncdefon_message(message):print(f"Received{message.type} event")connection.on(EventType.MESSAGE,on_message)awaitconnection.start_listening()asyncio.run(main())

Exception Handling

The SDK provides detailed error information for debugging and error handling:

fromdeepgramimportDeepgramClientfromdeepgram.core.api_errorimportApiErrorclient=DeepgramClient()try:response=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3"    )exceptApiErrorase:print(f"Status Code:{e.status_code}")print(f"Error Details:{e.body}")print(f"Request ID:{e.headers.get('x-dg-request-id','N/A')}")exceptExceptionase:print(f"Unexpected error:{e}")

Advanced Features

Raw Response Access

Access raw HTTP response data including headers:

fromdeepgramimportDeepgramClientclient=DeepgramClient()response=client.listen.v1.media.with_raw_response.transcribe_file(request=audio_data,model="nova-3")print(response.headers)# Access response headersprint(response.data)# Access the response object

Request Configuration

Configure timeouts, retries, and other request options:

fromdeepgramimportDeepgramClient# Global client configurationclient=DeepgramClient(timeout=30.0)# Per-request configurationresponse=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3",request_options={"timeout_in_seconds":60,"max_retries":3    })

Custom HTTP Client

Use a custom httpx client for advanced networking features:

importhttpxfromdeepgramimportDeepgramClientclient=DeepgramClient(httpx_client=httpx.Client(proxies="http://proxy.example.com",timeout=httpx.Timeout(30.0)    ))

Retry Configuration

The SDK automatically retries failed requests with exponential backoff:

# Automatic retries for 408, 429, and 5xx status codesresponse=client.listen.v1.media.transcribe_file(request=audio_data,model="nova-3",request_options={"max_retries":3})

Contributing

We welcome contributions to improve this SDK! However, please note that this library is primarily generated from our API specifications.

Development Setup

Install Poetry (if not already installed):

curl -sSL https://install.python-poetry.org| python - -y --version 1.5.1

Install dependencies:
```
poetry install
```

Install example dependencies:

poetry run pip install -r examples/requirements.txt

Run tests:
```
poetry run pytest -rP.
```

Run examples:

python -u examples/listen/v2/connect/main.py

Contribution Guidelines

See ourCONTRIBUTING guide.

Requirements

Python 3.8+
Seepyproject.toml for full dependency list

Community Code of Conduct

Please see our communitycode of conduct before contributing to this project.

License

This project is licensed under the MIT License - see theLICENSE file for details.

About

Official Python SDK for Deepgram.

developers.deepgram.com

Releases

No releases published

Packages

No packages published

Languages

Python99.7%
Shell0.3%

Movatterモバイル変換

License

TeamADAPT/deepgram-python-sdk

Folders and files

Latest commit

History

Repository files navigation

Deepgram Python SDK

Documentation

Migrating From Earlier Versions

Installation

Reference

Usage

Quick Start

Real-time Speech Recognition (Listen v2)

File Transcription

Text-to-Speech

Text Analysis

Voice Agent (Conversational AI)

Complete SDK Reference

Authentication

Access Token Authentication

API Key Authentication

Environment Variables

Async Client

Exception Handling

Advanced Features

Raw Response Access

Request Configuration

Custom HTTP Client

Retry Configuration

Contributing

Development Setup

Contribution Guidelines

Requirements

Community Code of Conduct

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages