Status Endpoint

Monitor self-hosted node health and readiness.

The/v1/status endpoint provides real-time health and readiness information for your Deepgram self-hosted nodes. This endpoint is essential for monitoring your deployment and integrating with load balancers, orchestration platforms, and health check systems.

Overview

The status endpoint reports the current operational state of a Deepgram node, tracking it through various states as it starts up, serves requests, and responds to runtime conditions. The endpoint helps prevent false critical alerts and provides accurate information about whether a node is ready to handle requests.

Response Format

The status endpoint returns a JSON object with the following fields:

1 {
2   "system_health": "Healthy",
3   "active_batch_requests": 0,
4   "active_stream_requests": 0
5 }

system_health: The current state of the node (Initializing,Ready,Healthy, orCritical)
active_batch_requests: Number of pre-recorded transcription requests currently being processed
active_stream_requests: Number of real-time streaming requests currently active

Status States

Thesystem_health field reports one of four possible states:

Initializing

Reported during node startup. When a Deepgram API node first starts, it reportsInitializing status while it:

Establishes connections to Engine drivers
Loads configuration
Prepares to service requests

The node automatically transitions toReady once initialization completes successfully.

Example Response:

1 {
2   "system_health": "Initializing",
3   "active_batch_requests": 0,
4   "active_stream_requests": 0
5 }

Ready

The node can service requests. Once initialization is complete, the node transitions toReady status, indicating it is capable of handling transcription and other API requests.

From theReady state, the node will:

Transition toHealthy after successfully processing enough requests
Transition toCritical if errors occur during request processing

Example Response:

1 {
2   "system_health": "Ready",
3   "active_batch_requests": 2,
4   "active_stream_requests": 1
5 }

Healthy

Sustained successful operation. After a node has successfully processed multiple requests, it transitions toHealthy status, indicating stable, production-ready operation.

AHealthy node can transition toCritical if failures arise during request processing.

Example Response:

1 {
2   "system_health": "Healthy",
3   "active_batch_requests": 0,
4   "active_stream_requests": 0
5 }

Critical

Node is experiencing failures. When a node encounters errors that prevent it from successfully servicing requests, it transitions toCritical status.

This state indicates:

The node is experiencing operational issues
Requests may fail or produce errors
Intervention may be required

A node inCritical status can recover and transition back toReady once it can successfully service requests again.

Example Response:

1 {
2   "system_health": "Critical",
3   "active_batch_requests": 0,
4   "active_stream_requests": 0
5 }

State Transitions

The following diagram illustrates how nodes transition between states:

Initializing → Ready: Automatic transition when node startup completes
Ready → Healthy: After processing enough successful requests
Ready → Critical: If errors occur during request processing
Healthy → Critical: If failures arise during operation
Critical → Ready: When the node can successfully service requests again

Using the Status Endpoint

Making a Request

Query the status endpoint with a simple GET request:

$ curl http://localhost:8080/v1/status

Integration with Load Balancers

Configure your load balancer to use the status endpoint for health checks. Different states may require different handling:

Initializing: Consider the node unhealthy/not ready
Ready: Node is healthy and can receive traffic
Healthy: Node is healthy and can receive traffic
Critical: Remove node from rotation or reduce traffic

Example: AWS Application Load Balancer

1 Health Check Configuration:
2   Protocol: HTTP
3   Path: /v1/status
4   Healthy threshold: 2
5   Unhealthy threshold: 2
6   Timeout: 5 seconds
7   Interval: 30 seconds
8   Success codes: 200

Integration with Kubernetes

Use the status endpoint for liveness and readiness probes:

1 apiVersion: v1
2 kind: Pod
3 metadata:
4   name: deepgram-api
5 spec:
6   containers:
7   - name: api
8     image: quay.io/deepgram/self-hosted-api:release-251029
9     livenessProbe:
10       httpGet:
11         path: /v1/status
12         port: 8080
13       initialDelaySeconds: 30
14       periodSeconds: 10
15     readinessProbe:
16       httpGet:
17         path: /v1/status
18         port: 8080
19       initialDelaySeconds: 10
20       periodSeconds: 5
21       successThreshold: 1
22       failureThreshold: 3

Monitoring and Alerting

The status endpoint is valuable for monitoring dashboards and alerting systems:

Python Monitoring Script

1 import requests
2 import time
3 
4 def check_node_status(url):
5     try:
6         response = requests.get(f"{url}/v1/status", timeout=5)
7         data = response.json()
8         status = data['system_health']
9         batch_requests = data['active_batch_requests']
10         stream_requests = data['active_stream_requests']
11 
12         if status == 'Critical':
13             alert(f"Node {url} is in Critical state!")
14         elif status == 'Initializing':
15             log(f"Node {url} is still initializing...")
16         else:
17             log(f"Node {url} is {status} - "
18                 f"Batch: {batch_requests}, Stream: {stream_requests}")
19 
20         return status
21     except Exception as e:
22         alert(f"Failed to check status for {url}: {e}")
23         return None
24 
25 # Check every 30 seconds
26 while True:
27     check_node_status("http://localhost:8080")
28     time.sleep(30)

Best Practices

Startup Handling

During node deployment or restart:

Wait for theInitializing state to transition toReady before sending production traffic
Allow adequate time for initialization (typically 30-60 seconds)
Configure health checks with appropriate initial delays

Error Recovery

When a node entersCritical state:

Check node logs for specific error messages
Verify Engine connectivity and resource availability
Monitor for automatic recovery toReady state
Consider restarting the node if it remains inCritical state

High Availability

For production deployments:

Deploy multiple API nodes for redundancy
Configure load balancers to removeCritical nodes from rotation
Set up automated alerts forCritical state transitions
Monitor the proportion of nodes in each state across your deployment

Monitoring Active Requests

Use theactive_batch_requests andactive_stream_requests fields to:

Track node utilization and load distribution
Identify nodes that may be overloaded
Plan capacity based on request patterns
Implement graceful shutdowns by waiting for active requests to complete

Troubleshooting

Node Stuck in Initializing

If a node remains inInitializing state for an extended period:

Verify Engine containers are running and accessible
Check network connectivity between API and Engine nodes
Review API and Engine logs for initialization errors
Ensure proper configuration inapi.toml andengine.toml

Frequent Critical State Transitions

If nodes frequently transition toCritical:

Review Engine resource allocation (GPU/CPU/memory)
Check for model loading issues or corrupted model files
Verify license validity and connectivity to license servers
Monitor for request patterns that may cause failures

Status Endpoint Not Responding

If the status endpoint is unreachable:

Verify the API container is running:docker ps
Check API logs:docker logs <container_id>
Ensure port 8080 is accessible and not blocked by firewall rules
Verify the API container has started successfully

What’s Next

Now that you understand how to monitor node health with the status endpoint, explore related topics:

Metrics Guide - Detailed metrics and monitoring
System Maintenance - Keeping your deployment healthy
Prometheus Integration - Advanced monitoring setup

1	{
2	"system_health": "Healthy",
3	"active_batch_requests": 0,
4	"active_stream_requests": 0
5	}

1	{
2	"system_health": "Initializing",
3	"active_batch_requests": 0,
4	"active_stream_requests": 0
5	}

1	{
2	"system_health": "Ready",
3	"active_batch_requests": 2,
4	"active_stream_requests": 1
5	}

1	{
2	"system_health": "Critical",
3	"active_batch_requests": 0,
4	"active_stream_requests": 0
5	}

1	Health Check Configuration:
2	Protocol: HTTP
3	Path: /v1/status
4	Healthy threshold: 2
5	Unhealthy threshold: 2
6	Timeout: 5 seconds
7	Interval: 30 seconds
8	Success codes: 200

1	apiVersion: v1
2	kind: Pod
3	metadata:
4	name: deepgram-api
5	spec:
6	containers:
7	- name: api
8	image: quay.io/deepgram/self-hosted-api:release-251029
9	livenessProbe:
10	httpGet:
11	path: /v1/status
12	port: 8080
13	initialDelaySeconds: 30
14	periodSeconds: 10
15	readinessProbe:
16	httpGet:
17	path: /v1/status
18	port: 8080
19	initialDelaySeconds: 10
20	periodSeconds: 5
21	successThreshold: 1
22	failureThreshold: 3

1	import requests
2	import time
3
4	def check_node_status(url):
5	try:
6	response = requests.get(f"{url}/v1/status", timeout=5)
7	data = response.json()
8	status = data['system_health']
9	batch_requests = data['active_batch_requests']
10	stream_requests = data['active_stream_requests']
11
12	if status == 'Critical':
13	alert(f"Node {url} is in Critical state!")
14	elif status == 'Initializing':
15	log(f"Node {url} is still initializing...")
16	else:
17	log(f"Node {url} is {status} - "
18	f"Batch: {batch_requests}, Stream: {stream_requests}")
19
20	return status
21	except Exception as e:
22	alert(f"Failed to check status for {url}: {e}")
23	return None
24
25	# Check every 30 seconds
26	while True:
27	check_node_status("http://localhost:8080")
28	time.sleep(30)

Movatterモバイル変換

Overview

Response Format

Status States

Initializing

Ready

Healthy

Critical

State Transitions

Using the Status Endpoint

Making a Request

Integration with Load Balancers

Integration with Kubernetes

Monitoring and Alerting

Best Practices

Startup Handling

Error Recovery

High Availability

Monitoring Active Requests

Troubleshooting

Node Stuck in Initializing

Frequent Critical State Transitions

Status Endpoint Not Responding

What’s Next