Entity Detection

Entity Detection identifies and extracts key entities from content in submitted audio.

detect_entitiesboolean. Default:false

Pre-recorded Streaming:Nova English (all available regions)

Entity Detection

When Entity Detection is enabled, thePunctuation feature will be enabled by default.

Enable Feature

To enable Entity Detection, when you call Deepgram’s API, add adetect_entities parameter set totrue in the query string:

detect_entities=true&punctuate=true

When Entity Detection is enabled,Punctuation will also be enabled by default.

To transcribe audio from a file on your computer, run the following curl command in a terminal or your favorite API client.

cURL

$ curl \
>   --request POST \
>   --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
>   --header 'Content-Type: audio/wav' \
>   --data-binary @youraudio.wav \
>   --url 'https://api.deepgram.com/v1/listen?detect_entities=true&punctuate=true'

ReplaceYOUR_DEEPGRAM_API_KEY with yourDeepgram API Key.

Analyze Response

When the file is finished processing (often after only a few seconds), you’ll receive a JSON response that has the following basic structure:

JSON

1 {
2   "metadata": {
3     "transaction_key": "string",
4     "request_id": "string",
5     "sha256": "string",
6     "created": "string",
7     "duration": 0,
8     "channels": 0
9   },
10   "results": {
11     "channels": [
12       {
13         "alternatives":[],
14       }
15     ]
16   }

Let’s look more closely at thealternatives object:

JSON

1 "alternatives":[
2   {
3     "transcript":"Welcome to the Ai show. I'm Scott Stephenson, cofounder of Deepgram...",
4     "confidence":0.9816771,
5     "words": [...],
6     "entities":[
7       {
8         "label":"NAME",
9         "value":" Scott Stephenson",
10         "confidence":0.9999924,
11         "start_word":6,
12         "end_word":8
13       },
14       {
15         "label":"ORG",
16         "value":" Deepgram",
17         "confidence":0.9999757,
18         "start_word":10,
19         "end_word":11
20       },
21       {
22         "label": "CARDINAL_NUM",
23         "value": "one",
24         "confidence": 1,
25         "start_word": 186,
26         "end_word": 187
27       },
28       ...
29     ]
30   }
31 ]

In this response, we see that each alternative contains:

transcript: Transcript for the audio being processed.
confidence: Floating point value between 0 and 1 that indicates overall transcript reliability. Larger values indicate higher confidence.
words: Object containing each word in the transcript, along with its start time and end time (in seconds) from the beginning of the audio stream, and a confidence value.
entities: Object containing the information about entities for the audio being processed.

And we see that eachentities object contains:

label: Type of entity identified.
value: Text of entity identified.
confidence: Floating point value between 0 and 1 that indicates entity reliability. Larger values indicate higher confidence.
start_word: Location of the first character of the first word in the section of audio being inspected for entities.
end_word: Location of the first character of the last word in the section of audio being inspected for entities.

All entities are available in English.

Identifiable Entities

View all options here:Supported Entity Types

Use Cases

Some examples of uses for Entity Detection include:

Customers who want to improve Conversational AI and Voice Assistant by triggering particular workflows and responses based on identified name, address, location, and other key entities.
Customers who want to enhance customer service and user experience by extracting meaningful and relevant information about key entities such as a person, organization, email, and phone number.
Customers who want to derive meaningful and actionable insights from the audio data based on identified entities in conversations.

$	curl \
>	--request POST \
>	--header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
>	--header 'Content-Type: audio/wav' \
>	--data-binary @youraudio.wav \
>	--url 'https://api.deepgram.com/v1/listen?detect_entities=true&punctuate=true'

1	{
2	"metadata": {
3	"transaction_key": "string",
4	"request_id": "string",
5	"sha256": "string",
6	"created": "string",
7	"duration": 0,
8	"channels": 0
9	},
10	"results": {
11	"channels": [
12	{
13	"alternatives":[],
14	}
15	]
16	}

1	"alternatives":[
2	{
3	"transcript":"Welcome to the Ai show. I'm Scott Stephenson, cofounder of Deepgram...",
4	"confidence":0.9816771,
5	"words": [...],
6	"entities":[
7	{
8	"label":"NAME",
9	"value":" Scott Stephenson",
10	"confidence":0.9999924,
11	"start_word":6,
12	"end_word":8
13	},
14	{
15	"label":"ORG",
16	"value":" Deepgram",
17	"confidence":0.9999757,
18	"start_word":10,
19	"end_word":11
20	},
21	{
22	"label": "CARDINAL_NUM",
23	"value": "one",
24	"confidence": 1,
25	"start_word": 186,
26	"end_word": 187
27	},
28	...
29	]
30	}
31	]

Movatterモバイル変換

Entity Detection

Enable Feature

Analyze Response

Identifiable Entities

Use Cases