- Audio Intelligence
- Text Intelligence
- Results Processing
Entity Detection
detect_entitiesboolean. Default:false
Entity Detection
When Entity Detection is enabled, thePunctuation feature will be enabled by default.
Enable Feature
To enable Entity Detection, when you call Deepgram’s API, add adetect_entities parameter set totrue in the query string:
detect_entities=true&punctuate=true
When Entity Detection is enabled,Punctuation will also be enabled by default.
To transcribe audio from a file on your computer, run the following curl command in a terminal or your favorite API client.
$ curl \ > --request POST \ > --header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \ > --header 'Content-Type: audio/wav' \ > --data-binary @youraudio.wav \ > --url 'https://api.deepgram.com/v1/listen?detect_entities=true&punctuate=true'
ReplaceYOUR_DEEPGRAM_API_KEY with yourDeepgram API Key.
Analyze Response
When the file is finished processing (often after only a few seconds), you’ll receive a JSON response that has the following basic structure:
1 { 2 "metadata": { 3 "transaction_key": "string", 4 "request_id": "string", 5 "sha256": "string", 6 "created": "string", 7 "duration": 0, 8 "channels": 0 9 }, 10 "results": { 11 "channels": [ 12 { 13 "alternatives":[], 14 } 15 ] 16 }
Let’s look more closely at thealternatives object:
1 "alternatives":[ 2 { 3 "transcript":"Welcome to the Ai show. I'm Scott Stephenson, cofounder of Deepgram...", 4 "confidence":0.9816771, 5 "words": [...], 6 "entities":[ 7 { 8 "label":"NAME", 9 "value":" Scott Stephenson", 10 "confidence":0.9999924, 11 "start_word":6, 12 "end_word":8 13 }, 14 { 15 "label":"ORG", 16 "value":" Deepgram", 17 "confidence":0.9999757, 18 "start_word":10, 19 "end_word":11 20 }, 21 { 22 "label": "CARDINAL_NUM", 23 "value": "one", 24 "confidence": 1, 25 "start_word": 186, 26 "end_word": 187 27 }, 28 ... 29 ] 30 } 31 ]
In this response, we see that each alternative contains:
transcript: Transcript for the audio being processed.confidence: Floating point value between 0 and 1 that indicates overall transcript reliability. Larger values indicate higher confidence.words: Object containing each word in the transcript, along with its start time and end time (in seconds) from the beginning of the audio stream, and a confidence value.entities: Object containing the information about entities for the audio being processed.
And we see that eachentities object contains:
label: Type of entity identified.value: Text of entity identified.confidence: Floating point value between 0 and 1 that indicates entity reliability. Larger values indicate higher confidence.start_word: Location of the first character of the first word in the section of audio being inspected for entities.end_word: Location of the first character of the last word in the section of audio being inspected for entities.
All entities are available in English.
Identifiable Entities
View all options here:Supported Entity Types
Use Cases
Some examples of uses for Entity Detection include:
- Customers who want to improve Conversational AI and Voice Assistant by triggering particular workflows and responses based on identified name, address, location, and other key entities.
- Customers who want to enhance customer service and user experience by extracting meaningful and relevant information about key entities such as a person, organization, email, and phone number.
- Customers who want to derive meaningful and actionable insights from the audio data based on identified entities in conversations.