AssemblyAI Audio Transcripts
TheAssemblyAIAudioTranscriptLoader
allows to transcribe audio files with theAssemblyAI API and loads the transcribed text into documents.
To use it, you should have theassemblyai
python package installed, and theenvironment variableASSEMBLYAI_API_KEY
set with your API key. Alternatively, the API key can also be passed as an argument.
More info about AssemblyAI:
Installation
First, you need to install theassemblyai
python package.
You can find more info about it inside theassemblyai-python-sdk GitHub repo.
%pip install--upgrade--quiet assemblyai
Example
TheAssemblyAIAudioTranscriptLoader
needs at least thefile_path
argument. Audio files can be specified as an URL or a local file path.
from langchain_community.document_loadersimport AssemblyAIAudioTranscriptLoader
audio_file="https://storage.googleapis.com/aai-docs-samples/nbc.mp3"
# or a local file path: audio_file = "./nbc.mp3"
loader= AssemblyAIAudioTranscriptLoader(file_path=audio_file)
docs= loader.load()
Note: Callingloader.load()
blocks until the transcription is finished.
The transcribed text is available in thepage_content
:
docs[0].page_content
"Load time, a new president and new congressional makeup. Same old ..."
Themetadata
contains the full JSON response with more meta information:
docs[0].metadata
{'language_code': <LanguageCode.en_us: 'en_us'>,
'audio_url': 'https://storage.googleapis.com/aai-docs-samples/nbc.mp3',
'punctuate': True,
'format_text': True,
...
}
Transcript Formats
You can specify thetranscript_format
argument for different formats.
Depending on the format, one or more documents are returned. These are the differentTranscriptFormat
options:
TEXT
: One document with the transcription textSENTENCES
: Multiple documents, splits the transcription by each sentencePARAGRAPHS
: Multiple documents, splits the transcription by each paragraphSUBTITLES_SRT
: One document with the transcript exported in SRT subtitles formatSUBTITLES_VTT
: One document with the transcript exported in VTT subtitles format
from langchain_community.document_loaders.assemblyaiimport TranscriptFormat
loader= AssemblyAIAudioTranscriptLoader(
file_path="./your_file.mp3",
transcript_format=TranscriptFormat.SENTENCES,
)
docs= loader.load()
Transcription Config
You can also specify theconfig
argument to use different audio intelligence models.
Visit theAssemblyAI API Documentation to get an overview of all available models!
import assemblyaias aai
config= aai.TranscriptionConfig(
speaker_labels=True, auto_chapters=True, entity_detection=True
)
loader= AssemblyAIAudioTranscriptLoader(file_path="./your_file.mp3", config=config)
Pass the API Key as argument
Next to setting the API key as environment variableASSEMBLYAI_API_KEY
, it is also possible to pass it as argument.
loader= AssemblyAIAudioTranscriptLoader(
file_path="./your_file.mp3", api_key="YOUR_KEY"
)
Related
- Document loaderconceptual guide
- Document loaderhow-to guides