Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

Subtitle

The SubRip file format is described on theMatroska multimedia container format website as "perhaps the most basic of all subtitle formats."SubRip (SubRip Text) files are named with the extension.srt, and contain formatted lines of plain text in groups separated by a blank line. Subtitles are numbered sequentially, starting at 1. The timecode format used is hours:minutes:seconds,milliseconds with time units fixed to two zero-padded digits and fractions fixed to three zero-padded digits (00:00:00,000). The fractional separator used is the comma, since the program was written in France.

How to load data from subtitle (.srt) files

Please, download theexample .srt file from here.

%pip install--upgrade--quiet  pysrt
from langchain_community.document_loadersimport SRTLoader
API Reference:SRTLoader
loader= SRTLoader(
"example_data/Star_Wars_The_Clone_Wars_S06E07_Crisis_at_the_Heart.srt"
)
docs= loader.load()
docs[0].page_content[:100]
'<i>Corruption discovered\nat the core of the Banking Clan!</i> <i>Reunited, Rush Clovis\nand Senator A'

Related


[8]ページ先頭

©2009-2025 Movatter.jp