- Notifications
You must be signed in to change notification settings - Fork18
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers.https://subaligner.readthedocs.io/
License
baxtree/subaligner
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Subtitle: SubRip, TTML, WebVTT, (Advanced) SubStation Alpha, MicroDVD, MPL2, TMP, EBU STL, SAMI, SCC and SBV.
Video/Audio: MP4, WebM, Ogg, 3GP, FLV, MOV, Matroska, MPEG TS, WAV, MP3, AAC, FLAC, etc.
ℹ️ Subaligner relies on file extensions as default hints to process a wide range of audiovisual or subtitle formats. It is recommended to use extensions widely acceppted by the community to ensure compatibility.
Required by basic:FFmpeg
$ apt-get install ffmpeg
or
$ brew install ffmpeg
$ pip install -U pip && pip install -U setuptools wheel$ pip install subaligner
or install from source:
$ git clone git@github.com:baxtree/subaligner.git && cd subaligner$ pip install -U pip && pip install -U setuptools$ python setup.py install
ℹ️ It is highly recommended creating a virtual environment prior to installation.
# Install dependencies for enabling translation and transcription$ pip install 'subaligner[llm]'
# Install dependencies for enabling forced alignment$ pip install 'setuptools<65.0.0'$ pip install 'subaligner[stretch]'
# Install dependencies for setting up the development environment$ pip install 'setuptools<65.0.0'$ pip install 'subaligner[dev]'
Note that bothsubaligner[stretch]
andsubaligner[dev]
require additional dependencies to be pre-installed:
$ apt-get install espeak libespeak1 libespeak-dev espeak-data
or
$ brew install espeak
To install all supported features:
$ pip install 'setuptools<65.0.0'$ pip install 'subaligner[harmony]'
If you prefer using a containerised environment over installing everything locally, run:
$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner bash
For Windows users, you can use Windows Subsystem for Linux (WSL) to install Subaligner.Alternatively, you can useDocker Desktop to pull and run the image.Assuming your media assets are stored underd:\media
, open built-in command prompt, PowerShell, or Windows Terminal and run:
docker pull baxtree/subalignerdocker run -v "/d/media":/media -w "/media" -it baxtree/subaligner bash
# Single-stage alignment (high-level shift with lower latency)$ subaligner -m single -v video.mp4 -s subtitle.srt$ subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
# Dual-stage alignment (low-level shift with higher latency)$ subaligner -m dual -v video.mp4 -s subtitle.srt$ subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
# Generate subtitles by transcribing audiovisual files$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf small -o subtitle_aligned.srt$ subaligner -m transcribe -v video.mp4 -ml zho -mr whisper -mf medium -o subtitle_aligned.srt$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" -o subtitle_aligned.srt$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" --word_time_codes -o raw_subtitle.json$ subaligner -m transcribe -v video.mp4 -s subtitle.srt -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt$ subaligner -m transcribe -v video.mp4 -s subtitle.srt --use_prior_prompting -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
# Alignment on segmented plain texts (double newlines as the delimiter)$ subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt$ subaligner -m script -v video.mp4 -s subtitle.txt --word_time_codes -o raw_subtitle.json$ subaligner -m script -v https://example.com/video.mp4 -s https://example.com/subtitle.txt -o subtitle_aligned.srt
# Alignment on multiple subtitles against the single media file$ subaligner -m script -v video.mp4 -s subtitle_lang_1.txt -s subtitle_lang_2.txt$ subaligner -m script -v video.mp4 -s subtitle_lang_1.txt subtitle_lang_2.txt
# Alignment on embedded subtitles$ subaligner -m single -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt$ subaligner -m dual -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
# Translative alignment with the ISO 639-3 language code pair (src,tgt)$ subaligner --languages$ subaligner -m single -v video.mp4 -s subtitle.srt -t src,tgt$ subaligner -m dual -v video.mp4 -s subtitle.srt -t src,tgt$ subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt -t src,tgt$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-mbart -tf large -o subtitle_aligned.srt -t src,tgt$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-m2m100 -tf small -o subtitle_aligned.srt -t src,tgt$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr whisper -tf small -o subtitle_aligned.srt -t src,eng
# Transcribe audiovisual files and generate translated subtitles$ subaligner -m transcribe -v video.mp4 -ml src -mr whisper -mf small -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
# Shift subtitle manually by offset in seconds$ subaligner -m shift --subtitle_path subtitle.srt -os 5.5$ subaligner -m shift --subtitle_path subtitle.srt -os -5.5 -o subtitle_shifted.srt
# Run batch alignment against directories$ subaligner_batch -m single -vd videos/ -sd subtitles/ -od aligned_subtitles/$ subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/$ subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/ -of ttml
# Run alignments with pipx$ pipx run subaligner -m single -v video.mp4 -s subtitle.srt$ pipx run subaligner -m dual -v video.mp4 -s subtitle.srt
# Run the module as a script$ python -m subaligner -m single -v video.mp4 -s subtitle.srt$ python -m subaligner -m dual -v video.mp4 -s subtitle.srt
# Run alignments with the docker image$ docker pull baxtree/subaligner$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m single -v video.mp4 -s subtitle.srt$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m dual -v video.mp4 -s subtitle.srt$ docker run -it baxtree/subaligner subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt$ docker run -it baxtree/subaligner subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
The aligned subtitle will be saved atsubtitle_aligned.srt
. To obtain the subtitle in raw JSON format for downstreamprocessing, replace the output file extension with.json
. For details on CLIs, runsubaligner -h
orsubaligner_batch -h
,subaligner_convert -h
,subaligner_train -h
andsubaligner_tune -h
for additional utilities.subaligner_1pass
andsubaligner_2pass
are shortcuts for runningsubaligner
with-m single
and-m dual
options, respectively.
You can train a new model with your own audiovisual files and subtitle files:
$ subaligner_train -vd VIDEO_DIRECTORY -sd SUBTITLE_DIRECTORY -tod TRAINING_OUTPUT_DIRECTORY
Then you can apply it to your subtitle synchronisation with the aforementioned commands. For more details on how to train and tune your own model, please refer toSubaligner Docs.
For larger media files taking longer to process, you can reconfigure various timeouts using the following options:
-mpt [Maximum waiting time in seconds when processing media files]-sat [Maximum waiting time in seconds when aligning each segment]-fet [Maximum waiting time in seconds when embedding features for training]
Subtitles can be out of sync with their companion audiovisual media files for a variety of causes including latency introduced by Speech-To-Text on live streams or calibration and rectification involving human intervention during post-production.
A model has been trained with synchronised video and subtitle pairs and later used for predicating shifting offsets and directions under the guidance of a dual-stage aligning approach.
First Stage (Global Alignment):
Second Stage (Parallelised Individual Alignment):
This tool wouldn't be possible without the following packages:librosatensorflowscikit-learnpycaptionpysrtpysubs2aeneastransformersopenai-whisper.
Thanks to Alan Robinson and Nigel Megitt for their invaluable feedback.
About
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers.https://subaligner.readthedocs.io/