Lightricks/LTX-VideoPublic

NotificationsYou must be signed in to change notification settings
Fork881
Star9.3k

Official repository for LTX-Video

License

Apache-2.0 license

9.3k stars 881 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.github/workflows		.github/workflows
configs		configs
docs/_static		docs/_static
ltx_video		ltx_video
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
pyproject.toml		pyproject.toml

Repository files navigation

LTX-Video

This is the official repository for LTX-Video.

🚀New: LTX-2 is Now Available!

We're excited to announceLTX-2 - the next generation of LTX with synchronized audio+video generation!

LTX-2 is the first DiT-based audio-video foundation model that contains all core capabilities of modern video generation in one model.LTX-2 is now the primary home for LTX development and includes significant improvements:

🎵Synchronized Audio+Video Generation - Generate videos with perfectly synchronized audio
🎬Latest Model - LTX-2 with improved quality and capabilities
🔌ComfyUI Integration - Built into ComfyUI core for seamless workflows
🎯Advanced Features:
- Multiple keyframe support
- IC-LoRA control models for precise generation
- Standard LoRA support for style customization
- Latent upsampler for multiscale pipelines
🛠️Training Tools - LoRA training capabilities
📚Comprehensive Documentation - Full documentation athttps://docs.ltx.video
🔄Active Development - Ongoing improvements and community support

👉 Check out LTX-2 here

📖 View Documentation

Introduction

LTX-Video is the first DiT-based video generation model that contains all core capabilities of modern video generation in one model: synchronized audio and video, high fidelity, multiple performance modes, production-ready outputs, API access, and open access. It can generate up to 50 FPS videos at native 4K resolution with synchronized audio in one pass.The model is trained on a large-scale dataset of diverse videos and can generate high-resolution videos with realistic and diverse content.

The model supports image-to-video, multi-keyframe conditioning, keyframe-based animation, video extension (both forward and backward), video-to-video transformations, and any combination of these features.

Image-to-video examples

Controlled video examples

News

October 23, 2025: LTX-2 Announced

Today we announced our newest foundation model, LTX-2. LTX-2 represents a major leap forward from our previous model, LTXV 0.9.8. Here’s what’s new:

Audio + Video, Together: Visuals and sound are generated in one coherent process, with motion, dialogue, ambience, and music flowing simultaneously.
4K Fidelity: Professional-grade precision with native 4K and up to 50 fps, sharp textures, clean motion, and synchronized audio.
Longer Generations: LTX-2 supports longer, continuous clips with synchronized audio up to 10 seconds.
Low Cost & Efficiency: Up to 50% lower compute cost than competing models, powered by a multi-GPU inference stack.
Creative Control: Multi-keyframe conditioning, 3D camera logic, and LoRA fine-tuning deliver frame-level precision and style consistency.

For more details, please see ourblog post. LTX-2 model weights, code, and benchmarks will be released to the community later in 2025.

July, 16th, 2025: New Distilled models v0.9.8 with up to 60 seconds of video:

Long shot generation in LTXV-13B!
- LTX-Video now supports up to 60 seconds of video.
- Compatible also with the official IC-LoRAs.
- Try now inComfyUI.
Release a new distilled models:
- 13B distilled modelltxv-13b-0.9.8-distilled
- 2B distilled modelltxv-2b-0.9.8-distilled
- Both models are distilled from the same base modelltxv-13b-0.9.8-dev and are compatible for use together in the same multiscale pipeline.
- Improved prompt understanding and detail generation
- Includes corresponding FP8 weights and workflows.
Release a new detailer modelLTX-Video-ICLoRA-detailer-13B-0.9.8
- Available inComfyUI.

July, 8th, 2025: New Control Models Released!

Released three new control models for LTX-Video on HuggingFace:
- Depth Control:LTX-Video-ICLoRA-depth-13b-0.9.7
- Pose Control:LTX-Video-ICLoRA-pose-13b-0.9.7
- Canny Control:LTX-Video-ICLoRA-canny-13b-0.9.7

May, 14th, 2025: New distilled model 13B v0.9.7:

Release a new 13B distilled modelltxv-13b-0.9.7-distilled
- Amazing for iterative work - generates HD videos in 10 seconds, with low-res preview after just 3 seconds (on H100)!
- Does not require classifier-free guidance and spatio-temporal guidance.
- Supports sampling with 8 (recommended), or less diffusion steps.
- Also released a LoRA version of the distilled model,ltxv-13b-0.9.7-distilled-lora128
  - Requires only 1GB of VRAM
  - Can be used with the full 13B model for fast inference
Release a new quantized distilled modelltxv-13b-0.9.7-distilled-fp8 forreal-time generation (on H100) with even less VRAM

May, 5th, 2025: New model 13B v0.9.7:

Release a new 13B modelltxv-13b-0.9.7-dev
Release a new quantized modelltxv-13b-0.9.7-dev-fp8 for faster inference with less VRam
Release a new upscalers
- ltxv-temporal-upscaler-0.9.7
- ltxv-spatial-upscaler-0.9.7
Breakthrough prompt adherence and physical understanding.
New Pipeline for multi-scale video rendering for fast and high quality results

April, 15th, 2025: New checkpoints v0.9.6:

Release a new checkpointltxv-2b-0.9.6-dev-04-25 with improved quality
Release a new distilled modelltxv-2b-0.9.6-distilled-04-25
- 15x faster inference than non-distilled model.
- Does not require classifier-free guidance and spatio-temporal guidance.
- Supports sampling with 8 (recommended), or less diffusion steps.
Improved prompt adherence, motion quality and fine details.
New default resolution and FPS: 1216 × 704 pixels at 30 FPS
- Still real time on H100 with the distilled model.
- Other resolutions and FPS are still supported.
Support stochastic inference (can improve visual quality when using the distilled model)

March, 5th, 2025: New checkpoint v0.9.5

New license for commercial use (OpenRail-M)
Release a new checkpoint v0.9.5 with improved quality
Support keyframes and video extension
Support higher resolutions
Improved prompt understanding
Improved VAE
New online web app inLTX-Studio
Automatic prompt enhancement

February, 20th, 2025: More inference options

Improve STG (Spatiotemporal Guidance) for LTX-Video
Support MPS on macOS with PyTorch 2.3.0
Add support for 8-bit model, LTX-VideoQ8
Add TeaCache for LTX-Video
AddComfyUI-LTXTricks
Add Diffusion-Pipe

December 31st, 2024: Research paper

Release theresearch paper

December 20th, 2024: New checkpoint v0.9.1

Release a new checkpoint v0.9.1 with improved quality
Support for STG / PAG
Support loading checkpoints of LTX-Video in Diffusers format (conversion is done on-the-fly)
Support offloading unused parts to CPU
Support the new timestep-conditioned VAE decoder
Reference contributions from the community in the readme file
Relax transformers dependency

November 21th, 2024: Initial release v0.9.0

Initial release of LTX-Video
Support text-to-video and image-to-video generation

Models

Name	Notes	inference.py config	ComfyUI workflow (Recommended)
ltxv-13b-0.9.8-dev	Highest quality, requires more VRAM	ltxv-13b-0.9.8-dev.yaml	ltxv-13b-i2v-base.json
ltxv-13b-0.9.8-mix	Mix ltxv-13b-dev and ltxv-13b-distilled in the same multi-scale rendering workflow for balanced speed-quality	N/A	ltxv-13b-i2v-mixed-multiscale.json
ltxv-13b-0.9.8-distilled	Faster, less VRAM usage, slight quality reduction compared to 13b. Ideal for rapid iterations	ltxv-13b-0.9.8-distilled.yaml	ltxv-13b-dist-i2v-base.json
ltxv-2b-0.9.8-distilled	Smaller model, slight quality reduction compared to 13b distilled. Ideal for fast generation with light VRAM usage	ltxv-2b-0.9.8-distilled.yaml	N/A
ltxv-13b-0.9.8-dev-fp8	Quantized version of ltxv-13b	ltxv-13b-0.9.8-dev-fp8.yaml	ltxv-13b-i2v-base-fp8.json
ltxv-13b-0.9.8-distilled-fp8	Quantized version of ltxv-13b-distilled	ltxv-13b-0.9.8-distilled-fp8.yaml	ltxv-13b-dist-i2v-base-fp8.json
ltxv-2b-0.9.8-distilled-fp8	Quantized version of ltxv-2b-distilled	ltxv-2b-0.9.8-distilled-fp8.yaml	N/A
ltxv-2b-0.9.6	Good quality, lower VRAM requirement than ltxv-13b	ltxv-2b-0.9.6-dev.yaml	ltxvideo-i2v.json
ltxv-2b-0.9.6-distilled	15× faster, real-time capable, fewer steps needed, no STG/CFG required	ltxv-2b-0.9.6-distilled.yaml	ltxvideo-i2v-distilled.json

Quick Start Guide

Online inference

The model is accessible right away via the following links:

Run locally

Installation

The codebase was tested with Python 3.10.5, CUDA version 12.2, and supports PyTorch >= 2.1.2.On macOS, MPS was tested with PyTorch 2.3.0, and should support PyTorch == 2.3 or >= 2.6.

git clone https://github.com/Lightricks/LTX-Video.gitcd LTX-Video# create envpython -m venv envsource env/bin/activatepython -m pip install -e .\[inference\]

FP8 Kernels (optional)

FP8 kernels developed for LTX-Video provide performance boost on supported graphics cards (Ada architecture and later). To install FP8 kernels, follow the instructions in that repository.

Inference

📝Note: For best results, we recommend using ourComfyUI workflow. We're working on updating the inference.py script to match the high quality and output fidelity of ComfyUI.

To use our model, please follow the inference code ininference.py:

For image-to-video generation:

python inference.py --prompt"PROMPT" --conditioning_media_paths IMAGE_PATH --conditioning_start_frames 0 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.8-distilled.yaml

Extending a video:

📝Note: Input video segments must contain a multiple of 8 frames plus 1 (e.g., 9, 17, 25, etc.), and the target frame number should be a multiple of 8.

python inference.py --prompt"PROMPT" --conditioning_media_paths VIDEO_PATH --conditioning_start_frames START_FRAME --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.8-distilled.yaml

For video generation with multiple conditions:

You can now generate a video conditioned on a set of images and/or short video segments.Simply provide a list of paths to the images or video segments you want to condition on, along with their target frame numbers in the generated video. You can also specify the conditioning strength for each item (default: 1.0).

python inference.py --prompt"PROMPT" --conditioning_media_paths IMAGE_OR_VIDEO_PATH_1 IMAGE_OR_VIDEO_PATH_2 --conditioning_start_frames TARGET_FRAME_1 TARGET_FRAME_2 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.8-distilled.yaml

Using as a library

fromltx_video.inferenceimportinfer,InferenceConfiginfer(InferenceConfig(pipeline_config="configs/ltxv-13b-0.9.8-distilled.yaml",prompt=PROMPT,height=HEIGHT,width=WIDTH,num_frames=NUM_FRAMES,output_path="output.mp4",    ))

ComfyUI Integration

To use our model with ComfyUI, please follow the instructions athttps://github.com/Lightricks/ComfyUI-LTXVideo/.

Diffusers Integration

To use our model with the Diffusers Python library, check out theofficial documentation.

Diffusers also support an 8-bit version of LTX-Video,see details below

Model User Guide

📝 Prompt Engineering

When writing prompts, focus on detailed, chronological descriptions of actions and scenes. Include specific movements, appearances, camera angles, and environmental details - all in a single flowing paragraph. Start directly with the action, and keep descriptions literal and precise. Think like a cinematographer describing a shot list. Keep within 200 words. For best results, build your prompts using this structure:

Start with main action in a single sentence
Add specific details about movements and gestures
Describe character/object appearances precisely
Include background and environment details
Specify camera angles and movements
Describe lighting and colors
Note any changes or sudden events
Seeexamples for more inspiration.

Automatic Prompt Enhancement

When usingLTXVideoPipeline directly, you can enable prompt enhancement by settingenhance_prompt=True.

🎮 Parameter Guide

Resolution Preset: Higher resolutions for detailed scenes, lower for faster generation and simpler scenes. The model works on resolutions that are divisible by 32 and number of frames that are divisible by 8 + 1 (e.g. 257). In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input will be padded with -1 and then cropped to the desired resolution and number of frames. The model works best on resolutions under 720 x 1280 and number of frames below 257
Seed: Save seed values to recreate specific styles or compositions you like
Guidance Scale: 3-3.5 are the recommended values
Inference Steps: More steps (40+) for quality, fewer steps (20-30) for speed

📝 For advanced parameters usage, please seepython inference.py --help

Community Contribution

ComfyUI-LTXTricks 🛠️

A community project providing additional nodes for enhanced control over the LTX Video model. It includes implementations of advanced techniques like RF-Inversion, RF-Edit, FlowEdit, and more. These nodes enable workflows such as Image and Video to Video (I+V2V), enhanced sampling via Spatiotemporal Skip Guidance (STG), and interpolation with precise frame settings.

Repository:ComfyUI-LTXTricks
Features:
- 🔄RF-Inversion: ImplementsRF-Inversion with anexample workflow here.
- ✂️RF-Edit: ImplementsRF-Solver-Edit with anexample workflow here.
- 🌊FlowEdit: ImplementsFlowEdit with anexample workflow here.
- 🎥I+V2V: Enables Video to Video with a reference image.Example workflow.
- ✨Enhance: Partial implementation ofSTGuidance.Example workflow.
- 🖼️Interpolation and Frame Setting: Nodes for precise control of latents per frame.Example workflow.

LTX-VideoQ8 🎱

LTX-VideoQ8 is an 8-bit optimized version ofLTX-Video, designed for faster performance on NVIDIA ADA GPUs.

Repository:LTX-VideoQ8
Features:
- 🚀 Up to 3X speed-up with no accuracy loss
- 🎥 Generate 720x480x121 videos in under a minute on RTX 4060 (8GB VRAM)
- 🛠️ Fine-tune 2B transformer models with precalculated latents
Community Discussion:Reddit Thread
Diffusers integration: A diffusers integration for the 8-bit model is already out!Details here

TeaCache for LTX-Video 🍵

TeaCache is a training-free caching approach that leverages timestep differences across model outputs to accelerate LTX-Video inference by up to 2x without significant visual quality degradation.

Repository:TeaCache4LTX-Video
Features:
- 🚀 Speeds up LTX-Video inference.
- 📊 Adjustable trade-offs between speed (up to 2x) and visual quality using configurable parameters.
- 🛠️ No retraining required: Works directly with existing models.

Your Contribution

...is welcome! If you have a project or tool that integrates with LTX-Video,please let us know by opening an issue or pull request.

Training

We provide an open-source repository for fine-tuning the LTX-Video model:LTX-Video-Trainer.This repository supports both the 2B and 13B model variants, enabling full fine-tuning as well as LoRA (Low-Rank Adaptation) fine-tuning for more efficient training. This includes:

Control LoRAs: Train custom control models like depth, pose, and canny control
Effect LoRAs: Create specialized effects and transformations for video generation

Explore the repository to customize the model for your specific use cases!More information and training instructions can be found in theREADME.

Control Models

ComfyUI-LTXVideo repository now contains workflows and models for 3 specialized models that enable precise control over LTX-Video generation:

Pose Control, Depth Control and Canny Control

Example ComfyUI Workflow (for all control types):ic-lora.json

Join Us

Want to work on cutting-edge AI research and make a real impact on millions of users worldwide?

AtLightricks, an AI-first company, we're revolutionizing how visual content is created.

If you are passionate about AI, computer vision, and video generation, we would love to hear from you!

Please visit ourcareers page for more information.

Acknowledgement

We are grateful for the following awesome projects when implementing LTX-Video:

DiT andPixArt-alpha: vision transformers for image generation.

Citation

📄 Our tech report is out! If you find our work helpful, please ⭐️ star the repository and cite our paper.

@article{HaCohen2024LTXVideo,  title={LTX-Video: Realtime Video Latent Diffusion},  author={HaCohen, Yoav and Chiprut, Nisan and Brazowski, Benny and Shalem, Daniel and Moshe, Dudu and Richardson, Eitan and Levin, Eran and Shiran, Guy and Zabari, Nir and Gordon, Ori and Panet, Poriya and Weissbuch, Sapir and Kulikov, Victor and Bitterman, Yaki and Melumian, Zeev and Bibi, Ofir},  journal={arXiv preprint arXiv:2501.00103},  year={2024}}

About

Official repository for LTX-Video

ltx.io/model

Releases

2tags

Contributors15

Languages

Python100.0%

Movatterモバイル変換

License

Lightricks/LTX-Video

Folders and files

Latest commit

History

Repository files navigation

LTX-Video

🚀New: LTX-2 is Now Available!

Table of Contents

Introduction

Image-to-video examples

Controlled video examples

News

October 23, 2025: LTX-2 Announced

July, 16th, 2025: New Distilled models v0.9.8 with up to 60 seconds of video:

July, 8th, 2025: New Control Models Released!

May, 14th, 2025: New distilled model 13B v0.9.7:

May, 5th, 2025: New model 13B v0.9.7:

April, 15th, 2025: New checkpoints v0.9.6:

March, 5th, 2025: New checkpoint v0.9.5

February, 20th, 2025: More inference options

December 31st, 2024: Research paper

December 20th, 2024: New checkpoint v0.9.1

November 21th, 2024: Initial release v0.9.0

Models

Quick Start Guide

Online inference

Run locally

Installation

FP8 Kernels (optional)

Inference

For image-to-video generation:

Extending a video:

For video generation with multiple conditions:

Using as a library

ComfyUI Integration

Diffusers Integration

Model User Guide

📝 Prompt Engineering

Automatic Prompt Enhancement

🎮 Parameter Guide

Community Contribution

ComfyUI-LTXTricks 🛠️

LTX-VideoQ8 🎱

TeaCache for LTX-Video 🍵

Your Contribution

Training

Control Models

Join Us

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors15

Uh oh!

Languages