- Notifications
You must be signed in to change notification settings - Fork371
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
License
pytorch/TensorRT
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.
Stable versions of Torch-TensorRT are published on PyPI
pip install torch-tensorrt
Nightly versions of Torch-TensorRT are published on the PyTorch package index
pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu130
Torch-TensorRT is also distributed in the ready-to-runNVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.
For more advanced installation methods, please seehere
You can use Torch-TensorRT anywhere you usetorch.compile:
importtorchimporttorch_tensorrtmodel=MyModel().eval().cuda()# define your model herex=torch.randn((1,3,224,224)).cuda()# define what the inputs to the model will look likeoptimized_model=torch.compile(model,backend="tensorrt")optimized_model(x)# compiled on first runoptimized_model(x)# this will be fast!
If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).
importtorchimporttorch_tensorrtmodel=MyModel().eval().cuda()# define your model hereinputs= [torch.randn((1,3,224,224)).cuda()]# define a list of representative inputs heretrt_gm=torch_tensorrt.compile(model,ir="dynamo",inputs=inputs)torch_tensorrt.save(trt_gm,"trt.ep",inputs=inputs)# PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript filetorch_tensorrt.save(trt_gm,"trt.ts",output_format="torchscript",inputs=inputs)
importtorchimporttorch_tensorrtinputs= [torch.randn((1,3,224,224)).cuda()]# your inputs go here# You can run this in a new python session!model=torch.export.load("trt.ep").module()# model = torch_tensorrt.load("trt.ep").module() # this also worksmodel(*inputs)
#include"torch/script.h"#include"torch_tensorrt/torch_tensorrt.h"auto trt_mod = torch::jit::load("trt.ts");auto input_tensor = [...];// fill this with your inputsauto results = trt_mod.forward({input_tensor});
- Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT
- Up to 50% faster Stable Diffusion inference with one line of code
- Optimize LLMs from Hugging Face with Torch-TensorRT
- Run your model in FP8 with Torch-TensorRT
- Accelerated Inference in PyTorch 2.X with Torch-TensorRT
- Tools to resolve graph breaks and boost performance [coming soon]
- Tech Talk (GTC '23)
- Documentation
| Platform | Support |
|---|---|
| Linux AMD64 / GPU | Supported |
| Linux SBSA / GPU | Supported |
| Windows / GPU | Supported (Dynamo only) |
| Linux Jetson / GPU | Source Compilation Supported on JetPack-4.4+ |
| Linux Jetson / DLA | Source Compilation Supported on JetPack-4.4+ |
| Linux ppc64le / GPU | Not supported |
Note: ReferNVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.
These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.
- Bazel 8.1.1
- Libtorch 2.10.0.dev (latest nightly)
- CUDA 13.0 (CUDA 12.6 on Jetson)
- TensorRT 10.14.1.48 (TensorRT 10.3 on Jetson)
Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:
Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.
Take a look at theCONTRIBUTING.md
The Torch-TensorRT license can be found in theLICENSE file. It is licensed with a BSD Style licence
About
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Topics
Resources
License
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.