Installing on Linux viapip#

  1. Install TensorRT LLM (tested on Ubuntu 24.04).

    Install prerequisites

    Before the pre-built Python wheel can be installed viapip, a fewprerequisites must be put into place:

    Install CUDA Toolkit following theCUDA Installation Guide for Linux andmake sureCUDA_HOME environment variable is properly set.

    # By default, PyTorch CUDA 12.8 package is installed. Install PyTorch CUDA 13.0 package to align with the CUDA version used for building TensorRT LLM wheels.pip3installtorch==2.9.0torchvision--index-urlhttps://download.pytorch.org/whl/cu130sudoapt-get-yinstalllibopenmpi-dev# Optional step: Only required for disagg-servingsudoapt-get-yinstalllibzmq3-dev

    Tip

    Instead of manually installing the preqrequisites as describedabove, it is also possible to use the pre-builtTensorRT LLM Develop containerimage hosted on NGC(seehere for information on container tags).

    Install pre-built TensorRT LLM wheel

    Once all prerequisites are in place, TensorRT LLM can be installed as follows:

    pip3install--upgradepipsetuptools&&pip3installtensorrt_llm

    This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

  2. Sanity check the installation by running the following in Python (tested on Python 3.12):

     1fromtensorrt_llmimportLLM,SamplingParams 2 3 4defmain(): 5 6# Model could accept HF model name, a path to local HF model, 7# or TensorRT Model Optimizer's quantized checkpoints like nvidia/Llama-3.1-8B-Instruct-FP8 on HF. 8llm=LLM(model="TinyLlama/TinyLlama-1.1B-Chat-v1.0") 910# Sample prompts.11prompts=[12"Hello, my name is",13"The capital of France is",14"The future of AI is",15]1617# Create a sampling params.18sampling_params=SamplingParams(temperature=0.8,top_p=0.95)1920foroutputinllm.generate(prompts,sampling_params):21print(22f"Prompt:{output.prompt!r}, Generated text:{output.outputs[0].text!r}"23)2425# Got output like26# Prompt: 'Hello, my name is', Generated text: '\n\nJane Smith. I am a student pursuing my degree in Computer Science at [university]. I enjoy learning new things, especially technology and programming'27# Prompt: 'The president of the United States is', Generated text: 'likely to nominate a new Supreme Court justice to fill the seat vacated by the death of Antonin Scalia. The Senate should vote to confirm the'28# Prompt: 'The capital of France is', Generated text: 'Paris.'29# Prompt: 'The future of AI is', Generated text: 'an exciting time for us. We are constantly researching, developing, and improving our platform to create the most advanced and efficient model available. We are'303132if__name__=='__main__':33main()

Known limitations

There are some known limitations when you pip install pre-built TensorRT LLM wheel package.

  1. MPI in the Slurm environment

    If you encounter an error while running TensorRT LLM in a Slurm-managed cluster, you need to reconfigure the MPI installation to work with Slurm.The setup methods depends on your slurm configuration, pls check with your admin. This is not a TensorRT LLM specific, rather a general mpi+slurm issue.

    Theapplicationappearstohavebeendirectlaunchedusing"srun",butOMPIwasnotbuiltwithSLURMsupport.ThisusuallyhappenswhenOMPIwasnotconfigured--with-slurmandweweren't abletodiscoveraSLURMinstallationintheusualplaces.
  2. Preventpip from replacing existing PyTorch installation

    On certain systems, particularly Ubuntu 22.04, users installing TensorRT LLM would find that their existing, CUDA 13.0 compatible PyTorch installation (e.g.,torch==2.9.0+cu130) was being uninstalled bypip. It was then replaced by a CUDA 12.8 version (torch==2.9.0), causing the TensorRT LLM installation to be unusable and leading to runtime errors.

    The solution is to create apip constraints file, lockingtorch to the currently installed version. Here is an example of how this can be done manually:

    CURRENT_TORCH_VERSION=$(python3-c"import torch; print(torch.__version__)")echo"torch==$CURRENT_TORCH_VERSION">/tmp/torch-constraint.txtpip3install--upgradepipsetuptools&&pip3installtensorrt_llm-c/tmp/torch-constraint.txt