Run LLM-API with pytorch backend on Slurm #

SourceNVIDIA/TensorRT-LLM.
 1#!/bin/bash 2#SBATCH -A <account>    # parameter 3#SBATCH -p <partition>  # parameter 4#SBATCH -t 01:00:00 5#SBATCH -N 1 6#SBATCH --ntasks-per-node=2 7#SBATCH -o logs/llmapi-distributed.out 8#SBATCH -e logs/llmapi-distributed.err 9#SBATCH -J llmapi-distributed-task101112# NOTE, this feature is experimental and may not work on all systems.13# The trtllm-llmapi-launch is a script that launches the LLM-API code on14# Slurm-like systems, and can support multi-node and multi-GPU setups.1516# Note that, the number of MPI processes should be the same as the model world17# size. e.g. For tensor_parallel_size=16, you may use 2 nodes with 8 gpus for18# each, or 4 nodes with 4 gpus for each or other combinations.1920# This docker image should have tensorrt_llm installed, or you need to install21# it in the task.2223# The following variables are expected to be set in the environment:24# You can set them via --export in the srun/sbatch command.25#   CONTAINER_IMAGE: the docker image to use, you'd better install tensorrt_llm in it, or install it in the task.26#   MOUNT_DIR: the directory to mount in the container27#   MOUNT_DEST: the destination directory in the container28#   WORKDIR: the working directory in the container29#   SOURCE_ROOT: the path to the TensorRT LLM source30#   PROLOGUE: the prologue to run before the script31#   LOCAL_MODEL: the local model directory to use, NOTE: downloading from HF is32#      not supported in Slurm mode, you need to download the model and put it in33#      the LOCAL_MODEL directory.3435# Adjust the paths to run36exportscript=$SOURCE_ROOT/examples/llm-api/quickstart_advanced.py3738# Just launch the PyTorch example with trtllm-llmapi-launch command.39srun-l\40--container-image=${CONTAINER_IMAGE}\41--container-mounts=${MOUNT_DIR}:${MOUNT_DEST}\42--container-workdir=${WORKDIR}\43--export=ALL\44--mpi=pmix\45bash-c"46$PROLOGUE47        export PATH=$PATH:~/.local/bin48        trtllm-llmapi-launch python3$script \49            --model_dir$LOCAL_MODEL \50            --prompt 'Hello, how are you?' \51            --tp_size 252    "
Movatterモバイル変換

Run LLM-API with pytorch backend on Slurm#

Run LLM-API with pytorch backend on Slurm #