Continuous Integration Overview#
This page explains how TensorRT‑LLM’s CI is organized and how individual tests map to Jenkins stages. Most stages execute integration tests defined in YAML files, while unit tests run as part of a merge‑request pipeline. The sections below describe how to locate a test and trigger the stage that runs it.
Table of Contents#
CI pipelines#
Pull requests do not start testing by themselves. Developers trigger the CI by commenting/botrun (optionally with arguments) on the pull request (seePull Request Template for more details). That kicks off themerge-request pipeline (defined injenkins/L0_MergeRequest.groovy), which runs unit tests and integration tests whose YAML entries specifystage:pre_merge. Once a pull request is merged, a separatepost-merge pipeline (defined injenkins/L0_Test.groovy) runs every test markedpost_merge across all supported GPU configurations.
stage tags live in the YAML files undertests/integration/test_lists/test-db/. Searching those files forstage:pre_merge shows exactly which tests the merge-request pipeline covers.
Test definitions#
Integration tests are listed undertests/integration/test_lists/test-db/. Most YAML files are named after the GPU or configuration they run on (for examplel0_a100.yml). Some files, likel0_sanity_check.yml, use wildcards and can run on multiple hardware types. Entries contain conditions and a list of tests. Two important terms in each entry are:
stage: eitherpre_mergeorpost_merge.backend:pytorch,tensorrtortriton.
Example froml0_a100.yml:
terms:stage:post_mergebackend:tritontests:-triton_server/test_triton.py::test_gpt_ib_ptuning[gpt-ib-ptuning]
Unit tests#
Unit tests live undertests/unittest/ and run during the merge-request pipeline. They are invoked fromjenkins/L0_MergeRequest.groovy and do not require mapping to specific hardware stages.
Jenkins stage names#
jenkins/L0_Test.groovy maps stage names to these YAML files. For A100 the mapping includes:
"A100X-Triton-[Post-Merge]-1":["a100x","l0_a100",1,2],"A100X-Triton-[Post-Merge]-2":["a100x","l0_a100",2,2],
The array elements are: GPU type, YAML file (without extension), shard index, and total number of shards. Only tests withstage:post_merge from that YAML file are selected when aPost-Merge stage runs.
Finding the stage for a test#
Locate the test in the appropriate YAML file under
tests/integration/test_lists/test-db/and note itsstageandbackendvalues.Search
jenkins/L0_Test.groovyfor a stage whose YAML file matches (for examplel0_a100) and whose name contains[Post-Merge]if the YAML entry usesstage:post_merge.The resulting stage name(s) are what you pass to Jenkins via the
stage_listparameter when triggering a job.
Usingtest_to_stage_mapping.py#
Manually searching YAML and Groovy files can be tedious. The helper scriptscripts/test_to_stage_mapping.py automates the lookup:
pythonscripts/test_to_stage_mapping.py--tests"triton_server/test_triton.py::test_gpt_ib_ptuning[gpt-ib-ptuning]"pythonscripts/test_to_stage_mapping.py--testsgpt_ib_ptuningpythonscripts/test_to_stage_mapping.py--stagesA100X-Triton-Post-Merge-1pythonscripts/test_to_stage_mapping.py--test-listmy_tests.txtpythonscripts/test_to_stage_mapping.py--test-listmy_tests.ymlThe first two commands print the Jenkins stages that run the specified tests orpatterns. Patterns are matched by substring, so partial test names aresupported out of the box. The third lists every test executed in the given stage. Whenproviding tests on the command line, quote each test string so the shell doesnot interpret the[ and] characters as globs. Alternatively, store thetests in a newline‑separated text file or a YAML list and supply it with--test-list.
To run the same tests on your pull request, comment:
/botrun--stage-list"A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2"This executes the same tests that run post-merge for this hardware/backend.
Waiving tests#
Sometimes a test is known to fail due to a bug or unsupported feature. Insteadof removing it from the YAML test lists, add the test name totests/integration/test_lists/waives.txt. Every CI run passes this file topytest via--waives-file, so the listed tests are skipped automatically.
Each line contains the fully qualified test name followed by an optionalSKIP(reason) marker. Afull:GPU_TYPE/ prefix restricts the waive to aspecific hardware family. Example:
examples/test_openai.py::test_llm_openai_triton_1gpu SKIP (https://nvbugspro.nvidia.com/bug/4963654)full:GH200/examples/test_qwen2audio.py::test_llm_qwen2audio_single_gpu[qwen2_audio_7b_instruct] SKIP (arm is not supported)
Changes towaives.txt should include a bug link or brief explanation so otherdevelopers understand why the test is disabled.
Triggering CI Best Practices#
Triggering Post-merge tests#
When you only need to verify a handful of post-merge tests, avoid the heavy/botrun--post-merge command. Instead, specify exactly which stages to run:
/botrun--stage-list"stage-A,stage-B"This runsonly the stages listed. You can also add stages on top of thedefault pre-merge set:
/botrun--extra-stage"stage-A,stage-B"Both options accept any stage name defined injenkins/L0_Test.groovy. Beingselective keeps CI turnaround fast and conserves hardware resources.
Avoiding unnecessary--disable-fail-fast usage#
Avoid habitually using--disable-fail-fast as it wastes scarce hardware resources. The CI system automatically reuses successful test stages when commits remain unchanged, and subsequent/botrun commands only retry failed stages. Overusing--disable-fail-fast keeps failed pipelines consuming resources (like DGX-H100s), increasing queue backlogs and reducing team efficiency.