TensorRT Documentation#
NVIDIA TensorRT is an SDK that facilitates high-performance machine learning inference. It complements training frameworks such as TensorFlow, PyTorch, and MXNet. It focuses on running an already-trained network quickly and efficiently on NVIDIA hardware.
Attention
Ensure you refer to theRelease Notes, which describes the newest features, software enhancements and improvements, and known issues for the TensorRT release product package.
TheQuick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
TheSupport Matrix provides an overview of the supported platforms, features, and hardware capabilities of the TensorRT APIs, parsers, and layers.
TheInstalling TensorRT section provides the installation requirements, a list of what is included in the TensorRT package, and step-by-step instructions for installing TensorRT.
TheArchitecture section demonstrates how to use the C++ and Python APIs to implement the most common deep learning layers. It shows how to take an existing model built with a deep learning framework and build a TensorRT engine using the provided parsers.
TheInference Library section demonstrates how to use the C++ and Python APIs for implementing the most common deep learning layers. It shows how you can take an existing model built with a deep learning framework and build a TensorRT engine using the provided parsers.
TheSample Support Guide provides an overview of all the supported TensorRT samples on GitHub. The TensorRT samples specifically help in recommenders, machine comprehension, character recognition, image classification, and object detection.
ThePerformance section introduces how to use
trtexec, a command-line tool designed for TensorRT performance benchmarking, to get the inference performance measurements of your deep learning models.TheAPI section enables developers in C++ and Python based development environments and those looking to experiment with TensorRT to easily parse models (for example, from ONNX) and generate and run PLAN files.
TheAPI Migration Guide highlights the TensorRT API modifications. If you are unfamiliar with these changes, refer to our sample code for clarification.
TheReference section will help answer commonly asked questions regarding typical use cases as well as provide additional resources for assistance.