- Notifications
You must be signed in to change notification settings - Fork154
Image classification with NVIDIA TensorRT from TensorFlow models.
License
NVIDIA-AI-IOT/tf_to_trt_image_classification
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This contains examples, scripts and code related to image classification using TensorFlow models(fromhere)converted to TensorRT. Converting TensorFlow models to TensorRT offers significant performancegains on the Jetson TX2 as seenbelow.
- Models
- Setup
- Download models and create frozen graphs
- Convert frozen graph to TensorRT engine
- Execute TensorRT engine
- Benchmark all models
The table below shows various details related to pretrained models ported from the TensorFlowslim model zoo.
| Model | Input Size | TensorRT (TX2 / Half) | TensorRT (TX2 / Float) | TensorFlow (TX2 / Float) | Input Name | Output Name | Preprocessing Fn. |
|---|---|---|---|---|---|---|---|
| inception_v1 | 224x224 | 7.98ms | 12.8ms | 27.6ms | input | InceptionV1/Logits/SpatialSqueeze | inception |
| inception_v3 | 299x299 | 26.3ms | 46.1ms | 98.4ms | input | InceptionV3/Logits/SpatialSqueeze | inception |
| inception_v4 | 299x299 | 52.1ms | 88.2ms | 176ms | input | InceptionV4/Logits/Logits/BiasAdd | inception |
| inception_resnet_v2 | 299x299 | 53.0ms | 98.7ms | 168ms | input | InceptionResnetV2/Logits/Logits/BiasAdd | inception |
| resnet_v1_50 | 224x224 | 15.7ms | 27.1ms | 63.9ms | input | resnet_v1_50/SpatialSqueeze | vgg |
| resnet_v1_101 | 224x224 | 29.9ms | 51.8ms | 107ms | input | resnet_v1_101/SpatialSqueeze | vgg |
| resnet_v1_152 | 224x224 | 42.6ms | 78.2ms | 157ms | input | resnet_v1_152/SpatialSqueeze | vgg |
| resnet_v2_50 | 299x299 | 27.5ms | 44.4ms | 92.2ms | input | resnet_v2_50/SpatialSqueeze | inception |
| resnet_v2_101 | 299x299 | 49.2ms | 83.1ms | 160ms | input | resnet_v2_101/SpatialSqueeze | inception |
| resnet_v2_152 | 299x299 | 74.6ms | 124ms | 230ms | input | resnet_v2_152/SpatialSqueeze | inception |
| mobilenet_v1_0p25_128 | 128x128 | 2.67ms | 2.65ms | 15.7ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
| mobilenet_v1_0p5_160 | 160x160 | 3.95ms | 4.00ms | 16.9ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
| mobilenet_v1_1p0_224 | 224x224 | 12.9ms | 12.9ms | 24.4ms | input | MobilenetV1/Logits/SpatialSqueeze | inception |
| vgg_16 | 224x224 | 38.2ms | 79.2ms | 171ms | input | vgg_16/fc8/BiasAdd | vgg |
The times recorded include data transfer to GPU, network execution, anddata transfer back from GPU. Time does not include preprocessing.Seescripts/test_tf.py,scripts/test_trt.py, andsrc/test/test_trt.cufor implementation details.
Flash the Jetson TX2 using JetPack 3.2. Be sure to install
- CUDA 9.0
- OpenCV4Tegra
- cuDNN
- TensorRT 3.0
Install pip on Jetson TX2.
sudo apt-get install python-pipInstall TensorFlow on Jetson TX2.
Download the TensorFlow 1.5.0 pip wheel fromhere. This build of TensorFlow is provided as a convenience for the purposes of this project.
Install TensorFlow using pip
sudo pip install tensorflow-1.5.0rc0-cp27-cp27mu-linux_aarch64.whl
Install uff exporter on Jetson TX2.
Download TensorRT 3.0.4 for Ubuntu 16.04 and CUDA 9.0 tar package fromhttps://developer.nvidia.com/nvidia-tensorrt-download.
Extract archive
tar -xzf TensorRT-3.0.4.Ubuntu-16.04.3.x86_64.cuda-9.0.cudnn7.0.tar.gzInstall uff python package using pip
sudo pip install TensorRT-3.0.4/uff/uff-0.2.0-py2.py3-none-any.whl
Clone and build this project
git clone --recursive https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification.gitcd tf_to_trt_image_classificationmkdir buildcd buildcmake ..make cd ..
Run the following bash script to download all of the pretrained models.
source scripts/download_models.shIf there are any models you don't want to use, simply remove the URL from the model list inscripts/download_models.sh.
Next, because the TensorFlow models are provided in checkpoint format, we must convert them to frozen graphs for optimization with TensorRT. Run thescripts/models_to_frozen_graphs.py script.
python scripts/models_to_frozen_graphs.pyIf you removed any models in the previous step, you must add'exclude': true to the corresponding item in theNETS dictionary located inscripts/model_meta.py. If you are following the instructions for executing engines below, you may also need some sample images. Run the following script to download a few images from ImageNet.
source scripts/download_images.shRun thescripts/convert_plan.py script from the root directory of the project, referencing themodels table for relevant parameters. For example, to convert the Inception V1 model run the following
python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 floatThe inputs to the convert_plan.py script are
- frozen graph path
- output plan path
- input node name
- input height
- input width
- output node name
- max batch size
- max workspace size
- data type (float or half)
This script assumes single output single input image models, and may not work out of the box for models other than those in the table above.
Call theexamples/classify_image program from the root directory of the project, referencing themodels table for relevant parameters. For example, to run the Inception V1 model converted as above
./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inceptionFor reference, the inputs to the example program are
- input image path
- plan file path
- labels file (one label per line, line number corresponds to index in output)
- input node name
- output node name
- preprocessing function (either vgg or inception)
We provide two image label files in thedata folder. Some of the TensorFlow models were trained with an additional "background" class, causing the model to have 1001 outputs instead of 1000. To determine the number of outputs for each model, reference theNETS variable inscripts/model_meta.py.
To benchmark all of the models, first convert all of the models that youdownloaded above into TensorRT engines. Run the following script to convert all models
python scripts/frozen_graphs_to_plans.pyIf you want to change parameters related to TensorRT optimization, just edit thescripts/frozen_graphs_to_plans.py file.Next, to benchmark all of the models run thescripts/test_trt.py script
python scripts/test_trt.pyOnce finished, the timing results will be stored atdata/test_output_trt.txt.If you want to also benchmark the TensorFlow models, simply run.
python scripts/test_tf.pyThe results will be stored atdata/test_output_tf.txt. This benchmarking script loads an example image as input, make sure you have downloaded the sample images asabove.
About
Image classification with NVIDIA TensorRT from TensorFlow models.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
