Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A demo using tensorRT on NVIDIA Jetson TX2 accelerating the Caffe model of AlexNet.

NotificationsYou must be signed in to change notification settings

C-H-D/tensorRT-Caffe

Repository files navigation

A demo using tensorRT on NVIDIA Jetson TX2 accelerating the Caffe model of AlexNet.

Please refer toNVIDIA JETSON TX2 tensorRT 加速 Caffe 实战.pdf for detailed description in Chinese.

Prerequisites:

  • NVIDIA Jetson TX2
  • CUDA 8.0
  • cuDNN
  • tensorRT
  • .prototxt file
  • .caffemodel file
  • .binaryproto file

You are suggested to flash the TX2 device withJetpack 3.1, so you can have all the required tools installed automatically.

Caffe model we use

We try to classify three different types of parking slots:

ParkingSlotType1

ParkingSlotType2

ParkingSlotType3

So we used Alexnet in Caffe to implement this task.

The input and output are specified by theprototxt file of the Caffe model:

layer {  name: "data"  type: "Input"  top: "data"  input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }}

tensorRT will try to find the layer in your prototxt with the type"Input" as your input data.

As you can see above, our model has only one input named "data", which is of the size 3*227*227. The first dim is the batch size, which doesn't affect the input size here.

Also, for the output layer, at the end of our prototxt file:

layer {  name: "fc8"  type: "InnerProduct"  bottom: "fc7"  top: "fc8"  inner_product_param {    num_output: 3  }}layer {  name: "prob"  type: "Softmax"  bottom: "fc8"  top: "prob"}

It's a softmax layer outputting the probability of an image belonging to 3 different types of parking slots.

The name is "prob", and later in tensorRT you will have to specify the name, remember this point.

In our case here the output is a 1*3 array.

Also, you will need the .caffemodel file and the .binaryproto file.

Emphasis on some issues

  • Resizing the image:In our case, the image is usually of size 48*210, while the input of Caffe model is 227*227, so images will need to be resized. I did it before running tensorRT.

    Resizing is done by simplyscaling instead of padding. Here is a resized image:

    resize

  • Reading the image:Our input image is of thejpeg format, but tensorRT itself doesn't provide methods for reading images.

    So we usedstb_image to read the images.

  • Converting the data formation:The images read by stb is of the formation C*W*H, whereC is Channel(RGB),W is width,H is height.

    However, after my experiment, the input to Caffe should be in the formation W*H*C, where the channels are inBGR order.

    Both of the data(read by stb and required by Caffe) are 1*n array. So they should look like this:

    stb:RGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGBRGB......

    Caffe:BBBBBBBBBBBBBBBB...GGGGGGGGGGGGGGGG...RRRRRRRRRRRRRRRRR...

    So we need to reformat the input read by stb. See code for details.

  • Subtracting the mean image:In Caffe, you should subtract the image with the mean image stored in .binaryproto. However, the result tensorRT gives by reading it is a large array instead of 3 numbers representing the mean of 3 channels. So you will need to calculate the mean numbers by yourself.

    In my case, I used pyCaffe to read the .npy file converted from the .binaryproto file and got the 3 numbers, so in the code I just directly used them without calculation.

The code I write is based on the officialsampleMNIST example. Seegiexec.cpp.

As for a result, the model is6 times faster compared with pycaffe-GPU. Here is the result:

Caffe-GPU:

GPU

Caffe-tensorRT:

TRT

About

A demo using tensorRT on NVIDIA Jetson TX2 accelerating the Caffe model of AlexNet.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp