This repository was archived by the owner on Jan 26, 2022. It is now read-only.

roytseng-tw/Detectron.pytorchPublic archive

NotificationsYou must be signed in to change notification settings
Fork541
Star2.8k

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

License

MIT license

2.8k stars 541 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 709 Commits
.github		.github
.vscode		.vscode
configs		configs
demo		demo
lib		lib
tools		tools
.gitignore		.gitignore
.pylintrc		.pylintrc
.travis.yml		.travis.yml
BENCHMARK.md		BENCHMARK.md
LICENSE		LICENSE
README.md		README.md

Repository files navigation

Use this instead:https://github.com/facebookresearch/maskrcnn-benchmark

A Pytorch Implementation of Detectron

Example output ofe2e_mask_rcnn-R-101-FPN_2x using Detectron pretrained weight.

Corresponding example output from Detectron.

Example output ofe2e_keypoint_rcnn-R-50-FPN_s1x using Detectron pretrained weight.

This code follows the implementation architecture of Detectron. Only part of the functionality is supported. Checkthis section for more information.

With this code, you can...

Train your model from scratch.
Inference using the pretrained weight file (*.pkl) from Detectron.

This repository is originally built onjwyang/faster-rcnn.pytorch. However, after many modifications, the structure changes a lot and it's now more similar toDetectron. I deliberately make everything similar or identical to Detectron's implementation, so as to reproduce the result directly from official pretrained weight files.

This implementation has the following features:

It is pure Pytorch code. Of course, there are some CUDA code.
It supports multi-image batch training.
It supports multiple GPUs training.
It supports three pooling methods. Notice that onlyroi align is revised to match the implementation in Caffe2. So, use it.
It is memory efficient. For data batching, there are two techiniques available to reduce memory usage: 1)Aspect grouping: group images with similar aspect ratio in a batch 2)Aspect cropping: crop images that are too long. Aspect grouping is implemented in Detectron, so it's used for default. Aspect cropping is the idea fromjwyang/faster-rcnn.pytorch, and it's not used for default.
Besides of that, I implement a customizednn.DataParallel module which enables different batch blob size on different gpus. CheckMy nn.DataParallel section for more details about this.

News

(2018/05/25) Support ResNeXt backbones.
(2018/05/22) Add group normalization baselines.
(2018/05/15) PyTorch0.4 is supported now !

Getting Started

Clone the repo:

git clone https://github.com/roytseng-tw/mask-rcnn.pytorch.git

Requirements

Tested under python3.

python packages
- pytorch>=0.3.1
- torchvision>=0.2.0
- cython
- matplotlib
- numpy
- scipy
- opencv
- pyyaml
- packaging
- pycocotools — for COCO dataset, also available from pip.
- tensorboardX — for logging the losses in Tensorboard
An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
NOTICE: different versions of Pytorch package have different memory usages.

Compilation

Compile the CUDA code:

cd lib  # please change to this directorysh make.sh

If your are using Volta GPUs, uncomment thisline inlib/mask.sh and remember to postpend a backslash at the line above.CUDA_PATH defaults to/usr/loca/cuda. If you want to use a CUDA library on different path, change thisline accordingly.

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Crop and ROI_Align. (Actually gpu nms is never used ...)

Note that, If you useCUDA_VISIBLE_DEVICES to set gpus,make sure at least one gpu is visible when compile the code.

Data Preparation

Create a data folder under the repo,

cd {repo_root}mkdir data

COCO:Download the coco images and annotations fromcoco website.
And make sure to put the files as the following structure:
```
coco├── annotations|   ├── instances_minival2014.json│   ├── instances_train2014.json│   ├── instances_train2017.json│   ├── instances_val2014.json│   ├── instances_val2017.json│   ├── instances_valminusminival2014.json│   ├── ...|└── images    ├── train2014    ├── train2017    ├── val2014    ├──val2017    ├── ...
```
Download coco mini annotations fromhere.Please note that minival is exactly equivalent to the recently defined 2017 val set. Similarly, the union of valminusminival and the 2014 train is exactly equivalent to the 2017 train set.
Feel free to put the dataset at any place you want, and then soft link the dataset under thedata/ folder:
```
ln -s path/to/coco data/coco
```
Recommend to put the images on a SSD for possible better training performance

Pretrained Model

I use ImageNet pretrained weights from Caffe for the backbone networks.

ResNet50,ResNet101,ResNet152
VGG16 (vgg backbone is not implemented yet)

Download them and put them into the{repo_root}/data/pretrained_model.

You can the following command to download them all:

extra required packages:argparse_color_formater,colorama,requests

python tools/download_imagenet_weights.py

NOTE: Caffe pretrained weights have slightly better performance than Pytorch pretrained. Suggest to use Caffe pretrained models from the above link to reproduce the results. By the way, Detectron also use pretrained weights from Caffe.

If you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data preprocessing (minus mean and normalize) as used in Pytorch pretrained model.

ImageNet Pretrained Model provided by Detectron

Besides of using the pretrained weights for ResNet above, you can also use the weights from Detectron by changing the corresponding line in model config file as follows:

RESNETS:  IMAGENET_PRETRAINED_WEIGHTS: 'data/pretrained_model/R-50.pkl'

R-50-GN.pkl and R-101-GN.pkl are required for gn_baselines.

X-101-32x8d.pkl, X-101-64x4d.pkl and X-152-32x8d-IN5k.pkl are required for ResNeXt backbones.

Training

DO NOT CHANGE anything in the provided config files(configs/**/xxxx.yml) unless you know what you are doing

Use the environment variableCUDA_VISIBLE_DEVICES to control which GPUs to use.

Adapative config adjustment

Let's define some terms first

       batch_size:NUM_GPUS xTRAIN.IMS_PER_BATCH
       effective_batch_size: batch_size xiter_size
       change of somethining:new value of something / old value of something

Following config options will be adjustedautomatically according to actual training setups: 1) number of GPUsNUM_GPUS, 2) batch size per GPUTRAIN.IMS_PER_BATCH, 3) update perioditer_size

SOLVER.BASE_LR: adjust directly propotional to the change of batch_size.
SOLVER.STEPS,SOLVER.MAX_ITER: adjust inversely propotional to the change of effective_batch_size.

Train from scratch

Take mask-rcnn with res50 backbone for example.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --use_tfboard --bs {batch_size} --nw {num_workers}

Use--bs to overwrite the default batch size to a proper value that fits into your GPUs. Simliar for--nw, number of data loader threads defaults to 4 in config.py.

Specify—-use_tfboard to log the losses on Tensorboard.

NOTE: use--dataset keypoints_coco2017 when training for keypoint-rcnn.

The use of`--iter_size`

As in Caffe, update network once (optimizer.step()) everyiter_size iterations (forward + backward). This way to have a larger effective batch size for training. Notice that, step count is only increased after network update.

python tools/train_net_step.py --dataset coco2017 --cfg configs/baselines/e2e_mask_rcnn_R-50-C4.yml --bs 4 --iter_size 4

iter_size defaults to 1.

Finetune from a pretrained checkpoint

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint}

or using Detectron's checkpoint file

python tools/train_net_step.py ... --load_detectron {path/to/the/checkpoint}

Resume training with the same dataset and batch size

python tools/train_net_step.py ... --load_ckpt {path/to/the/checkpoint} --resume

When resume the training,step count andoptimizer state will also be restored from the checkpoint. For SGD optimizer, optimizer state contains the momentum for each trainable parameter.

NOTE:--resume is not yet supported for--load_detectron

Set config options in command line

  python tools/train_net_step.py ... --no_save --set {config.name1} {value1} {config.name2} {value2} ...

For Example, run for debugging.
```
python tools/train_net_step.py ... --no_save --set DEBUG True
```
Load less annotations to accelarate training progress. Add--no_save to avoid saving any checkpoint or logging.

Show command line help messages

python train_net_step.py --help

Two Training Scripts

In short, usetrain_net_step.py.

Intrain_net_step.py:

SOLVER.LR_POLICY: steps_with_decay is supported.
Training warm up inAccurate, Large Minibatch SGD: Training ImageNet in 1 Hour is supported.

(Deprecated) Intrain_net.py some config options have no effects and worth noticing:

SOLVER.LR_POLICY,SOLVER.MAX_ITER,SOLVER.STEPS,SOLVER.LRS:For now, the training policy is controlled by these command line arguments:
- --epochs: How many epochs to train. One epoch means one travel through the whole training sets. Defaults to 6.
- --lr_decay_epochs: Epochs to decay the learning rate on. Decay happens on the beginning of a epoch. Epoch is 0-indexed. Defaults to [4, 5].
For more command line arguments, please refer topython train_net.py --help
SOLVER.WARM_UP_ITERS,SOLVER.WARM_UP_FACTOR,SOLVER.WARM_UP_METHOD:Training warm up is not supported.

Inference

Evaluate the training results

For example, test mask-rcnn on coco2017 val set

python tools/test_net.py --dataset coco2017 --cfg config/baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml --load_ckpt {path/to/your/checkpoint}

Use--load_detectron to load Detectron's checkpoint. If multiple gpus are available, add--multi-gpu-testing.

Specify a different output directry, use--output_dir {...}. Defaults to{the/parent/dir/of/checkpoint}/test

Visualize the training results on images

python tools/infer_simple.py --dataset coco --cfg cfgs/baselines/e2e_mask_rcnn_R-50-C4.yml --load_ckpt {path/to/your/checkpoint} --image_dir {dir/of/input/images}  --output_dir {dir/to/save/visualizations}

--output_dir defaults toinfer_outputs.

Supported Network modules

Backbone:
- ResNet:ResNet50_conv4_body,ResNet50_conv5_body,ResNet101_Conv4_Body,ResNet101_Conv5_Body,ResNet152_Conv5_Body
- ResNeXt:[fpn_]ResNet101_Conv4_Body,[fpn_]ResNet101_Conv5_Body,[fpn_]ResNet152_Conv5_Body
- FPN:fpn_ResNet50_conv5_body,fpn_ResNet50_conv5_P2only_body,fpn_ResNet101_conv5_body,fpn_ResNet101_conv5_P2only_body,fpn_ResNet152_conv5_body,fpn_ResNet152_conv5_P2only_body
Box head:ResNet_roi_conv5_head,roi_2mlp_head,roi_Xconv1fc_head,roi_Xconv1fc_gn_head
Mask head:mask_rcnn_fcn_head_v0upshare,mask_rcnn_fcn_head_v0up,mask_rcnn_fcn_head_v1up,mask_rcnn_fcn_head_v1up4convs,mask_rcnn_fcn_head_v1up4convs_gn
Keypoints head:roi_pose_head_v1convX

NOTE: the naming is similar to the one used in Detectron. Just remove any prependingadd_.

Supported Datasets

Only COCO is supported for now. However, the whole dataset library implementation is almost identical to Detectron's, so it should be easy to add more datasets supported by Detectron.

Configuration Options

Architecture specific configuration files are put underconfigs. The general configuration filelib/core/config.pyhas almost all the options with same default values as in Detectron's, so it's effortless to transform the architecture specific configs from Detectron.

Some options from Detectron are not used because the corresponding functionalities are not implemented yet. For example, data augmentation on testing.

Extra options

MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS = True: Whether to load ImageNet pretrained weights.
- RESNETS.IMAGENET_PRETRAINED_WEIGHTS = '': Path to pretrained residual network weights. If start with'/', then it is treated as a absolute path. Otherwise, treat as a relative path toROOT_DIR.
TRAIN.ASPECT_CROPPING = False,TRAIN.ASPECT_HI = 2,TRAIN.ASPECT_LO = 0.5: Options for aspect cropping to restrict image aspect ratio range.
RPN.OUT_DIM_AS_IN_DIM = True,RPN.OUT_DIM = 512,RPN.CLS_ACTIVATION = 'sigmoid': Official implement of RPN has same input and output feature channels and use sigmoid as the activation function for fg/bg class prediction. Injwyang's implementation, it fix output channel number to 512 and use softmax as activation function.

How to transform configuration files from Detectron

RemoveMODEL.NUM_CLASSES. It will be set according to the dataset specified by--dataset.
RemoveTRAIN.WEIGHTS,TRAIN.DATASETS andTEST.DATASETS
For module type options (e.gMODEL.CONV_BODY,FAST_RCNN.ROI_BOX_HEAD ...), removeadd_ in the string if exists.
If want to load ImageNet pretrained weights for the model, addRESNETS.IMAGENET_PRETRAINED_WEIGHTS pointing to the pretrained weight file. If not, setMODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS toFalse.
[Optional] DeleteOUTPUT_DIR: . at the last line
DoNOT change the optionNUM_GPUS in the config file. It's used to infer the original batch size for training, and learning rate will be linearly scaled according to batch size change. Proper learning rate adjustment is important for training with different batch size.
For group normalization baselines, addRESNETS.USE_GN: True.

My nn.DataParallel

Keep certain keyword inputs on cpuOfficial DataParallel will broadcast all the input Variables to GPUs. However, many rpn related computations are done in CPU, and it's unnecessary to put those related inputs on GPUs.
Allow Different blob size for different GPUTo save gpu memory, images are padded seperately for each gpu.
Work with returned value of dictionary type

Benchmark

BENCHMARK.md

About

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Movatterモバイル変換

License

roytseng-tw/Detectron.pytorch

Folders and files

Latest commit

History

Repository files navigation

A Pytorch Implementation of Detectron

News

Getting Started

Requirements

Compilation

Data Preparation

Pretrained Model

ImageNet Pretrained Model provided by Detectron

Training

Adapative config adjustment

Let's define some terms first

Train from scratch

The use of--iter_size

Finetune from a pretrained checkpoint

Resume training with the same dataset and batch size

Set config options in command line

Show command line help messages

Two Training Scripts

Inference

Evaluate the training results

Visualize the training results on images

Supported Network modules

Supported Datasets

Configuration Options

Extra options

How to transform configuration files from Detectron

My nn.DataParallel

Benchmark

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors5

Uh oh!

Languages

The use of`--iter_size`

Packages