Cysu/noisy_labelPublic

NotificationsYou must be signed in to change notification settings
Fork31
Star120

Code for the CVPR15 paper "Learning from Massive Noisy Labeled Data for Image Classification"

www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Xiao_Learning_From_Massive_2015_CVPR_paper.pdf

120 stars 31 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
external		external
logs		logs
models		models
scripts		scripts
tools		tools
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Repository files navigation

CVPR15 Noisy Label Project

The repository contains the code of our CVPR15 paperLearning from Massive Noisy Labeled Data for Image Classification (paper link).

Installation

Clone this repository

# Make sure to clone with --recursive to get the modified Caffegit clone --recursive https://github.com/Cysu/noisy_label.git

Build the Caffe

cd external/caffe# Now follow the Caffe installation instructions here:#   http://caffe.berkeleyvision.org/installation.html# If you're experienced with Caffe and have all of the requirements installed# and your Makefile.config in place, then simply do:make -j8 && make pycd -

Setup an experiment directory. You can either create a new one under external/, or make a link to another existing directory.
```
mkdir -p external/exp
```
or
```
ln -s /path/to/your/exp/directory external/exp
```

CIFAR-10 Experiments

Download the CIFAR-10 data (python version).
```
scripts/cifar10/download_cifar10.sh
```
Synthesize label noise and prepare LMDBs. Will corrupt the labels of 40k randomly selected training images, while leaving other 10k image labels unchanged.
```
scripts/cifar10/make_db.sh 0.3
```
The parameter 0.3 controls the level of label noise. Can be any number between [0, 1].

Run a series of experiments

# Train a CIFAR10-quick model using only the 10k clean labeled imagesscripts/cifar10/train_clean.sh# Baseline:# Treat 40k noisy labels as ground truth and finetune from the previous modelscripts/cifar10/train_noisy_gt_ft_clean.sh# Our methodscripts/cifar10/train_ntype.shscripts/cifar10/init_noisy_label_loss.shscripts/cifar10/train_noisy_label_loss.sh

We provide the training logs inlogs/cifar10/ for reference.

Clothing1M Experiments

Clothing1M is the dataset we proposed in our paper.

Download the dataset. Please contacttong.xiao.work[at]gmail[dot]com to get the download link. Untar the images and unzip the annotations underexternal/exp/datasets/clothing1M. The directory structure should be

external/exp/datasets/clothing1M/├── category_names_chn.txt├── category_names_eng.txt├── clean_label_kv.txt├── clean_test_key_list.txt├── clean_train_key_list.txt├── clean_val_key_list.txt├── images│   ├── 0│   ├── ⋮│   └── 9├── noisy_label_kv.txt├── noisy_train_key_list.txt├── README.md└── venn.png

Make the LMDBs and compute the matrix C to be used.
```
scripts/clothing1M/make_db.sh
```

Run experiments for our method

# Download the ImageNet pretrained CaffeNetwget -P external/exp/snapshots/ http://dl.caffe.berkeleyvision.org/bvlc_reference_caffenet.caffemodel# Train the clothing prediction CNN using only the clean labeled imagesscripts/clothing1M/train_clean.sh# Train the noise type prediction CNNscripts/clothing1M/train_ntype.sh# Train the whole net using noisy labeled datascripts/clothing1M/init_noisy_label_loss.shscripts/clothing1M/train_noisy_label_loss.sh

We provide the training logs inlogs/clothing1M/ for reference. A final trained model is also providedhere. To test the performance, please download the model, place it underexternal/exp/snapshots/clothing1M/, and then

# Run the testexternal/caffe/build/tools/caffe test \    -model models/clothing1M/noisy_label_loss_test.prototxt \    -weights external/exp/snapshots/clothing1M/noisy_label_loss_inference.caffemodel \    -iterations 106 \    -gpu 0

Tips

The self-brewedexternal/caffe supports data parallel with multiple GPUs using MPI. One can accelerate the training / test process by

Compile the caffe with MPI enabled
Tweak the training shell scripts to use multiple GPUs, for example,mpirun -n 2 ... -gpu 0,1

Detailed instructions are listedhere.

Reference

@inproceedings{xiao2015learning,  title={Learning from Massive Noisy Labeled Data for Image Classification},  author={Xiao, Tong and Xia, Tian and Yang, Yi and Huang, Chang and Wang, Xiaogang},  booktitle={CVPR},  year={2015}}

About

Code for the CVPR15 paper "Learning from Massive Noisy Labeled Data for Image Classification"

www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Xiao_Learning_From_Massive_2015_CVPR_paper.pdf

Releases

No releases published

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CVPR15 Noisy Label Project

Installation

CIFAR-10 Experiments

Clothing1M Experiments

Tips

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

Cysu/noisy_label

Folders and files

Latest commit

History

Repository files navigation

CVPR15 Noisy Label Project

Installation

CIFAR-10 Experiments

Clothing1M Experiments

Tips

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages