dk-liang/FIDTMPublic

NotificationsYou must be signed in to change notification settings
Fork45
Star190

[IEEE TMM 23] Focal Inverse Distance Transform Maps for Crowd Localization

License

MIT license

190 stars 45 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Networks/HR_Net		Networks/HR_Net
data		data
image		image
local_eval		local_eval
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
config.py		config.py
dataset.py		dataset.py
image.py		image.py
make_npydata.py		make_npydata.py
test.py		test.py
train_baseline.py		train_baseline.py
utils.py		utils.py
video_demo.py		video_demo.py

Repository files navigation

Focal Inverse Distance Transform Map

[Project page] [paper]
An officical implementation of "Focal Inverse Distance Transform Map for Crowd Localization" (Accepted by IEEE TMM).
We propose a novel label named Focal Inverse Distance Transform (FIDT) map, which can represent each head location information.

News

We now provide the predicted coordinates txt files, and other researchers can use them to fairly evaluate the localization performance.

Overview

Visualizations

Compared with density map

Visualizations for bounding boxes

Progress

Testing Code (2021.3.16)
Training baseline code (2021.4.29)
Pretrained model
- ShanghaiA (2021.3.16)
- ShanghaiB (2021.3.16)
- UCF_QNRF (2021.4.29)
- JHU-Crowd++ (2021.4.29)
- NWPU-Crowd++ (2021.4.29)
Bounding boxes visualizations(2021.3.24)
Video demo(2021.3.29)
Predicted coordinates txt file(2021.8.20)

Environment

python >=3.6 pytorch >=1.4opencv-python >=4.0scipy >=1.4.0h5py >=2.10pillow >=7.0.0imageio >=1.18nni >=2.0 (python3 -m pip install --upgrade nni)

Datasets

Download ShanghaiTech dataset fromBaidu-Disk, passward:cjnx; orGoogle-Drive
Download UCF-QNRF dataset fromhere
Download JHU-CROWD ++ dataset fromhere
Download NWPU-CROWD dataset fromBaidu-Disk, passward:3awa; orGoogle-Drive

Generate FIDT Ground-Truth

cd datarun  python fidt_generate_xx.py

“xx” means the dataset name, including sh, jhu, qnrf, and nwpu. You should change the dataset path.

Model

Download the pretrained model fromBaidu-Disk, passward:gqqm, orOneDrive

Quickly test

git clone https://github.com/dk-liang/FIDTM.git

Download Dataset and Model
Generate FIDT map ground-truth

Generate image file list: python make_npydata.py

Test example:

python test.py --dataset ShanghaiA --pre ./model/ShanghaiA/model_best.pth --gpu_id 0python test.py --dataset ShanghaiB --pre ./model/ShanghaiB/model_best.pth --gpu_id 1  python test.py --dataset UCF_QNRF --pre ./model/UCF_QNRF/model_best.pth --gpu_id 2  python test.py --dataset JHU --pre ./model/JHU/model_best.pth --gpu_id 3

If you want to generate bounding boxes,

python test.py --test_dataset ShanghaiA --pre model_best.pth  --visual True(remember to change the dataset path in test.py)

If you want to test a video,

python video_demo.py --pre model_best.pth  --video_path demo.mp4(the output video will in ./demo.avi; By default, the video size is reduced by two times for inference. You can change the input size in the video_demo.py)

Visitingbilibili orYoutube to watch the video demonstration. The original demo video can be downloaded fromBaidu-Disk, passed: cebh

More config information is provided in config.py

Evaluation localization performance

Shanghai Teach Part A	Precision	Recall	F1-measure
σ=4	59.1%	58.2%	58.6%
σ=8	78.1%	77.0%	77.6%

Shanghai Teach Part B	Precision	Recall	F1-measure
σ=4	64.9%	64.5%	64.7%
σ=8	83.9%	83.2%	83.5%

JHU_Crowd++ (test set)	Precision	Recall	F1-measure
σ=4	38.9%	38.7%	38.8%
σ=8	62.5%	62.4%	62.4%

UCF_QNRF	Av.Precision	Av.Recall	Av. F1-measure
σ=1....100	84.49%	80.10%	82.23%

NWPU-Crowd (val set)	Precision	Recall	F1-measure
σ=σ_l	82.2%	75.9%	78.9%
σ=σ_s	76.7%	70.9%	73.7%

Evaluation example:

For Shanghai tech, JHU-Crowd (test set), and NWPU-Crowd (val set):

cd ./local_evalpython eval.py ShanghaiA  python eval.py ShanghaiBpython eval.py JHU  python eval.py NWPU

For UCF-QNRF dataset:

python eval_qnrf.py --data_path path/to/UCF-QNRF_ECCV18

For NWPU-Crowd (test set), please submit the nwpu_pred_fidt.txt to thewebsite.

We also provide the predicted coordinates txt file in './local_eval/point_files/', and you can use them to fairly evaluate the other localization metric.

(We hope the community can provide the predicted coordinates file to help other researchers fairly evaluate the localization performance.)

Tips:
The GT format is:

1 total_count x1 y1 4 8 x2 y2 4 8 ..... 2 total_count x1 y1 4 8 x2 y2 4 8 .....

The predicted format is:

1 total_count x1 y1 x2 y2.....2 total_count x1 y1 x2 y2.....

The evaluation code is modifed fromNWPU.

Training

The training strategy is very simple. You can replace the density map with the FIDT map in any regressors for training.

If you want to train based on the HRNET (borrow from the IIM-codelink), please first download the ImageNet pre-trained models from the officiallink, and replace the pre-trained model path in HRNET/congfig.py (__C.PRE_HR_WEIGHTS).

Here, we provide the training baseline code:

Training baseline example:

python train_baseline.py --dataset ShanghaiA --crop_size 256 --save_path ./save_file/ShanghaiA python train_baseline.py --dataset ShanghaiB --crop_size 256 --save_path ./save_file/ShanghaiB  python train_baseline.py --dataset UCF_QNRF --crop_size 512 --save_path ./save_file/QNRFpython train_baseline.py --dataset JHU --crop_size 512 --save_path ./save_file/JHU

For ShanghaiTech, you can train by a GPU with 8G memory. For other datasets, please utilize a single GPU with 24G memory or multiple GPU for training.

ImprovementsWe have not studied the effect of some hyper-parameter. Thus, the results can be further improved by using some tricks, such as adjust the learning rate, batch size, crop size, and data augmentation.

Reference

If you find this project is useful for your research, please cite:

@article{liang2022focal,  title={Focal inverse distance transform maps for crowd localization},  author={Liang, Dingkang and Xu, Wei and Zhu, Yingying and Zhou, Yu},  journal={IEEE Transactions on Multimedia},  year={2022},  publisher={IEEE}}

About

[IEEE TMM 23] Focal Inverse Distance Transform Maps for Crowd Localization

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Focal Inverse Distance Transform Map

News

Overview

Visualizations

Progress

Environment

Datasets

Generate FIDT Ground-Truth

Model

Quickly test

Evaluation localization performance

Training

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors4

Uh oh!

Languages

Movatterモバイル変換

License

dk-liang/FIDTM

Folders and files

Latest commit

History

Repository files navigation

Focal Inverse Distance Transform Map

News

Overview

Visualizations

Progress

Environment

Datasets

Generate FIDT Ground-Truth

Model

Quickly test

Evaluation localization performance

Training

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors4

Uh oh!

Languages

Packages