Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit6d4afec

Browse files
committed
Update the readme with more details for training and evaluation
1 parent093989a commit6d4afec

File tree

3 files changed

+74
-12
lines changed

3 files changed

+74
-12
lines changed

‎.gitignore‎

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,4 +135,3 @@ demo/data
135135

136136
# logs and checkpoints
137137
work_dirs/
138-
tools/*.sh

‎README.md‎

Lines changed: 47 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ Building upon this database, we introduce a baseline framework named <b>Embodied
8181

8282
##🔥 News
8383

84+
-\[2024-03\] We first release the data and baselines for the challenge. Please fill in the[form](https://docs.google.com/forms/d/e/1FAIpQLScUXEDTksGiqHZp31j7Zp7zlCNV7p_08uViwP_Nbzfn3g6hhw/viewform?usp=sf_link) to apply for downloading the data and try our baselines. Welcome any feedback!
8485
-\[2024-02\] We will co-organize[Autonomous Grand Challenge](https://opendrivelab.com/challenge2024/) in CVPR 2024. Welcome to try the Multi-View 3D Visual Grounding track! We will release more details about the challenge with the baseline after the Chinese New Year.
8586
-\[2023-12\] We release the[paper](./assets/EmbodiedScan.pdf) of EmbodiedScan. Please check the[webpage](https://tai-wang.github.io/embodiedscan) and view our demos!
8687

@@ -146,8 +147,6 @@ We provide a demo for running EmbodiedScan's model on a sample scan. Please refe
146147

147148
##📦 Model and Benchmark
148149

149-
We will release the code for model training and benchmark with pretrained checkpoints in the 2024 Q1.
150-
151150
###Model Overview
152151

153152
<palign="center">
@@ -175,31 +174,68 @@ Embodied Perceptron accepts RGB-D sequence with any number of views along with t
175174
<video src="assets/scannet_two_bed_demo.mp4" controls>
176175
</video>-->
177176

177+
###Training and Inference
178+
179+
We provide configs for different tasks[here](configs/) and you can run the train and test script in the[tools folder](tools/) for training and inference.
180+
For example, to train a multi-view 3D detection model with pytorch, just run:
181+
182+
```bash
183+
python tools/train.py configs/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet --launcher="pytorch"
184+
```
185+
186+
Or on the cluster with multiple machines, run the script with the slurm launcher following the sample script provided[here](tools/mv-grounding.sh).
187+
188+
NOTE: To run the multi-view 3D grounding experiments, please first download the 3D detection pretrained model to accelerate its training procedure. After downloading the detection checkpoint, please check the path used in the config, for example, the`load_from`[here](https://github.com/OpenRobotLab/EmbodiedScan/blob/main/configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py#L210), is correct.
189+
190+
To inference and evaluate the model (e.g., the checkpoint`work_dirs/mv-3ddet/epoch_12.pth`), just run the test script:
191+
192+
```bash
193+
python tools/test.py configs/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py work_dirs/mv-3ddet/epoch_12.pth --launcher="pytorch"
194+
```
195+
178196
###Benchmark
179197

180-
Please see the[paper](./assets/EmbodiedScan.pdf) for details of our two benchmarks, fundamental 3D perception and language-grounded benchmarks. This dataset is still scaling up and the benchmark is being polished and extended. Please stay tuned for our recent updates.
198+
We preliminarily provide several baseline results here with their logs and pretrained models.
199+
200+
Note that the performance is a little different from the results provided in the paper because we re-split the training set as the released training and validation set while keeping the original validation set as the test set for the public benchmark.
201+
202+
####Multi-View 3D Detection
203+
204+
| Method| Input| AP@0.25| AR@0.25| AP@0.5| AR@0.5| Download|
205+
|:------:|:-----:|:-------:|:-------:|:------:|:------:|:------:|
206+
|[Baseline](configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py)| RGB-D| 15.22| 52.23| 8.13| 26.66|[Model](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/Efl363DOsXdAiikGcHIC3aQB_rjqHKsgxACyUgrHzqRmMA?e=XQDeY7)|[Log](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/ET5FOsjHqOBBsGs_WHeUIpUBY-iLXeWdYPNeWZ3nh9wbYg?e=1fsirH)|
207+
208+
####Multi-View 3D Visual Grounding
209+
210+
| Method|AP@0.25| AP@0.5| Download|
211+
|:------:|:-----:|:-------:|:------:|
212+
|[Baseline-Mini](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py) | 33.59 | 14.40 |[Model](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/EbrSZQM6bLROuVdG5MgRhqABJH0Cs91vHE9B-PfjZXvE0w?e=D5wbIK) |[Log](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/EbhDHknA5nNMiBTiYZhnyCIBSZ881MJXUSfPgQGMm-spEw?e=Fp9fwZ)
213+
|[Baseline-Mini (w/ FCAF box coder)](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof_fcaf-coder.py)| -| -| -|
214+
|[Baseline-Full](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof-full.py)| -| -| -|
215+
216+
Please see the[paper](./assets/EmbodiedScan.pdf) for more details of our two benchmarks, fundamental 3D perception and language-grounded benchmarks. This dataset is still scaling up and the benchmark is being polished and extended. Please stay tuned for our recent updates.
181217

182218
##📝 TODO List
183219

184220
-\[x\] Release the paper and partial codes for datasets.
185221
-\[x\] Release EmbodiedScan annotation files.
186222
-\[x\] Release partial codes for models and evaluation.
187223
-\[\] Polish dataset APIs and related codes.
188-
-\[\] Release Embodied Perceptron pretrained models.
189-
-\[\] Release multi-modal datasets and codes.
190-
-\[\] Release codes for baselines and benchmarks.
224+
-\[x\] Release Embodied Perceptron pretrained models.
225+
-\[x\] Release multi-modal datasets and codes.
226+
-\[x\] Release codes for baselines and benchmarks.
191227
-\[\] Full release and further updates.
192228

193229
##🔗 Citation
194230

195231
If you find our work helpful, please cite:
196232

197233
```bibtex
198-
@article{wang2023embodiedscan,
199-
author={Wang, Tai and Mao, Xiaohan and Zhu, Chenming and Xu, Runsen and Lyu, Ruiyuan and Li, Peisen and Chen, Xiao and Zhang, Wenwei and Chen, Kai and Xue, Tianfan and Liu, Xihui and Lu, Cewu and Lin, Dahua and Pang, Jiangmiao},
200-
title={EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI},
201-
journal={Arxiv},
202-
year={2023}
234+
@inproceedings{wang2023embodiedscan,
235+
title={EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI},
236+
author={Wang, Tai and Mao, Xiaohan and Zhu, Chenming and Xu, Runsen and Lyu, Ruiyuan and Li, Peisen and Chen, Xiao and Zhang, Wenwei and Chen, Kai and Xue, Tianfan and Liu, Xihui and Lu, Cewu and Lin, Dahua and Pang, Jiangmiao},
237+
year={2024},
238+
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
203239
}
204240
```
205241

‎tools/mv-grounding.sh‎

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
#!/usr/bin/env bash
2+
3+
set -x
4+
5+
CKPT_PATH=/mnt/petrelfs/wangtai/EmbodiedScan/work_dirs
6+
PARTITION=test
7+
JOB_NAME=mv-grounding-challenge-benchmark
8+
TASK=mv-grounding-challenge-benchmark
9+
CONFIG=configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py
10+
WORK_DIR=${CKPT_PATH}/${TASK}
11+
CKPT=${CKPT_PATH}/${TASK}/latest.pth
12+
CPUS_PER_TASK=16
13+
GPUS=8
14+
GPUS_PER_NODE=8
15+
PORT=29320
16+
17+
PYTHONPATH="$(dirname$0)/..":$PYTHONPATH \
18+
export NCCL_IB_DISABLE=1;export NCCL_P2P_DISABLE=1; \
19+
srun -p${PARTITION} \
20+
--job-name=${JOB_NAME} \
21+
--gres=gpu:${GPUS_PER_NODE} \
22+
--ntasks=${GPUS} \
23+
--ntasks-per-node=${GPUS_PER_NODE} \
24+
--cpus-per-task=${CPUS_PER_TASK} \
25+
--kill-on-bad-exit=1 \
26+
--quotatype=reserved \
27+
python -u tools/train.py${CONFIG} --work-dir=${WORK_DIR} --launcher="slurm" --cfg-options env_cfg.dist_cfg.port=${PORT} --resume

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp