NotificationsYou must be signed in to change notification settings
Fork49
Star646

Commit6d4afec

committed

Update the readme with more details for training and evaluation

1 parent093989a commit6d4afecCopy full SHA for 6d4afec

File tree

3 files changed

+74

-12

lines changed

3 files changed

+74

-12

lines changed

`‎.gitignore‎`

Lines changed: 0 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -135,4 +135,3 @@ demo/data`
`135`	`135`
`136`	`136`	`# logs and checkpoints`
`137`	`137`	`work_dirs/`
`138`		`-tools/*.sh`

`‎README.md‎`

Lines changed: 47 additions & 11 deletions

Original file line number	Diff line number	Diff line change
`@@ -81,6 +81,7 @@ Building upon this database, we introduce a baseline framework named <b>Embodied`
`81`	`81`
`82`	`82`	`##🔥 News`
`83`	`83`
	`84`	`+-\[2024-03\] We first release the data and baselines for the challenge. Please fill in the[form](https://docs.google.com/forms/d/e/1FAIpQLScUXEDTksGiqHZp31j7Zp7zlCNV7p_08uViwP_Nbzfn3g6hhw/viewform?usp=sf_link) to apply for downloading the data and try our baselines. Welcome any feedback!`
`84`	`85`	`-\[2024-02\] We will co-organize[Autonomous Grand Challenge](https://opendrivelab.com/challenge2024/) in CVPR 2024. Welcome to try the Multi-View 3D Visual Grounding track! We will release more details about the challenge with the baseline after the Chinese New Year.`
`85`	`86`	`-\[2023-12\] We release the[paper](./assets/EmbodiedScan.pdf) of EmbodiedScan. Please check the[webpage](https://tai-wang.github.io/embodiedscan) and view our demos!`
`86`	`87`
`@@ -146,8 +147,6 @@ We provide a demo for running EmbodiedScan's model on a sample scan. Please refe`
`146`	`147`
`147`	`148`	`##📦 Model and Benchmark`
`148`	`149`
`149`		`-We will release the code for model training and benchmark with pretrained checkpoints in the 2024 Q1.`
`150`		`-`
`151`	`150`	`###Model Overview`
`152`	`151`
`153`	`152`	`<palign="center">`
`@@ -175,31 +174,68 @@ Embodied Perceptron accepts RGB-D sequence with any number of views along with t`
`175`	`174`	`<video src="assets/scannet_two_bed_demo.mp4" controls>`
`176`	`175`	`</video>-->`
`177`	`176`
	`177`	`+###Training and Inference`
	`178`	`+`
	`179`	`+We provide configs for different tasks[here](configs/) and you can run the train and test script in the[tools folder](tools/) for training and inference.`
	`180`	`+For example, to train a multi-view 3D detection model with pytorch, just run:`
	`181`	`+`
	`182`	+```bash
	`183`	`+python tools/train.py configs/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py --work-dir=work_dirs/mv-3ddet --launcher="pytorch"`
	`184`	+```
	`185`	`+`
	`186`	`+Or on the cluster with multiple machines, run the script with the slurm launcher following the sample script provided[here](tools/mv-grounding.sh).`
	`187`	`+`
	`188`	+NOTE: To run the multi-view 3D grounding experiments, please first download the 3D detection pretrained model to accelerate its training procedure. After downloading the detection checkpoint, please check the path used in the config, for example, the`load_from`[here](https://github.com/OpenRobotLab/EmbodiedScan/blob/main/configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py#L210), is correct.
	`189`	`+`
	`190`	+To inference and evaluate the model (e.g., the checkpoint`work_dirs/mv-3ddet/epoch_12.pth`), just run the test script:
	`191`	`+`
	`192`	+```bash
	`193`	`+python tools/test.py configs/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py work_dirs/mv-3ddet/epoch_12.pth --launcher="pytorch"`
	`194`	+```
	`195`	`+`
`178`	`196`	`###Benchmark`
`179`	`197`
`180`		`-Please see the[paper](./assets/EmbodiedScan.pdf) for details of our two benchmarks, fundamental 3D perception and language-grounded benchmarks. This dataset is still scaling up and the benchmark is being polished and extended. Please stay tuned for our recent updates.`
	`198`	`+We preliminarily provide several baseline results here with their logs and pretrained models.`
	`199`	`+`
	`200`	`+Note that the performance is a little different from the results provided in the paper because we re-split the training set as the released training and validation set while keeping the original validation set as the test set for the public benchmark.`
	`201`	`+`
	`202`	`+####Multi-View 3D Detection`
	`203`	`+`
	`204`	`+\| Method\| Input\| AP@0.25\| AR@0.25\| AP@0.5\| AR@0.5\| Download\|`
	`205`	`+\|:------:\|:-----:\|:-------:\|:-------:\|:------:\|:------:\|:------:\|`
	`206`	`+\|[Baseline](configs/detection/mv-det3d_8xb4_embodiedscan-3d-284class-9dof.py)\| RGB-D\| 15.22\| 52.23\| 8.13\| 26.66\|[Model](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/Efl363DOsXdAiikGcHIC3aQB_rjqHKsgxACyUgrHzqRmMA?e=XQDeY7)\|[Log](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/ET5FOsjHqOBBsGs_WHeUIpUBY-iLXeWdYPNeWZ3nh9wbYg?e=1fsirH)\|`
	`207`	`+`
	`208`	`+####Multi-View 3D Visual Grounding`
	`209`	`+`
	`210`	`+\| Method\|AP@0.25\| AP@0.5\| Download\|`
	`211`	`+\|:------:\|:-----:\|:-------:\|:------:\|`
	`212`	`+\|[Baseline-Mini](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py) \| 33.59 \| 14.40 \|[Model](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/EbrSZQM6bLROuVdG5MgRhqABJH0Cs91vHE9B-PfjZXvE0w?e=D5wbIK) \|[Log](https://pjlab-my.sharepoint.cn/:u:/g/personal/wangtai_pjlab_org_cn/EbhDHknA5nNMiBTiYZhnyCIBSZ881MJXUSfPgQGMm-spEw?e=Fp9fwZ)`
	`213`	`+\|[Baseline-Mini (w/ FCAF box coder)](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof_fcaf-coder.py)\| -\| -\| -\|`
	`214`	`+\|[Baseline-Full](configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof-full.py)\| -\| -\| -\|`
	`215`	`+`
	`216`	`+Please see the[paper](./assets/EmbodiedScan.pdf) for more details of our two benchmarks, fundamental 3D perception and language-grounded benchmarks. This dataset is still scaling up and the benchmark is being polished and extended. Please stay tuned for our recent updates.`
`181`	`217`
`182`	`218`	`##📝 TODO List`
`183`	`219`
`184`	`220`	`-\[x\] Release the paper and partial codes for datasets.`
`185`	`221`	`-\[x\] Release EmbodiedScan annotation files.`
`186`	`222`	`-\[x\] Release partial codes for models and evaluation.`
`187`	`223`	`-\[\] Polish dataset APIs and related codes.`
`188`		`--\[\] Release Embodied Perceptron pretrained models.`
`189`		`--\[\] Release multi-modal datasets and codes.`
`190`		`--\[\] Release codes for baselines and benchmarks.`
	`224`	`+-\[x\] Release Embodied Perceptron pretrained models.`
	`225`	`+-\[x\] Release multi-modal datasets and codes.`
	`226`	`+-\[x\] Release codes for baselines and benchmarks.`
`191`	`227`	`-\[\] Full release and further updates.`
`192`	`228`
`193`	`229`	`##🔗 Citation`
`194`	`230`
`195`	`231`	`If you find our work helpful, please cite:`
`196`	`232`
`197`	`233`	```bibtex
`198`		`-@article{wang2023embodiedscan,`
`199`		`-author={Wang, Tai and Mao, Xiaohan and Zhu, Chenming and Xu, Runsen and Lyu, Ruiyuan and Li, Peisen and Chen, Xiao and Zhang, Wenwei and Chen, Kai and Xue, Tianfan and Liu, Xihui and Lu, Cewu and Lin, Dahua and Pang, Jiangmiao},`
`200`		`-title={EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI},`
`201`		`-journal={Arxiv},`
`202`		`-year={2023}`
	`234`	`+@inproceedings{wang2023embodiedscan,`
	`235`	`+ title={EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI},`
	`236`	`+ author={Wang, Tai and Mao, Xiaohan and Zhu, Chenming and Xu, Runsen and Lyu, Ruiyuan and Li, Peisen and Chen, Xiao and Zhang, Wenwei and Chen, Kai and Xue, Tianfan and Liu, Xihui and Lu, Cewu and Lin, Dahua and Pang, Jiangmiao},`
	`237`	`+ year={2024},`
	`238`	`+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},`
`203`	`239`	`}`
`204`	`240`	```
`205`	`241`

`‎tools/mv-grounding.sh‎`

Lines changed: 27 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,27 @@`
	`1`	`+#!/usr/bin/env bash`
	`2`	`+`
	`3`	`+set -x`
	`4`	`+`
	`5`	`+CKPT_PATH=/mnt/petrelfs/wangtai/EmbodiedScan/work_dirs`
	`6`	`+PARTITION=test`
	`7`	`+JOB_NAME=mv-grounding-challenge-benchmark`
	`8`	`+TASK=mv-grounding-challenge-benchmark`
	`9`	`+CONFIG=configs/grounding/mv-grounding_8xb12_embodiedscan-vg-9dof.py`
	`10`	`+WORK_DIR=${CKPT_PATH}/${TASK}`
	`11`	`+CKPT=${CKPT_PATH}/${TASK}/latest.pth`
	`12`	`+CPUS_PER_TASK=16`
	`13`	`+GPUS=8`
	`14`	`+GPUS_PER_NODE=8`
	`15`	`+PORT=29320`
	`16`	`+`
	`17`	`+PYTHONPATH="$(dirname$0)/..":$PYTHONPATH \`
	`18`	`+export NCCL_IB_DISABLE=1;export NCCL_P2P_DISABLE=1; \`
	`19`	`+srun -p${PARTITION} \`
	`20`	`+ --job-name=${JOB_NAME} \`
	`21`	`+ --gres=gpu:${GPUS_PER_NODE} \`
	`22`	`+ --ntasks=${GPUS} \`
	`23`	`+ --ntasks-per-node=${GPUS_PER_NODE} \`
	`24`	`+ --cpus-per-task=${CPUS_PER_TASK} \`
	`25`	`+ --kill-on-bad-exit=1 \`
	`26`	`+ --quotatype=reserved \`
	`27`	`+ python -u tools/train.py${CONFIG} --work-dir=${WORK_DIR} --launcher="slurm" --cfg-options env_cfg.dist_cfg.port=${PORT} --resume`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit6d4afec

File tree

3 files changed

3 files changed

`‎.gitignore‎`

`‎README.md‎`

`‎tools/mv-grounding.sh‎`

0 commit comments