Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

We synthesize synchronized visual appearance and tactile geometry given a sketch of objects and render the multimodal output on a surface haptic device called TanvasTouch.

NotificationsYou must be signed in to change notification settings

RuihanGao/visual-tactile-synthesis

Repository files navigation

Content creation beyond visual outputs: We present an image-to-image method to synthesize the visual appearance and tactile geometry of different materials, given a handcrafted or DALL⋅E 2 sketch. We then render the outputs on a surface haptic device like TanvasTouch® where users can slide on the screen to feel the rendered textures. (Turn the audio ON to hear the sound of the rendering.)

website_teaser_video.mp4

Controllable Visual-Tactile Synthesis
Ruihan Gao,Wenzhen Yuan,Jun-Yan Zhu
Carnegie Mellon University
ICCV, 2023

Visual-tactile synthesis

We show an example of our visual-tactile synthesis. The tactile output is shown in the 3D height map. The patches below correspond to bounding boxes shown in the sketch input. Please see our paper for more results.FlowerJeans_patches

Colored 3D mesh for single object

We can also render the synthesized results as a colored 3D mesh. The meshes are exaggerated in z direction to show fine textures.
mesh_GreenTeemesh_WhiteTshirtmesh_NavyHoodie
mesh_FlowerJeansmesh_PurplePantsmesh_FlowerShorts

Swapping different sketches & materials

figure7_swap_sketch


Text-guided visual-tactile synthesis

figure8_DALLE_sketch


Please see our website and paper for more interactive and comprehensive results

Updates

We are plan to release our code and dataset in the following steps:

  • Inference and Evaluation code [05/04].
  • Preprocessed data of all 20 garments in ourTouchClothing dataset [05/04].
  • Pretrained model (ours & baselines) on theTouchClothing dataset [05/04].
  • Training code [05/04].
  • Data preprocessing code for camera and GelSight R1.5 data.
  • Rendering code to generate friction maps for TanvasTouch.
  • Instructions on how to create new test data.

Getting Started

We tested our code with Python 3.8 andPytorch 1.11.0. (We recommend installing PyTorch separately to avoid package conflicts.)

git clone https://github.com/RuihanGao/visual-tactile-synthesis.gitcd visual-tactile-synthesisconda create -n VTS python=3.8conda activate VTSpip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113pip install -r requirements.txt

Dataset

We provide the preprocessed data for ourTouchClothing dataset, which contains 20 pieces of garments of various shapes and textures. Here are 20 objects inTouchClothing dataset:
“figure2_dataset”
Example of preprocessed data:
“figure4_sample_data”

website_dataset_video.mp4

Use the following commands to download and unzip the dataset.
(0) Installgdown andunzip as follows if you haven't done so.

pip install gdownsudo apt install unzip

(1) Download the preprocessed data from Google Drive via the following command:Total size: 580M.

bash scripts/download_TouchClothing_dataset.sh

(2) Put the unzipped folderdatasets in the code repo.

Note:

  • in case there is "access denied" error, trypip install -U --no-cache-dir gdown --pre and rungdown command again.Ref here
  • use-q flag tounzip to suppress the log as it could be quite long

Method

website_method_video_v3_voiceover.mp4

Pre-trained models

We provide the pretrained models for our method and several baselines included in our paper. For each method, we provide 20 models, one for each object in ourTouchClothing dataset.See the Google Drive folderhere. To use them,
(1) download the checkpoints

  • checkpoints for our method (124M):gdown "https://drive.google.com/uc?export=download&id=11y2jP2vT7CtBIaEDcjROZ5hupHsYWG8D"
  • checkpoints for baselines (21.5G):gdown "https://drive.google.com/uc?export=download&id=16NNU1GuOWWtarzEJkLSYbeSqQVaX-943"

(2) After unzipping the files, put all pre-trained models in the foldercheckpoints to load them properly in the testing code.

(3) See thetesting section for more examples of how to evaluate the pretrained models.

Usage

In general, our pipeline contains two steps. We first feed the sketch input to our model to synthesize synchronized visual and tactile output. Then we convert the tactile output to a friction map required by TanvasTouch and render the multi-modal output on the surface haptic device, where you cansee andfeel the object simultaneously.

Train our model

material=BlackJeansCUDA_VISIBLE_DEVICES=0 python train.py  --gpu_ids 0 --name "${material}_sinskitG_baseline_ours" --model sinskitG --dataroot ./datasets/"singleskit_${material}_padded_1800_x1/"

where you can choose the variablematerial from ourTouchClothing dataset or your own customized dataset.

To use our launcher scripts to run multiple experiments in tmux window, use the following command:
(Ref here for more examples and explanations for tmux launcher)

material_idx=0python -m experiments SingleG_AllMaterials_baseline_ours launch $material_idx

where the material_idx set which object in the dataset to use. Choose a material_idx or use 'all' to run multiple experiments at once.The list of the material can be found in the launcher fileexperiments/SingleG_AllMaterials_baseline_ours_launcher.py

Note: Loading the dataset to cache before training may take up to 20-30 mins and the training takes about 16h on a single A5000 GPU. Please be patient.
For a proof-of-concept training, setdata_len inSingleG_AllMaterials_baseline_ours_launcher andverbose_freq inmodels/sinskitG_model.py to a smaller number (e.g., 3 or 10).

Test our model

material=BlackJeansCUDA_VISIBLE_DEVICES=0 python test.py  --gpu_ids 0 --name "${material}_sinskitG_baseline_ours" --model sinskitG --dataroot ./datasets/"singleskit_${material}_padded_1800_x1/" --epoch best --eval

Or, if you are usingtmux_launcher, use the following command.

material_idx=0python -m experiments SingleG_AllMaterials_baseline_ours test $material_idx

The results will be stored in theresults directory.
To compile the quantitative metrics of the tested method in a tabulated format, runbash scripts/compile_eval_metrics_sinskitG.sh. For each method, it retrieves theeval_metrics.pkl file of all materials and take the average. Modifymaterials list inutil/compile_eval_metrics_sinskitG.py and the bash script accordingly.

Citation

@inproceedings{gao2023controllable,title={Controllable Visual-Tactile Synthesis},author={Gao, Ruihan and Yuan, Wenzhen and Zhu, Jun-Yan},booktitle={IEEE International Conference on Computer Vision (ICCV)},year={2023},}

Acknowledgment

We thank Sheng-Yu Wang, Kangle Deng, Muyang Li, Aniruddha Mahapatra, and Daohan Lu for proofreading the draft. We are also grateful to Sheng-Yu Wang, Nupur Kumari, Gaurav Parmar, George Cazenavette, and Arpit Agrawal for their helpful comments and discussion. Additionally, we thank Yichen Li, Xiaofeng Guo, and Fujun Ruan for their help with the hardware setup. Ruihan Gao is supported by A*STAR National Science Scholarship (Ph.D.).

Our code base is built uponContrastive Unpaired Translation (CUT).

About

We synthesize synchronized visual appearance and tactile geometry given a sketch of objects and render the multimodal output on a surface haptic device called TanvasTouch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors3

  •  
  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp