- Notifications
You must be signed in to change notification settings - Fork7
We synthesize synchronized visual appearance and tactile geometry given a sketch of objects and render the multimodal output on a surface haptic device called TanvasTouch.
RuihanGao/visual-tactile-synthesis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Content creation beyond visual outputs: We present an image-to-image method to synthesize the visual appearance and tactile geometry of different materials, given a handcrafted or DALL⋅E 2 sketch. We then render the outputs on a surface haptic device like TanvasTouch® where users can slide on the screen to feel the rendered textures. (Turn the audio ON to hear the sound of the rendering.)
website_teaser_video.mp4
Controllable Visual-Tactile Synthesis
Ruihan Gao,Wenzhen Yuan,Jun-Yan Zhu
Carnegie Mellon University
ICCV, 2023
We show an example of our visual-tactile synthesis. The tactile output is shown in the 3D height map. The patches below correspond to bounding boxes shown in the sketch input. Please see our paper for more results.
We can also render the synthesized results as a colored 3D mesh. The meshes are exaggerated in z direction to show fine textures.
Please see our website and paper for more interactive and comprehensive results
We are plan to release our code and dataset in the following steps:
- Inference and Evaluation code [05/04].
- Preprocessed data of all 20 garments in ourTouchClothing dataset [05/04].
- Pretrained model (ours & baselines) on theTouchClothing dataset [05/04].
- Training code [05/04].
- Data preprocessing code for camera and GelSight R1.5 data.
- Rendering code to generate friction maps for TanvasTouch.
- Instructions on how to create new test data.
We tested our code with Python 3.8 andPytorch 1.11.0. (We recommend installing PyTorch separately to avoid package conflicts.)
git clone https://github.com/RuihanGao/visual-tactile-synthesis.gitcd visual-tactile-synthesisconda create -n VTS python=3.8conda activate VTSpip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113pip install -r requirements.txt
We provide the preprocessed data for ourTouchClothing dataset, which contains 20 pieces of garments of various shapes and textures. Here are 20 objects inTouchClothing dataset:
Example of preprocessed data:
website_dataset_video.mp4
Use the following commands to download and unzip the dataset.
(0) Installgdown
andunzip
as follows if you haven't done so.
pip install gdownsudo apt install unzip
(1) Download the preprocessed data from Google Drive via the following command:Total size: 580M.
bash scripts/download_TouchClothing_dataset.sh
(2) Put the unzipped folderdatasets
in the code repo.
Note:
- in case there is "access denied" error, try
pip install -U --no-cache-dir gdown --pre
and rungdown
command again.Ref here - use
-q
flag tounzip
to suppress the log as it could be quite long
website_method_video_v3_voiceover.mp4
We provide the pretrained models for our method and several baselines included in our paper. For each method, we provide 20 models, one for each object in ourTouchClothing dataset.See the Google Drive folderhere. To use them,
(1) download the checkpoints
- checkpoints for our method (124M):
gdown "https://drive.google.com/uc?export=download&id=11y2jP2vT7CtBIaEDcjROZ5hupHsYWG8D"
- checkpoints for baselines (21.5G):
gdown "https://drive.google.com/uc?export=download&id=16NNU1GuOWWtarzEJkLSYbeSqQVaX-943"
(2) After unzipping the files, put all pre-trained models in the foldercheckpoints
to load them properly in the testing code.
(3) See thetesting section for more examples of how to evaluate the pretrained models.
In general, our pipeline contains two steps. We first feed the sketch input to our model to synthesize synchronized visual and tactile output. Then we convert the tactile output to a friction map required by TanvasTouch and render the multi-modal output on the surface haptic device, where you cansee andfeel the object simultaneously.
material=BlackJeansCUDA_VISIBLE_DEVICES=0 python train.py --gpu_ids 0 --name "${material}_sinskitG_baseline_ours" --model sinskitG --dataroot ./datasets/"singleskit_${material}_padded_1800_x1/"
where you can choose the variablematerial
from ourTouchClothing dataset or your own customized dataset.
To use our launcher scripts to run multiple experiments in tmux window, use the following command:
(Ref here for more examples and explanations for tmux launcher)
material_idx=0python -m experiments SingleG_AllMaterials_baseline_ours launch $material_idx
where the material_idx set which object in the dataset to use. Choose a material_idx or use 'all' to run multiple experiments at once.The list of the material can be found in the launcher fileexperiments/SingleG_AllMaterials_baseline_ours_launcher.py
Note: Loading the dataset to cache before training may take up to 20-30 mins and the training takes about 16h on a single A5000 GPU. Please be patient.
For a proof-of-concept training, setdata_len
inSingleG_AllMaterials_baseline_ours_launcher
andverbose_freq
inmodels/sinskitG_model.py
to a smaller number (e.g., 3 or 10).
material=BlackJeansCUDA_VISIBLE_DEVICES=0 python test.py --gpu_ids 0 --name "${material}_sinskitG_baseline_ours" --model sinskitG --dataroot ./datasets/"singleskit_${material}_padded_1800_x1/" --epoch best --eval
Or, if you are usingtmux_launcher
, use the following command.
material_idx=0python -m experiments SingleG_AllMaterials_baseline_ours test $material_idx
The results will be stored in theresults
directory.
To compile the quantitative metrics of the tested method in a tabulated format, runbash scripts/compile_eval_metrics_sinskitG.sh
. For each method, it retrieves theeval_metrics.pkl
file of all materials and take the average. Modifymaterials
list inutil/compile_eval_metrics_sinskitG.py
and the bash script accordingly.
@inproceedings{gao2023controllable,title={Controllable Visual-Tactile Synthesis},author={Gao, Ruihan and Yuan, Wenzhen and Zhu, Jun-Yan},booktitle={IEEE International Conference on Computer Vision (ICCV)},year={2023},}
We thank Sheng-Yu Wang, Kangle Deng, Muyang Li, Aniruddha Mahapatra, and Daohan Lu for proofreading the draft. We are also grateful to Sheng-Yu Wang, Nupur Kumari, Gaurav Parmar, George Cazenavette, and Arpit Agrawal for their helpful comments and discussion. Additionally, we thank Yichen Li, Xiaofeng Guo, and Fujun Ruan for their help with the hardware setup. Ruihan Gao is supported by A*STAR National Science Scholarship (Ph.D.).
Our code base is built uponContrastive Unpaired Translation (CUT).
About
We synthesize synchronized visual appearance and tactile geometry given a sketch of objects and render the multimodal output on a surface haptic device called TanvasTouch.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors3
Uh oh!
There was an error while loading.Please reload this page.