- Notifications
You must be signed in to change notification settings - Fork39
JihyongOh/XVFI
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This is the official repository of XVFI (eXtreme Video Frame Interpolation)
[ArXiv_ver.] [ICCV2021_ver.] [Supp.] [Demo(YouTube)] [Oral12mins(YouTube)] [Flowframes(GUI)] [Poster]
Last Update: 20211130 - We provide extended input sequences for X-TEST. Please refer toX4K1000FPS
We provide the training and test code along with the trained weights and the dataset (train+test) used for XVFI.If you find this repository useful, please consider citing ourpaper.



The 4K@30fps input frames are interpolated to be 4K@240fps frames. All results are encoded at 30fps to be played as x8 slow motion and spatially down-scaled due to the limit of file sizes. All methods are trained on X-TRAIN.






Some examples of X4K1000FPS dataset, which are frames of 1000-fps and 4K-resolution. Our dataset contains the various scenes with extreme motions. (Displayed in spatiotemporally subsampled .gif files)
We provide our X4K1000FPS dataset which consists of X-TEST and X-TRAIN. Please refer to our main/suppl.paper for the details of the dataset. You can download the dataset from this dropboxlink.
X-TEST consists of 15 video clips with 33-length of 4K-1000fps frames. It follows the below directory format:
├──── YOUR_DIR/ ├──── test/ ├──── Type1/ ├──── TEST01/ ├──── 0000.png ├──── ... └──── 0032.png ├──── TEST02/ ├──── 0000.png ├──── ... └──── 0032.png ├──── ... ├──── ...Extended version of X-TESTissue#9.As described in our paper, we assume that the number of input frames for VFI is fixed to 2 in X-TEST. However, for the VFI methods that require more than 2 input frames, we provide anextended version of X-TEST which contains8 input frames (in a temporal distance of 32 frames) for each test seqeuence. The middle two adjacent frames among the 8 frames are the same input frames in the original X-TEST. To sort .png files properly by their file names, we added 1000 to the frame indices (e.g. '0000.png' and '0032.png' in the original version of X-TEST correspond to '1000.png' and '1032.png', respectively, in the extended version of X-TEST). Please note that the extended one consists of input frames only, without the ground truth intermediate frames ('1001.png'~'1031.png'). In addition, for the sequence 'TEST11_078_f4977', '1064.png', '1096.png' and '1128.png' are replicated frames since '1064.png' is the last frame of the raw video file.Theextended version of X-TEST can be downloaded from thelink.
X-TRAIN consists of 4,408 clips from various types of 110 scenes. The clips are 65-length of 1000fps frames. Each frame is the size of 768x768 cropped from 4K frame. It follows the below directory format:
├──── YOUR_DIR/ ├──── train/ ├──── 002/ ├──── occ008.320/ ├──── 0000.png ├──── ... └──── 0064.png ├──── occ008.322/ ├──── 0000.png ├──── ... └──── 0064.png ├──── ... ├──── ...After downloading the files from the link, decompress theencoded_test.tar.gz andencoded_train.tar.gz. The resulting .mp4 files can be decoded into .png files via runningmp4_decoding.py. Please follow the instruction written inmp4_decoding.py.
Our code is implemented using PyTorch1.7, and was tested under the following setting:
- Python 3.7
- PyTorch 1.7.1
- CUDA 10.2
- cuDNN 7.6.5
- NVIDIA TITAN RTX GPU
- Ubuntu 16.04 LTS
Caution: since there is "align_corners" option in "nn.functional.interpolate" and "nn.functional.grid_sample" in PyTorch1.7, we recommend you to follow our settings.Especially, if you use the other PyTorch versions, it may lead to yield a different performance.
- Download the source codes in a directory of your choice<source_path>.
- First download our X-TEST test dataset by following the above section 'X4K1000FPS'.
- Download the pre-trained weights, which was trained by X-TRAIN, fromthis link to place in<source_path>/checkpoint_dir/XVFInet_X4K1000FPS_exp1.
XVFI└── checkpoint_dir └── XVFInet_X4K1000FPS_exp1 ├── XVFInet_X4K1000FPS_exp1_latest.pt- Runmain.py with the following options in parse_args:
python main.py --gpu 0 --phase'test' --exp_num 1 --dataset'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 8
==> It would yield(PSNR/SSIM/tOF) = (30.12/0.870/2.15).
python main.py --gpu 0 --phase'test' --exp_num 1 --dataset'X4K1000FPS' --module_scale_factor 4 --S_tst 3 --multiple 8
==> It would yield(PSNR/SSIM/tOF) = (28.86/0.858/2.67).
- After running with the above test option, you can get the result images in<source_path>/test_img_dir/XVFInet_X4K1000FPS_exp1, then obtain the PSNR/SSIM/tOF results per each test clip as "total_metrics.csv" in the same folder.
- Our proposed XVFI-Net can start from any downscaled input upward by regulating '--S_tst', which is adjustable in terms ofthe number of scales for inference according to the input resolutions or the motion magnitudes.
- You can get any Multi-Frame Interpolation (x M) result by regulating '--multiple'.
- Download the source codes in a directory of your choice<source_path>.
- First download Vimeo90K dataset fromthis link (including 'tri_trainlist.txt') to place in<source_path>/vimeo_triplet.
XVFI└── vimeo_triplet ├── sequences readme.txt tri_testlist.txt tri_trainlist.txt- Download the pre-trained weights (XVFI-Net_v), which was trained by Vimeo90K, fromthis link to place in<source_path>/checkpoint_dir/XVFInet_Vimeo_exp1.
XVFI└── checkpoint_dir └── XVFInet_Vimeo_exp1 ├── XVFInet_Vimeo_exp1_latest.pt- Runmain.py with the following options in parse_args:
python main.py --gpu 0 --phase'test' --exp_num 1 --dataset'Vimeo' --module_scale_factor 2 --S_tst 1 --multiple 2
==> It would yieldPSNR = 35.07 on Vimeo90K.
- After running with the above test option, you can get the result images in<source_path>/test_img_dir/XVFInet_Vimeo_exp1.
- There are certain code lines in front of the 'def main()' for a convenience when running with the Vimeo option.
- The SSIM result of 0.9760 as in Fig. 8 was measured by matlab ssim function for a fair comparison after running the above guide because other SOTA methods did so. We also upload "compare_psnr_ssim.m" matlab file to obtain it.
It should be noted that there is a typo "S_trnand S_tst are set to 2" in the current version of XVFI paper, which should be modified to 1 (not 2), sorry for inconvenience.-> Updated in the latest arXiv version.
- Download the source codes in a directory of your choice<source_path>.
- First prepare your own video datasets in<source_path>/custom_path by following a hierarchy as belows:
XVFI└── custom_path ├── scene1 ├── 'xxx.png' ├── ... └── 'xxx.png' ... ├── sceneN ├── 'xxxxx.png' ├── ... └── 'xxxxx.png'Download the pre-trained weights trained onX-TRAIN orVimeo90K as decribed above.
Runmain.py with the following options in parse_args (ex) x8 Multi-Frame Interpolation):
# For the model trained on X-TRAINpython main.py --gpu 0 --phase'test_custom' --exp_num 1 --dataset'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 8 --custom_path'./custom_path'
# For the model trained on Vimeo90Kpython main.py --gpu 0 --phase'test_custom' --exp_num 1 --dataset'Vimeo' --module_scale_factor 2 --S_tst 1 --multiple 8 --custom_path'./custom_path'
- Our proposed XVFI-Net can start from any downscaled input upward by regulating '--S_tst', which is adjustable in terms ofthe number of scales for inference according to the input resolutions or the motion magnitudes.
- You can get any Multi-Frame Interpolation (x M) result by regulating '--multiple'.
- It only supports for '.png' format.
- Since we can not cover diverse possibilites of naming rule for custom frames, please sort your own frames properly.
- Download the source codes in a directory of your choice<source_path>.
- First download our X-TRAIN train/val/test datasets by following the above section 'X4K1000FPS' and place them as belows:
XVFI└── X4K1000FPS ├── train ├── 002 ├── ... └── 172 ├── val ├── Type1 ├── Type2 ├── Type3 ├── test ├── Type1 ├── Type2 ├── Type3- Runmain.py with the following options in parse_args:
python main.py --phase'train' --exp_num 1 --dataset'X4K1000FPS' --module_scale_factor 4 --S_trn 3 --S_tst 5
- Download the source codes in a directory of your choice<source_path>.
- First download Vimeo90K dataset fromthis link (including 'tri_trainlist.txt') to place in<source_path>/vimeo_triplet.
XVFI└── vimeo_triplet ├── sequences readme.txt tri_testlist.txt tri_trainlist.txt- Runmain.py with the following options in parse_args:
python main.py --phase'train' --exp_num 1 --dataset'Vimeo' --module_scale_factor 2 --S_trn 1 --S_tst 1
- You can freely regulate other arguments in the parser ofmain.py,here
- We also provide all visual results (x8 Multi-Frame Interpolation) on X-TEST for an easier comparison as belows. Each zip file has about 1~1.5GB.
- AdaCoFo,AdaCoFf,FeFlowo,FeFlowf,DAINo,DAINf,XVFI-Net (Stst=3),XVFI-Net (Stst=5)
- The quantitative comparisons (Table2 and Figure5) are attached as belows for a reference.
\
Hyeonjun Sim*, Jihyong Oh*, and Munchurl Kim "XVFI: eXtreme Video Frame Interpolation", InICCV, 2021. (*equal contribution)
BibTeX
@inproceedings{sim2021xvfi,title={XVFI: eXtreme Video Frame Interpolation},author={Sim, Hyeonjun and Oh, Jihyong and Kim, Munchurl},booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},year={2021}}
If you have any question, please send an email to either
[Hyeonjun Sim] -flhy5836@kaist.ac.kr or
[Jihyong Oh] -jhoh94@kaist.ac.kr.
The source codes and datasets can be freely used for research and education only. Any commercial use should get formal permission first.
About
[ICCV 2021, Oral 3%] Official repository of XVFI
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors3
Uh oh!
There was an error while loading.Please reload this page.