Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.

License

NotificationsYou must be signed in to change notification settings

Algolzw/daclip-uir

Repository files navigation

Project Page |Paper |Model Card 🤗

Open In ColabHugging FaceReplicate

daclip

Our follow-up workPhoto-Realistic Image Restoration in the Wild with Controlled Vision-Language Models (CVPRW 2024) presents aposterior sampling for better image generation and handles real-world mixed-degradation images similar toReal-ESRGAN.

Updates

[2024.04.16] Our follow-up paper "Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models" is onArXiv now!
[2024.04.15] Updated awild-IR model for real-world degradations and theposterior sampling for better image generation. The pretrained weightswild-ir.pth andwild-daclip_ViT-L-14.pt are also provided for wild-ir.
[2024.01.20] 🎉🎉🎉 Our DA-CLIP paper was accepted by ICLR 2024 🎉🎉🎉 We further provide a more robust model in themodel card.
[2023.10.25] Addeddataset links for training and testing.
[2023.10.13] Added the Replicatedemo andapi🔥. Thanks to@chenxwh!!! We updated the Hugging Facedemo🔥 and online Colabdemo🔥. Thanks to@fffiloni and@camenduru !!! We also made aModel Card in Hugging Face 🤗 and provided moreexamples for testing.
[2023.10.09] Thepretrained weights of DA-CLIP and the Universal IR model are released inlink1 andlink2, respectively. In addition, we also provide aGradio app file for the case that you want totest your own images.

How to Run the Code?

Dependencies

  • OS: Ubuntu 20.04
  • nvidia:
    • cuda: 11.4
  • python 3.8

Install

We advise you first create a virtual environment with:

python3 -m venv .envsource .env/bin/activatepip install -U pippip install -r requirements.txt

DA-CLIP Usage

Get into theuniversal-image-restoration directory and run:

importtorchfromPILimportImageimportopen_clipcheckpoint='pretrained/daclip_ViT-B-32.pt'model,preprocess=open_clip.create_model_from_pretrained('daclip_ViT-B-32',pretrained=checkpoint)tokenizer=open_clip.get_tokenizer('ViT-B-32')image=preprocess(Image.open("haze_01.png")).unsqueeze(0)degradations= ['motion-blurry','hazy','jpeg-compressed','low-light','noisy','raindrop','rainy','shadowed','snowy','uncompleted']text=tokenizer(degradations)withtorch.no_grad(),torch.cuda.amp.autocast():text_features=model.encode_text(text)image_features,degra_features=model.encode_image(image,control=True)degra_features/=degra_features.norm(dim=-1,keepdim=True)text_features/=text_features.norm(dim=-1,keepdim=True)text_probs= (100.0*degra_features @text_features.T).softmax(dim=-1)index=torch.argmax(text_probs[0])print(f"Task:{task_name}:{degradations[index]} -{text_probs[0][index]}")

Dataset Preparation

Preparing the train and test datasets following our paper Dataset Construction section as:

#### for training dataset ######## (uncompleted means inpainting) ####datasets/universal/train|--motion-blurry||--LQ/*.png||--GT/*.png|--hazy|--jpeg-compressed|--low-light|--noisy|--raindrop|--rainy|--shadowed|--snowy|--uncompleted#### for testing dataset ######## (the same structure as train) ####datasets/universal/val...#### for clean captions ####datasets/universal/daclip_train.csvdatasets/universal/daclip_val.csv

Then get into theuniversal-image-restoration/config/daclip-sde directory and modify the dataset paths in option files inoptions/train.yml andoptions/test.yml.

You can add more tasks or datasets to bothtrain andval directories and add the degradation word todistortion.

Dataset Links

Degradationmotion-blurryhazyjpeg-compressed*low-lightnoisy* (same to jpeg)
DatasetsGoproRESIDE-6kDIV2K+Flickr2KLOLDIV2K+Flickr2K
Degradationraindroprainyshadowedsnowyuncompleted
DatasetsRainDropRain100H:train,testSRDSnow100KCelebaHQ-256

You shouldonly extract the train datasets for training, and allvalidation datasets can be downloaded in theGoogle drive. For jpeg and noisy datasets, you can generate LQ images using thisscript.

Training

DA-CLIP

SeeDA-CLIP.md for details.

Universal Image Restoration

The main code for training is inuniversal-image-restoration/config/daclip-sde and the core network for DA-CLIP is inuniversal-image-restoration/open_clip/daclip_model.py.

  • Put the pretrainedDA-CLIP weights topretrained directory and check thedaclip path.

  • You can then train the model following below bash scripts:

cd universal-image-restoration/config/daclip-sde# For single GPU:python3 train.py -opt=options/train.yml# For distributed training, need to change the gpu_ids in option filepython3 -m torch.distributed.launch --nproc_per_node=2 --master_port=4321 train.py -opt=options/train.yml --launcher pytorch

The models and training logs will save inlog/universal-ir.You can print your log at time by runningtail -f log/universal-ir/train_universal-ir_***.log -n 100.

The same training steps can be used for image restoration in the wild (wild-ir).

Pretrained Models

Model NameDescriptionGoogleDriveHuggingFace
DA-CLIPDegradation-aware CLIP modeldownloaddownload
Universal-IRDA-CLIP based universal image restoration modeldownloaddownload
DA-CLIP-mixDegradation-aware CLIP model (add Gaussian blur + face inpainting and Gaussian blur + Rainy)downloaddownload
Universal-IR-mixDA-CLIP based universal image restoration model (add robust training and mix-degradations)downloaddownload
Wild-DA-CLIPDegradation-aware CLIP model in the wild (ViT-L-14)downloaddownload
Wild-IRDA-CLIP based image restoration model in the wilddownloaddownload

Evaluation

To evalute our method on image restoration, please modify the benchmark path and model path and run

cd universal-image-restoration/config/universal-irpython test.py -opt=options/test.yml

Gradio

Here we provide anapp.py file for testing your own images. Before that, you need to download the pretrained weights (DA-CLIP andUIR) and modify the model path inoptions/test.yml. Then by simply runningpython app.py, you can openhttp://localhost:7860 to test the model. (We also provide several images with different degradations in theimages dir). We also provide more examples from our test dataset in thegoogle drive.

The same steps can be used for image restoration in the wild (wild-ir).

Results

daclip

Unified Image Restoration (click to expand)

daclip

Degradation-Specific Restoration (click to expand)

daclip

Image Restoration in the wild (click to expand)

daclip

Notice!!

🙁 In testing we found that the current pretrained model is still difficult to process some real-world images which might have distribution shifts with our training dataset (captured from different devices or with different resolutions or degradations). We regard it as a future work and will try to make our model more practical! We also encourage users who are interested in our work to train their own models with larger dataset and more degradation types.

🙁 BTW,we also found that directly resizing input images will lead a poor performance for most tasks. We could try to add the resize step into the training but it always destroys the image quality due to interpolation.

🙁 For the inpainting task our current model only supports face inpainting due to thedataset limitation. We provide our maskexamples and you can use thegenerate_masked_face script to generate uncompleted faces.


Acknowledgment: Our DA-CLIP is based onIR-SDE andopen_clip. Thanks for their code!

Contact

If you have any question, please contact:ziwei.luo@it.uu.se

Citations

If our code helps your research or work, please consider citing our paper.The following are BibTeX references:

@article{luo2023controlling,  title={Controlling Vision-Language Models for Universal Image Restoration},  author={Luo, Ziwei and Gustafsson, Fredrik K and Zhao, Zheng and Sj{\"o}lund, Jens and Sch{\"o}n, Thomas B},  journal={arXiv preprint arXiv:2310.01018},  year={2023}}@article{luo2024photo,  title={Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models},  author={Luo, Ziwei and Gustafsson, Fredrik K and Zhao, Zheng and Sj{\"o}lund, Jens and Sch{\"o}n, Thomas B},  journal={arXiv preprint arXiv:2404.09732},  year={2024}}

--- Thanks for your interest! ---

statistics

visitors


[8]ページ先頭

©2009-2025 Movatter.jp