rinnakk/japanese-stable-diffusionPublic

NotificationsYou must be signed in to change notification settings
Fork13
Star283

Japanese Stable Diffusion is a Japanese specific latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

License

View license

283 stars 13 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
data		data
replicate		replicate
scripts		scripts
src/japanese_stable_diffusion		src/japanese_stable_diffusion
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Japanese Stable Diffusion

Japanese Stable Diffusion is a Japanese-specific latent text-to-image diffusion model.

This model was trained by using a powerful text-to-image model,Stable Diffusion.Many thanks toCompVis,Stability AI andLAION for public release.

Table of Contents
News
Model Details
Usage
Citation
License

News

September 9, 2022

Release Japanese Stable Diffusion under theCreativeML Open RAIL M License in huggingface hub (link)

Web Demo

Integrated intoHuggingface Spaces 🤗 usingGradio. Try out the Web Demo

Model Details

Why Japanese Stable Diffusion?

Stable Diffusion is a very powerful text-to-image model, not only in terms of quality but also in terms of computational cost.Because Stable Diffusion was trained on English dataset, you need translate prompts or use directly if you are non-English users.Surprisingly, Stable Diffusion can sometimes generate nice images even by using non-English prompts.So, why do we need a language-specific Japanese Stable Diffusion?

Because we want a model to understand our culture, identity, and unique expressions such as slang.For example, one of the famous Japanglish is "salary man" which means a businessman especially we often imagine he's wearing a suit.Stable Diffusion cannot understand such Japanese unique words correctly because Japanese is not their target.

So, we made a language-specific version of Stable Diffusion!Japanese Stable Diffusion can achieve the following points compared to the original Stable Diffusion.

Generate Japanese-style images
Understand Japanglish
Understand Japanese unique onomatope
Understand Japanese proper noun

caption: "サラリーマン油絵", that means "salary man, oil painting"

Training

Japanese Stable Diffusion was trained by using Stable Diffusion and has the same architecture and the same number of parameters.But, this is not a fully fine-tuned model on Japanese datasets because Stable Diffusion was trained on English dataset and the CLIP tokenizer is basically for English.To achieve make a Japanese-specific model based on Stable Diffusion, we had 2 stages inspired byPITI.

Train a Japanese-specific text encoder with our Japanese tokenizer from scratch with the latent diffusion model fixed. This stage is expected to map Japanese captions to Stable Diffusion's latent space.
Fine-tune the text encoder and the latent diffusion model jointly. This stage is expected to generate Japanese-style images more.

We used the following dataset for training the model:

Approximately 100 million images with Japanese captions, including the Japanese subset ofLAION-5B.

Usage

Firstly, install our package as follows. This package is modified🤗's Diffusers library to run Japanese Stable Diffusion.

pip install git+https://github.com/rinnakk/japanese-stable-diffusion

You need to accept the model license before downloading or using the weights. So, you'll need to visit its card, read the license and tick the checkbox if you agree.

https://huggingface.co/rinna/japanese-stable-diffusion

You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer tothis section of the documentation.

huggingface-cli login

Running the pipeline with the k_lms scheduler:

importtorchfromtorchimportautocastfromdiffusersimportLMSDiscreteSchedulerfromjapanese_stable_diffusionimportJapaneseStableDiffusionPipelinemodel_id="rinna/japanese-stable-diffusion"device="cuda"# Use the K-LMS scheduler here insteadscheduler=LMSDiscreteScheduler(beta_start=0.00085,beta_end=0.012,beta_schedule="scaled_linear",num_train_timesteps=1000)pipe=JapaneseStableDiffusionPipeline.from_pretrained(model_id,scheduler=scheduler,use_auth_token=True)pipe=pipe.to(device)prompt="猫の肖像画 油絵"withautocast("cuda"):image=pipe(prompt,guidance_scale=7.5)["sample"][0]image.save("output.png")

Note:JapaneseStableDiffusionPipeline is almost same as diffusers'StableDiffusionPipeline but added some lines to initialize our models properly.

Japanese Stable Diffusion pipelines also include

aSafety Checker Module, to reduce the probability of explicit outputs,
aninvisible watermarking of the outputs, to help viewers identify the images as machine-generated.

Citation

@InProceedings{Rombach_2022_CVPR,author    ={Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},title     ={High-Resolution Image Synthesis With Latent Diffusion Models},booktitle ={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},month     ={June},year      ={2022},pages     ={10684-10695}  }

@misc{japanese_stable_diffusion,author    ={Shing, Makoto and Sawada, Kei},title     ={Japanese Stable Diffusion},howpublished ={\url{https://github.com/rinnakk/japanese-stable-diffusion}},month     ={September},year      ={2022},}

License

The CreativeML OpenRAIL M license is anOpen RAIL M license, adapted from the work thatBigScience andthe RAIL Initiative are jointly carrying in the area of responsible AI licensing. See alsothe article about the BLOOM Open RAIL license on which our license is based.

About

Japanese Stable Diffusion is a Japanese specific latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Japanese Stable Diffusion

News

September 9, 2022

Web Demo

Model Details

Why Japanese Stable Diffusion?

Training

Usage

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors5

Languages

Movatterモバイル変換

License

rinnakk/japanese-stable-diffusion

Folders and files

Latest commit

History

Repository files navigation

Japanese Stable Diffusion

News

September 9, 2022

Web Demo

Model Details

Why Japanese Stable Diffusion?

Training

Usage

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors5

Languages

Packages