Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Oct 19, 2024. It is now read-only.
/DeOldifyPublic archive

A Deep Learning based project for colorizing and restoring old images (and video!)

License

NotificationsYou must be signed in to change notification settings

jantic/DeOldify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This Reposisitory is Archived This project was a wild ride since I started it back in 2018. 6 years ago as of this writing (October 19, 2024)!. It's time for me to move on and put this repo in the archives as I simply don't have the time to attend to it anymore, and frankly it's ancient as far as deep-learning projects go at this point! ~Jason

Quick Start: The easiest way to colorize images using open source DeOldify(for free!) is here:DeOldify Image Colorization on DeepAI

Desktop: Want to run open source DeOldify for photos and videos on the desktop?

In Browser (new!) Check out this Onnx-based in browser implementation:https://github.com/akbartus/DeOldify-on-Browser

Themost advanced version of DeOldify image colorization is available here,exclusively. Try a few images for free!MyHeritage In Color

Replicate: Image: | Video:


Image (artistic)Colab for images| VideoColab for video

Having trouble with the default image colorizer, aka "artistic"? Try the"stable" one below. It generally won't produce colors that are as interesting as"artistic", but the glitches are noticeably reduced.

Image (stable)Colab for stable model

Instructions on how to use the Colabs above have been kindly provided in videotutorial form by Old Ireland in Colour's John Breslin. It's great! Click videoimage below to watch.

DeOldify Tutorial

Get more updates onTwitterTwitter logo.

Table of Contents

About DeOldify

Simply put, the mission of this project is to colorize and restore old images andfilm footage. We'll get into the details in a bit, but first let's see somepretty pictures and videos!

New and Exciting Stuff in DeOldify

  • Glitches and artifacts are almost entirely eliminated
  • Better skin (less zombies)
  • More highly detailed and photorealistic renders
  • Much less "blue bias"
  • Video - it actually looks good!
  • NoGAN - a new and weird but highly effective way to do GAN training forimage to image.

Example Videos

Note: Click images to watch

Facebook F8 Demo

DeOldify Facebook F8 Movie Colorization Demo

Silent Movie Examples

DeOldify Silent Movie Examples

Example Images

"Migrant Mother" by Dorothea Lange (1936)

Migrant Mother

Woman relaxing in her livingroom in Sweden (1920)

Sweden Living Room

"Toffs and Toughs" by Jimmy Sime (1937)

Class Divide

Thanksgiving Maskers (1911)

Thanksgiving Maskers

Glen Echo Madame Careta Gypsy Camp in Maryland (1925)

Gypsy Camp

"Mr. and Mrs. Lemuel Smith and their younger children in their farm house,Carroll County, Georgia." (1941)

Georgia Farmhouse

"Building the Golden Gate Bridge" (est 1937)

Golden Gate Bridge

Note: What you might be wondering is while this render looks cool, are thecolors accurate? The original photo certainly makes it look like the towers ofthe bridge could be white. We looked into this and it turns out the answer isno - the towers were already covered in red primer by this time. So that'ssomething to keep in mind- historical accuracy remains a huge challenge!

"Terrasse de café, Paris" (1925)

Cafe Paris

Norwegian Bride (est late 1890s)

Norwegian Bride

Zitkála-Šá (Lakota: Red Bird), also known as Gertrude Simmons Bonnin (1898)

Native Woman

Chinese Opium Smokers (1880)

Opium Real

Stuff That Should Probably Be In A Paper

How to Achieve Stable Video

NoGAN training is crucial to getting the kind of stable and colorful images seenin this iteration of DeOldify. NoGAN training combines the benefits of GANtraining (wonderful colorization) while eliminating the nasty side effects(like flickering objects in video). Believe it or not, video is rendered usingisolated image generation without any sort of temporal modeling tacked on. Theprocess performs 30-60 minutes of the GAN portion of "NoGAN" training, using 1%to 3% of imagenet data once. Then, as with still image colorization, we"DeOldify" individual frames before rebuilding the video.

In addition to improved video stability, there is an interesting thing going onhere worth mentioning. It turns out the models I run, even different ones andwith different training structures, keep arriving at more or less the samesolution. That's even the case for the colorization of things you may thinkwould be arbitrary and unknowable, like the color of clothing, cars, and evenspecial effects (as seen in "Metropolis").

Metropolis Special FX

My best guess is that the models are learning some interesting rules about how tocolorize based on subtle cues present in the black and white images that Icertainly wouldn't expect to exist. This result leads to nicely deterministic andconsistent results, and that means you don't have track model colorizationdecisions because they're not arbitrary. Additionally, they seem remarkablyrobust so that even in moving scenes the renders are very consistent.

Moving Scene Example

Other ways to stabilize video add up as well. First, generally speaking renderingat a higher resolution (higher render_factor) will increase stability ofcolorization decisions. This stands to reason because the model has higherfidelity image information to work with and will have a greater chance of makingthe "right" decision consistently. Closely related to this is the use ofresnet101 instead of resnet34 as the backbone of the generator- objects aredetected more consistently and correctly with this. This is especially importantfor getting good, consistent skin rendering. It can be particularly visuallyjarring if you wind up with "zombie hands", for example.

Zombie Hand Example

Additionally, gaussian noise augmentation during training appears to help but atthis point the conclusions as to just how much are bit more tenuous (I justhaven't formally measured this yet). This is loosely based on work done in styletransfer video, described here:https://medium.com/element-ai-research-lab/stabilizing-neural-style-transfer-for-video-62675e203e42.

Special thanks go to Rani Horev for his contributions in implementing this noiseaugmentation.

What is NoGAN?

This is a new type of GAN training that I've developed to solve some key problemsin the previous DeOldify model. It provides the benefits of GAN training whilespending minimal time doing direct GAN training. Instead, most of the trainingtime is spent pretraining the generator and critic separately with morestraight-forward, fast and reliable conventional methods. A key insight here isthat those more "conventional" methods generally get you most of the results youneed, and that GANs can be used to close the gap on realism. During the veryshort amount of actual GAN training the generator not only gets the fullrealistic colorization capabilities that used to take days of progressivelyresized GAN training, but it also doesn't accrue nearly as much of the artifactsand other ugly baggage of GANs. In fact, you can pretty much eliminate glitchesand artifacts almost entirely depending on your approach. As far as I know thisis a new technique. And it's incredibly effective.

Original DeOldify Model

Before Flicker

NoGAN-Based DeOldify Model

After Flicker

The steps are as follows: First train the generator in a conventional way byitself with just the feature loss. Next, generate images from that, and trainthe critic on distinguishing between those outputs and real images as a basicbinary classifier. Finally, train the generator and critic together in a GANsetting (starting right at the target size of 192px in this case). Now forthe weird part: All the useful GAN training here only takes place within a verysmall window of time. There's an inflection point where it appears the critichas transferred everything it can that is useful to the generator. Past thispoint, image quality oscillates between the best that you can get at theinflection point, or bad in a predictable way (orangish skin, overly red lips,etc). There appears to be no productive training after the inflection point.And this point lies within training on just 1% to 3% of the Imagenet Data!That amounts to about 30-60 minutes of training at 192px.

The hard part is finding this inflection point. So far, I've accomplished thisby making a whole bunch of model save checkpoints (every 0.1% of data iteratedon) and then just looking for the point where images look great before they gototally bonkers with orange skin (always the first thing to go). Additionally,generator rendering starts immediately getting glitchy and inconsistent at thispoint, which is no good particularly for video. What I'd really like to figureout is what the tell-tale sign of the inflection point is that can be easilyautomated as an early stopping point. Unfortunately, nothing definitive isjumping out at me yet. For one, it's happening in the middle of training lossdecreasing- not when it flattens out, which would seem more reasonable on the surface.

Another key thing about NoGAN training is you can repeat pretraining the criticon generated images after the initial GAN training, then repeat the GAN trainingitself in the same fashion. This is how I was able to get extra colorful resultswith the "artistic" model. But this does come at a cost currently- the output ofthe generator becomes increasingly inconsistent and you have to experiment withrender resolution (render_factor) to get the best result. But the renders arestill glitch free and way more consistent than I was ever able to achieve withthe original DeOldify model. You can do about five of these repeat cycles, giveor take, before you get diminishing returns, as far as I can tell.

Keep in mind- I haven't been entirely rigorous in figuring out what all is goingon in NoGAN- I'll save that for a paper. That means there's a good chance I'mwrong about something. But I think it's definitely worth putting out there nowbecause I'm finding it very useful- it's solving basically much of my remainingproblems I had in DeOldify.

This builds upon a technique developed in collaboration with Jeremy Howard andSylvain Gugger for Fast.AI's Lesson 7 in version 3 of Practical Deep Learningfor Coders Part I. The particular lesson notebook can be found here:https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-superres-gan.ipynb

Why Three Models?

There are now three models to choose from in DeOldify. Each of these has keystrengths and weaknesses, and so have different use cases. Video is for videoof course. But stable and artistic are both for images, and sometimes one willdo images better than the other.

More details:

  • Artistic - This model achieves the highest quality results in imagecoloration, in terms of interesting details and vibrance. The most notabledrawback however is that it's a bit of a pain to fiddle around with to get thebest results (you have to adjust the rendering resolution or render_factor toachieve this). Additionally, the model does not do as well as stable in a fewkey common scenarios- nature scenes and portraits. The model uses a resnet34backbone on a UNet with an emphasis on depth of layers on the decoder side.This model was trained with 5 critic pretrain/GAN cycle repeats via NoGAN, inaddition to the initial generator/critic pretrain/GAN NoGAN training, at 192px.This adds up to a total of 32% of Imagenet data trained once (12.5 hours ofdirect GAN training).

  • Stable - This model achieves the best results with landscapes andportraits. Notably, it produces less "zombies"- where faces or limbs stay grayrather than being colored in properly. It generally has less weirdmiscolorations than artistic, but it's also less colorful in general. Thismodel uses a resnet101 backbone on a UNet with an emphasis on width of layers onthe decoder side. This model was trained with 3 critic pretrain/GAN cyclerepeats via NoGAN, in addition to the initial generator/critic pretrain/GANNoGAN training, at 192px. This adds up to a total of 7% of Imagenet datatrained once (3 hours of direct GAN training).

  • Video - This model is optimized for smooth, consistent and flicker-freevideo. This would definitely be the least colorful of the three models, butit's honestly not too far off from "stable". The model is the same as "stable"in terms of architecture, but differs in training. It's trained for a mere 2.2%of Imagenet data once at 192px, using only the initial generator/criticpretrain/GAN NoGAN training (1 hour of direct GAN training).

Because the training of the artistic and stable models was done before the"inflection point" of NoGAN training described in "What is NoGAN???" wasdiscovered, I believe this amount of training on them can be knocked downconsiderably. As far as I can tell, the models were stopped at "good points"that were well beyond where productive training was taking place. I'll belooking into this in the future.

Ideally, eventually these three models will be consolidated into one that has allthese good desirable unified. I think there's a path there, but it's going torequire more work! So for now, the most practical solution appears to be tomaintain multiple models.

The Technical Details

This is a deep learning based model. More specifically, what I've done iscombined the following approaches:

Except the generator is apretrained U-Net, and I've just modified it tohave the spectral normalization and self-attention. It's a prettystraightforward translation.

This is also very straightforward – it's just one to one generator/criticiterations and higher critic learning rate.This is modified to incorporate a "threshold" critic loss that makes sure thatthe critic is "caught up" before moving on to generator training.This is particularly useful for the "NoGAN" method described below.

NoGAN

There's no paper here! This is a new type of GAN training that I've developed tosolve some key problems in the previous DeOldify model.The gist is that you get the benefits of GAN training while spending minimal timedoing direct GAN training.More details are in theWhat is NoGAN? section (it's a doozy).

Generator Loss

Loss during NoGAN learning is two parts: One is a basic Perceptual Loss (orFeature Loss) based on VGG16 – this just biases the generator model to replicatethe input image.The second is the loss score from the critic. For the curious – Perceptual Lossisn't sufficient by itself to produce good results.It tends to just encourage a bunch of brown/green/blue – you know, cheating tothe test, basically, which neural networks are really good at doing!Key thing to realize here is that GANs essentially are learning the loss functionfor you – which is really one big step closer to toward the ideal that we'reshooting for in machine learning.And of course you generally get much better results when you get the machine tolearn something you were previously hand coding.That's certainly the case here.

Of note: There's no longer any "Progressive Growing of GANs" type traininggoing on here. It's just not needed in lieu of the superior results obtainedby the "NoGAN" technique described above.

The beauty of this model is that it should be generally useful for all sorts ofimage modification, and it should do it quite well.What you're seeing above are the results of the colorization model, but that'sjust one component in a pipeline that I'm developing with the exact same approach.

This Project, Going Forward

So that's the gist of this project – I'm looking to make old photos and filmlook reeeeaaally good with GANs, and more importantly, make the projectuseful.In the meantime though this is going to be my baby and I'll be actively updatingand improving the code over the foreseeable future.I'll try to make this as user-friendly as possible, but I'm sure there's goingto be hiccups along the way.

Oh and I swear I'll document the code properly...eventually. Admittedly I'mone of those people who believes in "self documenting code" (LOL).

Getting Started Yourself

Easiest Approach

The easiest way to get started is to go straight to the Colab notebooks:

ImageColab for images| VideoColab for video

Special thanks to Matt Robinson and María Benavente for their image Colab notebookcontributions, and Robert Bell for the video Colab notebook work!

Your Own Machine (not as easy)

Hardware and Operating System Requirements

  • (Training Only) BEEFY Graphics card. I'd really like to have more memorythan the 11 GB in my GeForce 1080TI (11GB). You'll have a tough time with less.The Generators and Critic are ridiculously large.
  • (Colorization Alone) A decent graphics card. Approximately 4GB+ memoryvideo cards should be sufficient.
  • Linux. I'm using Ubuntu 18.04, and I know 16.04 works fine too.Windowsis not supported and any issues brought up related to this will not be investigated.

Easy Install

You should now be able to do a simple install with Anaconda. Here are the steps:

Open the command line and navigate to the root folder you wish to install. Thentype the following commands

git clone https://github.com/jantic/DeOldify.git DeOldifycd DeOldifyconda env create -f environment.yml

Then start running with these commands:

source activate deoldifyjupyter lab

From there you can start running the notebooks in Jupyter Lab, via the url theyprovide you in the console.

Note: You can also now do "conda activate deoldify" if you have the latestversion of conda and in fact that's now recommended. But a lot of people don'thave that yet so I'm not going to make it the default instruction here yet.

Alternative Install: User daddyparodz has kindly created an installer scriptfor Ubuntu, and in particular Ubuntu on WSL, that may make things easier:https://github.com/daddyparodz/AutoDeOldifyLocal

Note on test_images Folder

The images in thetest_images folder have been removed because they were usingGit LFS and that costs a lot of money when GitHub actually charges for bandwidthon a popular open source project (they had a billing bug for while that wasrecently fixed). The notebooks that use them (the image test ones) still pointto images in that directory that I (Jason) have personally and I'd like to keepit that way because, after all, I'm by far the primary and most active developer.But they won't work for you. Still, those notebooks are a convenient templatefor making your own tests if you're so inclined.

Typical training

The notebookColorizeTrainingWandb has been created to log and monitor resultsthroughWeights & Biases. You can find a description oftypical training by consultingW&B Report.

Pretrained Weights

To start right away on your own machine with your own images or videos withouttraining the models yourself, you'll need to download the "Completed GeneratorWeights" listed below and drop them in the /models/ folder.

The colorization inference notebooks should be able to guide you from here. Thenotebooks to use are named ImageColorizerArtistic.ipynb,ImageColorizerStable.ipynb, and VideoColorizer.ipynb.

Completed Generator Weights

Completed Critic Weights

Pretrain Only Generator Weights

Pretrain Only Critic Weights

Want the Old DeOldify?

We suspect some of you are going to want access to the original DeOldify modelfor various reasons. We have that archived here:https://github.com/dana-kelley/DeOldify

Want More?

Follow#DeOldify on Twitter.

License

All code in this repository is under the MIT license as specified by the LICENSEfile.

The model weights listed in this readme under the "Pretrained Weights" sectionare trained by ourselves and are released under the MIT license.

A Statement on Open Source Support

We believe that open source has done a lot of good for the world.  After all,DeOldify simply wouldn't exist without it. But we also believe that there needsto be boundaries on just how much is reasonable to be expected from an opensource project maintained by just two developers.

Our stance is that we're providing the code and documentation on research thatwe believe is beneficial to the world.  What we have provided are novel takeson colorization, GANs, and video that are hopefully somewhat friendly fordevelopers and researchers to learn from and adopt. This is the culmination ofwell over a year of continuous work, free for you. What wasn't free wasshouldered by us, the developers.  We left our jobs, bought expensive GPUs, andhad huge electric bills as a result of dedicating ourselves to this.

What we haven't provided here is a ready to use free "product" or "app", and wedon't ever intend on providing that.  It's going to remain a Linux based projectwithout Windows support, coded in Python, and requiring people to have some extratechnical background to be comfortable using it.  Others have stepped in withtheir own apps made with DeOldify, some paid and some free, which is what we want!We're instead focusing on what we believe we can do best- making bettercommercial models that people will pay for.Does that mean you're not getting the very best for free?  Of course. We simplydon't believe that we're obligated to provide that, nor is it feasible! Wecompete on research and sell that.  Not a GUI or web service that wraps saidresearch- that part isn't something we're going to be great at anyways. We're notabout to shoot ourselves in the foot by giving away our actual competitiveadvantage for free, quite frankly.

We're also not willing to go down the rabbit hole of providing endless, openended and personalized support on this open source project.  Our position isthis:  If you have the proper background and resources, the project providesmore than enough to get you started. We know this because we've seen plenty ofpeople using it and making money off of their own projects with it.

Thus, if you have an issue come up and it happens to be an actual bug thathaving it be fixed will benefit users generally, then great- that's somethingwe'll be happy to look into.

In contrast, if you're asking about something that really amounts to asking forpersonalized and time consuming support that won't benefit anybody else, we'renot going to help. It's simply not in our interest to do that. We have bills topay, after all. And if you're asking for help on something that can already bederived from the documentation or code?  That's simply annoying, and we're notgoing to pretend to be ok with that.

About

A Deep Learning based project for colorizing and restoring old images (and video!)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp