Movatterモバイル変換

[0]ホーム

Jump to content

Diffusion model

Edit links

From Wikipedia, the free encyclopedia

Technique for the generative modeling of a continuous probability distribution

This article is about the technique in generative statistical modeling. For other uses, seeDiffusion (disambiguation).

This article discusses diffusion modeling of a continuous distribution. For the modeling of a discrete distribution, seeDiscrete diffusion model.

Machine learning anddata mining
Part of a series on
Paradigms Supervised learning Unsupervised learning Semi-supervised learning Self-supervised learning Reinforcement learning Meta-learning Online learning Batch learning Curriculum learning Rule-based learning Neuro-symbolic AI Neuromorphic engineering Quantum machine learning
Problems Classification Generative modeling Regression Clustering Dimensionality reduction Density estimation Anomaly detection Data cleaning AutoML Association rules Semantic analysis Structured prediction Feature engineering Feature learning Learning to rank Grammar induction Ontology learning Multimodal learning
Supervised learning (classification • regression) Apprenticeship learning Decision trees Ensembles Bagging Boosting Random forest k-NN Linear regression Naive Bayes Artificial neural networks Logistic regression Perceptron Relevance vector machine (RVM) Support vector machine (SVM)
Clustering BIRCH CURE Hierarchical k-means Fuzzy Expectation–maximization (EM) DBSCAN OPTICS Mean shift
Dimensionality reduction Factor analysis CCA ICA LDA NMF PCA PGD t-SNE SDL
Structured prediction Graphical models Bayes net Conditional random field Hidden Markov
Anomaly detection RANSAC k-NN Local outlier factor Isolation forest
Neural networks Autoencoder Deep learning Feedforward neural network Recurrent neural network LSTM GRU ESN reservoir computing Boltzmann machine Restricted GAN Diffusion model SOM Convolutional neural network U-Net LeNet AlexNet DeepDream Neural field Neural radiance field Physics-informed neural networks Transformer Vision Mamba Spiking neural network Memtransistor Electrochemical RAM (ECRAM)
Reinforcement learning Q-learning Policy gradient SARSA Temporal difference (TD) Multi-agent Self-play
Learning with humans Active learning Crowdsourcing Human-in-the-loop Mechanistic interpretability RLHF
Model diagnostics Coefficient of determination Confusion matrix Learning curve ROC curve
Mathematical foundations Kernel machines Bias–variance tradeoff Computational learning theory Empirical risk minimization Occam learning PAC learning Statistical learning VC theory Topological deep learning
Journals and conferences AAAI ECML PKDD NeurIPS ICML ICLR IJCAI ML JMLR
Related articles Glossary of artificial intelligence List of datasets for machine-learning research List of datasets in computer vision and image processing Outline of machine learning
v t e

Inmachine learning,diffusion models, also known asdiffusion-based generative models orscore-based generative models, are a class oflatent variable generative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn adiffusion process for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models data as generated by a diffusion process, whereby a new datum performs arandom walk with drift through the space of all possible data.^[1] A trained diffusion model can be sampled in many ways, with different efficiency and quality.

There are various equivalent formalisms, includingMarkov chains, denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations.^[2] They are typically trained usingvariational inference.^[3] The model responsible for denoising is typically called its "backbone". The backbone may be of any kind, but they are typicallyU-nets ortransformers.

As of 2024^[update], diffusion models are mainly used forcomputer vision tasks, includingimage denoising,inpainting,super-resolution,image generation, and video generation. These typically involve training a neural network to sequentiallydenoise images blurred withGaussian noise.^[1]^[4] The model is trained to reverse the process of adding noise to an image. After training to convergence, it can be used for image generation by starting with an image composed of random noise, and applying the network iteratively to denoise the image.

Diffusion-based image generators have seen widespread commercial interest, such asStable Diffusion andDALL-E. These models typically combine diffusion models with other models, such as text-encoders and cross-attention modules to allow text-conditioned generation.^[5]

Other than computer vision, diffusion models have also found applications innatural language processing^[6] such astext generation^[7] andsummarization,^[8] sound generation,^[9] andreinforcement learning.^[10]^[11]


Linear interpolation	Rectified Flow	Straightened Rectified Flow[1]

Movatterモバイル変換

Denoising diffusion model

Non-equilibrium thermodynamics

Denoising Diffusion Probabilistic Model (DDPM)

Forward diffusion

Backward diffusion

Variational inference

Noise prediction network

Backward diffusion process

Score-based generative model

Score matching

The idea of score functions

Learning the score function

Annealing the score function

Continuous diffusion processes

Forward diffusion process

Backward diffusion process

Noise conditional score network (NCSN)

Their equivalence

Main variants

Noise schedule

Denoising Diffusion Implicit Model (DDIM)

Latent diffusion model (LDM)

Architectural improvements

Classifier guidance

With temperature

Classifier-free guidance (CFG)

Samplers

Other examples

Flow-based diffusion model

Optimal transport flow

Rectified flow

Choice of architecture

Diffusion model

Conditioning

Upscaling

Examples

OpenAI

Stability AI

Google

Meta

See also

Further reading

References