Movatterモバイル変換

Imagen (text-to-image model)

From Wikipedia, the free encyclopedia

Image-generating machine learning model

Imagen
An image generated with Imagen 4. Partial prompt:`Softly illuminated afternoon valley with meandering river`
Developer	Google DeepMind
Initial release	May 2022; 3 years ago (2022-05)

Stable release	Imagen 4 / 20 May 2025; 6 months ago (2025-05-20)

Type	Text-to-image model
Website	Imagen website

Artificial intelligence (AI)
Part ofa series on

Major goals Artificial general intelligence Intelligent agent Recursive self-improvement Planning Computer vision General game playing Knowledge representation Natural language processing Robotics AI safety
Approaches Machine learning Symbolic Deep learning Bayesian networks Evolutionary algorithms Hybrid intelligent systems Systems integration Open-source
Applications Bioinformatics Deepfake Earth sciences Finance Generative AI Art Audio Music Government Healthcare Mental health Industry Software development Translation Military Physics Projects
Philosophy AI alignment Artificial consciousness The bitter lesson Chinese room Friendly AI Ethics Existential risk Turing test Uncanny valley Human–AI interaction
History Timeline Progress AI winter AI boom AI bubble
Controversies Deepfake pornography Taylor Swift deepfake pornography controversy Google Gemini image generation controversy Pause Giant AI Experiments Removal of Sam Altman from OpenAI Statement on AI Risk Tay (chatbot) Théâtre D'opéra Spatial Voiceverse NFT plagiarism scandal
Glossary Glossary
v t e

Imagen is a series oftext-to-image models developed byGoogle DeepMind. They were developed byGoogle Brain until the company's merger with DeepMind in April 2023.^[1] Imagen is primarily used to generate images from text prompts, similar toStability AI'sStable Diffusion,OpenAI'sDALL-E, orMidjourney.

The original version of the model was first discussed in a paper from May 2022.^[2] The tool produces high-quality images and is available to all users with a Google account through services includingGemini, ImageFX, and Vertex AI.^[3]

History

[edit]

Imagen's original version was first presented in a paper published in May 2022. It featured the ability to generate high-fidelity images from natural language.^[2] The second version, Imagen 2 was released in December 2023.^[4] The standout feature was text and logo generation.^[5] Imagen 3 was released in August 2024.^[6] Google claims that the newest version provides better detail and lighting on generated images.^[7] On 20 May 2025 atGoogle I/O 2025 the company released an improved model, Imagen 4.^[8]

Technology

[edit]

Imagen uses two key technologies. The first is the use oftransformer-basedlarge language models, notablyT5, to understand text and subsequently encode text for image synthesis. The second is the use of cascadeddiffusion models providing high-fidelity image generation. Imagen generates image in three stages, starting from a base of 64x64, then upsampled to 256x256 and 1024x1024.^[2] Imagen 4 generates image up to 2k.^[9]

Capabilities

[edit]

Imagen can generate photorealistic images from text prompts.^[3] It can also create various styles, such as cinematic, 35mm film, illustration, and surreal. Like most text-to-image generative AI models, Imagen has difficulty rendering human fingers, text, ambigrams and other forms of typography.

The model can generate images in five aspect ratios, namely 9:16, 3:4, 1:1, 4:3, and 16:9. Imagen can also refine already generated images by editing existing text prompts.^[7]

References

[edit]

^Roth, Emma; Peters, Jay (April 20, 2023)."Google's big AI push will combine Brain and DeepMind into one team".The Verge.Archived from the original on April 20, 2023. RetrievedMarch 18, 2025.
^^a ^b ^cSaharia, Chitwan; Chan, William; Saxena, Saurabh; Li, Lala; Whang, Jay; Denton, Emily; Seyed Kamyar Seyed Ghasemipour; Burcu Karagol Ayan; Sara Mahdavi, S.; Rapha Gontijo Lopes; Salimans, Tim; Ho, Jonathan; David J Fleet; Norouzi, Mohammad (2022). "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding".arXiv:2205.11487 [cs.CV].
^^a ^bPeterson, Jake (2024-08-16)."Anyone With a Google Account Can Try Google's Latest AI Image Generator Right Now".Lifehacker. Retrieved2025-03-18.
^"Imagen 2 - our most advanced text-to-image technology".Google DeepMind. 2025-03-12. Retrieved2025-03-18.
^Wiggers, Kyle (2023-12-13)."Google debuts Imagen 2 with text and logo generation".TechCrunch. Retrieved2025-03-18.
^Schoon, Ben (2024-08-16)."Google opens access to Imagen 3, its latest model for AI image generation".9to5Google.Archived from the original on 2024-08-18. Retrieved2025-03-18.
^^a ^bChristian Rowlands (2025-02-26)."Some of the most realistic AI images you'll see were created with this free tool".TechRadar. Retrieved2025-03-18.
^Kyle Wiggers (2025-05-20)."Imagen 4 is Google's newest AI image generator".techcrunch.com. Retrieved2025-03-18.
^"Imagen".Google DeepMind. Retrieved2025-10-28.

External links

[edit]

Imagen website

Google AI

Computer
programs

AlphaGo

Versions	AlphaGo (2015) Master (2016) AlphaGo Zero (2017) AlphaZero (2017) MuZero (2019)
Competitions	Fan Hui (2015) Lee Sedol (2016) Ke Jie (2017)
In popular culture	AlphaGo (2017) The MANIAC (2023)

Other

AlphaFold (2018)
AlphaStar (2019)
AlphaDev (2023)
AlphaGeometry (2024)
AlphaGenome (2025)

Machine
learning

Neural networks	Inception (2014) WaveNet (2016) MobileNet (2017) Transformer (2017) EfficientNet (2019) Gato (2022)
Other	Quantum Artificial Intelligence Lab TensorFlow Tensor Processing Unit

Generative
AI

Chatbots	Assistant (2016) Sparrow (2022) Gemini (2023) Nano Banana (2025)
Models	BERT (2018) XLNet (2019) T5 (2019) LaMDA (2021) Chinchilla (2022) PaLM (2022) Imagen (2023) Gemini (2023) VideoPoet (2024) Gemma (2024) Veo (2024)
Other	DreamBooth (2022) NotebookLM (2023) Vids (2024) Gemini Robotics (2025) Antigravity (2025)

See also

Google

a subsidiary ofAlphabet

Company

Divisions

Subsidiaries

Active

Defunct

Programs

Events

Infrastructure

People

Current	Krishna Bharat Vint Cerf Jeff Dean John Doerr Sanjay Ghemawat Al Gore John L. Hennessy Urs Hölzle Salar Kamangar Ray Kurzweil Ann Mather Alan Mulally Rick Osterloh Sundar Pichai (CEO) Ruth Porat (CFO) Rajen Sheth Hal Varian Neal Mohan
Former	Andy Bechtolsheim Sergey Brin (co-founder) David Cheriton Matt Cutts David Drummond Alan Eustace Timnit Gebru Omid Kordestani Paul Otellini Larry Page (co-founder) Patrick Pichette Eric Schmidt Ram Shriram Amit Singhal Shirley M. Tilghman Rachel Whetstone Susan Wojcicki

Criticism

General	Censorship DeGoogle FairSearch "Google's Ideological Echo Chamber" No Tech for Apartheid Privacy concerns Street View YouTube Trade unions Alphabet Workers Union YouTube copyright issues
Incidents	Backdoor advertisement controversy Blocking of YouTube videos in Germany Data breach Elsagate Fantastic Adventures scandal Kohistan video case Reactions toInnocence of Muslims San Francisco tech bus protests Services outages Slovenian government incident Walkouts YouTube headquarters shooting

Other

Development

Software

A–C	Accelerated Linear Algebra AMP Actions on Google ALTS American Fuzzy Lop Android Cloud to Device Messaging Android Debug Bridge Android NDK Android Runtime Android SDK Android Studio Angular AngularJS Apache Beam APIs App Engine App Inventor App Maker App Runtime for Chrome AppJet Apps Script AppSheet ARCore Base Bazel BeyondCorp Bigtable BigQuery Bionic Blockly Borg Caja Cameyo Chart API Charts Chrome Frame Chromium Blink Closure Tools Cloud Connect Cloud Dataflow Cloud Datastore Cloud Messaging Cloud Shell Cloud Storage Code Search Compute Engine Cpplint
D–N	Dalvik Data Protocol Dialogflow Exposure Notification Fast Pair Fastboot Federated Learning of Cohorts File System Firebase Firebase Studio Firebase Cloud Messaging FlatBuffers Flutter Freebase Gadgets Ganeti Gears Gerrit GLOP gRPC Gson Guava Guetzli Guice gVisor GYP JAX Jetpack Compose Keyhole Markup Language Kubernetes Kythe LevelDB Lighthouse Looker Studio lmctfy MapReduce Mashup Editor Matter Mobile Services Namebench Native Client Neatx Neural Machine Translation Nomulus
O–Z	Open Location Code OpenRefine OpenSocial Optimize OR-Tools Pack PageSpeed Piper Plugin for Eclipse Polymer Programmable Search Engine Project Shield Public DNS reCAPTCHA RenderScript SafetyNet SageTV Schema.org Search Console Shell Sitemaps Skia Graphics Engine Spanner Sputnik Stackdriver Swiffy Tango TensorFlow Tesseract Test Translator Toolkit Urchin UTM parameters V8 VirusTotal VisBug Wave Federation Protocol Weave Web Accelerator Web Designer Web Server Web Toolkit Webdriver Torso WebRTC

Operating systems

Machine learning models

Neural networks

Computer programs

Formats and codecs

Programming languages

Search algorithms

Domain names

Typefaces

Software

A	Aardvark Account Dashboard Takeout Ad Manager AdMob Ads AdSense Affiliate Network Alerts Allo Analytics Antigravity Android Auto Android Beam Answers Apture Arts & Culture Assistant Attribution Authenticator
B	BebaPay BeatThatQuote.com Beam Blog Search Blogger Body Bookmarks Books Ngram Viewer Browser Sync Building Maker Bump BumpTop Buzz
C	Calendar Cast Catalogs Chat Checkout Chrome Chrome Apps Chrome Experiments Chrome Remote Desktop Chrome Web Store Classroom Cloud Print Cloud Search Contacts Contributor Crowdsource Currents (social app) Currents (news app)
D	Data Commons Dataset Search Desktop Dictionary Dinosaur Game Directory Docs Docs Editors Domains Drawings Drive Duo
E	Earth Etherpad Expeditions Express
F	Family Link Fast Flip FeedBurner fflick Fi Wireless Finance Files Find Hub Fit Flights Flu Trends Fonts Forms Friend Connect Fusion Tables
G	Gboard Gemini Nano Banana Gesture Search Gizmo5 Google+ Gmail Goggles GOOG-411 Grasshopper Groups
H	Hangouts Helpouts Home
I	iGoogle Images Image Labeler Image Swirl Inbox by Gmail Input Tools Japanese Input Pinyin Insights for Search
J	Jaiku Jamboard
K	Kaggle Keep Knol
L	Labs Latitude Lens Like.com Live Transcribe Lively
M	Map Maker Maps Maps Navigation Marketing Platform Meet Messages Moderator My Tracks
N	Nearby Share News News & Weather News Archive Notebook NotebookLM Now
O	Offers One One Pass Opinion Rewards Orkut Oyster
P	Panoramio PaperofRecord.com Patents Page Creator Pay (mobile app) Pay (payment method) Pay Send People Cards Person Finder Personalized Search Photomath Photos Picasa Picasa Web Albums Picnik Pixel Camera Play Play Books Play Games Play Music Play Newsstand Play Pass Play Services Podcasts Poly Postini PostRank Primer Public Alerts Public Data Explorer
Q	Question Hub Quick, Draw! Quick Search Box Quick Share Quickoffice
R	Read Along Reader Reply
S	Safe Browsing SageTV Santa Tracker Schemer Scholar Search AI Overviews Knowledge Graph SafeSearch Searchwiki Sheets Shoploop Shopping Sidewiki Sites Slides Snapseed Socratic Softcard Songza Sound Amplifier Spaces Sparrow (chatbot) Sparrow (email client) Speech Recognition & Synthesis Squared Stadia Station Store Street View Surveys Sync
T	Tables Talk TalkBack Tasks Tenor Tez Tilt Brush Toolbar Toontastic 3D Translate Travel Trendalyzer Trends TV
U	URL Shortener
V	Video Vids Voice Voice Access Voice Search
W	Wallet Wave Waze WDYL Web Light Where Is My Train Widevine Wiz Word Lens Workspace Workspace Marketplace
Y	YouTube YouTube Kids YouTube Music YouTube Premium YouTube Shorts YouTube Studio YouTube TV YouTube VR

Hardware

Pixel

Smartphones	Pixel (2016) Pixel 2 (2017) Pixel 3 (2018) Pixel 3a (2019) Pixel 4 (2019) Pixel 4a (2020) Pixel 5 (2020) Pixel 5a (2021) Pixel 6 (2021) Pixel 6a (2022) Pixel 7 (2022) Pixel 7a (2023) Pixel Fold (2023) Pixel 8 (2023) Pixel 8a (2024) Pixel 9 (2024) Pixel 9 Pro Fold (2024) Pixel 9a (2025) Pixel 10 (2025) Pixel 10 Pro Fold (2025)
Smartwatches	Pixel Watch (2022) Pixel Watch 2 (2023) Pixel Watch 3 (2024) Pixel Watch 4 (2025)
Tablets	Pixel C (2015) Pixel Slate (2018) Pixel Tablet (2023)
Laptops	Chromebook Pixel (2013–2015) Pixelbook (2017) Pixelbook Go (2019)
Other	Pixel Buds (2017–present)

Nexus

Smartphones	Nexus One (2010) Nexus S (2010) Galaxy Nexus (2011) Nexus 4 (2012) Nexus 5 (2013) Nexus 6 (2014) Nexus 5X (2015) Nexus 6P (2015)
Tablets	Nexus 7 (2012) Nexus 10 (2012) Nexus 7 (2013) Nexus 9 (2014)
Other	Nexus Q (2012) Nexus Player (2014)

Other

v t e Litigation
Advertising	Feldman v. Google, Inc. (2007) Rescuecom Corp. v. Google Inc. (2009) Goddard v. Google, Inc. (2009) Rosetta Stone Ltd. v. Google, Inc. (2012) Google, Inc. v. American Blind & Wallpaper Factory, Inc. (2017) Jedi Blue
Antitrust	European Union (2010–present) United States v. Adobe Systems, Inc., Apple Inc., Google Inc., Intel Corporation, Intuit, Inc., and Pixar (2011) Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC and Ors. (2019) United States v. Google LLC (2020) United States v. Google LLC (2023)
Intellectual property	Perfect 10, Inc. v. Amazon.com, Inc. (2007) Viacom International, Inc. v. YouTube, Inc. (2010) Lenz v. Universal Music Corp.(2015) Authors Guild, Inc. v. Google, Inc. (2015) Field v. Google, Inc. (2016) Google LLC v. Oracle America, Inc. (2021) Smartphone patent wars
Privacy	Rocky Mountain Bank v. Google, Inc. (2009) Hibnick v. Google, Inc. (2010) United States v. Google Inc. (2012) Judgement of the German Federal Court of Justice on Google's autocomplete function (2013) Joffe v. Google, Inc. (2013) Mosley v SARL Google (2013) Google Spain v AEPD and Mario Costeja González (2014) Frank v. Gaos (2019)
Other	Garcia v. Google, Inc. (2015) Google LLC v Defteros (2020) Epic Games v. Google (2021) Gonzalez v. Google LLC (2022)

Concepts

Products

Android	Booting process Custom distributions Features Recovery mode Software development
Street View coverage	Africa Antarctica Asia Israel Europe North America Canada United States Oceania South America Argentina Chile Colombia
YouTube	Copyright strike Education Features Moderation Most-disliked videos Most-liked videos Most-subscribed channels Most-viewed channels Most-viewed videos Arabic music videos Chinese music videos French music videos Indian videos Pakistani videos Official channel Social impact YouTube Premium original programming
Other	Gmail interface Maps pin Most downloaded Google Play applications Stadia games

Documentaries

Books

Popular culture

Google Feud
Google Me (film)
"Google Me" (Kim Zolciak song)
"Google Me" (Teyana Taylor song)
Is Google Making Us Stupid?
Proceratium google
Matt Nathanson: Live at Google
The Billion Dollar Code
The Internship
Where on Google Earth is Carmen Sandiego?

Other

Italics denotediscontinued products.

Generative AI

Concepts

Chatbots

Models

Text	Claude Gemini Gemma GPT 1 2 3 J 4 4o 4.5 4.1 OSS 5 Llama o1 o3 o4-mini Qwen Velvet
Coding	Base44 Claude Code Cursor Devstral GitHub Copilot Kimi Qwen3-Coder Replit
Image	Aurora Firefly Flux GPT Image 1 Ideogram Imagen Midjourney Qwen-Image Recraft Seedream Stable Diffusion
Video	Dream Machine Hailuo AI Kling Runway Gen Seedance Sora Veo Wan
Speech	15.ai Eleven MiniMax Speech 2.5 WaveNet
Music	Eleven Music Endel Lyria Riffusion Suno Udio

Controversies

Agents

Companies

Category

Artificial intelligence (AI)

Concepts

Applications

Implementations

Audio–visual	AlexNet WaveNet Human image synthesis HWR OCR Computer vision Speech synthesis 15.ai ElevenLabs Speech recognition Whisper Facial recognition AlphaFold Text-to-image models Aurora DALL-E Firefly Flux Ideogram Imagen Midjourney Recraft Stable Diffusion Text-to-video models Dream Machine Runway Gen Hailuo AI Kling Sora Veo Music generation Riffusion Suno AI Udio
Text	Word2vec Seq2seq GloVe BERT T5 Llama Chinchilla AI PaLM GPT 1 2 3 J ChatGPT 4 4o o1 o3 4.5 4.1 o4-mini 5 5.1 Claude Gemini Gemini (language model) Gemma Grok LaMDA BLOOM DBRX Project Debater IBM Watson IBM Watsonx Granite PanGu-Σ DeepSeek Qwen
Decisional	AlphaGo AlphaZero OpenAI Five Self-driving car MuZero Action selection AutoGPT Robot control

People

Architectures

Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=Imagen_(text-to-image_model)&oldid=1320755746"

Categories:

Hidden categories:

[8]ページ先頭