Artificial intelligence visual art, orAI art, isvisual artwork generated or enhanced through the implementation ofartificial intelligence (AI) programs, most commonly usingtext-to-image models. The process of automated art-making has existed since antiquity. The field of artificial intelligence was founded in the 1950s, and artists began to create art with artificial intelligence shortly after the discipline's founding. A select number of these creations have been showcased in museums and have been recognized with awards.[1] Throughoutits history, AI has raised manyphilosophical questions related to thehuman mind,artificial beings, and the nature ofart in human–AI collaboration.
Automated art dates back at least to theautomata ofancient Greek civilization, when inventors such asDaedalus andHero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music.[4][5] Creative automatons have flourished throughout history, such asMaillardet's automaton, created around 1800 and capable of creating multiple drawings and poems.[6]
Also in the 19th century,Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems.[7][8] In 1950,Alan Turing's paper "Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly.[9] Shortly after, the academic discipline of artificial intelligence was founded at a researchworkshop atDartmouth College in 1956.[10]
Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored bymyth,fiction, andphilosophy since antiquity.[11]
Karl Sims has exhibited art created withartificial life since the 1980s. He received an M.S. in computer graphics from theMIT Media Lab in 1987 and was artist-in-residence from 1990 to 1996 at thesupercomputer manufacturer and artificial intelligence companyThinking Machines.[18][19][20] In both 1991 and 1992, Sims won the Golden Nica award atPrix Ars Electronica for his videos using artificial evolution.[21][22][23] In 1997, Sims created the interactive artificial evolution installationGalápagos for theNTT InterCommunication Center in Tokyo.[24] Sims received anEmmy Award in 2019 for outstanding achievement in engineering development.[25]
In 1999,Scott Draves and a team of several engineers created and releasedElectric Sheep as afree software screensaver.[26]Electric Sheep is a volunteer computing project for animating and evolvingfractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize forElectric Sheep.[27][unreliable source?]
In 2014,Stephanie Dinkins began working onConversations with Bina48.[28] For the series, Dinkins recorded her conversations withBINA48, a social robot that resembles a middle-aged black woman.[29][30] In 2019, Dinkins won theCreative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color."[31]
In 2015,Sougwen Chung beganMimicry (Drawing Operations Unit: Generation 1), an ongoing collaboration between the artist and a robotic arm.[32] In 2019, Chung won theLumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung.[33]
In 2018, an auction sale of artificial intelligence art was held atChristie's in New York where the AI artworkEdmond de Belamy sold forUS$432,500, which was almost 45 times higher than its estimate of US$7,000–10,000. The artwork was created by Obvious, a Paris-based collective.[34][35][36]
In 2024, Japanese filmgenerAIdoscope was released. The film was co-directed byHirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence.[37]
In 2025, the Japaneseanime television seriesTwins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software.[38][39]
Deep learning, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art.[40] During the deep learning era, there are mainly these types of designs for generative art:autoregressive models,diffusion models,GANs,normalizing flows.
In 2014,Ian Goodfellow and colleagues atUniversité de Montréal developed thegenerative adversarial network (GAN), a type ofdeep neural network capable of learning to mimic thestatistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful.[41] Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specificaesthetic by analyzing adataset of example images.[12]
In 2015, a team atGoogle releasedDeepDream, a program that uses aconvolutional neural network to find and enhance patterns in images via algorithmicpareidolia.[42][43][44] The process creates deliberately over-processed images with a dream-like appearance reminiscent of apsychedelic experience.[45] Later, in 2017, a conditional GAN learned to generate 1000 image classes ofImageNet, a large visualdatabase designed for use invisual object recognition software research.[46][47] By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models.[48]
The websiteArtbreeder, launched in 2018, uses the modelsStyleGAN and BigGAN[51][52] to allow users to generate and modify images such as faces, landscapes, and paintings.[53]
In the 2020s,text-to-image models, which generate images based onprompts, became widely used, marking yet another shift in the creation of AI-generated artworks.[2]
Example of an image made with VQGAN-CLIP (NightCafe Studio, March 2023)
Example of an image made with Flux 1.1 Pro in Raw mode (November 2024); this mode is designed to generate photorealistic images
In 2021, using the influentiallarge languagegenerative pre-trained transformer models that are used inGPT-2 andGPT-3,OpenAI released a series of images created with the text-to-image AI modelDALL-E 1.[54] It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021,EleutherAI released theopen source VQGAN-CLIP[55] based on OpenAI's CLIP model.[56]Diffusion models, generative models used to create synthetic data based on existing data,[57] were first proposed in 2015,[58] but they only became better than GANs in early 2021.[59]Latent diffusion model was published in December 2021 and became the basis for the laterStable Diffusion (August 2022), developed through a collaboration between Stability AI, CompVis Group at LMU Munich, and Runway.[60]
Ideogram was released in August 2023, this model is known for its ability to generate legible text.[72][73]
In 2024,Flux was released. This model can generate realistic images and was integrated intoGrok, the chatbot used onX (formerly Twitter), andLe Chat, the chatbot ofMistral AI.[3][74][75][76] Flux was developed by Black Forest Labs, founded by the researchers behind Stable Diffusion.[77] Grok later switched to its own text-to-image modelAurora in December of the same year.[78] Several companies, along with their products, have also developed an AI model integrated with an image editing service.Adobe has released and integrated the AI modelFirefly intoPremiere Pro,Photoshop, andIllustrator.[79][80] Microsoft has also publicly announced AI image-generator features forMicrosoft Paint.[81] Along with this, some examples oftext-to-video models of the mid-2020s areRunway's Gen-4, Google'sVideoPoet, OpenAI'sSora, which was released in December 2024, andLTX-2 which was released in 2025.[82][83][84]
In 2025, several models were released.GPT Image 1 fromOpenAI, launched in March 2025, introduced new text rendering and multimodal capabilities, enabling image generation from diverse inputs like sketches and text.[85]MidJourney v7 debuted in April 2025, providing improved text prompt processing.[86] In May 2025,Flux.1 Kontext by Black Forest Labs emerged as an efficient model for high-fidelity image generation,[87] whileGoogle'sImagen 4 was released with improved photorealism.[88] Flux.2 debuted in November 2025 with improved image reference, typography, and prompt understanding.[89]
There are many approaches used by artists to develop AI visual art. Whentext-to-image is used, AI generates images based on textual descriptions, using models like diffusion or transformer-based architectures. Users input prompts and the AI produces corresponding visuals.[90][91] When image-to-image is used, AI transforms an input image into a new style or form based on a prompt or style reference, such as turning a sketch into a photorealistic image or applying an artistic style.[92][93] When image-to-video is used, AI generates short video clips or animations from a single image or a sequence of images, often adding motion or transitions. This can include animating still portraits or creating dynamic scenes.[94][95] Whentext-to-video is used, AI creates videos directly from text prompts, producing animations, realistic scenes, or abstract visuals. This is an extension of text-to-image but focuses on temporal sequences.[96]
Example of a usage ofComfyUI for Stable Diffusion XL. People can adjust variables (such as CFG, seed, and sampler) needed to generate image.
There are many tools available to the artist when working with diffusion models. They can define both positive and negative prompts, but they are also afforded a choice in using (or omitting the use of)VAEs,LoRAs, hypernetworks, IP-adapter, and embedding/textual inversions. Artists can tweak settings like guidance scale (which balances creativity and accuracy), seed (to control randomness), and upscalers (to enhance image resolution), among others. Additional influence can be exerted during pre-inference by means of noise manipulation, while traditional post-processing techniques are frequently used post-inference. People can also train their own models.
In addition, procedural "rule-based" image generation techniques have been developed, utilizing mathematical patterns, algorithms that simulate brush strokes and other painterly effects, as well as deep learning models such asgenerative adversarial networks (GANs) and transformers. Several companies have released applications and websites that allow users to focus exclusively on positive prompts, bypassing the need for manual configuration of other parameters. There are also programs capable of transforming photographs into stylized images that mimic the aesthetics of well-known painting styles.[97][98]
There are many options, ranging from simple consumer-facing mobile apps toJupyter notebooks and web UIs that require powerful GPUs to run effectively.[99] Additional functionalities include "textual inversion," which refers to enabling the use of user-provided concepts (like an object or a style) learned from a few images. Novel art can then be generated from the associated word(s) (the text that has been assigned to the learned, often abstract, concept)[100][101] and model extensions or fine-tuning (such asDreamBooth).
AI has the potential for asocietal transformation, which may include enabling the expansion of noncommercial niche genres (such ascyberpunk derivatives likesolarpunk) by amateurs, novel entertainment, fast prototyping,[102] increasing art-making accessibility,[102] and artistic output per effort or expenses or time[102]—e.g., via generating drafts, draft-definitions, and image components (inpainting). Generated images are sometimes used as sketches,[103] low-cost experiments,[104] inspiration, or illustrations ofproof-of-concept-stage ideas. Additional functionalities or improvements may also relate to post-generation manual editing (i.e., polishing), such as subsequent tweaking with an image editor.[104]
Prompts for some text-to-image models can also include images and keywords and configurable parameters, such as artistic style, which is often used via keyphrases like "in the style of [name of an artist]" in the prompt[105] /or selection of a broad aesthetic/art style.[106][103] There are platforms for sharing, trading, searching, forking/refining, or collaborating on prompts for generating specific imagery from image generators.[107][108][109][110] Prompts are often shared along with images onimage-sharing websites such asReddit and AI art-dedicated websites. A prompt is not the complete input needed for the generation of an image; additional inputs that determine the generated image include theoutput resolution,random seed, and random sampling parameters.[111]
Synthetic media, which includes AI art, was described in 2022 as a major technology-driven trend that will affect business in the coming years.[102]Harvard Kennedy School researchers voiced concerns about synthetic media serving as a vector for political misinformation soon after studying the proliferation of AI art on the X platform.[112]Synthography is a proposed term for the practice of generating images that are similar to photographs using AI.[113]
In addition to the creation of original art, research methods that use AI have been generated to quantitatively analyze digital art collections. This has been made possible due to the large-scale digitization of artwork in the past few decades. According to CETINIC and SHE (2022), using artificial intelligence to analyze already-existing art collections can provide new perspectives on the development of artistic styles and the identification of artistic influences.[114][115]
Two computational methods, close reading and distant viewing, are the typical approaches used to analyze digitized art.[116] Close reading focuses on specific visual aspects of one piece. Some tasks performed by machines in close reading methods include computational artist authentication and analysis of brushstrokes or texture properties. In contrast, through distant viewing methods, the similarity across an entire collection for a specific feature can be statistically visualized. Common tasks relating to this method include automatic classification,object detection,multimodal tasks, knowledge discovery in art history, and computational aesthetics.[115] Synthetic images can also be used to train AI algorithms forart authentication and to detect forgeries.[117]
Researchers have also introduced models that predict emotional responses to art. One such model is ArtEmis, a large-scale dataset paired with machine learning models. ArtEmis includes emotional annotations from over 6,500 participants along with textual explanations. By analyzing both visual inputs and the accompanying text descriptions from this dataset, ArtEmis enables the generation of nuanced emotional predictions.[118][119]
AI has also been used in arts outside of visual arts. Generative AI has been used to createmusic, as well as in video game productionbeyond imagery, especially forlevel design (e.g., forcustom maps) and creating new content (e.g., quests or dialogue) orinteractive stories in video games.[120][121] AI has also been used in theliterary arts,[122] such as helping withwriter's block, inspiration, or rewriting segments.[123][124][125][126] In the culinary arts, some prototypecooking robots can dynamicallytaste, which can assist chefs in analyzing the content and flavor of dishes during the cooking process.[127]
The usage of the label "art" when it applies to works generated by AI software has led to debate among artists, philosophers, scholars, and more. Various observers argue that referring to machine generated images as "art" undermines the traditional characteristics of human artistry, such as creativity, skill, and intentionality. Present-day definitions of true artistic creation often put an emphasis on the requirement of human-level intentions, personal experience and emotion, as well as historical and/or artistic context.[128]
According to a research study from the National Library of Medicine, humans inherently show a bias against artwork described as being AI-generated. When participants of the study were shown two comparable images, with only one presented as having been generated by AI, subjects were more likely to rate the one described as being artificially generated lower in artistic value. This suggests that social and cultural attitudes can shape the determination of whether an image is considered art, regardless of the image's other visual features.[129]
In a 2023 report submitted to theAnnual Convention of Digital Art Observers, Samuel Loomis wrote that the term "AI art" acknowledges its dual nature as a product of human guidance and machine-driven generative systems, when evaluating it by the same critical standards applied to traditional art.[130]
^Crevier, Daniel (1993).AI: The Tumultuous Search for Artificial Intelligence. New York, NY: BasicBooks. p. 109.ISBN0-465-02997-3.
^Newquist, HP (1994).The Brain Makers: Genius, Ego, And Greed In The Quest For Machines That Think. New York: Macmillan/SAMS. pp. 45–53.ISBN978-0-672-30412-5.
^McCorduck, Pamela (1991).AARONS's Code: Meta-Art. Artificial Intelligence, and the Work of Harold Cohen. New York: W. H. Freeman and Company. p. 210.ISBN0-7167-2173-2.
^Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014).Generative Adversarial Nets(PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680.Archived(PDF) from the original on 22 November 2019. Retrieved26 January 2022.
^Oord, Aäron van den; Kalchbrenner, Nal; Kavukcuoglu, Koray (11 June 2016)."Pixel Recurrent Neural Networks".Proceedings of the 33rd International Conference on Machine Learning. PMLR:1747–1756.Archived from the original on 9 August 2024. Retrieved16 September 2024.
^Parmar, Niki; Vaswani, Ashish; Uszkoreit, Jakob; Kaiser, Lukasz; Shazeer, Noam; Ku, Alexander; Tran, Dustin (3 July 2018)."Image Transformer".Proceedings of the 35th International Conference on Machine Learning. PMLR:4055–4064.
^Simon, Joel."About".Archived from the original on 2 March 2021. Retrieved3 March 2021.
^"Stable Diffusion". CompVis - Machine Vision and Learning LMU Munich. 15 September 2022.Archived from the original on 18 January 2023. Retrieved15 September 2022.
^Mohamad Diab, Julian Herrera, Musical Sleep, Bob Chernow, Coco Mao (28 October 2022)."Stable Diffusion Prompt Book"(PDF).Archived(PDF) from the original on 30 March 2023. Retrieved7 August 2023.{{cite web}}: CS1 maint: multiple names: authors list (link)
^abCetinic, Eva; She, James (16 February 2022). "Understanding and Creating Art with AI: Review and Outlook".ACM Transactions on Multimedia Computing, Communications, and Applications.18 (2): 66:1–66Kate Vass2.arXiv:2102.09109.doi:10.1145/3475799.ISSN1551-6857.S2CID231951381.