| Part ofa series on |
| Artificial intelligence (AI) |
|---|
Glossary |
| History of computing |
|---|
| Hardware |
| Software |
| Computer science |
| Modern concepts |
| By country |
| Timeline of computing |
| Glossary of computer science |
Thehistory of artificial intelligence (AI) began inantiquity, with myths, stories, and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen. The study of logic and formal reasoning from antiquity to the present led directly to the invention of theprogrammable digital computer in the 1940s, a machine based on abstract mathematical reasoning. This device and the ideas behind it inspired scientists to begin discussing the possibility of building anelectronic brain.
The field of AI research was founded at aworkshop held on the campus ofDartmouth College in 1956.[1] Attendees of the workshop became the leaders of AI research for decades. Many of them predicted that machines as intelligent as humans would exist within a generation. TheU.S. government provided millions of dollars with the hope of making this vision come true.[2]
Eventually, it became obvious that researchers had grossly underestimated the difficulty of this feat.[3] In 1974, criticism fromJames Lighthill and pressure from the U.S.A. Congress led the U.S. andBritish Governments to stop funding undirected research into artificial intelligence. Seven years later, a visionary initiative by theJapanese Government and the success ofexpert systems reinvigorated investment in AI, and by the late 1980s, the industry had grown into a billion-dollar enterprise. However, investors' enthusiasm waned in the 1990s, and the field was criticized in the press and avoided by industry (a period known as an "AI winter"). Nevertheless, research and funding continued to grow under other names.
In the early 2000s,machine learning was applied to a wide range of problems in academia and industry. The success was due to the availability of powerful computer hardware, the collection of immense data sets, and the application of solid mathematical methods. Soon after,deep learning proved to be a breakthrough technology, eclipsing all other methods. Thetransformer architecture debuted in 2017 and was used to produce impressivegenerative AI applications, amongst other use cases.
Investment in AIboomed in the 2020s. The recent AI boom, initiated by the development of transformer architecture, led to the rapid scaling and public releases oflarge language models (LLMs) likeChatGPT. These models exhibit human-like traits of knowledge, attention, and creativity, and have been integrated into various sectors, fueling exponential investment in AI. However, concerns about the potential risks andethical implications of advanced AI have also emerged, causing debate about the future of AI and its impact on society.
InGreek mythology,Talos was a creature made of bronze who acted as guardian for theisland of Crete. He would throw boulders at the ships of invaders and would complete 3 circuits around the island's perimeter daily.[4] According topseudo-Apollodorus'Bibliotheke, Hephaestus forged Talos with the aid of a cyclops and presented theautomaton as a gift toMinos.[5] In theArgonautica,Jason and theArgonauts defeated Talos by removing a plug near his foot, causing the vitalichor to flow out from his body and rendering him lifeless.[6]
Pygmalion was a legendary king and sculptor of Greek mythology, famously represented inOvid'sMetamorphoses. In the 10th book of Ovid's narrative poem, Pygmalion becomes disgusted with women when he witnesses the way in which thePropoetides prostitute themselves. Despite this, he makes offerings at the temple of Venus asking the goddess to bring to him a woman just like a statue he carved.[7]

InOf the Nature of Things, the Swiss alchemistParacelsus describes a procedure that he claims can fabricate an "artificial man". By placing the "sperm of a man" in horse dung, and feeding it the "Arcanum of Mans blood" after 40 days, the concoction will become a living infant.[8]
The earliest written account regarding golem-making is found in the writings ofEleazar ben Judah of Worms in the early 13th century.[9] During the Middle Ages, it was believed that the animation of aGolem could be achieved by insertion of a piece of paper with any of God's names on it, into the mouth of the clay figure.[10] Unlike legendary automata likeBrazen Heads,[11] aGolem was unable to speak.[12]
Takwin, the artificial creation of life, was a frequent topic ofIsmaili alchemical manuscripts, especially those attributed toJabir ibn Hayyan. Islamic alchemists attempted to create a broad range of life through their work, ranging from plants to animals.[13]
InFaust: The Second Part of the Tragedy byJohann Wolfgang von Goethe, an alchemically fabricatedhomunculus, destined to live forever in the flask in which he was made, endeavors to be born into a full human body. Upon the initiation of this transformation, however, the flask shatters and the homunculus dies.[14]
By the 19th century, ideas about artificial men and thinking machines became a popular theme in fiction. Notable works likeMary Shelley'sFrankenstein andKarel Čapek'sR.U.R. (Rossum's Universal Robots)[15]explored the concept of artificial life. Speculative essays, such asSamuel Butler's "Darwin among the Machines",[16] andEdgar Allan Poe's "Maelzel's Chess Player"[17] reflected society's growing interest in machines with artificial intelligence. AI remains a common topic in science fiction today.[18]

Realistic humanoidautomata were built by craftsman from many civilizations, includingYan Shi,[19]Hero of Alexandria,[20]Al-Jazari,[21]Haroun al-Rashid,[22]Jacques de Vaucanson,[23][24]Leonardo Torres y Quevedo,[25]Pierre Jaquet-Droz andWolfgang von Kempelen.[26][27]
The oldest known automata were thesacred statues ofancient Egypt andGreece.[28][29] The faithful believed that craftsman had imbued these figures with very real minds, capable of wisdom and emotion—Hermes Trismegistus wrote that "by discovering the true nature of the gods, man has been able to reproduce it".[30] English scholarAlexander Neckham asserted that the Ancient Roman poetVirgil had built a palace with automaton statues.[31]
During the early modern period, these legendary automata were said to possess the magical ability to answer questions put to them. The late medieval alchemist and proto-ProtestantRoger Bacon was purported to have fabricated abrazen head, having developed a legend of having been a wizard.[32][33] These legends were similar to the Norse myth of the Head ofMímir. According to legend, Mímir was known for his intellect and wisdom, and was beheaded in theÆsir-Vanir War.Odin is said to have "embalmed" the head with herbs and spoke incantations over it such that Mímir's head remained able to speak wisdom to Odin. Odin then kept the head near him for counsel.[34]
Artificial intelligence is based on the assumption that the process of human thought can be mechanized. The study of mechanical—or "formal"—reasoning has a long history.Chinese,Indian andGreek philosophers all developed structured methods of formal deduction by the first millennium BCE. Their ideas were developed over the centuries by philosophers such asAristotle (who gave a formal analysis of thesyllogism),[35]Euclid (whoseElements was a model of formal reasoning),al-Khwārizmī (who developedalgebra and gave his name to the wordalgorithm) and Europeanscholastic philosophers such asWilliam of Ockham andDuns Scotus.[36]
Spanish philosopherRamon Llull (1232–1315) developed severallogical machines devoted to the production of knowledge by logical means;[37][38] Llull described his machines as mechanical entities that could combine basic and undeniable truths by simple logical operations, produced by the machine by mechanical meanings, in such ways as to produce all the possible knowledge.[39] Llull's work had a great influence onGottfried Leibniz, who redeveloped his ideas.[40]

In the 17th century,Leibniz,Thomas Hobbes andRené Descartes explored the possibility that all rational thought could be made as systematic as algebra or geometry.[41]Hobbes famously wrote inLeviathan: "Forreason ... is nothing butreckoning, that is adding and subtracting".[42]Leibniz envisioned a universal language of reasoning, thecharacteristica universalis, which would reduce argumentation to calculation so that "there would be no more need of disputation between two philosophers than between two accountants. For it would suffice to take their pencils in hand, down to their slates, and to say each other (with a friend as witness, if they liked):Let us calculate."[43] These philosophers had begun to articulate thephysical symbol system hypothesis that would guide AI research.
The study ofmathematical logic provided the essential breakthrough that made artificial intelligence seem plausible. The foundations had been set by such works asBoole'sThe Laws of Thought andFrege'sBegriffsschrift.[44] Building onFrege's system,Russell andWhitehead presented a formal treatment of the foundations of mathematics in their masterpiece, thePrincipia Mathematica in 1913. Inspired byRussell's success,David Hilbert challenged mathematicians of the 1920s and 30s to answer this fundamental question: "can all of mathematical reasoning be formalized?"[36] His question was answered byGödel'sincompleteness proof,[45]Turing'smachine[45] andChurch'sLambda calculus.[a]

Their answer was surprising in two ways. First, they proved that there were, in fact, limits to what mathematical logic could accomplish. But second (and more important for AI) their work suggested that, within these limits,any form of mathematical reasoning could be mechanized. TheChurch-Turing thesis implied that a mechanical device, shuffling symbols as simple as0 and1, could imitate any conceivable process of mathematical deduction.[45] The key insight was theTuring machine—a simple theoretical construct that captured the essence of abstract symbol manipulation.[48] This invention would inspire a handful of scientists to begin discussing the possibility of thinking machines.
In the 18th and 19th centuryLuigi Galvani,Emil du Bois-Reymond,Hermann von Helmholtz and others demonstrated that the nerves carried electrical signals andRobert Bentley Todd correctly speculated in 1828 that the brain was an electrical network.Camillo Golgi's staining techniques enabledSantiago Ramón y Cajal to provide evidence for theneuron theory: "The truly amazing conclusion is that a collection of simple cells can lead to thought, action, and consciousness".[49]
Donald Hebb was a Canadian psychologist whose work laid the foundation for modern neuroscience, particularly in understanding learning, memory, and neural plasticity. His most influential book,The Organization of Behavior (1949), introduced the concept of Hebbian learning, often summarized as "cells that fire together wire together."[50]
Hebb began formulating the foundational ideas for this book in the early 1940s, particularly during his time at the Yerkes Laboratories of Primate Biology from 1942 to 1947. He made extensive notes between June 1944 and March 1945 and sent a complete draft to his mentor Karl Lashley in 1946. The manuscript forThe Organization of Behavior wasn't published until 1949. The delay was due to various factors, including World War II and shifts in academic focus. By the time it was published, several of his peers had already published related ideas, making Hebb's work seem less groundbreaking at first glance. However, his synthesis of psychological and neurophysiological principles became a cornerstone of neuroscience and machine learning.[51][52]
Calculating machines were designed or built in antiquity and throughout history by many people, includingGottfried Leibniz,[38][53]Joseph Marie Jacquard,[54]Charles Babbage,[54][55]Percy Ludgate,[56]Leonardo Torres Quevedo,[57]Vannevar Bush,[58] and others.Ada Lovelace speculated that Babbage's machine was "a thinking or ... reasoning machine", but warned "It is desirable to guard against the possibility of exaggerated ideas that arise as to the powers" of the machine.[59][60]
The first modern computers were the massive machines of theSecond World War (such asKonrad Zuse'sZ3,Tommy Flowers'Heath Robinson andColossus,Atanasoff andBerry'sABC, andENIAC at theUniversity of Pennsylvania).[61]ENIAC was based on the theoretical foundation laid byAlan Turing and developed byJohn von Neumann,[62] and proved to be the most influential.[61]

The earliest research into thinking machines was inspired by a confluence of ideas that became prevalent in the late 1930s, 1940s, and early 1950s. Recent research inneurology had shown that the brain was an electrical network ofneurons that fired in all-or-nothing pulses.Norbert Wiener'scybernetics described control and stability in electrical networks.Claude Shannon'sinformation theory described digital signals (i.e., all-or-nothing signals).Alan Turing'stheory of computation showed that any form of computation could be described digitally. The close relationship between these ideas suggested that it might be possible to construct an "electronic brain".
In the 1940s and 50s, a handful of scientists from a variety of fields (mathematics, psychology, engineering, economics and political science) explored several research directions that would be vital to later AI research.[63] Alan Turing was among the first people to seriously investigate the theoretical possibility of "machine intelligence".[64] The field of "artificial intelligence research" was founded as an academic discipline in 1956.[65]

In 1950 Turing published a landmark paper "Computing Machinery and Intelligence", in which he speculated about the possibility of creating machines that think.[67][b] In the paper, he noted that "thinking" is difficult to define and devised his famousTuring test: If a machine could carry on a conversation (over ateleprinter) that was indistinguishable from a conversation with a human being, then it was reasonable to say that the machine was "thinking".[68] This simplified version of the problem allowed Turing to argue convincingly that a "thinking machine" was at leastplausible and the paper answered all the most common objections to the proposition.[69] The Turing test was the first serious proposal in thephilosophy of artificial intelligence.
Walter Pitts andWarren McCulloch analyzed networks of idealizedartificial neurons and showed how they might perform simple logical functions in 1943. They were the first to describe what later researchers would call aneural network.[70] The paper was influenced by Turing's paper "On Computable Numbers" from 1936 using similar two-state boolean 'neurons', but was the first to apply it to neuronal function.[64] One of the students inspired by Pitts and McCulloch wasMarvin Minsky who was a 24-year-old graduate student at the time. In 1951 Minsky and Dean Edmonds built the first neural net machine, theSNARC.[71] Minsky would later become one of the most important leaders and innovators in AI.
Experimental robots such asW. Grey Walter'sturtles and theJohns Hopkins Beast, were built in the 1950s. These machines did not use computers, digital electronics, or symbolic reasoning; they were controlled entirely by analog circuitry.[72]
In 1951, using theFerranti Mark 1 machine of theUniversity of Manchester,Christopher Strachey wrote a checkers program[73] andDietrich Prinz wrote one for chess.[74]Arthur Samuel's checkers program, the subject of his 1959 paper "Some Studies in Machine Learning Using the Game of Checkers", eventually achieved sufficient skill to challenge a respectable amateur.[75] Samuel's program was among the first uses of what would later be calledmachine learning.[76]Game AI would continue to be used as a measure of progress in AI throughout its history.

When access todigital computers became possible in the mid-fifties, a few scientists instinctively recognized that a machine that could manipulate numbers could also manipulate symbols and that the manipulation of symbols could well be the essence of human thought. This was a new approach to creating thinking machines.[77][78]
In 1955,Allen Newell and future Nobel LaureateHerbert A. Simon created the "Logic Theorist", with help fromJ. C. Shaw. The program would eventually prove 38 of the first 52 theorems inRussell andWhitehead'sPrincipia Mathematica, and find new and more elegant proofs for some.[79] Simon said that they had "solved the venerablemind/body problem, explaining how a system composed of matter can have the properties of mind."[80][c] The symbolic reasoning paradigm they introduced would dominate AI research and funding until the middle 90s, as well as inspire thecognitive revolution.
The Dartmouth workshop of 1956 was a pivotal event that marked the formal inception of AI as an academic discipline.[65] It was organized byMarvin Minsky andJohn McCarthy, with the support of two senior scientistsClaude Shannon andNathan Rochester ofIBM. The proposal for the conference stated they intended to test the assertion that "every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it".[81][d] The term "Artificial Intelligence" was introduced by John McCarthy at the workshop.[e]The participants includedRay Solomonoff,Oliver Selfridge,Trenchard More,Arthur Samuel,Allen Newell andHerbert A. Simon, all of whom would create important programs during the first decades of AI research.[87][f] At the workshop Newell and Simon debuted the "Logic Theorist".[88] The workshop was the moment that AI gained its name, its mission, its first major success and its key players, and is widely considered the birth of AI.[g]
In the autumn of 1956, Newell and Simon also presented the Logic Theorist at a meeting of the Special Interest Group in Information Theory at theMassachusetts Institute of Technology (MIT). At the same meeting,Noam Chomsky discussed hisgenerative grammar, andGeorge Miller described his landmark paper "The Magical Number Seven, Plus or Minus Two". Miller wrote "I left the symposium with a conviction, more intuitive than rational, that experimental psychology, theoretical linguistics, and the computer simulation of cognitive processes were all pieces from a larger whole."[90][61]
This meeting was the beginning of the "cognitive revolution"—an interdisciplinaryparadigm shift in psychology, philosophy, computer science and neuroscience. It inspired the creation of the sub-fields ofsymbolic artificial intelligence,generative linguistics,cognitive science,cognitive psychology,cognitive neuroscience and the philosophical schools ofcomputationalism andfunctionalism. All these fields used related tools to model the mind and results discovered in one field were relevant to the others.
The cognitive approach allowed researchers to consider "mental objects" like thoughts, plans, goals, facts or memories, often analyzed usinghigh level symbols in functional networks. These objects had been forbidden as "unobservable" by earlier paradigms such asbehaviorism.[h] Symbolic mental objects would become the major focus of AI research and funding for the next several decades.
The programs developed in the years after theDartmouth Workshop were, to most people, simply "astonishing":[i] computers were solving algebra word problems, proving theorems in geometry and learning to speak English. Few at the time would have believed that such "intelligent" behavior by machines was possible at all.[94][95][93] Researchers expressed an intense optimism in private and in print, predicting that a fully intelligent machine would be built in less than 20 years.[96] Government agencies like theDefense Advanced Research Projects Agency (DARPA, then known as "ARPA") poured money into the field.[97] Artificial Intelligence laboratories were set up at a number of British and US universities in the latter 1950s and early 1960s.[64]
There were many successful programs and new directions in the late 50s and 1960s. Among the most influential were these:
Many early AI programs used the same basicalgorithm. To achieve some goal (like winning a game or proving a theorem), they proceeded step by step towards it (by making a move or a deduction) as if searching through a maze,backtracking whenever they reached a dead end.[98] The principal difficulty was that, for many problems, the number of possible paths through the "maze" was astronomical (a situation known as a "combinatorial explosion"). Researchers would reduce the search space by usingheuristics that would eliminate paths that were unlikely to lead to a solution.[99]
Newell andSimon tried to capture a general version of this algorithm in a program called the "General Problem Solver".[100][101] Other "searching" programs were able to accomplish impressive tasks like solving problems in geometry and algebra, such asHerbert Gelernter's Geometry Theorem Prover (1958)[102] and Symbolic Automatic Integrator (SAINT), written byMinsky's student James Slagle in 1961.[103][104] Other programs searched through goals and subgoals toplan actions, like theSTRIPS system developed atStanford to control the behavior of the robotShakey.[105]

An important goal of AI research is to allow computers to communicate innatural languages like English. An early success wasDaniel Bobrow's programSTUDENT, which could solve high school algebra word problems.[106]
Asemantic net represents concepts (e.g. "house", "door") as nodes, and relations among concepts as links between the nodes (e.g. "has-a"). The first AI program to use a semantic net was written by Ross Quillian[107] and the most successful (and controversial) version wasRoger Schank'sConceptual dependency theory.[108]
Joseph Weizenbaum'sELIZA could carry out conversations that were so realistic that users occasionally were fooled into thinking they were communicating with a human being and not a computer program (seeELIZA effect). But in fact, ELIZA simply gave acanned response or repeated back what was said to it, rephrasing its response with a few grammar rules. ELIZA was the firstchatbot.[109][110]
In the late 60s,Marvin Minsky andSeymour Papert of theMIT AI Laboratory proposed that AI research should focus on artificially simple situations known as micro-worlds.[j] They pointed out that in successful sciences like physics, basic principles were often best understood using simplified models like frictionless planes or perfectly rigid bodies. Much of the research focused on a "blocks world," which consists of colored blocks of various shapes and sizes arrayed on a flat surface.[111]
This paradigm led to innovative work inmachine vision byGerald Sussman, Adolfo Guzman,David Waltz (who invented "constraint propagation"), and especiallyPatrick Winston. At the same time, Minsky and Papert built a robot arm that could stack blocks, bringing the blocks world to life.Terry Winograd'sSHRDLU could communicate in ordinary English sentences about the micro-world, plan operations and execute them.[111]
In the 1960s funding was primarily directed towards laboratories researchingsymbolic AI, however several people still pursued research in neural networks.

Theperceptron, a single-layerneural network was introduced in 1958 byFrank Rosenblatt[112] (who had been a schoolmate ofMarvin Minsky at theBronx High School of Science).[113] Like most AI researchers, he was optimistic about their power, predicting that a perceptron "may eventually be able to learn, make decisions, and translate languages."[114] Rosenblatt was primarily funded byOffice of Naval Research.[115]
Bernard Widrow and his studentTed Hoff builtADALINE (1960) andMADALINE (1962), which had up to 1000 adjustable weights.[116][117] A group atStanford Research Institute led byCharles A. Rosen and Alfred E. (Ted) Brain built two neural network machines named MINOS I (1960) and II (1963), mainly funded byU.S. Army Signal Corps. MINOS II[118] had 6600 adjustable weights,[119] and was controlled with anSDS 910 computer in a configuration named MINOS III (1968), which could classify symbols on army maps, and recognize hand-printed characters onFortrancoding sheets.[120][121] Most of neural network research during this early period involved building and using bespoke hardware, rather than simulation on digital computers.[k]
However, partly due to lack of results and partly due to competition fromsymbolic AI research, the MINOS project ran out of funding in 1966. Rosenblatt failed to secure continued funding in the 1960s.[122] In 1969 research came to a sudden halt with the publication ofMinsky andPapert's 1969 bookPerceptrons.[123] It suggested that there were severe limitations to what perceptrons could do and that Rosenblatt's predictions had been grossly exaggerated. The effect of the book was that virtually no research was funded inconnectionism for 10 years.[124] The competition for government funding ended with the victory of symbolic AI approaches over neural networks.[121][122]
Minsky (who had worked onSNARC) became a staunch objector to pure connectionist AI.Widrow (who had worked onADALINE) turned to adaptive signal processing. TheSRI group (which worked on MINOS) turned to symbolic AI and robotics.[121][122]
The main problem was the inability to train multilayered networks (versions ofbackpropagation had already been used in other fields but it was unknown to these researchers).[125][124] The AI community became aware of backpropogation in the 80s,[126] and, in the 21st century, neural networks would become enormously successful, fulfilling all of Rosenblatt's optimistic predictions. Rosenblatt did not live to see this, however, as he died in a boating accident in 1971.[127]
The first generation of AI researchers made these predictions about their work:
In June 1963,MIT received a $2.2 million grant from the newly created Advanced Research Projects Agency (ARPA, later known asDARPA). The money was used to fundproject MAC which subsumed the "AI Group" founded byMinsky andMcCarthy five years earlier. DARPA continued to provide $3 million each year until the 70s.[134] DARPA made similar grants toNewell andSimon's program atCarnegie Mellon University and toStanford University'sAI Lab, founded byJohn McCarthy in 1963.[135] Another important AI laboratory was established atEdinburgh University byDonald Michie in 1965.[136] These four institutions would continue to be the main centers of AI research and funding in academia for many years.[137][m]
The money was given with few strings attached:J. C. R. Licklider, then the director of ARPA, believed that his organization should "fund people, not projects!" and allowed researchers to pursue whatever directions might interest them.[139] This created a freewheeling atmosphere at MIT that gave birth to thehacker culture,[140] but this "hands off" approach did not last.
In the 1970s, AI was subject to critiques and financial setbacks. AI researchers had failed to appreciate the difficulty of the problems they faced. Their tremendous optimism had raised public expectations impossibly high, and when the promised results failed to materialize, funding targeted at AI was severely reduced.[141] The lack of success indicated the techniques being used by AI researchers at the time were insufficient to achieve their goals.[142][143]
These setbacks did not affect the growth and progress of the field, however. The funding cuts only impacted a handful of major laboratories[144] and the critiques were largely ignored.[145] General public interest in the field continued to grow,[144] the number of researchers increased dramatically,[144] and new ideas were explored inlogic programming,commonsense reasoning and many other areas. Historian Thomas Haigh argued in 2023 that there was no winter,[144] and AI researcherNils Nilsson described this period as the most "exciting" time to work in AI.[146]
In the early seventies, the capabilities of AI programs were limited. Even the most impressive could only handle trivial versions of the problems they were supposed to solve;[n] all the programs were, in some sense, "toys".[148] AI researchers had begun to run into several limits that would be only conquered decades later, and others that still stymie the field in the 2020s:
The agencies which funded AI research, such as theBritish government,DARPA and theNational Research Council (NRC) became frustrated with the lack of progress and eventually cut off almost all funding for undirected AI research. The pattern began in 1966 when theAutomatic Language Processing Advisory Committee (ALPAC) report criticized machine translation efforts. After spending $20 million, theNRC ended all support.[158] In 1973, theLighthill report on the state of AI research in the UK criticized the failure of AI to achieve its "grandiose objectives" and led to the dismantling of AI research in that country.[159] (The report specifically mentioned thecombinatorial explosion problem as a reason for AI's failings.)[143][147][s] DARPA was deeply disappointed with researchers working on theSpeech Understanding Research program at CMU and canceled an annual grant of $3 million.[161][t]
Hans Moravec blamed the crisis on the unrealistic predictions of his colleagues. "Many researchers were caught up in a web of increasing exaggeration."[162][u] However, there was another issue: since the passage of theMansfield Amendment in 1969,DARPA had been under increasing pressure to fund "mission-oriented direct research, rather than basic undirected research". Funding for the creative, freewheeling exploration that had gone on in the 60s would not come from DARPA, which instead directed money at specific projects with clear objectives, such asautonomous tanks andbattle management systems.[163][v]
The major laboratories (MIT, Stanford, CMU and Edinburgh) had been receiving generous support from their governments, and when it was withdrawn, these were the only places that were seriously impacted by the budget cuts. The thousands of researchers outside these institutions and the many more thousands that were joining the field were unaffected.[144]
Several philosophers had strong objections to the claims being made by AI researchers. One of the earliest wasJohn Lucas, who argued thatGödel's incompleteness theorem showed that aformal system (such as a computer program) could never see the truth of certain statements, while a human being could.[165]Hubert Dreyfus ridiculed the broken promises of the 1960s and critiqued the assumptions of AI, arguing that human reasoning actually involved very little "symbol processing" and a great deal ofembodied,instinctive, unconscious "know how".[w][167]John Searle'sChinese Room argument, presented in 1980, attempted to show that a program could not be said to "understand" the symbols that it uses (a quality called "intentionality"). If the symbols have no meaning for the machine, Searle argued, then the machine can not be described as "thinking".[168]
These critiques were not taken seriously by AI researchers. Problems likeintractability andcommonsense knowledge seemed much more immediate and serious. It was unclear what difference "know how" or "intentionality" made to an actual computer program. MIT'sMinsky said of Dreyfus and Searle "they misunderstand, and should be ignored."[169] Dreyfus, who also taught atMIT, was given a cold shoulder: he later said that AI researchers "dared not be seen having lunch with me."[170]Joseph Weizenbaum, the author ofELIZA, was also an outspoken critic of Dreyfus' positions, but he "deliberately made it plain that [his AI colleagues' treatment of Dreyfus] was not the way to treat a human being,"[x] and was unprofessional and childish.[172]
Weizenbaum began to have serious ethical doubts about AI whenKenneth Colby wrote a "computer program which can conductpsychotherapeutic dialogue" based on ELIZA.[173][174][y] Weizenbaum was disturbed that Colby saw a mindless program as a serious therapeutic tool. A feud began, and the situation was not helped when Colby did not credit Weizenbaum for his contribution to the program. In 1976,Weizenbaum publishedComputer Power and Human Reason which argued that the misuse of artificial intelligence has the potential to devalue human life.[176]
Logic was introduced into AI research as early as 1958, byJohn McCarthy in hisAdvice Taker proposal.[177][102] In 1963,J. Alan Robinson had discovered a simple method to implement deduction on computers, theresolution andunification algorithm.[102] However, straightforward implementations, like those attempted by McCarthy and his students in the late 1960s, were especially intractable: the programs required astronomical numbers of steps to prove simple theorems.[177][178] A more fruitful approach to logic was developed in the 1970s byRobert Kowalski at theUniversity of Edinburgh, and soon this led to the collaboration with French researchersAlain Colmerauer andPhilippe Roussel [fr] who created the successful logic programming languageProlog.[179] Prolog uses a subset of logic (Horn clauses, closely related to "rules" and "production rules") that permit tractable computation. Rules would continue to be influential, providing a foundation forEdward Feigenbaum'sexpert systems and the continuing work byAllen Newell andHerbert A. Simon that would lead toSoar and theirunified theories of cognition.[180]
Critics of the logical approach noted, asDreyfus had, that human beings rarely used logic when they solved problems. Experiments by psychologists likePeter Wason,Eleanor Rosch,Amos Tversky,Daniel Kahneman and others provided proof.[z] McCarthy responded that what people do is irrelevant. He argued that what is really needed are machines that can solve problems—not machines that think as people do.[aa]
Among the critics ofMcCarthy's approach were his colleagues across the country atMIT.Marvin Minsky,Seymour Papert andRoger Schank were trying to solve problems like "story understanding" and "object recognition" thatrequired a machine to think like a person. In order to use ordinary concepts like "chair" or "restaurant" they had to make all the same illogical assumptions that people normally made. Unfortunately, imprecise concepts like these are hard to represent in logic. MIT chose instead to focus on writing programs that solved a given task without using high-level abstract definitions or general theories of cognition, and measured performance by iterative testing, rather than arguments from first principles.Schank described their "anti-logic" approaches asscruffy, as opposed to theneat paradigm used byMcCarthy,Kowalski,Feigenbaum,Newell andSimon.[181][ab]
In 1975, in a seminal paper,Minsky noted that many of his fellow researchers were using the same kind of tool: a framework that captures all ourcommon sense assumptions about something. For example, if we use the concept of a bird, there is a constellation of facts that immediately come to mind: we might assume that it flies, eats worms and so on (none of which are true for all birds). Minsky associated these assumptions with the general category and they could beinherited by the frames for subcategories and individuals, or over-ridden as necessary. He called these structuresframes.Schank used a version of frames he called "scripts" to successfully answer questions about short stories in English.[182] Frames would eventually be widely used insoftware engineering under the nameobject-oriented programming.
The logicians rose to the challenge.Pat Hayes claimed that "most of 'frames' is just a new syntax for parts of first-order logic." But he noted that "there are one or two apparently minor details which give a lot of trouble, however, especially defaults".[183]
Ray Reiter admitted that "conventional logics, such as first-order logic, lack the expressive power to adequately represent the knowledge required for reasoning by default".[184] He proposed augmenting first-order logic with aclosed world assumption that a conclusion holds (by default) if its contrary cannot be shown. He showed how such an assumption corresponds to the common sense assumption made in reasoning with frames. He also showed that it has its "procedural equivalent" asnegation as failure inProlog. The closed world assumption, as formulated by Reiter, "is not a first-order notion. (It is a meta notion.)"[184] However,Keith Clark showed that negation asfinite failure can be understood as reasoning implicitly with definitions in first-order logic including aunique name assumption that different terms denote different individuals.[185]
During the late 1970s and throughout the 1980s, a variety of logics and extensions of first-order logic were developed both for negation as failure inlogic programming and for default reasoning more generally. Collectively, these logics have become known asnon-monotonic logics.
In the 1980s, a form of AI program called "expert systems" was adopted by corporations around the world andknowledge became the focus of mainstream AI research. Governments provided substantial funding, such as Japan'sfifth generation computer project and the U.S.Strategic Computing Initiative. "Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars in 1988."[126]
Anexpert system is a program that answers questions or solves problems about a specific domain of knowledge, using logicalrules that are derived from the knowledge of experts.[186]The earliest examples were developed byEdward Feigenbaum and his students.Dendral, begun in 1965, identified compounds from spectrometer readings.[187][124]MYCIN, developed in 1972, diagnosed infectious blood diseases.[126] They demonstrated the feasibility of the approach.
Expert systems restricted themselves to a small domain of specific knowledge (thus avoiding thecommonsense knowledge problem)[124] and their simple design made it relatively easy for programs to be built and then modified once they were in place. All in all, the programs proved to beuseful: something that AI had not been able to achieve up to this point.[188]
In 1980, an expert system calledR1 was completed atCMU for theDigital Equipment Corporation. It was an enormous success: it was saving the company 40 million dollars annually by 1986.[189] Corporations around the world began to develop and deploy expert systems and by 1985 they were spending over a billion dollars on AI, most of it to in-house AI departments.[190] An industry grew up to support them, including hardware companies likeSymbolics andLisp Machines and software companies such asIntelliCorp andAion.[191]
In 1981, theJapanese Ministry of International Trade and Industry set aside $850 million for theFifth generation computer project. Their objectives were to write programs and build machines that could carry on conversations, translate languages, interpret pictures, and reason like human beings.[192] Much to the chagrin ofscruffies, they initially choseProlog as the primary computer language for the project.[193]
Other countries responded with new programs of their own. The UK began the £350 millionAlvey project.[194] A consortium of American companies formed theMicroelectronics and Computer Technology Corporation (or "MCC") to fund large scale projects in AI and information technology.[195][194]DARPA responded as well, founding theStrategic Computing Initiative and tripling its investment in AI between 1984 and 1988.[196][197]
The power of expert systems came from the expert knowledge they contained. They were part of a new direction in AI research that had been gaining ground throughout the 70s. "AI researchers were beginning to suspect—reluctantly, for it violated the scientific canon ofparsimony—that intelligence might very well be based on the ability to use large amounts of diverse knowledge in different ways,"[198] writesPamela McCorduck. "[T]he great lesson from the 1970s was that intelligent behavior depended very much on dealing with knowledge, sometimes quite detailed knowledge, of a domain where a given task lay".[199]Knowledge based systems andknowledge engineering became a major focus of AI research in the 1980s.[200] It was hoped that vast databases would solve thecommonsense knowledge problem and provide the support thatcommonsense reasoning required.
In the 1980s some researchers attempted to attack thecommonsense knowledge problem directly, by creating a massive database that would contain all the mundane facts that the average person knows.Douglas Lenat, who started a database calledCyc, argued that there is no shortcut―the only way for machines to know the meaning of human concepts is to teach them, one concept at a time, by hand.[201]
Although symbolicknowledge representation andlogical reasoning produced useful applications in the 80s and received massive amounts of funding, it was still unable to solve problems inperception,robotics,learning andcommon sense. A small number of scientists and engineers began to doubt that the symbolic approach would ever be sufficient for these tasks and developed other approaches, such as "connectionism",robotics,"soft" computing andreinforcement learning.Nils Nilsson called these approaches "sub-symbolic".

In 1982, physicistJohn Hopfield was able to prove that a form of neural network (now called a "Hopfield net") could learn and process information, and provably converges after enough time under any fixed condition. It was a breakthrough, as it was previously thought that nonlinear networks would, in general, evolve chaotically.[202]Geoffrey Hinton proved a similar result about a device called a "Boltzmann machine".[203] (Hopfield and Hinton would eventually receive the2024 Nobel prize for this work.[203]) In 1986, Hinton andDavid Rumelhart popularized a method for training neural networks called "backpropagation".[ac] These three developments helped to revive the exploration ofartificial neural networks.[126][204]
Neural networks, along with several other similar models, received widespread attention after the 1986 publication of theParallel Distributed Processing, a two volume collection of papers edited byRumelhart and psychologistJames McClelland. The new field was christened "connectionism" and there was a considerable debate between advocates ofsymbolic AI and the "connectionists".[126] Hinton called symbols the "luminous aether of AI"―that is, an unworkable and misleading model of intelligence.[126] This was a direct attack on the principles that inspired thecognitive revolution.
Neural networks started to advance state of the art in some specialist areas such as protein structure prediction. Following pioneering work from Terry Sejnowski,[205] cascading multilayer perceptrons such as PhD[206] and PsiPred[207] reached near-theoretical maximum accuracy in predicting secondary structure.
In 1990,Yann LeCun atBell Labs usedconvolutional neural networks to recognize handwritten digits. The system was used widely in 90s, reading zip codes and personal checks. This was the first genuinely useful application of neural networks.[208][209]
Rodney Brooks,Hans Moravec and others argued that, in order to show real intelligence, a machine needs to have abody—it needs to perceive, move, survive, and deal with the world.[210] Sensorimotor skills are essential to higher level skills such ascommonsense reasoning. They can't be efficiently implemented using abstract symbolic reasoning, so AI should solve the problems of perception, mobility, manipulation and survival without using symbolic representation at all. These robotics researchers advocated building intelligence "from the bottom up".[ad]
A precursor to this idea wasDavid Marr, who had come toMIT in the late 1970s from a successful background in theoretical neuroscience to lead the group studyingvision. He rejected all symbolic approaches (bothMcCarthy's logic andMinsky's frames), arguing that AI needed to understand the physical machinery of vision from the bottom up before any symbolic processing took place. (Marr's work would be cut short by leukemia in 1980.)[212]
In his 1990 paper "Elephants Don't Play Chess",[213] robotics researcher Brooks took direct aim at thephysical symbol system hypothesis, arguing that symbols are not always necessary since "the world is its own best model. It is always exactly up to date. It always has every detail there is to be known. The trick is to sense it appropriately and often enough."[214]
In the 1980s and 1990s, manycognitive scientists also rejected the symbol processing model of the mind and argued that the body was essential for reasoning, a theory called the "embodied mind thesis".[215]
Soft computing uses methods that work with incomplete and imprecise information. They do not attempt to give precise, logical answers, but give results that are only "probably" correct. This allowed them to solve problems that precise symbolic methods could not handle. Press accounts often claimed these tools could "think like a human".[216][217]
Judea Pearl'sProbabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, an influential 1988 book[218] broughtprobability anddecision theory into AI.[219]Fuzzy logic, developed byLofti Zadeh in the 60s, began to be more widely used in AI and robotics.Evolutionary computation andartificial neural networks also handle imprecise information, and are classified as "soft". In the 90s and early 2000s many other soft computing tools were developed and put into use, includingBayesian networks,[219]hidden Markov models,[219]information theory, andstochastic modeling. These tools in turn depended on advanced mathematical techniques such as classicaloptimization. For a time in the 1990s and early 2000s, these soft tools were studied by a subfield of AI called "computational intelligence".[220]
Reinforcement learning[221] gives an agent a reward every time it performs a desired action well, and may give negative rewards (or "punishments") when it performs poorly. It was described in the first half of the twentieth century by psychologists using animal models, such asThorndike,[222][223]Pavlov[224] andSkinner.[225] In the 1950s,Alan Turing[223][226] andArthur Samuel[223] foresaw the role of reinforcement learning in AI.
A successful and influential research program was led byRichard Sutton andAndrew Barto beginning 1972. Their collaboration revolutionized the study of reinforcement learning and decision making over the four decades.[227][228] In 1988, Sutton described machine learning in terms ofdecision theory (i.e., theMarkov decision process). This gave the subject a solid theoretical foundation and access to a large body of theoretical results developed in the field ofoperations research.[228]
Also in 1988, Sutton and Barto developed the "temporal difference" (TD) learning algorithm, where the agent is rewarded only when itspredictions about the future show improvement. It significantly outperformed previous algorithms.[229] TD-learning was used by Gerald Tesauro in 1992 in the programTD-Gammon, which played backgammon as well as the best human players. The program learned the game by playing against itself with zero prior knowledge.[230] In an interesting case of interdisciplinary convergence, neurologists discovered in 1997 that thedopamine reward system in brains also uses a version of the TD-learning algorithm.[231][232][233] TD learning would be become highly influential in the 21st century, used in bothAlphaGo andAlphaZero.[234]
The business community's fascination with AI rose and fell in the 1980s in the classic pattern of aneconomic bubble. As dozens of companies failed, the perception in the business world was that the technology was not viable.[235] The damage to AI's reputation would last into the 21st century. Inside the field there was little agreement on the reasons for AI's failure to fulfill the dream of human level intelligence that had captured the imagination of the world in the 1960s. Together, all these factors helped to fragment AI into competing subfields focused on particular problems or approaches, sometimes even under new names that disguised the tarnished pedigree of "artificial intelligence".[236]
Over the next 20 years, AI consistently delivered working solutions to specific isolated problems. By the late 1990s, it was being used throughout the technology industry, although somewhat behind the scenes. The success was due toincreasing computer power, by collaboration with other fields (such asmathematical optimization andstatistics) and using higher standards of scientific accountability.
The term "AI winter" was coined by researchers who had survived the funding cuts of 1974 when they became concerned that enthusiasm for expert systems had spiraled out of control and that disappointment would certainly follow.[ae] Their fears were well founded: in the late 1980s and early 1990s, AI suffered a series of financial setbacks.[126]
The first indication of a change in weather was the sudden collapse of the market for specialized AI hardware in 1987. Desktop computers fromApple andIBM had been steadily gaining speed and power and in 1987 they became more powerful than the more expensiveLisp machines made bySymbolics and others. There was no longer a good reason to buy them. An entire industry worth half a billion dollars was demolished overnight.[238]
Eventually the earliest successful expert systems, such asXCON, proved too expensive to maintain. They were difficult to update, they could not learn, and they were "brittle" (i.e., they could make grotesque mistakes when given unusual inputs). Expert systems proved useful, but only in a few special contexts.[239]
In the late 1980s, theStrategic Computing Initiative cut funding to AI "deeply and brutally". New leadership atDARPA had decided that AI was not "the next wave" and directed funds towards projects that seemed more likely to produce immediate results.[240]
By 1991, the impressive list of goals penned in 1981 for Japan'sFifth Generation Project had not been met. Some of them, like "carry on a casual conversation", would not be accomplished for another 30 years. As with other AI projects, expectations had run much higher than what was actually possible.[241][af]
Over 300 AI companies had shut down, gone bankrupt, or been acquired by the end of 1993, effectively ending the first commercial wave of AI.[243] In 1994,HP Newquist stated inThe Brain Makers that "The immediate future of artificial intelligence—in its commercial form—seems to rest in part on the continued success of neural networks."[243]
In the 1990s, algorithms originally developed by AI researchers began to appear as parts of larger systems. AI had solved a lot of very difficult problems[ag] and their solutions proved to be useful throughout the technology industry,[244][245] such asdata mining,industrial robotics, logistics,speech recognition,[246] banking software,[247] medical diagnosis,[247] andGoogle's search engine.[248][249]
The field of AI received little or no credit for these successes in the 1990s and early 2000s. Many of AI's greatest innovations have been reduced to the status of just another item in the tool chest of computer science.[250]Nick Bostrom explains: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore."[247]
Many researchers in AI in the 1990s deliberately called their work by other names, such asinformatics,knowledge-based systems, "cognitive systems" orcomputational intelligence. In part, this may have been because they considered their field to be fundamentally different from AI, but also the new names help to procure funding.[246][251][252] In the commercial world at least, the failed promises of the AI winter continued to haunt AI research into the 2000s, as theNew York Times reported in 2005: "Computer scientists and software engineers avoided the term artificial intelligence for fear of being viewed as wild-eyed dreamers."[253]
AI researchers began to develop and use sophisticated mathematical tools more than they ever had in the past.[254][255] Most of the new directions in AI relied heavily on mathematical models, includingartificial neural networks,probabilistic reasoning,soft computing andreinforcement learning. In the 90s and 2000s, many other highly mathematical tools were adapted for AI. These tools were applied to machine learning, perception, and mobility.
There was a widespread realization that many of the problems that AI needed to solve were already being worked on by researchers in fields likestatistics,mathematics,electrical engineering,economics, oroperations research. The shared mathematical language allowed both a higher level of collaboration with more established and successful fields and the achievement of results which were measurable and provable; AI had become a more rigorous "scientific" discipline. Another key reason for the success in the 90s was that AI researchers focused on specific problems with verifiable solutions (an approach later derided asnarrow AI). This provided useful tools in the present, rather than speculation about the future.
A new paradigm called "intelligent agents" became widely accepted during the 1990s.[256][257][ah] Although earlier researchers had proposed modular "divide and conquer" approaches to AI,[ai] the intelligent agent did not reach its modern form untilJudea Pearl,Allen Newell,Leslie P. Kaelbling, and others brought concepts fromdecision theory and economics into the study of AI.[258] When theeconomist's definition of arational agent was married tocomputer science's definition of anobject ormodule, the intelligent agent paradigm was complete.
Anintelligent agent is a system that perceives its environment and takes actions which maximize its chances of success. By this definition, simple programs that solve specific problems are "intelligent agents", as are human beings and organizations of human beings, such asfirms. The intelligent agent paradigm defines AI research as "the study of intelligent agents".[aj] This is a generalization of some earlier definitions of AI: it goes beyond studying human intelligence; it studies all kinds of intelligence. The paradigm gave researchers license to study isolated problems and to disagree about methods, but still retain hope that their work could be combined into anagent architecture that would be capable of general intelligence.[259]
On 11 May 1997,Deep Blue became the first computer chess-playing system to beat a reigning world chess champion,Garry Kasparov.[260] In 2005, a Stanford robot won theDARPA Grand Challenge by driving autonomously for 131 miles along an unrehearsed desert trail. Two years later, a team from CMU won theDARPA Urban Challenge by autonomously navigating 55 miles in an urban environment while responding to traffic hazards and adhering to traffic laws.[261]
These successes were not due to some revolutionary new paradigm, but mostly on the tedious application of engineering skill and on the tremendous increase in the speed and capacity of computers by the 90s.[ak] In fact,Deep Blue's computer was 10 million times faster than theFerranti Mark 1 thatChristopher Strachey taught to play chess in 1951.[al] This dramatic increase is measured byMoore's law, which predicts that the speed and memory capacity of computers doubles every two years. The fundamental problem of "raw computer power" was slowly being overcome.
Electronic literature experiments such asThe Impermanence Agent (1998–2002) and digital art such asAgent Ruby used AI in their art and literature, "laying bare the bias accompanying forms of technology that feign objectivity."[263]
In the first decades of the 21st century, access to large amounts of data (known as "big data"),cheaper and faster computers and advancedmachine learning techniques were successfully applied to many problems throughout the economy. A turning point was the success ofdeep learning around 2012 which improved the performance of machine learning on many tasks, including image and video processing, text analysis, and speech recognition.[264] Investment in AI increased along with its capabilities, and by 2016, the market for AI-related products, hardware, and software reached more than $8 billion, and theNew York Times reported that interest in AI had reached a "frenzy".[265]
In 2002,Ben Goertzel and others became concerned that AI had largely abandoned its original goal of producing versatile, fully intelligent machines, and argued in favor of more direct research intoartificial general intelligence (AGI). By the mid-2010s several companies and institutions had been founded to pursue artificial general intelligence, such asOpenAI andGoogle'sDeepMind. During the same period, new insights intosuperintelligence raised concerns that AI was anexistential threat. The risks and unintended consequences of AI technology became an area of serious academic research after 2016.
The success of machine learning in the 2000s depended on the availability of vast amounts of training data and faster computers.[266] Russell and Norvig wrote that the "improvement in performance obtained by increasing the size of the data set by two or three orders of magnitude outweighs any improvement that can be made by tweaking the algorithm."[208]Geoffrey Hinton recalled that back in the 80s and 90s the problem was that "our labeled datasets were thousands of times too small. [And] our computers were millions of times too slow."[267] This was no longer true by 2010.
The most useful data in the 2000s came from curated, labeled data sets created specifically for machine learning and AI. In 2007, a group atUMass Amherst releasedLabeled Faces in the Wild, an annotated set of images of faces that was widely used to train and testface recognition systems for the next several decades.[268]Fei-Fei Li developedImageNet, a database of three million images captioned by volunteers using theAmazon Mechanical Turk. Released in 2009, it was a useful body of training data and a benchmark for testing for the next generation of image processing systems.[269][208] Google releasedword2vec in 2013 as an open source resource. It used large amounts of data text scraped from the internet andword embedding to create a numeric vector to represent each word. Users were surprised at how well it was able to capture word meanings, for example, ordinary vector addition would give equivalences like China + River = Yangtze or London − England + France = Paris.[270] This database in particular would be essential for the development oflarge language models in the late 2010s.
The explosive growth of the internet gave machine learning programs access to billions of pages of text and images that could bescraped. And, for specific problems, large privately held databases contained the relevant data.McKinsey Global Institute reported that "by 2009, nearly all sectors in the US economy had at least an average of 200 terabytes of stored data".[271] This collection of information was known in the 2000s asbig data.
In aJeopardy! exhibition match in February 2011,IBM'squestion answering systemWatson defeated the two bestJeopardy! champions,Brad Rutter andKen Jennings, by a significant margin.[272] Watson's expertise would have been impossible without the information available on the internet.[208]
In 2012,AlexNet, adeep learning model,[am] developed byAlex Krizhevsky, won theImageNet Large Scale Visual Recognition Challenge, with significantly fewer errors than the second-place winner.[274][208] Krizhevsky worked withGeoffrey Hinton at theUniversity of Toronto.[an] This was a turning point in machine learning: over the next few years dozens of other approaches to image recognition were abandoned in favor of deep learning.[266]
Deep learning uses a multi-layerperceptron. Although this architecture has been known since the 60s, getting it to work requires powerful hardware and large amounts of training data.[275] Before these became available, improving performance of image processing systems required hand-craftedad hoc features that were difficult to implement.[275] Deep learning was simpler and more general.[ao]
Deep learning was applied to dozens of problems over the next few years (such as speech recognition, machine translation, medical diagnosis, and game playing). In every case it showed enormous gains in performance.[266] Investment and interest in AI boomed as a result.[266]
It became fashionable in the 2000s to begin talking about the future of AI again and several popular books considered the possibility ofsuperintelligent machines and what they might mean for human society. Some of this was optimistic (such asRay Kurzweil'sThe Singularity is Near), but others warned that a sufficiently powerful AI wasexistential threat to humanity, such asNick Bostrom andEliezer Yudkowsky.[276] The topic became widely covered in the press and many leading intellectuals and politicians commented on the issue.
AI programs in the 21st century are defined by theirgoals—the specific measures that they are designed to optimize.Nick Bostrom's influential 2014 bookSuperintelligence[277] argued that, if one isn't careful about defining these goals, the machine may cause harm to humanity in the process of achieving a goal.Stuart J. Russell used the example of an intelligent robot that kills its owner to prevent it from being unplugged, reasoning "you can't fetch the coffee if you're dead".[278] (This problem is known by the technical term "instrumental convergence".) The solution is toalign the machine's goal function with the goals of its owner and humanity in general. Thus, the problem of mitigating the risks and unintended consequences of AI became known as "the value alignment problem" or AI alignment.[279]
At the same time, machine learning systems had begun to have disturbing unintended consequences.Cathy O'Neil explained how statistical algorithms had been among the causes of the2008 economic crash,[280]Julia Angwin ofProPublica argued that theCOMPAS system used by the criminal justice system exhibited racial bias under some measures,[281][ap] others showed that many machine learning systems exhibited some form of racialbias,[283] and there were many other examples of dangerous outcomes that had resulted from machine learning systems.[aq]
In 2016, the election ofDonald Trump and the controversy over the COMPAS system illuminated several problems with the current technological infrastructure, including misinformation, social media algorithms designed to maximize engagement, the misuse of personal data and the trustworthiness of predictive models.[284] Issues offairness and unintended consequences became significantly more popular at AI conferences, publications vastly increased, funding became available, and many researchers refocused their careers on these issues. Thevalue alignment problem became a serious field of academic study.[285][ar]
In the early 2000s, several researchers became concerned that mainstream AI was too focused on "measurable performance in specific applications"[287] (known as "narrow AI") and had abandoned AI's original goal of creating versatile, fully intelligent machines. An early critic wasNils Nilsson in 1995, and similar opinions were published by AI elder statesmen John McCarthy, Marvin Minsky, and Patrick Winston in 2007–2009. Minsky organized a symposium on "human-level AI" in 2004.[287]Ben Goertzel adopted the term "artificial general intelligence" for the new sub-field, founding a journal and holding conferences beginning in 2008.[288] The new field grew rapidly, buoyed by the continuing success of artificial neural networks and the hope that it was the key to AGI.
Several competing companies, laboratories and foundations were founded to develop AGI in the 2010s.DeepMind was founded in 2010 by three English scientists,Demis Hassabis,Shane Legg andMustafa Suleyman, with funding fromPeter Thiel and laterElon Musk. The founders and financiers were deeply concerned aboutAI safety and theexistential risk of AI. DeepMind's founders had a personal connection with Yudkowsky, and Musk was among those who was actively raising the alarm.[289] Hassabis was both worried about the dangers of AGI and optimistic about its power; he hoped they could "solve AI, then solve everything else."[290]The New York Times wrote in 2023, "At the heart of this competition is a brain-stretching paradox. The people who say they are most worried about AI are among the most determined to create it and enjoy its riches. They have justified their ambition with their strong belief that they alone can keep AI from endangering Earth."[289]
In 2012,Geoffrey Hinton (who been leading neural network research since the 80s) was approached byBaidu, which wanted to hire him and all his students for an enormous sum. Hinton decided to hold an auction and, at a Lake Tahoe AI conference, they sold themselves toGoogle for a price of $44 million. Hassabis took notice and sold DeepMind to Google in 2014, on the condition that it would not accept military contracts and would be overseen by an ethics board.[289]

Larry Page of Google, unlike Musk and Hassabis, was an optimist about the future of AI. Musk and Paige became embroiled in an argument about the risk of AGI at Musk's 2015 birthday party. They had been friends for decades but stopped speaking to each other shortly afterwards. Musk attended the one and only meeting of the DeepMind's ethics board, where it became clear that Google was uninterested in mitigating the harm of AGI. Frustrated by his lack of influence he foundedOpenAI in 2015, enlistingSam Altman to run it and hiring top scientists. OpenAI began as a non-profit, "free from the economic incentives that were driving Google and other corporations."[289] Musk became frustrated again and left the company in 2018. OpenAI turned to Microsoft for continued financial support and Altman and OpenAI formed a for-profit version of the company with more than $1 billion in financing.[289]
In 2021,Dario Amodei and 14 other scientists left OpenAI over concerns that the company was putting profits above safety. They formedAnthropic, which soon had $6 billion in financing from Microsoft and Google.[289]


The AI boom started with the initial development of key architectures and algorithms such as thetransformer architecture in 2017, leading to the scaling and development of large language models exhibiting human-like traits of knowledge, attention, and creativity. The new AI era began in 2020, with the public release of scaledlarge language models (LLMs) such asChatGPT.[292]
In 2017, thetransformer architecture was proposed by Google researchers in a paper titled "Attention Is All You Need". It exploits aself-attention mechanism and became widely used in large language models.[293] Large language models, based on the transformer, were further developed by other companies:OpenAI releasedGPT-3 in 2020, thenDeepMind releasedGato in 2022. These arefoundation models: they are trained on vast quantities of unlabeled data and can be adapted to a wide range of downstream tasks. These models can discuss a huge number of topics and display general knowledge, which has raised questions around whether or not they are examples ofartificial general intelligence.
Bill Gates was skeptical of the new technology and the hype that surrounded AGI. However, Altman presented him with a live demo ofChatGPT-4 passing an advanced biology test. Gates was convinced.[289] In 2023,Microsoft Research tested the model with a large variety of tasks, and concluded that "it could reasonably be viewed as an early (yet still incomplete) version of anartificial general intelligence (AGI) system".[294]
In 2024,OpenAI o3, a type of advancedreasoning model developed by OpenAI, was announced. On the Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) benchmark developed byFrançois Chollet in 2019, the model achieved an unofficial score of 87.5% on the semi-private test, surpassing the typical human score of 84%. The benchmark is supposed to be a necessary, but not sufficient test for AGI. Speaking of the benchmark, Chollet has said "You'll know AGI is here when the exercise of creating tasks that are easy for regular humans but hard for AI becomes simply impossible."[295]
Investment in AI grew exponentially after 2020, with venture capital funding for generative AI companies increasing dramatically. Total AI investments rose from $18 billion in 2014 to $119 billion in 2021, with generative AI accounting for approximately 30% of investments by 2023.[296] According to metrics from 2017 to 2021, the United States outranked the rest of the world in terms ofventure capital funding, number ofstartups, and AIpatents granted.[297] The commercial AI scene became dominated by AmericanBig Tech companies, whose investments in this area surpassed those from U.S.-basedventure capitalists.[298]OpenAI's valuation reached $86 billion by early 2024,[299] whileNVIDIA's market capitalization surpassed $3.3 trillion by mid-2024, making it the world's largest company bymarket capitalization as the demand for AI-capableGPUs surged.[300]
15.ai, launched in March 2020[301] by an anonymousMIT researcher,[302][303] was one of the earliest examples ofgenerative AI gaining widespread public attention during the initial stages of the AI boom.[304] The freeweb application demonstrated the ability to clone character voices using neural networks with minimal training data, requiring as little as 15 seconds of audio to reproduce a voice—a capability later corroborated byOpenAI in 2024.[305] The service wentviral on social media platforms in early 2021,[306][307] allowing users to generate speech for characters frompopular media franchises, and became particularly notable for its pioneering role in popularizingAI voice synthesis forcreative content andmemes.[308]
Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders.Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable. This confidence must be well justified and increase with the magnitude of a system's potential effects. OpenAI's recent statement regarding artificial general intelligence, states that "At some point, it may be important to get independent review before starting to train future systems, and for the most advanced efforts to agree to limit the rate of growth of compute used for creating new models." We agree. That point is now.
Therefore,we call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4. This pause should be public and verifiable, and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.
ChatGPT was launched on 30 November 2022, marking a pivotal moment in artificial intelligence's public adoption. Within days of its release it went viral, gaining over 100 million users in two months and becoming the fastest-growing consumer software application in history.[310] The chatbot's ability to engage in human-like conversations, write code, and generate creative content captured public imagination and led to rapid adoption across various sectors includingeducation,business, and research.[311] ChatGPT's success prompted unprecedented responses from major technology companies—Google declared a "code red" and rapidly launchedGemini (formerly known as Google Bard), whileMicrosoft incorporated the technology intoBing Chat.[312]
The rapid adoption of these AI technologies sparked intense debate about their implications. Notable AI researchers and industry leaders voiced both optimism and concern about the accelerating pace of development. In March 2023, over 20,000 signatories, includingcomputer scientistYoshua Bengio,Elon Musk, andApple co-founderSteve Wozniak, signedan open letter calling for a pause in advanced AI development, citing "profound risks to society and humanity."[313] However, other prominent researchers likeJuergen Schmidhuber took a more optimistic view, emphasizing that the majority of AI research aims to make "human lives longer and healthier and easier."[314]
By mid-2024, however, the financial sector began to scrutinize AI companies more closely, particularly questioning their capacity to produce areturn on investment commensurate with their massive valuations. Some prominent investors raised concerns about market expectations becoming disconnected from fundamental business realities.Jeremy Grantham, co-founder ofGMO LLC, warned investors to "be quite careful" and drew parallels to previous technology-driven market bubbles.[315] Similarly,Jeffrey Gundlach, CEO ofDoubleLine Capital, explicitly compared the AI boom to thedot-com bubble of the late 1990s, suggesting that investor enthusiasm might be outpacing realistic near-term capabilities and revenue potential.[316] These concerns were amplified by the substantial market capitalizations of AI-focused companies, many of which had yet to demonstrate sustainable profitability models.
In March 2024,Anthropic released theClaude 3 family of large language models, including Claude 3 Haiku, Sonnet, and Opus.[317] The models demonstrated significant improvements in capabilities across various benchmarks, with Claude 3 Opus notably outperforming leading models from OpenAI and Google.[318] In June 2024, Anthropic released Claude 3.5 Sonnet, which demonstrated improved performance compared to the larger Claude 3 Opus, particularly in areas such as coding, multistep workflows, and image analysis.[319]
In 2024, theRoyal Swedish Academy of Sciences awardedNobel Prizes in recognition of groundbreaking contributions toartificial intelligence. The recipients included:
In January 2025, OpenAI announced a new AI, ChatGPT-Gov, which would be specifically designed for US government agencies to use securely.[321] Open AI said that agencies could utilize ChatGPT Gov on a Microsoft Azure cloud or Azure Government cloud, "on top of Microsoft's Azure's OpenAI Service." OpenAI's announcement stated that "Self-hosting ChatGPT Gov enables agencies to more easily manage their own security, privacy, and compliance requirements, such as stringent cybersecurity frameworks (IL5, CJIS, ITAR,FedRAMP High). Additionally, we believe this infrastructure will expedite internal authorization of OpenAI's tools for the handling of non-public sensitive data."[321]
Countries have invested in policies and funding to deployautonomous robots in an attempt to address labor shortages and enhancing efficiency, while also implementingregulatory frameworks for ethical and safe development.
In 2025, China invested approximately 730 billion yuan (roughly US$100 billion) to advance AI and robotics in smart manufacturing and healthcare.[322] The "14th Five-Year Plan" (2021–2025) prioritized service robots, with AI systems enabling robots to perform complex tasks like assisting in surgeries or automating factory assembly lines.[323] Some funding also supported defense applications, such as autonomous drones.[324][325] Starting in September 2025, China mandated labeling of AI-generated content to ensure transparency and public trust in these technologies.[326]
In January 2025,Stargate LLC was formed as a joint venture ofOpenAI,SoftBank,Oracle, andMGX, who announced plans to invest US$500 billion in AI infrastructure in theUnited States by 2029. The venture was formally announced by U.S. President Donald Trump on 21 January 2025, with SoftBank CEOMasayoshi Son appointed as chairman.[327][328]
The U.S. government allocated approximately $2 billion to integrate AI and robotics in manufacturing and logistics.[329][non-primary source needed] State governments supplemented this with funding for service robots, such as those deployed in warehouses to fulfill verbal commands for inventory management or in eldercare facilities to respond to residents' requests for assistance.[330] Some funds were directed to defense, includinglethal autonomous weapon andmilitary robot. In January 2025, Executive Order 14179 established an "AI Action Plan" to accelerate innovation and deployment of these technologies with the declared intent of "world domination" and "victory".[331][332].
While AI voice memes have been around in some form since '15.ai' launched in 2020, [...]
AI voice tools used to create "audio deepfakes" have existed for years in one form or another, with 15.ai being a notable example.