Movatterモバイル変換

OpenAI Five

From Wikipedia, the free encyclopedia

Machine-learned bot project using the video game Dota 2

OpenAI Five is acomputer program byOpenAI that plays the five-on-fivevideo gameDota 2. Its first public appearance occurred in 2017, where it was demonstrated in a live one-on-one game against the professional playerDendi, who lost to it. The following year, the system had advanced to the point of performing as a full team of five, and began playing against and showing the capability to defeat professional teams.

By choosing a game as complex asDota 2 to studymachine learning, OpenAI thought they could more accurately capture the unpredictability and continuity seen in the real world, thus constructing more general problem-solving systems. The algorithms and code used by OpenAI Five were eventually borrowed by anotherneural network in development by the company, one which controlled a physical robotic hand. OpenAI Five has been compared to other similar cases ofartificial intelligence (AI) playing against and defeating humans, such asAlphaStar in the video gameStarCraft II,AlphaGo in the board gameGo,Deep Blue inchess, andWatson on the television game showJeopardy!.

History

[edit]

Development on the algorithms used for the bots began in November 2016. OpenAI decided to useDota 2, a competitive five-on-five video game, as a base due to it being popular on thelive streaming platformTwitch, having native support forLinux, and had anapplication programming interface (API) available.^[1] Before becoming a team of five, the first public demonstration occurred atThe International 2017 in August, the annual premiere championship tournament for the game, whereDendi, a Ukrainian professional player, lost against an OpenAI bot in a live one-on-one matchup.^[2]^[3] After the match, CTOGreg Brockman explained that the bot had learned by playing against itself for two weeks ofreal time, and that the learning software was a step in the direction of creating software that can handle complex tasks "like being a surgeon".^[4]^[5] OpenAI used a methodology calledreinforcement learning, as the bots learn over time by playing against itself hundreds of times a day for months, in which they are rewarded for actions such as killing an enemy and destroying towers.^[6]^[7]^[8]

By June 2018, the ability of the bots expanded to play together as a full team of five and were able to defeat teams of amateur and semi-professional players.^[9]^[6]^[10]^[11] AtThe International 2018, OpenAI Five played in two games against professional teams, one against the Brazilian-based paiN Gaming and the other against anall-star team of former Chinese players.^[12]^[13]^[14] Although the bots lost both matches, OpenAI still considered it a successful venture, stating that playing against some of the best players inDota 2 allowed them to analyze and adjust their algorithms for future games.^[15] The bots' final public demonstration occurred in April 2019, where they won abest-of-three series against The International 2018 championsOG at a live event inSan Francisco.^[16] A four-day online event to play against the bots, open to the public, occurred the same month.^[17] There, the bots played in 42,729 public games, winning 99.4% of those games.^[18]

Architecture

[edit]

Each OpenAI Five bot is a neural network containing a single layer with a 4096-unit^[19]LSTM that observes the current game state extracted from the Dota developer's API. The neural network conducts actions via numerous possible action heads (no human data involved), and every head has meaning. For instance, the number of ticks to delay an action, what action to select – the X or Y coordinate of this action in a grid around the unit. In addition, action heads are computed independently. The AI system observes the world as a list of 20,000 numbers and takes an action by conducting a list of eight enumeration values. Also, it selects different actions and targets to understand how to encode every action and observe the world.^[20]

OpenAI Five has been developed as a general-purpose reinforcement learning training system on the "Rapid" infrastructure. Rapid consists of two layers: it spins up thousands of machines and helps them ‘talk’ to each other and a second layer runs software. By 2018, OpenAI Five had played around 180 years worth of games in reinforcement learning running on 256GPUs and 128,000CPU cores,^[21] usingProximal Policy Optimization, apolicy gradient method.^[20]^[22]

Comparison chart
	OpenAI 1v1 bot (2017)	OpenAI Five (2018)
CPUs	60,000 CPU cores onMicrosoft Azure	128,000 pre-emptible CPU cores on theGoogle Cloud Platform (GCP)
GPUs	256 K80 GPUs on Azure	256 P100 GPUs on the GCP
Experience collected	~300 years per day	~180 years per day
Size of observation	~3.3kB	~36.8kB
Observations per second of gameplay	10	7.5
Batch size	8,388,608 observations	1,048,576 observations
Batches per minute	~20	~60

Comparisons with other game AI systems

[edit]

Prior to OpenAI Five, other AI versus human experiments and systems have been successfully used before, such asJeopardy! withWatson,chess withDeep Blue, andGo withAlphaGo.^[23]^[24]^[25] In comparison with other games that have used AI systems to play against human players,Dota 2 differs as explained below:^[20]

Long run view: The bots run at 30frames per second for an average match time of 45 minutes, which results in 80,000 ticks per game. OpenAI Five observes every fourth frame, generating 20,000 moves. By comparison, chess usually ends before 40 moves, while Go ends before 150 moves.

Partially observed state of the game: Players and their allies can only see the map directly around them. The rest of it is covered in afog of war which hides enemies units and their movements. Thus, playingDota 2 requires making inferences based on this incomplete data, as well as predicting what their opponent could be doing at the same time. By comparison, Chess and Go are "full-information games", as they do not hide elements from the opposing player.^[26]

Continuous action space: Each playable character in aDota 2 game, known as a hero, can take dozens of actions that target either another unit or a position. The OpenAI Five developers allow the space into 170,000 possible actions per hero. Without counting the perpetual aspects of the game, there are an average of ~1,000 valid actions each tick. By comparison, the average number of actions in chess is 35 and 250 in Go.

Continuous observation space:Dota 2 is played on a large map with ten heroes, five on each team, along with dozens of buildings andnon-player character (NPC) units. The OpenAI system observes the state of a game through developers’ bot API, as 20,000 numbers that constitute all information a human is allowed to get access to. A chess board is represented as about 70 lists, whereas a Go board has about 400 enumerations.

Reception

[edit]

OpenAI Five have received acknowledgement from the AI, tech, and video game community at large.Microsoft founderBill Gates called it a "big deal", as their victories "required teamwork and collaboration".^[8]^[27] Chess championGarry Kasparov, who lost against theDeep Blue AI in 1997, stated that despite their losing performance at The International 2018, the bots would eventually "get there, and sooner than expected".^[28]

In a conversation withMIT Technology Review, AI experts also considered OpenAI Five system as a significant achievement, as they noted thatDota 2 was an "extremely complicated game", so even beating non-professional players was impressive.^[26]PC Gamer wrote that their wins against professional players was a significant event in machine learning.^[29] In contrast,Motherboard wrote that the victory was "basically cheating" due to the simplified hero pools on both sides, as well as the fact that bots were given direct access to the API, as opposed to usingcomputer vision to interpret pixels on the screen.^[30]The Verge wrote that the bots were evidence that the company's approach to reinforcement learning and its general philosophy about AI was "yielding milestones".^[17]

In 2019,DeepMind unveiled a similar bot forStarCraft II,AlphaStar. Like OpenAI Five, AlphaStar used reinforcement learning and self-play.The Verge reported that "the goal with this type of AI research is not just to crush humans in various games just to prove it can be done. Instead, it’s to prove that — with enough time, effort, and resources — sophisticated AI software can best humans at virtually any competitive cognitive challenge, be it a board game or a modern video game." They added that the DeepMind and OpenAI victories were also a testament to the power of certain uses of reinforcement learning.^[31]

It was OpenAI's hope that the technology could have applications outside of the digital realm. In 2018, they were able to reuse the same reinforcement learning algorithms and training code from OpenAI Five forDactyl, a human-like robot hand with a neural network built to manipulate physical objects.^[32] In 2019, Dactyl solved theRubik's Cube.^[33]

References

[edit]

^OpenAI."OpenAI Five".openai.com/five.Archived from the original on 1 September 2018. Retrieved10 October 2018.
^Savov, Vlad (14 August 2017)."My favorite game has been invaded by killer AI bots and Elon Musk hype".The Verge.Archived from the original on 26 June 2018. Retrieved25 June 2018.
^Frank, Blair Hanley."OpenAI's bot beats top Dota 2 player so badly that he quits".Venture Beat. Archived fromthe original on 12 August 2017. Retrieved12 August 2017.
^OpenAI (11 August 2017)."Dota 2".blog.openai.com.Archived from the original on 11 August 2017. Retrieved12 August 2017.
^OpenAI (16 August 2017)."More on Dota 2".blog.openai.com.Archived from the original on 16 August 2017. Retrieved16 August 2017.
^^a ^bSimonite, Tom (25 June 2018)."Can Bots Outwit Humans in One of the Biggest Esports Games?".Wired.Archived from the original on 25 June 2018. Retrieved25 June 2018.
^Kahn, Jeremy (25 June 2018)."A Bot Backed by Elon Musk Has Made an AI Breakthrough in Video Game World".Bloomberg.com.Archived from the original on 27 June 2018. Retrieved27 June 2018.
^^a ^b"Bill Gates says gamer bots from Elon Musk-backed nonprofit are 'huge milestone' in A.I."CNBC. 28 June 2018.Archived from the original on 28 June 2018. Retrieved28 June 2018.
^OpenAI (18 July 2018)."OpenAI Five Benchmark".blog.openai.com.Archived from the original on 26 August 2018. Retrieved25 August 2018.
^Vincent, James (25 June 2018)."AI bots trained for 180 years a day to beat humans at Dota 2".The Verge.Archived from the original on 25 June 2018. Retrieved25 June 2018.
^Savov, Vlad (6 August 2018)."The OpenAI Dota 2 bots just defeated a team of former pros".The Verge.Archived from the original on 7 August 2018. Retrieved7 August 2018.
^Hutson, Matthew (31 July 2019)."Just Months Old, a Game-Playing A.I. Takes on the World".Medium. Retrieved12 June 2025.
^Simonite, Tom."Pro Gamers Fend off Elon Musk-Backed AI Bots—for Now".Wired.Archived from the original on 24 August 2018. Retrieved25 August 2018.
^Quach, Katyanna."Game over, machines: Humans defeat OpenAI bots once again at video games Olympics".The Register.Archived from the original on 25 August 2018. Retrieved25 August 2018.
^OpenAI (24 August 2018)."The International 2018: Results".blog.openai.com.Archived from the original on 24 August 2018. Retrieved25 August 2018.
^Wiggers, Kyle (13 April 2019)."OpenAI Five defeats professional Dota 2 team, twice".Venture Beat.Archived from the original on 13 April 2019. Retrieved13 April 2019.
^^a ^bStatt, Nick (13 April 2019)."OpenAI's Dota 2 AI steamrolls world champion e-sports team with back-to-back victories".The Verge. Vox Media.Archived from the original on 15 April 2019. Retrieved15 April 2019.
^Wiggers, Kyle (22 April 2019)."OpenAI's Dota 2 bot defeated 99.4% of players in public matches".Venture Beat. Archived fromthe original on 11 July 2019. Retrieved22 April 2019.
^"Dota 2 with Large Scale Deep Reinforcement Learning"(PDF).OpenAI.Archived(PDF) from the original on 26 September 2024. Retrieved29 September 2024.
^^a ^b ^cOpenAI (25 June 2018)."OpenAI Five".blog.openai.com.Archived from the original on 25 June 2018. Retrieved25 June 2018.
^"Why are AI researchers so obsessed with games?".QUARTZ. 4 August 2018.Archived from the original on 4 August 2018. Retrieved4 August 2018.
^Schulman, John; Wolski, Filip; Dhariwal, Prafulla; Radford, Alec; Klimov, Oleg (2017). "Proximal Policy Optimization Algorithms".arXiv:1707.06347 [cs.LG].
^Gabbatt, Adam (17 February 2011)."IBM computer Watson wins Jeopardy clash".The Guardian.Archived from the original on 21 September 2013. Retrieved17 February 2011.
^"Chess grandmaster Garry Kasparov on what happens when machines 'reach the level that is impossible for humans to compete'".Business Insider.Archived from the original on 29 December 2017. Retrieved29 December 2017.
^"DeepMind's Go-playing AI doesn't need human help to beat us anymore".Verge. 18 October 2017.Archived from the original on 18 October 2017. Retrieved18 October 2017.
^^a ^bKnight, Will (25 June 2018)."A team of AI algorithms just crushed humans in a complex computer game".MIT Tech Review. Retrieved25 June 2018.
^"Bill Gates hails 'huge milestone' for AI as bots work in a team to destroy humans at video game 'Dota 2'".Business Insider.Archived from the original on 27 June 2018. Retrieved27 June 2018.
^"Garry Kasparov's Twitter". 24 August 2018. Retrieved24 August 2018.
^Park, Morgan (11 August 2018)."How the OpenAI Five tore apart a team of Dota 2 pros".PC Gamer. Retrieved25 May 2020.
^Gault, Matthew (17 August 2018)."OpenAI Is Beating Humans at 'Dota 2' Because It's Basically Cheating".Vice. Retrieved25 May 2020.
^Statt, Nick (30 October 2019)."DeepMind's StarCraft 2 AI is now better than 99.8 percent of all human players".The Verge. Retrieved25 May 2020.
^OpenAI; Andrychowicz, Marcin; Baker, Bowen; Chociej, Maciek; Józefowicz, Rafał; McGrew, Bob; Pachocki, Jakub; Petron, Arthur; Plappert, Matthias; Powell, Glenn; Ray, Alex; Schneider, Jonas; Sidor, Szymon; Tobin, Josh; Welinder, Peter; Weng, Lilian; Zaremba, Wojciech (2019). "Learning Dexterous In-Hand Manipulation".arXiv:1808.00177v5 [cs.LG].
^OpenAI; Akkaya, Ilge; Andrychowicz, Marcin; Chociej, Maciek; Litwin, Mateusz; McGrew, Bob; Petron, Arthur; Paino, Alex; Plappert, Matthias; Powell, Glenn; Ribas, Raphael (2019). "Solving Rubik's Cube with a Robot Hand".arXiv:1910.07113v1 [cs.LG].

External links

[edit]

Dota

Valve

Games

Esports

The International	2014 2015 2016 2017 2018 2019 2021 2022 2023 2024 2025
Notable teams	Alliance Aurora Gaming CDEC Gaming Evil Geniuses Invictus Gaming Mineski Newbee Natus Vincere OG PSG.LGD Team Falcons Team Liquid Team Secret Team Spirit Vici Gaming Virtus.pro Wings Gaming
Notable players	AdmiralBulldog ana Arteezy Aui_2000 Ceb Dendi Fear Ferrari_430 Fly Hao KuroKy Miracle- N0tail ppd Puppey s4 Sumail Topson Universe

Category

GPT models	GPT-1 GPT-2 GPT-3 GPT-4 GPT-4o o1 o3 GPT-4.5 GPT-4.1 o4-mini GPT-OSS GPT-5 GPT-5.1 GPT-5.2
Specialized	DALL-E GPT Image Sora Whisper

Intelligent
agents

People

Senior
management

Current	Sam Altman removal Greg Brockman Sarah Friar Jakub Pachocki Scott Schools
Former	Mira Murati Emmett Shear

Board of
directors

Current	Sam Altman Adam D'Angelo Sue Desmond-Hellmann Zico Kolter Paul Nakasone Adebayo Ogunlesi Nicole Seligman Fidji Simo Bret Taylor (chair)
Former	Greg Brockman (2017–2023) Reid Hoffman (2019–2023) Will Hurd (2021–2023) Holden Karnofsky (2017–2021) Elon Musk (2015–2018) Ilya Sutskever (2017–2023) Helen Toner (2021–2023) Shivon Zilis (2019–2023) Lawrence Summers (2023-2025)

JVs

Stargate LLC

Category

Artificial intelligence (AI)

Concepts

Applications

Implementations

Audio–visual	AlexNet WaveNet Human image synthesis HWR OCR Computer vision Speech synthesis 15.ai ElevenLabs Speech recognition Whisper Facial recognition AlphaFold Text-to-image models Aurora DALL-E Firefly Flux GPT Image Ideogram Imagen Midjourney Recraft Stable Diffusion Text-to-video models Dream Machine Runway Gen Hailuo AI Kling Sora Veo Music generation Riffusion Suno AI Udio
Text	Word2vec Seq2seq GloVe BERT T5 Llama Chinchilla AI PaLM GPT 1 2 3 J ChatGPT 4 4o o1 o3 4.5 4.1 o4-mini 5 5.1 5.2 Claude Gemini Gemini (language model) Gemma Grok LaMDA BLOOM DBRX Project Debater IBM Watson IBM Watsonx Granite PanGu-Σ DeepSeek Qwen
Decisional	AlphaGo AlphaZero OpenAI Five Self-driving car MuZero Action selection AutoGPT Robot control

People

Architectures

Political

Social and economic

Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=OpenAI_Five&oldid=1337928454"

Categories:

Hidden categories:

[8]ページ先頭