Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

OpenAI Codex

From Wikipedia, the free encyclopedia
Artificial intelligence model geared towards programming

OpenAI Codex describes twoAI-assisted software development tools released byOpenAI. They translate natural language intocode, a technology described by artificial intelligence researchers as anAI agent.[1]

On August 10, 2021, OpenAI announced Codex, a codeautocompletion tool available in selectIDEs such asVisual Studio Code andNeovim. It was a modified, production version ofGPT-3,[2] finetuned on gigabytes of source code in a dozen programming languages. It was the original model poweringGitHub Copilot.[3]

On April 16, 2025, OpenAI published CodexCLI toGitHub under anApache 2.0 license, an AI agent harness that runs locally on a user's computer.[4][5] They also announced a language model,codex-mini-latest, available only behind an API. It was a fine-tuned version ofo4-mini, specifically trained for use in Codex CLI.[6]

On May 16, 2025, OpenAI announced the launch of a research preview of a distinct tool with a similar purpose, also named Codex, based on a finetuned version ofOpenAI o3.[7] It is a software agent that performs tasks in computer programming, including writing features, answering codebase questions, running tests, and proposing PRs for review. It has two versions, one running in a virtual machine in the cloud, and one where the agent runs in the cloud, but performs actions on a local machine connected viaAPI (similar in operation toCursor orClaude Code). It is available to ChatGPT Pro, Enterprise, Team, and Plus users.[8][9]

Capabilities

[edit]

Based onGPT-3, aneural network trained on text, Codex was additionally trained on 159 gigabytes ofPython code from 54 millionGitHub repositories.[10][11] A typical use case of Codex is for a user to type a comment, such as "//compute the moving average of an array for a given window size", then use the AI to suggest a block of code that satisfies that comment prompt.[12] OpenAI stated that Codex can complete approximately 37% of requests and is meant to make human programming faster rather than to replace it. According to OpenAI's blog, Codex excels most at "mapping... simple problems to existing code", which they describe as "probably the least fun part of programming".[13][14] Co-founder of Fast.ai, Jeremy Howard ted that "Codex is a way of getting code written without having to write as much code", and that "it is not always correct, but it is just close enough".[15] According to a paper by OpenAI researchers, when Codex attempted each test case 100 times, it generated working solutions for 70.2% of prompts.[16]

OpenAI claims that Codex can create code in over a dozen programming languages, includingGo,JavaScript,Perl,PHP,Ruby,Shell,Swift, andTypeScript, though it is most effective in Python.[3] According toVentureBeat, demonstrations uploaded by OpenAI showed impressivecoreference resolution capabilities. The demonstrators were able to create abrowser game in JavaScript and generate data science charts usingmatplotlib.[14]

OpenAI showed that Codex can interface with services and apps such asMailchimp,Microsoft Word,Spotify, andGoogle Calendar.[14][17]

The Codex-1 model is trained to detect requests for malware, exploits or policy-violating content and returns a refusal with a cited policy clause. The container has no outbound internet and only whitelisted dependencies, which is intended to reduce the blast radius of any bad code.[18]

Issues

[edit]

OpenAI demonstrations showcased flaws such as inefficient code and one-off quirks in code samples.[14] In an interview withThe Verge, OpenAIchief technology officerGreg Brockman said that "sometimes [Codex] doesn't quite know exactly what you're asking" and that it can require some trial and error.[17] OpenAI researchers found that Codex struggles with multi-step prompts, often failing or yielding counter-intuitive behavior. Additionally, they brought up several safety issues, such as over-reliance by novice programmers, biases based on the training data, and security impacts due to vulnerable code.[16]

VentureBeat stated that because Codex[19] is trained on public data, it could be vulnerable to "data poisoning" via intentional uploads of malicious code.[14] According to a study by researchers fromNew York University, approximately 40% of code generated byGitHub Copilot (which uses Codex) in scenarios relevant to high-riskCWEs included glitches or other exploitable design flaws.[20]

Copyright

[edit]

TheFree Software Foundation expressed concerns that code snippets generated by Copilot and Codex couldviolate copyright, in particular the condition of theGPL that requiresderivative works to be licensed under equivalent terms.[21] Issues they raised include whether training on public repositories falls intofair use or not, how developers could discover infringing generated code, whether trainedmachine learning models could be considered modifiable source code or a compilation of the training data, and if machine learning models could themselves be copyrighted and by whom.[21][22] An internal GitHub study found that approximately 0.1% of generated code contained direct copies from the training data. In one example the model outputted the training data code implementing thefast inverse square root algorithm, including comments and an incorrectcopyright notice.[12]

In response, OpenAI stated that "legal uncertainty on the copyright implications of training AI systems imposes substantial costs on AI developers and so should be authoritatively resolved."[12]

The copyright issues with Codex have been compared to theAuthors Guild, Inc. v. Google, Inc. court case, in which judges ruled thatGoogle Books's use of text snippets from millions ofscanned books constituted fair use.[12][23]

References

[edit]
  1. ^Metz, Cade (2025-05-16)."OpenAI Unveils New Tool for Computer Programmers".The New York Times. Retrieved2025-05-20.
  2. ^"OpenAI Releases GPT-3, The Largest Model So Far".Analytics India Magazine. 3 June 2020. Retrieved7 April 2022.
  3. ^abZaremba, Wojciech (August 10, 2021)."OpenAI Codex".OpenAI.Archived from the original on 2023-02-03. Retrieved2021-09-03.
  4. ^openai/codex, OpenAI, 2025-08-11, retrieved2025-08-11
  5. ^Wiggers, Kyle (2025-04-16)."OpenAI debuts Codex CLI, an open source coding tool for terminals".TechCrunch. Retrieved2025-08-11.
  6. ^"OpenAI Platform".platform.openai.com. Retrieved2025-08-11.
  7. ^Knight, Will (2025-05-16)."OpenAI Launches an Agentic, Web-Based Coding Tool".Wired. Retrieved2025-05-20.
  8. ^"OpenAI Platform".platform.openai.com. Retrieved2025-07-31.
  9. ^"OpenAI Codex".openai.com. Retrieved2025-07-31.
  10. ^Wiggers, Kyle (July 8, 2021)."OpenAI warns AI behind GitHub's Copilot may be susceptible to bias".VentureBeat.Archived from the original on 2023-02-03. Retrieved2021-09-03.
  11. ^Alford, Anthony (August 31, 2021)."OpenAI Announces 12 Billion Parameter Code-Generation AI Codex".InfoQ.Archived from the original on 2022-07-09. Retrieved2021-09-03.
  12. ^abcdAnderson, Tim; Quach, Katyanna (July 6, 2021)."GitHub Copilot auto-coder snags emerge, from seemingly spilled secrets to bad code, but some love it".The Register.Archived from the original on 2023-06-02. Retrieved2021-09-04.
  13. ^Dorrier, Jason (August 15, 2021)."OpenAI's Codex Translates Everyday Language Into Computer Code".SingularityHub.Archived from the original on 2023-05-26. Retrieved2021-09-03.
  14. ^abcdeDickson, Ben (August 16, 2021)."What to expect from OpenAI's Codex API".VentureBeat.Archived from the original on 2023-02-03. Retrieved2021-09-03.
  15. ^Metz, Cade (September 9, 2021)."A.I. Can Now Write Its Own Computer Code. That's Good News for Humans".The New York Times.Archived from the original on 2022-03-30. Retrieved2021-09-16.
  16. ^abChen, Mark; Tworek, Jerry; Jun, Heewoo; Yuan, Qiming; Pinto, Henrique Ponde de Oliveira; Kaplan, Jared; Edwards, Harri; Burda, Yuri; Joseph, Nicholas; Brockman, Greg; Ray, Alex (2021-07-14). "Evaluating Large Language Models Trained on Code".arXiv:2107.03374 [cs].
  17. ^abVincent, James (August 10, 2021)."OpenAI can translate English into code with its new machine learning software Codex".The Verge.Archived from the original on 2021-09-02. Retrieved2021-09-03.
  18. ^Nuzhnyy, Sergey (May 19, 2025)."What is Codex? Exploring OpenAI's AI Coding Agentx".AI/ML API.
  19. ^"Coding's Next Frontier: How OpenAI Codex Is Redefining Software Engineering". 2025-05-17. Retrieved2025-05-26.
  20. ^Pearce, Hammond; Ahmad, Baleegh; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh (2021-12-16). "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions".arXiv:2108.09293 [cs.CR].
  21. ^abKrill, Paul (August 2, 2021)."GitHub Copilot is 'unacceptable and unjust,' says Free Software Foundation".InfoWorld.Archived from the original on 2021-09-03. Retrieved2021-09-03.
  22. ^Robertson, Donald (2021-07-28)."FSF-funded call for white papers on philosophical and legal questions around Copilot: Submit before Monday, August 23, 2021".Free Software Foundation.Archived from the original on 2021-08-11. Retrieved2021-09-04.
  23. ^Barber, Gregory (July 12, 2021)."GitHub's Commercial AI Tool Was Built From Open Source Code".WIRED.Archived from the original on 2021-07-25. Retrieved2021-09-04.Coding’s Next Frontier:
Products
Chatbots
Foundation
models
Intelligent
agents
People
Senior
management
Current
Former
Board of
directors
Current
Former
Joint ventures
Related
Concepts
Models
Text
Coding
Image
Video
Speech
Music
Agents
Companies
Controversies
Retrieved from "https://en.wikipedia.org/w/index.php?title=OpenAI_Codex&oldid=1316460980"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp