Movatterモバイル変換


[0]ホーム

URL:


TechCrunch Desktop Logo
TechCrunch Mobile Logo
Site Search Toggle
Mega Menu Toggle
large Reddit logo overlaying background of smaller logo silhouettes
Image Credits:TechCrunch

OpenAI inks deal to train AI on Reddit data

OpenAI hasreached a deal with Reddit to use the social news site’s data for training AI models.

In ablog post on OpenAI’s press relations site, the company said that the Reddit partnership will provide it access to “real-time, structured and unique content” — e.g. posts and replies — from Reddit, allowing its tools and models to “better understand and showcase” that content. Reddit content will be incorporated intoChatGPT, OpenAI’s popular conversational AI, and the companies will work together to bring unspecified new “AI-powered features” to both Reddit users and moderators.

OpenAI will also become a Reddit advertising partner.

“Reddit will be building on OpenAI’s platform of AI models to bring its powerful vision to life,” OpenAI wrote in the post. “Using LLMs, ML, and AI allow Reddit to improve the user experience for everyone.”

OpenAI has several similar licensing deals with content providers ranging from stock media libraries to news publishers. But the unusual angle to this one is that Sam Altman, OpenAI’s CEO, has an 8.7% stake in Reddit, making him the third-largest shareholder, and was once a member of the company’s board of directors.

In an attempt to discourage scrutiny, OpenAI says in its press release that, while Altman remains a Reddit shareholder, the partnership “was led by OpenAI’s COO [Brad Lightcap]” and “approved by [OpenAI’s] independent board of directors.” (I’ll note here that Altman is a member of OpenAI’s board; he recused himself for this decision, however, an OpenAI spokesperson tells TechCrunch.)

Techcrunch event

Join the Disrupt 2026 Waitlist

Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector.

Join the Disrupt 2026 Waitlist

Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector.

San Francisco|October 13-15, 2026

Reddit has made data licensing agreements an increasingly central part of its growth strategy as it navigates the market as a public company.

In its IPO prospectus, Reddit revealed that it has contractual agreements to license its data tocustomers including Google worth a combined over $200 million. And, in its first earnings report as a public company, Reddit reported a 450% year-over-year increase in non-ad revenue, attributable mainly to those agreements.

Reddit stock was up 11% in extended trading following the announcement of the OpenAI deal.

“The paradox I see is that, as more content on the internet is written by machines, there’s an increasing premium on content that comes from real people,” Reddit CEO Steve Huffman said during the company’s earnings call in March. “And we have nearly two decades of authentic conversation.”

Reddit’s platform — which has over 1 billion posts and more than 16 billion comments, figures that grow every day thanks to its hundreds of millions of active users — is a gold mine for generative AI companies, whose models learn from examples of content, like text and images, to generate new, similar content.

But the company could face pushback from users concerned about how it’s monetizing their data.

It’s instructive to look at Stack Overflow, the Q&A forum for software developers, which recently inked an agreement with OpenAI to supply data for the latter’s model training. In protest, some users deleted their top-rated answers to questions on the community. But Stack Overflow restored the deleted posts and banned those users, claiming that they weren’t in compliance with its terms of service.

Reddit has already voiced its displeasure with one attempt to afford Reddit users greater control over their own data.

Vana, a startup built on the blockchain, is attempting to launch a data “DAO” (Digital Autonomous Organization) to let Reddit users pool their data and let them decide together how that combined data’s used (or sold). Reddit banned Vana’s subreddit dedicated to discussion about the DAO, in a statement to TechCrunch, and accused the company of “exploiting” its data export controls.

We’re launching an AI newsletter! Sign up here to start receiving it in your inboxes on June 5.

Topics

Kyle Wiggers
Kyle Wiggers

AI Editor

Kyle Wiggers was TechCrunch’s AI Editor until June 2025. His writing has appeared in VentureBeat and Digital Trends, as well as a range of gadget blogs including Android Police, Android Authority, Droid-Life, and XDA-Developers. He lives in Manhattan with his partner, a music therapist.
Event Logo
December 3, 2025
Palo Alto, CA


StrictlyVC concludes its 2025 series with an exclusive event featuring insights from leading VCs and builders such as Pat Gelsinger, Mina Fahmi, and more. Plus, opportunities to forge meaningful connections.

Loading the next article
Error loading the next article

[8]ページ先頭

©2009-2025 Movatter.jp