top of page
outsystems-Q225-prospecting-ban-v1-300x600.png
outsystems-Q225-prospecting-ban-v1-728x90.png
TechNewsHub_Strip_v1.jpg

LATEST NEWS

Reddit sues Perplexity, accusing AI startup of 'Industrial-Scale Data Laundering'

  • Marijan Hassan - Tech Journalist
  • 8 minutes ago
  • 2 min read

Social media giant Reddit has filed a lawsuit in New York federal court against AI search startup Perplexity AI, accusing the company and three data-scraping firms of engaging in an "industrial-scale, unlawful" operation to steal Reddit content.


ree

The lawsuit, filed on Wednesday, alleges that Perplexity is a "willing customer" in a new "data laundering economy" designed to feed the AI industry's insatiable appetite for human-generated content.


The core allegation: Circumvention and theft

The complaint centers on the theft of user posts and comments, which Reddit refers to as "one of the largest and most dynamic collections of human conversation ever created." Unlike major players like OpenAI and Google, who have signed lucrative, paid licensing deals with Reddit for access to its data, Perplexity is accused of opting for unauthorized means.


Reddit's lawsuit names the following co-defendants, describing them as the "would-be bank robbers" who supply the stolen goods:


  • Oxylabs UAB (Lithuania-based)

  • AWMProxy (Russia-based)

  • SerpApi (Texas-based)


Reddit alleges that these scraping companies circumvented both Reddit's own anti-scraping measures and Google's anti-scraping controls by disguising their bots and scraping Reddit content indirectly from billions of Google Search results.


Reddit’s “trap” and Perplexity’s defense

In a key piece of evidence, Reddit claims it set a "test post" on its site that was configured to be visible only to Google's search crawler. Within hours, that hidden content appeared in Perplexity's AI-generated search results, which Reddit argues is definitive proof that the startup relied on the scraped Google data.


Perplexity quickly responded to the lawsuit with a statement on its own subreddit, arguing that its approach is "principled and responsible."


"Perplexity, as an application-layer company, does not train AI models on content. Never has. So it is impossible for us to sign a license agreement to do so... We summarize Reddit discussions, and we cite Reddit threads in answers, just like people share links to posts here all the time," they wrote.


Perplexity suggested the lawsuit is a "show of force" by Reddit in its ongoing data negotiations with Google and OpenAI, and vowed it will not "bow to strong-arm tactics."


A battle for the future of the open web

Reddit's Chief Legal Officer, Ben Lee, framed the lawsuit as part of a broader fight, stating, "AI companies are locked in an arms race for quality human content, and that pressure has fuelled an industrial-scale 'data laundering' economy."


The case highlights the growing tension between AI developers who argue that public web content should be free to ingest for training models and content owners, who are seeking to monetize their proprietary data.

wasabi.png
Gamma_300x600.jpg
paypal.png
bottom of page