politics

December 3, 2025

OpenAI desperate to avoid explaining why it deleted pirated book datasets

OpenAI risks increased fines after deleting pirated books datasets.

OpenAI desperate to avoid explaining why it deleted pirated book datasets

TL;DR

  • Authors suing OpenAI claim ChatGPT was illegally trained on their copyrighted works using datasets "Books 1" and "Books 2."
  • OpenAI deleted these datasets before ChatGPT's release, citing "non-use" and attorney-client privilege.
  • US magistrate judge Ona Wang ordered OpenAI to produce communications related to the dataset deletion, including those previously withheld under privilege.
  • The judge found OpenAI's assertions about privilege to be inconsistent, potentially waiving their privilege claims.
  • The authors believe these communications could prove willful infringement, leading to higher damages.
  • OpenAI disputes the ruling and intends to appeal.

Continue reading
the original article

Made withNostr