Meta faces new challenges in its ongoing copyright battle with book authors as newly unsealed emails reveal evidence of the company’s extensive torrenting activities. Last month, Meta acknowledged downloading LibGen, a controversial dataset containing millions of pirated books.
However, the full scope of these activities only became clear yesterday when unredacted emails showed Meta had torrented “at least 81.7 terabytes of data across multiple shadow libraries through the site Anna’s Archive, including at least 35.7 terabytes of data from Z-Library and LibGen.” Additionally, the company had previously torrented “80.6 terabytes of data from LibGen.”
Allegations of Copyright Infringement Against Meta
The authors’ legal team emphasized the scale of this operation, stating that “the magnitude of Meta’s unlawful torrenting scheme is astonishing.” They pointed out that “vastly smaller acts of data piracy—just .008 percent of the number of copyrighted works Meta pirated—have resulted in Judges referring the conduct to the US Attorneys’ office for criminal investigation.”
Internal communications reveal growing concerns within Meta about these practices. In April 2023, Meta research engineer Nikolay Bashlykov expressed unease, writing “Torrenting from a corporate laptop doesn’t feel right” and worrying about using Meta IP addresses “to load through torrents pirate content.”Â

By September 2023, Bashlykov’s concerns had escalated, leading him to consult Meta’s legal team about how torrenting would involve seeding—sharing content with others—which he warned “could be legally not OK.”
The authors allege that Meta attempted to conceal its torrenting activities. According to internal messages from Meta researcher Frank Zhang, the company avoided using Facebook servers for downloads to prevent anyone from “tracing back the seeder/downloader.”Â
Zhang described this work as being in “stealth mode.” During a deposition, Michael Clark, a Meta executive overseeing project management, admitted the company modified settings “so that the smallest amount of seeding possible could occur.”
These revelations have led authors to demand new depositions from Meta staff involved in the LibGen decisions, claiming the new evidence “contradicts prior deposition testimony.” This includes questioning Mark Zuckerberg’s previous statement of non-involvement, as unredacted messages indicate the “decision to use LibGen occurred” after “a prior escalation to MZ.”
Meta’s Torrenting Activity Complicates AI Copyright Case
Meta has maintained throughout that its AI training on LibGen constitutes “fair use.” In a recent motion to dismiss, the company argued that “plaintiffs do not plead a single instance in which any part of any book was, in fact, downloaded by a third party from Meta via torrent, much less that Plaintiffs’ books were somehow distributed by Meta.”
However, these new revelations about Meta’s torrenting activities have strengthened the authors’ distribution theory, which is crucial for proving direct copyright infringement. The case now extends beyond just claiming that Meta’s AI outputs unlawfully distributed their works to include the company’s seeding activities.
While Meta isn’t currently contesting the seeding aspect of the direct copyright infringement claim, they plan to address these allegations during summary judgment, where they intend to “set… the record straight and debunk… this meritless allegation.”
The case continues to evolve as limited discovery regarding Meta’s seeding activities proceeds, potentially setting important precedents for how tech companies can legally acquire and use copyrighted materials for AI training.