Google has repeatedly reported on its efforts over the previous ten years to remove links to content it deems to be pirated from its search results; most recently, the total number of takedowns Google has reported on has surpassed 6 billion.
It’s a significant achievement that, according to Torrent Freak, demonstrates how Google “is slowly but surely portraying itself as a willing participant in the anti-piracy fight,” even though copyright infringement cannot be completely eliminated.
In 1998, Google’s gradual transformation into a leader in the fight against piracy accelerated. At that time, the Federal Communications Commission granted online service providers like Google safe harbour, shielding them from claims of copyright infringement involving content from third parties, with the requirement that the providers disclose information on any users who may have violated any laws.
However, it appeared that Google wasn’t going far enough a decade later in 2009, and the FCC once more stepped in as a result of news publishers criticising Google and others.
Publishers at the time claimed that service providers were making money off of ads placed next to links from aggregators and scrapers, who were charged with stealing and republishing news information without authorization.
When the problem first emerged, Google vowed to fix it by making it simpler for copyright holders to flag unlawful content in search results.
In 2010, it then released its first transparency report, but that document only contained data on government takedown requests. Two years later, Google updated its report, “offering information on who sends us copyright removal letters, how often, on behalf of which copyright owners and for which websites,” and “counting every takedown notice that we received.”
In 2018, Google made the decision to take things a step further by developing a preemptive blocklist. A total of 6 billion URLs were delisted as a result of that action, which prevented copyright-infringing URLs from ever being indexed in search results.
4 041,845 distinct domain names were detected by 326,575 copyright holders, according to Torrent Freak, totaling 6 billion takedowns since 2012. But not every report was reliable. A list of “false positive” reported domains included Torrent Freak as well as “websites of the White House, the FBI, Disney, Netflix, and the New York Times.”
The goal of Google’s efforts to be more open about takedowns over the past ten years, according to senior copyright counsel Fred von Lohmann, was to assist guide policy decisions as the Internet developed, he said in a blog post in 2012.
We hope that these data will add to the conversation as politicians and Internet users throughout the world weigh the benefits and drawbacks of various solutions to the issue of online copyright infringement, stated Lohmann.
When Ars asked Google to comment on the political implications of its transparency reports, they did not react right away.
In order to increase transparency, Google works with Lumen to track all of its takedown notifications. According to Adam Holland, project manager at Lumen, Google submits more data than any of its other partners, including Twitter, Wikipedia, and Reddit.
The majority of requests for Google data, according to Holland, come from academics looking to analyse long-term trends, as well as media outlets and non-governmental organisations, though rarely policymakers.
Holland told Ars, “We don’t actually get a lot of attention directly from lawmakers. “That disappoints me personally, but that’s the truth,” Holland recently revealed, however, that Lumen has started assisting European Union regulators as they strive to put new transparency rules for online service providers included in its recently passed Digital Services Act into effect.
Lumen’s main objective, as an impartial data source, is not to affect policy, though. According to Holland, Lumen only has one position on the subject: opposing web service providers’ invisible takedowns.
This is because “[o]ur unofficial motto is good policy demands good data,” Holland told Ars. And due to its extensive global presence, Google continues to be the main source of Lumen data.
The main advantage of Google’s research, according to Harvard Law School copyright expert Rebecca Tushnet, is that it would demonstrate to lawmakers how challenging it is for service providers to categorise information.
Tushnet warned lawmakers when advising the US Senate Committee on the Judiciary Subcommittee on Intellectual Property in 2020, and he most recently told Ars that reports like Google’s demonstrate how scammers will find inventive ways to take advantage of any avenues created for reporting infringing content by “doing their very best to mimic people who have valid claims.”
Unfortunately, Tushnet told Ars, “I’m not sure [Google’s transparency reporting] has changed policy.” The best use of transparency reports, in my opinion, is to demonstrate how challenging this is and how, even if you remove infringing content 99.9% of the time, you’re still going to be wrong a lot.