Apple’s announcement earlier this month of a new child sexual abuse material (CSAM) detection tool drew widespread criticism. The technique, which Apple calls NeuralHash, has yet to be activated for Apple’s billion-plus users, but security researchers have already criticized it, claiming that the algorithm produces faulty results.
NeuralHash is a program that can recognize known CSAM on a user’s device without requiring the user to have the image or know what it contains. Because a user’s images in iCloud are end-to-end encrypted, even Apple can’t access them, NeuralHash instead checks for known CSAM on the user’s device, which Apple claims is more privacy friendly because it confines the scanning to just photos rather than scanning the entire user’s file system.
Apple accomplishes this by scanning a user’s device for photographs with the same hash — a series of letters and numbers that can be used to uniquely identify an image — as those provided by child protection organizations such as NCMEC. If NeuralHash discovers 30 or more hashes that match, the photos are marked for manual review by Apple before the account owner is submitted to law enforcement. According to Apple, the chances of a false positive account are around one in a trillion.
However, security experts and privacy advocates are concerned that the system might be manipulated by well-resourced parties, such as governments, to falsely accuse innocent victims or use the system to discover other items that authoritarian nation states find offensive. According to a leaked document circulated internally to Apple employees, NCMEC referred to opponents as “screeching voices of the minority.”
Asuhariet Ygvar reverse-engineered Apple’s NeuralHash into a Python script and published the source on GitHub last night, letting anyone to test the technology without needing an Apple device. Ygvar claimed in a Reddit post that NeuralHash “already exists” as obfuscated code in iOS 14.3, but that he was able to rebuild the technology to help other security researchers better understand the algorithm before it’s rolled out to iOS and macOS devices later this year.
It wasn’t long before others began tinkering with the public code, and the first known incidence of a “hash collision,” which occurs when two completely different photos yield the same hash in NeuralHash. The hash collision was found by Cory Cornelius, a well-known Intel Labs research scientist. A little minutes later, Ygvar confirmed the collision. Hash collisions can spell disaster for systems that rely on cryptography for security, such as encryption. Several well-known password hashing algorithms, such as MD5 and SHA-1, have been decommissioned over time due to collision attacks rendering them ineffective.
“I think some people aren’t getting that the period between the iOS NeuralHash code being uncovered and [the] first collision was not months or days, but a handful of hours,” said Kenneth White, a cryptography specialist and founder of the Open Crypto Audit Project, in a tweet.
An Apple official declined to comment on the record when contacted. Apple, however, downplayed the hash collision on a background call where reporters were not allowed to interview executives directly or by name, arguing that the safeguards it puts in place — such as a manual review of images before they are given to law enforcement — are designed to prevent abuses.
Civil liberties organizations and security professionals aren’t the only ones who are concerned about the technology. This week, a senior member of the German parliament wrote to Apple CEO Tim Cook, warning that the firm is on a “hazardous path” and urging the company not to deploy the system.