De-anonymization is an important topic for anyone working with sensitive data, whether in the context of academic research, IT system design, or otherwise.
I remember a talk during a Massey Grand Rounds panel where a medical researcher explained how she could pick herself out from an ‘anonymous’ database of Ontarians, on the basis that her salary was public as an exact dollar figure, only people with her specific job had it, and she was the only woman in that position.
The more general idea is that by putting pieces together you may be able to identify somebody who someone else has made some effort to keep anonymous.
It’s a challenge when doing academic research and writing on social movements, when some subjects choose to be anonymous in publications. That means not just not sharing their name, but not sharing any information that could be used to identify them. That gets hard when you think about adversaries who might have access to other information (in an extreme case, governments with access to masses of information) or even just ordinary people who can combine information from multiple sources logically. The date of an event described in an anonymous quote might tell allow someone to look up where it happened online. Another quote in which a third party’s actions are described could be used to determine that the de-anonymization target wasn’t that person. And so on and on like the logical games on the LSAT or the intricacies of mole hunting.
Lee Ann Fujii wrote smart stuff about this, and about subject protection in research generally.
Defcon 21 – De-Anonymizing Alt.Anonymous. Messages
—
How Tor Users Got Caught – Defcon 22
Sorry, your data can still be identified even if it’s anonymized
Urban planners and researchers at MIT found that it’s shockingly easy to “reidentify” the anonymous data that people generate all day, every day in cities.
Inside the Industry That Unmasks People at Scale
Unique IDs linked to phones are supposed to be anonymous. But there’s an entire industry that links them to real people and their address.
https://www.vice.com/en/article/epnmvz/industry-unmasks-at-scale-maid-to-pii
AI Fake-Face Generators Can Be Rewound To Reveal the Real Faces They Trained On
https://yro.slashdot.org/story/21/10/13/2116205/ai-fake-face-generators-can-be-rewound-to-reveal-the-real-faces-they-trained-on
—
This Person (Probably) Exists. Identity Membership Attacks Against GAN Generated Faces
https://arxiv.org/pdf/2107.06018.pdf
This appears to be Eric Trump’s incredibly depressing YouTube playlist.
https://slate.com/news-and-politics/2019/07/this-appears-to-be-eric-trumps-incredibly-depressing-youtube-playlist.html
They Stormed the Capitol. Their Apps Tracked Them
Times Opinion was able to identify individuals from a trove of leaked smartphone location data.
https://www.nytimes.com/2021/02/05/opinion/capitol-attack-cellphone-data.html
Never use pixelation to redact text
—
Never, Ever, Ever Use Pixelation for Redacting Text