from the deep-fakes-coming-right-up dept.
Microsoft has pulled a facial recognition database offline in response to criticism, but the contents are still available from alternate sources:
If you've ever uploaded photos of yourself to the internet under a creative commons license—which allows for re-use under certain conditions—they may already have been used to train AI programs to recognize human faces.
Microsoft released MS-Celeb-1M, a dataset of roughly 10 million photos from 100,000 individuals collected from the internet in 2016. The database was designed to contain photos of celebrities, but as Berlin-based researcher Adam Harvey pointed out with his project Megapixels, the definition of "celebrity" was quite broad. The database also contained photos of "journalists, artists, musicians, activists, policy makers, writers, and academics," Harvey wrote.
MS-Celeb-1M's webpage is currently offline, but before the database was quietly pulled, it was used far and wide to train facial recognition programs. Entities that made use of images in the database, according to Harvey, include Chinese tech firms such as SenseTime and Megvii, which have been linked to the Chinese state's use of facial recognition to track and oppress ethnic minorities.
[...] Even though Microsoft took it down, cleaned-up versions of the database are available to download from GitHub for example. Tools for working with the database, such as labelling lists that can reveal the names of photo subjects, also remain easily accessible.
"Despite the recent termination of the msceleb.org website, the dataset still exists in several repositories on GitHub, the hard drives of countless researchers, and will likely continue to be used in research projects around the world," Harvey wrote on Megapixels. A facial recognition challenge this year at Imperial College London plans to use a variant of the MS-Celeb-1M database, and offers download links.
(Score: 2) by DannyB on Friday June 07, @03:10PM
If I uploaded my photo with a creative commons license, I would assume that it could be used for just about anything, as long as the use is within compliance of the license.
The uses the photo is put to might include things I would not think of.
If someone is worried about that, then they probably should not put their photo online, or use a license that expressly restricts its use to certain types of uses.
The best way to avoid conflict and encourage diversity is to force everyone to voluntarily think alike.
(Score: 2) by DannyB on Friday June 07, @03:14PM
I'm struggling to see the harm if I had uploaded a photo and someone used it along with many others to train a neural network to recognize what makes up a face.
All my photo would do is tweak the weights of connections between layers of neurons. Usually expressed as a matrix with one axis being the neurons in one layer, and the other axis the other layer of neurons.
All of the photos tweak these weights. After training is it even possible to extract an individual face from the goop? (can you un-make a saussage?)
The best way to avoid conflict and encourage diversity is to force everyone to voluntarily think alike.