Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Wednesday July 01 2015, @05:29AM   Printer-friendly
from the did-you-really-just-call-her-that? dept.

Google Photo tries to categorize your pictures automatically. Until very recently, it had a failure mode in which its classification for some pictures of humans was "Gorillas".

Google reacted [and apologised] very quickly when they got a complaint from a black woman who had been misclassified.

When Brooklyn-based computer programmer Jacky Alcine looked over a set of images that he had uploaded to Google Photos on Sunday, he found that the service had attempted to classify them according to their contents. Google offers this capability as a selling point of its service, boasting that it lets you, “Search by what you remember about a photo, no description needed.” In Alcine’s case, many of those labels were basically accurate: A photograph of an airplane wing had been filed under “Airplanes,” one of two tall buildings under “Skyscrapers,” and so on.

Then there was a picture of Alcine and a friend. They’re both black. And Google had labeled the photo “Gorillas.” On investigation, Alcine found that many more photographs of the pair—and nothing else—had been placed under this literally dehumanizing rubric.

Speculating, it's possible that their software is heavy on statistical matching and it's really hard to debug, which is why they wound up simply deleting "Gorilla" from the list of possible categories.

http://www.slate.com/blogs/future_tense/2015/06/30/google_s_image_recognition_software_returns_some_surprisingly_racist_results.html


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by bziman on Wednesday July 01 2015, @01:46PM

    by bziman (3577) on Wednesday July 01 2015, @01:46PM (#203753)

    The error is probably in the training data. In photos, 90% of gorillas are very dark in coloring. In photos, only about 10% of humans are very dark in coloring. The algorithm probably thinks that coloring is the primary differentiator between humans and gorillas.

    The problem is that the training data is pulled off the web, rather than manually curated. In order to get enough data for this kind of AI, you basically have to crowd source the training set. But then you bake in these sorts of biases.

    Perhaps for the next iteration they should include equal numbers of each race of human and each subspecies of primate? It would take much longer and produce generally poorer results for the wider audience, but it might reduce outliers.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by Beryllium Sphere (r) on Thursday July 02 2015, @04:40AM

    by Beryllium Sphere (r) (5062) on Thursday July 02 2015, @04:40AM (#204092)

    A CMU AI graduate I know pointed out that if, hypothetically, Google had used the web at large for training data then it would have been poisoned by all the racist web sites out there.