Stories
Slash Boxes
Comments

SoylentNews is people

posted by chromas on Friday August 28 2020, @04:50PM   Printer-friendly
from the operation-google-2:-electric...google-fu dept.

One Database to Rule Them All: The Invisible Content Cartel that Undermines the Freedom of Expression Online:

Every year, millions of images, videos and posts that allegedly contain terrorist or violent extremist content are removed from social media platforms like YouTube, Facebook, or Twitter. A key force behind these takedowns is the Global Internet Forum to Counter Terrorism (GIFCT), an industry-led initiative that seeks to "prevent terrorists and violent extremists from exploiting digital platforms."

[...] Hashes are digital "fingerprints" of content that companies use to identify and remove content from their platforms. They are essentially unique, and allow for easy identification of specific content. When an image is identified as "terrorist content," it is tagged with a hash and entered into a database, allowing any future uploads of the same image to be easily identified.

This is exactly what the GIFCT initiative aims to do: Share a massive database of alleged 'terrorist' content, contributed voluntarily by companies, amongst members of its coalition. The database collects 'hashes', or unique fingerprints, of alleged 'terrorist', or extremist and violent content, rather than the content itself. GIFCT members can then use the database to check in real time whether content that users want to upload matches material in the database. While that sounds like an efficient approach to the challenging task of correctly identifying and taking down terrorist content, it also means that one single database might be used to determine what is permissible speech, and what is taken down—across the entire Internet.

Countless examples have proven that it is very difficult for human reviewers—and impossible for algorithms—to consistently get the nuances of activism, counter-speech, and extremist content itself right. The result is that many instances of legitimate speech are falsely categorized as terrorist content and removed from social media platforms. Due to the proliferation of the GIFCT database, any mistaken classification of a video, picture or post as 'terrorist' content echoes across social media platforms, undermining users' right to free expression on several platforms at once. And that, in turn, can have catastrophic effects on the Internet as a space for memory and documentation.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by JoeMerchant on Friday August 28 2020, @07:44PM (5 children)

    by JoeMerchant (3937) on Friday August 28 2020, @07:44PM (#1043491)

    Most images contain 24 bits of color information, but only the most significant 18 bits are even visible on a lot of displays, and even displays which can render variations in the least significant bits, almost nobody would ever notice, particularly if it's just a few pixels changed. That's the basis of a lot of steganography, and certainly it will defeat simple hashing algorithms, until... the hashers get wind of these kinds of tweaks being made and make their hashes less sensitive to insignificant changes.

    Remember: for the last 10 years your cellphone has been capable of opening an ambient sound microphone and correctly identifying what song is playing in the room, even with heavy background noise, out of millions of songs in the ID database. If they can fingerprint a few seconds of music like that, I'm sure they can fingerprint 2D images even better - just like Google Image Search doesn't need exact matches to show you similar images.

    --
    🌻🌻 [google.com]
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 1, Insightful) by Anonymous Coward on Saturday August 29 2020, @02:28AM (4 children)

    by Anonymous Coward on Saturday August 29 2020, @02:28AM (#1043612)

    "until... the hashers get wind of these kinds of tweaks being made and make their hashes less sensitive to insignificant changes."

    That's not how hashes work. Change a single byte and the hash is completely different. What you describe would need image comparison not hash comparison.

    • (Score: 0) by Anonymous Coward on Saturday August 29 2020, @06:50PM (3 children)

      by Anonymous Coward on Saturday August 29 2020, @06:50PM (#1043876)

      That is how cryptographic hashes work. There are other kinds of hash families around, and fuzzy matching is a big one.