New research has shown just how bad AI is at dealing with online trolls.
Such systems struggle to automatically flag nudity and violence, don’t understand text well enough to shoot down fake news and aren’t effective at detecting abusive comments from trolls hiding behind their keyboards.
A group of researchers from Aalto University and the University of Padua found this out when they tested seven state-of-the-art models used to detect hate speech. All of them failed to recognize foul language when subtle changes were made, according to a paper [PDF] on arXiv.
Adversarial examples can be created automatically by using algorithms to misspell certain words, swap characters for numbers or add random spaces between words or attach innocuous words such as ‘love’ in sentences.
The models failed to pick up on adversarial examples and successfully evaded detection. These tricks wouldn’t fool humans, but machine learning models are easily blindsighted. They can’t readily adapt to new information beyond what’s been spoonfed to them during the training process.
(Score: 0) by Anonymous Coward on Saturday September 01 2018, @02:35PM (1 child)
Perhaps we can find a lesson in the story from early days of ITS at the MIT AI Lab? Not sure where I read the story first, but this page https://en.wikipedia.org/wiki/Incompatible_Timesharing_System [wikipedia.org] has a short summary:
Not suggesting that this is a reasonable solution for trolling. What I am suggesting is that maybe there is an analogous approach that removes the troll's motivation or challenge.
(Score: 1, Insightful) by Anonymous Coward on Saturday September 01 2018, @07:10PM
What I am suggesting is that maybe there is an analogous approach that removes the troll's motivation or challenge.
By far the very best approach is to simply not respond, in any way. They are attention seekers. That is the prize.