Stories
Slash Boxes
Comments

SoylentNews is people

posted by on Monday February 13 2017, @06:11AM   Printer-friendly
from the sounds-familiar dept.

A joint study carried out by researchers from Alphabet's Jigsaw and the Wikimedia Foundation has analyzed all user comments left on Wikipedia in 2015 in order to identify how and why users launch in personal attacks, one of the many faces of online abuse. To analyze the gigantic trove of sample comments, researchers developed a machine learning algorithm that was able to identify and distinguish different forms of online abuse and personal attacks. In order for the algorithm to work, it had to be trained beforehand. For this, researchers used human users to classify a small batch of 100,000 comments, with each of the test comments passing through the hands of ten different humans. The resulted data classification allowed the algorithm to accurately distinguish between direct personal attacks (statements like "You suck!"), third-party personal attacks (statements like "Bob sucks!"), and indirect personal attacks (statements like "Henry said Bob sucks").

After training the algorithm and unleashing it on all Wikipedia 2015 user comments, researchers were able to identify personal attacks, and then collect data on the users that launched them. Their findings reveal that around 43% of all comments left on Wikipedia came from anonymous users, but most of these were one-time commenters, and the number of comments they left was 20 times smaller than comments left by registered users. Despite this, researchers discovered that anonymous users were six times more active in posting personal attacks, but in the end, they contributed to less than half of personal attacks, meaning a large number of personal attacks came from users with a registered identity on the site.

Of all personal attacks, researchers noted that about a tenth came from extremely active users, who had an activity level of 20+, the highest on the site. A closer look at the data revealed that 34 "highly toxic users" from this 20+ category were responsible for almost 9% of all personal attacks on the site. "By comparing these figures, we see that almost 80% of attacks come from the over 9000 users who have made fewer than 5 attacking comments," the research team noted, something that's somewhat normal, as everybody tends to get mad at one point or another. "However, the 34 users with a toxicity level of more than 20 are responsible for almost 9% of attacks," researchers noted.

Source:

https://www.bleepingcomputer.com/news/security/wikipedia-comments-destroyed-by-a-few-highly-toxic-users/


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Insightful) by AthanasiusKircher on Monday February 13 2017, @06:52AM

    by AthanasiusKircher (5291) on Monday February 13 2017, @06:52AM (#466483) Journal

    First off, here's a link to the original study [arxiv.org], which is quite a bit easier to read than the circuitous writing of the linked article.

    Secondly, isn't the headline a bit exaggerated? Yes, apparently 34 users are responsible for almost 9% of attacks (apparently out of ~227,000 users). So yeah, these 34 users are clearly "highly toxic" and deserve closer moderation. On the other hand, they are only contributing less than 10% of the personal attacks, so I'm not sure it's accurate to accuse them of "destroying" Wikipedia comments alone. In fact, the headline directly contradicts the sentiment of the abstract of the original study, whose final sentence reads: "This reveals that the majority of personal attacks on Wikipedia are not the result of a few malicious users, nor primarily the consequence of allowing anonymous contributions."

    Lastly, for some reason the actual study seems biased toward retaining anonymous comments, despite their much higher level of attack. (See the sentence just quoted.) At no point in the study do they actually provide data on what percentage of personal attacks come from anonymous comments, other than the stats quoted in the summary, i.e., "six times more active in posting personal attacks" BUT "less than half of personal attacks." What exactly is "less than half"? And given the smaller number of anonymous comments ("20 times smaller than comments left by registered users"), doesn't that mean they were actually a MAJOR contributor to personal attacks? Even if anonymous contributors were only 1/3 of personal attacks, one could remove 33% of the "toxic" environment simply by banning 5% of posts (i.e., the anonymous ones).

    I'm NOT arguing for that necessarily. I'm pointing out that the headline focuses on 9% of "toxic comments" created by a small subset of users, and the study claims "significant progress could be made by moderating" them. Yet the study also wants to claim that a much larger number of personal attacks is created by another rather small subset, i.e., anonymous commenters, yet the study seems to repeatedly downplay the significance of this finding by not citing detailed stats about them and just saying vague stuff like "Thus, while anonymous contributions are much more likely to be an attack, overall they contribute less than half of attacks." Oh... "less than half"... not really that big of a deal?

    An alternative summary for this study could read: "Despite comprising a much smaller percentage of comments, anonymous users are six times more likely to engage in personal attacks than registered users. When combined together with 34 'highly toxic' registered users, they represent about half of Wikipedia's personal attacks in comments." Quite a different spin, no?

    Again, I'm not arguing for eliminating anonymous contributions or comments, but the results here are unsurprising for anyone who knows anything about internet forums: a large percentage of the crap comes from anonymous users (who feel emboldened by their anonymity to break conventions of polite conduct), along with a small number of complete jerks who don't give a crap what people think of them.

    Starting Score:    1  point
    Moderation   +2  
       Insightful=1, Interesting=1, Total=2
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 2, Informative) by pTamok on Monday February 13 2017, @08:41AM

    by pTamok (3042) on Monday February 13 2017, @08:41AM (#466514)

    Doesn't the study look at pseudonyms attacking pseudonyms? Am I wrong is saying that Wikipedia regards comments made by non-logged-in users (shown as coming from an IP address) as anonymous, and comments made by logged in users as non-anonymous: but there is no requirement for usernames to be linked to a real-world person?

    Obviously, the owner of a pseudonym can provide enough information voluntarily to link the pseudonym to a real-word identity, but I don't think the study concentrated only on usernames linked to real-world identities as being non-anonymous.

    • (Score: 2) by AthanasiusKircher on Monday February 13 2017, @09:45AM

      by AthanasiusKircher (5291) on Monday February 13 2017, @09:45AM (#466522) Journal

      Yes, you are correct. But pseudonyms, like real names, have a "reputation" that follows you. Pseudonymous users nevertheless tend to behave worse online than real-name accounts, and anonymous users behave worse than pseudonyms. All of this is pretty well known, I thought.

      • (Score: 1) by pTamok on Monday February 13 2017, @10:49AM

        by pTamok (3042) on Monday February 13 2017, @10:49AM (#466531)

        I'm not sure that 'real name' accounts behave better. I'm providing the first link I could find, as I don't have the time to dive deep into Google Scholar

        https://techcrunch.com/2012/07/29/surprisingly-good-evidence-that-real-name-policies-fail-to-improve-comments/ [techcrunch.com]

        Some people are proud of being contentious, or espousing controversial viewpoints - so being regarded as a 'white supremacist' or 'unreconstructed misogynist' can be regarded by a pseudonym owner as praise; and some people regard trolling as an intellectual game. As a result, you do not necessarily 'clean up' comments by banning anonymity.

        A rational, liberal viewpoint is a minority viewpoint: trying to redefine reality to ignore the irrational, illiberal majority doesn't work. Remember "If there is hope, it must lie in the proles."

        • (Score: 2) by AthanasiusKircher on Monday February 13 2017, @07:06PM

          by AthanasiusKircher (5291) on Monday February 13 2017, @07:06PM (#466707) Journal

          I'm not sure that 'real name' accounts behave better.

          They do overall. Even your link supports this, noting a 30% aggregate reduction in "swearing and 'anti-normative' behavior" after real-name policies were introduced. To be clear, I'm NOT arguing in favor of real-name policies (which I think also have many problems) or necessarily arguing against anonymous commenting.

          The general pattern (at least from the articles I've seen) is that real names or pseudonyms tends to embolden a small percentage of users who like to actually be associated with their troll-like behavior. On average, though, real-name or pseudonymous users as a group tend to be LESS likely to "misbehave" than anonymous users. That's exactly what we see again in this Wikipedia study: overall, anonymous users are significantly more likely to make personal attacks, but there are also a small number of "heavy" users who ARE logged in and nevertheless have a disproportionate share of the bad comments.

          This is truly basic psychology. Most people tend to conform to social norms and they have a stronger incentive to do so when their reputation follows them. But for a small fraction (i.e., sociopathic trolls), they actually love the negative attention.

      • (Score: 0) by Anonymous Coward on Monday February 13 2017, @03:25PM

        by Anonymous Coward on Monday February 13 2017, @03:25PM (#466617)

        Indeed, that's why nobody takes Aristarchus seriously. He's a huge fool that contributes nothing of value while making vague remarks and not even bothering to state what his issue is. Dude's probably one of those Aspie folks that just needs to get laid.

        I don't really care who the fuck he is in real life as I'm not going to go out of my way to go wherever he is and give him the ass kicking he deserves.

  • (Score: 2) by zocalo on Monday February 13 2017, @10:13AM

    by zocalo (302) on Monday February 13 2017, @10:13AM (#466525)
    Definitely some hyperbole in the 34 users and their 9% of attacks being "responsible", but it does sound like they are a good candidate for a small pool of users that could be made an example of - say by revocation of their access rights (temporary or permanent, Wikipedia's choice) - to "encourage" the others to toe the line. Especially if Wikipedia were prepared to follow through with a second round of bans. As TFA notes though, abuse by these individuals often leads to abuse by many others in what they have termed "pile-ons", so by removing the temptation the final impact could be much greater than a 9% reduction.

    There isn't really anything new in this other than identifying names and numbers specific to Wikipedia though; many forums have a handful of toxic users that prompt this kind of response, including other "supporting" comments that spiral out of control. It is worth bearing in mind though that sometimes the cause is not someone out to deliberately troll/inflame, but actually slightly more benign; the language barrier. Not everyone is a native speaker of English and without a good grasp of both spoken and written English, including the use of idioms, it's not at all uncommon to come across someone who unintentionally comes off as slightly - or exceedingly - abrasive when using a secondary language.
    --
    UNIX? They're not even circumcised! Savages!
    • (Score: 2) by VLM on Monday February 13 2017, @02:13PM

      by VLM (445) on Monday February 13 2017, @02:13PM (#466584)

      The result would be a McDonalds hamburger of merely 91% protein of unknown animal source rather than 100%, it would feel good yet accomplish very little. Even if intimidation took hold it would still be twice of about nothing is still nothing.

      There are other side issues of course where I suspect I know the politics of those involved and the last thing we need is a Hollywood actor's crusade and a fake news barrage of clickbait in response to action. Its entirely possible due to secondary effects that the situation could deteriorate rather than improve.

  • (Score: 0) by Anonymous Coward on Tuesday February 14 2017, @04:45AM

    by Anonymous Coward on Tuesday February 14 2017, @04:45AM (#466853)

    A toxic user can make huge dent in driving away new blood when you combine it with wikipedia policy of "now you are an admin you can ban people complaining about you in arbitration".