Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday January 03 2019, @12:39PM   Printer-friendly
from the we-prefer-the-term-autonomous-agents dept.

A large portion of web traffic is due to bots, and has been for years.

How much of the [I]nternet is fake? Studies generally suggest that, year after year, less than 60 percent of web traffic is human; some years, according to some researchers, a healthy majority of it is bot. For a period of time in 2013, the Times reported this year, a full half of YouTube traffic was "bots masquerading as people," a portion so high that employees feared an inflection point after which YouTube's systems for detecting fraudulent traffic would begin to regard bot traffic as real and human traffic as fake. They called this hypothetical event "the Inversion."

[...] Take something as seemingly simple as how we measure web traffic. Metrics should be the most real thing on the [I]nternet: They are countable, trackable, and verifiable, and their existence undergirds the advertising business that drives our biggest social and search platforms. Yet not even Facebook, the world's greatest data–gathering organization, seems able to produce genuine figures. In October, small advertisers filed suit against the social-media giant, accusing it of covering up, for a year, its significant overstatements of the time users spent watching videos on the platform (by 60 to 80 percent, Facebook says; by 150 to 900 percent, the plaintiffs say). According to an exhaustive list at MarketingLand, over the past two years Facebook has admitted to misreporting the reach of posts on Facebook Pages (in two different ways), the rate at which viewers complete ad videos, the average time spent reading its "Instant Articles," the amount of referral traffic from Facebook to external websites, the number of views that videos received via Facebook's mobile site, and the number of video views in Instant Articles.

Can we still trust the metrics? After the Inversion, what's the point? [...]

Some metrics already measure the legitimate traffic as smaller than the bot traffic.


Original Submission

Related Stories

Inside the Black Market for Bots That Buy Designer Clothes Before They Sell Out 28 comments

In a growing number of online activities, bots are the main means of interaction. Online shopping is increasingly one of those areas. Vice has an interview with someone who built their own bot in order to compete against the other bots when buying online, just to have a chance at making a purchase for sought after items.

A tool for beating others to buying the items you want consists of three main components, finalphoenix explained. A monitoring bot, which scouts the target websites for new items; an account creation part, which will make a load of accounts on the site so you have a higher chance of pushing through the crowd as you control more of it; and a purchase bot, the part that actually orders and pays for your item. Users will also need to get some server space to run their bots.

Hiding from the clothes websites that you're using a bot is a bit more complicated; companies will likely ban you if they suspect you're scraping their website. Here, buyers need to use different accounts, proxies to route their traffic, and other technical means as workarounds.

Earlier on SN:
Facebook and CMU's AI Poker Bot Beat Five Pros at Once
TrickBot Malware Learns How to Spam -- Ensnares 250M Email Addresses
How Much of the Internet Is Fake? Turns Out, a Lot of It, Actually


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by crafoo on Thursday January 03 2019, @01:01PM (11 children)

    by crafoo (6639) on Thursday January 03 2019, @01:01PM (#781451)

    Imagine if the advertising companies no longer had confidence in Google/Youtube stats. How on Earth could they have confidence in the price they should pay at the advertising spot auctions? Why, the whole web would just come tumbling down! Maybe we can get some laws passed. Protect these adware sites. Protect the internet as we know it. Maybe get some sweet barriers to entry in there while we are at it.

    • (Score: 5, Interesting) by Nuke on Thursday January 03 2019, @01:20PM (5 children)

      by Nuke (3162) on Thursday January 03 2019, @01:20PM (#781456)

      It is just a matter of time before it is realised that advertising, and data for advertising, is worth no-where what is being paid for it at the moment. We are told that the ad industry earns and handles millions for this and billions for that, but that is only what is being received by entities at the top of the sales pyramid (like Facebook) in exchange for ad data and ad space. The money is being fed into the bottom by marks who are getting no-where near the value of what they are paying. When the marks realise this the whole over-inflated pyramid will collapse.

      • (Score: 2, Interesting) by khallow on Thursday January 03 2019, @02:02PM

        by khallow (3766) Subscriber Badge on Thursday January 03 2019, @02:02PM (#781463) Journal

        It is just a matter of time before it is realised that advertising, and data for advertising, is worth no-where what is being paid for it at the moment.

        There's probably two classes of advertisers. The first are the newbies trying out advertising because they heard it works. Those will be affected by a change in the perception of how effective online advertising is. There's the second group which knows it works because they already have ad campaigns that worked to generate business for themselves - metrics from self-interested advertisers be damned.

      • (Score: 2, Insightful) by nitehawk214 on Thursday January 03 2019, @02:14PM

        by nitehawk214 (1304) on Thursday January 03 2019, @02:14PM (#781468)

        I think this feeds in to the fact that companies like Coca-Cola and Pepsi spend absurd percentages of their budgets on advertising. They know their product costs pennies to make as the scales they are making it at, so they simply drown out competition in a flood of marketing.

        --
        "Don't you ever miss the days when you used to be nostalgic?" -Loiosh
      • (Score: 2) by crafoo on Thursday January 03 2019, @06:30PM (2 children)

        by crafoo (6639) on Thursday January 03 2019, @06:30PM (#781598)

        I think advertising works far better than most people realize. It's generally money well-spent.

        • (Score: 2) by MostCynical on Thursday January 03 2019, @08:12PM (1 child)

          by MostCynical (2589) on Thursday January 03 2019, @08:12PM (#781662) Journal

          Redbull doesn't make any products. Everything is licensing and advertising - seems to work for them.

          --
          "I guess once you start doubting, there's no end to it." -Batou, Ghost in the Shell: Stand Alone Complex
          • (Score: 2, Informative) by Anonymous Coward on Friday January 04 2019, @06:07AM

            by Anonymous Coward on Friday January 04 2019, @06:07AM (#781932)

            A can of Redbull in Oz is about $3. There is an off-brand drink, that looks the same, has the same flavor, and comes in the same size can. It costs $1 and sells reasonably well, but nowhere near as much as Redbull.
            I used to work where the generic was sold, and it occasionally turned up in Redbull boxes.
            That's right, it was the same stuff, from the same factory, and people still paid triple for the name.

    • (Score: 4, Interesting) by VLM on Thursday January 03 2019, @02:26PM (3 children)

      by VLM (445) on Thursday January 03 2019, @02:26PM (#781472)

      Its interesting that theres a market of fakeness already.

      Newspapers sell $7 of advertising per issue and retail for $1.50 (or whatever, its been a long time since I've read one) and there have been huge lawsuits over printing presses emitting directly into recycling bins. They can prove by receipts for paper that they printed 250K copies but they can only sell a smaller number, so in the recycling bin they go...

      TV advertising companies proudly claim ridiculous statistics like the average American watches 8 hours of TV per day. Of course they are 1) twisting facts by failing to mention they're counting coma patients and nursing home residents and prisoners as 24/7 viewing regardless if they're actually conscious and watching or even present in the room 2) they're simply lying.

      Its 2019, other than legacy neocon talk radio for boomers, I don't think anyone listens to broadcast radio anymore, but the reported listening stats are ridiculously high.

      I mean, in an absolute sense if internet stats were 100% inflated, that's still better than legacy media.

      This is aside from issues like "we got a budget, we'll spend it on what people actually see regardless of their ridiculous self promotional numbers" because no matter how the numbers are made up, there's more middle aged women paying attention to Facebook than legacy broadcast FM radio. Or more teen girls on instagram than reading old fashioned newspapers.

      • (Score: 1, Informative) by Anonymous Coward on Thursday January 03 2019, @05:01PM (1 child)

        by Anonymous Coward on Thursday January 03 2019, @05:01PM (#781547)

        > I don't think anyone listens to broadcast radio anymore

        One counter-data point -- I listen to BBC World Service (broadcast by local public/university station in USA) nearly every night.

        • (Score: 3, Insightful) by isostatic on Thursday January 03 2019, @11:34PM

          by isostatic (365) on Thursday January 03 2019, @11:34PM (#781766) Journal

          I listen to WS and Radio 4 in the car. Don’t watch TV, don’t listen to commercial radio unless in a shop/taxi/etc

          On holiday in a cottage this week - we put on TV on New Years even for the last hour to watch Big Ben. What an eye opener - 50 channels of dross, most with adverts. It was all nonsensical. We gave up, and put on “atomic clock” for he countdown, then spotify for auld lang syne.

          However it turns out that not everyone is like me. People actually watch Tv. They listen to 5 songs and 10 adverts an hour. They enjoy having their brains warped by these messages. After three years of not having these adverts at all (since dumping the Ariel), it was eye opening to actually watch them again.

      • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @08:40PM

        by Anonymous Coward on Thursday January 03 2019, @08:40PM (#781668)

        Newspapers sell $7 of advertising per issue and retail for $1.50 (or whatever, its been a long time since I've read one) and there have been huge lawsuits over printing presses emitting directly into recycling bins. They can prove by receipts for paper that they printed 250K copies but they can only sell a smaller number, so in the recycling bin they go...

        $3 here for the incumbent newspaper. But isn't the Audit Bureau of Circulation still a thing? It was supposed to measure the actual distribution of advertising substrates, to allow ad placers to get a reliable idea of the spread of a publication, and the number of eyeballs they may reach.

    • (Score: 1, Interesting) by Anonymous Coward on Thursday January 03 2019, @03:36PM

      by Anonymous Coward on Thursday January 03 2019, @03:36PM (#781506)

      If advertising were really such a moneymaker, real ad firms like Saarchi + Saatchi and Ogilvy & Mather would have extreme valuations. I suppose some time we'll find out that Google's ads were just a front to countless millions received from governments to spy on people.

  • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @01:58PM (1 child)

    by Anonymous Coward on Thursday January 03 2019, @01:58PM (#781462)

    So fake that you are not reading it right now.

    • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @06:28PM

      by Anonymous Coward on Thursday January 03 2019, @06:28PM (#781597)

      I'm so meta, even this acronym.

  • (Score: 2) by stretch611 on Thursday January 03 2019, @02:51PM (4 children)

    by stretch611 (6199) on Thursday January 03 2019, @02:51PM (#781484)

    a lot of fake breasts. (not to mention male parts that are also faked from surgery.)

    --
    Now with 5 covid vaccine shots/boosters altering my DNA :P
    • (Score: 2) by OrugTor on Thursday January 03 2019, @04:51PM (3 children)

      by OrugTor (5147) on Thursday January 03 2019, @04:51PM (#781541)

      Wait - you can get surgery for male parts? If only I had known when I was younger. I am assuming you are not referring to circumcision/castration.

      • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @07:27PM (2 children)

        by Anonymous Coward on Thursday January 03 2019, @07:27PM (#781636)

        They can cut some tendons that will make the shaft slide out a couple inches, not sure if implants are a thing though.

        • (Score: 1, Funny) by Anonymous Coward on Thursday January 03 2019, @09:49PM (1 child)

          by Anonymous Coward on Thursday January 03 2019, @09:49PM (#781720)

          Long ago a car guy told me that it's bore, not stroke that counts...

          • (Score: 0) by Anonymous Coward on Friday January 04 2019, @08:57PM

            by Anonymous Coward on Friday January 04 2019, @08:57PM (#782234)

            What about Wankel engines?

  • (Score: 2) by mhajicek on Thursday January 03 2019, @03:01PM

    by mhajicek (51) on Thursday January 03 2019, @03:01PM (#781492)

    That's five ways.

    --
    The spacelike surfaces of time foliations can have a cusp at the surface of discontinuity. - P. Hajicek
  • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @03:29PM (5 children)

    by Anonymous Coward on Thursday January 03 2019, @03:29PM (#781504)

    The people around here who shit all over the problems with web stats are the same ones who boast how they have blocked all analytics from running when they browse the web.

    Hey dumbass! GIGO!

    • (Score: 2, Insightful) by Anonymous Coward on Thursday January 03 2019, @04:31PM (2 children)

      by Anonymous Coward on Thursday January 03 2019, @04:31PM (#781524)

      1. We don't support analytics so we block it.
      2. We can see the legitimate problems with analytics (such as bots) and recognize that the advertising industry is built on shaky ground.

      You can take your rusty irony and shove it up your fetid asshole.

      • (Score: 1, Funny) by Anonymous Coward on Thursday January 03 2019, @07:22PM

        by Anonymous Coward on Thursday January 03 2019, @07:22PM (#781632)

        You can take your rusty irony and shove it up your fetid asshole.

        The reports on the internet that my irony is rusty, and that my asshole is fetid, are fake.

      • (Score: 0) by Anonymous Coward on Friday January 04 2019, @07:02PM

        by Anonymous Coward on Friday January 04 2019, @07:02PM (#782180)

        Yeah, that guy claiming that because we block analytics we have no reason to gripe at their accuracy must work for some sort of metric related company.

        We're not the ones contributing fake metrics. In fact, our lack of participation is helping the industry figure out who is really fake -- not that they have an interest in doing so. As noted, the whole pyramid will collapse into ruin if everyone really understood that i block spam and have html disabled in my email client and run no script and umatrix and filter on my firewall and make dns entries for sites with many scattered IPs that I don't want anything of mine visiting and

        it's hard to call it complaining when I comment on how the system I had to take actions to force my opt-out of (since often, there is no offered choice to do so) is not working well. Even the people actively contributing are not generating as much data as the bots.

        probably, the bots are from competitors to someone. Or people like me that used to run a mouse mover and one of those applications that would pay you to watch ads. I just let it run all day like how bitcoin mining worked. I only got a few checks before that company went out of business--but the checks they sent were real and they actually paid in full when I cashed them.

        Those bots are doing the same thing at scale, and these metrics the industry has... are more accurately reflecting bot profits than people. Only the real suckers install weather channel apps or facebook messenger. Even the bots don't do that, and the bots are what generates most of the traffic to influence advert sales and impressions.

    • (Score: 5, Touché) by Nuke on Thursday January 03 2019, @05:38PM

      by Nuke (3162) on Thursday January 03 2019, @05:38PM (#781574)

      The people around here who shit all over the problems with web stats are the same ones who boast how they have blocked all analytics from running when they browse the web.

      And the problem with that is what? I block the admen's analytics and I am glad, not sad, that I am contributing to them being fucked up. I am interested in this and other articles about the failure of advertising data purely as a form of amusement.

    • (Score: 2) by crafoo on Thursday January 03 2019, @06:32PM

      by crafoo (6639) on Thursday January 03 2019, @06:32PM (#781599)

      Bots taking over the majority of web traffic - not the same as a small minority of people blocking unwanted (and unauthorized) access to their computers.

  • (Score: 3, Interesting) by Anonymous Coward on Thursday January 03 2019, @06:13PM (4 children)

    by Anonymous Coward on Thursday January 03 2019, @06:13PM (#781593)

    I have a few "google alerts" set for the title of my book and some common mis-spellings -- mostly curious who might be commenting/reviewing/blogging-about the book (or pirating it). Many of the web pages that the big G finds are nonsense, they may have English words but no intelligence. Sometimes it looks like piles of search requests (to google or other search engine??) all crammed together.

    Often these are attached to legit sites for some completely different purpose, if I shorten the URL down to the base .com/.edu/.org I get a perfectly normal informational page. For example, one of these weird pages was part of a website for an elementary school, another time a florist shop. In these cases, my guess is that the site was broken into (default password or something else easy) and the site database is being used for data storage by the attacker?

    But what is the point of this type of nonsensical (to me) page.

    • (Score: 1, Interesting) by Anonymous Coward on Thursday January 03 2019, @06:52PM (2 children)

      by Anonymous Coward on Thursday January 03 2019, @06:52PM (#781610)

      I also meet them very frequently, in a few last years more frequently than sites with useful information. The thing is IMHO that the large amount of keywords in these pages allows to make bots index it under more words which may be used with the most important keyword by user. This "blurs" results a bit and makes user more prone to click wrong page. Such things are today everywhere, from news portals to e-shops. Even some WP sites, probably using a plug-in, have it. They are linked using invisible hyperlinks or scripts detecting UA of crawler.
      Similar things go with images - there are sites which notoriously make such "gibberish" with images found on other sites. Crawling through this makes lots of traffic and makes search quality worse. Won't it go against some law article? Like "obstructing access to IT system" (literal translation of some country's computer security law)?.
      So if some website wants to charge money for hosting a simple text or a picture, just check how much it's really worth by looking at terabytes of junk they host without problems, as sometimes is just better to go with own hosting and with knowledge that some time ago it was a default part of Internet access account to make communication possibilities more equal than producers farting propaganda into consumers' ears.
      This knowledge also cures all "Internet freedom" fights - all this recent "Article 13" mess is just a squeak of one pig being pushed away from food by a larger pig.

      • (Score: 0) by Anonymous Coward on Thursday January 03 2019, @09:57PM

        by Anonymous Coward on Thursday January 03 2019, @09:57PM (#781722)

        I wonder how our resident Bot feels about these pages, does "it" find them confusing?

      • (Score: 0) by Anonymous Coward on Friday January 04 2019, @06:12AM

        by Anonymous Coward on Friday January 04 2019, @06:12AM (#781936)

        Change your useragent to Googlebot and you will see some weird shit on the web.

    • (Score: 2) by toddestan on Saturday January 05 2019, @07:42PM

      by toddestan (4982) on Saturday January 05 2019, @07:42PM (#782588)

      My guess is they are probably SEO garbage sites. If you dig around a bit, you'll probably find links back to a "real" site or sites. The idea being that if search engines see a ton of websites that link to a site that they'll rank that website higher (I don't know if that really works, but that's what people believe).

      I got a similar thing when I set up an online image gallery using a pretty common PHP-based application, and stupidly left anonymous commenting on. Tons of generic comments along the lines of "Cool picture, check out my website at blahblahblah". All fake traffic of course. At least it was easy to clean up.

  • (Score: 2) by darkfeline on Friday January 04 2019, @12:14AM (2 children)

    by darkfeline (1030) on Friday January 04 2019, @12:14AM (#781792) Homepage

    You know that User Agent HTTP header? It identifies the piece of software acting on behalf of the user, the agent of the user.

    First, the way "fake" is used in the headline is clearly clickbait. It doesn't mean anything. All the article says is that more HTTP requests are made by bots.

    But wait, all HTTP request are made by bots, specifically pieces of software acting out the instructions of people. All that's changed is the average level of indirection between the human and the user agent.

    If I enter a URL into a browser and the browser sends a request, is that fake? If I send a request with curl, is that fake? If I run a script which runs curl to send a request, is that fake? If I make that script into a cron job, is that fake?

    --
    Join the SDF Public Access UNIX System today!
    • (Score: 0) by Anonymous Coward on Friday January 04 2019, @10:59PM

      by Anonymous Coward on Friday January 04 2019, @10:59PM (#782280)

      If I run my botnet to scam somebody, is that fake?

    • (Score: 2) by toddestan on Saturday January 05 2019, @07:54PM

      by toddestan (4982) on Saturday January 05 2019, @07:54PM (#782593)

      I'm going to guess, from the second paragraph in the two paragraph summary, that "fake" means traffic generated to purposely manipulate metrics. So if you set up a cron job to use curl to inflate the number of views of your Youtube videos to make them appear more popular, or to inflate the amount of traffic to your website to make it more valuable to advertisers, then that traffic is "fake".

  • (Score: 0) by Anonymous Coward on Friday January 04 2019, @01:06AM

    by Anonymous Coward on Friday January 04 2019, @01:06AM (#781821)

    Ever since they've been a thing I've been detected as a bot by the PoS' (independent of IP address) simply because I'm very fast about what I do. The inversion likely has already happened, a long time ago presumably.

(1)