Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Tuesday October 29 2019, @11:59PM   Printer-friendly
from the stop-looking-at-the-wrong-thing dept.

From the following story:

Amazon has still not provided any useful information or insights into the DDoS attack that took down swathes of websites last week, so let's turn to others that were watching.

One such company is digital monitoring firm Catchpoint, which sent us its analysis of the attack in which it makes two broad conclusions: that Amazon was slow in reacting to the attack, and that tardiness was likely the result of its looking in the wrong places.

Even though cloud providers go to some lengths to protect themselves, the DDoS attack shows that even a company as big as Amazon is vulnerable. Not only that but, thanks to the way that companies use cloud services these days, the attack has a knock-on impact.

"A key takeaway is the ripple effect impact when an outage happens to a third-party cloud service like S3," Catchpoint noted.

The attack targeted Amazon's S3 - Simple Storage Service - which provides object storage through a web interface. It did not directly target the larger Amazon Web Services (AWS) but for many companies the end result was the same: their websites fell over.

[...] Amazon responded by rerouting packets through a DDoS mitigation service run by Neustar but it took hours for the company to respond. Catchpoint says its first indications that something was up came five hours before Amazon seemingly noticed, saying it saw "anomalies" that it says should have served as early warnings signs.

When it had resolved the issue, Amazon said the attack happened "between 1030 and 1830 PST," but Catchpoint's system shows highly unusual activity from 0530. We should point out that Catchpoint sells monitoring services for a living so it has plenty of reasons to highlight its system's efficacy, but that said, assuming the graphic we were given [PDF] is accurate - and we have double-checked with Catchpoint - it does appear that Amazon was slow to recognize the threat.

Catchpoint says the problem is that Amazon - and many other organizations - are using an "old" way of measuring what's going on. They monitor their own systems rather than the impact on users.

"It is critical to primarily focus on the end-user," Catchpoint argues. "In this case, if you were just monitoring S3, you would have missed the problem (perhaps, being alerted first by frustrated users)."

-- submitted from IRC


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: -1, Troll) by Anonymous Coward on Wednesday October 30 2019, @03:56AM (2 children)

    by Anonymous Coward on Wednesday October 30 2019, @03:56AM (#913589)

    OK, I'll just punch you in the face, and we'll see how your broken nose is your own failure to design yourself an unbreakable face. Nothing is my fault for punching you. It's a normal accident, completely the system's fault.

    Starting Score:    0  points
    Moderation   -1  
       Troll=2, Touché=1, Total=3
    Extra 'Troll' Modifier   0  

    Total Score:   -1  
  • (Score: 3, Interesting) by c0lo on Wednesday October 30 2019, @07:54AM

    by c0lo (156) Subscriber Badge on Wednesday October 30 2019, @07:54AM (#913621) Journal

    Where's my "Don't you dare to touche me" ** mod when I need it.

    ** In a complex and interactive systems environment, the continuation could be "... or else accidents may happen to you and it will be your fault. Fair warning"

    --
    https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
  • (Score: 5, Insightful) by c0lo on Wednesday October 30 2019, @02:13PM

    by c0lo (156) Subscriber Badge on Wednesday October 30 2019, @02:13PM (#913702) Journal

    Before being smug, you may stop and think a bit. Here are some points:

    If a complex system is designed for resilience, it has high chances to survive a partial failure, no matter the cause of it - DDoS included. If you only would be actually more interested or just curious than to "display your muscles", who knows what you may have discovered? Maybe something like Resilience Design Patterns [arxiv.org] from Oak Ridge National Laboratory (yeap, the ones maintaining and used several of the Top 50 supercomputers. I have this nagging feeling they know a bit more than you about resilience).

    Now, if those complex systems are not designed for resilience, very interesting things may happen. Things like cascading failures [wikipedia.org], in which a network goes down because of a single node failure, in spite of the entire network having more than enough capacity to handle the load.

    And this is far from being specific to computers. In spite of the idiots wanting to see the world burn down in ashes (I have moments when I'm among them), there actually are things like too big to fail [wikipedia.org] (if you're too lazy to follow the link, think of cascading bank runs).

    Now, the problem is that designing for resilience and operating resilient systems will require extra cost. Be it only because such designs wil apply to large/complex systems, handling complexity requires brains (as opposed to punching muscles) and those brains aren't cheap.

    So yeah I can see as plausible a statement like "AWS went down because of systemic faults, it just happened this time to be triggered by a DDoS, but it was bound to happen sooner or later starting from a large variety triggers". Things like fat fingers taking down half the Web [soylentnews.org]

    --
    https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford