Stories
Slash Boxes
Comments

SoylentNews is people

posted by cmn32480 on Tuesday October 24 2017, @06:22PM   Printer-friendly
from the somebody-blue-up-over-this dept.

Arthur T Knackerbracket has found the following story:

Back in September, IBM was left red-faced when its global load balancer and reverse DNS services fell over for 21 hours.

At the time, IBM blamed the outage on a third-party domain name registrar that was transferring some domains to another registrar. The sending registrar, IBM said, accidentally put the domains in a "hold state" that prevented them being transferred. As the load balancer and reverse DNS service relied on the domains in question, the services became inaccessible to customers.

IBM's now released an incident summary [PDF] in which it says "multiple domain names were mistakenly allowed to expire and were in hold status."

The explanation also reveals that the network-layer.net domain was caught up in the mess, in addition to the global-datacenter.com and global-datacenter.net domains that IBM reported as messed up in September.

It's unclear if IBM or its outsourced registrar was responsible for the failure to renew registration for the domains.

-- submitted from IRC


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by choose another one on Wednesday October 25 2017, @07:41PM

    by choose another one (515) Subscriber Badge on Wednesday October 25 2017, @07:41PM (#587513)

    It should be possible to sign uptime contracts with the registrars that would guarantee this kind of thing could never happen.

    NO, this is the wrong approach. For a start, if that registrar goes titsup (aka bankrupt) your contract guarantees _nothing_ at all.

    The real problem is having domain names designed in to services and clients as a single point of failure. The solution, as with other single points of failure, is to add redundancy so there is no longer a single point of failure. We fail-over servers, services, disks/storage, network switches, network cabling, power supplies, every damned thing _except_ domain names - WHY? Fix that, problem gone. And fix it properly - depend only on multiple domain names renewed at different times with different registrars - _no_ single point of failure.

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3