Stories
Slash Boxes
Comments

SoylentNews is people

posted by takyon on Tuesday August 09 2016, @03:34PM   Printer-friendly
from the you're-grounded dept.

Cringley speculates like hell:

Delta Airlines last night suffered a major power outage at its data center in Atlanta that led to a systemwide shutdown of its computer network, stranding airliners and canceling flights all over the world. You already know that. What you may not know, however, is the likely role in the crisis of IT outsourcing and offshoring.

Do any Soylentils have inside/better information?


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by Snotnose on Tuesday August 09 2016, @03:43PM

    by Snotnose (1623) on Tuesday August 09 2016, @03:43PM (#385801)

    Outsourcing? [cringely.com]

    Or ancient software. The news this morning said that Delta's system had been written by an airline that Delta bought in 1982. I'm sure they've added all sorts of updates over the years, but if the architecture is 30 years old then yeah, I can see how that might be a problem. Especially considering how much air travel has changed in the last 10-15 years.

    --
    My ducks are not in a row. I don't know where some of them are, and I'm pretty sure one of them is a turkey.
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: -1, Offtopic) by Anonymous Coward on Tuesday August 09 2016, @04:06PM

    by Anonymous Coward on Tuesday August 09 2016, @04:06PM (#385814)

    air travel has changed in the last 10-15 years.

    Don't worry brother! Obama gonna CHANGE things back to the way they been in 1989, that number! Another summer! FIGHT THE POWER!

  • (Score: 3, Insightful) by Anonymous Coward on Tuesday August 09 2016, @04:24PM

    by Anonymous Coward on Tuesday August 09 2016, @04:24PM (#385823)

    It doesn't matter how old the software is. The fact is that Delta had no disaster recovery plan at all and no off-site passive standby ready to take over. That's just negligence or incompetence for a major corporation with IT infrastructure that is critical to their business. Honestly, I hope this hurts them hard enough that they go out of business. This complete lack of planning or foresight for such a critical piece of their business makes you wonder what other corners they're cutting elsewhere: like in plane maintenance.

    • (Score: -1, Flamebait) by Anonymous Coward on Tuesday August 09 2016, @04:31PM

      by Anonymous Coward on Tuesday August 09 2016, @04:31PM (#385827)

      Critical IT infrastructure? Delta isn't an IT company. America isn't even an IT country. IT is Indian Technology.

      • (Score: 2) by maxwell demon on Tuesday August 09 2016, @08:47PM

        by maxwell demon (1608) on Tuesday August 09 2016, @08:47PM (#385946) Journal

        The failure of that infrastructure caused them not being able to do their core business correctly. Yes, I'd call that critical infrastructure.

        --
        The Tao of math: The numbers you can count are not the real numbers.
    • (Score: 3, Interesting) by bob_super on Tuesday August 09 2016, @04:50PM

      by bob_super (1357) on Tuesday August 09 2016, @04:50PM (#385841)

      When United merged with Continental in 2010, they looked at both IT systems, and saved a few millions by adopting the most ancient and underpowered one (from Continental, I think).
      I had a friend working for United in ORD, they had two months of pure hell, because the system just couldn't handle tens of thousands of novices making mistakes just as its load doubled.

      Nothing like getting yelled at all day by justifiably tired and angry customers, while the people providing the inappropriate tools celebrate their bonuses... I was actually surprised at the lack of spontaneous combustion of C-suite-owned cars and buildings.

      • (Score: 3, Interesting) by frojack on Tuesday August 09 2016, @06:17PM

        by frojack (1554) on Tuesday August 09 2016, @06:17PM (#385886) Journal

        Since they were merging, and both systems were already in-hand and sunk costs, how could there one be cheaper than the other? Perhaps the newer one was bug ridden and full of maintenance headaches and caused all sorts of down time.

        I've seen large scale accounting systems get newly developed replacements at the cost of years of work and multiple millions of dollars, and even after 5 years of operations couldn't manage the task the old system did with ease. Newer is not automatically better.

        --
        No, you are mistaken. I've always had this sig.
        • (Score: 2) by bob_super on Tuesday August 09 2016, @07:54PM

          by bob_super (1357) on Tuesday August 09 2016, @07:54PM (#385927)

          True, but his manager-of-users viewpoint was that the one they had was stable enough and had more productivity features, and the company chatter was that the one cheapest to double the load was selected despite being ancestral.
          It might have been impossible to double United's, or he may have been biased by his habit, but the end result was both sides of the company (and the customers, but who cares) having a horrible merger experience.

      • (Score: 3, Interesting) by JoeMerchant on Tuesday August 09 2016, @06:24PM

        by JoeMerchant (3937) on Tuesday August 09 2016, @06:24PM (#385894)

        I was actually surprised at the lack of spontaneous combustion of C-suite-owned cars and buildings.

        Don't be, when you're "in the system" close enough to access C-suite-owned cars and buildings, you're already in "lackey mode" where you're prime motivation is to ingratiate yourself to those guys so they cut you in on a tiny slice of their pie.

        Or, you're at the other end of the pay-scale where you need this damn job in order to pay the past-due rent, so bombing the CEO's car might not be the best way to stay employed, or get re-employed at any of your crappy options.

        --
        🌻🌻 [google.com]
        • (Score: 2) by deadstick on Tuesday August 09 2016, @08:13PM

          by deadstick (5110) on Tuesday August 09 2016, @08:13PM (#385932)

          A fundamental principle of oppressive regimes, economic as well as political. If you want to trust people, let them dip their beaks to a depth appropriate to their level.

      • (Score: 3, Funny) by krishnoid on Tuesday August 09 2016, @07:11PM

        by krishnoid (1156) on Tuesday August 09 2016, @07:11PM (#385915)

        When United merged with Continental in 2010, they looked at both IT systems, and saved a few millions by adopting the most ancient and underpowered one (from Continental, I think).
        ... pure hell, because the system just couldn't handle tens of thousands of novices making mistakes just as its load doubled.

        Too bad your friend didn't just print up a sign with that information on it, and add "If you want to complain, please call our M&A department at ..." . I bet a bunch of the customers would have been at least a little understanding.

    • (Score: 0) by Anonymous Coward on Tuesday August 09 2016, @05:12PM

      by Anonymous Coward on Tuesday August 09 2016, @05:12PM (#385851)
      Maintenance is probably better. "If I sign off on this in it's current state, I'll lose my license" provides leverage that IT people don't have to push back against "not in the budget".
    • (Score: 3, Insightful) by sjames on Tuesday August 09 2016, @05:30PM

      by sjames (2882) on Tuesday August 09 2016, @05:30PM (#385862) Journal

      Beyond that, Delta, like many businesses out there has no resiliency in their system at all. The whole damned operation worldwide runs through a single point of failure. Further, if that single point fails, even with ticket, passenger, and plane all in the same place at the same time they somehow cannot put the passenger on the plane. Why should any of that require more than a local server? Sure, I can see how there would be problems with scheduling future flights, but anyone with a ticket to fly should be able to fly.

      • (Score: 3, Insightful) by frojack on Tuesday August 09 2016, @06:12PM

        by frojack (1554) on Tuesday August 09 2016, @06:12PM (#385884) Journal

        Local server?

        Passengers have the tickets in hand. One or two gate agents is all you need to check tear ticket stubs and board the plane.
        A few gorillas to load the baggage.
        A pilot to file a flight plan, and a call for a pushback tug.

        Done.

        Sure its not sustainable beyond a couple days. But it shouldn't turn to shit the instant power fails 2000 miles away.

         

        --
        No, you are mistaken. I've always had this sig.
        • (Score: 0) by Anonymous Coward on Tuesday August 09 2016, @07:32PM

          by Anonymous Coward on Tuesday August 09 2016, @07:32PM (#385922)

          You never saw Airport 77, did you? Old lady made her own boarding passes with a felt-tip pen. Without central server to verify lady was authorized to fly, she got lots of free flights.

          TSA rules require central passenger verification anyway.

        • (Score: 2) by edIII on Tuesday August 09 2016, @08:31PM

          by edIII (791) on Tuesday August 09 2016, @08:31PM (#385940)

          No security. You could get 100 terrorists on the planes simply by creating fake tickets and then taking out the power in a single building someplace. Which brings up another point, how does the TSA check the paper tickets for authenticity? More than just the airline company is accessing airline systems, and the emphasis these days is not so much logistics but security.

          All of that could be accomplished with a local caching/authentication server. If we really wanted to create a system capable of these things we could. The problem is management and budgeting, not availability of solutions and technology.

          --
          Technically, lunchtime is at any moment. It's just a wave function.
          • (Score: 2) by sjames on Tuesday August 09 2016, @08:59PM

            by sjames (2882) on Tuesday August 09 2016, @08:59PM (#385955) Journal

            That's why I said local server. When the ticket is created, the server at the departing airport gets a record of the ticket. The ticket itself gets a barcode containing the record and a signature to make it VERY hard to forge. The signature must verify and it must match the already downloaded record.

            In addition, the local server can then inform the central control that the passenger was actually boarded once things return to normal.

            As for management, call it a private cloud app and they'll be pissing themselves with excitement to get it done.

        • (Score: 2) by maxwell demon on Tuesday August 09 2016, @08:51PM

          by maxwell demon (1608) on Tuesday August 09 2016, @08:51PM (#385951) Journal

          Sorry, but the zoo was not willing to give the gorillas to the airport.

          --
          The Tao of math: The numbers you can count are not the real numbers.
    • (Score: 2) by frojack on Tuesday August 09 2016, @06:02PM

      by frojack (1554) on Tuesday August 09 2016, @06:02PM (#385880) Journal

      Agreed. Old Software is not the problem.

      Not having an offisite backup MAY be a problem, but it is a secondary problem, not a causal event.

      The problem seems to be they were running on shore power only, and had zero, or inadequate on site backup power.

      Its also possible they didn't have control of their uplink to the network, no redundant links and left that in the hands of some telco with equally inadequate backup.

      I doubt Cringley.

      --
      No, you are mistaken. I've always had this sig.
    • (Score: 1, Informative) by Anonymous Coward on Tuesday August 09 2016, @06:04PM

      by Anonymous Coward on Tuesday August 09 2016, @06:04PM (#385881)

      At least planes get inspected every so often. But from what I read in the USA Today, only part of the system failed, the backup came up, and promptly locked up. Because of the half broken nature, the offsites wouldn't fail over. This just screams to me of improper testing of failure modes.

    • (Score: 3, Informative) by DannyB on Tuesday August 09 2016, @06:19PM

      by DannyB (5839) Subscriber Badge on Tuesday August 09 2016, @06:19PM (#385889) Journal

      Delta had no disaster recovery plan at all and no off-site passive standby ready to take over. That's just negligence or incompetence for a major corporation

      What you call negligence and incompetence, executives call bigger bonuses for cost saving. Job well done!

      --
      To transfer files: right-click on file, pick Copy. Unplug mouse, plug mouse into other computer. Right-click, paste.
    • (Score: 0) by Anonymous Coward on Tuesday August 09 2016, @06:38PM

      by Anonymous Coward on Tuesday August 09 2016, @06:38PM (#385901)

      > no off-site passive standby ready to take over.

      Its hard to tell, but reading between the lines of the reporting in the popular press, they did have a standby but something went wrong with it too.

    • (Score: 2) by VLM on Tuesday August 09 2016, @08:06PM

      by VLM (445) on Tuesday August 09 2016, @08:06PM (#385930)

      It doesn't matter how old the software is. The fact is that Delta had no disaster recovery plan at all

      I started my career at a major financial services company a long time ago and their disaster recovery plan was based on having the same code at many sites, which works perfectly when the problem is a hurricane or earthquake (neither of which were issues at any of the sites, intentionally, but I digress) and fails miserably when the problem is the code itself.

      Of course we had two devs and a test system to go with our dual prods and disaster recovery schemes. Back when having a test or dev meant literally buying five mainframes instead of just two.

      Anyway if there's an old bug that barfs on 8/8/16 for whatever reason, all the hardware DR plans in the world won't help.

  • (Score: 2, Insightful) by Anonymous Coward on Tuesday August 09 2016, @04:55PM

    by Anonymous Coward on Tuesday August 09 2016, @04:55PM (#385842)

    > Or ancient software. The news this morning said that Delta's system had been written by an airline that Delta bought in 1982. I'm sure they've added all sorts of updates over the years, but if the architecture is 30
    > years old then yeah, I can see how that might be a problem. Especially considering how much air travel has changed in the last 10-15 years.

    Yeah, 30 year old code must be bad. They should have scrapped it and rewritten it from scratch. That worked so well for netscape/systemd/etc

    • (Score: 1) by fustakrakich on Wednesday August 10 2016, @04:33AM

      by fustakrakich (6150) on Wednesday August 10 2016, @04:33AM (#386118) Journal

      Netscape [seamonkey-project.org] still works for me.

      As far as system redundancy is concerned, It has to be made cheaper than lawsuits and insurance/tax write offs for it to happen.

      --
      La politica e i criminali sono la stessa cosa..