Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Friday January 27 2023, @05:32AM   Printer-friendly
from the wait-did-you-say-"insert"-or-"drop"? dept.

They were in the midst of synchronizing databases, the agency revealed:

The contractors working on the Federal Aviation Administration's NOTAM system apparently deleted files by accident, leading to the delays and cancellations of thousands of US flights. If you'll recall, the FAA paused all domestic departures in the US on the morning of January 11th, because its NOTAM or Notice to Air Missions system had failed. NOTAMs typically contain important information for pilots, including warnings for potential hazards along a flight's route, flight restrictions and runway closures. 

[...] The agency later reported that the system failed after "personnel who failed to follow procedures" damaged certain files. Now, it has shared more details as part of the preliminary findings of an ongoing investigation. Apparently, its contractors were synchronizing a main and a back-up database when they "unintentionally deleted files" that turned out to be necessary to keep the alert system running. It also reiterated what it said in the past that it has "so far found no evidence of a cyberattack or malicious intent."


Original Submission

This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by PiMuNu on Friday January 27 2023, @10:43AM (6 children)

    by PiMuNu (3823) on Friday January 27 2023, @10:43AM (#1288888)

    > personnel who failed to follow procedures

    There is always potential for a cock-up, but the procedures should be written in such a way that the risk is negligible. For example, running multiple servers in parallel with suitable failover (they must do this anyway for such a critical system); ensuring roll back capability for a couple of days before deleting any data.

    Easier said than done I guess.

    • (Score: 5, Insightful) by VLM on Friday January 27 2023, @01:40PM (5 children)

      by VLM (445) on Friday January 27 2023, @01:40PM (#1288904)

      There shouldn't be procedure to manually delete files at a command line or whatever nonsense. That process should be a small shell script that can be reviewed and tested on the dev and later test cluster before hitting prod. Assuming they're not cowboying it and they even have development and testing infrastructure. Assuming they have any sort of code review or testing procedure at all.

      Also critical systems should be self healing. Critical files should have a dependency. Its missing, well, OK, make another, possibly automatically.

      Also critical systems should have a backup system, something involving VM snapshots and filesystem rollbacks not "a days work to restore from tape". Or "git reset --hard" or similar. Or more likely "docker rm wtf" and run the deploy script or fancier (docker-compose templates, docker-swarm, k8s, etc)

      This is a management problem not a technical one; all this stuff is ancient, older than me.

      • (Score: 2) by DannyB on Friday January 27 2023, @03:26PM

        by DannyB (5839) Subscriber Badge on Friday January 27 2023, @03:26PM (#1288931) Journal

        HAL: Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

        "It is now safe to turn off your computer." -- HAL 9000

        --
        The lower I set my standards the more accomplishments I have.
      • (Score: 2) by PiMuNu on Friday January 27 2023, @03:27PM

        by PiMuNu (3823) on Friday January 27 2023, @03:27PM (#1288932)

        I had understood that the deleted files were user data i.e. flight plans. But on re-reading I realise it is unclear.

      • (Score: 0) by Anonymous Coward on Friday January 27 2023, @05:37PM (1 child)

        by Anonymous Coward on Friday January 27 2023, @05:37PM (#1288951)

        > This is a management problem not a technical one; all this stuff is ancient, older than me.

        Herein lies the death knell of every complex organization, even small labs. Once the guys who originated the system stop paying attention to the details - and promotions virtually force them out - it's the second generation in charge of operations. They were at least taught by the first generation. The third generation were taught by the second generation. And so on.

        The passion for doing "real work" is diminished on each generation, and filled with drudgery instead (see e.g. Procedural Manuals). The organization is too complex to be maintained with 1/2-ass clock punchers but the drive that motivates the original designers is completely squashed out of the role. Decrepitude approaches.

        • (Score: 3, Insightful) by Joe Desertrat on Saturday January 28 2023, @12:27AM

          by Joe Desertrat (2454) on Saturday January 28 2023, @12:27AM (#1289013)

          The passion for doing "real work" is diminished on each generation, and filled with drudgery instead (see e.g. Procedural Manuals). The organization is too complex to be maintained with 1/2-ass clock punchers but the drive that motivates the original designers is completely squashed out of the role. Decrepitude approaches.

          And when the generation doing the work is outsourced employees, the likelihood of that drive being present is close to zero. Why are critical systems being handled by contractors?

      • (Score: 2) by legont on Saturday January 28 2023, @05:26AM

        by legont (4179) on Saturday January 28 2023, @05:26AM (#1289044)

        You forgot Indians. They are probably innovation over there as well.

        --
        "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
  • (Score: 2) by legont on Saturday January 28 2023, @05:35AM

    by legont (4179) on Saturday January 28 2023, @05:35AM (#1289045)

    It's perhaps interesting to know that no computer provided NOTAM information is legal to use. As per FAA an airman has to call - yes, phone - FAA and request a brief. Only that info is considered true all computers be damn. Note that violation of NOTAM is very very serious offense.

    --
    "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
(1)