Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Friday May 03 2019, @12:39PM   Printer-friendly
from the living-on-the-edge dept.

What is 768K Day, and Will It Cause Internet Outages?

You might have heard about a topic that's gaining some attention in industry discussions (such as in the recent ZDnet.com article), about an event that could potentially cause significant disruptions across the Internet—the so-called "768k Day". This day is the point in time sometime in the near future (some speculate in the coming month) when the size of the global BGP[*] routing table is expected to exceed 768,000 entries. Why is this a big deal?

In 2014, on what we now know as "512k Day", the IPv4 Internet routing table exceeded 512,000 BGP routes when Verizon advertised thousands more routes to the Internet. Many ISP and other organizations had provisioned the size of the memory for their router TCAMs for a limit of 512K route entries, and some older routers suffered memory overflows that caused their CPUs to crash. The crashes on old routers, in turn, created significant packet loss and traffic outages across the Internet, even in some large provider networks. Engineers and network administrators scrambled to apply emergency firmware patches to set it to a new upper limit. In many cases, that upper limit was 768k entries.

[...] Fast forward five years later, and the upcoming 768k Day is an echo of 512k Day, just with a higher threshold. So, some are worried that the Internet could have similar problems.

[...] while nobody's exactly hyperventilating about 768k day, there are still a lot of smaller ISPs, data centers and other providers who are part of the fabric of the Internet. When you look at Internet paths, a good amount of service traffic transits through these 'soft spots' of Internet infrastructure, if you will—where maintenance on legacy routers and network equipment can be neglected or missed more easily. Given the sheer size and unregulated nature of the Internet, it's fair to say that things will be missed.

[*] BGP: Border Gateway Protocol.

There's something I do not understand. Why wait for things to fall over? Why not plan ahead to force the issue — temporarily — to flag the errant boxes. Pick a fixed date/time to publish, say, 5k new routes, and leave it up for, say, 30 minutes to an hour or so... and then roll back the changes.

I mean, who wouldn't prefer to just wait for the limit to occur all by itself... at noon (UTC), so that all of Europe is up-and-at-it, and the whole USA from west coast to east is along for the ride, too. Might as well include South America and Africa while you are at it. Or maybe 12 hours later, when all those sysadmins are going to get roused from their sleep, nevermind the people in Asia who are in the midst of their workday.

Alternatively, imagine a time when all of Asia is flying right along and then things come to a screeching stop.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Insightful) by bradley13 on Friday May 03 2019, @01:36PM (1 child)

    by bradley13 (3053) on Friday May 03 2019, @01:36PM (#838385) Homepage Journal

    There are sadly very few companies who see the truth: smoothly running IT means you have good people who are preventing problems and enabling your business to run without unnecessary outages and disturbances.

    There are too many companies where - when things run smoothly - the bosses assume that means that the IT department has nothing to do. Time to reduce funding, or outsource for savings. Usually the infrastructure is in good enough shape for the CIO/CTO to collect a nice bonus for a year or two, based on the reduced costs, before things fall apart. By which time they have moved on, and a new CIO/CTO comes in, rebuilds the IT department, and collects their bonus for fixing things that should never have been broken. Rinse and repeat.

    On the gripping hand, you also get the anal-retentive IT departments who seem to think "keeping things running" means denying every possible user request. Servers run better, when clients can't use them for anything. And client machines run better, when users can't install the software they need. These IT departments also have smoothly running infrastructure, but it's kind of useless.

    --
    Everyone is somebody else's weirdo.
    Starting Score:    1  point
    Moderation   +3  
       Insightful=2, Interesting=1, Total=3
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   5  
  • (Score: 2) by sjames on Friday May 03 2019, @05:27PM

    by sjames (2882) on Friday May 03 2019, @05:27PM (#838464) Journal

    Sometimes the latter are also a symptom of management that wants to cut budgets to the bone. If they admit that some additional service is possible, it would mean (to management) that they might have an excess resource hidden away somewhere that needs cutting. Or perhaps they really don't have the resources to spare and don't believe they'll get those resources if they allow actual need to increase.

    After all, even if the entire sales department would go idle instantly if they stopped doing their jobs, they are a "cost center", just like the guy that polishes the CEOs golf clubs before his busy day of racking up triple bogeys.