Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 17 submissions in the queue.
posted by Fnord666 on Friday May 03 2019, @12:39PM   Printer-friendly
from the living-on-the-edge dept.

What is 768K Day, and Will It Cause Internet Outages?

You might have heard about a topic that's gaining some attention in industry discussions (such as in the recent ZDnet.com article), about an event that could potentially cause significant disruptions across the Internet—the so-called "768k Day". This day is the point in time sometime in the near future (some speculate in the coming month) when the size of the global BGP[*] routing table is expected to exceed 768,000 entries. Why is this a big deal?

In 2014, on what we now know as "512k Day", the IPv4 Internet routing table exceeded 512,000 BGP routes when Verizon advertised thousands more routes to the Internet. Many ISP and other organizations had provisioned the size of the memory for their router TCAMs for a limit of 512K route entries, and some older routers suffered memory overflows that caused their CPUs to crash. The crashes on old routers, in turn, created significant packet loss and traffic outages across the Internet, even in some large provider networks. Engineers and network administrators scrambled to apply emergency firmware patches to set it to a new upper limit. In many cases, that upper limit was 768k entries.

[...] Fast forward five years later, and the upcoming 768k Day is an echo of 512k Day, just with a higher threshold. So, some are worried that the Internet could have similar problems.

[...] while nobody's exactly hyperventilating about 768k day, there are still a lot of smaller ISPs, data centers and other providers who are part of the fabric of the Internet. When you look at Internet paths, a good amount of service traffic transits through these 'soft spots' of Internet infrastructure, if you will—where maintenance on legacy routers and network equipment can be neglected or missed more easily. Given the sheer size and unregulated nature of the Internet, it's fair to say that things will be missed.

[*] BGP: Border Gateway Protocol.

There's something I do not understand. Why wait for things to fall over? Why not plan ahead to force the issue — temporarily — to flag the errant boxes. Pick a fixed date/time to publish, say, 5k new routes, and leave it up for, say, 30 minutes to an hour or so... and then roll back the changes.

I mean, who wouldn't prefer to just wait for the limit to occur all by itself... at noon (UTC), so that all of Europe is up-and-at-it, and the whole USA from west coast to east is along for the ride, too. Might as well include South America and Africa while you are at it. Or maybe 12 hours later, when all those sysadmins are going to get roused from their sleep, nevermind the people in Asia who are in the midst of their workday.

Alternatively, imagine a time when all of Asia is flying right along and then things come to a screeching stop.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Informative) by sjames on Saturday May 04 2019, @12:06AM (2 children)

    by sjames (2882) on Saturday May 04 2019, @12:06AM (#838632) Journal

    Smaller services really should set up default route(s) just in case BGP craps out. At least that way they don't go hard down.

    Starting Score:    1  point
    Moderation   0  
       Informative=1, Overrated=1, Total=2
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2, Insightful) by Anonymous Coward on Saturday May 04 2019, @05:41AM (1 child)

    by Anonymous Coward on Saturday May 04 2019, @05:41AM (#838727)

    Who modded this informative?

    Think it through. Pretend it's traffic in NYC. Then imagine telling even 1/10th of it to all take the same route. Instant gridlock.

    Routes aren't over-provisioned; the same would happen. Better to shut it down, than shunt a stream that will overload everything you point it at (and when one of these fallbacks fails and *its* traffic fall back into the remaining pool, amplifies their loads... do you see yet?).

    • (Score: 3, Interesting) by sjames on Saturday May 04 2019, @08:54PM

      by sjames (2882) on Saturday May 04 2019, @08:54PM (#838965) Journal

      What part of "small provider" was unclear?

      Small providers don't tend to have a lot of uplinks, just a few. If they set up a handful of default routes (using policy routing), they end up with sorta balanced traffic if BGP fails. It's not as good as the BGP routing would do, and performance will suffer, but it's better than hard down.