Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.
posted by Fnord666 on Friday May 03 2019, @12:39PM   Printer-friendly
from the living-on-the-edge dept.

What is 768K Day, and Will It Cause Internet Outages?

You might have heard about a topic that's gaining some attention in industry discussions (such as in the recent ZDnet.com article), about an event that could potentially cause significant disruptions across the Internet—the so-called "768k Day". This day is the point in time sometime in the near future (some speculate in the coming month) when the size of the global BGP[*] routing table is expected to exceed 768,000 entries. Why is this a big deal?

In 2014, on what we now know as "512k Day", the IPv4 Internet routing table exceeded 512,000 BGP routes when Verizon advertised thousands more routes to the Internet. Many ISP and other organizations had provisioned the size of the memory for their router TCAMs for a limit of 512K route entries, and some older routers suffered memory overflows that caused their CPUs to crash. The crashes on old routers, in turn, created significant packet loss and traffic outages across the Internet, even in some large provider networks. Engineers and network administrators scrambled to apply emergency firmware patches to set it to a new upper limit. In many cases, that upper limit was 768k entries.

[...] Fast forward five years later, and the upcoming 768k Day is an echo of 512k Day, just with a higher threshold. So, some are worried that the Internet could have similar problems.

[...] while nobody's exactly hyperventilating about 768k day, there are still a lot of smaller ISPs, data centers and other providers who are part of the fabric of the Internet. When you look at Internet paths, a good amount of service traffic transits through these 'soft spots' of Internet infrastructure, if you will—where maintenance on legacy routers and network equipment can be neglected or missed more easily. Given the sheer size and unregulated nature of the Internet, it's fair to say that things will be missed.

[*] BGP: Border Gateway Protocol.

There's something I do not understand. Why wait for things to fall over? Why not plan ahead to force the issue — temporarily — to flag the errant boxes. Pick a fixed date/time to publish, say, 5k new routes, and leave it up for, say, 30 minutes to an hour or so... and then roll back the changes.

I mean, who wouldn't prefer to just wait for the limit to occur all by itself... at noon (UTC), so that all of Europe is up-and-at-it, and the whole USA from west coast to east is along for the ride, too. Might as well include South America and Africa while you are at it. Or maybe 12 hours later, when all those sysadmins are going to get roused from their sleep, nevermind the people in Asia who are in the midst of their workday.

Alternatively, imagine a time when all of Asia is flying right along and then things come to a screeching stop.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2, Interesting) by maggotbrain on Friday May 03 2019, @06:00PM (1 child)

    by maggotbrain (6063) on Friday May 03 2019, @06:00PM (#838486)
    I have a bit of a quibble with the article's vague hand waving around the routing table number sources. (I know, it's ZDnet.... :-/). Let me get this straight. In the article, the author states:

    CIDR Report, a website that keeps track of the global BGP routing table, puts the size of this file at 773,480 entries; however, their version of the table isn't official and contains some duplicates.

    The https://www.cidr-report.org/as2.0/ [cidr-report.org] has been around since almost the NSFNet(pre-1996) days and was created by some of the brightest minds on the early internet (Tony Bates, Geoff Huston - wrote numerous RFCs, on the IETF boards,ARIN, APNIC etc).

    But somehow, we are supposed to trust a twitter-bot that was created in 2014 by some rando CCIE at Google? If you try and look up Geoff Huston's name, you will, frankly, be a bit overwhelmed with reports and discussions on methodology for the CIDR-Reports data collection.

    A Twitter bot named BGP4-Table, which has also been tracking the size of the global BGP routing table in anticipation of 768K Day, puts the actual size of the file at 767,392, just a hair away from overflowing.

    A quick look indicates no data on the twitter-bot's methods. Why should I trust its numbers? Granted, as the article correctly states most places are filtering their BGP feeds and are not taking full tables from their upstream provider Frankly, I've trusted CIDR reports numbers since ~1997-ish and was around when we were having to mitigate against routing table bloat when the number of entries was approaching 100k.
    I guess I am not really questioning the foreseeable impact that ZDNet and its author claims. Their assessment seems mostly relevant-ish and accurate. I am simply question why they felt the need to make such baseless claims without verification and hand-wavingly suggest one source of numbers was more valid than another. I guess I would just like to see the numbers behind their vague preferring of one source over another.

    *shrug* ZDNet can get off my freakin' lawn. Time for some coffee..

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  

    Total Score:   2  
  • (Score: 0) by Anonymous Coward on Saturday May 04 2019, @02:05AM

    by Anonymous Coward on Saturday May 04 2019, @02:05AM (#838669)

    If you look at your report and normalize the data, it looks like you are pretty close to the lower range. I was surprised how large the bogons and specific allocations were, and there were a handful of duplicates or redundant entries. I didn't run all the numbers, but my estimate based on the sample I used is just under 4000 entries taken off of the CIDR report. So, depending on how the table is actually stored, could make a big enough difference.