What is 768K Day, and Will It Cause Internet Outages?
You might have heard about a topic that's gaining some attention in industry discussions (such as in the recent ZDnet.com article), about an event that could potentially cause significant disruptions across the Internet—the so-called "768k Day". This day is the point in time sometime in the near future (some speculate in the coming month) when the size of the global BGP[*] routing table is expected to exceed 768,000 entries. Why is this a big deal?
In 2014, on what we now know as "512k Day", the IPv4 Internet routing table exceeded 512,000 BGP routes when Verizon advertised thousands more routes to the Internet. Many ISP and other organizations had provisioned the size of the memory for their router TCAMs for a limit of 512K route entries, and some older routers suffered memory overflows that caused their CPUs to crash. The crashes on old routers, in turn, created significant packet loss and traffic outages across the Internet, even in some large provider networks. Engineers and network administrators scrambled to apply emergency firmware patches to set it to a new upper limit. In many cases, that upper limit was 768k entries.
[...] Fast forward five years later, and the upcoming 768k Day is an echo of 512k Day, just with a higher threshold. So, some are worried that the Internet could have similar problems.
[...] while nobody's exactly hyperventilating about 768k day, there are still a lot of smaller ISPs, data centers and other providers who are part of the fabric of the Internet. When you look at Internet paths, a good amount of service traffic transits through these 'soft spots' of Internet infrastructure, if you will—where maintenance on legacy routers and network equipment can be neglected or missed more easily. Given the sheer size and unregulated nature of the Internet, it's fair to say that things will be missed.
[*] BGP: Border Gateway Protocol.
There's something I do not understand. Why wait for things to fall over? Why not plan ahead to force the issue — temporarily — to flag the errant boxes. Pick a fixed date/time to publish, say, 5k new routes, and leave it up for, say, 30 minutes to an hour or so... and then roll back the changes.
I mean, who wouldn't prefer to just wait for the limit to occur all by itself... at noon (UTC), so that all of Europe is up-and-at-it, and the whole USA from west coast to east is along for the ride, too. Might as well include South America and Africa while you are at it. Or maybe 12 hours later, when all those sysadmins are going to get roused from their sleep, nevermind the people in Asia who are in the midst of their workday.
Alternatively, imagine a time when all of Asia is flying right along and then things come to a screeching stop.
(Score: 2, Interesting) by maggotbrain on Friday May 03 2019, @06:00PM (1 child)
The https://www.cidr-report.org/as2.0/ [cidr-report.org] has been around since almost the NSFNet(pre-1996) days and was created by some of the brightest minds on the early internet (Tony Bates, Geoff Huston - wrote numerous RFCs, on the IETF boards,ARIN, APNIC etc).
But somehow, we are supposed to trust a twitter-bot that was created in 2014 by some rando CCIE at Google? If you try and look up Geoff Huston's name, you will, frankly, be a bit overwhelmed with reports and discussions on methodology for the CIDR-Reports data collection.
A quick look indicates no data on the twitter-bot's methods. Why should I trust its numbers? Granted, as the article correctly states most places are filtering their BGP feeds and are not taking full tables from their upstream provider Frankly, I've trusted CIDR reports numbers since ~1997-ish and was around when we were having to mitigate against routing table bloat when the number of entries was approaching 100k.
I guess I am not really questioning the foreseeable impact that ZDNet and its author claims. Their assessment seems mostly relevant-ish and accurate. I am simply question why they felt the need to make such baseless claims without verification and hand-wavingly suggest one source of numbers was more valid than another. I guess I would just like to see the numbers behind their vague preferring of one source over another.
*shrug* ZDNet can get off my freakin' lawn. Time for some coffee..
(Score: 0) by Anonymous Coward on Saturday May 04 2019, @02:05AM
If you look at your report and normalize the data, it looks like you are pretty close to the lower range. I was surprised how large the bogons and specific allocations were, and there were a handful of duplicates or redundant entries. I didn't run all the numbers, but my estimate based on the sample I used is just under 4000 entries taken off of the CIDR report. So, depending on how the table is actually stored, could make a big enough difference.