Stories
Slash Boxes
Comments

SoylentNews is people

posted by Woods on Wednesday August 13 2014, @08:24PM   Printer-friendly
from the oops dept.

Due to a new set of routes published yesterday, the internet has effectively undergone a schism. All routers with a TCAM allocation of 512k (or less), in particular Cisco Catalyst 6500 and 7600's, have started randomly forgetting portions of the internet.

"Cisco also warned its customers in May that this BGP problem was coming and that, in particular, a number of routers and networking products would be affected. There are workarounds, and, of course the equipment could have been replaced. But, in all too many cases this was not done.. Unfortunately, we can expect more hiccups on the Internet as ISPs continue to deal with the BGP problem." says Steven J. Vaughan-Nichols of ZDNet.

Is it time to switch to all IPv6 yet?

Earlier today there was a hiccup in the Internet as many routers started "flapping". BGPmon.net tracks the BGP activities on the global Internet, and came up with the following analysis:

Folks quickly started to speculate that this might be related to a known default limitation in older Cisco routers. These routers have a default limit of 512K routing entries in their TCAM memory. [...] Right now the number of prefixes is still several thousands under the 512,000 limit so it shouldn't be an issue. However when we take a closer look at our BGP telemetry we see that starting at 07:48 UTC about 15,000 new prefixes were introduced into the global routing table.

Whatever happened internally at Verizon caused aggregation for their prefixes to fail which resulted in the introduction of thousands of new /24 routes into the global routing table. This caused the routing table to temporarily reach 515,000 prefixes and that caused issues for older Cisco routers.

Luckily Verizon quickly solved the de-aggregation problem, so we're good for now. However the Internet routing table will continue to grow organically and we will reach the 512,000 limit soon again. The good news is that there's a solution for those operating these older cisco routers. The 512,000 route limitation can be increased to a higher number, for details see this Cisco doc.

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Insightful) by cafebabe on Wednesday August 13 2014, @08:52PM

    by cafebabe (894) on Wednesday August 13 2014, @08:52PM (#80966) Journal

    I was under the impression that Cisco had a patent for maintaining compressed routing tables. So, this 2^n limit may be artificial on some equipment.

    I was also under the impression that some parties published routes for the smallest possible range of network addresses to minimize the problem with route stealing, as occurred with Iran's objection to lewdness on YouTube.

    If this is all true then it is an unfortunate combination.

    --
    1702845791×2
  • (Score: 3, Insightful) by frojack on Wednesday August 13 2014, @08:55PM

    by frojack (1554) on Wednesday August 13 2014, @08:55PM (#80967) Journal

    All it takes is on ill-trained person at some obscure ISP to induce routing errors that could bring down the net by table flooding, route flapping, or null routing huge portions of the net.

    This is something that is terribly fragile (still). Its an attack vector that very few are watching, and every time it happens everybody talks a great deal and does just about nothing. Almost all problems have been due to mistakes or ignorance, but that might not remain that way.

    There are entirely too many routes to be managed by BGP, it was never designed with Classless Inter-Domain Routing (CIDR) in mind, and its been a bolt on patch kludge since CIDR was adopted.

    (Oddly CIDR was proposed as a method of controlling routing tables size growth, but it has proven over time to do the exact opposite. It was proposed so that sixteen contiguous /24 networks can be aggregated and advertised to a larger network as a single /20 route. Problem is, very few contiguous /24 NEED to be routed that way, and more often numerically contiguous /8s are geographically scattered all over the globe, and their routes have to be advertised to every backbone router because nobody wants to handle any traffic but their own regardless of the IP block.

    --
    No, you are mistaken. I've always had this sig.
    • (Score: 5, Interesting) by VLM on Wednesday August 13 2014, @09:16PM

      by VLM (445) Subscriber Badge on Wednesday August 13 2014, @09:16PM (#80972)

      Not entirely correct but not bad. I got to do BGP design/support at a regional ISP about a decade ago. Or more accurately thats about when I quit after doing it for quite a few years.

      Via the joy of route-dampening which everyone has implemented for the last 20 or so years, route flapping is quite automatically ... dampened. Flap and the router puts you on a time out list of naughty kids. To the annoyance of my customers who would do stupid stuff like flip their router on and off five times in fifteen minutes and then wonder why they fell off the net for an hour and why none of their upstreams such as myself could do anything whatsoever to help them get back online.

      As far as flooding and null routing most upstreams have their "moment" like ours in the 90s, some earlier some later, where you stop trusting your customers and extensively filter anything they advertise. Which is easy although a huge waste of cycles. Which means a default-free tier 1-ish ISP can still screw up, but they usually know what they're doing and we had limited filtering on them anyway. Go ahead, make my day, advertise a 0/0 in my general direction while I laugh at you (probably not good customer service, but...)

      Adding to the confusion null routing is actually good, if we give you a /18 I kinda expect you to null route the entire /18 because you don't want to waste the bandwidth of playing packet tennis with traffic to a temporarily unused /28 and from modesty alone (oh and that route dampening thing) you really don't want to air the dirty laundry of your internal OSPF meltdown by advertising it to the entire network. Then again, inevitably a customer has an ethernet switch port fail on the LAN side of their BGP speaking router and blackhole their own space this way. There are always tradeoffs.

      CIDR helps a lot when ARIN gives you a /20 and you don't want to advertise 16 /24s. Aside from the whole class-ful address stuff which means ARIN is quite likely to give you a /20 that under the old fashioned classFUL notation can't be represented as anything smaller than an /8 or something. So yeah CIDR is a net win. NANOG mailing list has a weekly summary of routes that need summarization and people who ignore them.

      High end networking is cool other than the lack of jobs. So I haven't done stuff like this in a decade. Just no work. Nothing. There must be like 10 CCNP-ish level people with the skills for every individual CCNP-ish level job, which is a bummer. There's probably 10-25 guys doing something else who can wrangle BGP, for every BGP wranglin' job. Kinda sucks.

      You missed a good rant about BGP security, I got out of the biz around the time the great MD5 crisis was ending. Basically you can/could spoof reset BGP connections, unless you add a MD5 hash to each packet which makes it about 2 to the power of 128 or whatever harder than just sending random TCP RST packets with spoofed source addresses. Resetting a connection would be pretty bad, yeah. Expecting operators to route in a civilized manner and not allow source addrs spoofing on their sh!tty networks is asking too much, so we'll paper over the problem by adding a shared secret md5 hash to the packets. What a load of BS.

      Can't say I miss ISP work all that much.

      • (Score: 1) by CyprusBlue on Wednesday August 13 2014, @10:33PM

        by CyprusBlue (943) on Wednesday August 13 2014, @10:33PM (#81005)

        Minor correction: no one does dampening anymore.

      • (Score: 3, Interesting) by isostatic on Wednesday August 13 2014, @11:14PM

        by isostatic (365) on Wednesday August 13 2014, @11:14PM (#81021) Journal

        High end networking is cool other than the lack of jobs. So I haven't done stuff like this in a decade. Just no work. Nothing. There must be like 10 CCNP-ish level people with the skills for every individual CCNP-ish level job, which is a bummer. There's probably 10-25 guys doing something else who can wrangle BGP, for every BGP wranglin' job. Kinda sucks.

        Start your own ISP :)

        I played with BGP on a course, haven't got a hope in hell of getting anywhere near a router running BGP (at least publicly), however having an understanding of how BGP works is very useful when you're trying to work out why you're getting issues. I have a nagios plugin that reports the AS path to my remote offices. I don't care if the traceroute changes from server A to server B, but if my packets start being routed via other ISPs I want to know about it. Had one interesting one where my ISP's routing to HK was putting a hard limit at 1mbit. Talking sweetly to my ISP to change their localpref for that target subnet to go via another route solved the problem, but without that background in BGP I wouldn't have a hope in hell of talking the same language.

        I am advocating BGP internally to seperate different regional departments. I don't want an OSPF meltdown in Glasgow to affect my routing from Belfast to Cardiff for example. OK you can put OSPF filters in place, but having a separation at a protocol level makes it easier to manage, and clearer where the boundaries are.

        • (Score: 2) by VLM on Sunday August 17 2014, @12:00PM

          by VLM (445) Subscriber Badge on Sunday August 17 2014, @12:00PM (#82259)

          "haven't got a hope in hell of getting anywhere near a router running BGP"

          Sorry for the late response but google for "zebra" "quagga" and BGP

          http://www.nongnu.org/quagga/ [nongnu.org]

          The exact genealogy of zebra and quagga is unclear but quagga is probably "where its at" today. Find old PCs (you need at least a 486 realistically, a rasp pi is overkill and harder to add extra ethernet cards).

          "but having a separation at a protocol level makes it easier to manage,"

          The main advantage is OSPF table operations scale much worse than linear with number of nodes in the network and beyond about 100-200 routing devices used to be pretty unwise in practice with OSPF. Or more precisely more than 200 or so routes. Maybe with "modern technology" using 400 is perfectly fine today. Anyway segmenting giant OSPF networks is a pretty stereotypical BGP job. Also separation among administrative domains. So your group runs Europe the way you want and we run north america the way we want and our demarcation point is some transatlantic links which happen to speak BGP.

          Of course the problem with administrative domains is now you've solved the inter-domain routing problem while not solving the inter-domain address allocation problem or inter-domain firewalling problem.

  • (Score: 0) by Anonymous Coward on Wednesday August 13 2014, @09:26PM

    by Anonymous Coward on Wednesday August 13 2014, @09:26PM (#80974)

    512,000 seems like an artificial limit. Is the real limit 2^19 = 524288?

    • (Score: 2) by VLM on Wednesday August 13 2014, @09:57PM

      by VLM (445) Subscriber Badge on Wednesday August 13 2014, @09:57PM (#80987)

      "512,000 seems like an artificial limit. Is the real limit 2^19 = 524288?"

      Not exactly, its a tree not a table. So you burn nodes on parent and branch tree, uh, limb-things, such that you get fewer end nodes than total. So, like, a tree data structure always has less leaves than total nodes in the structure because some nodes are tree-trunks not leaves. I'm sure there's a crappy SN car analogy buried in here somewhere

      Or more precisely, you've actually got about a million slots and the default is to hard partition between ipv4 and ipv6. So if you get the default hard partition wrong, maybe too thin for ipv4 and too fat for ipv6, one is going to hit a limit. And then inside that hard partition of space you've got the tree I mention above. And (from memory) the tree doesn't actually old destinations it holds indexes to a table of destinations so you gotta store that somewhere.

      Some of this cisco stuff, even the decade old stuff I used to use, is darn near as complicated as a modern filesystem. There's a lot more to routing than just "make a big ole table of regex and hope for the best". How the linux kernel routes, from a long time ago when I understood / cared about how the linux kernel routes, is/was pretty crude compared to the more ... extreme ... ideas Cisco had.

      Also there's substantial hardware acceleration much as you'd be surprised how much specialized hardware makes graphics faster. And with that you just stick in as many elements as the FPGA/ASIC can hold and its probably not a power of 2 and maybe not even an even number, its just kinda what fits which is about a million in that old TCAM.

      • (Score: 2) by cafebabe on Wednesday August 13 2014, @10:40PM

        by cafebabe (894) on Wednesday August 13 2014, @10:40PM (#81009) Journal

        I'm sure there's a crappy SN car analogy buried in here somewhere

        There are more cars than car dealers.

        There's a lot more to routing than just "make a big ole table of regex and hope for the best".

        Routing has to be done properly but guess how a Cisco acquisition does web filtering.

        --
        1702845791×2
      • (Score: 2) by kaszz on Thursday August 14 2014, @01:35AM

        by kaszz (4211) on Thursday August 14 2014, @01:35AM (#81051) Journal

        So the 2^19 limit is the limit of the number of nodes in graph?

        And the performance limit is the time it takes to walk that graph?

        Acceleration, what kind of specific operation needs to be accelerated? load-and-compare or pointer walking?

        (perhaps the whole concept is flawed and mesh or P2P is the networking method of the future?)

  • (Score: 1, Insightful) by Anonymous Coward on Thursday August 14 2014, @12:27AM

    by Anonymous Coward on Thursday August 14 2014, @12:27AM (#81032)

    Come on, really? The other site posed a similar useless question in the summary without thinking it through. (Hey, it generates discussion!) This has nothing to do with if/when/should we move to IPv6.

    IPv6 won't make the issue of lack of memory in these old core routers go away; if anything, it'd make the issue worse. Either you'd be running 2 layer-3 protocol stacks in the same device, which clearly needs more memory, or you'd be running pure IPv6 on the device, and given that IPv6 addresses are larger than IPv4 addresses, and the address space is also much larger, you'd still run into such memory issues eventually once the routing tables reaches such a size (I don't know how big the Internet IPv6 table compares to the IPv4 one currently).

    The issue is all these old devices are no longer up to task because they don't have enough memory, or are configured with defaults which are no longer suitable for what is required to hold the size of the current routing table. At least the latter allows room for workaround to fix the problem in the short while.

    • (Score: 2) by kaszz on Thursday August 14 2014, @01:31AM

      by kaszz (4211) on Thursday August 14 2014, @01:31AM (#81049) Journal

      Perhaps not all IPv6 bits needs to be used in routing. At least initially?

      I'm also thinking if one could translate groups of addresses into a label. And then route that label, not the specific address. In order to keep sizes down.

      Regarding Cisco, have they not felt the competition from say Juniper etc..? and due to the piggyback-listen-R'-us a lot of foreign companies should have shoot to compete?

    • (Score: 0) by Anonymous Coward on Thursday August 14 2014, @06:56AM

      by Anonymous Coward on Thursday August 14 2014, @06:56AM (#81112)

      Try the other way around. There are some old equipment that is now running out of space for IPv4, and needs to be replaced. And what is holding IPv6 back? Old equipment that doesn't support IPv6, and thus IPv6 won't happen until this old stuff is replaced.

  • (Score: 1) by knorthern knight on Thursday August 14 2014, @01:15AM

    by knorthern knight (967) on Thursday August 14 2014, @01:15AM (#81047)

    Over time, smaller ISPs grew, and needed, and acquired more IP addresses. They were usually scattered all over the address space. Has someone considered an effort to swap IP address space to aggregate smaller routes?

  • (Score: 3, Interesting) by velex on Thursday August 14 2014, @02:50AM

    by velex (2068) on Thursday August 14 2014, @02:50AM (#81065) Journal

    Totally drunk here, but I just wanted to say I read a similar article on the other site, and the commentary was completely full of shit. The commentary here, however, I must revisit once sober. I may have been affected by this issue, and I'd like to know more. There's some data center in Atlanta (.ga.us) that hosts a website for a client that is every now and then unreachable by one ISP but reachable by the backup ISP. As far as I can tell by the traceroute from my server in the clouds and tracert from my workstation, there's a router in that datacenter that's having problems, not that I think I have a chance in hell of contacting someone who can do something about it, if something about it may be done. Damnit, Jim, I'm a programmer, not a network administrator. The mitigation for the issue is in place, but I still want to know more. I've never done a study of BGP. Any links to introductory primers?

    It'd be an interesting exercise to set up a lab environment with VMs. That's how I learned the basics like ARP, DHCP, "active domains," etc. (Not to mention the practical experience that followed when I found myself being an IT consultant in a previous life.)

    • (Score: 1) by Nollij on Thursday August 14 2014, @12:42PM

      by Nollij (4559) on Thursday August 14 2014, @12:42PM (#81216)

      Much like a comedy club, the drunker you get, the better we are.

      Remember to tip your waitstaff!