Are there any reasons to not use BFD?

  • In looking to implement Bidirectional Forwarding Detection (BFD) it seems to be very flexible in terms of timer tuning, light weight regarding any overhead and it's flexibility in terms of overall application appears very impressive.

    So if for example it can be applied to detect link failure over Ethernet, MPLS over multiple hops, at the network edge, for IGP convergence, for tunnels etc etc - why would it not be used in certain scenarios perhaps and are there other emerging alternatives to be aware of?

  • jwbensley

    jwbensley Correct answer

    9 years ago

    I am only directly aware of one issue with BFD, which is CPU demand. I am currently investigating an issues with a Cisco 7301 which when pushing more traffic during our peak hours, compared to the rest of the day, BFD is sometimes timing out and routing trips over to the next link.

    It seems that under high traffic volumes the router CPU usage is rising (which isn't unusual) but at about 40-50% CPU BFD packets aren't receiving enough resources.

    However I have found the following information which suggests additional issues with BFD (From this NANOG presentation, there is more in the presentation, it's a good one, give it a read!)

    What are the caveats?

    • Two main ones:
      1. BFD can have high resource demands depending on your scale.
      2. BFD is not visible to Layer 2 bundling protocols. (Ethernet LAGs or POS bundles)

    BFD Resource Demands

    • The number of BFD sessions on each linecard or router can impact how well BFD scales for you. -Each unique platform has its own limits.
    • Bundled interfaces supporting min tx/rx of 250ms or 2 seconds have been seen.
    • In some cases, BFD instances on a router may need to be operated on the route-processor depending on the implementation (non-adjacency based BFD sessions).
    • Test your platform first before deploying BFD. Attempt to put load on the RP or LC CPU with your configured settings. This can be done by:
    • Executing CPU-heavy commands
    • Flooding packets to TTL expire on the destination

    BFD Resource Demands (cont’d)

    • What values are safe to try?
    • Based upon speaking to several operators, 300ms with a multiplier of 3 (900ms detection) appears to be a safe value that works on most equipment fairly well.
    • This is a significant improvement over some of the alternatives.

    BFD and L2 link-bundling

    • BFD is unaware of underlying L2 link bundle members.
    • A 4x10GigE L2 bundle (802.3ad) would appear as a single L3 adjacency. BFD packets would be transmitted on a single member link, rather than out all 4 links.
    • A failure of the link with BFD on it would result in the entire L3 adjacency failing.
    • However, in some scenarios the failed member link may result in only a single BFD packet being dropped. Subsequent packets may route over working member links.

    Another thing to note is that some platforms do not support BFD on every type of interface. Most famous (to me): Cisco 7600 does not support BFD on SVI (Vlan) interfaces until very, very recently (15.something required).

    Good point, the 7301 issue I am working on, it should, but it still isn't running as smoothly as I'd like, and it's on a very new 12 IOS. Where as some other 7301s and 7206s are fine. Sebastian is right, it's defiantly worth mentioning that it's not as well supports as we'd probably all like to be in these common hardware platforms.

    Note that there is an IETF draft to address running BFD over LAGs: https://tools.ietf.org/html/draft-mmm-bfd-on-lags. It is not really implemented anywhere yet, but hopefully this issue will eventually be solved, since it's a very common scenario.

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM