Preventing STP loops in etherchannels configurations
What is the best practice when configuring etherchannel on Cisco switches to prevent a STP loop/broadcast storm when etherchannel is mis-configured?
I had an instance where 2 ports on a switch were configured with via trunk ports properly however the opposing switch was only had one port switch as a trunk, the 2nd as a regular access port. Upon a power-cycle, a broadcast storm took down the network and originated from these two switches.
channel-group 1 mode onwas configured on all ports.
From my research L2 etherchannel links should be configured with only
channel-group 1 mode desirable
L3 links can be configured w/ channel-group 1 mode on as STP isn't running over them.
Unfortunately in this scenario PVST+ was configured and bdpuguard was missing on all edge ports :( Lastly, all switches here are Cisco - not a multi-vendor environment
In new environments, I always deploy rapid-pvst+ however this particular environment was running the cisco default PVST+.
You should not be using "on" for link aggregation as it can lead to problems. On the side with aggregation statically on, it will use the interfaces in the etherchannel, no matter what the configuration is on the other side.
While storm control (from comments) can be very helpful with some of the problems that result, it does not resolve all of them. For instance if one of the remote side links is an access port on a different VLAN, all traffic that goes down that port will likely never reach its destination. Depending on how constant the traffic and the load balancing across the etherchannel, this may result in a complete "outage" for some hosts.
I always recommend the use of LACP over PAgP, so instead of desirable/auto or on, use either active on both sides or active on one and passive on the other. The reason for this is that LACP is standards based while PAgP is Cisco proprietary.
Of course this is in part dependent on the hardware platform, so check the appropriate documentation for your platform.
Also implement storm-control. It does not take care of the root cause but it keeps the network from melting down.
"on" is occasionally required for multi-vendor setups. I only use LACP on host ports -- inter-switch links are *never* negotiated, STP and storm-control catch you if you royally screw it up.
LACP whenever possible. Certain platforms, like Cisco 7200, don't support LACP, while others, like older Juniper boxes, need upgraded FEBs to run LACP
I've not encountered any gear worth using in the last few years that supported bundles but couldn't talk LACP. Even cheap SMB switches do it, VMware only added support in v5 though.
FYI - configuring an etherchannel for "mode on" != using PAgP. Only using "mode auto" or "mode desirable" will turn it on. Compare this with Brocade's "static" vs "dynamic" LAG.