Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutdown one port, all LAGs go to down because of LACP timeout #75

Open
shuaishang opened this issue Dec 6, 2023 · 0 comments
Open

Comments

@shuaishang
Copy link

SONiC + libteam
(all SONiC/Debian/libteam version has this issue)

Topo:
image

  • DUT has LAG/PortChannel with member Ethernet1, the LACP tx/rx interval is 1s
  • DUT Ethernet7 connected with IXIA
  • DUT Ethernet7 has 16 subport
  • each subport has one BGP session with IXIA
  • each IXIA BGP session advertise 8K routes to DUT
  • 16 IXIA BGP session advertise same 8K routes prefix

Then the DUT Linux Kernel will be installed 8K routes, each route has 16 ECMP member, the member is one of Ethernet7 subport

Problem:

  • shutdown Ethernet7, the FRR will delete all the 8K routes from Kernel
  • the LAG/PortChannel LACP timeout and down

Why:
When Ethernet7 was shutdown, libteam will received the netlink message RTM_NEWLINK for Ethernet7 linkdown.

The following code try to read Ethernet7 netdev info via netlink:
https://github.com/jpirko/libteam/blob/8b843e93cee1dab61fb79b01791201cdad45e1d1/libteam/ifinfo.c#L271C9-L271C29

Since routes (nexthops are Ethernet7 subport) were being deleted in Kernel, we observed the "rtnl_link_get_kernel" will be blocked for more than 3s. Then LACP timeout and LAG down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant