The on-site networking is shown in the diagram. Two S6800 switches serve as M-LAG VLAN active-active gateways. The M-LAG interfaces connect with third parties switches. The HCL simulator replicates the on-site fault scenario. The IP addresses in the following network are not real.

When M-LAG status is normal, terminal services operate normally. However, during power-off testing of the device (the Primary device in the M-LAG system), it was observed that the terminal below (10.0.0.1) would lose over ten packets when pinging the active-active gateway address (10.0.0.254).
S6800-1 configuration:
#
interface Bridge-Aggregation1
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port m-lag peer-link 1
undo mac-address static source-check enable
#
#
interface Ten-GigabitEthernet1/0/48
port link-mode route
ip address 1.1.1.1 255.255.255.252
#
#
interface Bridge-Aggregation15
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port lacp system-priority 100
port m-lag group 15
#
#
interface Ten-GigabitEthernet1/0/20
port link-mode bridge
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
port link-aggregation group 15
#
#
m-lag mad exclude interface Ten-GigabitEthernet1/0/48
m-lag mad exclude interface Vlan-interface10
m-lag role priority 100
m-lag system-mac 0002-0002-0002
m-lag system-number 1
m-lag system-priority 234
m-lag standalone enable delay 1
m-lag keepalive ip destination 1.1.1.2 source 1.1.1.1
#
#
interface Vlan-interface10
ip address 10.0.0.254 255.255.255.0
#
S6800-2 configuration:
#
interface Bridge-Aggregation1
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port m-lag peer-link 1
undo mac-address static source-check enable
#
#
interface Ten-GigabitEthernet1/0/48
port link-mode route
ip address 1.1.1.2 255.255.255.252
#
#
interface Bridge-Aggregation15
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
link-aggregation mode dynamic
port lacp system-priority 100
port m-lag group 15
#
#
interface Ten-GigabitEthernet1/0/20
port link-mode bridge
port link-type trunk
undo port trunk permit vlan 1
port trunk permit vlan 2 to 4094
port link-aggregation group 15
#
#
m-lag mad exclude interface Ten-GigabitEthernet1/0/48
m-lag mad exclude interface Vlan-interface10
m-lag role priority 200
m-lag system-mac 0002-0002-0002
m-lag system-number 2
m-lag system-priority 234
m-lag standalone enable delay 1
m-lag keepalive ip destination 1.1.1.1 source 1.1.1.2
#
#
interface Vlan-interface10
ip address 10.0.0.254 255.255.255.0
#
Before powering off the Primary device of the M-LAG system, use display link-aggregation verbose to check that Bri-15 ports on both devices are selected.
After powering off the Primary device of the M-LAG system, use display link-aggregation verbose to check that the Bri-15 port on the other device is not selected, and use display interface
Ten-GigabitEthernet1/0/20 to check that the port status is LAGG DOWN. The S6800-2 device outputs the following logs:
%Jan 1 13:02:45:636 2021S6800-2 M-LAG/6/MLAG_KEEPALIVELINK_DOWN: Keepalive link went down because the local keepalive timeout timer expired. Please check the keepalive packet transmission and reception status at the two ends.
%Jan 1 13:02:45:654 2021 S6800-2 STP/5/STP_CONSISTENCY_CHECK: M-LAG role assignment finished. Please verify that the local device and the peer device have consistent global and mlag-interface-specific STP settings.
%Jan 1 13:02:45:657 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_DEVICEROLE_CHANGE: Device role changed from Secondary to Primary for peer link and keepalive link down.The device role has undergone a switchover
%Jan 1 13:02:45:664 2021 S6800-2 M-LAG/6/MLAG_IFEVT_PEERIF_NOSELECTED: Peer M-LAG interface inM-LAG group 15 does not have Selected member ports. ////No selected ports for the M-LAG interface
%Jan 1 13:02:46:823 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_MODE_CHANGE: The device working mode switchover to standalone
%Jan 1 13:02:46:947 2021 S6800-2 LAGG/6/LAGG_INACTIVE_OPERSTATE: Member port XGE1/0/20 of aggregation group BAGG15 changed to the inactive state, because the peer port did not have the Synchronization flag bit causing the port inactive
%Jan 1 13:02:46:953 2021 S6812_KXC_VDI_4E03_DS2 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Ten-GigabitEthernet1/0/20 changed to down.
%Jan 1 13:02:46:969 2021 S6800-2 M-LAG/4/MLAG_DEVICE_MADDOWN: All new service interfaces not excluded from the M-LAG MAD DOWN will change to the M-LAG MAD DOWN state because the peer link and all M-LAG interfaces went down. Please first check the peer link settings on both ends of the peer link.
%Jan 1 13:02:46:971 2021 S6800-2 M-LAG/6/MLAG_IFEVT_MLAGIF_NOSELECTED: Local M-LAG interface Bridge-Aggregation15 in M-LAG group 15 does not have Selected member ports because the aggregate interface went down. Please check the aggregate link status.
%Jan 1 13:02:47:191 2021 S6800-2 IFNET/3/PHY_UPDOWN: Physical state on the interface Bridge-Aggregation15 changed to down.
%Jan 1 13:02:47:191 2021 S6800-2 IFNET/5/LINK_UPDOWN: Line protocol state on the interface Bridge-Aggregation15 changed to down.
%Jan 1 13:02:47:216 2021 S6800-2 M-LAG/6/MLAG_IFEVT_MLAGIF_GLOBALDOWN: The state of M-LAG group 15 changed to down.
%Jan 1 13:02:47:220 2021 S6800-2 M-LAG/6/MLAG_SYSEVENT_DEVICEROLE_CHANGE: Device role changed from Primary to None for peer link and Keepalive link down.All local M-LAG interfaces down.
2. The issue can be preliminarily determined as related to LACP packet interaction through logs. However, to confirm why the negotiation failed, debugging link-aggregation lacp packet all interface Ten-GigabitEthernet 1/0/20 is required to acknowledge the negotiation parameters on both ends.
Before powering off the Primary device of the M-LAG system, normal debug output shows that the M-LAG system parameter 0002-0002-0002 serves as the local LACP system MAC address, while the peer system MAC is xxxx-xxxx-e080:
After powering off the Primary device of the M-LAG system, the terminal starts to cannot be pinged. The LACP packets sent by our device have been updated to its own system MAC, but the peer MAC address in the LACP reply packets from the peer remains 0002-0002-0002. Due to the persistent inconsistency in negotiation parameters between both ends, the aggregation port fails to come up:
After the Primary device of the M-LAG system is powered off, the peer responds by refreshing the peer MAC address in the LACP packet, and terminal ping resumes normal operation:
3. After checking the configuration, it was found that the m-lag standalone enable command was used on-site to enable the standalone operating function of the M-LAG device.
Command usage guidance
When the M-LAG system splits, to prevent both devices in the M-LAG system from acting as master devices forwarding traffic, configure this command to switch the M-LAG device to standalone mode immediately or after a period of time.
After the M-LAG device switches to standalone mode, the M-LAG system parameters carried in the LACP packets sent by the aggregate interface revert to the LACP system MAC address and LACP precedence of the aggregate interface. This causes inconsistency in the LACP system MAC address and LACP precedence between the two aggregate interfaces in the same M-LAG group.As a result, only the member ports on one side of the aggregate interface can be selected. The selected device operates independently to forward service traffic, avoiding traffic forwarding anomalies.
This command takes effect only when both the peer-link and Keepalive links fail. When the peer M-LAG device restarts the overall system, it notifies the local M-LAG device. The local M-LAG device detects that the peer-link and Keepalive links are not faulty, and this function does not take effect in this case. For scenarios where powering off the device causes both the peer-link and Keepalive links to fail, it is recommended to configure the delay time for the M-LAG device to switchover to the independent operating state to be longer than the overall system restart time of the device, to avoid traffic forwarding anomalies caused by flapping of M-LAG interfaces. For other scenarios where non-power-off events cause both the peer-link and Keepalive links to fail, it is recommended to configure a shorter delay time for the M-LAG device to switchover to the independent operating state, enabling the device to switch to standalone mode as quickly as possible.
If this command is executed multiple times, the last executed command takes precedence.
It is recommended that all M-LAG devices be configured with this function.
Before executing this command, ensure that the LACP system precedence of the M-LAG device is higher than that of the device connected to the M-LAG system. This ensures the reference port is located on the device connected to the M-LAG system, preventing frequent flapping of the ports on the connected device.
After disabling M-LAG standalone, the same condition testing terminal will only lose one packet. However, the root cause is that the MAC address refresh in the peer LACP packets is too slow. Adjust the relevant LACP parameters on the peer side.