The link-heartbeat function detected that the packet loss rate reached or exceeded the threshold. (InterfaceIfIndex=[InterfaceIfIndex], InterfaceName=[InterfaceName], SendInterfaceName=[SendInterfaceName], RecvInterfaceName=[RecvInterfaceName])
The link-heartbeat function detected that the packet loss rate reached or exceeded the threshold.
Trap Attribute | Description |
---|---|
Alarm or Event |
Alarm |
Trap Severity |
Critical |
Mnemonic Code |
hwLinkHeartbeatDropAlarm |
Trap OID |
1.3.6.1.4.1.2011.5.25.157.2.222 |
MIB |
HUAWEI-PORT-MIB |
Alarm ID |
0x09ae2002 |
Alarm Name |
hwLinkHeartbeatDropAlarm |
Alarm Type |
qualityOfServiceAlarm |
Raise or Clear |
Raise |
Match trap |
FEI_1.3.6.1.4.1.2011.5.25.157.2.223 hwLinkHeartbeatDropAlarmResume |
Parameter | Description |
---|---|
InterfaceIfIndex |
Interface IfIndex. |
InterfaceName |
Interface Name. |
SendInterfaceName |
Send Packets Interface Name. |
RecvInterfaceName |
Recvive Packets Interface Name. |
VB OID | VB Name | VB Index |
---|---|---|
1.3.6.1.4.1.2011.5.25.157.1.24.1.1.1 |
hwLinkHeartbeatIfindex |
hwLinkHeartbeatIfindex |
1.3.6.1.4.1.2011.5.25.157.1.24.1.1.2 |
hwLinkHeartbeatIfName |
hwLinkHeartbeatIfindex |
1.3.6.1.4.1.2011.5.25.157.1.24.1.1.3 |
hwLinkHeartbeatTxInterface |
hwLinkHeartbeatIfindex |
1.3.6.1.4.1.2011.5.25.157.1.24.1.1.4 |
hwLinkHeartbeatRxInterface |
hwLinkHeartbeatIfindex |
Fault detection
The ping service node performs link-heartbeat loopback detection to detect service faults. The packets used are ICMP detection packets. There are 12 packet templates in total. Each template sends two packets in sequence within a period of 30s. Therefore, a total of 24 packets are sent by the 12 templates within a period of 30s. After five periods, the system starts to collect statistics on lost packets and modified packets.
Link-heartbeat loopback detection is classified as packet modification detection or packet loss detection.
Packet loss detection checks whether the difference between the number of received heartbeat packets and the number of sent heartbeat packets is within the permitted range. If one of the following conditions is met, a trigger message is sent to instruct the SAID ping node to perform fault diagnosis:
1:The total number of lost packets exceeds 3.
2:After each packet sending period ends, the system checks the protocol status and whether ARP entries exist on the interface and find that there is no ARP in three consecutive periods.
3:The absolute value of the difference between the number of lost packets whose payload is all 0s and the number of lost packets whose payload is all Fs is greater than 25% of the total number of sent packets in five periods.
Fault diagnosis
After receiving the triggered message in the fault detection state, the ping service node enters the fault diagnosis state.
1:If a packet loss error is detected on the device, the SAID ping node checks whether a module (subcard, TM, or NP) on the device is faulty. If no module is faulty, the system completes the diagnosis and returns to the fault detection state.
2:If a packet loss error is detected on the device, the SAID ping node checks whether a module (subcard, TM, or NP) on the device is faulty. If a module fault occurs, the system performs loopback diagnosis. If packet loss or modification is detected during loopback, the local device is faulty. The system then enters the fault recovery state. If no packet is lost during loopback diagnosis, the system returns to the fault detection state.
Fault recovery
If a fault is detected during loopback diagnosis, the ping service node determines whether a counting error occurs on the associated subcard.
1:If a counting error occurs on the subcard, the ping service node resets the subcard for service recovery. Then, the node enters the service recovery determination state and performs link-heartbeat loopback detection to determine whether services recover. If services recover, the node returns to the fault detection state. If services do not recover, the node returns to the fault recovery state and takes a secondary recovery action. (For a subcard reset, the secondary recovery action is board reset.)
2:If no counting error occurs on the subcard, the ping service node resets the involved board for service recovery. After the board starts, the node enters the service recovery determination state and performs link-heartbeat loopback detection to determine whether services recover. If services recover, the node returns to the fault detection state. If services do not recover, the node remains in the service recovery determination state and periodically performs link-heartbeat loopback detection until services recover.
Service recovery determination
After fault recovery is complete, the ping service node uses the fault packet template to send diagnostic packets. If a fault still exists and a subcard reset is performed, the node generates an alarm and instructs the subcard to perform a switching for self-healing. If a fault still exists but no subcard reset is performed, the node generates an alarm only. If no fault exists, the node instructs the link-heartbeat loopback function to return to the initiate state, and the node itself returns to the fault detection state.
Fault alarm
If link-heartbeat loopback detects packet loss, it triggers SAID ping diagnosis and performs recovery operations (reset the subcard or board). However, services fail to be recovered, and the device detects packet loss and reports an alarm.
1.Run the display link-heartbeat command to check information about link-heartbeat packet loss on the interface. Check whether link-heartbeat packets are lost for five consecutive intervals (150s).
2.Collect the alarm information, log information, and configuration information, and then contact technical support personel.
3.End.