FEI_1.3.6.1.4.1.2011.5.25.157.2.224 hwLinkHeartbeatChangeAlarm

Trap Buffer Description

The link-heartbeat function detected that the packet modification rate reached or exceeded the threshold. (InterfaceIfIndex=[InterfaceIfIndex], InterfaceName=[InterfaceName], SendInterfaceName=[SendInterfaceName], RecvInterfaceName=[RecvInterfaceName])

The link heartbeat function detected that the packet modification rate reached or exceeded the threshold.

Trap Attributes

Trap Attribute Description

Alarm or Event

Alarm

Trap Severity

Critical

Mnemonic Code

hwLinkHeartbeatChangeAlarm

Trap OID

1.3.6.1.4.1.2011.5.25.157.2.224

MIB

HUAWEI-PORT-MIB

Alarm ID

0x09ae2003

Alarm Name

hwLinkHeartbeatChangeAlarm

Alarm Type

qualityOfServiceAlarm

Raise or Clear

Raise

Match trap

FEI_1.3.6.1.4.1.2011.5.25.157.2.225 hwLinkHeartbeatChangeAlarmResume

Trap Buffer Parameters

Parameter Description

InterfaceIfIndex

Interface IfIndex.

InterfaceName

Interface Name.

SendInterfaceName

Send Packets Interface Name.

RecvInterfaceName

Recvive Packets Interface Name.

VB Parameters

VB OID VB Name VB Index

1.3.6.1.4.1.2011.5.25.157.1.24.1.1.1

hwLinkHeartbeatIfindex

hwLinkHeartbeatIfindex

1.3.6.1.4.1.2011.5.25.157.1.24.1.1.2

hwLinkHeartbeatIfName

hwLinkHeartbeatIfindex

1.3.6.1.4.1.2011.5.25.157.1.24.1.1.3

hwLinkHeartbeatTxInterface

hwLinkHeartbeatIfindex

1.3.6.1.4.1.2011.5.25.157.1.24.1.1.4

hwLinkHeartbeatRxInterface

hwLinkHeartbeatIfindex

Impact on the System

Fault-triggered packet modification occurs on the link with link-heartbeat detection enabled, which may affect service forwarding.

Possible Causes

Fault detection

The ping service node performs link-heartbeat loopback detection to detect service faults. The packets used are ICMP detection packets. There are 12 packet templates in total. Each template sends two packets in sequence within a period of 30s. Therefore, a total of 24 packets are sent by the 12 templates within a period of 30s. After five periods, the system starts to collect statistics on lost packets and modified packets.

Link-heartbeat loopback detection is classified as packet modification detection or packet loss detection.

Packet modification detection checks whether the content of received heartbeat packets is the same as the content of sent heartbeat packets. If one of the following conditions is met, a trigger message is sent to instruct the SAID ping node to perform fault diagnosis:

1:Modified packets are detected in each of the five periods.

2:Two or more packets are modified in a period.

Fault diagnosis

After receiving the triggered message in the fault detection state, the ping service node enters the fault diagnosis state.

If a packet modification error is detected on the device, the SAID ping node checks whether a module (subcard, TM, or NP) on the device is faulty. Loopback diagnosis is performed regardless of whether a module fault occurs. If packet loss or packet modification occurs during loopback, the local device is faulty. The system then enters the fault recovery state. If no packet is lost during the loopback, the system returns to the fault detection state and generates a packet modification alarm.

Fault recovery

If a fault is detected during loopback diagnosis, the ping service node determines whether a counting error occurs on the associated subcard.

1:If a counting error occurs on the subcard, the ping service node resets the subcard for service recovery. Then, the node enters the service recovery determination state and performs link-heartbeat loopback detection to determine whether services recover. If services recover, the node returns to the fault detection state. If services do not recover, the node returns to the fault recovery state and takes a secondary recovery action. (For a subcard reset, the secondary recovery action is board reset.)

2:If no counting error occurs on the subcard, the ping service node resets the involved board for service recovery. After the board starts, the node enters the service recovery determination state and performs link-heartbeat loopback detection to determine whether services recover. If services recover, the node returns to the fault detection state. If services do not recover, the node remains in the service recovery determination state and periodically performs link-heartbeat loopback detection until services recover.

Service recovery determination

After fault recovery is complete, the ping service node uses the fault packet template to send diagnostic packets. If a fault still exists and a subcard reset is performed, the node generates an alarm and instructs the subcard to perform a switching for self-healing. If a fault still exists but no subcard reset is performed, the node generates an alarm only. If no fault exists, the node instructs the link-heartbeat loopback function to return to the initiate state, and the node itself returns to the fault detection state.

Fault alarm

If link-heartbeat loopback detects packet modification, it triggers SAID ping diagnosis and reports an alarm when any of the following conditions is met:

1:If services fail to be restored after recovery operations (reset the subcard or board), the device detects packet loss and reports an alarm.

2:If a software error occurs, the device forcibly cancels link-heartbeat loopback and reports an alarm if no other recovery operation is performed within 8 minutes.

3:If no packet loss or packet modification error occurs during link-heartbeat loopback, the device cancels the recovery operation. If no other recovery operation is performed within 8 minutes, the device reports an alarm.

4:If the board does not support SAID ping, the device reports an alarm.

Procedure

1.Run the display link-heartbeat command to check information about link-heartbeat packet modification on the interface. Check whether link-heartbeat packets are modified for 5 consecutive intervals (150s).

  • If no link-heartbeat packets are modified for 5 consecutive intervals (150s), the trap is cleared, go to Step 3.
  • link-heartbeat packets are modified for 5 consecutive intervals (150s), the trap is not cleared, go to Step 2.

2.Collect the alarm information, log information, and configuration information, and then contact technical support personel.

3.End.

Copyright © Huawei Technologies Co., Ltd.
Copyright © Huawei Technologies Co., Ltd.
< Previous topic Next topic >