As the manufacturing technique of electronic components evolves towards deep submicron, the per-unit soft failure rate of storage units in such components has been increasing. As a result, single event upset (SEU) faults often occur, adversely affecting services.
If a subcard encounters an SEU fault, SAID for SEU performs loopbacks on all interfaces of the subcard. If packet loss or modification occurs during loopback detection, the subcard is reset for fault rectification.
The SAID system diagnoses an SEU fault through three phases: fault detection, loopback detection, and troubleshooting. This enables devices to perform automatic diagnosis and fault information collection.
Fault detection
SAID for SEU detects an SEU fault on a logical subcard and starts loopback detection.
Loopback detection
Loopback detection is to send ICMP packets from the CPU on the involved interface board to an interface on the faulty subcard and then loop back the ICMP packets from the interface to the CPU.
Troubleshooting