SRv6 Microloop Avoidance

Because IGP link state databases (LSDBs) are distributed, network nodes running an IGP may converge at different times. This may in turn result in microloops, a kind of transient loop that disappears after all the nodes on the forwarding path have converged. Microloops result in a series of issues, including packet loss, jitter, and out-of-order packets, and therefore must be avoided.

SR uses a method with minimal impact on the network to effectively avoid potential loops. Specifically, if a network topology change may cause a loop, SR allows network nodes to insert loop-free SRv6 segment lists to guide traffic to the destination address. Normal traffic forwarding is restored only when all the involved network nodes complete convergence.

SRv6 Local Microloop Avoidance in a Traffic Switchover Scenario

In a traffic switchover scenario, a local microloop is formed when a node adjacent to the faulty node converges earlier than the other nodes on the network. On the network shown in Figure 1, SRv6 TI-LFA is deployed on all nodes. If node B is faulty, node A undergoes the following process to perform convergence for the route to node C:

Node A detects the fault and enters the TI-LFA FRR process, during which node A inserts the SRv6 repair list <5::1> into the packet to direct the packet to the TI-LFA-computed PQ node, that is, node E. To direct the packet to node E, node A needs to forward the packet to the next-hop node D first. The SIDs encapsulated into the packet are <5::1, 3::1>.
After performing route convergence, node A searches for the route to node C and forwards the packet to the next-hop node D through the route. In this case, the packet does not carry any SRv6 repair list and is directly forwarded based on the destination address 3::1.
If node D has not completed convergence when receiving the packet, its routing table has node A as the next hop of the route to node C. As a result, node D forwards the packet back to node A, which in turn causes a microloop between the two nodes.

Figure 1 SRv6 local microloop in a traffic switchover scenario
Click to enlarge

According to the preceding convergence process, node A completes convergence, exits the TI-LFA FRR process, and implements normal forwarding before other nodes on the network complete convergence, resulting in a microloop. The issue is that node A converges earlier than the other nodes, so by postponing its convergence, the microloop can be avoided. As TI-LFA backup paths are loop-free, the packet can be forwarded along a TI-LFA backup path for a period of time. Node A can then wait for the other nodes to complete convergence before exiting the TI-LFA FRR process and performing convergence, thereby avoiding the microloop.

Figure 2 SRv6 local microloop avoidance in a traffic switchover scenario
Click to enlarge

After microloop avoidance is deployed on the network shown in Figure 2, the convergence process is as follows:

After node A detects the fault, it enters the TI-LFA FRR process, encapsulating the SRv6 repair list <5::1> into the packet and forwarding the packet along the TI-LFA backup path, with node D as the next hop.
Node A starts the timer T1. During T1, node A does not respond to topology changes, the forwarding table remains unchanged, and the TI-LFA backup path continues to be used for packet forwarding. Other nodes on the network converge properly.
When T1 expires, other nodes on the network have completed convergence. Node A can now perform convergence and exit the TI-LFA FRR process to forward the packet along a post-convergence path.

The preceding solution can protect only a point of local repair (PLR) against microloops in a traffic switchover scenario. This is because only a PLR can enter the TI-LFA FRR process and forward packets along a TI-LFA backup path. In addition, this solution applies only to single point of failure. In a multiple-point fault scenario, the TI-LFA backup path may also be adversely affected and therefore cannot be used for packet forwarding.

SRv6 Local Microloop Avoidance in a Traffic Switchback Scenario

In addition to traffic switchover scenarios, microloops may also occur in traffic switchback scenarios. The following uses the network shown in Figure 3 as an example to describe how a microloop occurs in a traffic switchback scenario. The process is as follows:

Node A sends a packet to destination node F along the path A->B->C->E->F. If the B-C link fails, node A sends the packet to destination node F along the post-convergence path A->B->D->E->F.
After the B-C link recovers, a node (for example, node D) first completes convergence.
When receiving the packet sent from node A, node B has not completed convergence and therefore continues to forward the packet to node D along the pre-recovery path, as shown in Figure 3.
Because node D has completed convergence, it forwards the packet to node B along the post-recovery path, resulting in a microloop between the two nodes.

Traffic switchback does not involve the TI-LFA FRR process. Therefore, delayed convergence cannot be used for microloop avoidance in such scenarios.

Figure 3 SRv6 local microloop in a traffic switchback scenario

According to the preceding process, a transient loop occurs when node D converges earlier than node B during fault recovery. Node D is unable to predict link up events on the network and so is unable to pre-compute any loop-free path for such events. To avoid loops that may occur during traffic switchback, node D needs to be able to converge to a loop-free path.

On the network shown in Figure 4, after node D detects that the B-C link goes up, it re-converges to the D->B->C->E->F path.

In addition, the B-C link up event does not affect the path from node D and node B, proving that the path is loop-free.

Topology changes triggered by a link up event affect only the post-convergence forwarding path that passes through the link. As such, if the post-convergence forwarding path from node D to node B does not pass through the B-C link, it is not affected by the B-C link up event. Similarly, topology changes triggered by a link down event affect only the pre-convergence forwarding path that passes through the link.

A loop-free path from node D to node F can be constructed without specifying the path from node D to node B. Because the path from node C to node F is not affected by the B-C link up event, it is definitely loop-free. In this scenario, only the path from node B to node C is affected. Given this, to compute a loop-free path from node D to node F, you only need to specify a path from node B to node C. According to the preceding analysis, a loop-free path from node D to node F can be formed by inserting only an End.X SID that instructs packet forwarding from node B to node C into the post-convergence path of node D.

Figure 4 SRv6 local microloop avoidance in a traffic switchback scenario

After microloop avoidance is deployed, the convergence process is as follows:

After the B-C link recovers, a node (for example, node D) first completes convergence.
Node D starts the timer T1. Before T1 expires, node D computes a microloop avoidance segment list <2::3, 3::5, 5::6> for the packet destined for node F.
When receiving the packet sent from node A, node B has not completed convergence and therefore continues to forward the packet to node D along the pre-recovery path.
Node D inserts the microloop avoidance segment list <2::3, 3::5, 5::6> into the packet and forwards the packet to node B.

Although the packet is sent from node B to node D and then back to node B, no loop occurs because node D has changed the destination address of the packet to End.X SID 2::3.
According to the instruction bound to End.X SID 2::3, node B forwards the packet to node C through the outbound interface specified by End.X SID 2::3 and decrements the SL value by 1.
Node C and node E forward the packet to the destination node F according to the SID stack.

As previously described, node D inserts the microloop avoidance segment list <2::3, 3::5, 5::6> into the packet, avoiding loops.

When T1 of node D expires, other nodes on the network have completed convergence, allowing node A to forward the packet along the post-convergence path A->B->C->E->F.

SRv6 Remote Microloop Avoidance in a Traffic Switchover Scenario

In a traffic switchover scenario, a remote microloop may also occur between two nodes on a packet forwarding path if the node close to the point of failure converges earlier than one farther from the point. The following uses the network shown in Figure 5 as an example to describe how a remote microloop occurs in a traffic switchover scenario. The process is as follows:

After detecting a C-E link fault, a node (for example, node G) first completes convergence, whereas node B has not converged.
Nodes A and B forward the packet to node G along the path used before the fault occurs.
Because node G has completed convergence, it forwards the packet to node B according to the next hop of the corresponding route, resulting in a microloop between the two nodes.

Figure 5 SRv6 remote microloop in a traffic switchover scenario

To minimize computation workload, generally, a network node can pre-compute a loop-free path only when a directly connected link or node fails. That is, no loop-free path can be pre-computed against any other potential fault on the network. Given this, the microloop can be avoided only by installing a loop-free path after node G converges.

As mentioned above, topology changes triggered by a link down event affect only the pre-convergence forwarding path that passes through the link. If the path from a node to the destination node does not pass through the faulty link before convergence, the path is not affected by the link fault. In the topology shown in Figure 5, the path from node G to node D is not affected by the C-E link fault, and therefore does not need to be specified for computing a loop-free path from node G to node F. Similarly, the path from node E to node F is not affected by the C-E link fault, and therefore does not need to be specified, either. Because only the path from node D to node E is affected by the C-E link fault, you only need to specify the End.X SID 4::5 identifying the path from node D to node E to determine the loop-free path, as shown in Figure 6.

Figure 6 SRv6 remote microloop avoidance in a traffic switchover scenario

After microloop avoidance is deployed, the convergence process is as follows:

After detecting a C-E link fault, a node (for example, node G) first completes convergence.
Node G starts the timer T1. Before T1 expires, node G computes a microloop avoidance segment list <4::5> for the packet destined for node F.
When receiving the packet sent from node A, node B has not completed convergence and therefore continues to forward the packet to node G along the path used before the fault occurs.
Node G inserts the microloop avoidance segment list <4::5> into the packet and forwards the packet to node B.

Although the packet is sent from node B to node G and then back to node B, no loop occurs because node G has changed the destination address of the packet to End.X SID 4::5.
Node B searches the routing table for the route to destination address 4::5 and forwards the packet to node D through the route.
According to the instruction bound to the End.X SID 4::5, node D forwards the packet to node E through the outbound interface specified by the End.X SID, decrements the SL value by 1, and updates the destination address in the IPv6 packet header to 6::.
Node E forwards the packet to destination node F along the shortest path.

As previously described, node G inserts the microloop avoidance segment list <4::5> into the packet, avoiding loops.

When T1 of node G expires, other nodes on the network have completed convergence, allowing node A to forward the packet along the post-convergence path A->B->D->E->F.

SRv6 Remote Microloop Avoidance in a Traffic Switchback Scenario

The following uses the network shown in Figure 5 as an example to describe how a remote microloop occurs in a traffic switchback scenario. The process is as follows:

After the C-E link recovers, a node (for example, node B) first completes convergence, whereas node G has not converged.
Node B forwards the packet to node G.
Because node G has not completed convergence, it continues to forward the packet to node B along the pre-recovery path, resulting in a microloop between the two nodes.

To avoid the microloop, start the timer T1 on node B. Then, before T1 expires, node B inserts an End.X SID identifying the path between nodes G and C into the packet destined for node F to ensure that the packet can be forwarded to node C. In this way, node C can forward the packet to destination node F according to the destination IPv6 address 6:: along the shortest path.