EIGRP Unequal Cost Load Balancing
In this Post I would like to explore Unequal Cost Load Balancing with EIGRP and particularly how we configure EIGRP in order to achieve load balancing. I will talk about the feasible distance, the metric weights, offset-list, the different methods we can use to change traffic sharing among multiple unequal Cost paths with EIGRP and I will finally quickly cover CEF load balancing.
For this post I will use the following topology:
IGP: EIGRP AS 100
Platform/IOS: Cisco 2691/12.4(15)T11 Adv IP services
R4´s loopback IP address is the one I will focus on for Unequal Cost Load Balancing from the perspective of R1.
By default EIGRP uses bandwidth and delay to calculate its composite metric. The default weighting of K1 and K3 means that only bandwidth and delay are used.
The calculation of the composite metric is as follows:
Metric = [k1 * bandwidth + (k2 * bandwidth)/(256 - load) +k3 * delay] * [k5/(reliability + k4)]
If K5 equals zero the second half of the equation is ignored.
- Bandwidth is the inverse minimum bandwidth along the path scaled by 10^7*256
- Delay is the sum of the various delays along the path expressed in tens of microseconds scaled by 256
For simplicity, I will only use delay for the composite metric calculation. So every router in the EIGRP domain are configured with the following command :
To confirm that the above command has taken in effect we can look at a Wireshark capture of an hello packet sent from R1 after the change:
We can effectively see that K3=1 and all the other K values are equal to zero. In order for EIGRP adjacency to form K values should be the same through the entire EIGRP domain.
Let´s have a look at the EIGRP topology entry for the loopback of R4 (22.214.171.124) on R1 which will be the focus point from now on:
If you look at the topology above the total delay of 30 microseconds corresponds to the following path: R1 -> R3 -> R4 as it is the shortest path from a metric point of view. R1 F0/0 has a delay of 10 usec, R3 F0/1 has a delay of 10 usec and R4 Loopback4 has a delay of 10 usec which result in a total delay of 30 usec and a composite bandwidth of 768 as the composite bandwidth is based on the delay only (we changed previously the default EIGRP metric calculation) and it is calculated as follows:
Metric=3 tens of microseconds scaled by 256=768
The reported distance of R3 is its metric to reach the loopback of R4 so we can conclude that the metric between R1 and R3 is: 768-512=256:
Alright! R1 should receive routing information for the loopback of R4 through R3. However it will never install the route through R2 in its routing table as the metric is much higher ([10+10+1]*256=5376) than the path via R3 and unequal load balancing has not been configured yet. Let´s verify if R1 knows the path through R2 for 126.96.36.199:
Indeed it does! Also we can see the metric that we calculated just before -> 5376 for the total path to R4´s loopback. So these paths are unequal cost paths. How do we load balance (I should actually say load share) between this tow paths?
Well, first of all in order to consider the path via R2 for load balancing, the route first must pass the Feasibility Condition. Again the Feasibility Condition states that if reported distance of an alternate route is lower than the Feasible Distance of the Successor, the route is a loop free path and can be considered for load balancing.
In our case the RD (reported distance) of the path via R2 is higher than the FD (Feasible Distance) of the path via R3 (2816 vs 768). That means that the path through R2 will never be consider for load balancing by the EIGRP process. That is because R1 thinks that R2 is not closer to R4 than he is and therefore they could be a loop.
To solve this I would decrease the delay on the interface connected to the subnet 188.8.131.52 /24 to 10 usec instead of 100. Just for fun we can have a look at how the EIGRP process will react when we change the delay on R2:
From the Wireshark capture we can see that as soon as we change the delay on R2, R2 multicast an update to all its connected neighbors with 2 prefix, one of them being 184.108.40.206/32 with the new updated metric 512 (2*256=512). I have not attached the next wirechark capture here but when receiving this update, R1 will send an EIGRP unicast acknowledgment packet to R2 to confirm that it has received the update. So let´s check now on R1 if it is the case:
Indeed, R1 has updated its EIGRP topology table with the new metric. That means that the alternative path via R2 can now be elected as a feasible successor as the RD (reported distance) of 512 is less than the FD (feasible distance) of 768. Still we miss one step to achieve unequal load balancing as EIGRP variance is 1 per default so the alternative path will not be present in the routing table of R1:
As we can see in the above output R1 is only considering the path to R3 for 220.127.116.11 due to the default variance of 1.
EIGRP allows load distribution among unequal paths which is Controlled by the variance command. If feasible distance (end to end metric for active route) * variance > metric of feasible successor then load balancing occurs for unequal cost path(s). Only feasible successors are candidate for load balancing.
So let´s change the variance on R1 so that FD of successor which is 768 scale by the variance X is superior to the metric of the alternate path via R2:
768*X > 3072 so X > 3072/768 so the variance X should be greater than 4. Let´s configure a variance of 5 on R1 and let´s have a look at the routing table again:
Perfect! Both paths are installed in the routing table now although they have different metric. So the switching process of R1 will load share traffic between these two links in a proportion ratio of 4:1.
I would like to explain how traffic share works as I think it is quite interesting.
Automatically calculated traffic share count causes links to be used in ratio proportional to their composite metrics but the load balancing (or more accurate to say load sharing) is accomplished by the switching process of the router which is in this case CEF. So CEF performs the load-balancing once the routing protocol table is calculated
Cisco Express Forwarding (CEF) is an advanced Layer 3 switching technology which can be used for load balancing in routers. By default, CEF uses per-destination load balancing. If it is enabled on an interface, per-destination load balancing forwards packets based on the path to reach the destination.
If two or more parallel paths exist for a destination, CEF takes the same path (single path) and avoids the parallel paths. This is a result of the default behavior of CEF. In order to utilize all the parallel paths and load balance the traffic, you must enable per-packet load balancing which implies that you disable CEF which is processor intensive task and impacts the overall forwarding performance.
To test load balancing I am gonna go ahead and configure an ACL matching ICMP packets on R2 and R3 so we can verify how many packets are going through R3 compare to R2. Let´s ping from R1 to 18.104.22.168 with CEF on:
So all the ICMP traffic is going via R3 which means that there is no share at all. We should get a share with a ratio of 4:1 (see output above for the routing table of R1 regarding destination IP 22.214.171.124).
Let´s go ahead and disable CEF on R1 and enable per-packet load balancing (not recommended in production):
Let´s ping again:
So now we get the desired sharing in a 4:1 ration which means that R1 is sending 4 times more packets via R3 compare to via R2.
But what if we want to change the share for example, so R1 sends 10 times more packets via R3 compare to R2 which means a share ratio of 10:1.
There are different solutions to do this:
- Changing the delay to affect the composite metric
- Use an offset list to affect the composite metric
Let´s start by changing the delay to obtain a share ratio of 10:1. Let´s have a look at the routing entry as well as the EIGRP topology table entry for 126.96.36.199 on R1 again:
The entire composite metric through R2 should be 10 times the FD that is to say: 10*768. R2 advertised a delay of 20 microseconds (512/256=2 tens of microseconds). So for a traffic share of 10:1, the wanted delay value can be calculated as follows:
768*10=(R1_TO_R2_DLY + 2)*256
7680/256= R1_TO_R2_DLY + 2
R1_TO_R2_DLY= 28 tens of microseconds which is 280 microseconds
So let´s change the delay on R1 interface connecting to R2 and also change the variance to 10 to accommodate for the new composite metric of the path via R2. Let´s have a look at the routing table on R1 now:
So the desired traffic share ratio of 10:1 has been achieved by modifying the delay. I will reset the delay on R1 as it was before changing it to 280 usec.
Let´s try now to achieve the same share of 10:1 by using an offset list and configuring only one command to accomplish this. With the offset list we can modify the composite metric directly. To achieve the share of 10:1 the composite metric through R2 should be 10 times superior than the FD via R3. So we need the following offset list:
Offset metric=10*768-3072=7680-3072=4068 (With the offset-list we add a value to the current metric that is why in the calculation the current metric through R2 which is 3072 is taken into account).
Let´s verify the share count now in the routing table of R1:
Let´s verify the share count with a ping:
Indeed the share count is almost 10:1. R1 is sending 10 times more packets via R3 than via R2.
Thanks for reading.