BGP over GRE
While I was doing some BGP labs I came across an interesting topic which is BGP over GRE. Using automatic tunneling techniques along with BGP is the core of MPLS VPNs and I think it is worth seeing the effect of using manual tunnels along with BGP. Let´s have a look at the following topology which illustrates my example:
The core IGP is EIGRP while BGP is used between AS 100 and AS 200 to advertise both loopback IP. Only router R5 and R1 are running BGP. For the BGP peering R1 IP 184.108.40.206 and R2 IP 220.127.116.11 will be used. When looking at a traceroute output from R1 to R5 the number of hops showed is 3 which means that we will have to use either ebgp multihop or ttl security as EBGP TCP packets are sent with a TTL 1 which means that EBGP peers are supposed to be directly connected. In this example I will use ttl security which while having the same effect as ebgp multihop adds some security. Let´s build the EBGP session between R1 and R5:
Now we can see the EBGP session is up between R1 and R5 with R1 being the TCP client and R5 the TCP server: Here is a debug ip tcp transactions+debug ip bgp on R1:
Before any connection attempt is made, the BGP peer relation must have left the idle state and entered the Active state. Active state means that the remote router becomes reachable on a directly connected interface. Afterwards, the BGP process tries to establish a TCP session towards peer neighbor on port 179.
Once the TCP three way handshake is made (SYN[R1], SYN ACK[R5] and SYN[R1]) R1 and R5 send an BGP open message to each other containing the AS number, the BGP RID, BGP version, Hold time and optional parameters such as capabilities advertisement (for example route refresh capability used for soft refresh, or session authentication). Once the routers have sent the open message their BGP state goes from OpenSent to OpenConfirm. If both EBGP peers are agree on the different parameters stored in the open message their send each other a keepalive packet to signal this acceptance and the state is then Established (neighbor coming UP).
Now that the EBGP session is up between R1 and R5 let´s advertise their respective loopback in BGP:
If we look at R5 and do a debug ip bgp update then we see the following:
From the above output we can see that R5 install the prefix (18.104.22.168/24) advertised by R1 in its routing table as the recursive lookup for the next-hop succeeded. The routing process in this case has found an outgoing interface for 22.214.171.124.
We can also see that R1 is sending some BGP attributes with the prefix. These attributes are called mandatory well-known attributes and must be present in all updates messages. These attributes are: origin, AS-path and Next-hop. In this case we can see the metric also but this attribute is called an optional Nontransitive attribute which means that this attribute should not be present in all updates and if it is not recognized by a neighbor the partial bit is set to indicate that the attribute was not recognized.
The same advertisement process is done on R5 and the prefix 126.96.36.199/24 is advertised to R1.
So let´s look at the BGP table of R1:
From the output above we can see that R1 has 2 network entries which are the one originate by R1 and the one advertised by R5 with a next-hop of 188.8.131.52. We can also see that the prefix advertised by R5 comes from AS 200 and it is seen as external which means that it has been learned by EBGP.
Now from a BGP point of view everything looks fine but what will happen when for example R1 is trying to ping the loopback of R5 184.108.40.206. Let´s try:
The packet is not going further than R2. We can tell that because R1 is receiving an ICMP type 3 (destination unreachable) from R2 (220.127.116.11). R2 sends this ICMP paccket because it is not able to find a next-hop for 18.104.22.168 so the routing process fails as there is no next-hop for this destination. Here is the output from R2 that illustrate the issue:
So why is that happening? Well the answer is that The IGP has no knowledge of the loopback of R1 and R5 and these are only advertised in BGP which result in prefix black-holing
So what is the solution in this case? Tunneling! By tunneling the packets between the loopbacks the non-BGP devices will never ever notice those packets and therefore will the unknown address 22.214.171.124 and 126.96.36.199 be hidden from the core network.
So let´s configure a tunnel on R1 and R5:
In order to send the traffic destined for the loopbacks into the tunnel the next-hop must be changed to point to the tunnel IP address of both BGP peers. In this example let´s configure two route-maps on R1.
- The first route-map will be used in inbound direction to change the next-hop of the advertised prefix of R5 (188.8.131.52/24) to point to the tunnel IP of R5.
- The second route-map will be used in outbound direction to change the next-hop of the advertised prefix of R1 (184.108.40.206/24) to point to the tunnel IP of R1.
Let´s apply both route maps:
After a clearing of the BGP session the BGP table of R5 and R1 are now updated with the new next-hop:
So now we can actually ping both loopbacks:
That is it! A quick challenge question to my readers: Why is the traceroute showing only one hop?