BGP Unnumbered
BGP requires a next-hop for a route to be valid. In IPv4, except for getting the MAC address to put on the packet, the next-hop IP address is not used in the packet at all. In case of IPv6, as well, the next-hop IPv6 address is used to identify the next-hop MAC address, using IPv6’s equivalent of ARP: Neighbor Discovery (ND). Even in IPv6, forwarding to the original destination involves only the next-hop’s MAC address. Therefore the next-hop IP address is used only to get the next-hop’s MAC address.
RFC 5549 builds on this observation and provides an encoding scheme to allow a router to advertise IPv4 routes with an IPv6 next-hop. RFC 5549 is a somewhat obscure RFC, invented in the early years of a new century. Its purpose is to allow the advertisement of an IPv4 route and routing of an IPv4 packet over a pure IPv6 network.
To summarize:- BGP unnumbered uses the interface’s IPv6 LLA to set up a BGP session with a peer.
- The IPv6 LLA of the remote end is discovered via IPv6’s Router Advertisement (RA) protocol.
- RA provides not only the remote end’s LLA, but also its corresponding MAC address.
- BGP uses RFC 5549 to encode IPv4 routes as reachable over an IPv6 nexthop, using the IPv6 LLA as the nexthop.
Cumulus implementation
The RIB process programs a static ARP entry with a reserved IPv4 LLA, 169.254.0.1, with the MAC address set to the one learned via RA. BGP hands down to the RIB process IPv4 routes with the IPv6 LLA as the next-hop. The RIB process converts the next-hop to 169.254.0.1 and the outgoing interface before programming the route in the forwarding table as shown herecumulus@basondole-leaf:~$ arp Address HWtype HWaddress Flags Mask Iface 169.254.0.1 ether 50:00:00:05:00:01 CM swp1 cumulus@basondole-leaf:~$ ip route 192.168.1.1 via 169.254.0.1 dev swp1 proto bgp metric 20 onlink cumulus@basondole-spine:~$ net sh int swp1 Name MAC Speed MTU Mode -- ---- ----------------- ----- ---- ------------- UP swp1 50:00:00:05:00:01 1G 1500 NotConfigured LLDP Details ------------ LocalPort RemotePort(RemoteHost) --------- ---------------------- swp1 swp1(basondole-leaf) Routing ------- Interface swp1 is up, line protocol is up . . Type: Ethernet HWaddr: 50:00:00:05:00:01 inet6 fe80::5200:ff:fe05:1/64 Hosts use stateless autoconfig for addresses. Neighbor address(s): inet6 fe80::5200:ff:fe04:1/128
Cumulus configuration
+--------------+ +---------------+ | | | | | spine |swp1------------------------swp1| leaf | | | | | +--------------+ +---------------+Note: By default the ports on cumulus are IPv6 enabled and with BGP unnumbered we do not need IPv4 address on the interfaces. Therefore there is nothing to configure on the swp1 interfaces
Spine
net add loopback lo ip address 192.168.1.1/32 net add bgp autonomous-system 64515 net add bgp network 192.168.1.1/32 net add bgp neighbor swp1 interface remote-as external net add bgp neighbor swp1 capability extended-nexthop net commit
leaf
net add loopback lo ip address 192.168.2.1/32 net add bgp autonomous-system 64512 net add bgp network 192.168.2.1/32 net add bgp neighbor swp1 interface remote-as external net add bgp neighbor swp1 capability extended-nexthop net commit
Verfication on the spine
We are learning 1 prefix from the neighborbasondole-leaf
cumulus@basondole-spine:~$ net sh bgp summary show bgp ipv4 unicast summary ============================= BGP router identifier 192.168.1.1, local AS number 64515 vrf-id 0 BGP table version 2 RIB entries 3, using 456 bytes of memory Peers 1, using 19 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd basondole-leaf(swp1) 4 64512 100 100 0 0 0 00:04:48 1 Total number of neighbors 1 cumulus@basondole-spine:~$ net sh bgp show bgp ipv4 unicast ===================== BGP table version is 2, local router ID is 192.168.1.1 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> 192.168.1.1/32 0.0.0.0 0 32768 i *> 192.168.2.1/32 swp1 0 0 64512 i Displayed 2 routes and 2 total paths cumulus@basondole-spine:~$ net sh route show ip route ============= Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, > - selected route, * - FIB route C>* 192.168.1.1/32 is directly connected, lo, 00:06:49 B>* 192.168.2.1/32 [20/0] via fe80::5200:ff:fe04:1, swp1, 00:06:38 cumulus@basondole-spine:~$ ping 192.168.2.1 -s 192.168.1.1 -c 5 PING 192.168.2.1 (192.168.2.1) 192(220) bytes of data. 200 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=1.18 ms 200 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.805 ms 200 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=1.05 ms 200 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=1.04 ms 200 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=1.06 ms --- 192.168.2.1 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4013ms rtt min/avg/max/mdev = 0.805/1.030/1.180/0.124 ms cumulus@basondole-spine:~$
Verfication on the leaf
cumulus@basondole-leaf:~$ net sh bgp summ show bgp ipv4 unicast summary ============================= BGP router identifier 192.168.2.1, local AS number 64512 vrf-id 0 BGP table version 2 RIB entries 3, using 456 bytes of memory Peers 1, using 19 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd basondole-spine(swp1) 4 64515 91 92 0 0 0 00:04:20 1 Total number of neighbors 1 cumulus@basondole-leaf:~$ net sh bgp show bgp ipv4 unicast ===================== BGP table version is 2, local router ID is 192.168.2.1 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> 192.168.1.1/32 swp1 0 0 64515 i *> 192.168.2.1/32 0.0.0.0 0 32768 i Displayed 2 routes and 2 total paths
Interoperability between vendors
+--------------+ +---------------+ | | | | | cisco | Gi2-----------------------swp2 | cumulus | | | | | +--------------+ +---------------+With cisco IOS XE there’s more to be configured. In BGP configuration you must specify the link local address of the neighbor together with a
route-map
to change the next-hop address from ipv6 linklocal to ipv4 address.
Cat8000v(config)#do sh ipv6 int g2 | i IPv6 IPv6 is enabled, link-local address is FE80::5200:FF:FE06:1 Cat8000v(config)#do sh ipv6 nei IPv6 Address Age Link-layer Addr State Interface FE80::5200:FF:FE03:2 0 5000.0003.0002 REACH Gi2 cumulus@basondole-cu:~$ net sh int swp2 | egrep "Int.*swp|inet|Nei" IP Neighbor(ARP) Entries: 2 Interface swp2 is up, line protocol is up inet 10.10.20.2/30 inet6 fe80::5200:ff:fe03:2/64 Neighbor address(s): inet6 fe80::5200:ff:fe06:1/128
Configuration on IOS XE
interface GigabitEthernet2 description toCumulus ip address 10.10.20.1 255.255.255.252 ipv6 enable interface Loopback0 ip address 192.168.30.1 255.255.255.255 router bgp 64520 bgp log-neighbor-changes neighbor FE80::5200:FF:FE03:2%GigabitEthernet2 remote-as 64512 ! address-family ipv4 network 192.168.30.1 mask 255.255.255.255 neighbor FE80::5200:FF:FE03:2%GigabitEthernet2 activate neighbor FE80::5200:FF:FE03:2%GigabitEthernet2 route-map FRR out exit-address-family route-map FRR permit 10 set ip next-hop 10.10.20.1
Configuration on cumulus VX
net add interface swp2 ip address 10.100.10.2/30 net add loopback lo ip address 192.168.20.1/32 net add bgp autonomous-system 64512 net add bgp network 192.168.20.1/32 net add bgp neighbor swp2 interface remote-as external net add bgp neighbor swp2 capability extended-nexthop net add routing route-map FRR permit 10 set ip next-hop 10.10.20.2 net add bgp neighbor swp2 route-map FRR out net commit
In case you observe log messages on IOS XE
*Dec 29 14:24:02.791: %BGP-5-NBR_RESET: Neighbor 10.10.20.2 passive reset (BGP Notification sent) *Dec 29 14:24:02.792: %BGP-5-ADJCHANGE: neighbor 10.10.20.2 passive Down Error during connection collisionRemove ipv4 address on swp2, bgp session will come up then restore the ipv4 address, as it is needed for next hop resolution by the cumulus switch
Verification
Cat8000v(config)#do sh bgp summ | b N Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd FE80::5200:FF:FE03:2%GigabitEthernet2 4 64512 141 139 3 0 0 00:06:53 1 Cat8000v(config)#do sh bgp | b Net Network Next Hop Metric LocPrf Weight Path *> 192.168.20.1/32 10.10.20.2 0 0 64512 i *> 192.168.30.1/32 0.0.0.0 0 32768 i Cat8000v(config)#do ping 192.168.20.1 so 192.168.30.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.168.20.1, timeout is 2 seconds: Packet sent with a source address of 192.168.30.1 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 1/7/18 ms Cat8000v(config)#do traceroute 192.168.20.1 so 192.168.30.1 numeric Type escape sequence to abort. Tracing the route to 192.168.20.1 VRF info: (vrf in name/id, vrf out name/id) 1 192.168.20.1 [AS 64512] 1 msec 1 msec 2 msec Cat8000v(config)#
cumulus@basondole-cu:~$ net sh bgp summary show bgp ipv4 unicast summary ============================= BGP router identifier 192.168.20.1, local AS number 64512 vrf-id 0 BGP table version 6 RIB entries 3, using 456 bytes of memory Peers 4, using 77 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd swp2 4 64520 177 180 0 0 0 00:08:49 1 Total number of neighbors 1 cumulus@basondole-cu:~$ net sh bgp show bgp ipv4 unicast ===================== BGP table version is 6, local router ID is 192.168.20.1 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> 192.168.20.1/32 0.0.0.0 0 32768 i *> 192.168.30.1/32 10.10.20.1 0 0 64520 i Displayed 2 routes and 2 total paths cumulus@basondole-cu:~$ ping 192.168.30.1 -s 192.168.20.1 -c 2 PING 192.168.30.1 (192.168.30.1) 192(220) bytes of data. 200 bytes from 192.168.30.1: icmp_seq=1 ttl=255 time=0.733 ms 200 bytes from 192.168.30.1: icmp_seq=2 ttl=255 time=0.775 ms --- 192.168.30.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.733/0.754/0.775/0.021 ms cumulus@basondole-cu:~$ traceroute 192.168.30.1 -s 192.168.20.1 -n traceroute to 192.168.30.1 (192.168.30.1), 30 hops max, 60 byte packets 1 10.10.20.1 1.638 ms * * cumulus@basondole-cu:~$
Juniper configuration
basondole@vQFX# show | compare [edit groups] + BGP_UNNUMBERED { + interfaces { + xe-0/0/1 { + unit 0 { + description to_GNS3_via_vboxNet3; + family inet { + address 172.17.0.1/30; + } + family inet6; + } + } + } + protocols { + bgp { + group FRR { + family inet { + unicast; + } + peer-as 64515; + local-as 64514; + neighbor fe80::5200:ff:fe05:1 { + local-interface xe-0/0/1.0; + } + } + } + } + } [edit] + apply-groups BGP_UNNUMBERED;