BGP Notes

The main purpose of this article is to summarize the key concepts of BGP for the my CCIE exam preparation. All commands are related to IOS/IOS-XE, unless stated otherwise.

In addition, I strongly recommend the book Troubleshooting BGP: A Practical Guide to Understanding and Troubleshooting BGP (Networking Technology) by Vinit Jain, Brad Edgeworth and some of BGP contents mentioned below are referenced from it.



I plan to write other blog posts about the summary of EIGRP/OSPF/multicast concepts and regularly review them. Hope it might be helpful for someone in the Internet. 😂

BGP (Border Gateway Protocol):

  • Path vector routing protocol
  • Used to exchange networks on the Internet 
  • Transported on TCP port 179
  • ASN (autonomous system number)
    • 2-byte format
      • Public: 1-64,551
      • Private: 64,552-65,535
    • 4-byte format
      • Public
      • Private: 4,200,000,000–4,294,967,294
  • AD (administrative distance)
    • EBGP: 20  
    • IBGP: 200 

BGP Sessions:

  • EBGP
    • Peers are in different ASNs
    • TTL = 1
      • Configure "neighbor x.x.x.x disable-connected-check" for loopback address peering
      • OR configure "neighbor x.x.x.x ebgp-multihop x"
  • IBGP
    • Peers are in a same ASN
    • TTL = 255
    • Require IP reachability
  • AFI (address family identifier) / SAFI (subsequent address family identifier)
    • In IOS, the IPv4 address-family is activated by default
      • Disable the IPv4 address-family via "no bgp default ipv4-unicast" and make the router ready for the multi-protocol support
    • In NX-OS, the address family must be specified for the neighbor.
  • Neighbor states and the finite state machine (not demonstrated)
    • Idle
    • Active
    • Connect
    • OpenSent
    • OpenConfirm
    • Established 
  • Convergence consideration:
    • Using BFD

BGP Messages:

  • Keepalive
  • Open
    • BGP identifier (router ID)
    • Optional (session capabilities, e.g. route refresh)
  • Update
    • Advertise NLRIs (prefixes and its associated PAs)
    • Withdraw NLRIs (prefixes only)
    • Also act as a keepalive
  • Notification

BGP Timers:
  • Keepalive timer: 60 seconds
  • Hold timer: 180 seconds
  • MRAI (minimal route advertisement interval)
    • IOS
      • IBGP: 0 second
      • EBGP: 30 seconds
    • NXOS
      • 0 second for IBGP/EBGP

PAs (path attributes; not listed all):
  • Well-known mandatory
    • NEXT_HOP
      • The next hop of the prefixes received from EBGP peer will be modified.
      • The next hop of the prefixes received from IBGP peer will not be modified.
        • Router A (send updates) ---- EBGP ---- Router B ---- IBGP ---- Router C
        • As for updates from the router A, the next-hop reachability check will fail in the router C and the next-hop will show the update source of the router A 
        • Solutions:  
          • Configure "neighbor router_C_x.x.x.x next-hop-self" in the router B
          • The router B advertises the EBGP peering link, with the router A, into IGP
    • AS_PATH
    • ORIGIN
  • Well-known discretionary
  • Optional transitive
    • Community
      • Similar to route tag but one route can carry multiple communities
      • IOS and NXOS devices do not advertise the communities by default
        • Configuration knob: neighbor x.x.x.x send-community
      • no-export
      • local-as
      • no-advertise
    • RT (route target)
      • Extended community
      • For import/export of routing updates in the VPN environment
    • SOO (site of origin)
      • Extended community
      • For the multi-home (PEs) in the VPN environment, PE will reject the BGP route updates with the same SOO as the configured SOO for the site
  • Optional non-transitive

Loop Prevention
  • EBGP
    • Won't accept the route updates, which has its own ASN in AS_PATH, from other EBGP peers
      • Break the rule via "neighbor x.x.x.x allowas-in"
  • IBGP
    • NLRI updates, received from the IBGP peer, must not be sent to the other IBGP peers (split horizon)
      • Solutions:
        • Route reflectors
        • Full mesh of IBGP peers
        • Confederation
        • MPLS

Route Advertisement
  • There are 3 tables maintaining the prefix and PAs for a route
    • Adj-RIB-in
      • Contains NLRIs in the original form advertised from BGP peers
      • Will be purged after all inbound route policies are processed
        • "neighbor x.x.x.x soft-reconfiguration inbound" keep the raw NLRI in memory and no route refresh will be sent out after "clear ip bgp * soft in"
    • loc-RIB
      • local BGP table
    • Adj-RIB-out
      • After all outbound route policies are processed, NLRIs are sent to BGP peers, 
  • Network statement (network x.x.x.x mask x.x.x.x)
    • Must match the exact prefix in RIB
    • Install the route into the BGP table (loc-rib)
      • PAs of connected network
        • Origin: i (IGP)
        • Next-hop: 0.0.0.0
        • Weight: 32768
      • PAs of static routes or IGP routes
        • Origin: i (IGP)
        • Next-hop: the next-hop IP address in RIB
        • Weight: 32768
        • MED: IGP metric
    • Not enable neighbor discovery for the interface
  • Only the best route will be advertised to the BGP peers
    • Best path selection rules:
    • Wise Lips Lover Will Apply Oral Medication Every Night
      • Weight
        • locally significant
      • Local preference
      • Locally originated
        • network > redistribution > aggregation
      • Length of AS path 
      • Origin 
        • (i>?)
      • MED
      • EBGP > IBGP
      • Next-Hop with the lowest metric
      • ...
  • Recalculation of the best path for a prefix:
    • BGP next-hop reachability change
    • Failure of interface connected to an EBGP peer
    • Redistribution change
    • Reception of new paths for a route
  • RD (route distinguisher):
    • Make the tenant prefix unique in VPN environment
      • VPNv4 prefix: RD (64-bit) + IPv4 address (32-bit)

Default Route Advertisement:
  • network 0.0.0.0 
    • RIB must have the default route
  • neighbor x.x.x.x default-originate
    • Only advertise the default route to the certain neighbor
    • Not necessary to have the default route in RIB
  • redistribute static
    • "default-information originate" must be configured
    • Not recommended

Route Redistribution:

  • By default, only EBGP routes will be redistributed into IGPs
    • Configure "bgp redistribute-internal" for IBGP routes
  • PAs of locally redistributed routes
    • Origin: ? (incomplete)
    • NH: the next-hop IP address in RIB
    • Weight: 32,768
    • MED: IGP metric
  • Considerations:
    • Scalability issue, for instance, router memory.
    • Loss of BGP path attributes
    • Route loops and suboptimal routes (e.g. route feedbacks and race conditions)

Route Summarization:

  • aggregate-address x.x.x.x x.x.x.x [option]
    • Installed in the BGP table only if a smaller route that is in the aggregate range is present
    • Create the null 0 summary route automatically
    • Aggregated route will lose PA information (such as AS-Path, MED and communities) due to aggregation (indicated by the atomic aggregate attribute)
      • Solution: add the option as-set
    • Option summary-only: suppress all smaller routes
    • Option suppress-map route-map-name: suppress the selective prefixes
      • Still possible to leak the route via "neighbor x.x.x.x unsuppress-map route-map-name"
  • Static way
    • Configure the static route "ip x.x.x.x x.x.x.x null0" and advertise it via the network statement 
    • The summary route will always be advertised even if the network doesn't exist
  • auto-summary (just let it go)

Route Reflector (RR):
  • Hub-and-spoke structure to scale IBGP sessions
  • Rules:
    • RR receives route updates from non-RR clients and it will send to RR clients only.
    • RR receives route updates from RR clients and it will send to RR clients and non-RR clients.
    • RR receives route updates from EBGP peers and it will send them to RR clients and non-RR clients
    • Note: the non-RR client can be the RR of other cluster.
  • Loop prevention:
    • ORIGINATOR_ID
    • CLUSTER_ID
  • Design considerations:
    • Redundancy: ideally, two or more RR per each service set (address-family) within AS 
    • Hierarchical RR: multilayer RR structure
    • Partial RR: 
      • Use BGP RR group
      • Use standard BGP community
    • Out-of-band RR: dedicated RR for route reflection and they are outside of the data path
      • Will not configure "next-hop-self" in it

评论