09 Apr

Transit at IXP & “next-hop-self”

And college started after pretty good holi holidays. Again having bit painful time due to hot weather and this is just start of summers. Well all I can hope is that there won’t be voltage issues in village again (like last time). And just to make sure on that part – I have put 2 RTI’s asking Power department about their preparation details. 🙂

 

 

Anyways coming on blog post topic for the day – the effect of “next-hop-self” at an IXP when there are peers as well as transit customers of a network. Just to be clear in start – this post will stick to technical side of it and without going into IXP policy side of it. 

 

OK let’s consider a case of an Internet Exchange Point (IXP) where we have three participants – A, B and C. Now A is very big ISP, while B is a big (not as big as A though) while C is pretty small. All are connected on same switch and under same broadcast subnet 10.0.0.0/24 with A having autonomous system number 1 and is allocated 10.0.0.1 (from IXP’s /24), B having AS2 with IP 10.0.0.2 and C having AS3 with IP 10.0.0.3.

 

Now B “requests” A for peering and A decides that since B has a significant part of routing table, it’s a good idea to peer, and so A and B start peering. Next, C goes to A with same requests and considering (small) size of C, A rejects peering request and rather offers paid transit at some X price. B hears about this issue and goes to C to offer “cheap transit” to reach A (since B peers with A already) and eventually C agrees and becomes a downstream customer of B.

 

I am doing this scenario setup on GNS and here’s how things will look like:

3 routers connection

 

Looking at router B (AS2):

 

b.net>sh ip bgp summary
BGP router identifier 10.0.0.2, local AS number 2
BGP table version is 6, main routing table version 6
5 network entries using 485 bytes of memory
5 path entries using 180 bytes of memory
3 BGP path attribute entries using 180 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 893 total bytes of memory
BGP activity 5/0 prefixes, 5/0 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.0.0.1 4 1 25 26 6 0 0 00:20:02 3
10.0.0.3 4 3 5 6 6 0 0 00:00:39 1
b.net>

 

So we have two sessions – one with A (peering) and one with C (customer).

 

Now let’s check on customer C’s router to see what they have got in table:

c.net>
c.net>sh ip bgp
BGP table version is 6, local router ID is 10.0.0.3
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal
Origin codes: i – IGP, e – EGP, ? – incomplete

Network Next Hop Metric LocPrf Weight Path
*> 60.0.0.0/24 10.0.0.1 0 2 1 i
*> 61.0.0.0/24 10.0.0.1 0 2 1 i
*> 62.0.0.0/24 10.0.0.1 0 2 1 i
*> 70.0.0.0/24 10.0.0.2 0 0 2 i
*> 80.0.0.0/24 0.0.0.0 0 32768 i
c.net>

 

OK – we can see C is getting 60.0.0.0/24, 61.0.0.0/24 and 62.0.0.0/24 which is originated by AS 1 (ISP A) and AS path is 2 > 1 which seems all good but “next hop” is 10.0.0.1 which is IP of router A. So basically traffic is NOT going via B, only AS path is telling AS path is C > B > A but actual flow of packets is C > A directly

  

Let’s look at trace:

c.net>traceroute 60.0.0.1

Type escape sequence to abort.
Tracing the route to 60.0.0.1

1 10.0.0.1 16 msec 16 msec *
c.net>

 

This is something which few people dislike because traffic is flowing between C and A directly while C is paying to B. So B is making money without even having traffic on their ports!

Again – I am not going into whether this argument is good or not because then we will come on argument why A didn’t peered with C when they were on same switch already?! !
If traffic flows from B’s port then it will have to pass IXP switch twice C > switch > B and B > switch > A and again return as A > switch > B, B > switch > C.

 

What if IXP tries and make sure that direct flow doesn’t happens?

Let’s say if IXP forces “next-hop-self”?

 

Firstly let’s recall what exactly “next-hop-self” means:

It’s an important parameter used in BGP sessions to put “next hop” as itself for adjacent peer. It is needed in lot’s of cases when router which is originating prefix appears to be on same subnet and but not reachable by other peer.  

Let’s reconfigure router of B.net for it’s neighbour C.net on 10.0.0.3

b.net>enable
Password:
b.net#conf t
Enter configuration commands, one per line. End with CNTL/Z.
b.net(config)#router b
b.net(config)#router bgp 2
b.net(config-router)#nei
b.net(config-router)#neighbor 10.0.0.3 next-hop-self
b.net(config-router)#end
b.net#
00:37:18: %SYS-5-CONFIG_I: Configured from console by console
b.net#
b.net#
b.net#

 

 

Checking routing table on router C again:

c.net>sh ip bgp
BGP table version is 9, local router ID is 10.0.0.3
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal
Origin codes: i – IGP, e – EGP, ? – incomplete

Network Next Hop Metric LocPrf Weight Path
*> 60.0.0.0/24 10.0.0.2 0 2 1 i
*> 61.0.0.0/24 10.0.0.2 0 2 1 i
*> 62.0.0.0/24 10.0.0.2 0 2 1 i
*> 70.0.0.0/24 10.0.0.2 0 0 2 i
*> 80.0.0.0/24 0.0.0.0 0 32768 i
c.net>

 

And as we see – next hop is 10.0.0.2 now and thus traffic will now actually pass B.

c.net>traceroute 60.0.0.1

Type escape sequence to abort.
Tracing the route to 60.0.0.1

1 10.0.0.2 16 msec 20 msec 20 msec
2 10.0.0.1 44 msec 44 msec *
c.net>

 

In general default behaviour of BGP is to take shortest path and ideally it should pick direct router rather then routing packets via IXP switch twice. So well that’s about it. 

 

Time to get ready for college class!

 

Disclaimer: Post is completely a reflection of my personal thoughts and has NOTHING to do with my employer. It does not reflects thoughts or vision of my employeer. 

28 Sep

Midnight system screwup (and fix!)

I was just working (and playing music!) and realized that “Movie player” package given on default Ubuntu installation isn’t of much use. 

Decided to uninstall it, next needed arping for some test and installed it (via default debian repository). Something crazy happened here. I opened something on personal server and it gave DNS error. I shot couple of digs from terminal and all timed it. I was scared to hell thinking of DNS failure on personal domain which is very very unlikely since I am using multiple DNS providers and close to a dozen of servers serving DNS zone. 

I ran ping to 8.8.8.8 and that also failed. I realized it was simply internet issues. Nothing unusual on BSNL but when I tried to telnet BSNL router to see WAN link status, telnet also failed. I couldn’t even connect to LAN IP. Was my system out of internal network broadcast? Yes!

I gave ifconfig and saw only lo interface and realized the blunder!

As per history of software center, I uninstalled network-manager along with network-manager-gnone along with totem player. Strange!

Anyways, since I figured out it was simply missing network-manager package, I gave apt-get install network-manager in terminal and it failed since network was already disconnected!

OK – what can one possibly do in such situation? Can’t even Google on system. 🙂

Anyways, realized in a min that I don’t have network-manager on servers, do I? Nopes!

Thus, simply went to /etc/network/interfaces and added following lines

iface eth0 inet static
address MY-IP
netmask 255.255.255.0
gateway BSNL-router-gateway-IP

 

Incase you don’t run crazy filtering & NATing rules, you can simply use DHCP. You can add following line to get DHCP working:
iface eth0 inet dhcp

Next, restarting networking via usual /etc/init.d/networking restart and I was back online to install network-manager and write this blog post! 🙂

 

Have a happy package breaking ahead!