01 Apr

Why NIXI AS24029 appears to be transit ASN?

And my post on 1st April. Don’t take it as April fool post 😉

 

Multiple times NIXI’s AS24029 has been reported as acting like transit ASN for multiple networks. I have analysed it in past and this is very much because of route leaks by few specific networks. I have explained difference in peering Vs transit routes and their handling previously on my blog.

In short: A network is supposed to re-announce it’s peering and transit routes only to customer and not to any other peer or upstream. Whenever NIXI’s ASN appears in global routing table, its always the case where one or more networks are re-announcing routes learnt via NIXI to their upstream transits. 

 

Looking at Hurricane Electric’s bgp.he.net for NIXI’s AS24029, we get:

 

 

Now according to this – many ASNs are peers of NIXI and visible to HE. The problem with HE’s data is that it doesn’t shows who is downstream and who is upstream (but is pretty fast!). Looking at stat.ripe.net data for AS24029, we get:

peers_of_AS24029

 

This is very interesting data as left side are the ones which are actually announcing these routes to their upstreams. Finding  upstream is tricky since these are filtered out at global level are don’t stay in the global routing table. It would be overall hard to find ones whose path count is low but for ones with large path count, we can likely see those routes in RIPE RIS collected data. 

Using bgpdump on RIPE RIS data, I get:

anurag@server7:~/test$ grep "24029" latest.txt 
TABLE_DUMP_V2|03/31/14 16:00:05|A|68.67.63.245|22652|103.27.140.0/22|22652 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|46.149.194.2|15469|103.27.140.0/22|15469 6762 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|193.160.39.1|57821|103.27.140.0/22|57821 48039 3320 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|79.143.241.12|29608|103.27.140.0/22|29608 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|203.119.76.5|4608|103.27.140.0/22|4608 1221 4637 1273 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|46.149.194.1|15469|103.27.140.0/22|15469 6762 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|193.150.22.1|57381|103.27.140.0/22|57381 42708 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|202.12.28.1|4777|103.27.140.0/22|4777 2500 2500 2500 7660 4635 1273 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|208.51.134.248|3549|103.27.140.0/22|3549 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|203.123.48.6|37989|103.27.140.0/22|37989 4844 2914 3491 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|213.200.87.254|3257|103.27.140.0/22|3257 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|218.189.6.2|9304|103.27.140.0/22|9304 4635 1273 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|12.0.1.63|7018|103.27.140.0/22|7018 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|212.25.27.44|8758|103.27.140.0/22|8758 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|176.12.110.8|50300|103.27.140.0/22|50300 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|193.0.0.56|3333|103.27.140.0/22|3333 3356 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|195.47.235.100|6881|103.27.140.0/22|6881 15685 6939 1273 55410 {18101,24029,45804,132777}|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|208.51.134.248|3549|103.245.200.0/24|3549 6453 4755 17439 24029 9498 132542|IGP
TABLE_DUMP_V2|03/31/14 16:00:05|A|12.0.1.63|7018|103.245.200.0/24|7018 6453 4755 17439 24029 9498 132542|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|46.149.194.2|15469|116.214.114.0/24|15469 6762 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|46.149.194.1|15469|116.214.114.0/24|15469 6762 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|193.150.22.1|57381|116.214.114.0/24|57381 42708 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|176.12.110.8|50300|116.214.114.0/24|50300 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|203.123.48.6|37989|116.214.114.0/24|37989 4844 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|213.200.87.254|3257|116.214.114.0/24|3257 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|195.47.235.100|6881|116.214.114.0/24|6881 3257 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|208.51.134.248|3549|116.214.114.0/24|3549 7473 9498 17447 24029 4755 24063|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|46.149.194.2|15469|123.136.159.0/24|15469 6762 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|46.149.194.1|15469|123.136.159.0/24|15469 6762 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|193.150.22.1|57381|123.136.159.0/24|57381 42708 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|218.189.6.2|9304|123.136.159.0/24|9304 4635 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|203.123.48.6|37989|123.136.159.0/24|37989 4844 7473 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|208.51.134.248|3549|123.136.159.0/24|3549 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:06|A|202.12.28.1|4777|123.136.159.0/24|4777 2500 2500 2500 7660 4635 9498 45528 45528 45528 45528 24029 18101 18101 9396|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|46.149.194.2|15469|182.48.209.0/24|15469 6762 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|46.149.194.1|15469|182.48.209.0/24|15469 6762 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|195.47.235.100|6881|182.48.209.0/24|6881 3257 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|208.51.134.248|3549|182.48.209.0/24|3549 10026 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|203.123.48.6|37989|182.48.209.0/24|37989 4844 7473 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|202.12.28.1|4777|182.48.209.0/24|4777 2516 10026 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|213.200.87.254|3257|182.48.209.0/24|3257 9498 17443 4755 24029 9583 45769|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|202.12.28.1|4777|183.87.210.0/24|4777 2516 6453 4755 18229 24029 9583 45841|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|202.12.28.1|4777|183.87.244.0/24|4777 2516 6453 4755 18229 24029 9583 45841|IGP
TABLE_DUMP_V2|03/31/14 16:00:09|A|202.12.28.1|4777|183.87.250.0/24|4777 2516 6453 4755 18229 24029 9583 45841|IGP
TABLE_DUMP_V2|03/31/14 16:00:12|A|193.150.22.1|57381|202.3.15.0/24|57381 42708 9583 24186 24029 4755 45927|IGP
TABLE_DUMP_V2|03/31/14 16:00:12|A|203.119.76.5|4608|202.140.137.0/24|4608 1221 4637 6453 4755 10201 10201 10201 10201 10201 10201 10201 24029 9583|IGP
TABLE_DUMP_V2|03/31/14 16:00:12|A|202.12.28.1|4777|202.140.137.0/24|4777 2516 6453 4755 10201 10201 10201 10201 10201 10201 10201 24029 9583|IGP
TABLE_DUMP_V2|03/31/14 16:00:12|A|208.51.134.248|3549|202.140.137.0/24|3549 6453 4755 10201 10201 10201 10201 10201 10201 10201 24029 9583|IGP
TABLE_DUMP_V2|03/31/14 16:00:13|A|46.149.194.2|15469|203.119.50.0/24|15469 6762 9498 24186 24029 18101 18101|IGP
TABLE_DUMP_V2|03/31/14 16:00:13|A|46.149.194.1|15469|203.119.50.0/24|15469 6762 9498 24186 24029 18101 18101|IGP
TABLE_DUMP_V2|03/31/14 16:00:13|A|193.150.22.1|57381|203.119.50.0/24|57381 42708 9498 24186 24029 18101 18101|IGP
TABLE_DUMP_V2|03/31/14 16:00:13|A|203.123.48.6|37989|203.119.50.0/24|37989 4844 9498 24186 24029 18101 18101|IGP
TABLE_DUMP_V2|03/31/14 16:00:13|A|208.51.134.248|3549|203.119.50.0/24|3549 9498 24186 24029 18101 18101|IGP
anurag@server7:~/test$

 

Refinding more of AS path part, we get:

anurag@server7:~/test$ grep "24029" latest.txt |awk -F '|' '{print $7}' |awk -F '24029' '{print $1}'|awk -F ' ' '{print $NF}' |sort -u
10201 - Aircel
17439 - Netmagic
17447 - Net4India
18229 - CtrlS Datacenter
24186 - RailTel India
45528 - Tikona Networks
4755 - Tata Comm/VSNL
{18101, - Reliance Comm
anurag@server7:~/test$

Here we get the culprit ASNs. 🙂

So why does this happens?

Mostly it happens due the way filters are controlled in routers. Most of networks open their filters with upstreams to announce their customer routes. Now if customer routes are received via NIXI, they are re-announced as well. So in many of these cases these networks have/had the origination ASN as customer. 

These are the prefixes which are causing this:

anurag@server7:~/test$ grep "24029" latest.txt |awk -F '|' '{print $6}'|sort -u
103.245.200.0/24
103.27.140.0/22
116.214.114.0/24
123.136.159.0/24
182.48.209.0/24
183.87.210.0/24
183.87.244.0/24
183.87.250.0/24
202.140.137.0/24
202.3.15.0/24
203.119.50.0/24
anurag@server7:~/test$

 

 So that’s all about NIXI route leaks. Wish NIXI becomes a International hub for traffic exchange between Europe/Middle East and East Asia and as per current policy it’s no where around promoting domestic traffic exchange let alone international one! 

 

Disclaimer: I work for an Indian ISP and all comments here are completely personal. In no way it reflects my employers view. 

28 Jun

BSNL routing glitch and updates

Today I noticed some traffic on my blog from a link from Broadband forum

 

Here’s what poster wrote:

I made a thread a few days ago complaining about BSNL’s horrible routing. Well it looks like it has been fixed. I thank all the guys who made efforts to bring this to BSNL’s notice. Especially Anurag Bhatia who highlighted the issue with much detail on his blog

anuragbhatia.com !!! » Blog Archive » BSNL > Softlayer connectivity problem & possible fix

 

 

Always good to see links to my blog. This was an interesting update and I can see forward does seems good for now. 

 

Here’s an updated traceroute from India to Singapore (BSNL > Softlayer):

anurag:~ anurag$ traceroute -a hostgator.in
traceroute to hostgator.in (216.12.194.67), 64 hops max, 52 byte packets
1 [AS65534] router.home (10.10.0.1) 1.183 ms 1.290 ms 0.849 ms
2 [AS9829] 117.206.176.1 (117.206.176.1) 17.517 ms 18.056 ms 17.163 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 71.872 ms 52.246 ms 114.018 ms
4 [AS4755] 115.114.57.249.static-mumbai.vsnl.net.in (115.114.57.249) 49.644 ms 50.151 ms 49.265 ms
5 [AS0] 172.31.16.193 (172.31.16.193) 83.261 ms * 82.361 ms
6 [AS0] ix-4-2.tcore1.cxr-chennai.as6453.net (180.87.36.9) 197.469 ms 199.161 ms 196.580 ms
7 [AS0] if-5-2.tcore1.svw-singapore.as6453.net (180.87.12.53) 318.931 ms 307.292 ms
[AS0] if-3-3.tcore2.cxr-chennai.as6453.net (180.87.36.6) 306.836 ms
8 [AS0] if-5-2.tcore2.svw-singapore.as6453.net (180.87.15.69) 330.831 ms
[AS0] if-2-2.tcore2.svw-singapore.as6453.net (180.87.12.2) 306.926 ms
[AS0] if-6-2.tcore2.svw-singapore.as6453.net (180.87.37.14) 227.751 ms
9 [AS0] 180.87.15.218 (180.87.15.218) 230.692 ms 265.758 ms 241.768 ms
10 [AS4637] i-1-0-0.6ntp-core01.bi.telstraglobal.net (202.84.243.81) 245.100 ms 235.299 ms 274.206 ms
11 [AS4637] i-0-1-0-0.istt02.bi.telstraglobal.net (202.84.243.110) 307.158 ms 304.905 ms 307.080 ms
12 [AS4637] unknown.telstraglobal.net (202.126.128.18) 307.409 ms 304.740 ms 307.178 ms
13 [AS36351] ae5.dar02.sr03.sng01.networklayer.com (50.97.18.199) 307.167 ms 306.263 ms
[AS36351] ae5.dar01.sr03.sng01.networklayer.com (50.97.18.197) 307.456 ms
14 [AS36351] po1.fcr01.sr03.sng01.networklayer.com (174.133.118.131) 238.486 ms
[AS36351] po2.fcr01.sr03.sng01.networklayer.com (174.133.118.133) 234.005 ms
[AS36351] po1.fcr01.sr03.sng01.networklayer.com (174.133.118.131) 306.823 ms
15 * * *
16 * *^C
anurag:~ anurag$

 

 

So forward does seems good but latency is still way too high then an expected value of 120-150ms (from North India). There’s a jump as soon as we hit Chennai router for AS6453.

 

Quick ping output:

anurag:~ anurag$ ping -c 5 hostgator.in
PING hostgator.in (216.12.194.67): 56 data bytes
64 bytes from 216.12.194.67: icmp_seq=0 ttl=45 time=232.593 ms
64 bytes from 216.12.194.67: icmp_seq=1 ttl=45 time=233.120 ms
64 bytes from 216.12.194.67: icmp_seq=2 ttl=45 time=259.231 ms
64 bytes from 216.12.194.67: icmp_seq=3 ttl=45 time=281.217 ms
64 bytes from 216.12.194.67: icmp_seq=4 ttl=45 time=305.450 ms

— hostgator.in ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 232.593/262.322/305.450/28.154 ms
anurag:~ anurag$

 

We can ignore any value above 232ms because that’s simply latency added by router because they do not put ICMP on priority. But overall 232ms is quite high and it seems like there is issue in reverse path. I am doing this test from 117.206.176.0/20 sitting on BSNL autonomous system 9829.

 

Looking at BGP table at Softlayer Singapore for this prefix via Softlayer Looking Glass, we get:

 

 

bbr01.eq01.sng02> show route protocol bgp 117.206.176.0 table inet.0
inet.0: 461775 destinations, 1662681 routes (461773 active, 1 holddown, 1 hidden)
+ = Active Route, – = Last Active, * = Both

117.206.176.0/20 *[BGP/170] 6:50:39, MED 1, localpref 160
AS Path: 4637 6453 9829 I

> to 202.126.128.17 via xe-0/2/0.0
[BGP/170] 6:50:41, MED 5, localpref 10
AS Path: 2914 6453 9829 I

> to 116.51.17.53 via ae11.0

 

So AS path is AS4637 > AS6453 > AS9829 

 

AS4637 is Reach/Telstra while AS6453 is Tata Comm and just next to it is AS9829 which again (as per my earlier post) is an IPLC link. AS6453 > AS9829 connection is from outside India for sure and it should be rather AS6453 > AS4755 (VSNL) > AS9829 for actual direct route from Singapore to Asia.

 

Just to confirm this, let’s run a trace to a random IP 117.206.178.1 from Softlayer Singapore:

bbr01.eq01.sng02> traceroute 117.206.178.1
HOST: bbr01.eq01.sng02-re0 Loss% Snt Last Avg Best Wrst StDev
1. 202.126.128.17 0.0% 5 1.6 2.9 1.6 5.6 1.5
2. 202.84.180.157 0.0% 5 1.5 16.3 1.4 43.8 20.6
3. 202.84.243.114 0.0% 5 4.9 4.2 3.1 4.9 0.9
4. 180.87.15.217 0.0% 5 44.1 11.6 2.6 44.1 18.2
5. 180.87.15.70 0.0% 5 261.5 261.4 260.6 263.0 0.9
6. 180.87.36.5 0.0% 5 260.3 256.6 255.4 260.3 2.1
7. 180.87.36.33 0.0% 5 255.9 256.3 255.9 256.8 0.4
8. 80.231.217.17 0.0% 5 255.7 257.7 255.7 263.5 3.3
9. 80.231.217.2 60.0% 5 258.1 257.7 257.4 258.1 0.5
10. 80.231.200.14 80.0% 5 263.5 263.5 263.5 263.5 0.0
11. 80.231.131.34 0.0% 5 256.1 256.2 256.0 256.4 0.2
12. 195.219.144.2 0.0% 5 380.6 380.6 380.6 380.6 0.0
13. 218.248.255.101 0.0% 5 397.6 388.9 381.2 401.3 9.8
14. 218.248.169.121 0.0% 5 394.1 397.8 394.1 412.6 8.2
15. 218.248.169.121 20.0% 5 397.3 404.9 394.1 424.0 13.4
16. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0

 

 

 

Clearly a high latency route but unfortunately Softlayer looking glass is not doing rDNS PTR mapping for IP to hostname. Let’s try to look at some specific hops for them via using dig command (with -x argument for PTR):

anurag:~ anurag$
anurag:~ anurag$ dig +short -x 202.84.243.114
i-0-2-0-0.6ntp02.bi.telstraglobal.net.
anurag:~ anurag$ dig +short -x 180.87.15.217
ix-1-1-2-0.tcore2.SVW-Singapore.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.15.70
if-5-2.tcore2.CXR-Chennai.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.36.5
if-3-3.tcore1.CXR-Chennai.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.36.33
if-7-2.tcore1.MLV-Mumbai.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.217.17
if-9-5.tcore1.WYN-Marseille.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.217.2
if-2-2.tcore2.WYN-Marseille.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.200.14
if-9-2.tcore2.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.131.34
if-4-0-0.mcore3.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 195.219.144.2
ix-1-1-0.mcore3.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 218.248.255.101
anurag:~ anurag$

 

 

 

So return path for packets is as:

 

Telstra (Singapore) > Tata AS6453 (Singapore) > Tata AS6453 (Chennai via Tata Indicom cable link) > Tata AS6453 (Mumbai) > Tata AS6453 (Marseille, France) > Tata AS6453 (London) > IPLC Link >>> BSNL AS9829 India.

 

So basically BSNL fixed forward path but return path is badly messed up. They are not announcing this prefix – 117.206.176.0/20 along with many more prefixes  to transit provider’s IP links. They are just relying on NIXI for domestic traffic while for transit they are relying on IPLC ports which in this case seems to be with Tata AS6453 in London.

 

Here’s what Tata AS6453 router in Mumbai is getting:

 

AS6453 IPv4 and IPv6 Looking Glass
show ip bgp 117.206.176.0/20

Router: gin-mlv-core1
Site: IN, Mumbai, MLV
Command: show ip bgp 117.206.176.0/20

BGP routing table entry for 117.206.176.0/20
Bestpath Modifiers: deterministic-med
Paths: (3 available, best #3)
Multipath: eBGP
11 12
9829
l78-mcore3. (metric 2968) from mlv-tcore2. (66.110.10.215)
Origin IGP, valid, internal
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.
9829
l78-mcore3. (metric 2968) from mlv-tcore1. (66.110.10.202)
Origin IGP, valid, internal
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.
9829
l78-mcore3. (metric 2968) from cxr-tcore1. (66.110.10.113)
Origin IGP, valid, internal, best
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.

 

 

So clearly in all three cases Tata AS6453 is getting routes via loopback interfaces of it’s router in London (m core 3 – London). There’s not even a single route via m-core Chennai/Mumbai via VSNL AS4755.

 

 

So what’s the possible fix?

Likely something like this:

  1. BSNL should maintain good capacity with IP ports along with IPLC ports.
  2. They should announce all prefixes to IP ports atleast without doing any preferred more specific announcement on IPLC like they announce /18 on IP port and more specific /20 on IPLC.
  3. BSNL should implement BGP blackholing to avoid East Asian traffic via their IPLC ports since most of their ports are in London, New York and Los Angles and not really in East Asia (as far as I can see from routes).
  4. BSNL “could” do a basic 1 degree prepend for IPLC routes specially with Tata AS6453 since AS6453 > AS9829 is short AS path then AS6453 > AS4755 > AS9829. Hence with one degree prepend they can have AS6453 > AS9829 > AS9829 (repetition of own AS once) to increase AS path to make route less preferred. 
  5. Buying IPLC port to reach Equinix Singapore + HongKong Internet Exchange (HKIX) – that’s where they can find a lot of local Asian traffic.
01 Jun

BSNL > Softlayer connectivity problem & possible fix

It’s late night here in India. I am having final 8th semester exams and as usual really bored! 

Though this time we have interesting subjects but still syllabus is pretty boring spreading across multiple books, notes and pdf’s. Anyways I will be out of college after June which sounds good.

 

Tonight, I found a routing glitch. Yes a routing glitch!! 🙂

These issues somehow keep my life in orbit and give a good understanding on how routing works over the Internet.

 

 

OK – so the issue

I noticed a really bad (forward) route from my BSNL’s connection to hostgator.in website hosted in Softlayer Singapore. Let’s look at forward path:

anurag:~ anurag$ traceroute -a hostgator.in
traceroute to hostgator.in (216.12.194.67), 64 hops max, 52 byte packets
1 [AS65534] router.home (10.10.0.1) 1.189 ms 0.910 ms 0.810 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 17.707 ms 21.147 ms 16.925 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 30.195 ms 29.766 ms 29.976 ms
4 [AS9829] 218.248.250.82 (218.248.250.82) 75.432 ms 77.488 ms 76.761 ms
5 [AS6453] if-11-1-1.mcore3.laa-losangeles.as6453.net (209.58.85.5) 368.104 ms 303.206 ms 309.964 ms
6 [AS6453] if-10-2-0-14.tcore2.lvw-losangeles.as6453.net (216.6.84.6) 309.070 ms 308.725 ms 310.073 ms
7 [AS6453] 216.6.84.66 (216.6.84.66) 317.050 ms 318.714 ms 398.408 ms
8 [AS2914] ae-5.r21.lsanca03.us.bb.gin.ntt.net (129.250.5.85) 305.672 ms * 304.480 ms
9 [AS2914] as-2.r20.osakjp01.jp.bb.gin.ntt.net (129.250.3.202) 414.205 ms
[AS2914] as-1.r21.tokyjp01.jp.bb.gin.ntt.net (129.250.3.146) 485.451 ms
[AS2914] as-2.r20.osakjp01.jp.bb.gin.ntt.net (129.250.3.202) 414.272 ms
10 [AS2914] ae-3.r24.tokyjp05.jp.bb.gin.ntt.net (129.250.6.188) 381.221 ms
[AS2914] ae-1.r23.osakjp01.jp.bb.gin.ntt.net (129.250.2.49) 420.412 ms
[AS2914] ae-3.r25.tokyjp05.jp.bb.gin.ntt.net (129.250.6.192) 372.768 ms
11 [AS2914] ae-7.r25.tokyjp05.jp.bb.gin.ntt.net (129.250.3.223) 394.899 ms
[AS2914] ae-7.r24.tokyjp05.jp.bb.gin.ntt.net (129.250.3.221) 406.922 ms
[AS2914] ae-2.r00.tokyjp03.jp.bb.gin.ntt.net (129.250.2.5) 491.190 ms
12 [AS2914] ae-3.r00.tokyjp03.jp.bb.gin.ntt.net (129.250.4.233) 399.065 ms
[AS2914] xe-0-0-0.bbr01.eq01.tok01.networklayer.com (61.213.145.38) 307.955 ms
[AS2914] ae-2.r00.tokyjp03.jp.bb.gin.ntt.net (129.250.2.5) 392.937 ms
13 [AS2914] xe-0-0-0.bbr01.eq01.tok01.networklayer.com (61.213.145.38) 310.298 ms
[AS36351] ae1.bbr01.eq01.sng02.networklayer.com (50.97.18.165) 306.396 ms
[AS2914] xe-0-0-0.bbr01.eq01.tok01.networklayer.com (61.213.145.38) 407.191 ms
14 [AS36351] ae5.dar01.sr03.sng01.networklayer.com (50.97.18.197) 388.660 ms
[AS36351] ae5.dar02.sr03.sng01.networklayer.com (50.97.18.199) 303.546 ms 409.645 ms
15 [AS36351] po2.fcr01.sr03.sng01.networklayer.com (174.133.118.133) 407.589 ms
[AS36351] ae5.dar02.sr03.sng01.networklayer.com (50.97.18.199) 310.587 ms
[AS36351] po2.fcr01.sr03.sng01.networklayer.com (174.133.118.133) 305.969 ms
16 [AS36351] po2.fcr01.sr03.sng01.networklayer.com (174.133.118.133) 363.405 ms * 309.151 ms
17 * * *
18 * * *

 

BSNL (India) >> IPLC circuit >> Tata AS6453 Los Angeles, California >> NTT (US) >> NTT (Asia) >> NTT (Tokyo) >> Softlayer (Tokyo) >> Softlayer (Singapore)

Wow!

Pretty bad. Ideally route should be BSNL > Upstream – Tata/Reliance/Airtel/Vodafone > Singapore (that’s it. Over!)

 

Interesting enough that Softlayer operates a nice looking glass and hence I was able to trace return path to my home router from there to get idea of complete route.

bbr02.eq01.sng02> traceroute 117.220.163.128
HOST: bbr02.eq01.sng02-re0 Loss% Snt Last Avg Best Wrst StDev
1. 63.218.213.173 0.0% 5 0.4 0.5 0.4 0.5 0.0
2. 63.218.228.65 0.0% 5 0.6 0.6 0.5 0.7 0.0 <<< PCCW Global
3. 120.29.215.33 0.0% 5 11.1 7.4 0.6 12.7 5.1 <<< Tata AS6453
4. 120.29.214.13 0.0% 5 0.6 2.4 0.6 9.6 4.0
5. 180.87.12.9 0.0% 5 62.1 61.2 60.7 62.1 0.6
6. 180.87.12.54 0.0% 5 97.7 73.3 60.8 97.7 17.4
7. 180.87.36.33 0.0% 5 103.2 75.0 59.6 103.2 18.1
8. 180.87.38.74 0.0% 5 61.1 74.0 61.1 88.9 12.4 <<< Tata AS6453
9. 115.114.131.138 0.0% 5 91.7 92.6 91.7 96.3 2.0 <<<< VSNL AS4755
10. 218.248.255.101 0.0% 5 95.5 96.5 95.5 99.7 1.8 <<<< Hits BSNL AS9829
11. 218.248.169.117 0.0% 5 106.6 110.4 106.4 126.2 8.8
12. 218.248.169.117 0.0% 5 106.3 107.0 106.3 108.6 1.0
13. ???

 

 

Overall pretty good and direct. Basically latency value is also as we expect till hop 12 because forward route (i.e from BSNL > Softlayer) is direct from BSNL router on hop 12 but for routers below it they are taking route via US. Return path trace is not showing those routers because BSNL is dropping ICMP.

 

Reason for problem:

Forward path is terribly bad here because BSNL let usual BGP route selection algorithm to deal with it. Basically BSNL is getting multiple routes for that prefix from Softlayer. One from it’s IP port in India with Tata-VSNL AS4755 and other from it’s port from Tata in Los Angles (Tata AS6453) over IPLC.

 

So possible routes as per AS paths are:

AS9829 > AS4755 > AS6453 > AS2914 > AS36351 

AS9829 > AS6453 > AS2914 > AS36351

 

Based on default property of BGP, it is picking short AS path i.e 2nd one. In case of #1 BGP session between BSNL AS9829 and Tata-VSNL AS4755 is within India. 

For example:

1 [AS65534] router.home (10.10.0.1) 1.709 ms 0.912 ms 0.982 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 17.451 ms 18.075 ms 19.029 ms
3 [AS9829] 218.248.169.122 (218.248.169.122) 21.843 ms 24.584 ms 22.491 ms
4 [AS4755] 115.114.57.165.static-mumbai.vsnl.net.in (115.114.57.165) 57.399 ms 58.563 ms 57.446 ms

 

Very likely BGP session here is configured on usual /30 subnet with one IP on BSNL side, one on Tata’s side, third one as broadcast and 4th lying useless due to Math game!

So 115.114.57.165 is part of that /30. Let’s ping it:

anurag:~ anurag$ ping -c 5 115.114.57.165
PING 115.114.57.165 (115.114.57.165): 56 data bytes
64 bytes from 115.114.57.165: icmp_seq=0 ttl=58 time=63.286 ms
64 bytes from 115.114.57.165: icmp_seq=1 ttl=58 time=66.029 ms
64 bytes from 115.114.57.165: icmp_seq=2 ttl=58 time=59.063 ms
64 bytes from 115.114.57.165: icmp_seq=3 ttl=58 time=59.439 ms
64 bytes from 115.114.57.165: icmp_seq=4 ttl=58 time=61.719 ms

— 115.114.57.165 ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 59.063/61.907/66.029/2.573 ms
anurag:~ anurag$

 

60ms latency – for sure Mumbai and all good here.

 

Now let’s look at IP just next to it:

 

anurag:~ anurag$ ping -c 5 115.114.57.166
PING 115.114.57.166 (115.114.57.166): 56 data bytes
64 bytes from 115.114.57.166: icmp_seq=0 ttl=251 time=28.784 ms
64 bytes from 115.114.57.166: icmp_seq=1 ttl=251 time=25.586 ms
64 bytes from 115.114.57.166: icmp_seq=2 ttl=251 time=28.631 ms
64 bytes from 115.114.57.166: icmp_seq=3 ttl=251 time=26.905 ms
64 bytes from 115.114.57.166: icmp_seq=4 ttl=251 time=26.213 ms

— 115.114.57.166 ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 25.586/27.224/28.784/1.282 ms
anurag:~ anurag$

 

Half latency and that’s BSNL router in Delhi/Noida where they are taking drop from Tata. It’s BSNL’s router but sitting on Tata’s IP for BGP session. So this clearly tells that when we see routes from AS9829 to AS4755 Tata-VSNL they are between routers within India.

 

Now coming back to bad route between BSNL and Softlayer, in that case first few hops are:

1 [AS65534] router.home (10.10.0.1) 1.189 ms 0.910 ms 0.810 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 17.707 ms 21.147 ms 16.925 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 30.195 ms 29.766 ms 29.976 ms
4 [AS9829] 218.248.250.82 (218.248.250.82) 75.432 ms 77.488 ms 76.761 ms
5 [AS6453] if-11-1-1.mcore3.laa-losangeles.as6453.net (209.58.85.5) 368.104 ms 303.206 ms 309.964 ms

 

Hop 5 has latency of 300ms (usual for India > US routes). Again assuming 209.58.85.5 is coming from /30 and as per usual BSNL practice next IP in that subnet i.e 209.58.85.6 would be on BSNL’s side, let’s ping 209.58.85.6:

anurag:~ anurag$ ping -c 5 209.58.85.6
PING 209.58.85.6 (209.58.85.6): 56 data bytes
64 bytes from 209.58.85.6: icmp_seq=0 ttl=250 time=373.483 ms
64 bytes from 209.58.85.6: icmp_seq=1 ttl=250 time=395.493 ms
64 bytes from 209.58.85.6: icmp_seq=2 ttl=250 time=419.340 ms
64 bytes from 209.58.85.6: icmp_seq=3 ttl=250 time=305.460 ms
64 bytes from 209.58.85.6: icmp_seq=4 ttl=250 time=362.598 ms

— 209.58.85.6 ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 305.460/371.275/419.340/38.232 ms
anurag:~ anurag$

 

 

Hmm….300ms latency. Unexpected. I thought this router was in India but this seems slightly complex. Likely BGP session here is using BSNL’s /30 subnet and not via Tata Comm’s subnet. 

OK – let’s see last IP from BSNL on that trace – it was 218.248.250.82. Let’s ask Tata AS6453 Los Angles LAA router via AS6453 Looking Glass for BGP table:

 

Router: gin-laa-mcore3
Site: US, Los angeles, LAA
Command: show ip bgp 218.248.250.82

BGP routing table entry for 218.248.240.0/20
Bestpath Modifiers: deterministic-med
Paths: (2 available, best #1)
14 16 17 18
9829
ix-3-2.mcore3.LAA-LosAngeles. from ix-3-2.mcore3.LAA-LosAngeles. (218.248.254.99)
Origin IGP, valid, external, best
Community:
9829, (received-only)
ix-3-2.mcore3.LAA-LosAngeles. from ix-3-2.mcore3.LAA-LosAngeles. (218.248.254.99)
Origin IGP, valid, external

 

So BGP route is via – 218.248.254.99

Let’s trace:

traceroute to 218.248.254.99 (218.248.254.99), 64 hops max, 52 byte packets
1 router.home (10.10.0.1) 4.047 ms 0.875 ms 0.958 ms
2 117.220.160.1 (117.220.160.1) 18.779 ms 17.490 ms 19.334 ms
3 218.248.169.126 (218.248.169.126) 44.040 ms 32.802 ms 29.831 ms
4 218.248.250.174 (218.248.250.174) 82.626 ms 87.126 ms 84.243 ms
5 218.248.255.99 (218.248.255.99) 86.061 ms 85.503 ms 83.003 ms

 

Here we go!

So clearly BSNL on 218.248.255.99 is placed in India and is having a BGP session with Tata AS6453 router in Los Angeles. This is over an IPLC circuit of Tata Communications. 

 

Possible fix…

Following an amazing quote – “Never call it a problem unless you have the solution!

So problem here is not really via Tata’s network. They are just selling bandwidth in form of two products – IP Transit & IPLC. It’s BSNL’s wrong idea of using IPLC carelessly. Likely BSNL won’t care or put much effort in fixing it. 

There can be a possible fix from Softlayer side. If they blackhole prefix announcement to BSNL AS9829 via Tata AS6453, BSNL will never pick their IPLC (or even IP) route. Instead they will just pick route via any other upstream like Airtel or Reliance Globalcom.  

 

Let’s look at relationship of Tata AS6453 with PCCW Global (upstream for Softlayer)

anurag:~ anurag$ whois -h whois.radb.net as6453 | grep -w AS3491
import: from AS3491 action pref = 100; accept AS-CAIS
export: to AS3491 announce AS-GLOBEINTERNET
import: from AS3491 action pref = 100; accept AS-CAIS
export: to AS3491 announce AS-GLOBEINTERNET
anurag:~ anurag$

 

Clearly both are peering! 

Based on presentation from Mr Amit Dunga (from Tata Communications) at SANOG, here’s list of BGP communities used by Tata AS6453:

Screen Shot 2013-06-01 at 12.30.35 AM

 

 

Thus if Softlayer could get it’s upstream providers (like PCCW in this specific case) to use 65009:9829 – this will ensure that route learnt by Tata AS6453 from PCCW Global AS3491 is NOT exported to BSNL AS9829. Thus BSNL will instead get route via Bharti Airtel AS9498 or Reliance AS18101.

 

I just sent this detailed info as email to Softlayer and BSNL. And oh yes – I don’t know why hostgator.in is hosted in Softlayer Singapore anyways. They provide hosting in India out of Ctrls datacenter. Why they host their own home site in Singapore is something beyond my understanding!

 

With hopes that your packets to Singapore are not routing via US, time for me to get back to my “cramming” for exams. 🙂