21 Nov

Analysis: Inconsistent latency between two end points

An interesting evening here in village. From today sessional tests started at college and so does my blog posts too (to keep myself with positive energy!) ūüėČ




Learned something new while troubleshooting. ūüôā

I am used to getting latency of ~350ms with my server in Europe as I have mentioned in my past blog posts. My connection > Server goes direct but return path goes via US and this is what increases latency. Today all of sudden I saw latency of 200ms with my server. 150ms less – that’s significant.

Immediately I got idea that BSNL has changed BGP announcement and likely announcing prefix at EU to have direct return path. To confirm that I connected to my server and shooted a traceroute  It gave me a strange result of latency over 350ms as it has been. I pinged my home router from server and latency was still 350ms. While from other side i.e my home connection > server ping was 200ms. Very very strange!

Remember “packets CAN take different path but whatever they follow” – that stays same. So if home > server is 200ms (for SURE not via US) then how come server > home is via US and 350ms? Based on my¬†understanding¬†even if forward and return route is different – the round trip latency stays same.¬†


I decided to collect some more data and confirm my observation. Thus I ran 1000 packets ping from both points – home to server and server to home.



Home > Server ping:

1000 packets transmitted, 1000 received, 0% packet loss, time 999739ms
rtt min/avg/max/mdev = 197.128/201.314/294.035/11.347 ms

Server > Home ping:

1000 packets transmitted, 960 received, 4% packet loss, time 999710ms
rtt min/avg/max/mdev = 319.060/377.464/499.294/18.203 ms
You have new mail in /var/mail/root


clearly Home > Server is 200ms while server > home is 320ms. What seems going strange here? 

I did a trace from both end points to compare latency by each hop, though clearly return path is via US.


Home to server trace:

anurag@laptop:~$ traceroute -A
traceroute to (, 30 hops max, 60 byte packets
1 router.local ( [AS1] 3.578 ms 3.894 ms 4.234 ms
2 ( [AS9829] 30.015 ms 32.443 ms 35.071 ms
3 ( [AS9829] 37.402 ms 39.812 ms 43.527 ms
4 ( [AS18101] 50.175 ms 52.586 ms 55.038 ms
5 ( [AS18101] 80.995 ms 84.568 ms 92.177 ms
6 ( [AS15412] 321.376 ms 287.380 ms 299.725 ms
7 xe-8-3-0.0.pjr04.mmb004.flagtel.com ( [AS15412] 74.802 ms 76.024 ms 77.518 ms
8 xe-0-0-0.0.pjr04.ldn001.flagtel.com ( [AS15412] 362.197 ms 364.614 ms 367.255 ms
9 xe-11-0-0.edge5.London1.Level3.net ( [AS3356/AS9057] 365.825 ms 368.374 ms 370.728 ms
10 ae-52-52.csw2.London1.Level3.net ( [AS3356] 383.136 ms 386.773 ms 387.971 ms
11 ae-57-222.ebr2.London1.Level3.net ( [AS3356] 391.669 ms ae-59-224.ebr2.London1.Level3.net ( [AS3356] 384.353 ms ae-58-223.ebr2.London1.Level3.net ( [AS3356] 390.391 ms
12 ae-24-24.ebr2.Frankfurt1.Level3.net ( [AS3356] 367.976 ms 368.210 ms ae-23-23.ebr2.Frankfurt1.Level3.net ( [AS3356] 374.705 ms
13 ae-82-82.csw3.Frankfurt1.Level3.net ( [AS3356] 368.563 ms ae-62-62.csw1.Frankfurt1.Level3.net ( [AS3356] 371.392 ms ae-82-82.csw3.Frankfurt1.Level3.net ( [AS3356] 380.242 ms
14 ae-71-71.ebr1.Frankfurt1.Level3.net ( [AS3356] 366.064 ms ae-81-81.ebr1.Frankfurt1.Level3.net ( [AS3356] 367.639 ms ae-61-61.ebr1.Frankfurt1.Level3.net ( [AS3356] 375.044 ms
15 ae-1-19.bar1.Munich1.Level3.net ( [AS3356] 388.552 ms 390.943 ms 393.417 ms
16 GIGA-HOSTIN.bar1.Munich1.Level3.net ( [AS9057/AS3356] 222.769 ms 225.136 ms 227.559 ms
17 server7 ( [AS51167] 231.750 ms 234.424 ms 236.893 ms


(clearly latency drop between 15th and 16th hop. Giving clue of different return path)


Server > Home trace:

root@server7:~# traceroute
traceroute to (, 30 hops max, 60 byte packets
1 gw.giga-dns.com ( 0.725 ms 0.719 ms 0.705 ms
2 host-93-104-204-33.customer.m-online.net ( 0.694 ms 0.678 ms 0.669 ms
3 xe-1-1-0.rt-decix-2.m-online.net ( 7.859 ms 7.855 ms 7.853 ms
4 xe-1-1-0.rt-decix-2.m-online.net ( 7.592 ms 7.597 ms 7.591 ms
5 ( 8.048 ms 8.048 ms 8.271 ms
6 ae-5.r21.frnkge03.de.bb.gin.ntt.net ( 7.776 ms ae-2.r20.frnkge04.de.bb.gin.ntt.net ( 8.073 ms ae-5.r21.frnkge03.de.bb.gin.ntt.net ( 7.820 ms
7 ae-0.r20.frnkge04.de.bb.gin.ntt.net ( 8.067 ms ae-1.r21.asbnva02.us.bb.gin.ntt.net ( 98.758 ms ae-0.r20.frnkge04.de.bb.gin.ntt.net ( 8.042 ms
8 ae-0.r20.asbnva02.us.bb.gin.ntt.net ( 93.975 ms ae-1.r21.asbnva02.us.bb.gin.ntt.net ( 113.841 ms ae-0.r20.asbnva02.us.bb.gin.ntt.net ( 93.847 ms
9 ae-0.r20.asbnva02.us.bb.gin.ntt.net ( 99.587 ms 93.837 ms 93.829 ms
10 ae-2.r04.lsanca03.us.bb.gin.ntt.net ( 168.760 ms 168.761 ms 167.983 ms
11 ae-2.r04.lsanca03.us.bb.gin.ntt.net ( 167.236 ms xe-0-1-0-10.r04.lsanca03.us.ce.gin.ntt.net ( 159.727 ms ae-2.r04.lsanca03.us.bb.gin.ntt.net ( 160.751 ms
12 xe-0-1-0-10.r04.lsanca03.us.ce.gin.ntt.net ( 163.235 ms 160.475 ms 165.715 ms
13 ( 340.748 ms 344.682 ms 338.766 ms
14 ( 336.735 ms ( 335.243 ms ( 338.721 ms
15 ( 345.967 ms ( 335.266 ms 342.750 ms
16 ( 342.260 ms 345.005 ms 343.014 ms
17 ( 347.736 ms * 353.451 ms
18 * * *
19 * * *
20 * * *



This brings me back to question Рhow come when return path is via US then latency is 200ms when pinging from home? 


All of sudden I thought of multiple subnets on server! (YES YES YES!)

There are multiple subnets configured and they have IPs belonging to completely different range and infact different ASNs and BGP announcements (here we go!). Now let’s call them IP1 and IP2, where IP1 is default on server and gateway from IP1 range is default route for server while IP2 is secondary. So far I was pinging IP2. Let’s ping IP1 from home:

5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 396.381/399.734/403.037/2.617 ms


Clearly expected latency!! ūüôā



Here’s what was happening – Server has two IPs – IP1 and IP2.
IP1 is default and is covered in BGP announcement of M-Online (German ISP) while IP2 is secondary and is covered in BGP announcement of datacenter itself (they recently got a ASN and run own autonomous network). M-online is relatively big ISP and has transit from multiple providers including Level3. Telia, lot of peering etc. While datacenter’s network has almost no peering but transit from Level3 and Telia. Thus IP1 (primary) is from M-online and server’s default route also points to M-Online gateway.

But since IP is from different subnet covered under different BGP announcement – it follows gateway of datacenter itself. IP1 was prefering route via Level3 which is following route as: Level3 (EU) > Level3 (US) > Reliance-FLAG (US) > BSNL (India) – hence a trip via US with high latency. But for IP2 which is from dataceneter’s network – they seem to be prefering Telia (rather then Level3). Now Telia seems to be having better relation with Tata AS6453 which is also one of upstream transit providers for BSNL other then Reliance. Also, Tata has BSNL’s prefixes covered on their border routers within London and hence followed path for IP2 is: Datacenter > Telia (Europe) > Tata (Europe) > Tata (India) > Tata-VSNL (India) > BSNL. This is better path and gives relatively low latency.


TeliaSonera Looking Glass – traceroute inet as-number-lookup

Router: Munich
Command: traceroute inet as-number-lookup

traceroute to (, 30 hops max, 40 byte packets
1 ffm-bb1-link.telia.net ( 7.760 ms 8.963 ms ffm-bb1-link.telia.net ( 9.625 ms
2 ffm-b2-link.telia.net ( 8.028 ms ffm-b2-link.telia.net ( 8.049 ms ffm-b2-link.telia.net ( 18.213 ms
3 teleglobe-122701-ffm-b2.telia.net ( 8.105 ms 8.014 ms 8.120 ms
4 if-3-2.tcore1.PVU-Paris.as6453.net ( [AS 6453] 135.317 ms if-5-2.tcore1.PVU-Paris.as6453.net ( [AS 6453] 137.277 ms if-3-2.tcore1.PVU-Paris.as6453.net ( [AS 6453] 153.583 ms
MPLS Label=332640 CoS=0 TTL=1 S=1
5 if-2-2.tcore1.PYE-Paris.as6453.net ( [AS 6453] 145.658 ms if-12-2.tcore1.PYE-Paris.as6453.net ( [AS 6453] 135.219 ms if-2-2.tcore1.PYE-Paris.as6453.net ( [AS 6453] 131.225 ms
MPLS Label=522146 CoS=0 TTL=1 S=1
6 if-8-1600.tcore1.WYN-Marseille.as6453.net ( [AS 6453] 135.192 ms 149.243 ms 135.519 ms
MPLS Label=646803 CoS=0 TTL=1 S=1
7 if-9-5.tcore1.MLV-Mumbai.as6453.net ( [AS 6453] 152.803 ms 138.552 ms 171.287 ms
8 ( [AS 6453] 133.772 ms 134.308 ms 134.257 ms
9 ( [AS 4755] 179.789 ms 179.780 ms 179.786 ms
10 ( [AS 9829] 186.435 ms 176.686 ms 176.350 ms
11 ( [AS 9829] 184.978 ms 185.500 ms 194.190 ms
12 ( [AS 9829] 185.008 ms 184.857 ms 187.702 ms
13 * * *
14 * * *


This is why home > server Vs server > home latency is different. Case closed! ūüôā


Heart of issue:

Core cause of all these strange results is BSNL itself. At one end they purchasing transit in India from Indian ISPs like Tata. This works well since they purchase transit = they announce their routes to Tata-VSNL and get global table from Tata-VSNL. Next Tata-VSNL announces all these prefixes to Tata Comm main Tier 1 backbone AS6453 which announces these prefixes consistantly across their border routers in US, Europe and Asia. While at the same time BSNL does crazy purchases of IPLC (International Private Leased Circuit) and connect to ISP’s routers outside India over a layer 2 leased circuit. Here BSNL’s router in India makes a BGP session with upstream’s router outside India. This is OK BUT should be done everywhere. Since BSNL is using IPLC setup only with US – this tends to make a situation where there are few paths to enter BSNL’s network and most of them via New York or Los Angles routers of their upstreams. Either BSNL has to stop using IPLC and simply purchase bandwidth in India or has to setup BGP sessions in Europe and Asia as well.¬†


Comparison of setup

Let’s pickup transit provider from whom BSNL purchases within India – Tata. Here clear identification is that BSNL connects to Tata-VSNL AS4755 and AS4755 further connects to Tata AS6453. AS4755 is only used within India and AS6453 is used everywhere else except India. In India generally AS6453 connects only to AS4755. So AS4755 > AS9829 happens only in India. ūüôā

So lookup of routing table from Tata’s router in Mumbai, London, New York and Singapore:

Router: gin-mlv-core1
Site: IN, Mumbai – MLV, VSNL
Command: show ip bgp

BGP routing table entry for
Bestpath Modifiers: deterministic-med
Paths: (3 available, best #3)
Multipath: eBGP
4755 9829
mlv-tcore1. (metric 1) from cxr-tcore1. (
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 1) from mlv-tcore2. (
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 1) from mlv-tcore1. (
Origin IGP, valid, internal, best


Router: gin-lhx-core1
Command: show ip bgp

BGP routing table entry for
Bestpath Modifiers: deterministic-med
Paths: (2 available, best #1)
Multipath: eBGP
4755 9829
mlv-tcore1. (metric 3636) from ldn-mcore3. (ldn-mcore3.)
Origin IGP, valid, internal, best
4755 9829
mlv-tcore1. (metric 3636) from l78-tcore1. (
Origin IGP, valid, internal


Router: gin-nyt-core1
Site: US, New York – NYT, TELEHOUSE
Command: show ip bgp

BGP routing table entry for
Bestpath Modifiers: deterministic-med
Paths: (1 available, best #1)
4755 9829
tv2-tcore2. (metric 13282) from nyy-tcore1. (
Origin IGP, valid, internal, best

Router: gin-s9r-core1
Site: SG, Singapore – S9R, GLOBAL SW F6-SC6
Command: show ip bgp

BGP routing table entry for
Bestpath Modifiers: deterministic-med
Paths: (8 available, best #6)
Multipath: eBGP
21 23 24 25 26
4755 9829
mlv-tcore1. (metric 3176) from tv2-core1. (tv2-core1.)
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from hk2-core3. (hk2-core3.)
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from klt-tcore1. (
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from pye-core1. (pye-core1.)
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from l78-mcore3. (Loopback5.mcore3.L78-London.)
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from mlv-tcore1. (
Origin IGP, valid, internal, best
4755 9829
mlv-tcore1. (metric 3176) from rsd-core1. (rsd-core1.)
Origin IGP, valid, internal
4755 9829
mlv-tcore1. (metric 3176) from jsd-core1. (jsd-core1.)
Origin IGP, valid, internal



Clearly all paths are to Tata-VSNL 4755 in India directly (not via US) and then further to BSNL. This is not true incase of their IPLC bandwidth purchase where entry point for network will be router of BSNL’s upstream in New York and Los Angles only.

This is how BSNL screws up itself. Anyone hearing? 



Oh and btw I was “made to write” a 2 page answer to question in today’s sessional test at college. Question was what is TCP/IP? I wish teacher ignores the crap I wrote there and look at this blog post. THIS IS TCP/IP! ūüôā

15 Nov

Dumb script for Picasaweb backup on Linux server & Amazon S3

Just wrote a quick script to pull dump of Picasaweb albums backup on my server & further to Amazon S3.¬†Overall I have good trust on Google for data but it’s always a poor idea to leave all eggs in single bucket.

OK here’s the script (poorly written code. Literally spent 10mins on this, thus suggestions to improve my coding are more then welcome!)


google picasa list-albums | cut -d”,” -f1 >> $Destination/tmp/album_list.txt

cat $Destination/tmp/album_list.txt | while read album

¬†¬†¬†¬†¬†¬†¬†¬†¬† google picasa get “$album” $Destination/tmp

FileName=PicsBackup-`date ‘+%d-%B-%Y’`.tar
tar -cpzf $Destination/$FileName $Destination/tmp
gpg –output $Destination/$FileName.pgp -r <YOUR-PGP-KEY-HERE> –always-trust –encrypt $Destination/$FileName
s3cmd put $Destination/$FileName.pgp s3://YOUR-AWS-S3-BUCKET-ADDRESS-HERE

rm -r $Destination/tmp/*
rm $Destination/$FileName
rm $Destination/$FileName.pgp



How to use

Simply download Google Cli scripts, and get your Google account working with the installed stack. Also if you need Amazon S3 backup support then install & configure s3cmd. Once you have both of these configured with your account, simple give executable bit to the script & run!


Code logic

Yes it’s super crappy code but anyways it does the work.

I couldn’t find an easy to way to download entire album base from Picasa. There seems to be some bug with Google Cli tools in directory creation and hence google picasa get .* . fails right after 1st album pull up. Google Cli offers pullup of album names (along with hyperlinks) with list-albums parameter. Thus first part of code is to pull that list and cut the first part of output using comma as delimiter. Next. the output is taken on a txt file which is read line by line in a loop. And the loop has simple code for download of each album one by one. Once download is completed, tar runs to create compress archive followed by gpg to encrypt the tar. This encrypted file is then uploaded to Amazon S3 using s3cmd tool and lastly all downloaded files are just deleted!


On Amazon S3 I have a bucket expiry rule which takes care of rotation and removal of old data. I can spend few more mins to make it more complex but this one just works! ūüėČ


Moral: My programming is crappy, no doubt!

07 Nov

Google's routing issues because of an Indonesian ISP

Yesterday it was reported across networking community that Google’s prefixes were having issue due to an Indonesian ISP¬†Moratel AS23947.


Quick analysis of what happened

From data logged by routeviews it seems like it wasn’t exactly a prefix hijack.¬†AS23947 did not originated¬†prefixes but rather had a route leak leading to path leak of¬†AS23947 > AS15169.¬†

Here’s a view of global routing table for Google’s prefix¬† at 15:57 GMT on 4th Nov:



Next at 02:07:27 GMT on 6th Nov morning, a route change is logged.

FROM 4436 15169 
TO 4436 3491 23947 15169


This path change is observed by routeviews only for one of its participating networks РnLayer AS4436. Rest participating networks seem not having any change. By 02:07:31 i.e within 4 seconds entire route goes via AS23947 for this specific network (and likely few more). At 02:35:06 i.e after 28 mins of this route leak, it is withdrawn and we can see within next few seconds direct route is preferred again. 


The impact was only on very limited part of internet because of fact that Google peers with lot of big networks directly and thus a short path is preferred  E.g Comcast in US will ignore this because Comcast AS36732 > Google AS15169 is short AS path as compared to Comcast > Someone else > Google. 


In above specific part – nLayer¬†preferred¬†long path likely because of its relation with Google. Most of networks prefer Customer routes (in which they get paid) over settlement free peering routes (in which they don’t get anything) over transit routes (in which they have to pay). The¬†Indonesian¬†ISP¬†Moratel seems to have transit from few major players including PCCW Global which was likely running an unfiltered BGP session with its client. Thus largest impact came on PCCW Global Network which is Hong Kong based and fairly large in Asia.¬†


These glitches remind network engineers to careful configure router to avoid screwups! ūüôā