Google Public DNS and Akamai issues in India

A quick blog post on a interesting issue coming up due to combined problem of CDN failure on Google Public DNS and bad Akamai performance due to Tata-NTT peering issue.

I was trying Zembra mail since there’s no more free Google Apps edition and one of my friend asked me to basic email on his domain up. It was more or less a straight task by installing Zembra with decent GUI.

I downloaded it on my Europe based server and during installation realized it was for 64 bit and thus I turned my head to my other server in India.
I started download again and it was slow. DEAD SLOW!

Something like this:

root@server2:~# wget http://files2.zimbra.com/downloads/8.0.1_GA/zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz
--2012-12-18 14:02:59– http://files2.zimbra.com/downloads/8.0.1_GA/zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz
Resolving files2.zimbra.com (files2.zimbra.com)… 23.32.241.26, 125.56.200.51
Connecting to files2.zimbra.com (files2.zimbra.com)|23.32.241.26|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 701053545 (669M) [binary/octet-stream]
Saving to: `zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz.1'

0% [ ] 5,545,378 67.5K/s eta 2h 28m ^C
root@server2:~#

Would have taken 2hrs + on 512Kbps speed while server is on 100Mbps connection and I usually get 20Mbps or so for US/Europe based sources. Since I downloaded same 700MB file on Europe based server and it was quite fast 40Mbps+ while here just 512kbps.

I looked at route from Indian server and route was:

traceroute to 23.32.241.26 (23.32.241.26), 30 hops max, 60 byte packets
1 103.6.87.1 (103.6.87.1) [AS36236] 1.310 ms 1.426 ms 1.716 ms
2 180.179.33.245 (180.179.33.245) [AS17439/AS9584] 0.843 ms 0.951 ms 0.958 ms
3 180.179.37.93 (180.179.37.93) [AS17439] 0.761 ms 0.762 ms 0.750 ms
4 * * *
5 180.179.37.137 (180.179.37.137) [AS17439] 0.840 ms 0.938 ms 1.091 ms
6 59.163.105.170.static-chennai.vsnl.net.in (59.163.105.170) [AS4755] 4.441 ms 3.891 ms 3.848 ms
7 * * *
8 ix-0-100.tcore2.MLV-Mumbai.as6453.net (180.87.39.25) [*] 27.175 ms 27.178 ms 27.927 ms
9 if-6-2.tcore1.L78-London.as6453.net (80.231.130.5) [AS6453] 143.917 ms 145.977 ms if-2-2.tcore1.MLV-Mumbai.as6453.net (180.87.38.1) [*] 135.972 ms
10 if-9-5.tcore1.WYN-Marseille.as6453.net (80.231.217.17) [AS6453] 132.133 ms 133.902 ms 133.286 ms
11 if-8-1600.tcore1.PYE-Paris.as6453.net (80.231.217.6) [AS6453] 134.763 ms 133.217 ms 136.691 ms
12 if-2-2.tcore1.PVU-Paris.as6453.net (80.231.154.17) [AS6453] 134.674 ms 137.558 ms *
13 * * *
14 ae-1.r21.parsfr01.fr.bb.gin.ntt.net (129.250.2.224) [AS2914] 153.657 ms 155.550 ms 152.412 ms
15 as-4.r22.amstnl02.nl.bb.gin.ntt.net (129.250.3.84) [AS2914] 155.136 ms 151.697 ms as-0.r25.tokyjp01.jp.bb.gin.ntt.net (129.250.3.79) [AS2914] 383.668 ms
16 * * *
17 * xe-3-2.a16.tokyjp01.jp.ra.gin.ntt.net (203.105.72.78) [AS2914] 271.296 ms 270.202 ms
18 a23-32-241-26.deploy.akamaitechnologies.com (23.32.241.26) [AS20940] 280.674 ms 282.407 ms xe-3-2.a16.tokyjp01.jp.ra.gin.ntt.net (203.105.72.78) [AS2914] 263.254 ms

Akami CDN node in Japan and route via Europe!!

This poor performance case is result of multiple issues:

  1. Datacenter is not running own DNS server but instead replying on Google Public DNS 8.8.8.8 and 8.8.4.4 which has upstream/full table connectivity outside in East Asia and not really in India. Thus Google DNS resolvers always pass IP of nodes outside India in Japan or sometimes in Malaysia. I blogged about it in detail sometimes back here.
  2. Datacenter is picking Tata-VSNL AS4755 for upstream for this route (not Airtel or Reliance), and interesting enough Tata does NOT peers with NTT Communications in Japan. They peer everywhere else except home market of NTT which I guess will be case with other ISPs as well. Thus nearest point for Tata to handle traffic to NTT is Europe which we can see in traceroute. 
  3. Since IP belongs to Akamai Japan, it brings traffic back to Asia over NTT Asia’s network and eventually passes it to Akamai. Strange that Akamai has not fixed this problem from long time. They must be knowing this since I posted this problem in NANOG mailing list last year and also blogged about it here.

OK - the fix!

I can surely do better then waiting for 2 hours to download that package! 
I quickly installed BIND and since BIND runs as “recursive resolver” by default, I simply pointed /etc/resolv.conf to 127.0.0.1

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND – YOUR CHANGES WILL BE OVERWRITTEN
# nameserver 8.8.8.8
# nameserver 8.8.4.4

nameserver 127.0.0.1

OK - now running download again, let’s see how it works:

root@server2:~/tmp2# wget http://files2.zimbra.com/downloads/8.0.1_GA/zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz
--2012-12-18 14:31:55– http://files2.zimbra.com/downloads/8.0.1_GA/zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz
Resolving files2.zimbra.com (files2.zimbra.com)… 125.252.226.97, 125.252.226.106
Connecting to files2.zimbra.com (files2.zimbra.com)|125.252.226.97|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 701053545 (669M) [binary/octet-stream]
Saving to: `zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz'

100%[============================================================================================================>] 701,053,545 3.07M/s in 3m 45s

2012-12-18 14:35:43 (2.98 MB/s) - `zcs-8.0.1_GA_5438.UBUNTU12_64.20121105164409.tgz' saved [701053545/701053545]

root@server2:~/tmp2# 

Fast? Yeah a lot!

How?

Simply doing a trace to destination this time takes me to: 

traceroute to 125.252.226.97 (125.252.226.97), 30 hops max, 60 byte packets
1 103.6.87.1 (103.6.87.1) [AS36236] 2.089 ms 2.060 ms 2.047 ms
2 180.179.33.245 (180.179.33.245) [AS17439/AS9584] 2.012 ms 2.002 ms 1.997 ms
3 180.179.37.89 (180.179.37.89) [AS17439] 1.940 ms 1.949 ms 1.937 ms
4 180.179.37.38 (180.179.37.38) [AS17439] 2.258 ms 2.668 ms 3.144 ms
5 218.100.48.143 (218.100.48.143) [*] 2.199 ms 2.181 ms 2.174 ms
6 * 182.79.220.182 (182.79.220.182) [*] 1.683 ms 1.802 ms
7 a125-252-226-97.deploy.akamaitechnologies.com (125.252.226.97) [AS9498] 1.821 ms 1.775 ms 1.947 ms

An interesting case here is that NTT owns majority stake in Netmagic datacenter where this server is located! But likely they can’t do much since they need a license in India to offer their own network in Netmagic or simply peer more? :)

This is how I increased my download speed from 512Kbps to 24Mbps! ;)