05 Apr

Tata – Airtel domestic peering IRR filtering and OpenDNS latency!

Last month I noticed quite high latency with Cisco’s OpenDNS from my home fibre connection. The provider at home is IAXN (AS134316) which is peering with content folks in Delhi besides transit from Airtel.

ping -c 5 208.67.222.222
PING 208.67.222.222 (208.67.222.222) 56(84) bytes of data.
64 bytes from 208.67.222.222: icmp_seq=1 ttl=51 time=103 ms
64 bytes from 208.67.222.222: icmp_seq=2 ttl=51 time=103 ms
64 bytes from 208.67.222.222: icmp_seq=3 ttl=51 time=103 ms
64 bytes from 208.67.222.222: icmp_seq=4 ttl=51 time=103 ms
64 bytes from 208.67.222.222: icmp_seq=5 ttl=51 time=103 ms


--- 208.67.222.222 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 103.377/103.593/103.992/0.418 ms

This is bit on the higher side as from Haryana to Mumbai (OpenDNS locations list here). My ISP is backhauling from Faridabad which is probably 6-8ms away from my city and 2-3ms further to Delhi and from there to Mumbai around 30ms. Thus latency should be around ~40-45ms.

Here’s how forward trace looked like

traceroute 208.67.222.222
traceroute to 208.67.222.222 (208.67.222.222), 30 hops max, 60 byte packets
 1  172.16.0.1 (172.16.0.1)  0.730 ms  0.692 ms  0.809 ms
 2  axntech-dynamic-218.140.201.103.axntechnologies.in (103.201.140.218)  4.904 ms  4.314 ms  4.731 ms
 3  10.10.26.1 (10.10.26.1)  6.000 ms  6.414 ms  6.326 ms
 4  10.10.26.9 (10.10.26.9)  6.836 ms  7.135 ms  7.047 ms
 5  nsg-static-77.249.75.182-airtel.com (182.75.249.77)  9.344 ms  9.416 ms  9.330 ms
 6  182.79.243.201 (182.79.243.201)  62.274 ms 182.79.177.69 (182.79.177.69)  66.874 ms 182.79.239.193 (182.79.239.193)  61.297 ms
 7  121.240.1.201 (121.240.1.201)  85.789 ms  82.250 ms  79.591 ms
 8  172.25.81.134 (172.25.81.134)  110.049 ms 172.31.29.245 (172.31.29.245)  114.350 ms  113.673 ms
 9  172.31.133.210 (172.31.133.210)  112.598 ms 172.19.138.86 (172.19.138.86)  114.889 ms 172.31.133.210 (172.31.133.210)  113.415 ms
10  115.110.234.50.static.mumbai.vsnl.net.in (115.110.234.50)  125.770 ms  125.056 ms  123.779 ms
11  resolver1.opendns.com (208.67.222.222)  113.648 ms  115.044 ms  106.066 ms

Forward trace looks fine except that latency jumps as soon as we hit Tata AS4755 backbone. OpenDNS connects with Tata AS4755 inside India and announces their anycast prefixes to them. If the forward trace is logically correct but has high latency, it often reflects the case of bad return path. Thus I requested friends at OpenDNS to share the return path towards me. As expected, it was via Tata AS6453 Singapore.

Here’s what Tata AS4755 Mumbai router had for IAXN prefix:

BGP routing table entry for 14.102.188.0/22 
Paths: (1 available, best #1, table Default-IP-Routing-Table) 
Not advertised to any peer 
6453 9498 134316 134316 134316 134316 134316 134316 134316 134316 134316 134316 
192.168.203.194 from 192.168.199.193 (192.168.203.194) 
Origin IGP, localpref 62, valid, internal, best 
Community: 4755:44 4755:97 4755:888 4755:2000 4755:3000 4755:47552 6453:50 6453:3000 6453:3400 6453:3402 
Originator: 192.168.203.194, Cluster list: 192.168.199.193 192.168.194.15 
Last update: Mon Mar 25 15:26:36 2019

Thus what was happening is this:

Forward path: IAXN (AS134316) > Airtel (AS9498) > Tata (AS4755) > OpenDNS (AS36692)

Return path: OpenDNS (AS36692) > Tata (AS4755) > Tata (AS6453) > Airtel (AS9498) > IAXN (AS134316)

While this may seem like a Tata – Airtel routing issue but it wasn’t. I could see some of the prefixes with a direct path as well. Here’s a trace from Tata AS4755 Mumbai PoP to an IP from a different pool of IAXN:

traceroute to 103.87.46.1 (103.87.46.1), 15 hops max, 60 byte packets
1 * * *
2 172.31.170.210 (172.31.170.210) 0.911 ms 0.968 ms 0.643 ms
3 172.23.78.233 (172.23.78.233) 1.233 ms 0.821 ms 0.810 ms
4 172.17.125.249 (172.17.125.249) 23.540 ms 23.454 ms 23.367 ms
5 115.110.232.174.static.Delhi.vsnl.net.in (115.110.232.174) 49.175 ms 48.832 ms 49.107 ms
6 182.79.153.87 (182.79.153.87) 48.777 ms 182.79.153.83 (182.79.153.83) 49.043 ms 182.79.177.127 (182.79.177.127) 54.879 ms
7 103.87.46.1 (103.87.46.1) 60.865 ms 60.540 ms 60.644 ms

This clearly was fine. So why Tata was treating 103.87.46.0/24 different from 14.102.188.0/22? The reason for that lies in following:

  • Airtel (AS9498) very likely peers with Tata (AS4755). They do interconnect for sure as we see in traceroutes and my understanding is that it’s based on settlement-free peering for Indian traffic.
  • Airtel (AS9498) buys IP transit from Tata (AS6453) (besides a few others). Tata AS6453 is carrying the routing announcements to other networks in the transit free zone and that confirms that Airtel (at least technically) has a downstream customer relationship here.
  • Tata (AS4755) has IRR based filters on peering but not the Tata (AS6453) for it’s downstream. Hence while Tata rejected the route in India, they did accept that in Singapore PoP.
  • My IP was from prefix 14.102.188.0/22 and there was no valid route object for it at any of key IRRs like ATLDB, APNIC or RADB. But other prefix 103.87.46.0/24 did had a valid route object on APNIC.

Now after almost 10 days of it, my ISP has changed the BGP announcement and announcing 14.102.189.0/24 (which does a valid route object on APNIC). This fixes the routing problem and give me pretty decent latency with OpenDNS:

ping -c 5 208.67.222.222
PING 208.67.222.222 (208.67.222.222): 56 data bytes
64 bytes from 208.67.222.222: icmp_seq=0 ttl=55 time=52.552 ms
64 bytes from 208.67.222.222: icmp_seq=1 ttl=55 time=53.835 ms
64 bytes from 208.67.222.222: icmp_seq=2 ttl=55 time=53.330 ms
64 bytes from 208.67.222.222: icmp_seq=3 ttl=55 time=52.700 ms
64 bytes from 208.67.222.222: icmp_seq=4 ttl=55 time=52.504 ms

--- 208.67.222.222 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 52.504/52.984/53.835/0.518 ms

So if you are a network operator and originating prefixes, please do document them in any of the IRRs. You can do that via IRR of your RIR (APNIC, ARIN etc) or a free IRR like ALTDB. If you have downstreams, make sure to create AS SET, add downstreams ASNs in your AS SET and also include that AS SET on peeringdb for the world to see!

Misc Notes

  • Posted strictly in my personal capacity and has nothing to do with my employrer.
  • Thanks for folks from Cisco/OpenDNS for quick replies with relevant data which helped in troubleshooting. ūüôā
01 Mar

Encrypted DNS using DNSCrypt

Writing this post from my hotel room in Kathmandu. I found that many of the servers appear to be DNS resolvers which is unusual.

E.g:

dig @anuragbhatia.com . ns +short
a.root-servers.net.
b.root-servers.net.
c.root-servers.net.
d.root-servers.net.
e.root-servers.net.
f.root-servers.net.
g.root-servers.net.
h.root-servers.net.
i.root-servers.net.
j.root-servers.net.
k.root-servers.net.
l.root-servers.net.
m.root-servers.net.

dig @google.com . ns +short
b.root-servers.net.
c.root-servers.net.
d.root-servers.net.
e.root-servers.net.
f.root-servers.net.
g.root-servers.net.
h.root-servers.net.
i.root-servers.net.
j.root-servers.net.
k.root-servers.net.
l.root-servers.net.
m.root-servers.net.
a.root-servers.net.

 

This seems unusual and is the result of basically port 53 DNS hijack. Let’s try to verify it using popular “whoami.akamai.net” query.

dig @8.8.8.8 whoami.akamai.net a +short
202.79.32.164

dig @9.9.9.9 whoami.akamai.net a +short
202.79.32.164

dig @1.2.3.4 whoami.akamai.net a +short
202.79.32.164

So clearly something in middle is hijacking DNS queries and no matter whichever DNS resolver I try to use, the queries actually hit authoritative DNS via 202.79.32.164. This belongs to WorldLink Communications (ISP here in Nepal) and I am just 5 hops away from it.

 

So what can be done about these cases? Well, one way is VPN of course but with a setup where VPN server’s IP address is hardcoded in the client and not using DNS. It works and does the task but performance can vary greatly depending on how far is the tunnel server. A better and more modern¬†way out of it is by using encryption in DNS by using a protocol named “DNSCrypt“. DNSCrypt offers to encrypt of DNS queries from clients to the DNS resolvers. (Beyond that resolver still, follow usual non-encrypted root chain to reach authoritative DNS servers).

 

So how does it work?

There’s no integrated support of DNSCrypt in OS’es at this time. There are number of projects like¬†dnscrypt-osxclient available on GitHub which enable this support.¬† Once configured, the client changes system’s DNS resolver to a local IP which listens for port 53 (regular/non-encrypted) requests.

cat /etc/resolv.conf |grep nameserver
nameserver 127.0.0.54

The client often offers support of various open resolvers like OpenDNS, Quad9 etc.

dig @127.0.0.54 whoami.akamai.net a +short
67.215.80.66

 

 

Here it shows that DNS resolver in my case happens to be Cisco’s OpenDNS. As soon as the client gets port 53 DNS queries, it encrypts it and sends via UDP port 443 (UDP or TCP depending on provider and client configuration). The encyption¬†is based on trusted¬†root CA’s and associated chain as popularly used in HTTPS. This is also one of reasons why DNSCrypt is also known as DNS over HTTPS.

 

Here’s an example of a DNS query to resolve A record of google.com while running tcpdumps¬†in parallel:

sudo tcpdump -i lo0 'dst port 53' -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
04:36:04.429212 IP 127.0.0.54.50966 > 127.0.0.54.53: 31576+ A? prd.col.aria.browser.skypedata.akadns.net. (59)
04:36:04.532015 IP 127.0.0.54.54914 > 127.0.0.54.53: 623+ [1au] A? google.com. (39)
^C
2 packets captured
4 packets received by filter
0 packets dropped by kernel

This shows request went in clear text to 127.0.0.54 which is configured on loopback. While in parallel if I watch for traffic towards OpenDNS public IPs, I get:

sudo tcpdump -i en0 'dst 208.67.220.220 or dst 208.67.222.222' -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes
04:39:56.827824 IP 192.168.0.4.53763 > 208.67.220.220.443: UDP, length 512
^C
1 packet captured
63 packets received by filter
0 packets dropped by kernel

Thus all that appears here is just an encrypted packet to Cisco OpenDNS over UDP port 443.

I ran another query and saved it in pcap file. Here’s how it looks like in wireshark:

 

 

 

That’s all about it for now. I am going to keep encryption enabled especially when travelling from now onwards. Time to get some sleep. ūüôā

 

Useful Links:

  1. dnscrypt-osxclient –¬†https://github.com/alterstep/dnscrypt-osxclient
  2. DNSCrypt Wikipedia –¬†https://en.wikipedia.org/wiki/DNSCrypt
  3. DNS Over HTTPS (Google Public DNS) –¬†https://developers.google.com/speed/public-dns/docs/dns-over-https
  4. DNS over TLS (Quad9) –¬†https://quad9.net/faq/#Does_Quad9_support_DNS_over_TLS
28 Oct

Akamai CDN and DNS resolution analysis

These days Open DNS resolvers are getting quite popular. With Open DNS resolver I mean resolvers including OpenDNS as well as Google Public DNS.

One of major issues these resolvers suffer is failure of integration with CDN providers like Akamai, Limelight etc. In this post I will analyse sample client site of Akamai –¬†Malaysia Airlines website –¬†http://www.malaysiaairlines.com. ¬†

 

Looking at OpenDNS, Google Public DNS and my ISP (BSNL’s) DNS resolver for its DNS records:

OpenDNS 

;; QUESTION SECTION:
;www.malaysiaairlines.com. IN A

;; ANSWER SECTION:
www.malaysiaairlines.com. 12169 IN CNAME www.malaysiaairlines.com.edgesuite.net.
www.malaysiaairlines.com.edgesuite.net. 12169 IN CNAME a1456.b.akamai.net.
a1456.b.akamai.net. 20 IN A 125.252.225.158
a1456.b.akamai.net. 20 IN A 125.252.225.151

 

Google Public DNS

;; QUESTION SECTION:
;www.malaysiaairlines.com. IN A

;; ANSWER SECTION:
www.malaysiaairlines.com. 12312 IN CNAME www.malaysiaairlines.com.edgesuite.net.
www.malaysiaairlines.com.edgesuite.net. 12318 IN CNAME a1456.b.akamai.net.
a1456.b.akamai.net. 10 IN A 58.27.22.154
a1456.b.akamai.net. 10 IN A 58.27.22.138

 

BSNL’s DNS resolver

;; QUESTION SECTION:
;www.malaysiaairlines.com. IN A

;; ANSWER SECTION:
www.malaysiaairlines.com. 20410 IN CNAME www.malaysiaairlines.com.edgesuite.net.
www.malaysiaairlines.com.edgesuite.net. 20410 IN CNAME a1456.b.akamai.net.
a1456.b.akamai.net. 20 IN A 117.239.141.35
a1456.b.akamai.net. 20 IN A 117.239.141.10

 

Notice different IP’s coming when asked from different DNS resolvers.¬†

OpenDNS passes me 125.252.225.151 which is announced by Singtel in Singapore.
Google passes me  58.27.22.154 which is announced by Tmnet in Malaysia.
BSNL’s DNS resolver passes me ¬†117.239.141.35 announced by BSNL-NIB itself is within India (yay!) ūüôā

This results in latency of 300ms for¬†www.malaysiaairlines.com when using OpenDNS & Google while 60ms when using ISP’s default resolver.¬†

 

How and why this is happening?

The answer lies on underlying DNS layer which is doing this magic. In all cases¬†www.malaysiaairlines.com. is a cname (alias record) to¬†www.malaysiaairlines.com.edgesuite.net. ¬†Further¬†www.malaysiaairlines.com.edgesuite.net. is a cname to¬†a1456.b.akamai.net. Real magic comes here – “b.akamai.net.” itself is a DNS zone. Let’s look at this zone from all 3 DNS resolvers:

 

anurag@laptop:/$ dig b.akamai.net. ns +short @208.67.222.222
n6b.akamai.net.
n7b.akamai.net.
n1b.akamai.net.
n2b.akamai.net.
n4b.akamai.net.
n3b.akamai.net.
n5b.akamai.net.
n0b.akamai.net.

anurag@laptop:/$ dig b.akamai.net. ns +short @8.8.8.8
n1b.akamai.net.
n4b.akamai.net.
n8b.akamai.net.
n3b.akamai.net.
n2b.akamai.net.
n6b.akamai.net.
n5b.akamai.net.
n0b.akamai.net.
n7b.akamai.net.

anurag@laptop:/$ dig b.akamai.net. ns +short @10.0.0.1
n0b.akamai.net.
n1b.akamai.net.
n2b.akamai.net.
n3b.akamai.net.
n4b.akamai.net.
n5b.akamai.net.
n6b.akamai.net.
n7b.akamai.net.
n8b.akamai.net.

 

All identical names. Let’s pick one randomly and analyse:

n0b.akamai.net

 

anurag@laptop:/$ dig n0b.akamai.net a @208.67.222.222 +short
124.155.223.36

anurag@laptop:/$ dig n0b.akamai.net a @8.8.8.8 +short
202.175.5.150

anurag@laptop:/$ dig n0b.akamai.net a @10.0.0.1 +short
124.124.201.156

 

All different IPs!
At this stage everything seems very confusing.

 

Let’s revise what we have till now

www.malaysiaairlines.com. is CNAME to www.malaysiaairlines.com.edgesuite.net. and¬†www.malaysiaairlines.com.edgesuite.net. is cname to¬†a1456.b.akamai.net.¬†Now a1456.b.akamai.net. is a absolute hostname under DNS zone “b.akamai.net” which is giving different IPs when checked from different DNS resolvers. b.akamai.net DNS zones has several DNS servers and I randomly pick one of them¬†n0b.akamai.net. We see¬†n0b.akamai.net itself gives different A records and thus I am going back to parent zone which is akamai.net to further find how this is happening.

 

Let’s see DNS servers of akamai.net:

To avoid further confusion due to interesting DNS lookups, let’s use whois record of akamai.net domain to see what authoritative DNS servers it is using rather then a DNS query:

anurag@laptop:~$ whois akamai.net

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

Domain Name: AKAMAI.NET
Registrar: TUCOWS.COM CO.
Whois Server: whois.tucows.com
Referral URL: http://domainhelp.opensrs.net
Name Server: NS1-1.AKAMAITECH.NET
Name Server: NS2-193.AKAMAITECH.NET
Name Server: NS3-193.AKAMAITECH.NET
Name Server: NS4-193.AKAMAITECH.NET
Name Server: NS5-193.AKAMAITECH.NET
Name Server: NS6-193.AKAMAITECH.NET
Name Server: NS7-193.AKAMAITECH.NET
Name Server: ZC.AKAMAITECH.NET
Name Server: ZD.AKAMAITECH.NET
Name Server: ZE.AKAMAITECH.NET
Name Server: ZG.AKAMAITECH.NET
Name Server: ZH.AKAMAITECH.NET
Name Server: ZI.AKAMAITECH.NET
Status: clientTransferProhibited
Status: clientUpdateProhibited
Updated Date: 18-jun-2012
Creation Date: 03-mar-1999
Expiration Date: 03-mar-2022

>>> Last update of whois database: Sun, 28 Oct 2012 16:56:03 UTC <<<

 

Now again let’s pick one randomly –¬†NS1-1.AKAMAITECH.NET¬†and see what it tells us for hostname “n0b.akamai.net”¬†

 

anurag@laptop:~$ dig @NS1-1.AKAMAITECH.NET n0b.akamai.net +short
123.201.147.5

 

 

Wow! Akamai’s DNS setup can make a boring Sunday evening very interesting. ūüėČ

 

Now since¬†NS1-1.AKAMAITECH.NET. itself is on a different domain name (and so different DNS zone), let’s do bit more effort to get to the core of it.¬†NS1-1.AKAMAITECH.NET. is simply an A record on DNS servers of¬†AKAMAITECH.NET. zone.

 

Let’s look at that zone now:

anurag@laptop:/$ dig AKAMAITECH.NET ns +short
zh.AKAMAITECH.NET.
ns3-193.AKAMAITECH.NET.
ns2-193.AKAMAITECH.NET.
zm-1.AKAMAITECH.NET.
zg.AKAMAITECH.NET.
zb.AKAMAITECH.NET.
ze.AKAMAITECH.NET.
zf.AKAMAITECH.NET.
ns5-193.AKAMAITECH.NET.
zd.AKAMAITECH.NET.
zi.AKAMAITECH.NET.
ns4-193.AKAMAITECH.NET.
za.AKAMAITECH.NET.
zc.AKAMAITECH.NET.

 

Again, let’s pick –¬†zh.AKAMAITECH.NET. and query for¬†NS1-1.AKAMAITECH.NET.

anurag@laptop:/$ dig NS1-1.AKAMAITECH.NET. @zh.AKAMAITECH.NET.  +short
193.108.88.1

Finally some¬†consistent¬†result (YAY!). So is server with IP¬†193.108.88.1 playing game? Remember in 2nd last step this server was giving different IPs for hostname¬†NS1-1.AKAMAITECH.NET. I SMELL ANYCASTING! ūüôā

Let’s do a traceroute to¬†193.108.88.1 from my location (BSNL Haryana), Airtel Delhi node & my Europe server (where this blog is hosted!):

 

BSNL

traceroute to 193.108.88.1 (193.108.88.1), 30 hops max, 60 byte packets
1 10.0.0.1 (10.0.0.1) [AS1] 0.644 ms 1.022 ms 1.150 ms
2 117.220.160.1 (117.220.160.1) [AS9829] 19.467 ms 20.335 ms 21.824 ms
3 218.248.169.122 (218.248.169.122) [AS9829] 27.180 ms 29.092 ms 30.510 ms
4 115.254.1.138 (115.254.1.138) [AS18101] 61.354 ms 63.244 ms 64.209 ms
5 115.255.239.53 (115.255.239.53) [AS18101] 68.160 ms 68.907 ms 69.847 ms
6 115.248.226.21 (115.248.226.21) [AS18101] 72.336 ms 54.497 ms 54.633 ms
7 203.101.100.213 (203.101.100.213) [AS9498/AS7617] 80.766 ms 82.390 ms 83.732 ms
8 AES-Static-010.194.22.125.airtel.in (125.22.194.10) [AS24560/AS9498] 87.199 ms 88.580 ms 90.314 ms
9 * * *
10 * * *

 

Europe server

traceroute to 193.108.88.1 (193.108.88.1), 30 hops max, 60 byte packets
1 gw.giga-dns.com (91.194.90.1) [AS51167] 0.639 ms 0.637 ms 0.623 ms
2 host-93-104-204-33.customer.m-online.net (93.104.204.33) [AS8767] 0.600 ms 0.592 ms 0.585 ms
3 xe-1-1-0.rt-decix-2.m-online.net (82.135.16.102) [AS8767] 7.784 ms 7.740 ms 7.727 ms
4 xe-1-1-0.rt-decix-2.m-online.net (82.135.16.102) [AS8767] 7.464 ms 7.461 ms 7.452 ms
5 decix-fra6.netarch.akamai.com (80.81.192.28) [AS6695] 8.434 ms 8.916 ms 8.407 ms
6 * * *
7 * * *
8 * * *

 

Here we go! Surely anycasting. 193.108.88.1 is coming from prefix 193.108.88.0/24 announced by Akamai AS21342 announced at different locations.

 

Summary:

Let’s go in forward mode now:

Akamai CDN provider has a interesting DNS setup with mix of anycasting DNS servers where “edge servers” carry different A record for a given hostname. E.g at core Akamai has set of anycasted DNS servers like¬†zh.AKAMAITECH.NET which hold A record for another set of DNS servers like¬†NS1-1.AKAMAITECH.NET. which act as DNS server for akamai.net domain name. Next, these DNS servers hold different values for another set of DNS servers like¬†n0b.akamai.net which are hold the delegation for a subzone like¬†b.akamai.net which holds the hostname like¬†a1456.b.akamai.net¬†to which hostnames like www.malaysiaairlines.com.edgesuite.net. point to! ūüôā¬†

 

Why Akamai is having such complex setup?

My strong guess here is that multiple zones and cross dependency here is simply to spread load and avoid single point failure. The important thing here is that at core of DNS Akamai uses anycasting but for serving content from these web servers there’s no anycasting. E.g I am getting IP¬†117.239.141.10¬†for Akamai’s client site why is a unicated IP from BSNL¬†117.239.128.0/20 prefix announcement. Akamai is NOT using anycasting on edge distribution and my strong guess for that is that it’s way too easy for Akamai to manage things in current rather then putting caching servers on anycasting IPs. E.g if in current situation Akamai node on BSNL is choked up, they can simply distribute traffic by modifying DNS server to pass A record to BSNL 1 out of 4 times and rest of time pass the IP of caching node on Airtel. In case of anycasting that is not possible. It will simply follow short AS/hop path and distribution of load partially is not possible. Again that’s my guess. ūüôā

Time for me to change DNS resolver in my router now!