23 Dec

Doomsday and the DNS resolution

Last month I did a short webinar with Indian ISPs talking about DNS servers in detail. The idea of the session was to make network engineers from fellow ISPs familiar with root DNS servers, DNS hierarchy, anycast etc. As we went through slides it was clear from RIPE Atlas data that Indian networks are not reaching local DNS servers due to routing! (Data from RIPE Atlas here).

This may come as a surprise for policymakers (where there seem to be ongoing discussions around how India can have its own root DNS servers even though) we are not hitting existing local root DNS instances. Anyways does that statement of having own root DNS servers even possible?


First about the current servers referred to as Root DNS servers

  • Essentially 13 public-facing systems with thousands of servers using anycast act as “authoritative” for the root DNS zone.
  • More precisely these are secondary/slave authoritative DNS nodes and are synced to master DNS servers hosted with the respective root DNS operators. Those 13 servers which act as master further are synced to a sort of hidden master cluster administered by ICANN.
  • The primary server (cluster) where entries are changed is hidden from the world for good technical reasons like avoiding a massive DDoS attack or hacking attempts.
  • Thus root server operators are essentially running slave/secondary authoritative server for the root DNS zone and ensure that they copy they receive is simply distributed to thousands of slave/secondary DNS nodes across the world (map here) using common IPs which are hardcoded in all DNS software (as starting point for the resolution).
  • Root DNS zone has been signed with DNSSEC back in 2010 and thus the zone files carries “cryptographic signatures” and hence that validates the authenticity of the data.

List of Root Servers with their IP & operators

HOSTNAMEIP ADDRESSESOPERATOR
a.root-servers.net198.41.0.4, 2001:503:ba3e::2:30Verisign, Inc.
b.root-servers.net199.9.14.201, 2001:500:200::bUniversity of Southern California,
Information Sciences Institute
c.root-servers.net192.33.4.12, 2001:500:2::cCogent Communications
d.root-servers.net199.7.91.13, 2001:500:2d::dUniversity of Maryland
e.root-servers.net192.203.230.10, 2001:500:a8::eNASA (Ames Research Center)
f.root-servers.net192.5.5.241, 2001:500:2f::fInternet Systems Consortium, Inc.
g.root-servers.net192.112.36.4, 2001:500:12::d0dUS Department of Defense (NIC)
h.root-servers.net198.97.190.53, 2001:500:1::53US Army (Research Lab)
i.root-servers.net192.36.148.17, 2001:7fe::53Netnod
j.root-servers.net192.58.128.30, 2001:503:c27::2:30Verisign, Inc.
k.root-servers.net193.0.14.129, 2001:7fd::1RIPE NCC
l.root-servers.net199.7.83.42, 2001:500:9f::42ICANN
m.root-servers.net202.12.27.33, 2001:dc3::35WIDE Project

By looking at the list one can clearly say it’s primarily American organisations (mostly non for profit) running these and that is what likely brings the discussion of “Can we have our own root DNS server“. That might sound legit question if we think from the aspect of Indian policymakers who went from bringing in great organisations like ISRO to UIDAI. Ignoring the 512-byte limit and assuming if we can get the root DNS server managed by India, it still won’t add any benefit.

Here’s why:

  1. As described above – the current set of root DNS servers are simply secondary/slave DNS nodes running in hierarchical sync model from hidden masters. So if we add another node it would ideally do the same task and won’t be any different from other nodes. Taking worst possible case scenario – If during say War or any other instability, someone decides to remove reference to “in” in the root DNS zone, it will cause an impact across .in whether or not it’s a root, b root or a hypothetical Indian root DNS server.
  2. If such root DNS server runs as a cluster of master and starting point then the issue comes on how Indian networks will be able to reach rest of domain names via hierarchy? Indian DNS resolvers still need to find a way to resolve .com, .us, .co.uk, .anything. If we are talking about a totally different version of the internet for India then that’s a different case altogether.
  3. Most of the mission-critical networks like defence run on their own separate backbone disconnected from the internet and public-facing root DNS servers won’t have an impact on them.

But what about a possible doomsday scenario where imagine networks in India are 100% isolated?

Assuming there’s a major loss in the connectivity of submarine cable capacity to just everywhere. In such scenario will content hosted in India still work at the DNS layer? Answer depends. Let’s explore…

Let’s look at SOA of the root DNS zone:

anurag@devops01 ~> dig . soa +short
a.root-servers.net. nstld.verisign-grs.com. 2020121401 1800 900 604800 86400
anurag@devops01 ~>

SOA – Start of Authority record here has a bunch of values starting with serial, refresh, retry, expire and minimum TTL. Let’s map the output we see above to these values as per BIND reference syntax.

Serial: 2020121401
Refresh: 1800
Retry: 900
Expire: 604800
Minimum TTL: 86400

Expire here defines the number of seconds after which secondary can stop responding if the master does not respond. So 604800 means 7 days. Effectively after 7 days the internet as we know, it will stop working in event of a major system failure where India networks are 100% isolated and there’s absolutely no connectivity for all the root DNS servers in India to reach their respective hidden masters to sync up the zone data. (SOA reference Wikipedia). Can ISPs do something, in that case, to still make things work? Well, depending on DNS software ISPs are using, they can inject hard references to TLDs including .com, .in etc. Which brings us to the next level of resolution which is TLD resolution.

anurag@devops01 ~> dig com. soa +short
a.gtld-servers.net. nstld.verisign-grs.com. 1607981978 1800 900 604800 86400
anurag@devops01 ~>

For .com the retry is also 604800 i.e a week. So .com domains will stop working after a week of such mass outage.

anurag@devops01 ~> dig in. soa +short
ns1.neustar.in. hostmaster.neustar.in. 1607981693 1800 300 1814400 1800
anurag@devops01 ~>

For .in ccTLD, that number happens to be 1814400 seconds / 21 days. And hence .in will work up to 21 domains of such mass outage and very likely even after that since .in ccTLD master would likely be in India hosted with Neustar.

So well 7 days – that’s the limitation we live in. And remember submarine cable is not the only possible connectivity. A possible IP transit over satellite via any of friendly country will get the system to work as well or simply to ask Indian ISPs to put a static zone entry for .in towards IN auth servers hosted within India.

What is more concerning is that Indian networks are not hitting locally hosted root DNS instances. Primarily it’s because of fact that telcos do not peer with root servers in India except the ones at NIXI. And the ones at NIXI – K root (NIXI Noida), I root (NIXI Mumbai) and F root (NIXI Chennai) does seem unstable. At the time of writing this post – my understanding is that only F root Chennai is up (and that too with some issues recently). K root is not announcing it’s IPv4 address in Noida (IPv6 only) and I root seems down in NIXI Mumbai.

17 Nov

Measuring latency to endpoints with blocked ICMP

And a blog post after a while. Last few months went busy with RPKI. After my last post about RPKI and the fact that India was lacking a little bit on RPKI ROA front, we started with a major push by a set of like-minded folks like us. For now, Indian signed table has jumped from 12% since Aug to 32% now in Oct. Detailed graphs and other data can be found here on the public Grafana instance.

In terms of absolute percentage, India now has the highest number of absolute signed prefixes in this region. 13972 Indian prefixes have a valid ROA and nearest to that is Taiwan at 6824. Though 13972 results in just 32% of the Indian table while 6824 results in 91% of Taiwanese table. So a long way to go for us.

If you are a network operator in India and reading this, consider joining our RPKI webinar which is planned at 3 pm (IST) on 18th Nov 2020. You can register for the event here. Or buzz me to talk about RPKI!


Catching Covid-19

Besides RPKI push I also caught up with Covid19 along with family members. Luckily for us, it went fine and wasn’t that painful. The impact was mild and everyone has recovered. Phew!
I hope readers of this blog post are well.

TraceroutePing in Smokeping

Coming to the topic for today’s blog post. I recently came across this excellent Smokeping plugin which solves a very interesting problem. There are often nodes we see in the traceroute/MTR which is either not routed or simply block ICMP/TCP/UDP packets which are addressed to them. This can include routers which have a pretty harsh firewall dropping everything addressed to them as well as cases where we have IX or any other non-routed IP in the traceroute. It becomes tricky to measure latency to those. Someone used the simple idea of incremental TTLs as used in traceroute to get a reply from these middle nodes of “TTL time exceeded error” and based on that a way to plot latency.

Let’s look at a real-world case: One of ISP serving my home is IAXN AS134316 and they peer with my ex-employers network Spectra AS10029 at Extreme IX in Delhi. Let’s see how traceroute to Spectra’s anycast DNS looks from my home.

traceroute -P icmp 180.151.151.151
traceroute to 180.151.151.151 (180.151.151.151), 64 hops max, 72 byte packets
 1  router01.rtk.anuragbhatia.com (172.16.0.1)  2.818 ms  1.876 ms  4.274 ms
 2  10.10.26.6 (10.10.26.6)  4.258 ms  4.301 ms  5.953 ms
 3  10.10.26.5 (10.10.26.5)  5.490 ms  5.916 ms  5.257 ms
 4  10.10.26.29 (10.10.26.29)  11.349 ms  9.246 ms  9.430 ms
 5  as10029.del.extreme-ix.net (45.120.248.51)  10.628 ms  8.802 ms  9.609 ms
 6  resolver1.anycast.spectranet.in (180.151.151.151)  8.446 ms  9.113 ms  10.699 ms

Now hop 5 here is likely Spectra’s Delhi router’s interface which has Extreme IX IP – 45.120.248.51. Let’s see what we get when we ping it.

ping -c 5 45.120.248.51
PING 45.120.248.51 (45.120.248.51): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

--- 45.120.248.51 ping statistics ---
5 packets transmitted, 0 packets received, 100.0% packet loss

I cannot ping it. Let’s look at the trace to it to see where it drops.

traceroute -P icmp 45.120.248.51
traceroute to 45.120.248.51 (45.120.248.51), 64 hops max, 72 byte packets
 1  router01.rtk.anuragbhatia.com (172.16.0.1)  3.196 ms  1.790 ms  4.421 ms
 2  10.10.26.6 (10.10.26.6)  5.514 ms  3.624 ms  5.323 ms
 3  10.10.26.5 (10.10.26.5)  5.252 ms  4.043 ms  3.671 ms
 4  * * *
 5  * * *
 6  nsg-static-77.249.75.182-airtel.com (182.75.249.77)  14.221 ms  10.963 ms  11.574 ms
 7  116.119.68.58 (116.119.68.58)  146.531 ms  147.899 ms  146.065 ms
 8  * * *
 9  * * *
10  * * *

Now, this is an interesting and not very unexpected result. Basically, my ISP – IAXN AS134316 does not has any route in it’s routing table for 45.120.248.51 and hence passing it to default route towards it’s upstream Airtel. BGP wise IAXN is not supposed to have any route belonging to IX peering IP anyways and that’s expected. Likely their router which peers with Extreme IX is different from the router which serves me and is possibly missing sharing of connected routes via IGP and hence the unexpected path. As soon as traffic hits Airtel router with a full routing table & no default route, it drops it.

In this setup, I cannot reach Spectra’s interface connected to the Extreme IX (45.120.248.51) directly if I try to send packets to it. But I do know from the first trace that it comes in middle when I try to send packets to 180.151.151.151. This option can be used where packets can be sent with incremental TTL and latency can be measured and even graphed. This concept can be used even if there’s a use of private IPs before the destination.

So this goes to my Probe config

+ TraceroutePing

binary = /usr/bin/traceroute # mandatory
binaryv6 = /usr/bin/traceroute6
forks = 5
offset = 50%
step = 300
timeout = 15

and this goes to my Target’s config

++SpectraExtremeIXInterface
probe = TraceroutePing
menu = Spectra via Extreme IX
title = Spectra via Extreme IX
host = 45.120.248.51
desthost = 180.151.151.151
maxttl = 15
minttl = 5
pings = 5
wait = 3

How it works?

A reminder on working on traceroute!

Remember the concept of TTL in IP routing. TTL is time to live and basically whenever the router passes the packets, it decreases TTL by 1 and when TTL reaches 0, the router just drops it. This ensures loops aren’t as dangerous in layer 3 as we see in layer 2. Now when a router drops packets with TTL 0, it replies back to the source saying “TTL exceeded” and the reply packets have the router’s source IP address. That way traceroute can send 1st packet with TTL 1, 1st router in the chain gets it, reduces TTL by 1 and (now that TTL is 0) drops it with a reply from its source IP. Next, another packet is sent with TTL 2 and so on.

Note: Thanks to networking folks from OVH Cloud who replied me with this probe on Twitter. It wasn’t what I was looking for but quite fascinating and useful!
Time to go back into the routing world! 🙂