20 Aug

Bangladesh .bd TLD outage on 18th August 2016

 

outage

Day before yesterday i.e on 18th August 2016 Bangladesh’s TLD .bd went had an outage. It was originally reported by Jasim Alam on bdNOG mailing list.

 

His message shows that DNS resolution of BTCL (Bangladesh Telecommunications Company Ltd) was failing. Later Alok Das that it was the power problem resulting in outage.

Let’s look ask one of 13 root DNS server about NS records on who has the delegation for .bd.

So two of out of these three seem to be on BTCL network and that too on same /24.

 

Let’s ping to all these three using NLNOG Ring node of bdHUB: bdhub01.ring.nlnog.net

So clearly all three servers are in Bangladesh/local as per super low latency from bdHUB node. From traces from outside India it’s quite unlikely of any other anycast node outside Bangladesh. This is a serious design issue. For a country’s TLD one should have much more resiliency.

My good friend Fakrul from APNIC mentioned on mailing list about PCH becoming secondary for .bd. Same is visible now in the authority NS records of the domain.

dig @dns.bd. bd. ns +short
jamuna.btcl.net.bd.
dns.bd.
bd-ns.anycast.pch.net.
surma.btcl.net.bd.

 

So once the same is added on root DNS servers, it will bring up bit more resiliency with PCH’s platform with large number of anycast nodes.

So what was impact of this outage?
Well, probably a lot. .bd TLD outage would have brought down a lot of websites running on .bd domain. Any fresh DNS lookup would have failed, any websites with lower TTL would have went down. As per bdIX traffic graph some disturbance is visible across that day.

bdix drop

 

20 Aug

F root server transit in Chennai

Few days back I noticed F root server (which is with ISC) brought it’s anycasted node in NIXI Chennai back live. They have taken that down as per my interaction with them over mailing list. My last post about F root coming back live was with guess work on who’s providing upstream. 

 

Today I spent sometime in finding who’s providing transit to that node. It is very important to note that most of these key infrastructure related nodes rely on peering for most of traffic but a transit in form of full table or default stays so that one can push packets to a route if it is not in table learnt from peering. In case of Indian deployment which was at NIXI Chennai – many ISPs were following “regional routes” clause of NIXI and were announcing just their regional routes (to ISC’s F root router) but quite a few of them (like BSNL) were still learning routes from one region and exporting them into their other region via their IGP. This brought case where my router (sitting on BSNL link) was getting a forward path to NIXI Chennai for F root but there was return path from F root to my system because BSNL wasn’t announcing Northern prefixes in Chennai based NIXI. 

As I noted earlier F root is back live in India and I am getting consistant and direct routes. It seems very much the case of addition of transit on that node. Today I was looking at global table dump and I came across some interesting routes which revealed who is probably the transit for ISC’s F root in India. 🙂

 

 

 

F root server uses IP 192.5.5.241 which comes under BGP announcement of 192.5.5.0/24 as well as 192.5.4.0/23 from quite a few autonomous system numbers of ISC. In India ISC is announcing 192.5.5.0/24 from AS3557. It seems like ISC does not uses AS3557 for direct peering with external networks but instead it peers with other ASNs of ISC. In ISC’s worldwide deployment of F root it seems like there are as many as 15+ ASNs with direct peering/upsteram for AS3557. In case of India AS24049 is used by ISC for inter-connection with external networks. 

 

Here’s a routing table entry from NIXI Chennai:

 

show ip bgp 192.5.5.241
Number of BGP Routes matching display condition : 1
Status codes: s suppressed, d damped, h history, * valid, > best, i internal
Origin codes: i – IGP, e – EGP, ? – incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.5.5.0/24 218.100.48.142 10 100 0 24049 3557 i

 

Looking into global IPv4 table from Orgeon route-views for AS24049 routes, we get:

route-views>sh ip bgp regexp 24049$
BGP table version is 911752723, local router ID is 128.223.51.103
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
r RIB-failure, S Stale
Origin codes: i – IGP, e – EGP, ? – incomplete

Network Next Hop Metric LocPrf Weight Path
* 203.119.18.0 12.0.1.63 0 7018 6453 4755 37986 24049 i
* 193.0.0.56 0 3333 3356 6453 4755 37986 24049 i
* 208.74.64.40 0 19214 174 6453 4755 37986 24049 i
* 154.11.11.113 0 0 852 3257 6453 4755 37986 24049 i
* 128.223.253.10 0 3582 3701 3356 6453 4755 37986 24049 i
* 157.130.10.233 0 701 6453 4755 37986 24049 i
* 164.128.32.11 0 3303 6453 4755 37986 24049 i
*> 66.110.0.86 0 6453 4755 37986 24049 i
* 208.51.134.254 2523 0 3549 6453 4755 37986 24049 i
* 203.181.248.168 0 7660 2516 6453 4755 37986 24049 i
* 144.228.241.130 0 1239 6453 4755 37986 24049 i
* 114.31.199.1 0 0 4826 6939 1299 6453 4755 37986 24049 i
* 206.24.210.102 0 3561 6453 4755 37986 24049 i
* 194.85.102.33 0 3277 3267 174 6453 4755 37986 24049 i
* 217.75.96.60 0 0 16150 1299 6453 4755 37986 24049 i
* 202.232.0.2 0 2497 6453 4755 37986 24049 i
* 207.172.6.20 0 0 6079 3356 6453 4755 37986 24049 i
* 207.172.6.1 0 0 6079 3356 6453 4755 37986 24049 i
* 4.69.184.193 0 0 3356 6453 4755 37986 24049 i
* 66.59.190.221 0 6539 6453 4755 37986 24049 i
* 194.85.40.15 0 3267 9002 6453 4755 37986 24049 i
* 154.11.98.225 0 0 852 3257 6453 4755 37986 24049 i
* 69.31.111.244 3 0 4436 3257 6453 4755 37986 24049 i
* 202.249.2.86 0 7500 2516 6453 4755 37986 24049 i
* 89.149.178.10 10 0 3257 6453 4755 37986 24049 i
* 129.250.0.11 6 0 2914 6453 6453 4755 37986 24049 i
* 216.218.252.164 0 6939 1299 6453 4755 37986 24049 i
* 203.62.252.186 0 1221 4637 6453 4755 37986 24049 i
* 66.185.128.48 7 0 1668 6453 4755 37986 24049 i
* 209.124.176.223 0 101 101 3356 6453 4755 37986 24049 i
* 134.222.87.1 0 286 6453 4755 37986 24049 i
* 207.46.32.34 0 8075 6453 4755 37986 24049 i
route-views>

 

And here we go!

 

AS37986 i.e Tulip Telecom seems to be transit. I think this is not case of any route leak – it’s just that Tulip telecom is providing transit. 

 

route-views>sh ip bgp regexp 24049 3557

route-views>

 

 

Clearly there’s no announcement of F root prefix directly to Tulip which seems good to avoid any external (outside India) traffic hitting Chennai node. I am quite sure that default on router of AS24049 (or likely AS3557) does points to Tulip’s gateway. 

Here we can see the sites under their management subnet – http://route.robtex.com/203.119.18.0-24—internet-systems-consortium.html#sites

 

Well, thanks to Tulip telecom for helping to bring F root node back live. 🙂

 

 

Disclaimer: This is my personal blog and post is completely a reflection of personal thoughts. It has no relation with my employer.