20 Aug

Bangladesh .bd TLD outage on 18th August 2016

 
outage
Day before yesterday i.e on 18th August 2016 Bangladesh’s TLD .bd went had an outage. It was originally reported by Jasim Alam on bdNOG mailing list.

dig btcl.com.bd @8.8.8.8
; <<>> DiG 9.10.4-P2 <<>> btcl.com.bd @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8114
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;btcl.com.bd.                   IN      A
;; Query time: 76 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Aug 18 14:24:25 Bangladesh Standard Time 2016
;; MSG SIZE  rcvd: 40

 
His message shows that DNS resolution of BTCL (Bangladesh Telecommunications Company Ltd) was failing. Later Alok Das that it was the power problem resulting in outage.
Let’s look ask one of 13 root DNS server about NS records on who has the delegation for .bd.

dig @k.root-servers.net. bd. ns
; <<>> DiG 9.8.3-P1 <<>> @k.root-servers.net. bd. ns
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7148
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 3
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;bd.   				IN     	NS
;; AUTHORITY SECTION:
bd.    			172800 	IN     	NS     	dns.bd.
bd.    			172800 	IN     	NS     	surma.btcl.net.bd.
bd.    			172800 	IN     	NS     	jamuna.btcl.net.bd.
;; ADDITIONAL SECTION:
dns.bd.			172800 	IN     	A      	209.58.24.3
surma.btcl.net.bd.     	172800 	IN     	A      	203.112.194.232
jamuna.btcl.net.bd.    	172800 	IN     	A      	203.112.194.231
;; Query time: 43 msec
;; SERVER: 2001:7fd::1#53(2001:7fd::1)
;; WHEN: Sat Aug 20 01:29:37 2016
;; MSG SIZE  rcvd: 136

So two of out of these three seem to be on BTCL network and that too on same /24.
 
Let’s ping to all these three using NLNOG Ring node of bdHUB: bdhub01.ring.nlnog.net

anurag@ansible:~$ ansible -a 'ping -c 5 dns.bd'  bdhub01.ring.nlnog.net
bdhub01.ring.nlnog.net | SUCCESS | rc=0 >>
PING dns.bd (209.58.24.3) 56(84) bytes of data.
64 bytes from 209.58.24.3: icmp_req=1 ttl=60 time=0.754 ms
64 bytes from 209.58.24.3: icmp_req=2 ttl=60 time=0.728 ms
64 bytes from 209.58.24.3: icmp_req=3 ttl=60 time=0.725 ms
64 bytes from 209.58.24.3: icmp_req=4 ttl=60 time=0.726 ms
64 bytes from 209.58.24.3: icmp_req=5 ttl=60 time=0.737 ms
--- dns.bd ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 34660ms
rtt min/avg/max/mdev = 0.725/0.734/0.754/0.010 ms
anurag@ansible:~$
anurag@ansible:~$ ansible -a 'ping -c 5 surma.btcl.net.bd'  bdhub01.ring.nlnog.net
bdhub01.ring.nlnog.net | SUCCESS | rc=0 >>
PING surma.btcl.net.bd (203.112.194.232) 56(84) bytes of data.
64 bytes from host232.btcl.net.bd (203.112.194.232): icmp_req=1 ttl=60 time=0.775 ms
64 bytes from host232.btcl.net.bd (203.112.194.232): icmp_req=2 ttl=60 time=0.739 ms
64 bytes from host232.btcl.net.bd (203.112.194.232): icmp_req=3 ttl=60 time=1.02 ms
64 bytes from host232.btcl.net.bd (203.112.194.232): icmp_req=4 ttl=60 time=0.724 ms
64 bytes from host232.btcl.net.bd (203.112.194.232): icmp_req=5 ttl=60 time=0.724 ms
--- surma.btcl.net.bd ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4534ms
rtt min/avg/max/mdev = 0.724/0.796/1.022/0.119 ms
anurag@ansible:~$
anurag@ansible:~$ ansible -a 'ping -c 5 jamuna.btcl.net.bd'  bdhub01.ring.nlnog.net
bdhub01.ring.nlnog.net | SUCCESS | rc=0 >>
PING jamuna.btcl.net.bd (203.112.194.231) 56(84) bytes of data.
64 bytes from host231.btcl.net.bd (203.112.194.231): icmp_req=1 ttl=60 time=0.739 ms
64 bytes from host231.btcl.net.bd (203.112.194.231): icmp_req=2 ttl=60 time=0.785 ms
64 bytes from host231.btcl.net.bd (203.112.194.231): icmp_req=3 ttl=60 time=0.948 ms
64 bytes from host231.btcl.net.bd (203.112.194.231): icmp_req=4 ttl=60 time=1.26 ms
64 bytes from host231.btcl.net.bd (203.112.194.231): icmp_req=5 ttl=60 time=0.747 ms
--- jamuna.btcl.net.bd ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4513ms
rtt min/avg/max/mdev = 0.739/0.897/1.268/0.201 ms
anurag@ansible:~$

So clearly all three servers are in Bangladesh/local as per super low latency from bdHUB node. From traces from outside India it’s quite unlikely of any other anycast node outside Bangladesh. This is a serious design issue. For a country’s TLD one should have much more resiliency.
My good friend Fakrul from APNIC mentioned on mailing list about PCH becoming secondary for .bd. Same is visible now in the authority NS records of the domain.

dig @dns.bd. bd. ns +short
jamuna.btcl.net.bd.
dns.bd.
bd-ns.anycast.pch.net.
surma.btcl.net.bd.

 
So once the same is added on root DNS servers, it will bring up bit more resiliency with PCH’s platform with large number of anycast nodes.
So what was impact of this outage?
Well, probably a lot. .bd TLD outage would have brought down a lot of websites running on .bd domain. Any fresh DNS lookup would have failed, any websites with lower TTL would have went down. As per bdIX traffic graph some disturbance is visible across that day.
bdix drop
 

27 Mar

Letsencrypt – Free signed automated SSL

Last year a really good project Letsencrypt came up. They key objective of this project is to help in securing web by pushing SSL everywhere.
 
Two key cool features

  1. It offer free signed SSL certs!
  2. It helps in setting up SSL via an agent seamlessly without having to deal with CSR, getting it signed & updating web server configuration.

 
At this stage Letsencrypt is itself a Certificate Authority and but it’s root certs are yet not in the browser. It’s probably going to take a while till all major browsers get their certificate.
To help on that one of it’s sponsors IdenTrust has signed their intermediate certs. Hence certs signed by Letsencrypt are accepted by all browsers right away. All certs signed by Letsencypt are signed by Letencrypt Authority X1 which have signature from DST Root CA X3 which is accepted by pretty much all popular browsers. You can read more about How it works here.
 
Here’s an example of SSL setup for say “demo.anuragbhatia.com” test domain which is already up and working without SSL. http://demo.anuragbhatia.com shows a plain text page. This is Apache running on Ubuntu server.
The Apache web config is pretty straightforward.

<VirtualHost *:80>
ServerName demo.anuragbhatia.com
DocumentRoot /var/www/demo.anuragbhatia.com
ErrorLog /var/log/apache2/demo.anuragbhatia.com
LogLevel notice
</VirtualHost>

 
 
Step 1 – Grab the Letscrypt agent
git clone https://github.com/letsencrypt/letsencrypt
 
Step 2 – Execute the auto script
./letsencrypt-auto –help
 
This will grab all needed dependencies and will get the agent working.
 
Step 3 – Execute Letsencrypt auto script with it’s Apache plugin
./letsencrypt-auto –apache -d demo.anuragbhatia.com
 
It takes with a quick wizard and in the end I get:

Congratulations! You have successfully enabled
https://demo.anuragbhatia.com
You should test your configuration at:
https://www.ssllabs.com/ssltest/analyze.html?d=demo.anuragbhatia.com

 
And it’s done!
Wizard got me a signed SSL and installed it in the apache config as well.
Screen Shot 2016-03-27 at 7.22.21 PM
 
Screen Shot 2016-03-27 at 7.22.37 PM
 
The agent created an addional Apache config with name demo.anuragbhatia.com-le-ssl.conf with following content

<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerName demo.anuragbhatia.com
DocumentRoot /var/www/demo.anuragbhatia.com
ErrorLog /var/log/apache2/demo.anuragbhatia.com
LogLevel notice
SSLCertificateFile /etc/letsencrypt/live/demo.anuragbhatia.com/cert.pem
SSLCertificateKeyFile /etc/letsencrypt/live/demo.anuragbhatia.com/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
SSLCertificateChainFile /etc/letsencrypt/live/demo.anuragbhatia.com/chain.pem
</VirtualHost>
</IfModule>

 
Here options-ssl-apache.conf plays an important role by using better security options. It’s config:

# Baseline setting to Include for SSL sites
SSLEngine on
# Intermediate configuration, tweak to your needs
SSLProtocol             all -SSLv2 -SSLv3
SSLCipherSuite          ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA
SSLHonorCipherOrder     on
SSLCompression          off
SSLOptions +StrictRequire
# Add vhost name to log entries:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" vhost_combined
LogFormat "%v %h %l %u %t \"%r\" %>s %b" vhost_common
CustomLog /var/log/apache2/access.log vhost_combined
LogLevel warn
ErrorLog /var/log/apache2/error.log
# Always ensure Cookies have "Secure" set (JAH 2012/1)
#Header edit Set-Cookie (?i)^(.*)(;\s*secure)??((\s*;)?(.*)) "$1; Secure$3$4"

 
Some of the limitations 

  1. Signed SSL certs are valid only for 90 days and have to be renewed.
  2. Wildcard SSL certs are not supported yet.
  3. IPv6 is not supported in the autoconfig setup via client. One can always get certificate manually and use with IPv6 but agent is yet to support IPv6 (which I guess is from next month).

 
You can read more on their excellent documentation here and can also consider checking Presentation by Ashley Jones from PCH at SANOG on All TLS, all the time.
 
Have fun!