29 May

What makes BSNL AS9829 as most unstable ASN in the world?!

On weekend  I was looking at BGP Instability Report data. As usual (and unfortunately) BSNL tops that list. BSNL is the most unstable autonomous network in the world. In past, I have written previously about how AS9829 is the rotten IP backbone.


This isn’t a surprise since they keep on coming on top but I think it’s well worth a check on what exactly is causing that. So I looked into BGP tables updates published on Oregon route-views from 21st May to 27th May and pulled data specifically for AS9829. I see zero withdrawals which are very interesting. I thought there would be a lot of announcements & withdrawals as they switch transits to balance traffic.

If I plot the data, I get following chart of withdrawals against timestamp. This consists of summarised view of every 15mins and taken from 653 routing update dumps. It seems not feasible to graph data for 653 dumps, so I picked top 300. Here’s how it look like:


Except for few large spikes, it seems to have a relatively consistent pattern. We can see daily fresh announcements of close to 50,000 announcements.

This data gives no idea and I can’t say much by looking at it. Instead of looking at updates, I pulled last weeks RIBs and pulled AS9829 announcements. The idea here is to get map announcements to each upstream against time stamp along with announcements across various subnet masks.

Here’s total route announcement graph:

The graph above clearly shows that total routes announcements increased significantly on 23rd May at 06:00 UTC from 127664 to 129298. Thus dipped significantly at 14:00 on 26th May to 124301. So between 10:00 to 14:00 on 26th, the drop in routes as much as 4% drop clear indicating a large outage they had in their network.

Next part is to look at how they tweak their announcements to upstream.

So clearly they are announcing a large number of routes to Tata AS6453 and these are IPLC links where they are buying IP transit outside India. Some of these key spikes show a mirror among other transit giving a clear hint of circuit balancing by moving route announcement.


Next part is to view their announcements in terms of prefix size.

/20 as well as /22 as both seems relatively consistent except showing a dip on 26th.


So all I can say based on above data is following:

  1. BSNL had some issues last week. Possibly one of their upstream pipes had issues and they increased their announcements on Tata AS6453 during that time.
  2. They are an only large operator who is buying transits from as many as 9 upstream. This would result in broken capacity across at least 9 and possibly 30-40 circuits resulting in a major capacity management challenge across these upstream.
  3. They are announcing a large number of prefix sizes. /18, /20, /22, /23 and even /24s. This isn’t good practice at their large scale.
  4. They need to start peering. They are the only network of that scale who isn’t peering except with a couple of content players like Google AS15169. They need to peer aggressively inside India & follow same outside India if they actually keep on running such network. Or else even buying transit domestic only will be a better strategy.


Most of these problems can be fixed if BSNL aggregates it’s a number of transits (and circuits per transit) along with aggregation of routes. For a three transit scenario, they can follow /18, /20 and /22 strategy and leave /24 only for emergency cases to balance traffic.

30 Jun

Private IPs in Public routing

Sometimes we see interesting IP’s in traceroute & they confuse lot of people.

I have seen this topic in discussion twice on NANOG and once on Linux Delhi user group. 


OK – let’s pick an example: 

anurag:~ anurag$ traceroute
traceroute to (, 64 hops max, 52 byte packets
1 router ( 1.176 ms 0.993 ms 0.941 ms
2 ( 20.626 ms 29.101 ms 19.216 ms
3 ( 23.983 ms 43.850 ms 45.057 ms
4 ( 118.094 ms 81.447 ms 66.838 ms
5 ( 115.979 ms 90.947 ms 90.491 ms
6 ix-4-2.tcore1.cxr-chennai.as6453.net ( 95.778 ms 98.601 ms 98.920 ms
7 if-5-2.tcore1.svw-singapore.as6453.net ( 321.174 ms
if-3-3.tcore2.cxr-chennai.as6453.net ( 331.386 ms 326.671 ms
8 if-6-2.tcore2.svw-singapore.as6453.net ( 317.442 ms
if-2-2.tcore2.svw-singapore.as6453.net ( 334.647 ms 339.289 ms
9 if-7-2.tcore2.lvw-losangeles.as6453.net ( 318.003 ms 328.334 ms 309.234 ms
10 if-2-2.tcore1.lvw-losangeles.as6453.net ( 306.500 ms 326.194 ms 341.537 ms
11 ( 315.431 ms 330.417 ms 308.372 ms
12 dls-bb1-link.telia.net ( 354.768 ms 344.360 ms 357.050 ms
13 chi-bb1-link.telia.net ( 352.479 ms 358.751 ms 359.987 ms
14 cco-ic-156108-chi-bb1.c.telia.net ( 367.467 ms 370.482 ms 377.280 ms
15 bbr01aldlmi-bue-4.aldl.mi.charter.com ( 387.269 ms 385.362 ms 365.694 ms
16 crr02aldlmi-bue-2.aldl.mi.charter.com ( 375.275 ms 375.356 ms 371.621 ms
17 dtr02grhvmi-tge-0-1-0-0.grhv.mi.charter.com ( 383.539 ms 371.817 ms 383.804 ms
18 dtr02whthmi-tge-0-1-0-0.whth.mi.charter.com ( 384.400 ms 391.197 ms 393.340 ms
19 dtr02ldngmi-tge-0-1-0-0.ldng.mi.charter.com ( 371.192 ms 375.679 ms 378.457 ms
20 acr01mnplmi-tge-0-0-0-3.mnpl.mi.charter.com ( 364.824 ms 385.534 ms 374.401 ms
21 * *^C
anurag:~ anurag$



Let’s try pinging IP on 14th hop (which is with a major backbone Telia) –

anurag:~ anurag$ ping -c 5
PING ( 56 data bytes
64 bytes from icmp_seq=0 ttl=240 time=517.305 ms
64 bytes from icmp_seq=1 ttl=240 time=329.230 ms
64 bytes from icmp_seq=2 ttl=240 time=324.397 ms
64 bytes from icmp_seq=3 ttl=240 time=331.474 ms
64 bytes from icmp_seq=4 ttl=240 time=326.409 ms

— ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 324.397/365.763/517.305/75.809 ms
anurag:~ anurag$


Works fine! 


Game begins here…


Next, let’s try pinging hop 15th IP which is with a major cable company Charter operating in US East –

PING ( 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

— ping statistics —
5 packets transmitted, 0 packets received, 100.0% packet loss
anurag:~ anurag$


So we see some nice timeouts. This confuses lot of people as we can’t have a firewall blocking ICMP packets here since we did had ICMP based traceroute with ICMP replies from 15th hop in last trace.


Let’s try to do a trace to this IP to see where exactly is connection breaking.

anurag:~ anurag$ traceroute -a
traceroute to (, 64 hops max, 52 byte packets
1 [AS65534] router ( 1.661 ms 0.887 ms 0.934 ms
2 [AS9829] ( 18.867 ms 31.898 ms 20.931 ms
3 [AS9829] ( 43.427 ms 22.327 ms 34.790 ms
4 [AS4755] ( 78.673 ms 79.056 ms 70.441 ms
5 * * *
6 * * *
7 * * *
8 * * *
anurag:~ anurag$


(Surprising?) Well as we see – we can’t go beyond Tata-VSNL AS4755 border router in Mumbai. Why? Let’s ask it’s neighbor upstream router Tata AS6453. Checking route for IP in Tata AS6453 routing table:


show ip bgp

Router: gin-mlv-core1
Site: IN, Mumbai, MLV
Command: show ip bgp

% Network not in table 


This situation is the one this blog post is about! 🙂

What’s bit confusing here is the fact that we are able to reach a destination IP say as taken in this example and middle routers just seem normal but if we try to explicitly reach these middle routers then we don’t see a route. 


Why we see no route?

Because there’s just no route. These prefixes are not announced in global routing table via BGP. 


So technically no one is announcing any subnet in global IPv4 table which covers address space for


Here’s another major backbone router in US:

route-server> sh bgp ipv4 unicast
% Network not in table


Did someone missed to announce a prefix? 

Well, answer is NO!
Everything is just fine in such setup. Basically many providers like Charter (and many ISPs) do not announce address space allocated to their backbone routers which are middle in chain to avoid possibility of packet flooding and possibly some other attacks.


Then how we are getting ICMP replies during initial trace to destination IP?

We get ICMP replies because we just followed chain, and in chain last router before Charter was Telia which is announcing its address space normally and we are able to reach it. Now that specific Telia router is having a BGP session with Charter router (since Charter is their downstream customer network) and that Telia router has multiple broadcast domains. Including the one which takes us to it (coming from BGP announcement for from AS1299). The other possible broadcast domain it has is /30 which is used for BGP session with Charter. /30 = 4IP’s. One goes for Telia router, other goes to Charter router, third one becomes broadcast IP and last one lies useless due to Maths. 😉

Hence that specific Telia router has routing table of Charter and knows from which “Physical interface” is the “next hop” to that Charter router and so does and next, next and next till we reach destination router (which is always on a well advertised address space).  The same logic pretty much applies on RFC 1918 based private address space too. Like or etc. 

Now as soon as one knows this chain – one can always add static routes in routing table and flood those routers (taking off the reason for not announcing address space). For IXP’s this part is also important – since lot of them use a shared peering VLAN which stays on single broadcast subnet often a /23 or /24. Will discuss more about IX prefix and announcement impacts in my future posts.


So that’s all about it. Have a good week ahead! 🙂