30 Jun

Private IPs in Public routing

Sometimes we see interesting IP’s in traceroute & they confuse lot of people.

I have seen this topic in discussion twice on NANOG and once on Linux Delhi user group. 

 

OK – let’s pick an example: 

anurag:~ anurag$ traceroute 71.89.140.11
traceroute to 71.89.140.11 (71.89.140.11), 64 hops max, 52 byte packets
1 router (10.10.0.1) 1.176 ms 0.993 ms 0.941 ms
2 117.220.160.1 (117.220.160.1) 20.626 ms 29.101 ms 19.216 ms
3 218.248.169.122 (218.248.169.122) 23.983 ms 43.850 ms 45.057 ms
4 115.114.89.21.static-mumbai.vsnl.net.in (115.114.89.21) 118.094 ms 81.447 ms 66.838 ms
5 172.31.16.193 (172.31.16.193) 115.979 ms 90.947 ms 90.491 ms
6 ix-4-2.tcore1.cxr-chennai.as6453.net (180.87.36.9) 95.778 ms 98.601 ms 98.920 ms
7 if-5-2.tcore1.svw-singapore.as6453.net (180.87.12.53) 321.174 ms
if-3-3.tcore2.cxr-chennai.as6453.net (180.87.36.6) 331.386 ms 326.671 ms
8 if-6-2.tcore2.svw-singapore.as6453.net (180.87.37.14) 317.442 ms
if-2-2.tcore2.svw-singapore.as6453.net (180.87.12.2) 334.647 ms 339.289 ms
9 if-7-2.tcore2.lvw-losangeles.as6453.net (180.87.15.26) 318.003 ms 328.334 ms 309.234 ms
10 if-2-2.tcore1.lvw-losangeles.as6453.net (66.110.59.1) 306.500 ms 326.194 ms 341.537 ms
11 66.110.59.66 (66.110.59.66) 315.431 ms 330.417 ms 308.372 ms
12 dls-bb1-link.telia.net (213.155.136.40) 354.768 ms 344.360 ms 357.050 ms
13 chi-bb1-link.telia.net (80.91.248.208) 352.479 ms 358.751 ms 359.987 ms
14 cco-ic-156108-chi-bb1.c.telia.net (213.248.89.46) 367.467 ms 370.482 ms 377.280 ms
15 bbr01aldlmi-bue-4.aldl.mi.charter.com (96.34.0.98) 387.269 ms 385.362 ms 365.694 ms
16 crr02aldlmi-bue-2.aldl.mi.charter.com (96.34.2.11) 375.275 ms 375.356 ms 371.621 ms
17 dtr02grhvmi-tge-0-1-0-0.grhv.mi.charter.com (96.34.34.83) 383.539 ms 371.817 ms 383.804 ms
18 dtr02whthmi-tge-0-1-0-0.whth.mi.charter.com (96.34.34.85) 384.400 ms 391.197 ms 393.340 ms
19 dtr02ldngmi-tge-0-1-0-0.ldng.mi.charter.com (96.34.34.87) 371.192 ms 375.679 ms 378.457 ms
20 acr01mnplmi-tge-0-0-0-3.mnpl.mi.charter.com (96.34.40.75) 364.824 ms 385.534 ms 374.401 ms
21 * *^C
anurag:~ anurag$

 

 

Let’s try pinging IP on 14th hop (which is with a major backbone Telia) – 213.248.89.46

anurag:~ anurag$ ping -c 5 213.248.89.46
PING 213.248.89.46 (213.248.89.46): 56 data bytes
64 bytes from 213.248.89.46: icmp_seq=0 ttl=240 time=517.305 ms
64 bytes from 213.248.89.46: icmp_seq=1 ttl=240 time=329.230 ms
64 bytes from 213.248.89.46: icmp_seq=2 ttl=240 time=324.397 ms
64 bytes from 213.248.89.46: icmp_seq=3 ttl=240 time=331.474 ms
64 bytes from 213.248.89.46: icmp_seq=4 ttl=240 time=326.409 ms

— 213.248.89.46 ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 324.397/365.763/517.305/75.809 ms
anurag:~ anurag$

  

Works fine! 

 

Game begins here…

 

Next, let’s try pinging hop 15th IP which is with a major cable company Charter operating in US East – 96.34.0.98

PING 96.34.0.98 (96.34.0.98): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3

— 96.34.0.98 ping statistics —
5 packets transmitted, 0 packets received, 100.0% packet loss
anurag:~ anurag$

  

So we see some nice timeouts. This confuses lot of people as we can’t have a firewall blocking ICMP packets here since we did had ICMP based traceroute with ICMP replies from 15th hop in last trace.

 

Let’s try to do a trace to this IP to see where exactly is connection breaking.

anurag:~ anurag$ traceroute -a 96.34.0.98
traceroute to 96.34.0.98 (96.34.0.98), 64 hops max, 52 byte packets
1 [AS65534] router (10.10.0.1) 1.661 ms 0.887 ms 0.934 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 18.867 ms 31.898 ms 20.931 ms
3 [AS9829] 218.248.169.118 (218.248.169.118) 43.427 ms 22.327 ms 34.790 ms
4 [AS4755] 115.114.89.17.static-mumbai.vsnl.net.in (115.114.89.17) 78.673 ms 79.056 ms 70.441 ms
5 * * *
6 * * *
7 * * *
8 * * *
^C
anurag:~ anurag$

 

(Surprising?) Well as we see – we can’t go beyond Tata-VSNL AS4755 border router in Mumbai. Why? Let’s ask it’s neighbor upstream router Tata AS6453. Checking route for IP 96.34.0.98 in Tata AS6453 routing table:

 

show ip bgp 96.34.0.98

Router: gin-mlv-core1
Site: IN, Mumbai, MLV
Command: show ip bgp 96.34.0.98

% Network not in table 

 

This situation is the one this blog post is about! 🙂

What’s bit confusing here is the fact that we are able to reach a destination IP say 71.89.140.11 as taken in this example and middle routers just seem normal but if we try to explicitly reach these middle routers then we don’t see a route. 

 

Why we see no route?

Because there’s just no route. These prefixes are not announced in global routing table via BGP. 

 

So technically no one is announcing any subnet in global IPv4 table which covers address space for 96.34.0.98.

 

Here’s another major backbone router in US:

route-server>
route-server> sh bgp ipv4 unicast 96.34.0.98
% Network not in table
route-server>

 

Did someone missed to announce a prefix? 

Well, answer is NO!
Everything is just fine in such setup. Basically many providers like Charter (and many ISPs) do not announce address space allocated to their backbone routers which are middle in chain to avoid possibility of packet flooding and possibly some other attacks.

 

Then how we are getting ICMP replies during initial trace to destination IP?

We get ICMP replies because we just followed chain, and in chain last router before Charter was Telia which is announcing its address space normally and we are able to reach it. Now that specific Telia router is having a BGP session with Charter router (since Charter is their downstream customer network) and that Telia router has multiple broadcast domains. Including the one which takes us to it 213.248.89.46 (coming from BGP announcement for 213.248.64.0/18 from AS1299). The other possible broadcast domain it has is /30 which is used for BGP session with Charter. /30 = 4IP’s. One goes for Telia router, other goes to Charter router, third one becomes broadcast IP and last one lies useless due to Maths. 😉

Hence that specific Telia router has routing table of Charter and knows from which “Physical interface” is the “next hop” to that Charter router and so does and next, next and next till we reach destination router (which is always on a well advertised address space).  The same logic pretty much applies on RFC 1918 based private address space too. Like 10.0.0.0/8 or 192.168.0.0/24 etc. 

Now as soon as one knows this chain – one can always add static routes in routing table and flood those routers (taking off the reason for not announcing address space). For IXP’s this part is also important – since lot of them use a shared peering VLAN which stays on single broadcast subnet often a /23 or /24. Will discuss more about IX prefix and announcement impacts in my future posts.

 

So that’s all about it. Have a good week ahead! 🙂

28 Jun

BSNL routing glitch and updates

Today I noticed some traffic on my blog from a link from Broadband forum

 

Here’s what poster wrote:

I made a thread a few days ago complaining about BSNL’s horrible routing. Well it looks like it has been fixed. I thank all the guys who made efforts to bring this to BSNL’s notice. Especially Anurag Bhatia who highlighted the issue with much detail on his blog

anuragbhatia.com !!! » Blog Archive » BSNL > Softlayer connectivity problem & possible fix

 

 

Always good to see links to my blog. This was an interesting update and I can see forward does seems good for now. 

 

Here’s an updated traceroute from India to Singapore (BSNL > Softlayer):

anurag:~ anurag$ traceroute -a hostgator.in
traceroute to hostgator.in (216.12.194.67), 64 hops max, 52 byte packets
1 [AS65534] router.home (10.10.0.1) 1.183 ms 1.290 ms 0.849 ms
2 [AS9829] 117.206.176.1 (117.206.176.1) 17.517 ms 18.056 ms 17.163 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 71.872 ms 52.246 ms 114.018 ms
4 [AS4755] 115.114.57.249.static-mumbai.vsnl.net.in (115.114.57.249) 49.644 ms 50.151 ms 49.265 ms
5 [AS0] 172.31.16.193 (172.31.16.193) 83.261 ms * 82.361 ms
6 [AS0] ix-4-2.tcore1.cxr-chennai.as6453.net (180.87.36.9) 197.469 ms 199.161 ms 196.580 ms
7 [AS0] if-5-2.tcore1.svw-singapore.as6453.net (180.87.12.53) 318.931 ms 307.292 ms
[AS0] if-3-3.tcore2.cxr-chennai.as6453.net (180.87.36.6) 306.836 ms
8 [AS0] if-5-2.tcore2.svw-singapore.as6453.net (180.87.15.69) 330.831 ms
[AS0] if-2-2.tcore2.svw-singapore.as6453.net (180.87.12.2) 306.926 ms
[AS0] if-6-2.tcore2.svw-singapore.as6453.net (180.87.37.14) 227.751 ms
9 [AS0] 180.87.15.218 (180.87.15.218) 230.692 ms 265.758 ms 241.768 ms
10 [AS4637] i-1-0-0.6ntp-core01.bi.telstraglobal.net (202.84.243.81) 245.100 ms 235.299 ms 274.206 ms
11 [AS4637] i-0-1-0-0.istt02.bi.telstraglobal.net (202.84.243.110) 307.158 ms 304.905 ms 307.080 ms
12 [AS4637] unknown.telstraglobal.net (202.126.128.18) 307.409 ms 304.740 ms 307.178 ms
13 [AS36351] ae5.dar02.sr03.sng01.networklayer.com (50.97.18.199) 307.167 ms 306.263 ms
[AS36351] ae5.dar01.sr03.sng01.networklayer.com (50.97.18.197) 307.456 ms
14 [AS36351] po1.fcr01.sr03.sng01.networklayer.com (174.133.118.131) 238.486 ms
[AS36351] po2.fcr01.sr03.sng01.networklayer.com (174.133.118.133) 234.005 ms
[AS36351] po1.fcr01.sr03.sng01.networklayer.com (174.133.118.131) 306.823 ms
15 * * *
16 * *^C
anurag:~ anurag$

 

 

So forward does seems good but latency is still way too high then an expected value of 120-150ms (from North India). There’s a jump as soon as we hit Chennai router for AS6453.

 

Quick ping output:

anurag:~ anurag$ ping -c 5 hostgator.in
PING hostgator.in (216.12.194.67): 56 data bytes
64 bytes from 216.12.194.67: icmp_seq=0 ttl=45 time=232.593 ms
64 bytes from 216.12.194.67: icmp_seq=1 ttl=45 time=233.120 ms
64 bytes from 216.12.194.67: icmp_seq=2 ttl=45 time=259.231 ms
64 bytes from 216.12.194.67: icmp_seq=3 ttl=45 time=281.217 ms
64 bytes from 216.12.194.67: icmp_seq=4 ttl=45 time=305.450 ms

— hostgator.in ping statistics —
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 232.593/262.322/305.450/28.154 ms
anurag:~ anurag$

 

We can ignore any value above 232ms because that’s simply latency added by router because they do not put ICMP on priority. But overall 232ms is quite high and it seems like there is issue in reverse path. I am doing this test from 117.206.176.0/20 sitting on BSNL autonomous system 9829.

 

Looking at BGP table at Softlayer Singapore for this prefix via Softlayer Looking Glass, we get:

 

 

bbr01.eq01.sng02> show route protocol bgp 117.206.176.0 table inet.0
inet.0: 461775 destinations, 1662681 routes (461773 active, 1 holddown, 1 hidden)
+ = Active Route, – = Last Active, * = Both

117.206.176.0/20 *[BGP/170] 6:50:39, MED 1, localpref 160
AS Path: 4637 6453 9829 I

> to 202.126.128.17 via xe-0/2/0.0
[BGP/170] 6:50:41, MED 5, localpref 10
AS Path: 2914 6453 9829 I

> to 116.51.17.53 via ae11.0

 

So AS path is AS4637 > AS6453 > AS9829 

 

AS4637 is Reach/Telstra while AS6453 is Tata Comm and just next to it is AS9829 which again (as per my earlier post) is an IPLC link. AS6453 > AS9829 connection is from outside India for sure and it should be rather AS6453 > AS4755 (VSNL) > AS9829 for actual direct route from Singapore to Asia.

 

Just to confirm this, let’s run a trace to a random IP 117.206.178.1 from Softlayer Singapore:

bbr01.eq01.sng02> traceroute 117.206.178.1
HOST: bbr01.eq01.sng02-re0 Loss% Snt Last Avg Best Wrst StDev
1. 202.126.128.17 0.0% 5 1.6 2.9 1.6 5.6 1.5
2. 202.84.180.157 0.0% 5 1.5 16.3 1.4 43.8 20.6
3. 202.84.243.114 0.0% 5 4.9 4.2 3.1 4.9 0.9
4. 180.87.15.217 0.0% 5 44.1 11.6 2.6 44.1 18.2
5. 180.87.15.70 0.0% 5 261.5 261.4 260.6 263.0 0.9
6. 180.87.36.5 0.0% 5 260.3 256.6 255.4 260.3 2.1
7. 180.87.36.33 0.0% 5 255.9 256.3 255.9 256.8 0.4
8. 80.231.217.17 0.0% 5 255.7 257.7 255.7 263.5 3.3
9. 80.231.217.2 60.0% 5 258.1 257.7 257.4 258.1 0.5
10. 80.231.200.14 80.0% 5 263.5 263.5 263.5 263.5 0.0
11. 80.231.131.34 0.0% 5 256.1 256.2 256.0 256.4 0.2
12. 195.219.144.2 0.0% 5 380.6 380.6 380.6 380.6 0.0
13. 218.248.255.101 0.0% 5 397.6 388.9 381.2 401.3 9.8
14. 218.248.169.121 0.0% 5 394.1 397.8 394.1 412.6 8.2
15. 218.248.169.121 20.0% 5 397.3 404.9 394.1 424.0 13.4
16. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0

 

 

 

Clearly a high latency route but unfortunately Softlayer looking glass is not doing rDNS PTR mapping for IP to hostname. Let’s try to look at some specific hops for them via using dig command (with -x argument for PTR):

anurag:~ anurag$
anurag:~ anurag$ dig +short -x 202.84.243.114
i-0-2-0-0.6ntp02.bi.telstraglobal.net.
anurag:~ anurag$ dig +short -x 180.87.15.217
ix-1-1-2-0.tcore2.SVW-Singapore.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.15.70
if-5-2.tcore2.CXR-Chennai.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.36.5
if-3-3.tcore1.CXR-Chennai.as6453.net.
anurag:~ anurag$ dig +short -x 180.87.36.33
if-7-2.tcore1.MLV-Mumbai.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.217.17
if-9-5.tcore1.WYN-Marseille.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.217.2
if-2-2.tcore2.WYN-Marseille.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.200.14
if-9-2.tcore2.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 80.231.131.34
if-4-0-0.mcore3.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 195.219.144.2
ix-1-1-0.mcore3.L78-London.as6453.net.
anurag:~ anurag$ dig +short -x 218.248.255.101
anurag:~ anurag$

 

 

 

So return path for packets is as:

 

Telstra (Singapore) > Tata AS6453 (Singapore) > Tata AS6453 (Chennai via Tata Indicom cable link) > Tata AS6453 (Mumbai) > Tata AS6453 (Marseille, France) > Tata AS6453 (London) > IPLC Link >>> BSNL AS9829 India.

 

So basically BSNL fixed forward path but return path is badly messed up. They are not announcing this prefix – 117.206.176.0/20 along with many more prefixes  to transit provider’s IP links. They are just relying on NIXI for domestic traffic while for transit they are relying on IPLC ports which in this case seems to be with Tata AS6453 in London.

 

Here’s what Tata AS6453 router in Mumbai is getting:

 

AS6453 IPv4 and IPv6 Looking Glass
show ip bgp 117.206.176.0/20

Router: gin-mlv-core1
Site: IN, Mumbai, MLV
Command: show ip bgp 117.206.176.0/20

BGP routing table entry for 117.206.176.0/20
Bestpath Modifiers: deterministic-med
Paths: (3 available, best #3)
Multipath: eBGP
11 12
9829
l78-mcore3. (metric 2968) from mlv-tcore2. (66.110.10.215)
Origin IGP, valid, internal
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.
9829
l78-mcore3. (metric 2968) from mlv-tcore1. (66.110.10.202)
Origin IGP, valid, internal
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.
9829
l78-mcore3. (metric 2968) from cxr-tcore1. (66.110.10.113)
Origin IGP, valid, internal, best
Community:
Originator: Loopback5.mcore3.L78-London.as6453.net.

 

 

So clearly in all three cases Tata AS6453 is getting routes via loopback interfaces of it’s router in London (m core 3 – London). There’s not even a single route via m-core Chennai/Mumbai via VSNL AS4755.

 

 

So what’s the possible fix?

Likely something like this:

  1. BSNL should maintain good capacity with IP ports along with IPLC ports.
  2. They should announce all prefixes to IP ports atleast without doing any preferred more specific announcement on IPLC like they announce /18 on IP port and more specific /20 on IPLC.
  3. BSNL should implement BGP blackholing to avoid East Asian traffic via their IPLC ports since most of their ports are in London, New York and Los Angles and not really in East Asia (as far as I can see from routes).
  4. BSNL “could” do a basic 1 degree prepend for IPLC routes specially with Tata AS6453 since AS6453 > AS9829 is short AS path then AS6453 > AS4755 > AS9829. Hence with one degree prepend they can have AS6453 > AS9829 > AS9829 (repetition of own AS once) to increase AS path to make route less preferred. 
  5. Buying IPLC port to reach Equinix Singapore + HongKong Internet Exchange (HKIX) – that’s where they can find a lot of local Asian traffic.
20 Jun

Multi-dimensional effect of corruption in college system

Exam days still going and a few more to go. Before I go further on actual blog topic, a small story to share.
 

“Being a Superman” story 😉

Recently it rained quite heavily in Radaur and I was trapped in a situation where stairs to my room got in touch with live grid electricity due to a badly insulated wire joint.  Stairs are made of metal and that brought this dangerous situation. I was informed about “current in stairs” from someone living in neighbourhood who accidentally touched stairs while it was raining.
Next, I had to get down from there to go to my home. There was just no other option, and waiting for an electrician to figure it out was impractical idea. There was no option of turning down main power switch since leak was before switch in main line. At this point I gave some engineering thoughts. 
 
When one can get a “electric shock”? -> When current passes through the body.
Since human body has fair amount of flesh, practically it has significant amount of resistance to offer in flow of current and thus for current to pass through body there needs to be “potential difference”. I gave a thought and considering fact air was dry at that instant, there was a relatively low possibility of moisture acting as low potential point. With that thought – I put my hand on stairs and felt slight shock on hand + leg which was still on wet ground. I quickly jumped completely on to the stairs and put both hands on metal. And as I expected (though doubted) – I did not got any further shock. And I was still alive too! 🙂 
I came down the stairs (while my neighbours must  have thought that I was planning a suicide after being tired of boring exams & even bad result!) 🙂
As soon as I reached last stair, I saw another dangerous point. There’s a step-down transfer just next to it (less then 2m away) and it is quite well grounded with 4 metal rods buried deep inside the ground. Ground was still wet and there was a high risk of ground being at zero potential (while me being at 220V) and that would be good enough to give a amazing shock. With few seconds of thinking I figured out best way would be to jump out of stairs to ensure I touch ground with zero potential. And all that went well. With just minor shock while getting “on to” the stairs, I safely got out of my scary room. 🙂
 

The main post…

Anyways coming on to more serious issue – “Multi-dimensional effect of corruption in college/education system“. In recent times there have been multiple paper leak reports in newspapers for Kurukshetra University. Here’s one report (in Hindi) from today’s newspaper about DSP and Digital Electronics exam:

Based on amazing lecture of Dr Subramanian Swamy about “Multi-dimensional effects of corruption” – we can deduce some multi dimensional effects from corruption/cheating cases and unfair means in college system.

Just like corruption isn’t simply related to one person bribing other but rather has a broad impact from security of country to political stability, from economic stand point to sub-standard utilization of resources, in very similar manner these “paper leaks” etc aren’t just about someone cheating and passing exam but rather extend their effect in multiple dimensions. Being a student myself I am aware of such leaks and have a fair understanding of how system behind such leaks works.

Some of multi-dimension effects of paper leaks are:

  1. Paper leaks have significant money involvement as paper are leaked around their  origin at very high rates in ranges of $1000 or so. Students who have link to such “sources” often pay such heavy amount and recover few x times by re-selling off the exam.
  2. Most of students who are involved in such reselling at a higher rank in chain make as much as $500-1000 over a night by simply distributing digital pictures of leaked papers.
  3. After getting original paper in hand, students get a nice opportunity to go with well prepared “slips” with exact answers to 8 simple questions for a 100% attempt rather then carrying slips based on assumed questions.
  4. Students who don’t get their hands dirty with such leaked paper tend to fail in exam since their “fair” attempt is way too below the average/topper’s attempt.
  5. Student’s who successfully pass exam in such way completely ruin the value of education and they eventually turn out to be a “B.tech degree holder” then “Engineer”.
  6. As count of “B.tech degree holders” increases – society (friends, parents, relatives etc) increase their pressure on real possible “engineers” and they sooner or later turn their head down and become corrupt.
  7. More number of B.tech holders Vs less amount of real engineers degrade job market & causes problems for everyone including “real engineers”.
  8. Students who make $500-$1000 over night by selling leaked paper make best use of that money by spending it on drinking, drugs, (and even putting in IPL match fixing!) further degrading environment.
  9. Each “successful paper leak” follows atleast half a dozen of “fake paper leaks” and tends to ruin those exams.
  10. Our society end up in having “B.tech holders” as “engineers” in various Govt. jobs and we get degraded infrastructure everywhere.
  11. More students opt for “B.Tech degree” since they learn from seniors on such backend management.
  12. (Classic one) Degree holders are looked upon as “educated people” in society & even if they don’t get a job or they do not opt for an engineering job, they surely secure a good rank in “arranged marriage system”. 🙂

 
 
Interesting fact is that Kurukshetra University never writes “Fail(ed)” in result, they always mention – “Re-appear” which gives “Not this time, but good luck next time” sort of feeling.
 
Time for me to end this post and get ready for next exam.