31 Mar

Dark spot in Global IPv6 routing

 

 

Fest time at college – Good since I get lot of free time to spend around looking at routing tables. It’s always interesting since last week was full of some major submarine cable cuts and has huge impact on Indian networks.

Anyways, an interesting issue to post today about Global IPv6 routing . There are “dark spots” in global IPv6 routing because of peering dispute between multiple tier 1 ISPs involving Hurricane Electric (AS6939) & Cogent Communications (AS174).  What’s happening here is that both tier 1 providers failed to reach on agreement to keep peering up in case of IPv6. This has resulted in parts of global IPv6 internet where packets from one network (and it’s downstream) can’t reach other network or their downsteam singled hommed networks. 

Only publicly known information about de-peering of Cogent from HE is Mr Mike Leber’s email to NANOG mailing list here. Overall Hurricane Electric seems pretty much open in peering and networking community knows this well. So it is not hard to believe in to Mr Mike’s mail. Infact they even baked a cake to cheer Cogent up at NANOG meeting 47 at Dearborn, Michigan in 2009.

 

 

Why IPv6 Internet is broken when simply two providers de-peered? 

Answer of this lies in fundamental theory of a Tier 1 network i.e a “transit free” network. Hurricane Electric is world’s biggest IPv6 backbone in terms of number of interconnections while Cogent Communications is a big ISP in US and Europe with significant last mile fiber in many areas of US. It is a popular choice for cheap datacenter upstream transit. 

Now since both ISPs are tier 1 i.e transit free network in case of IPv6 internet, they simply do not pay to anyone (on layer 3) to reach any network. Packets from HE can’t go to Cogent simply because there’s no transit provider for HE in IPv6 (infact it is the transit provider to lot of networks!). At the same time Cogent is also not having any transit provider in IPv6. Transit here is important because there are many networks in world which are not connected. Say e.g Indian BSNL doesn’t connects to Hurricane Electric or say Tulip Telecom doesn’t connects to AT&T directly but packets can be routed because in both cases they have transit from an upstream network which eventually connects to AT&T or peers with AT&T. 

 

 

Looking at Cogent’s IPv6 prefix – 2001:0550::/32 announced from AS174 from Hurricane Electric’s route server:

 

route-server> show bgp ipv6 2001:0550::/32
% Network not in table
route-server>

 

There is no public route server from Cogent, thus I am using their looking glass to reach IPv6 address of he.net to test connectivity:

PING he.net(he.net) 56 data bytes
From 2001:550:1:31f::1 icmp_seq=2 Destination unreachable: No route
From 2001:550:1:31f::1 icmp_seq=3 Destination unreachable: No route

— he.net ping statistics —
5 packets transmitted, 0 received, +2 errors, 100% packet loss, time 14003ms

 

 

Is dark spot just in case of IPv6? What about their IPv4?

Yes, this problem in IPv6 specific only. HE and Cogent do not peer in case of IPv4 too but since HE is not a tier 1 in case of IPv4, it rather has a couple of transit providers who seem to be having peering relation with Cogent.

Looking at Cogent’s IPv4 38.100.128.10  from HE’s route server:

route-server> show ip bgp 38.0.0.0/8 long
BGP table version is 0, local router ID is 64.62.142.154
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
r RIB-failure, S Stale, R Removed
Origin codes: i – IGP, e – EGP, ? – incomplete

Network Next Hop Metric LocPrf Weight Path
* i38.0.0.0 213.248.92.33 48 70 0 1299 174 i
* i 213.248.92.33 60 70 0 1299 174 i
* i 216.218.252.174 48 70 0 1299 174 i
* i 213.248.86.53 48 70 0 1299 174 i
* i 213.248.93.81 48 70 0 1299 174 i
* i 213.248.93.81 48 70 0 1299 174 i
* i 213.248.67.105 48 70 0 1299 174 i
* i 213.248.96.177 48 70 0 1299 174 i
* i 213.248.67.125 48 70 0 1299 174 i
* i 213.248.70.37 48 70 0 1299 174 i
* i 213.248.92.33 48 70 0 1299 174 i
* i 213.248.101.145 48 70 0 1299 174 i

(short extracted view of long output)

 

So clearly HE seems using AS1299 which Telia Global Network – one of IPv4 Tier 1 ISPs to reach Cogent. I can guess it is transit provider for HE. At the same time I can see a route from Cogent to HE in IPv4 via Global Crossing:

traceroute to 216.218.186.2 (216.218.186.2), 30 hops max, 60 byte packets
1 vl99.mag01.ord01.atlas.cogentco.com (66.250.250.89) 0.497 ms 0.444 ms
2 te0-5-0-3.ccr21.ord01.atlas.cogentco.com (154.54.45.193) 0.437 ms 0.569 ms
3 te0-5-0-5.ccr22.ord03.atlas.cogentco.com (154.54.44.162) 0.647 ms te0-5-0-1.ccr22.ord03.atlas.cogentco.com (154.54.43.230) 0.821 ms
4 Tenge4-4-10000M.ar3.CHI2.gblx.net (64.212.107.73) 0.554 ms 0.562 ms
5 Hurrican-Electric-LLC.Port-channel100.ar3.SJC2.gblx.net (64.214.174.246) 54.313 ms 54.016 ms
6 10gigabitethernet1-1.core1.fmt1.he.net (72.52.92.109) 54.792 ms 55.231 ms
7 * *
8 * *

 

So clearly networks have connectivity in IPv4 via HE’s upstreams Global Crossing (which is now Level 3) & Telia. In IPv6 HE simply is not having a customer relationship with Gblx and Telia. And so the dark spot remains there. 

 

The other fact which confirms that Telia and Gblx are transit for HE is via RADB records of AS1299.

 

Anurags-MacBook-Pro:~ anurag$ whois -h whois.radb.net as1299 | grep 6939
import: from AS6939 action pref=50; accept AS-HURRICANE
export: to AS6939 announce ANY
mp-import: afi ipv6 from AS6939 accept AS-HURRICANE
mp-export: afi ipv6 to AS6939 announce AS-TELIANET-V6
Anurags-MacBook-Pro:~ anurag$

 

Clearly it is announcing ANY ie 0.0.0.0/0 to HE on IPv4 while for IPv6 it is announcing only AS-TELIANET-V6 i.e transit in IPv4 while peering in IPv6.

 

With hope that this issue is resolved in near future, time for me to get some sleep! 🙂

 

Disclaimer: Focus of this blog post is not about who is responsible for not peering & creating such situation but rather a technical analysis of what happens when big Tier 1 ISPs de-peer.

Comments are personal and have nothing to do with my employer. I know most of people I mentioned in this post personally and this fact has nothing to do with this blog post!

 

27 Mar

SMW4 Cable outage

Today a friend from Pakistan informed about SMW4 outage. He reported about issues in Pakistan.

It seems like SMW4 is damaged near Egypt and that is what causing high load on East Asian routes giving pretty high latency.

 

I am at my home and sitting BSNL’s network and latency with Europe has jumped terribly to 700-800ms. Right now I do not see a direct route to Europe and it’s rather taking East Asia > US > Europe routes right now on other cable networks.

 

Quick view on some of traceroutes:

 

To Facebook.com

anurag:~ anurag$ traceroute -a www.facebook.com
traceroute to star.c10r.facebook.com (69.171.229.25), 64 hops max, 52 byte packets
1 [AS65534] router02 (10.10.0.1) 1.759 ms 1.018 ms 0.869 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 18.184 ms 18.809 ms 17.962 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 28.761 ms 28.648 ms 28.352 ms
4 [AS4755] 115.114.57.165.static-mumbai.vsnl.net.in (115.114.57.165) 77.803 ms 63.059 ms 61.319 ms
5 [AS3549] 172.29.250.33 (172.29.250.33) 63.106 ms 62.755 ms 63.853 ms
6 * * *
7 [AS4755] 115.114.85.233 (115.114.85.233) 64.694 ms 63.013 ms 61.133 ms
8 * [AS0] if-7-2.tcore1.cxr-chennai.as6453.net (180.87.36.34) 531.243 ms *
9 [AS0] if-5-2.tcore1.svw-singapore.as6453.net (180.87.12.53) 566.615 ms 906.432 ms *
10 * * *
11 [AS0] if-2-2.tcore1.tv2-tokyo.as6453.net (180.87.180.1) 577.953 ms 542.487 ms *
12 * [AS0] if-9-2.tcore2.pdi-paloalto.as6453.net (180.87.180.17) 538.170 ms 617.144 ms
13 * [AS3549] te1-4-10g.ar1.pao2.gblx.net (208.51.134.97) 673.785 ms *
14 * [AS22566] xe10-3-1-10g.scr3.snv2.gblx.net (67.17.79.169) 563.667 ms 631.657 ms
15 [AS22566] e8-1-20g.ar5.sjc2.gblx.net (67.16.145.118) 554.785 ms * *
16 [AS3549] 64.208.158.30 (64.208.158.30) 535.164 ms 573.485 ms 546.552 ms
17 [AS32934] ae1.bb02.sjc1.tfbnw.net (204.15.21.164) 580.511 ms * 529.838 ms
18 [AS32934] ae12.bb02.prn1.tfbnw.net (74.119.79.109) 543.454 ms 569.572 ms
[AS32934] ae16.bb01.prn1.tfbnw.net (31.13.24.254) 659.153 ms
19 [AS32934] ae1.dr02.prn1.tfbnw.net (74.119.79.107) 567.662 ms *
[AS32934] ae1.dr05.prn1.tfbnw.net (204.15.23.61) 560.851 ms
20 * * *
21 * * *
22 *^C
anurag:~ anurag$

 

Route to Europe:

anurag:~ anurag$ traceroute -a server7.anuragbhatia.com
traceroute to server7.anuragbhatia.com (178.238.225.247), 64 hops max, 52 byte packets
1 [AS65534] router02 (10.10.0.1) 1.797 ms 0.989 ms 1.015 ms
2 [AS9829] 117.220.160.1 (117.220.160.1) 21.046 ms 18.046 ms 18.068 ms
3 [AS9829] 218.248.169.126 (218.248.169.126) 244.155 ms 28.669 ms 28.922 ms
4 [AS4755] 115.114.57.165.static-mumbai.vsnl.net.in (115.114.57.165) 62.840 ms 61.595 ms 60.564 ms
5 [AS0] 172.31.16.193 (172.31.16.193) 91.433 ms 94.132 ms 94.564 ms
6 [AS6453] if-2-606.tcore1.njy-newark.as6453.net (66.198.70.121) 529.370 ms * *
7 * [AS6453] 66.110.59.66 (66.110.59.66) 566.573 ms *
8 [AS1299] nyk-bb1-link.telia.net (80.91.252.226) 614.390 ms * *
9 * * [AS1299] ffm-bb1-link.telia.net (213.155.131.146) 697.499 ms
10 * [AS1299] mcn-b2-link.telia.net (213.155.134.13) 733.122 ms 721.410 ms
11 * [AS1299] gw02.contabo.net (213.248.101.78) 731.281 ms *
12 * * [AS51167] server7.anuragbhatia.com (178.238.225.247) 702.811 ms
anurag:~ anurag$

 

 

Issues seems not isolated to BSNL or Tata but also with Airtel.

 

E.g Airtel Delhi PoP to London:

 

Wed Mar 27 16:28:59 GMT+05:30 2013
traceroute 62.239.237.1

Type escape sequence to abort.
Tracing the route to 62.239.237.1

1 203.101.100.29 [MPLS: Label 716197 Exp 0] 84 msec
182.79.254.242 [MPLS: Label 716197 Exp 0] 84 msec
203.101.95.146 [MPLS: Label 677302 Exp 0] 80 msec
2 125.21.80.161 [MPLS: Label 406905 Exp 0] 156 msec
203.101.95.141 [MPLS: Label 406905 Exp 0] 76 msec
202.56.223.205 [MPLS: Label 406905 Exp 0] 92 msec
3 203.101.95.117 [MPLS: Label 569896 Exp 0] 120 msec 40 msec
203.101.100.205 [MPLS: Label 389360 Exp 0] 52 msec
4 182.79.255.18 92 msec
182.79.255.14 88 msec 88 msec
5 BHA-0007.gw1.sin0.asianetcom.net (203.192.168.53) 176 msec 180 msec 172 msec
6 te0-3-0-0.wr1.sin0.asianetcom.net (61.14.157.233) [AS 10026] 184 msec 184 msec 216 msec
7 gi3-0-0.cr2.nrt1.asianetcom.net (61.14.157.158) [AS 10026] 248 msec 248 msec 244 msec
8 po5-0-0.gw3.lax1.asianetcom.net (202.147.0.38) [AS 10026] 428 msec 432 msec 420 msec
9 linx7.ukcore.bt.net (195.66.224.56) [AS 10026] 388 msec 388 msec 388 msec
10 *
core1-te0-3-0-1.ealing.ukcore.bt.net (62.172.102.2) [AS 2856] 384 msec 392 msec
11 core1-pos1-0.birmingham.ukcore.bt.net (62.172.103.81) [AS 2856] 384 msec 384 msec 384 msec
12 iar1-gig5-4.birmingham.ukcore.bt.net (62.6.196.94) [AS 2856] 392 msec 388 msec 448 msec
13 62.172.57.218 [AS 2856] 384 msec 432 msec 392 msec
14 * * *

 

 

If we look at Tata AS6453’s routing table at Mumbai for a Europe based IP:

BGP routing table entry for 178.238.224.0/21
Bestpath Modifiers: deterministic-med
Paths: (3 available, best #3)
Multipath: eBGP
     11         12
  3356 51167, (aggregated by 51167 gw02.contabo.net.)
    ldn-icore1. (metric 9713) from mlv-tcore2. (66.110.10.215)
      Origin IGP, valid, internal, atomic-aggregate
      Community:
      Originator: ldn-icore1.
  3356 51167, (aggregated by 51167 gw02.contabo.net.)
    ldn-icore1. (metric 9713) from mlv-tcore1. (66.110.10.202)
      Origin IGP, valid, internal, atomic-aggregate
      Community:
      Originator: ldn-icore1.
  3356 51167, (aggregated by 51167 gw02.contabo.net.)
    ldn-icore1. (metric 9713) from cxr-tcore1. (66.110.10.113)
      Origin IGP, valid, internal, atomic-aggregate, best
      Community: 
      Originator: ldn-icore1.



There seems to be direct path via mlv – tcore 1 (Mumbai > Europe) but overall it is less preferred and cxr-tcore1 is given preference (Chennai > East Asian route). Same applies on most of other Europe based prefixes.I tried pulling data from my RIPE Probe #1032 but not able to login to RIPE atlas site hosted in Europe!

 

That’s all for now. Will post updates as things improve.