21 Jun

Why Indian internet traffic routes from outside of India?

After my last post about home networking, I am jumping back into global routing. More specifically how Indian traffic is hitting the globe when it does not need to. This is an old discussion across senior management folks in telcos, policymakers, and more. It’s about “Does Indian internet traffic routes from outside of India?” and if the answer is yes then “Why?” and “How much?”

It became a hot topic, especially after the Snowden leaks. There was even an advisory back in 2018 from Deputy National Security Advisor to ensure Indian internet traffic stays local (news here). Over time this has come up a few dozen times in my discussion with senior members from the Indian ISP community, individuals, and even latency-sensitive gamers. So I am going to document some of that part here. I am going to put whatever can be verified publically and going to avoid putting any private discussions I had with friends in these respective networks. The data specially traceroutes will have measurement IDs from RIPE Atlas so they can be independently verified by other network engineers.

Does Indian internet traffic routes from outside of India?

Yes, a part of it does that.

How do we know that’s happening?

Well, a simple traceroute is good enough to reveal the forward path. When triggering traces from large endpoints (on projects like RIPE Atlas) to another set of large endpoints, one can find that. I am going to put some traces showing that behavior as I proceed in this blog post.

Why does that happen?

There can be broadly four reasons on why that happens in the Indian context:

  1. Simple misconfiguration where pools which are supposed to be advertised locally are missed OR a more specific pool is announced to International transits & peers compared to domestic peers/transit. More specific = always wins in the routing.

  2. Heterogeneous route filtering – As networks roll out route filtering (based on IRR and RPKI), they typically start with their peers and at a later stage start filtering their customers. So e.g Tata Comm AS4755 filters routes based on IRR route objects from its peers (like Airtel AS9498) but Airtel AS9498 is also a customer of Tata Comm AS6453 outside of India. There are visible cases where Tata Comm AS4755 would drop peering routes from AS9498 but would accept them Tata Comm global network AS6453 from outside of India. While both AS6453 & AS4755 belong to Tata Comm, at the technical layer AS6453 is upstream of AS4755. Remember AS9498 is a peer of Tata Comm AS4755 in India but a downstream customer of Tata Comm global backbone AS6453 outside of India.

  3. Paths on which actual traffic flow is just not expected. This is true, especially for CDN IPs. CDNs often DNS magic to map end-users to the nearest node. They typically do not expect users on ISP 1 to be connecting to caching nodes on ISP 2. But if we trace from ISP 1 towards caching IPs on ISP 2 – it might go from outside of India. Ideally, this should not happen because of domestic peering within networks (more on this later in this post) but practically has a very low/negligible impact on actual traffic flow. This is also often for loopback IPs large operators where aggregate is announced somewhere far away & more specific /32 IPv4 or /128 IPv6 stays local within the network. Again these do not have actual traffic flow.

  4. Lack of peering/interconnection within India

Out of these 4th is the largest cause and has a serious impact on end-users. Before jumping on lack of peering/interconnection let’s understand how interconnection is supposed to be.

A reminder of transit free network concept

There are around 16 networks in the world that are known as transit-free / tier 1 networks. These networks typically hold a very large set of the global routing table as its direct or indirect customers. All networks in the world are either directly purchasing IP transit from them or from their customer or from their customer’s customer and so on. These operators peer with each other and by virtue of having to peer with each other they can reach any network in the world. Any given network is either their direct/indirect customer or their peer’s direct indirect customer. The routing table these players collectively hold is often called a Default Free Zone (DFZ). Thus to ensure full internet connectivity a network must ensure that its route reach at least one transit-free network as a customer (as a customer so that it announces them to the rest of its peers). Once routes reach at least one transit-free network, they are visible in the default-free zone and hence comes the guarantee of reachability. This is just about the concept of reaching every network in the world. The majority of traffic by volume actually does not hit transit-free / tier 1 networks as it’s from content players to eyeball networks and is often routes via various public peering ports over internet exchanges and through private peering PNI ports. If you want to learn more about peering, read Mr. Bill Norton’s Dr Peering Playbook.

So how do Indian networks ensure global reachability? They buy IP transit from various transit-free networks. Now any lack of domestic peering won’t lead to a blackholing/total drop of traffic but instead of spillover to upstreams of these networks outside of the country. Thus to analyze where Indian traffic routes from outside, we should focus on networks that connect to networks outside India specially the networks in the default-free zone.

Which Indian networks connects to outside of India?

As of now, there are 3105 ASNs allocated to Indian networks out of which 2172 ASNs are active and visible in the global routing table when looking via points like RIPE RIS collectors. By looking at the raw routing table & with a test one can find which are the ASNs out of 2172 connect outside of India.

Indian ASNs which connect to outside world on layer 3 routing:

  1. Bharti Airtel AS9498
  2. Tata Communications AS4755/AS6453
  3. Jio AS55836/AS64049
  4. Vodafone IDEA AS55410 AS55644
  5. Sify AS9583
  6. BSNL AS9829
  7. Telstra AS4637
  8. Reliance Communications FLAG AS18101/AS15412

(Note: This is purely from the view of layer 3 routing. There can be cases like Powergrid connecting to Nepal/Bangladesh but on layer 1 circuits and hence not visible when looking at the BGP data)

Now this list of networks is important as they appear to be transit for the entire Indian table with 39000+ routes. As long as routing within these networks happens within India, our traffic would be local. Else it would “leak” from outside of India. One cannot ensure 2172 networks connect to each other but as long as these specific backbones are connected (in peering or transit relation) within India, traffic should be local. Of course, more dense peering by their downstream locally is even better. Folks at DE-CIX India and Extreme IX are doing a good job to promote that.

Notable cases where Indian traffic is being routed from outside of India:

  1. Traffic between Telstra Reach AS4637 PoPs in India and Airtel AS9498 routes from outside of India (RIPE atlas measurements here and raw traces here).

  2. Some traffic between Telstra Reach AS4637 PoPs in India and Tata Comm AS4755 routes from outside of India (RIPE atlas measurements here and raw traces here). This feels like a config issue within AS4755 because some traces also show direct paths like this one from their looking glass. Ideally iBGP within AS4755 should have ensured direct local paths but seems like Delhi PoP has while Hyderabad is missing.

  3. Traffic between Sify AS9583 and Airtel AS9498 is routing from outside of India (RIPE atlas measurements here and raw traces here).

  4. Traffic from Airtel AS9498 towards BSNL AS9829 South Indian pools is routing from outside of India (RIPE atlas measurements here and raw traces here). Here from data from looking glass, it seems clear that Airtel AS9498 is rejecting AS9829 routes at NIXI Chennai. All other major operators including Tata Comm, Jio, etc are picking those routes as visible from traces.

  5. Traffic destination to Vodafone IDEA (VI) for many pools is routing from outside of India. This includes cases where traffic is from Airtel, Tata Comm, Jio (RIPE atlas measurements here and raw traces here)

In all these cases all these networks are at multiple NIXI exchanges and traditional knowledge says that they should be getting routes of each other in session via the NIXI route server (forced multi-lateral peering policy). But that’s clearly not happening. In the case of Telstra, routes are clearly visible at NIXI Noida (lookup result here) and thus it seems to be an issue of Airtel & Tata Comm. In the case of Sify, their routes are visible across key NIXI PoPs – Noida, Mumbai, and Chennai (lookup here). In case of the BSNL issue – their South Indian routes are visible at NIXI Chennai (lookup here). In the case of Vodafone IDEA (VI) their routes are not visible at any of NIXI PoPs and hence only way for networks to send them traffic is from outside of India.

Is there a real world impact of this sort of awful routing?

Yes, there is. Take e.g trace from Airtel fixed line AS24560 towards India’s largest state-owned bank SBI portal (www.onlinesbi.com – 103.68.221.190) which is on Sify. (RIPE Atlas measurements here)

2021-06-20 22:50 UTC

Traceroute to 103.68.221.190 (103.68.221.190), 48 byte packets

1 192.168.1.1 0.231ms 0.168ms 0.27ms
2 192.168.2.1 0.844ms 1.047ms 0.765ms
3 122.179.4.1 abts-kk-dynamic-001.4.179.122.airtelbroadband.in AS24560 73.924ms 69.936ms 56.354ms
4 122.185.118.173 nsg-corporate-173.118.185.122.airtel.in AS9498 67.181ms 59.309ms 73.925ms
5 182.79.154.0 198.111ms 199.051ms 205.77ms
6 149.14.224.161 be6391.rcr21.b015591-1.lon13.atlas.cogentco.com AS174 202.371ms 208.665ms 201.43ms
7 130.117.2.65 be2053.ccr41.lon13.atlas.cogentco.com AS174 197.046ms 197.73ms 198.251ms
8 154.54.58.174 be2870.ccr22.lon01.atlas.cogentco.com AS174 199.739ms 200.94ms 201.756ms
9 149.6.149.114 AS174 199.097ms 198.14ms 199.624ms
10 100.66.12.5 207.597ms 223.398ms 209.189ms
11 100.66.12.5 206.488ms 206.549ms 208.43ms
12 223.31.161.154 223-30-0-0.lan.sify.net AS9583 207.665ms 206.16ms 207.075ms
13 * * *
14 * * *
15 * * *

Thus requests from Airtel users to SBI portal route via London. There we throw the concerns of National security or localized routing out of the window.

Thus sad part here is that instead of acting as a major interconnection hub for Asia, we are having poor routing in networks within India. I hope this broken peering ecosystem is fixed eventually.

Disclaimer: This is my personal blog and I am posting here as an individual. Post has nothing to do with my employer.

15 Jun

Designing high capacity home network

For the last few months, we have been working on setting up a dedicated room for my home office usage. This gives me the opportunity to plan for changes in the home network.

Key design ideas

  1. Cabling should support needs for now as well as the future but active hardware stays strictly for what I need for now. Hardware can be upgraded easily. It is not impossible to change cabling in the future since we have ducts – 0.75″ ducts everywhere and 1.5″ where it goes towards my office for aggregation but it’s still a few hour’s tasks to plan, change cables, etc. Plus, if changing cables, they have to change together. Any partial change will be very hard as 0.75″ ducts get filled with cables. Cables are in star topology but ducts are daisy-chained (older POTS design)

  2. While it may not be true especially in the Western world where manpower is very expensive but hardware is relatively cheap, here in India good managed switches, good routers with gigabit ports, policy-based routing, etc can be quite expensive compared to the cost of cabling. I would prefer to centrally aggregate circuits instead of putting switches here & there. Thus if I needed 1 port, I targeted to provision 2 ports and associated cabling from that point till the aggregation.

  3. Single place aggregation also has another advantage – it makes it easy to have one single “fat” wall plate with lots of ports and just 1, 2, or 3 port faceplates in the rest of the wall plates. Most of the old telephone boxes are 3 x 3 and can give a max of 2 cat6 ports. One might be able to fit 4 ports if a faceplate is found with those but would be very hard to manage it because there’s barely space to put the extra cable in a 3 x 3 box.

  4. All cables should be terminated on a faceplate. In past, I ran a network between different rooms without faceplates for the last 8 years and it gets messy as time goes. The area gets dirty, becomes a fragile point to disconnect, cleanup, etc. Thus this time I decided that all cables must terminate with a faceplate instead of hanging out of the wall.

  5. For cabling, cat6 made the most sense to me. Cat6 cost 7000 INR for 300m box. I used around half of the cable and thus it was 3500 INR for 150m / $48 USD for 492 feet. I have the option to return the remaining unused cable.
    I couldn’t find cat6a locally but pricing hints were way higher than cat6. I can run gigabit right away on cat6 and can easily do 2 x 1Gbps LCAP LAG if needed between key locations (the room where uplinks come and the office where aggregation will happen). Furthermore, I can run 2.5Gbps & 5Gbps on each leg at some point when I upgrade the switch. I might be able to run 10G on copper though the longest leg might be a margin of 50m including patch cords. I doubt I will need more than 1Gbps to individual wall jacks where bandwidth is being “consumed” for at least the next 6-7 years. But might very well need more than 1Gbps between uplink room (location of core router, GEPON optical units) and office (aggregation switch). Thus I also got a two-strand single-mode fiber pulled in. That costs around 5.5Rs/meter. It’s a drop cable & I can add connectors to that directly without the need of splicing (though would still need cleaver for a 90 degrees sharp cut). I might light that up after few years especially when both home uplinks are 1Gbps or above. For now, they are 100Mbps (primary) and 50Mbps (secondary). Because fiber is there in the wall, I can in theory run 100Gbps in the far-off future if need to.

  6. Price of electronics changes very quickly. It makes sense to over-provision on the cable part but I would always avoid over-provision on the router/switch part. The cost of a 10G switch today is probably 10-20x more than what it will be after 6-7 years when I might actually need that kind of capacity.

  7. The highest traffic segment in the home is between the home server (Intel NUC) with a connected 4TB HDD. It does various monitoring, home NAS, site-to-site VPN & more. There are times when traffic touches 1Gbps between Intel NUC, core switch, and core router. Its port is 1Gbps and thus no upgrade path on that except to just replace it in the future. Newer NUC has a 2.5Gbps port. As cloud computing becomes super cheap, I doubt I will be putting super high compute at home in the near future. As of now, Intel NUC hosts only what has to be hosted locally.

  8. For faceplates my friend helped me to get a Hi-Fi modular plate – 2 x 8 port each and thus 16 port aggregation. These are basically electric plates with cat6 modules fitted in. So far working well. Though I have terminated cat6 countless times, it was a hard job on that one due to excess cabling. Challenge is: If I leave excess cable, 8 “excess cables” per plate make it hard to close the faceplate while if I leave minimal cable then it would be hard to tweak/fix/make changes in the future. I ended up not cutting the cat6 cable tearing string & tied it along the cable with electric tape. That way I can open up tape anytime & just pull that up to further peel off the outer jacket. For faceplates in the rest of the home, I found Anchor face plates locally. Again these were also electric plates with the option to add a cat6 jack in place of a single “button”. For patch cables, I prefer to use cat5e instead of cat6 because cat5e is less rigid. I prefer to use the same in datacenter environment as well. 1Gbps runs just fine on that & cabling management is a lot easier. Again all of these or a few can be upgraded to cat6 at some point when I have to run high speeds (2.5G/5G/10G). I think I made the mistake of not having an extra “box” to hold the extra cables. Since it was a fresh design, that could have been done easily during the built. So in total this was 13 cat6 cables, 2 ends, 8 strands each = 13 x 2 x 8 = 208 punch downs. In reality, it was probably 250 as I had to re-do few things twice due to less space in the wall box.

And so here are some pictures!

05 Jun

Remembering M Henri Day & Google Apps forum

Blog post dedicated to my friend M Henri Day from Stockholm, Sweden. Today I learnt that he’s no more and passed away in the first week of December last year. He one of my few good friends from college days. We both were so called “power posters or top contributors” as Google named us in their different forums. I was one of top contributors in Google Apps (Gsuite / Google Workplace) and he was …..well to be honest I don’t even recall that now after 11 years about which specific Google product he was active on. I think it was Google bookmarks, Picasa and few other things. We were super active in those forums for no specific reason but because it was just fun helping people around. Plus that was the time I learnt how DNS works and was very excited to talk about it with everyone. I was out of school and didn’t perform well & got into a college which was ok. To be true college was less fun and life in Radaur was harsh but somehow I developed the taste of the life there. I documented part of that life in some old posts here and here.

So in a way these groups were a good exit from harsh life back then. It become more fun in 2011 when Google announced Top Contributor summit & invited top contributors from all these forums to their HQ in Mountain View. That was fun and must say a trip I still remember in great detail. We had a direct channel with Google on testing products, passing feedback etc. Henri did not travel there as he was old plus he travelled very extensively before 1990s. He once told how he travelled from Iran, Pakistan and entered India by foot decades ago. So I missed opportunity of meeting him there. As time proceeded, I got opportunities first as consultant on projects here & there and later as full time job. Things moved on. While both of us were less active on the forums after that, we stayed in regular touch.

Henri was a retired psychiatrist and was great at discussing things. We communicated often via Gtalk/Gmail chat / Google hangouts and emails running via 2G tethering from my cell phone back then (his end was Bahnhof running on Stokab fiber on 250Mbps! 🙂 ). I would often tell him about challenges at college, excitement of possible developments in terms of internships, projects & jobs and he used to encourage me in trying various things. His help was immense by keeping up the excitement levels around the “possibilities” at that time. In hindsight now I think that is what is most one needs when at smaller college with low exposure to the world and a overall sad ecosystem. Once in a class in first semester a lecturer told us on how he went to Bangalore in hope of job, couldn’t get it and was “kicked back”. I remember asking Henri “If he cannot get a job, what’s the point of me being here” and his long reply explaining “who knows what the hell lecturer did and if he actually was at right place or not“. The variety of our discussion was endless.

He would often mention about challenges he was having with Linux as he used to promote Linux on low end re-cycled desktops for old age people around Stockholm. As time proceeded, so did our discussions. He was getting old, I was getting busy with my job. I recently sent him multiple emails and chat messages. After not hearing back, today I called his landline number & his partner told that he passed away. It feels bad and a huge loss of a great friend who helped a lot. I hope he rests in peace. 🙁