Analysing transit free networks
Recently came across this interview of Tata Comm CEO Mr Amur S Lakshminarayanan interview to CNBC International. At 3:05 minute, he mentions how Tata Comm carries a third of the world’s internet. It would be very hard to technically find how much is total traffic on the internet and how much of that passes through Tata Comm’s network or any other large network in the world. I guess it’s safe to say that he meant routes rather than traffic. Tata Comm AS6453 at this point in time has around 249k IPv4 routes on its customer cone. The global IPv4 routing table is around 968k and hence Tata Comm’s customer side includes 25.7% of the global routing table.
Now whether traffic comes only for Tata Comm or not for this 25.7% routes depends on whether the downstream is single homed (only behind Tata Comm) or multi-homed. This all makes me curious to find out customer code size of each transit free player and also measure how much of that is single homed Vs multi-homed.
Clickhouse database
I have recently started playing with clickhouse database and am amazed with its performance. I am able to parse a full dump of Oregon routeviews + RIPE RIS pretty quickly.
A complete dump of all these collectors gives me 749 million routes.
SELECT count(*) FROM bgp.table
Query id: 6820f18d-148b-43fb-a941-8126d1d19004
┌───count()─┐
1. │ 749472159 │ -- 749.47 million
└───────────┘
1 row in set. Elapsed: 0.729 sec. Processed 754.36 million rows, 18.64 GB (1.04 billion rows/s., 25.59 GB/s.)
Peak memory usage: 46.88 MiB.
Plus this also gives me ability to find routes of networks which do not directly feed any of the collctors like Verizon AS701, Lumen AS3356 etc.
E.g here’s a count of all IPv4 routes in Verizon AS701 table with routes fed via their peers and downstream all within a second and all this on my local machine!
SELECT countDistinct(prefix) FROM bgp.table WHERE has(as_path, 701) AND (NOT (prefix LIKE '%::/%'))
Query id: 2861968a-cf5a-4c9e-aff6-bd122580d55f
┌─countDistinct(prefix)─┐
1. │ 969008 │
└───────────────────────┘
1 row in set. Elapsed: 1.645 sec. Processed 754.36 million rows, 26.04 GB (458.67 million rows/s., 15.83 GB/s.)
Peak memory usage: 174.07 MiB.
A more specific query can be even faster: e.g 1.1.1.0/24 in Tata Comm AS6453 table:
SELECT * FROM bgp.table WHERE (prefix = '1.1.1.0/24') AND has(as_path, 6453)
Query id: ae1689dd-0921-4ae1-8b9c-ab61ac7db170
┌─prefix─────┬─as_path─────────────────┬─collector────────┐
1. │ 1.1.1.0/24 │ [6453,13335] │ rrc03 │
2. │ 1.1.1.0/24 │ [3320,6453,13335] │ rrc01 │
3. │ 1.1.1.0/24 │ [17660,6453,4755,13335] │ route-views.sg │
4. │ 1.1.1.0/24 │ [271253,6453,13335] │ rrc01 │
5. │ 1.1.1.0/24 │ [1031,999,6453,13335] │ route-views.linx │
6. │ 1.1.1.0/24 │ [1031,6453,13335] │ route-views.flix │
7. │ 1.1.1.0/24 │ [3320,6453,13335] │ route-views.eqix │
8. │ 1.1.1.0/24 │ [3320,6453,13335] │ route-views.isc │
9. │ 1.1.1.0/24 │ [6453,13335] │ route-views.linx │
10. │ 1.1.1.0/24 │ [271253,6453,13335] │ amsix.ams │
└────────────┴─────────────────────────┴──────────────────┘
10 rows in set. Elapsed: 0.012 sec. Processed 65.54 thousand rows, 3.46 MB (5.51 million rows/s., 290.60 MB/s.)
Peak memory usage: 60.70 KiB.
Notice 0.012 seconds!
Disclaimer
Before coming to the data, here are some important points/disclaimers:
- Routes may not always reflect traffic. Take the case of India - Jio AS55836 or Airtel AS24560 have much lower number of routes but extremely high inbound traffic. Networks like Google AS15169 - also a customer of Tata Comm have very few prefixes and have a extremely high outbound traffic. So 10-20% extra routes may not reflect 10-20% extra traffic but 2x-3x more routes would mean something in terms of traffic at tier 1 scale.
- Having single-homed routes is quite high value as traffic for those is carried by that provider from other large providers.
- Having multi-homed customer routes is also nice as that would ensure a given network is not reaching it’s peering partners to reach for that much of the internet.
- I looked at routes only from point of view of transit free network. A network taking transit from single player while peering at an IX may very well be offload 80-90% of it’s traffic but I asume it to be single homed as it’s connectivity to rest of non-peered networks is via single transit.
- View of routing table varies from where one is looking at it from. So these numbers can easily vary by 1-2%.
- Analysis is based on routing dumps from RIPE RIS and Oregon Routeviews as on Sunday, 28th Feb 11pm IST (GMT+5.5). Post is coming a few days after as I didn’t get chance to publish this data.
- Click here to download raw data if interested.
Current state of multi-homed routes across transit free networks
ASN | AS Name | Total downstream IPv4 routes | Single homed downstream routes | Multi-homed downstream routes | |
---|---|---|---|---|---|
AS7018 | AT&T | 21197 | 8161 | 13036 | |
AS3320 | DTAG | 22238 | 1122 | 21116 | |
AS6830 | Liberty Global | 22207 | 428 | 21779 | |
AS3356 | Lumen | 417898 | 24400 | 393498 | |
AS2914 | NTT | 336180 | 3559 | 332621 | |
AS5511 | Orange | 114615 | 3260 | 111355 | |
AS3491 | PCCW Global | 138706 | 3571 | 135135 | |
AS6453 | Tata Comm | 264382 | 11179 | 253203 | |
AS6762 | Telecom Italia | 230097 | 5580 | 224517 | |
AS1299 | Arelion | 576574 | 5139 | 571435 | |
AS12956 | Telxius | 60519 | 6196 | 54323 | |
AS701 | Verizon | 60569 | 4161 | 56408 | |
AS6461 | Zayo | 169116 | 2383 | 166733 | |
AS174 | Cogent | 554498 | 13514 | 540984 | |
AS3257 | GTT | 325698 | 3456 | 322242 |
External network dependency
Comparing total routes in the table with downstream routes shows how much these networks depend on their peering partners.
Conclusion
Tata Comm CEO’s claim is not far off from what is visible in the routing table. Instead of third routes, a little over 1/4th of the routing table is in their downstream. All networks have a massive dependency on other networks for peering. From the list - Arelion AS1299 seems to have the highest number of routes in their table as customer routes (59.23%) and even for their remaining 40.77% routes come from their peering partners. This data is a warm reminder of the first thing we learn about the Internet at school - “The Internet is indeed a network of networks”. 😀