Google to stop peering via route servers

Earlier this week, Google announced that it will stop peering via route servers. I cannot find the official announcement as it seems to have been sent to the IXPs. Since Mr. Bill Woodcock (Director at PCH and my former boss) posted it on Linkedin, I think it’s reliable information.

Various discussions in the networking community have arisen regarding why Google is making this decision and whether this will impact smaller ISPs. I do not agree with Mr Bill’s comment that it’s a step backwards. It’s important to first understand what the models Google follows to peer and then speculate on the decision to not peer at the route servers.

How Google (AS15169) delivers traffic to it’s peers:

  1. Multi-lateral peering at the IXPs - Google & various eyeball networks peer with IX route servers and exchange traffic. BGP sessions are with route servers and not directly with the eyeball networks but next-hop remains of these eyeball networks over the IX fabric.
  2. Bilateral peering at the IXPs - Google & various eyeballs networks establish BGP sessions directly over the IX fabric. The exchange route server does not come in the picture and they simply “exchange traffic” over the IXP switch.
  3. PNIs - Private Network Interconnect is established between Google and various large to mid-sized networks. In this case, traffic does not hit an IXP at all and simply goes over the cross-connects between the rack/transport equipment of an eyeball network and Google.

What Google intends to stop is #1 and push more and more networks towards #2 i.e bilateral BGP sessions over the IXPs. This overall is not surprising and likely is because of certain good technical reasons. Amazon AWS (AS16509) is known to be not peering via route server either and many large networks tend to avoid them.

Possible reasons not to use route servers:

  • Route leaks - When exchanging routes with hundreds of networks across various exchanges, any network can pick up a route and leak to its upstream or peers. The same can happen in the case of bilateral peering as well but then Google can simply have automation to shut down the session. In case of a leak via RS, it can be a little hard to detect as well as shutting it down would impact many networks at once. They can use BGP communities at the RS to limit announcements but not all IXPs have those, and most of the time they are non-standard as well.
  • Remote peering - Remote peering is a ghost which keeps on coming up again and again over time when most of us feel it’s dead. Due to remote peering, the next hop within an IX can be 1ms away or 100ms away. There have been proposals in past to tag remote peers with the specific community but so far that has not been implemented.
  • Blackholing of traffic - This tends to happen in IXPs with large fabrics spread across multiple datacentres. There are times when the transport does not behave well while the route server keeps announcing the routes. In the case of bilateral sessions, this would simply tear down BGP sessions as TCP port 179 would fail. But in the case of route servers, if they are reachable from both sides, there would be routes where simply next-hop is not reachable. Members will keep trying to get the ARP of the next-hop IP but transport failure would not let it complete.
  • Automation: This may seem weird, but in reality, bilateral sessions were discouraged due to the administrative impact. It involved setting up the sessions, maintaining them, and troubleshooting with peers if they were down. Route servers are used to solve this issue by simply offering a single session for routes of many networks. This “effort to set sessions” is primarily on Google’s side. It’s Google which probably has to do over 50k BGP sessions and the majority of the other networks. The other side is essentially dealing with a few dozen for most of the time. Once sufficient automation is in place to take care of setup, troubleshooting etc, the major -ve impact of bilateral peering goes away.

Many of these problems are simply problems of the scale. For small ISPs, as well as smaller IXPs - multilateral peering via route server, does very much make sense. With that being said - go and request Google for your bilateral sessions before they stop the announcement.