08 Apr

Espresso: Google's peering edge architecture

Back in 2017 Google shared details about Espresso which is their SDN solution for scaling up their routing.
Saw this fascinating presentation from Google at SIGCOMM 2017. This blog post covers it in detail besides the talk.

Key design principles for their routing platform

  1. Hierarchical control plane consisting of both global as well as local control. Global takes care of overall traffic flow, inputs coming from performance metric etc while local take care of failure of BGP sessions, port/device failure etc.
  2. Fail static – To ensure that any part of the system can fail and the system keeps working as it was before.
  3. Software Programmability

Key features of the Espresso platform

  1. Peers physically terminate on MPLS switch and BGP feature is in software and hosted on a set of a host (servers). Sessions are spread across different hosts to avoid a single point of failure. If a host fails, it will result in the failure of only a set of peering and not all. Plus, they keep backup hosts in event of failure of the primary.
  2. Single BGP runs on the software, the table goes in RAM of the server giving very high scalability to hold large routing tables.
  3. Google “sprays” small amounts of traffic across all available paths (non-best paths) to have a picture of all available paths and based on that data as well as inputs from applications, it selects the path.
  4. This platform proves that SDN is not only for the jailed gardens and can be used for BGP routing optimisation. Many people believed SDN was for “internal network” only.
  5. Back in 2017, this platform was being used for around 22% of their existing capacity and entire new buildout was using it. Now in 2020 probably number would be much higher.

The talk ended with a nice Q&A where someone asked how they know capacity on other paths because on an “unloaded path” they may see it’s all good but as soon as they send traffic it may actually choke that path. Clearly that is something which does not happen with Google peering that often and hence I must say their platform is very quick in determining and re-routing traffic.

While the presenter did not mention it in response to the question I think that due to distribution of BGP sessions across various host and carrying a large set of a table in such scalable way, they probably do not have BGP convergence issues. Also, since it’s outbound heavy, they can pick the path to send traffic. It will work in all cases where the other side is able to send traffic back to Google (TCP traffic) and their selected path is not dead.

Think about peer on the other side when you bring up your BGP session with AS15169 next time. 🙂

07 Apr

Manage Wireguard users using Ansible

Day 16 of lockdown here in Haryana due to Covid19. Time for some distraction.


Last week it was reported that Wireguard will be added in next version of Linux kernel. I have been using Wireguard from over a year and it has been working great. I replaced OpenVPN with Wireguard for both site to site VPN as well as client-server VPN. If you are looking for a free open source VPN for remote employees or just connecting to your own remote servers Wireguard can be a really good candidate.

Recently I create client-server VPN at home so that I can get inside the home network whenever travelling (which is little uncommon due to Covid19 lockdown!).

Somehow I did not find any good automated script to generate keys. Tried a few projects and either they did not work or they tend to re-write everything inside /etc/wireguard directory. I presently run 5 different VPN daemons on my Raspberry Pi. It does site to site VPNs to two locations over two different uplinks and then OSPF running over FRR takes care of dynamically routing. For 5th one which is client-server VPN, I used Ansible put a playbook. Idea is to run playbook each time I want to add a user, provide it with client-name and client-ip (didn’t automate client IP since it’s just 4-5 devices max) and the playbook will take care of generating keys, config (which can be copy-pasted in Wireguard running on a laptop) and also QR code which can be scanned for importing config along with the keys in iOS devices. Ideally, I should put a more detailed one as Ansible role but then it’s just me being lazy and settling for a playbook instead.

Here’s goes the playbook!

---
  - hosts: ## Put server hostname here ##
    gather_facts: no
    become: yes
    vars:
      client_name: anurag-phone
      client_ip: 10.0.0.10
      client_mask: 24
      client_dns: 10.1.0.5
      wgname: wg5
      wgport: 5005
      work_dir: "/home/anurag/config"
      server_ip: ## Put server IP here ##
    tasks:
      - name: Ensure {{ work_dir }} exists
        file:
          path: '{{ work_dir }}'
          state: directory
      - name: Generate client keys for {{ client_name }}
        shell:
          cmd: wg genkey | tee privatekey | wg pubkey > publickey
          chdir: "{{ work_dir }}"
      - name: Read client privatekey and register into variable
        shell: cat {{ work_dir }}/privatekey
        register: privatekey
      - name: Read client publickey and register into variable
        shell: cat {{ work_dir }}/publickey
        register: clientpublickey
      - name: Read server publickey of server and register into variable
        shell: cat /etc/wireguard/publickey
        register: serverpublickey
      - name: Add {{ client_name }} to the server
        blockinfile:
          path: '/etc/wireguard/{{ wgname }}.conf'
          marker: "## Added by Ansible"
          block: |
              # {{ client_name }}
              [Peer]
              PublicKey = {{ clientpublickey.stdout }}
              AllowedIPs = {{ client_ip }}/32
      - name: Stop wireguard for {{ wgname }}
        command: wg-quick down {{ wgname }}
        register: wireguardstop
        tags: wireguardrestart
      - debug:
          var: wireguardstop.stderr_lines
        tags: wireguardrestart
      - name: Start wireguard for {{ wgname }}
        command: wg-quick up {{ wgname }}
        register: wireguardstart
        tags: wireguardrestart
      - debug:
          var: wireguardstart.stderr_lines
        tags: wireguardrestart
      - name: Generate client config for {{ client_name }} for full internet access
        blockinfile:
          path: "{{ work_dir }}/{{ client_name }}-full.conf"
          block: |
              [Interface]
              PrivateKey = {{ privatekey.stdout }}
              Address = {{ client_ip }}/{{ client_mask }}
              DNS = {{ client_dns }}
              [Peer]
              PublicKey = {{ serverpublickey.stdout }}
              AllowedIPs = 0.0.0.0/0
              Endpoint = {{ server_ip }}:{{ wgport }}
          state: present
          create: yes
      - name: Generate QR code for {{ client_name }}
        shell: qrencode -t ansiutf8  < {{ work_dir }}/{{ client_name }}-full.conf  > {{ work_dir }}/{{ client_name }}-qr-full
        tags: qr

Some limitations of this playbook:

  1. Cannot be used to delete users. I don’t do that often and thus I am OK to delete those just manually though one can make it little more smart to do that. Probably define users within vars and have a check to not-re-write keys during each run.
  2. It will keep on adding keys to the server side config and hence if run twice for same user, IP – it will add junk. Again, this was more of a quick written solution and not a extensively written playbook to tackle that.

The key objective here was just to generate keys, insert client public key in server side config and server’s key in client side config. And ofcourse making config available in text and QR code form so that one can use import and delete it.