10 Jul

Calculating IPv6 subnets outside the nibble boundary

Often this comes into the subnetting discussion by my friends who are deploying IPv6 for the first time. How do you calculate subnets outside the 4-bit nibble boundary? This also happens to be one of starting points of APNIC IPv6 routing workshop where I occasionally instruct as community trainer.

So what is a Nibble boundary?

In IPv6 context, it refers to 4 bit and any change in multiple of 4 bits is easy to calculate. Here’s how: Let’s say we have a allocation: 2001:db8::/32. Now taking slices from this pool within 4 bit boundry is quite easy.
/36 slices (1 x 4 bits)
and so on…
/40 slices (2 x 4 bits)
/44 slices (3 x 4 bits)
/48 slices (4 x 4 bits)
Clearly, it seems much simple and that is one of the reasons we often strongly recommend subnetting within the nibble boundary and not outside for all practical use cases. However understanding why it’s easy this way, as well as things like how to subnet outside nibble boundary for cases, say if you are running a very large network and have a /29 allocation from RIR.

Going back to fundamentals

IPv6 address consists of 128 bit addressing and is represented in hexadecimal.
IPv6 address:  _ _ _ _: _ _ _ _ :_ _ _ _ :_ _ _ _ :_ _ _ _ :_ _ _ _ :_ _ _ _ :_ _ _ _ 
Each dash here represents is written in hexadecimal and represents 4 bits, thus 4+4+4+4 = 16 bits in each block and 16 x 8 = 128 bit addressing. This brings that magical 4-bit nibble boundary.
So if we expand 4 bits into binary, we can have following combinations for each “dash” in above representation:

0 0 0 0 
0 0 0 1
0 0 1 0
0 0 1 1 
0 1 0 0 
0 1 0 1
0 1 1 0
0 1 1 1 
1 0 0 0 
1 0 0 1 
1 0 1 0
1 0 1 1 
1 1 0 0 
1 1 0 1
1 1 1 0
1 1 1 1

Here I have simply represented 4 bits from lowest to highest. Remember just like in the decimal system with base 10 (which we all are familiar with), we follow same logic in binary system where we start from lowest (0 0 0 0) and go to next digital (0 0 0 1) and now since it’s base 2, we go to next logical number which is (0 0 1 0) and so on. Now when we modify these 4 bits together, we do not have to worry about the decimal part but as soon as we try to go inside the 4-bit zone, we have to deal with the decimal counting.
So let’s take a real-world case of American Cable & broadband provider Comcast. They have an allocation 2001:558::/31:

NetRange: 2001:558:: - 2001:559:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF
CIDR: 2001:558::/31
NetHandle: NET6-2001-558-1
Parent: ARIN-001 (NET6-2001-400-0)
NetType: Direct Allocation
OriginAS: AS7922
Organization: Comcast Cable Communications, LLC (CCCS)
RegDate: 2003-01-06
Updated: 2016-08-31
Ref: https://whois.arin.net/rest/net/NET6-2001-558-1

What exactly /31 means here?

Going back by CIDR fundamentals /31 means 31 bits are reserved and remaining (128-31 i.e 97 bits) are available. How can they generate /32 or say /36 out of this allocation?
Writing in expanded form:
(16 bits + 15 bits)
In above, first 16 bits are reserved for 2001 but for next part “0558” only 15 bits are reserved. Let’s expand the 2nd block further:
0 5 5 8 – 15 bits reserved
Here “0” gives 4 bits (and in binary is 0 0 0 0)
5 gives 4 bits (and in binary is 0 1 0 1)
Next 5 also reserves 4 bits
So far we are at (4 + 4 + 4) 12 bit count. Now that 15 bits are reserved, basically from “8” 3 bits are reserved and rest 1 bit is available for modification.
Let’s expand 8:
8 in hexadecimal = 1 0 0 0 in binary. Here 1 0 0  is reserved (each representing one binary bit and hence the three bits) and 4th bit can vary.
Hence possible combinations in binary are:
1 0 0 0
1 0 0 1
The remaining first three bits (1 0 0 ) cannot be altered as they are part of network mask. Now 1 0 0 0 in binary gives us “8” in hexadecimal and 1 0 0 1 gives us “9”. Thus possible /32s out of this /31 allocation are:
2001:558::/31 = 2001:558::/32  and 2001:559::/32
Similarly to calculate /36 slices from it, we can basically vary this 1 bit (as we just did) as well as next 4 bits altogether (5-bit variation). Hence possible /36 slices are:
and so on until 2001:558:f000::/36 (16 pools here)
and next,
and so on until 2001:559:f000::/36 (16 pools here). Thus we get these 32 /36 blocks out of /31 allocations.
That’s all about IPv6 subnetting. Once you understand this part, you should be just fine with subnetting in the future. 🙂

20 Sep

IPv6 allocations to downwards machine with just one /64

One of my friend went for a VM with a German hosting provider. He got single IPv4 (quite common) and a /64 IPv6. Overall /64 per VM/end server used to be ok till few years back but now these days running applications inside LXC containers (OS level virtualization) make more sense. This gives option to maintain separate hosting environment for each application. I personally do that a lot and infect blog which you are reading right now itself is on a LXC container.

anurag@server7:~$ sudo lxc-ls -f |grep websrv1
[sudo] password for anurag:
websrv1.server7.core.anuragbhatia.com RUNNING 1 - 2402:b580:1:4:1:1:1:1, 2402:b580:1:4::abcd

So my friend tried to do similar setup but it went tricky for him because of just one single /64 from upstream. For me I have a /32 and I originate a /48 from this location giving me over 65k /64s of IPv6 for any testing and random fun application deployments.
The challenge in his setup was following: 

  1. One can use available 18 quintillion IPv6 address in the /64 by bridging the internal container interface with it. That’s ok for IPv6 but fails terribly for IPv4 as many people do not need dedicated IPv4 per container while it’s fun to have that for IPv6 and gives so much flexibility. For IPv4 a custom setup makes more sense with specific DST NAT and reverse proxy for port 80 and port 443 traffic.
  2. For NATing IPv4 a separate virtual interface (veth) makes sense so that one can run private IPv4 addressing. Now here firstly subnetting of /64 sounds stupid and weird but even if one does that it won’t work because main /64 allocation is via layer 2 and not a routed pool. This doesn’t works, read further on why.

So after our discussion my friend decided to use a /112 for container (ugly I know but datacenter provider quoted 4-5Euro/month for additional /64!). A /112 out of 128 IPv6 addressing gives one 2^16 i.e 65k IPv6 addresses to use on containers which is good number of IPv6 with few limitations like:

  1. Many things support /64 only like for instance use of IPv6 in OpenVPN sticks with that due to Linux Kernel implentation.
  2. IPv6 auto conf heavily depends on it. In my own personal setup I have a dedicated /64 for the container interfaces and radvd takes care of autoconfig via router advertisements. With anything less then /64 that’s not possible.

So we broke the allocated /64 into a /112 and allocated first IP our of that on veth based interface and next used 2nd IP on a container. IPv4 was working fine on container with SRC NAT in this case but IPv6 connectivity wasn’t. Container was able to reach host machine but nothing beyond that. I realised issue was of layer 2 based allocation which was relying on IPv6 NDP. So the container’s IPv6 had internal reachability with host machine but whenever any packet came from internet, the L3 device of VM provider wasn’t able to send packets further because of missing entry of that IP in their NDP table. Problem wasn’t just with IPv6 of container but with just any IPv6 used on any interface of the VM (whether that virtual veth or even loopback). Adding IPv6 on eth0 (which was connected to upstream device) was making IPv6 to work but not possible to use it further on a downstream device like a container. The datacenter provider offered to split /64 into /65s and route 2nd /65 for a monthly charge (ugly!!!). So we ended up with a nasty workaround – use of proxy NDP. This is very similar concept to proxy arp as in case of  IPv4. So that required enabling proxy arp by enabling in sysctl.conf and next doing proxy NDP for specific IPv6 using: ip neigh add proxy xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx/64 dev eth0
Hurricane Electric datacenter
This works and thus with an extra step of adding proxy NDP entry for each IPv6 in use. In general routed pool is way better and if I had a make a choice on behalf of his datacenter provider, I would have gone for use of /64 for point to point connectivity and a routed /48. At Hurricane Electric (company I work for) we offer IPv6 free of charge so that networks can grow without having to worry about address space or do nasty things like one I described above. 😉
Haven’t deployed IPv6 on your application yet? Do it now!
Time to get back to work and do more IPv6 🙂

25 Mar

How to subnet IPv6 ?

Subnetting IPv6 sounds very complex but to be true – it is very easy!
All you need to do is to understand basics of IPv6 addressesing – how an address is formed and how to efficiently use CIDR notation.
Firstly how an IPv6 address looks like? (good to clear fundamentals first!)
An IPv6 address has 8 sections seprated by coloums and each sections has carries 4 hexadecimal digits. So an IPv6 address is something like:
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx – Each x can have a hexa decimal value i.e from 0 to 9 and a to f. Thus 16 possible values for each x. Since each each x is stored in binary i.e 0 or 1 (that is 2 possible value) – number of bits per section turns out to be 2x2x2x2 = 16bits. Thus we have now each section with 16 bits per section and 8 sections in total. This turns out to be 16 + 16 + 16 + 16 + 16 + 16 + 16 + 16 bits = 128bit. This is why an IPv6 address has 128bits.
This means total possible addresses in IPv6 space is 2^128 = 340 282 366 920 938 463 463 374 607 431 768 211 456 addresses.
Next, an important point to remember here is  – in IPv6 address clients are mostly based on /64 subnet which means first 64 bits go to network part while next 64 bits go to the host part i.e usage IPv6 addresses which are allocated to end machines.

Now getting back on main question on how to subnet IPv6?

In most of cases RIRs like ARIN/APNIC allocate a /32 IPv6 block. This means first 2 sections 16+16 bits are reserved and rest 6 sections i.e 128-32 = 96 bits are available for use.
E.g let’s pick example of Google’s block.
Google has a allocation of 2404:6800::/32 from APNIC in Asia.
Now this is HUGE chunk.
First let’s understand what is range of 2404:6800::/32 looks like. :: here means that zeros are skipped and thus we can fill zeros to understand block.
2404:6800::/32 means = 2404:6800:0000:0000:0000:0000:0000:0000/32 and since only first 32 bits (16 bit per section) are reserved – we have first 2 sections reversed while rest 6 sections are available and we can fill any hexa decimal value in those sections.
Thus block 2404:6800::/32 goes from 2404:6800:0000:0000:0000:0000:0000:0000 to 2404:6800:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF
Now that’s a huge number of address space. You can simply count it by doing 2 to the power 96 (128-32) which will be 792 281 625 142 643 375 935 439 503 36 unique possible addresses!

Breaking it down further…

  1. If you have multi datacenter setup – it is very likely that you would like to use IPv6 space across multiple locations and thus doing a BGP announcement for whole /32 isn’t a very good idea.
  2. Many people on NANOG mailing list suggested me to use /48 block as it works well with BGP and most of ISPs do accept a /48 block.
  3. Most of servers are allocated /64 block of IPv6 further down.

So in idea situation – you would have to break your /32 allocation into multiple /48s – which you can annouce from BGP and further /64s out of /48 for allocation per server/per client.
At this point it is likely that will think of how many such small blocks are possible out of main bigger block?
Ok – here’s the answer. You can break /32 into 65,6536 /48s. Each can represent a separate network below a BGP session. Next, you can further break /48 block into 65,536 /64s and each /64 you can allocate to a client. Thus each client will have 2^64 addresses i.e 184 467 440 737 095 516 16 addresses per client!

Let’s break it!

Coming back on example of Google’s block – 2404:6800::/32 here to get /48s out of the block – all you need to do is to change the 3rd section. Remember as each section represents 16bits, altering 3rd section gives 16+16+16 = 48 bits. Thus possible /48s out of 2404:6800::/32 will be
also since it takes hexadecimal values, we can put a,b,c,d,e & f.
one can also use complete combination to fill all 4 digits i.e
here XXXX can take hexa decimal values of 65,536.
next, in similar manner altering 4th section gives /64s. Possible /64s out of Google’s IPv6 block:
Next, each client can alter last 4 sections – and generate ton of IPv6 addresses!
E.g unique IP addresses 2404:6800:1:1::1 which is 2404:6800:1:1:0000:0000:0000:0001 
2404:6800:1b11:21dd:00ab:0030:0020:0001  or just anything!

Quick point to remember here:

  1. If you alter JUST the last i.e 8th section you can have 65,536 (2^16) IPs.
  2. If characters in hexa decimal values confuse you, you can simply take last section values from 0 to 9999 i.e 10k possible IPs by just altering last section without hexa decimal.
  3. Its a good idea to alter just last section and fill zeros in 5th, 6th and 7th section because 10k IPs would be sufficient per server and one can always add more later.
  4. Also when filling 0 in 5th, 6th and 7th section, one can simply use double coloumn notation i.e  2404:6800:1:1:0000:0000:0000:0001 can be written as 2404:6800:1:1::1 skipping all zeros!

Well that’s all about IPv6 addressing. Hope you will find it useful! 🙂