Why object storage is getting exciting?

Last year had many interesting developments and one of that has been object storage. For those unaware, object storage is de-facto cloud storage which stores data as objects instead of file system architecture. This gives the option of simple plug-and-play horizontal scalability. It became popular when Amazon Web Services (AWS) launched S3. The idea was straightforward - pay-as-go storage with a few cents/GB/month charge to store data and a few cents/GB to egress data. No need to plan storage, no need to plan hard disk, storage servers, or rack capacity but a simple pay-as-you-go opex cost. Plus top tier cloud players do offer redundancy of data. The API replies with “success” on uploads only when data is replicated to multiple datacenters.

The S3 API became popular enough that now it acts as an industry standard and now is offered by hundreds of companies including Google Cloud, Microsoft Azure, Backblaze, OVH & many more. There is robust client-side library support & clients like rclone offer easy plug-and-play storage.


The dark side of object storage

One of the dark sides of object storage since its launch has been the egress pricing. Almost all players offer free ingress i.e user > cloud upload is free but cloud > user is charged. While the charge is a few cents/GB but at a large multi-TB scale it can get pretty expensive. Take e.g on AWS S3 (pricing here) it costs $0.023/GB/month (i.e 1.9 INR/GB/month) to store data and egress is $0.09/GB/month (i.e 7.44 INR/GB/month). In many applications (like backups) where one retrieves a fraction of uploaded data, this works well. But for applications pulling data regularly (like streaming) it can get extremely expensive ($90/TB)

This high egress pricing is not just for the storage but for most products by top tier clouds including CDN offering, egress by compute VMs etc. This high egress pricing acts as a lock-in feature for the top-tier cloud. Their customers can get cheaper computing, database hosting, storage and CDN etc outside but the back & forth traffic move will have a massive impact on the bill due to egress that it’s just not worth it.

Cost calculation of video streaming & jailed garden issue

Let’s look at a simple cost calculation of say 1 million views to any video/discussion/podcast/stream of say 100MB in size.

(100/1000)GB x 10,00,000 views x $0.09/GB egress = $9000

This would not make sense for the majority of content creators and hence they will simply push their content via YouTube/Facebook/Instagram. For individuals that is not a challenge but for media organisations going online, they lose direct contact with their customers. So now they have YouTube/FB/others sitting between them & their viewers and are on terms of those respective players w.r.t advertising.


Market for non-top-tier storage is getting hot

For the last few years, Cloudflare has been openly criticising AWS for their massive egress pricing. While I do not agree with all their points but in a broad sense they are right about AWS having a massive markup on the egress. To put money where the mouth is, Cloudflare launched R2 which has zero egress charges. The zero part is exciting here. There are many cheaper options other than top-tier clouds which charge 1/10th on egress but even 1/10th is still a pretty high cost when looking at Tera byte or Peta byte kind of egress.


Here are some popular options for object storage outside of flagship options:

Provider URL Storage cost per GB per month Egress cost per GB* Comment
Backblaze B2 Website 0.41 INR / $0.005 0.82 INR / $0.01 One of the cheapest per Gig storage & fairly documented on working. Checkout Backblaze’s open source reed-solomon erasure coding
Cloudflare R2 Website 1.23 INR/GB / $0.015 Free egress! Newer player in the market, yet to be tested over time, but exciting. Has absurd access limits when accessing via their domain r2.dev, checkout limits here. No limits when using their own domain but forces the use of their DNS (as far as I can see) & no way to just point A or CNAME or even a sub-zone NS delegation
iDrive E2 Website 0.33 INR / $0.004 Free egress! Cheapest storage price for active storage. Massively subsidizes the first year of storage. In my few days of testing, I found their Frankfurt zone to be a bit overloaded but other seem working well. Another negative here is that there is not much documentation about how resiliency is maintained.
AWS Glacier deep archive Website 0.081 INR / $0.00099 per GB, that is 81.45 INR / $0.99 per TB per month Complicated formula check here Cheapest overall storage but cold. Archives data & can take hours before it can be accessed. No one knows what they use for it. Some of the chat among sys admins points to either robot-managed tape drives or optical disks. It’s good for a store once, retrieves is very rarely an option.

Enjoy offloading your storage to “cloud” 😄