Was at the Google Next 2025 conference, and they've unveiled a zonal bucket version of GCS and what seems to be a gPRC interface over Google Colossus for Rapid Storage.
Very cool! This makes Google the only major cloud that has low-latency single-zone object storage, standard regional object storage, and transparently-replicated dual-region object storage - all with the same API.
For infra systems, this is great: code against the GCS API, and let the user choose the cost/latency/durability tradeoffs that make sense for their use case.
Sure, but AFAIK S3’s multi-region capabilities are quite far behind GCS’s.
S3 offers some multi-region replication facilities, but as far as I’ve seen they all come at the cost of inconsistent reads - which greatly complicates application code. GCS dual-region buckets offer strongly consistent metadata reads across multiple regions, transparently fetch data from the source region where necessary, and offer clear SLAs for replication. I don’t think the S3 offerings are comparable. But maybe I’m wrong - I’d love more competition here!
Isn't S3 Express not the same API? You have to use a "directory bucket" which isn't an object store anymore, as it has actual directories.
To be honest I'm not actually sure how different the API is. I've never used it. I just frequently trip over the existence of parallel APIs for directory buckets (when I'm doing something niche, mostly; I think GetObject/PutObject are the same.)
The cross-region replication I’ve seen for S3 (including the link you’ve provided) is fundamentally different from a dual-region GCS bucket. AWS is providing a way to automatically copy objects between distinct buckets, while GCS is providing a single bucket that spans multiple regions.
It’s much, much easier to code against a dual-region GCS bucket because the bucket namespace and object metadata are strongly consistent across regions.
The semantics they are offering are very different from S3. In Colossus a writer can make a durable 1-byte append and other observers are able to reason about the commit point. S3 does not offer this property.
FYI this was unveiled at the 2025 Google Next conference, and they're apparently unveiling a gRPC client for Rapid Storage, which appears to be a very thin wrapper over Colossus itself, as this is just zonal storage.
I kind of thought you meant ZNS / https://zonedstorage.io/ at first, or it's more recent better awesomer counterpart Host Directed Placement (HDP). I wish someone would please please advertize support for HDP, sounds like such a free win, tackling so many write amplification issues for so little extra complexity: just say which stream you want to write to, and writes to that stream will go onto the same superblock. Duh, simple, great.
They charge $20/TB/month for basic cloud storage. You can build storage servers for $20/TB flat. If you add 10% for local parity, 15% free space, 5% in spare drives, and $2000/rack/month overhead, then triple everything for redundancy purposes, then over a 3 year period the price of using your own hard drives is $115/TB and google's price is $720. Over 5 years it's $145 versus $1200. And that's before they charge you massive bandwidth fees.
"Zonal" relates to the concept of "availability zones" which are the next-smallest unit below a (physical) "region."
Most instances of a cloud ___ created in a region are allocated and exist at the zonal level (i.e. a specific zone of a region).
A physical "region" usually consists of three or more availability zones, and each zone is physically separated from other zones, limiting the potential for foreseeable disaster events from affecting multiple zones simultaneously. Zones are close enough networking-wise to have high throughput and low latency interconnection, but not as fast as same-rack, same-cluster communications.
Systems requiring high availability (or replication) generally attain this by placing instances (or replicas) in multiple availability zones.
Systems requiring high-availability generally start with multi-zone replication, and Systems with even higher availability requirements may use multi-region replication, which comes at greater cost.
> Struggling to find a definition, but seemingly zonal just means there's a massive instance per cluster.
There are a number of zones in a region. Region usually means city. Zone can mean data center. Rarely just means some sort of isolation (separate power / network).
In Google Cloud parlance, "regional" usually means "transparently master-master replicated across the availability zones within a region", while "zonal" means "not replicated, it just is where it is."
This could actually speed up some of my scientific computing (in some cases, data localization/delocalization is an important part of overall instance run-time). I will be interested to try it.
Glad to see the zonal object store take off. Such massive bandwidth speed will re define data analytics where 99% of all queries able to run on a single node faster than what distributed compute can offer.
This link makes so much more sense than the previous link did.
SSDs with high random I/o speeds are a significant contributor to the advantage. I think 20m writes per second are likely distributed over a network of drives to make that kind of speed possible.
Is S3 Express One Zone performance greatly improved to standard S3 like GCP rapid storage? My understanding is S3 Express One Zone is just more cost effective.
> 20x faster random-read data loading than a Cloud Storage regional bucket.
Update: Just read this article[1] which clarifies S3 Express One Zone. Yes, performance is greatly improved, but actually storage costs are 8x more than a standard S3 bucket. The naming S3 Express One Zone is terrible and a bit misleading on pricing changes.
For infra systems, this is great: code against the GCS API, and let the user choose the cost/latency/durability tradeoffs that make sense for their use case.
Absurd claim. S3 Express launched last year.
S3 offers some multi-region replication facilities, but as far as I’ve seen they all come at the cost of inconsistent reads - which greatly complicates application code. GCS dual-region buckets offer strongly consistent metadata reads across multiple regions, transparently fetch data from the source region where necessary, and offer clear SLAs for replication. I don’t think the S3 offerings are comparable. But maybe I’m wrong - I’d love more competition here!
https://cloud.google.com/blog/products/storage-data-transfer...
To be honest I'm not actually sure how different the API is. I've never used it. I just frequently trip over the existence of parallel APIs for directory buckets (when I'm doing something niche, mostly; I think GetObject/PutObject are the same.)
s3: https://aws.amazon.com/pm/serv-s3
s3 express: https://aws.amazon.com/s3/storage-classes/express-one-zone/
cross-region replication: https://docs.aws.amazon.com/AmazonS3/latest/userguide/replic...
It’s much, much easier to code against a dual-region GCS bucket because the bucket namespace and object metadata are strongly consistent across regions.
Did find some interesting recent (March 28th, 2025) reads though!
Colossus under the hood: How we deliver SSD performance at HDD prices https://cloud.google.com/blog/products/storage-data-transfer...
I kind of thought you meant ZNS / https://zonedstorage.io/ at first, or it's more recent better awesomer counterpart Host Directed Placement (HDP). I wish someone would please please advertize support for HDP, sounds like such a free win, tackling so many write amplification issues for so little extra complexity: just say which stream you want to write to, and writes to that stream will go onto the same superblock. Duh, simple, great.
They charge $20/TB/month for basic cloud storage. You can build storage servers for $20/TB flat. If you add 10% for local parity, 15% free space, 5% in spare drives, and $2000/rack/month overhead, then triple everything for redundancy purposes, then over a 3 year period the price of using your own hard drives is $115/TB and google's price is $720. Over 5 years it's $145 versus $1200. And that's before they charge you massive bandwidth fees.
Most instances of a cloud ___ created in a region are allocated and exist at the zonal level (i.e. a specific zone of a region).
A physical "region" usually consists of three or more availability zones, and each zone is physically separated from other zones, limiting the potential for foreseeable disaster events from affecting multiple zones simultaneously. Zones are close enough networking-wise to have high throughput and low latency interconnection, but not as fast as same-rack, same-cluster communications.
Systems requiring high availability (or replication) generally attain this by placing instances (or replicas) in multiple availability zones.
Systems requiring high-availability generally start with multi-zone replication, and Systems with even higher availability requirements may use multi-region replication, which comes at greater cost.
There are a number of zones in a region. Region usually means city. Zone can mean data center. Rarely just means some sort of isolation (separate power / network).
SSDs with high random I/o speeds are a significant contributor to the advantage. I think 20m writes per second are likely distributed over a network of drives to make that kind of speed possible.
> 20x faster random-read data loading than a Cloud Storage regional bucket.
[1] https://www.warpstream.com/blog/s3-express-is-all-you-need