I’ve learned (sometimes the hard way!) that every design choice comes with real trade-offs. There’s no magic database architecture that optimizes every dimension (e.g., scalability, performance, ease-of-use) simultaneously.
Social media often pushes us into oversimplified "winner vs. loser" narratives, but this hides the actual complexity of building great infrastructure.
Recognizing and respecting these differences makes us smarter engineers, better community members, and frankly, just more enjoyable people to chat with.
PS Thank you for helping me add a new book to my list :-)
I know it's popular to bash the HackerNews hivemind, and often it's honestly deserved, but this line is in bad taste. The comment was not only polite and professional, it was also right. They had to introduce a columnar storage format (hypertables) to make it work. That is exactly what the comment and the follow-up cocmment suggest.
I can't imagine the comments if we had chosen that...
"ClickBench evaluates databases using a single table of clickstream data, representative of workloads like web analytics, BI, and log aggregation. It also favors full-table large scans and large-scale aggregations on denormalized data.
Real-time analytics inside applications is different and needs a new benchmark." [0]
This is why we published RTABench. [1]
We believe that it is more representative of real-time analytical workloads.
[0] https://www.tigerdata.com/blog/benchmarking-databases-for-re...
"The future is already here, it's just not very evenly distributed" - William Gibson
But this one, interesting but was not a practical choice at all from what I gather reading the blog. The reason given for not using Clickhouse which they are already using for analytics was vague and ambiguous. Clickhouse does support JSON which can be re-written into a more structured table using MV. Aggregation and other performance tuning steps are bread and butter of using Clickhouse.
The decision to go with postgres and learn the already known limitations the hard way and then continue using it by bringing up a new technology (Timescale) does not sound good, assuming that Cloudflare at this point might already have lots of internal tools for monitoring clickhouse clusters.
ClickHouse was fast but required a lot of extra pieces for it to work:
PostgreSQL with TimescaleDB did the job. Why overcomplicate things?