Readit News logoReadit News
jamesblonde · 7 months ago
Data warehousing is quickly becoming a commodity through open-source. I know a company who had 2PBs+ of data in Cloudera. But instead of moving to the cloud (and Databricks), they saved 5X costs by building their own analytics platform with Iceberg, Trino and Superset. The k8s operators are enterprise quality now. On-premises S3 is good, too. You can have great hardware (servers with 128 cpus and 1 TB) and networking. It's not just Trino. StarRocks and Clickhouse have enterprise grade k8s helm charts/operators. That 60bn valuation is an albtross on Databrick's neck - their pricing will have to justify it, and their core business is commoditizing.

Neon filled their product gap of not having an operational (row-oriented) DB.

richardw · 7 months ago
Not commoditising for enterprise. My last gig wouldn’t allow open source software or any company that might not be there in a decade, or which kept data anywhere but our own tenant. We’d look for the “call us” pricing rather than hate it, which I normally do. We added databricks and it was considered one of my top three achievements, because they don’t have to think about data platforms again, just focus on using it. It’s SO expensive for an enterprise to rejig for a new platform that you can’t rely on (insert open source project here).

I managed to add one startup and so far it’s done very well, but it was an exceptional case and the global CEO wanted the functionality. But it used MongoDB and ops team didn’t have any skills, so rather than learn one tiny thing for an irrelevant data store they added cash to use Atlas with all the support and RBAC etc etc. They couldn’t use the default Azure firewall because they only know one firewall, so added one of those too. Also loaded with contracts. Kept hiring load down, one number to call, job done. Startups cost is $5-10k per year. Support BS about $40k. (I forget the exact numbers but it dwarfed the startup costs.)

Startups are from Venus, enterprise are from Jupiter.

antruok · 7 months ago
Enterpise also often wants a full data platform (like Databricks), not a plain data warehouse.
jeffbee · 7 months ago
> My last gig wouldn’t allow open source software or any company that might not be there in a decade

I bet they had VMware all over the place.

adolph · 7 months ago
> Not commoditising for enterprise. My last gig wouldn’t allow open source software or any company that might not be there in a decade, or which kept data anywhere but our own tenant.

Hence IBM talking up Iceberg: https://www.ibm.com/think/topics/apache-iceberg

hlpn · 7 months ago
Totally agree. Happy open source StarRocks user here using the k8s operator for customer-facing analytics on terabytes of data. There's very little need for Databricks in my world.
anilshanbhag · 7 months ago
Looking at StarRocks site (https://www.starrocks.io/), they compare against Clickhouse, Druid and Trino. Don't even compare against Spark/Databricks! Guess Spark is just not competitive.
lars_francke · 7 months ago
Anyone looking for an open-source Cloudera alternative based on Kubernetes operators. We're building one (~5 years old now): https://stackable.tech/ & https://github.com/stackabletech/

On-premise open-source S3 is a problem though. MinIO is not something we're touching and other than that it looks a bit empty with enterprise ready solutions.

SOLAR_FIELDS · 7 months ago
Don’t SeaweedFS and ceph/rook also offer this? Ceph/rook is definitely enterprise ready
__turbobrew__ · 7 months ago
> On-premise open-source S3 is a problem though

Rook/ceph with object storage is pretty bulletproof: https://www.rook.io/docs/rook/v1.17/Storage-Configuration/Ob...

I do wish more systems had high quality operators out there. A lot of operators I have looked into are half baked, not reliable, or not supported.

pjdbruin · 7 months ago
Great to see cost-effective alternatives to Cloudera and Databricks! We’ve spent three years building IOMETE, a self-hosted data lakehouse that combines Apache Iceberg and Spark, designed to run natively on Kubernetes. We’re focused on on-premises deployments to address the growing need for data sovereignty and low TCO, with a streamlined setup for large-scale analytics. Early adopters are seeing strong results. Curious about your experience with Trino and Superset—any tips for optimizing performance at scale?
bittermandel · 7 months ago
Wouldn't Rook be a good solution? It's definitely proven in much larger settings than Minio, as it's just Ceph.
matt-p · 7 months ago
What's wrong with minio out of curiosity? Ceph an option?
kwillets · 7 months ago
It's been a commodity for decades now. Metrics like price-performance have a long history, but the SnowBricks products fail at them quite dramatically. The difference is hard-sell vs. soft or no-sell.
datadrivenangel · 7 months ago
Not having to buy an appliance and pay for it up front is quite a valuable option. Also the split between processing and storage allows for better archival and scaling strategies.
jflkdsjlcsuq · 7 months ago
I have been happily using ClickHouse for the past couple of years without any issue. Rock solid database with wide variety of features and fulfills all my needs. My favourite is the "external dictionary" feature which easily allows me to integrate it with other datastores like Postgres and Redis.
datadrivenangel · 7 months ago
But why would you buy an operational DB from Databricks? The only thing that makes sense is Databricks flailing to maintain market cap.
antruok · 7 months ago
In addition to the AI use cases, sometimes you wanna share the data warehouse data in oltp way for fast lookups and high concurrency. Not sure whether Neon will do that but I hope so.

One example from Snowflake is hybrid tables which adds rowstore next to columnar.

OLAP + OLTP = HTAP

ako · 7 months ago
ETL to bring all your data into Databricks/Snowflake is a lot of effort. Much better if your OLTP data already exists in Databricks and you directly access it from your OLAP layer.
swyx · 7 months ago
if Databricks just wanted a row DB they couldve done postgres themselves. paying this much for Neon i think is a sign that Neon has something special they want (which, knowing their marketing line, is "independently scalable storage and compute for postgres")
yencabulator · 7 months ago
Easy quick cheap forks of database state for AI agents to muck with.
t0mas88 · 7 months ago
That sounds like AWS Aurora?
dustingetz · 7 months ago
"Time is the denominator"
data_marsupial · 7 months ago
what do they use for ETL?
Robdel12 · 7 months ago
I applied to neon last week and then the news broke about the acquisition. They rejected it this morning — I have never been happier to receive a rejection to an application.

This would’ve been three acquisitions straight for me and… I’m okay, they’re awful. I just want stability.

Congrats to the neon team! I use and love neon. Really hope this doesn’t change them too much.

tedivm · 7 months ago
I got hired at Kenna Security a month before they were acquired by Cisco and it was such a miserable experience that I won't work for any company the Kenna leadership are involved with, nor would I ever consider working at Cisco.
jhickok · 7 months ago
I've been through two now, and for one of them nothing much changed, and the other one I was basically lost in a stack of papers for a year. Can I ask what made the experience miserable for you?
no_wizard · 7 months ago
Had personally the opposite experience. Acquisitions being one of the most interesting times to be hired into.

In a couple cases I’ve been recruited because I have a history of scaling and integrating acquisitions into companies successfully

Robdel12 · 7 months ago
The first acquisition I was apart of wasn’t too bad! But we were still culturally very different. So after 2 years and properly transitioning things, I bounced to another start up.

Walking into something like that is tough because the two teams sort of don’t like each other and you’re really “neither”. I’d want to make sure I was interviewed by both teams

bicx · 7 months ago
I've been part of an acquisition as a first-year engineering manager, during which I had to navigate subsequent two rounds of layoffs. I was also a part of the group to help restructure teams and help make calls on who to keep. Morale was terrible, and the cultures also did not gel at all.

It led to some serious burnout and I took several months off. I'm now happily working as an IC again.

gopalv · 7 months ago
> Really hope this doesn’t change them too much.

My guess is that this team gets rolled into Online Tables tech, which would make product sense.

https://docs.databricks.com/aws/en/machine-learning/feature-...

jamesblonde · 7 months ago
Yes, that is what i expect, too. They have been paying DynamoDB and CosmosDB for a few years now. However, Neon is not competitive latency/throughput-wise for real-time workloads, needed for high end AI (like personalized recommendations). There are a few others I would have expected like Cockroach, Aerospike, or RonDB.
swyx · 7 months ago
what if you had joined at neon's previous valuation (whatever it was) and got a sudden payday (assuming you had juuuust enough vesting)
Robdel12 · 7 months ago
I was a very early employee at the other two start ups that were acquired and even with equity it was not worth it. After all the class A shares were paid out, the rest of us got little.

I mean, hindsight 20/20 here, but I would have loved the theoretical money @ 1 billion. But those are so rare and my experience in the past 15 years hasn’t matched those unicorns.

Basically I’ve come to the conclusion unless you have serious equity or you’re a founder, acquisition suck. You’re the one doing the work making these two companies come together, while the founders usually bounce or are stripped of any real power to change things.

Deleted Comment

flanked-evergl · 7 months ago
Maybe unrelated but Databricks is the most annoying garbage I have ever had to use. It fascinates me how anyone uses it by choice.
mritchie712 · 7 months ago
Databricks started in 2013 when Spark sucked (it still does) and they aimed to make it better / faster (which they do).

The product is still centered Spark, but most companies don't want or need Spark and a combination of Iceberg and DuckDB will work for 95% of companies. It's cheaper, just as fast or faster and way easier to reason about.

We're building a data platform around that premise at Definite[0]. It includes everything you need to get started with data (ETL, BI, datalake).

0 - https://www.definite.app/

isignal · 7 months ago
Aren't the alternatives you mentioned - icerberg and duckdb - both storage solutions while spark is a way to express distributed compute? I'm a bit out of touch with this space, is there a newer way to express distributed compute?
MOARDONGZPLZ · 7 months ago
Databricks is the Jira of dealing with data. No one wants to use it, it sucks, there are too many features to try to appease all possible users but none of them particularly good, and there are substantially better options now than there were not long ago. I would never, ever use it by choice.
winwang · 7 months ago
What options do you use? I don't work for Databricks but I am building my own data infra startup, so I'd like to hear what "good" looks like!
swalsh · 7 months ago
Really hard disagree. Coming from hadoop, databricks is utopia. It's stable, fast, scales really well if you have massive datasets.

The biggest gripe in have is how crazy expensive it is.

willvarfar · 7 months ago
Spark was a really big step up from hadoop.

But these days just use trino or whatever. There are lots of new ways to work on data that are all bigger steps up - ergonomically, performance and price - over spark as spark was over hadoop.

DebtDeflation · 7 months ago
Hadoop was fundamentally a batch processing system for large data files that was never intended for the sort of online reporting and analytics workloads for which the DW concept addressed. No amount of Pig and Hive and HBase and subsequent tools layered on top of it could ever change that basic fact.
winwang · 7 months ago
If cost (or perf) is the issue, we're building a super-efficient, GPU-accelerated, easy-to-use Spark: https://news.ycombinator.com/item?id=43964505
robertkoss · 7 months ago
I used to be a big fan of the platform because back in 2020 / 2021 it really was the only reasonable choice compared to AWS / Azure / Snowflake for building data platforms.

Today it suffers from feature creep and too many pivots & acquisitions. That they are insanely bad at naming features doesn't help either.

kristjansson · 7 months ago
I’d settle for only one bad name per feature from them. Alas, they don’t feel so limited
winwang · 7 months ago
I'm building another Spark-based choice now with ParaQuery (GPU-accelerated Spark): https://news.ycombinator.com/item?id=43964505
apwell23 · 7 months ago
Is hosting spark really that groundbreaking ? Also isn't spark kind of too complicated for 90% of enterprisey data-processing .

I really don't understand the valuation for this company. Why is it so high.

teetertater · 7 months ago
Yes, spark is too complicated for most cases;

But if you're inclined to use it, databricks' setup of spark just saves you an incredible amount of time that you'd normally waste on configuration and wiring infrastructure (storage, compute, pipelines, unified access, VPNs etc). It's expensive and opinionated, but the data engineers you need to deal with spark OOM errors constantly is greater. Also databricks' default configs give you MUCH better performance out of the box than anything DIY and you don't have to fiddle with partitions and super niche config options to get even medium workloads stable

isoprophlex · 7 months ago
The market for IBM-like software and platforms (everyone else uses this! It must be good!) apparently wasn't saturated yet
viccis · 7 months ago
They push Serverless so hard but there are SO MANY limitations and surprise gotchas. It's driving me absolutely insane.
datadrivenangel · 7 months ago
And it tends to be notably more expensive! 4-5x the price for less features...
hacliff · 7 months ago
Hey, what are the most painful limitations/gotchas you're hitting? I'm on this team and would like to hear about painpoints.
sh34r · 7 months ago
TBH it's really quite boring. You just have to go back in time to the late 2010s. They had an excellent Spark-as-a-Service product, at a time when you'd have better luck finding a leprechaun than a reliable self-hosted Spark instance in an enterprise environment. That was simply beyond the capabilities of most enterprise IT teams at the time. The first-party offerings from the hyperscalars were relatively spartan.

Databricks' proprietary notebook format that introduced subtle incompatibilities with Jupyter was infuriating embrace-extend-extinguish style bullshit, but on-prem cluster instability causing jobs to crash on a daily basis was way more infuriating, and at that time, enterprises were more than happy to pay a premium to accelerate analytics teams.

In the 2010s, Databricks had a solid billion-dollar business. But Spark-as-a-Service by itself was never going to be a unicorn idea. AWS EMR was the giant tortoise lurking in the background, slowly but surely closing the gap. The status quo couldn't hold, and who doesn't want to be a unicorn? So, they bloated the hell out of the product, drank that off-brand growth-hacker Kool-Aid, and started spewing some of the most incoherent buzz-word salad to ever come out of the Left Coast. Just slapping data, lake, and house onto the ends of everything, like it was baby oil at a Diddy Party.

Now, here we are in 2025, deep into the terminal decline of enshittification, and they're just rotting away, waiting for One Real Asshole Called Larry Ellison to scoop them up and take them straight to Hell. The State of Florida, but for Big Data companies.

It would be a mystery to me too, why anyone would pick Databricks today for a greenfield project, but those enterprises from 5+ years ago are locked in hard now. They'll squeeze those whales and they'll shit money like a golden goose for a few more years, but their market share will steadily decrease over the next few years.

It's the cycle of life. Entropy always wins. Eventually the Grim Reaper Larry comes for us all. I wouldn't hate on them too hard. They had a pretty solid run.

DarkWiiPlayer · 7 months ago
With cookies disabled I get a blank website, which is a massive red flag and an immediate nope from me.

Can't imagine someone incapable of building a website would deliver a good (digital) product.

fkyoureadthedoc · 7 months ago
They did build a website though. It even looks pretty nice. The restriction you've placed on yourself just prevents you from viewing it.
fuzzy_biscuit · 7 months ago
But.. but.... we MUST track you! That's the whole purpose of our site /s
acd10j · 7 months ago
Databricks is Oracle-level bad. They will definitely ruin Neon or make it expensive. In the medium to long term, I will start looking for Neon alternatives.
bradhe · 7 months ago
Definitely agree, their M&A strategy is setup to strangle whoever they buy and they don't even know it. They're struggling in the face of Iceberg, DuckDB and the other tectonic shifts happening in the open source world. They are trying to innovate through acquisition, but can't quite make it because their culture kills the companies they buy.

I'm biased, I'm a big-data-tech refugee (ex-Snowflake) and am working on https://tower.dev right now, but we're definitely seeing the open source trend supported by Iceberg. It'll be really interesting to see how this plays out.

everfrustrated · 7 months ago
From the actual article

>As Neon became GA last year, they noticed an interesting stat: 30% of the databases were created by AI agents, not humans. When they looked at their stats again recently, the number went from 30% to over 80%. That is, AI agents were creating 4 times more databases versus humans.

For me this has alarm bells all over it. Databricks is trying to pump postgres as some sort of AI solution. We do live in weird times.

4b11b4 · 7 months ago
and how many of those DBs are still being actively used...
higeorge13 · 7 months ago
Congratz to neon team (i like what they built), but i don’t see the value or relation to databricks. I hope neon will continue as a standalone product, otherwise we lose a solid postgres provider from the market.
rockwotj · 7 months ago
Its pretty heavy in Azure, so I would be surprised if it went away. This is DBX play to move into the transactional database space in addition to the analytical database.
presentation · 7 months ago
They claim they will in the FAQ… but we know how this usually goes
thayne · 7 months ago
If only companies were held liable for breaking promises they made when acquiring other companies
timmg · 7 months ago
I remember the first post by the Neon team here on HN. I think I commented at the time that I thought it was a great idea. I’ve never had a need to use them yet, but thought I always would.

Cynically, am I the only one who takes pause because of an acquisition like this? It worries me that they will need to be more focused on the needs of their new owners, rather than their users. In theory, the needs should align — but I’m not sure it usually works out that way in practice.

avinassh · 7 months ago
> I remember the first post by the Neon team here on HN. I think I commented at the time that I thought it was a great idea.

Same! I remember it too. I found it quite fascinating. Separation of storage and compute was something new to me, and I was asking them about Pageserver [0]. I also asked for career advice on how to get into database development [1].

Two years later, I ended up working on very similar disaggregated storage at Turso database.

Congrats to the Neon team!

[0] - https://news.ycombinator.com/item?id=31756671

[1] - https://news.ycombinator.com/item?id=31756510

kaeshiwaza · 7 months ago
Taking a pause also... I don't believe serving IA can be aligned to serving devs. I hope that the part of the work related to the core of PostgreSQL will help the community.
beoberha · 7 months ago
Congrats to the Neon team. They make an awesome product. Obviously it’s sad to see this, but it’s inevitable when you’re VC funded. Let’s hope Nikita and co remain strong and don’t let Databricks bit.io them.