It's interesting that Snowflake went shopping for Crunchy Data over Neon. While Neon focused on bringing compute and storage separation to OLTP, Crunchy Data focused more on bringing OLTP/PostgreSQL closer to OLAP with DuckDB and Iceberg.
In a way, Crunch Data was a competitor to Snowflake as they literally name themselves as "Postgresql Data Warehouse" but correct me if I'm wrong. Neon sounds more complementary to Snowflake as they were struggling with an OLTP backend, namely their Unistore product, which was announced 3 years ago but never went into general availability due to its scalability issues.
Maybe Neon was 4x more expensive, but this acquisition sounds more like an answer to Databricks than a strategic acquisition if I'm being honest. Apparently, Crunchy had $30M ARR, so it's 8x ARR, which is a cheaper answer to Databricks.
I thought Crunchy Data Warehouse was their main product, looking at most of their marketing posts. What's the advantage of using their managed PostgreSQL offering on the cloud, compared to native offerings such as AWS RDS and GCP Cloud SQL?
1) built using an open source kubernetes operator, as I understand
2) Crunchy provides true superuser access and access to physical backups – that's huge
> Part of the reason Snowflake and Databricks are interested in database companies is because PostgreSQL can serve as the underlying database for customers to create AI agents with data they store in the companies’ respective platforms.
I don't understand this part. What does PostgreSQL offer here that these vendors believe they can't add to their existing platform? Is it the ecosystem?
Oh my bad! I was under the (evidently mistaken) impression that since they were bought by databricks they would just become a part of that and cease to be.
Evidently, I was very wrong, which I’m glad to hear tbh.
is it open source? (this? https://github.com/neondatabase/neon.git ) and since it's serverless, in terms of being on the internet what are you saying has disappeared, a proprietary version? support/consulting contracts?
But why do they need serverless Postgres for that?
They could achieve the same with normal pg, or SQLite. Or any number of other embedded DB’s. There’s also plenty of disaggregated compute options available…
I imagine they are buying the expertise in managing the transactional system rather than the IP itself. Operationally running a transactional system is a different ballgame for these OLAP players.
(of course what they're not getting is scale readiness.. it's not like these companies have anything resembling RDS level customer workloads)
There are a couple different reasons that make this acquisition interesting.
First is their long pursuit of HTAP and the failures around unistore.
Snowflake wanted to get into transactional workloads for a long time and for good reasons.
I wonder what will happen to Unistore after this acquisition.
The other interesting part is ETL/ELT, CDC and the whole business of replicating transactional databases into OLAP.
What crunchy built with duckdb and iceberg is a potential solution to this problem. A problem that has been painful to solve for a long long time.
Being able to replicate your transactional database into your data lake or data warehouse without having to deal with Debezium and all the rest of the stuff, is going to make many data teams happy.
There's still Xata. And plenty of other options that support a Postgres compatible API like CockroachDB and Yugabyte.
The problem is there's so much sprawl in this postgres ecosystem that it seems like no one other than the hyperscalers is really able to reach a escape velocity...
That's such a wild way to view this. I see this the opposite way: the pool of incredibly awesome fantastic postgres technology companies who are uniquely top of the game was down to very very very few.
The musical chairs here is who can get such long proven incredible fantastic well knowing talent. Who can snarf it up & convince these incredible doers to fold into the amorphous indistinct corporate giant.
I wonder what accounts for the gap between this and Neon's $1B price tag. Is the deal structured less favorably for Neon? Does Neon have significantly more revenue?
Seems like Neon raised a lot more venture funding, too.
Value is an intriguing concept. They may not have the revenue to justify their value or maybe they have. The price tag could be the result of an amazing negotiation or could be genuine forward-looking features they had.
Developed my disdain after having to put up with the incredibly shitty behaviour from the sales and account teams a few years ago.
Sure they had some novelty years ago, but everyone and their dog has disaggregated compute these days, and all their other “feature” just feel like enterprise money extraction that they’ve acquihired in.
One good reason is that a huge population of companies just don't have enough data to justify Snowflake. We sell a product built on it, and I wish we'd had DuckDB 3-4 years ago; it's perfect for 95%+ of our clients
I think DuckDB is great but I don't think it is necessarily playing the same game as Snowflake. A lot of people want the serverless option and DuckDB is not that.
BigQuery is more expensive than Snowflake though. You might as well just do Motherduck which would be cheaper than BQ but let you pull data from S3 which is cheaper than Snowflake storage.
Just for the data sharing feature alone it's worth using. It's so damn easy to onboard and maintain data sources when they have a Snowflake share. You don't have to worry each day about loading processes randomly failing and you don't have to write any custom logic to hit APIs and properly flatten and merge responses into the database.
Snowflake sounds like nominative determinism. I was just looking at this thing today, totally puzzled as to how to update it and postgres itself without rolling dice that it destroys everything on the cluster that uses postgres. Perhaps someone with k8s experience could explain to me why CRDs are not Singleton hell? The LLMs just run me in circles..
I run a two service cluster in the home lab for fun. I use PVC mapped to a NFS share for the actual data so you could always run a local postgres binary against it. In a production environment I would map these to local disk partitions like you would normally do for a db.
The upgrade process is actually quite nice when it works but it is "another" thing to learn and troubleshoot.
I think of CRDs as a troubleshooting flowchart that someone with more experience than me has put together. When it's right it's great and when it's wrong it makes trouble shooting harder. That is until you remember that the whole point of k8s is ephemeral containers. When one breaks just delete it and let pgcluster CRD resync the data.
In a way, Crunch Data was a competitor to Snowflake as they literally name themselves as "Postgresql Data Warehouse" but correct me if I'm wrong. Neon sounds more complementary to Snowflake as they were struggling with an OLTP backend, namely their Unistore product, which was announced 3 years ago but never went into general availability due to its scalability issues.
Maybe Neon was 4x more expensive, but this acquisition sounds more like an answer to Databricks than a strategic acquisition if I'm being honest. Apparently, Crunchy had $30M ARR, so it's 8x ARR, which is a cheaper answer to Databricks.
A couple of core Postgres members work there and iirc also the guy who spearheaded Heroku Postgres.
That's not their primary product. Crunchy Postgres is their primary offering and they recently announced Crunchy Data Warehouse.
I don't understand this part. What does PostgreSQL offer here that these vendors believe they can't add to their existing platform? Is it the ecosystem?
Less ability for customers to roll-their-own => more customers for Snowflake?
Source: I work there.
Evidently, I was very wrong, which I’m glad to hear tbh.
It’s probably worth it just for their people.
They could achieve the same with normal pg, or SQLite. Or any number of other embedded DB’s. There’s also plenty of disaggregated compute options available…
(of course what they're not getting is scale readiness.. it's not like these companies have anything resembling RDS level customer workloads)
First is their long pursuit of HTAP and the failures around unistore.
Snowflake wanted to get into transactional workloads for a long time and for good reasons.
I wonder what will happen to Unistore after this acquisition.
The other interesting part is ETL/ELT, CDC and the whole business of replicating transactional databases into OLAP.
What crunchy built with duckdb and iceberg is a potential solution to this problem. A problem that has been painful to solve for a long long time.
Being able to replicate your transactional database into your data lake or data warehouse without having to deal with Debezium and all the rest of the stuff, is going to make many data teams happy.
The problem is there's so much sprawl in this postgres ecosystem that it seems like no one other than the hyperscalers is really able to reach a escape velocity...
Not obvious serverless postgres is in that category.
The musical chairs here is who can get such long proven incredible fantastic well knowing talent. Who can snarf it up & convince these incredible doers to fold into the amorphous indistinct corporate giant.
Seems like Neon raised a lot more venture funding, too.
You will never know.
Developed my disdain after having to put up with the incredibly shitty behaviour from the sales and account teams a few years ago.
Sure they had some novelty years ago, but everyone and their dog has disaggregated compute these days, and all their other “feature” just feel like enterprise money extraction that they’ve acquihired in.
Expensive, slow, and painful.
In that case BigQuery is more managed than both, PAYG for analytical queries without thinking about compute nor clusters whatsoever.
The upgrade process is actually quite nice when it works but it is "another" thing to learn and troubleshoot.
I think of CRDs as a troubleshooting flowchart that someone with more experience than me has put together. When it's right it's great and when it's wrong it makes trouble shooting harder. That is until you remember that the whole point of k8s is ephemeral containers. When one breaks just delete it and let pgcluster CRD resync the data.
I guess you have to be pretty close to C level at a big company to even understand.