Readit News logoReadit News
kiwicopple · 2 years ago
hey hn, supabase ceo here

For background: we have a storage product for large files (like photos, videos, etc). The storage paths are mapped into your Postgres database so that you can create per-user access rules (using Postgres RLS)

This update adds S3 compatibility, which means that you can use it with thousands of tools that already support the protocol.

I'm also pretty excited about the possibilities for data scientists/engineers. We can do neat things like dump postgres tables in to Storage (parquet) and you can connect DuckDB/Clickhouse directly to them. We have a few ideas that we'll experiment with to make this easy

Let us know if you have any questions - the engineers will also monitor the discussion

devjume · 2 years ago
This is great news. Now I can utilize any CDN provider that supports S3. Like bunny.net [1] which has image optimization, just like Supabase does but with better pricing and features.

I have been developing with Supabase past two months. I would say there are still some rough corners in general and some basic features missing. Example Supabase storage has no direct support for metadata [2][3].

Overall I like the launch week and development they are doing. But more attention to basic features and little details would be needed because implementing workarounds for basic stuff is not ideal.

[1] https://bunny.net/ [2] https://github.com/orgs/supabase/discussions/5479 [3] https://github.com/supabase/storage/issues/439

kiwicopple · 2 years ago
> I can utilize any CDN provider that supports S3. Like bunny.net

Bunny is a great product. I'm glad this release makes that possible for you and I imagine this was one of the reasons the rest of the community wanted it too

> But more attention to basic features and little details

This is what we spend most of our time doing, but you won't hear about it because they aren't HN-worthy.

> no direct support for metadata

Fabrizio tells me this is next on the list. I understand it's frustrating, but there is a workaround - store metadata in the postgres database (I know, not ideal but still usable). We're getting through requests as fast as we can.

fenos · 2 years ago
This is indeed at the very top of the list :)
giancarlostoro · 2 years ago
I've not done a whole lot with S3 but is this due to it being easy to sync between storage providers that support S3 or something?

I'm more used to Azure Blob Storage than anything, so I'm OOL on what people do other than store files on S3.

inian · 2 years ago
Here is the example of the DuckDB querying parquet files directly from Storage because it supports the S3 protocol now - https://github.com/TylerHillery/supabase-storage-duckdb-demo

https://www.youtube.com/watch?v=diL00ZZ-q50

cmollis · 2 years ago
Yes. Duckdb works very well with parquet scans on s3 right now.
iamcreasy · 2 years ago
Does it work well with Hive tables storing parquet files on s3?
Rapzid · 2 years ago
I like to Lob my BLOBs into PG's storage. You need that 1-2TB of RDS storage for the IOPS anyway; might as well fill it up.

Large object crew, who's with me?!

vbezhenar · 2 years ago
I don't. S3-compatible storages usually are significantly cheaper, allow to offload HTTP requests. Also huge databases make backups and recoveries slow.

The only upside of storing blobs in the database is transactional semantics. Buf if you're fine with some theoretical trash in S3, that's trivially implemented with proper ordering.

abraae · 2 years ago
> The only upside of storing blobs in the database is transactional semantics.

Plenty more advantages than that. E.g for SaaS you can deep copy an entire tenant including their digital assets. Much easier copying with just "insert into ... Select from" than having to copy S3 objects.

iamcreasy · 2 years ago
What do you mean by transactional semantics?
dymk · 2 years ago
38TB of large objects stored in Postgres right here
Rapzid · 2 years ago
A hero appears!

The client I use currently, npgsql, supports proper streaming so I've created a FS->BLOB->PG storage abstraction. Streamy, dreamy goodness. McStreamy.

kiwicopple · 2 years ago
that's not how this works. files are stored in s3, metadata in postgres
Rapzid · 2 years ago
Sad.

J/K. It could be a really good back-end option for Supabase's S3 front end. A lot of PG clients don't support proper "streaming" and looking at the codebase it's TypeScript.. postgres.js is the only client nearing "performant" I'm aware of(last I looked) on Node.js but it's not clear it supports streaming outside "Copy" per the docs. Support could be added to the client if missing.

Edit: Actually it could be a good option for your normal uploads too. Docs talk about it being ideal for 6Mb or smaller files? Are you using bytea or otherwise needing to buffer the full upload/download in memory? Streaming with Lob would resolve that, and you can compute incremental hash sums for etags and etc. Lob has downsides and limitations but for a very large number of people it has many great benefits that can carry them very far and potentially all the way.

tln · 2 years ago
Will the files get deleted with ON CASCADE DELETE somehow? That would be awesome.

Then for GDPR, when you delete a user, the associated storage can be deleted.

One could cobble this together with triggers, some kind of external process, and probably repetititious code so there is one table of metadata per "owning" id, although it would be nice to be packaged.

code_biologist · 2 years ago
Lol. The most PG blob storage I've used in prod was a couple hundred GB. It was a hack and the performance wasn't ideal, but the alternatives were more complicated. Simple is good.
Rapzid · 2 years ago
Yeah, it's a great place to start. I took the time to implement streaming reads/write via npgsql's client support for it (it can stream records, and of course the Lob storage is broken into page sized rows) and performance is pretty darn good.
yoavm · 2 years ago
This looks great! How easy is it to self host Supabase? Is it more like "we're open-source, but good luck getting this deployed!", or can someone really build on Supabase and if things get a little too expensive it's easy enough to self-host the whole thing and just switch over? I wonder if people are doing that.
kiwicopple · 2 years ago
self-hosting docs are here: https://supabase.com/docs/guides/self-hosting/docker

And a 5-min demo video with Digital Ocean: https://www.youtube.com/watch?v=FqiQKRKsfZE&embeds_referring...

Anyone who is familiar with basic server management skills will have no problem self-hosting. every tool in the supabase stack[0] is a docker image and works in isolation. If you just want to use this Storage Engine, it's on docker-hub (supabase/storage-api). Example with MinIO: https://github.com/supabase/storage/blob/master/docker-compo...

[0] architecture: https://supabase.com/docs/guides/getting-started/architectur...

replwoacause · 2 years ago
I was pretty unhappy with the self hosted offering. It’s neutered compared to the cloud, which was disappointing.
zipping1549 · 2 years ago
Some may disagree but in my experience Supabase was definitely challenging to selfhost. Don't get me wrong; I'm pretty confident with selfhosting but Supabase was definitely on the hard side.

Pocketbase being literally single-binary doesn't make Supabase look good either, although funtionalities differ.

kiwicopple · 2 years ago
Yes, we have a vastly different architecture[0] from Pocketbase. We choose individual tools based on their scaling characteristics and give you the flexibility to add/remove tools as you see fit.

I doubt we can ever squeeze the "supabase stack" into a single binary. This undoubtedly makes things more difficult for self-hosters. Just self-hosting Postgres can be a challenge for many. We will trying to make it easier, but it will never be as simple as Pocketbase.

[0] https://supabase.com/docs/guides/getting-started/architectur...

brap · 2 years ago
Always thought it’s kind of odd how the proprietary API of AWS S3 became sort of the de-facto industry standard
bdcravens · 2 years ago
S3 is one of the original AWS services (SQS predates it), and has been around for 18 years.

The idea of a propriety API becoming the industry defacto standard isn't uncommon. The same thing happened with Microsoft's XMLHttpRequest.

crubier · 2 years ago
Also, the S3 API is simple and makes sense, no need to reinvent something different just for the pleasure of it
ovaistariq · 2 years ago
Supporting an existing API provides interoperability which is beneficial for the users. So that way if there is a better storage service it’s easier to adopt it. However, the S3 API compatibility can be a hindrance when you want to innovate and provide additional features and functionality. In our case, providing additional features [1] [2] while continuing to be S3 API compatible has forced us to rely on custom headers.

[1] https://www.tigrisdata.com/docs/objects/conditionals/ [2] https://www.tigrisdata.com/docs/objects/caching/#caching-on-...

garbanz0 · 2 years ago
Same thing seems to be happening with openai api
mmcwilliams · 2 years ago
I might be misremembering this but I was under the impression that Ceph offered the same or very similar object storage API prior to Amazon building S3.
mousetree · 2 years ago
Because that's where most of the industry store their data.
brap · 2 years ago
Yeah I understand how it came to be, it’s just kind of an uncommon situation
moduspol · 2 years ago
Yeah--though I guess kudos to AWS for not being litigious about it.
jimmySixDOF · 2 years ago
Supabase also announced this week Oriole (the team not just the table storage plugin) is joining them so I guess this is part of the same story. Anyway it's nice timing I was thinking about a hookup to Cloudflare R2 for something and this may be the way.
kiwicopple · 2 years ago
Oriole are joining to work on the OrioleDB postgres extension. That's slightly different to this release:

- This: for managing large files in s3 (videos, images, etc).

- Oriole: a postgres extension that's a "drop-in replacement" for the default storage engine

We also hope that the team can help develop Pluggable Storage in Postgres with the rest of the community. From the blog post[0]:

> Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. This system is available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Oriole aims to be a drop-in replacement for Postgres' default storage engine and supports similar use-cases with improved performance. Other storage engines, to name a few possibilities, could implement columnar storage for OLAP workloads, highly compressed timeseries storage for event data, or compressed storage for minimizing disk usage.

Tangentially: we have a working prototype for decoupled storage and compute using the Oriole extension (also in the blog post). This stores Postgres data in s3 and there could be some inter-play with this release in the future

[0] https://supabase.com/blog/supabase-aquires-oriole

kabes · 2 years ago
What's the point of acquiring them instead of just sponsering the project? I'm trying to understand supabase's angle here and if this is good or bad news for non-supabase postgres users.
jonplackett · 2 years ago
Dear supabase. Please don’t get bought out by anyone and ruined. I’ve built too many websites with a supabase backend now to go back.
kiwicopple · 2 years ago
we don't have any plans to get bought.

we only have plans to keep pushing open standards/tools - hopefully we have enough of a track record here that it doesn't feel like lip service

mort96 · 2 years ago
Is it even up to you? Isn't it up to your Board of Directors (i.e investors) in the end?
jerrygenser · 2 years ago
I can believe there are no plans, right now. But having raised over $100mm, the VCs will want to liquidate their holdings eventually. They have to answer to their LPs after all, and be able to raise their next funds.

The primary options I can think of that are not full acquisition are: - company buys back stock - VC sells on secondary market - IPO

The much more common and more likely option for these VCs to make the multiple or home run on their return is going to be to 10x+ their money by having a first or second tier cloud provider buy you.

I think there's a 10% chance that a deal with Google is done in the future, so their portfolio has Firebase for NoSQL and Firebase for SQL.

jonplackett · 2 years ago
Absolutely. I am so impressed with SB. It’s like you read my mind and then make what I’ll need before I realise.

(This is not a paid promotion)

robertlagrant · 2 years ago
As long as you make it so if you do get bought a team of you can always fork and move on, it's about the best anyone can hope for.
tap-snap-or-nap · 2 years ago
plans*

*Subject to change

gherkinnn · 2 years ago
This is my biggest reservation towards Supabase. Google bought Firebase in 2014. I've seen Vercel run Nextjs in to the ground and fuck up their pricing for some short-term gains. And Figma almost got bought by Adobe. I have a hard time trusting products with heavy VC backing.
MatthiasPortzel · 2 years ago
I’m not defending Vercel or VC backed companies per se, but I don’t understand your comments towards Vercel. They still offer a generous hobby plan and Next.js is still actively maintained open source software that supports self-hosting.

Heroku might be a better example of a company that was acquired and the shut down their free plan.

spxneo · 2 years ago
You know the whole point of YC companies is to flip their equity on the public market right and then moving on to the next one?
paradite · 2 years ago
I think Vercel and Next.js are built by the same group of people, the same people who made Now.sh, started the company (Zeit), then changed product name to Now.sh, then changed company and product name to Vercel.
quest88 · 2 years ago
What's the actual complaint here , other then company A buys company B.
hacker_88 · 2 years ago
Wasn't there Parse from firebase
zackmorris · 2 years ago
Firebase was such a terrible loss. I had been following it quite closely on its mailing list until the Google takeover, then it seemed like progress slowed to a halt. Also having big brother watching a common bootstrap framework's data like that, used by countless MVP apps, doesn't exactly inspire confidence, but that's of course why they bought it.

At the time, the most requested feature was a push notification mechanism, because implementing that on iOS had a steep learning curve and was not cross-platform. Then probably some more advanced rules to be able to do more functional-style permissions, possibly with variables, although they had just rolled out an upgraded rules syntax. And also having a symlink metaphor for nodes might have been nice, so that subtrees could reflect changes to others like a spreadsheet, for auto-normalization without duplicate logic. And they hadn't yet implemented an incremental/diff mechanism to only download what's needed at app startup, so larger databases could be slow to load. I don't remember if writes were durable enough to survive driving through a tunnel and relaunching the app while disconnected from the internet either. I'm going from memory and am surely forgetting something.

Does anyone know if any/all of the issues have been implemented/fixed yet? I'd bet money that the more obvious ones from a first-principles approach have not, because ensh!ttification. Nobody's at the wheel to implement these things, and of course there's no budget for them anyway, because the trillions of dollars go to bowing to advertisers or training AI or whatnot.

IMHO the one true mature web database will be distributed via something like Raft, have rich access rules, be log based with (at least) SQL/HTTP/JSON interfaces to the last-known state and access to the underlying sets selection/filtering/aggregation logic/language, support nested transactions or have all equivalent use-cases provided by atomic operations with examples, be fully indexed by default with no penalty for row or column based queries (to support both table and document-oriented patterns and even software transactional memories - STMs), have column and possibly even row views (not just table views), use a copy-on-write mechanism internally like Clojure's STM for mostly O(1) speed, be evented with smart merge conflicts to avoid excessive callbacks, preferable with a synchronized clock timestamp ordered lexicographically:

https://firebase.blog/posts/2015/02/the-2120-ways-to-ensure-...

I'm not even sure that the newest UUID formats get that right:

https://uuid6.github.io/uuid6-ietf-draft/

Loosely this next-gen web database would be ACID enough for business and realtime enough for gaming, probably through an immediate event callback for dead reckoning, with an optional "final" argument to know when the data has reached consensus and was committed, with visibility based on the rules system. Basically as fast as Redis but durable.

A runner up was the now (possibly) defunct RethinkDB. Also PouchDB/PouchBase, a web interface for CouchDB.

I haven't had time to play with Supabase yet, so any insights into whether it can do these things would be much appreciated!

codextremist · 2 years ago
Never used Supabase before but I'm very much comfortable with their underlying stack. I use a combination of postgres, PostgREST, PLv8 and Auth0 to achieve nearly the same thing.
pbronez · 2 years ago
I was unfamiliar with PLv8, found this:

“”” PLV8 is a trusted Javascript language extension for PostgreSQL. It can be used for stored procedures, triggers, etc.

PLV8 works with most versions of Postgres, but works best with 13 and above, including 14, [15], and 16. “””

https://plv8.github.io/

foerster · 2 years ago
I'm a bit terrified of this as well. I have built a profitable product on the platform, and it were to drastically change or go away, I'd be hosed.