Foundation DB Record Layer SQL API

fabianlindfors · 4 months ago

Really cool although quite an undertaking to build an entire SQL engine! I have been working on something pretty similar but using Postgres instead. Basically extending Postgres to run stateless on top of FoundationDB, which would achieve the same thing but with all the Postgres features one would expect (and without some quirks you might not want, like vacuuming).

Working with FoundationDB is a real pleasure as many people have noted already, a very intuitive abstraction to build databases on top of.

mike_hearn · 4 months ago

How do you plan to solve the N+1 query issue and the five second timeout?

fabianlindfors · 4 months ago

The five second timeout remains and trickles through to apply to Postgres transactions instead. This project would very much be for OLTP, so not really fit for using Postgres for OLAP or hybrid workloads.

The N+1 issue is a really interesting one which I have a plan for but haven't implemented yet. FoundationDB has what they call mapped ranges [0] to help with this, which works in some cases. More generally, one should make sure to issue read request as soon as possible and not on-demand, given that FoundationDB clients have a futures-based design. This is slightly tricky in Postgres because internally it has a pull-based model where one tuple is pulled from the execution plan at a time, so one needs to implement pre-fetching and not make a read against FDB each time a new tuple is pulled.

[0] https://github.com/apple/foundationdb/wiki/Everything-about-...

fidotron · 4 months ago

This is the interview question you would use to spot people that have actually used FDB vs just talked about it.

fabianlindfors · 4 months ago

If anybody wants to follow along, I'll be publishing it here once ready: https://github.com/fabianlindfors/pgfdb (currently just an empty repo!)

bognition · 4 months ago

I remember learning about FoundationDB a decade ago and being deeply impressed with what they built. Then it was acquired by Apple and went silent. Since then we've seen an explosion in new database storage layers. I'm curious is FoundationDB still the new hotness or has it been replaced by newer better technologies?

jwr · 4 months ago

FoundationDB is pretty much the best distributed database out there, but it's more of a toolkit for building databases than a complete batteries-included database.

I found that once I took the time to learn about FoundationDB and think about how to best integrate with it, the toolkit concept makes a lot of sense. Most people instinctively expect a database interface with a certain level of abstraction, and while that is nice to work with, it does not provide the advantages of a deeper integration.

To take an example: FoundationDB itself has no indexing. It's a key-value store, but you get plenty of great tools for maintaining indices. That sounded strange to me, until I understood that now I can write my indexing functions in my app's language (Clojure in my case), using my model data. That is so much better than using a "database language" with a limited set of data types.

Incidentally, I think that using SQL with FoundationDB is a waste, and I would not write a new app this way. Why would I want to talk to my database through a thin straw that mixes data with in-band commands?

Since FoundationDB is hard to understand, there is (and will be) strong resistance to adoption. That's just how things are: we do not enjoy thinking too hard.

umvi · 4 months ago

> Since FoundationDB is hard to understand, there is (and will be) strong resistance to adoption. That's just how things are: we do not enjoy thinking too hard.

More like: we all have limited time, and if it's hard to understand you are asking for a big upfront time investment for a thing that may not even be the best fit for your use case.

Anything can be made easier to understand with the right abstractions. The theory of relativity was super hard to understand when it was first developed; you basically had to be an elite physicist. But now non-physicists can understand it at a high level thanks to YouTubers like veritasium and minute physics. Maybe FoundationDB just needs better marketing.

Also: your description of FoundationDB reminds me of ZeroMQ, which basically just dumps MQ legos at your feet and tells you to build your own MQ system (as opposed to a batteries included solution like RabbitMQ)

MarkMarine · 4 months ago

Can we see some of your indexing code?

Dave_Rosenthal · 4 months ago

FoundationDB's original promise was to combine a distributed storage engine with stateless layers on top to expose a variety of useful data structures (including SQL). The company was acquired before it could release the layers part of that equation. Apple open-sourced the core storage engine a few years later so FDB has kind of had a second life since then.

In that second life, the vast majority of the powerful databases around the industry built on FoundationDB have been built by companies making their own custom layers that are not public. This release is cool because it's a rare case that a company that has built a non-trivial layer on top of FDB is letting that source code be seen.

The group to which the FoundationDB storage engine itself appeals is fairly narrow--you have to want to go deep enough to build your own database/datastore, but not so deep to want to start from scratch. But, for this group, there is still nothing like FoundationDB in the industry--a distributed ACID KV store of extreme performance and robustness. So, yes, it's still the hotness in that sense. (As others have mentioned, see e.g. Deepseek's recent reveal of their 3FS distributed filesystem which relies on FDB.)

jbverschoor · 4 months ago

AFAIK, the SQL layer was available and released

fidotron · 4 months ago

Foundation is fundamental to iCloud at Apple, and is _something_ at Snowflake, among a few others. Recently DeepSeek used it for https://github.com/deepseek-ai/3FS "The Fire-Flyer File System (3FS) is a high-performance distributed file system designed to address the challenges of AI training and inference workloads."

I don't think that there's anything else quite the same, partly because it has some real oddities that manifest because of things like the transaction time limits. At Apple they worked around some of this with https://www.foundationdb.org/files/QuiCK.pdf

frakkingcylons · 4 months ago

Tigris (an object storage provider, I have no affiliation) also uses FoundationDB for storing metadata:

https://www.tigrisdata.com/docs/concepts/architecture/#found...

senderista · 4 months ago

Snowflake uses FoundationDB for their metadata store and...something else which isn't public.

jen20 · 4 months ago

It’s now been approximately 7 years since Apple open sourced FoundationDB. Note that it was closed source before the acquisition, which is often not appreciated.

olavgg · 4 months ago

There is a company delivering a data platform for the Industry 4.0 named Cognite based in Oslo, Norway that migrated from Google BigQuery to their own database on on top of FoundationDB.

The video about it is available here: https://2023.javazone.no/program/85eae038-49b5-4f32-83c6-077... After watching, my thoughts were; why didn't you just use Clickhouse?

jbverschoor · 4 months ago

I Thin that was more around 1.5 decades by now :). Yeah I was super enthusiastic about it. Seemed perfect. Back then I considered Riak, MongoDB, and things like Tokyo cabinet.

FoundationDB to me is like Duke Nukem Forever.

I don’t need it anymore. At least not for now

pstuart · 4 months ago

There's an intriguing project which puts SQLite on top of FoundationDB that is quite intriguing, unfortunately the dev seems to have moved on from that effort:

https://github.com/losfair/mvsqlite

amazingamazing · 4 months ago

At some point someone will reimplement the dynamodb api on top of foundation db. That’ll be nice because then you have an effectively cheap hosted version available then.

conradev · 4 months ago

My favorite FoundationDB layer is per-user SQLite databases: https://github.com/losfair/mvsqlite

It's hard to tell if it's running in production, but the author works at Deno!

tough · 4 months ago

there was some discussion early on another thread about the one sqlite-db-per-vendor infra architecture can't remember maybe on duckdb one?

mastabadtomm · 4 months ago

There is one more project that aims to build a MongoDB-like query engine and uses Redis wire protocol. It's Kronotop: https://github.com/kronotop/kronotop

Kronotop uses FoundationDB as a metadata store for document indexes and the other stuff. It stores the document bodies on the local disk and supports primary-follower replication.

It also works as a RESP3/RESP2 proxy for FoundationDB API.

anhldbk · 4 months ago

At last Foundationdb has SQL Layer. AFAIK the initial discussion was in 2018 [1]

[1] SQL layer in FoundationDB, https://forums.foundationdb.org/t/sql-layer-in-foundationdb/...

tehlike · 4 months ago

I really really want nodejs bindings for foundationdb record layer. I tried using node java bridge, and it could be made to work but it'd be quiet an effort to maintain I guess...

ToJans · 4 months ago

Shouldn't be too hard. I built an Erlang/BeamVM driver/wrapper for it [1] before it got acquired by Apple... Their API is nice and clean.

[1] https://github.com/happypancake/fdb-erlang

tehlike · 4 months ago

Plain foundation db and document layer has bindings. It's the record layer that's a bit more complex with indexes, queries, etc.