API Mismatch: Why bolting SQL onto noSQL is a bad idea

What's so weird about this article is that the author recognizes that querying KV databases and querying relational databases is not the same thing and trying to overlay relational semantics over a KV database causes problems but then doesn't seem to take much notice of the fact that the interface ORMs provide (querying graph databases) is also mismatched with relational databases.

chris_armstrong · 3 years ago

(Author here)

I agree: ORMs as I would traditionally think of them, like Hibernate are very mismatched to SQL databases. In this case, Prisma describes itself as an ORM but it maps more like a KV table description with join conditions (I’m not sure how else to describe it).

I’d still argue it’s semantics are more compatible with DynamoDB though, complementing rather than working against the underlying storage model.

Arch-TK · 3 years ago

You know, this tendency to call things which are not actually ORMs ORMs reminds me a lot of the tendency for a lot of "multi paradigm" languages to claim they support OO while actually not supporting true OO (whatever that really means, I'm still not personally sure yet but it's easier to tell when you DON'T have it than when you do).

You will have an argument with someone about the flaws in the concept of ORMs and they will pull out something which calls itself an ORM but seems to lack a lot of the core of what an ORM was supposed to do. (This is starting to sound like an appeal to purity but I think the problem here is that terms like OO and ORM are increasingly becoming diluted to the point of being meaningless.)

Anyway, I'm not saying that you did anything wrong, you obviously are not the person who originally described Prisma as an ORM, it's on their website. I can sort of understand why the people behind Prisma called it an ORM as it is probably advantageous as a marketing strategy. But I will say that just from a quick glance, Prisma is NOT really an ORM but seemingly far more of just a query generator with nice syntax (e.g. sqlalchemy core). That's not to say it is inferior, it is in my opinion clearly superior, I strongly believe this is the correct sort of tool for helping work with a relational database from any programming language. I just wish people were more precise with terminology.

Tangential to the authors point, but it’s funny to note many new SQL databases(e.g CockroachDB, TiDB, MyRocks) are written on top of RocksDB, a “NoSQL” key value store.

jakewins · 3 years ago

I mean, so are the relational databases. Rocks provides similar storage primitives - trees - as most databases implement internally.

Postgres is “just” a SQL engine running on top of a key-value heap and trees pointing into it.

Neo4j is the same, key-value record store plus trees.

Rocks and it’s peers have made that into a dependable library, making new db dev that much faster :)

mattashii · 3 years ago

> Postgres is “just” a SQL engine running on top of a key-value heap and trees pointing into it.

I wouldn't call heaps a key-value data structure; even if you could argue that the location of your data is an implicit key. And the trees are strictly optional - you could build your database with only hash and brin indexes.

zffr · 3 years ago

It’s my understanding that row-based relational databases are basically key-value stores that map from row ID to column values. The “magic” of SQL-based relational databases is how the KVS is queried, and the consistency guarantees they provide.

Part of the consistency guarantees is having a reliable storage engine. That’s the value RocksDB provides.

The rest of the “SQL stuff” can be built on top this.

yourMadness · 3 years ago

It seems fairly well shown that the "SQL stuff" can be build on top of it on the server side.

Building the "SQL stuff" on the client side seems less well proven to me.

rhacker · 3 years ago

I think the one clean SQL bolted onto NoSQL that I've seen is spark. Since spark treats the NoSQL as having a structure that matches the underlying database, but it also runs an SQL layer, it's kind of an interesting way to do it. Now that's also up to the spark "driver" like for mongodb - the driver code has to tell mongo to give it some "shape" for the collection otherwise spark can't work with it. Now it's likely to have a shape, but the driver may skip a random column if it's not present enough.

CSDude · 3 years ago

PartiQL on DynamoDB is just syntactic sugar. DynamoDB is not meant to be used that way, but it's query and insert language can be a bit tricky, PartiQL just a helper for that weird query syntax, you can't do joins or aggregates. PartiQL is supported by QLDB, Redshift as well, to unify a querying language somewhat. Docs should be much more clearer to indicate the dangers of it , to avoid confusion like this.

But you can scan MongoDB, Elastic with Presto in parallel it works great when you need to run it, a few times. But if you find yourself using a NoSQL data store as relational, or OLTP cases just because you have ability to run SQL on it is going to hurt you and it should be obvious. As with everything in software, it depends.

We use PartiQL library directly at Resmo https://www.resmo.com because it makes querying the datastore with nested values easier and the its storage independent.

Deleted Comment

I think treating PartiQL as syntactic sugar is the mature approach - no one is under any delusions then as to how the database works.

I mostly use it for ad hoc stuff whenever I need to get at something quickly. The SDK interface is horrible in its own right.

jugg1es · 3 years ago

Yea, I always thought of PartiQL as a replacement for the overly-complicated DynamoDB SDK (it has 2 different 'versions' and multiple query modes - a low level API and a higher document level).

tjansen · 3 years ago

Even weirder than PartiQL is Microsoft's CosmosDB for NoSQL. Its query language is called SQL, despite the database name being NoSQL. Also a very limited SQL dialect, just a bit more convenient than PartiQL. Without joins across tables/containers it is a very different experience than real SQL.

https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quer...

joshstrange · 3 years ago

> This got me thinking about DynamoDB (a database I use day to day, and for which I maintain dynaglue, a single-table mapping layer for TypeScript/JavaScript), and made me wonder if a DynamoDB adapter existed.

I hadn't seen dynaglue when I went looking recently for a TypeScript library for DynamoDB. I played around with a couple of options out there and ended up settling on TypeDORM. Overall I'm happy with it. I find it a little odd that so few TypeScript/DynamoDB wrappers support fetching multiple entities in a single call (Example: PK is userId, SK is userId for the User entity, and SK is addressId for the user's addresses, get user and all their addresses in a single call by only querying with the shared PK). I guess I understand why, you'd need to be hydrating the objects returned from DynamoDB into classes (and thus storing something on the Items in DynamoDB that hint you the class to hydrate into) and it can be weird to get back an array of mixed class instances. In the end I just query for 1 entity at a time or a group of children entities for a given parent and I'm pretty happy overall.

If you want learn more about Single Table Design with DynamoDB then you should absolutely check out the book: https://www.dynamodbbook.com/ -- I was skeptical as I'm not really a programming book type of guy but this was an amazing resource for how to think about Single Table Design and how to structure your data. There is a hacker news coupon "HACKERNEWS" for $20 off that I found in an old thread and it still works.

TylerE · 3 years ago

Postgres with jsonb is pretty great, though.

tybit · 3 years ago

adamzegelin · 3 years ago