Readit News logoReadit News
Posted by u/olalonde 9 years ago
Ask HN: What happened to the ORM?
After all those years, I sometimes feel like the whole ORM thing hasn't made progress or been settled. Which ORMs do you like and what do you like about them? What is the sweet spot? I'm asking this because database I/O has always been my biggest frustration when doing application programming. These days, I do more functional programming and tend to side with the ORM haters but I do occasionally wish I could just abstract out the database more. Anyways, I'd just like to hear HN's thoughts on this.
ciconia · 9 years ago
I'm the original author of Sequel [1], an ORM for Ruby. Lately I've been finding that ORM's actually get in the way of accomplishing stuff. I think there's a case to be made for less abstraction in programming in general, and access to data stores is a major part of that.

I believe in most cases the efficiencies that can be gained from using an ORM is quickly offset by a substantial dependency on third-party code, limited control of performance, and inability to express more complex SQL expressions, such as CTE's, lateral expressions, JSONB functions and operators etc.

ORM's also tend to pile on lots of functionality, of which most projects will normally only use a small fraction.

For an in-production system I've been maintaining for the last 10 years, I've recently ripped out the ORM code, replacing it with raw SQL queries, and a bit of DRY glue code. Results: less code, better performing queries, and less dependencies.

[1] https://github.com/jeremyevans/sequel

bitexploder · 9 years ago
I don't feel any of those limitations using SQLAlchemy. Most projects just end up implementing a subset of ORM features poorly. I think your experience as an ORM author and contributor may rather uniquely qualify you to write "a bit of DRY glue code".

Good ORM lets you use a small slice of their features without any big performance hit in terms of development or actual. With a nice object model it just seems easier to remember and work with code.

When I need complex SQL I just write a raw query... It is trivial and concise in any ORM to drop down to raw SQL when needed.

I can agree they are no silver bullet, but a good ORM does so many chores for me I can't imagine bothering to do them myself, anymore when the pattern of using objects in my code and storing them in a SQL DB solves the problem at hand.

TeMPOraL · 9 years ago
> I think there's a case to be made for less abstraction in programming in general, and access to data stores is a major part of that.

I'd like to hear more of your thoughts on that.

My experiences with ORMs are less about too much abstraction, and more about them encouraging wrong abstractions. I'm yet to work with an ORM-using project that doesn't employ ORM objects as business model objects. The ones I've worked with always exploded due to the fact that even with ORM, database model doesn't map 1:1 to how the business logic would like to see the world - and the tension of trying to use one set of objects for both purposes at the same time is what messed up the codebase.

ciconia · 9 years ago
> I'd like to hear more of your thoughts on that.

These days it seems that even for the most rudimentary programming tasks everybody reaches straightaway for their favorite framework/lib without too much thinking.

I think in a way all these frameworks and libraries have had a negative impact on the general quality of coding. Programmers end up with limited knowledge of how the underlying technologies work: the HTTP protocol, SQL, the DOM...

And because these frameworks tend to grow over time, in an effort to be comprehensive and serve everyone's needs, they introduce more and more dependencies, more and more tooling, more and more code, which you'll need to get to know sooner or later.

Programmers who use said frameworks and libs become more and more "hooked", start using more features than actually needed (do you really need that auto generated REST API for that 3-page dry cleaning website?), and in the process become, well, dumber.

You can of course take this argument to absurdity by claiming that everybody should just write machine code. That's of course not what I mean. I just think that for many programming tasks, programmers would be better off - and no less productive - just by attacking the problem head on, with a good understanding of the technologies involved.

Sometimes a simple handsaw is more efficient and less hassle to use than that fancy sophisticated table saw you got in the corner.

nogbit · 9 years ago
Sequel was the last one that I used and was the best one. I can't remember the talk..."ORM's are the Vietnam of programming".

You are right, very little of the ORM gets used in the end and one almost always needs to write custom SQL anyway (models don't always map 100% to app layer). Most importanly, as languages change, evolve or dry up it's easier to find or train new talent with SQL experience vs. a language specific ORM.

vram22 · 9 years ago
>I can't remember the talk..."ORM's are the Vietnam of programming".

Had seen that article a while ago. Saw a few other mentions of it already in this thread, but here are a few I googled for and found:

http://blogs.tedneward.com/post/the-vietnam-of-computer-scie... (the original, maybe)

HN thread:

Object-Relational Mapping is the Vietnam of Computer Science (2006) (codinghorror.com) https://news.ycombinator.com/item?id=7310077

https://blog.jooq.org/2012/05/09/martin-fowler-on-the-vietna...

joeandaverde · 9 years ago
I haven't been a big fan of ORM's even before DataMapper in Ruby. I found that it was too easy to make non-performant queries and found myself writing raw SQL anyway.

Some argue that using an ORM means you can switch underlying database technologies on a whim. I think this is an incredibly weak argument. How often do people truly switch database technologies?

I created a small wrapper around the node postgres library to make querying a little easier.

Have a look at https://github.com/joeandaverde/tinypg - It's a no frills library that makes it easy to execute SQL files as prepared statements and pass parameters as keys on objects.

mewwts · 9 years ago
I've been battling a lot of cases where I don't want to duplicate SQL statements in my application. For example I have a query defined which I want to add a WHERE clause to. How do you handle these cases elegantly in code?
nogbit · 9 years ago
Some databases treat the query themselves as components that can be chained together before execution, look at RethinkDB and ReQL. For SQL, there are lots of ways to make dynamic sql (generated by code) and while the patterns are many the implementation in the language often looks bad and breaks most sql injection protections when using db middle ware.
dyeje · 9 years ago
Would love to see the DRY glue code.
ciconia · 9 years ago
A lightweight PostgreSQL-based Ruby object store using JSONB, in ~ 100 LOC:

https://gist.github.com/ciconia/57ad430cf437f0548d596a6f0026...

tony · 9 years ago
ORM's are good form, why?

- Avoids mistakes when dealing with writing raw SQL queries (SQL is quite repetitive in practice)

- The declarative nature of classes maps well to types and relationships

- The declarative nature of classes maps out well to tables, even with polymorphism [1]

- Keeping "Models" in an ORM often maps out well to migration utility (Alembic, Django Migrations)

- Object-chaining map very well to queries

- ORM objects can be reused and composed

- They can abstract out intricacies across SQL dialects

- They can potentially make it easier to migrate to different SQL servers if no specialized features were used

- Can help avoid common security vulnerabilities like SQL injections

- When something can't be expressed via ORM relationships, they tend to allow the dev to drop down to raw SQL. In the case of SQLAlchemy, there is a core query language [2], too.

- In the case of Django, QuerySet is used as a standard throughout extensions that power a whole community. Plugins that don't even know each other (e.g. django-filter and django-tables2) can operate on the same django queryset to filter/search and sort/display data.

I mention QuerySet/Django ORM quite a bit in a recent blog post at https://www.git-pull.com/code_explorer/django-vs-flask.html.

[1] http://docs.sqlalchemy.org/en/latest/orm/inheritance.html [2] http://docs.sqlalchemy.org/en/latest/core/

barrkel · 9 years ago
> Object-chaining map very well to queries

This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

The closer an ORM's API is to a monad, the happier you'll be. Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see.

IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed. Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.

bildung · 9 years ago
> This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

That isn't true for e.g. Django's ORM, which lazily evaluates the query and only actually accesses the db after that, with a single query, and filtering done in SQL.

git-pull · 9 years ago
> Fundamentally, accessing the database is executing remote code; the more you can package up into the query before it goes off and does anything, the better performance you'll see

The queries that most tend to be making just aren't that sophisticated. Relationships tend to be basic.

And I haven't even mentioned stuff that'd really, really hard to express/manage in pure SQL like tree/nested information that has to stay balanced [1]. Thanks django-treebeard/mptt and sqlalchemy-orm-tree.

> The closer an ORM's API is to a monad, the happier you'll be.

Developers using ORM's simply aren't caring about ORM's matching a certain construct. They care that models emit correct representations of their schemas and that the data is retrieved "fast enough".

Take it a different way: the best part about ORM's? They're effective 95% of the time, right out the box, so you end up avoiding time that'd be spent over and prematurely optimizing.

> IMO trying to shoehorn objects (in the OO sense, with polymorphism, data hiding and behaviour) into a database is wrong-headed.

Objects in things like SQLAlchemy declarative and Django Models map perfectly to generated SQL, so they also act as a way to generate migrations. It's that accurate. A lot of the relationships project's need expressed tend to be vanilla joins.

> Data hiding in particular is a wrong mental model for thinking about facts in the database, and the more is hidden, the harder it will be to reason about performance and bulk remote operations generally.

ORM's strive to hit a value sweet-spot in terms of code expressiveness, reducing duplication, handling the bread and butter relationships and types. That covers what most developers really need.

Perhaps there are projects out there not fitting to ORM's. Not all projects are sophisticated data mart projects, but even then, a good share of those still go back to simple joins at the end of the day.

And I've even gone as far as trusting heavy-duty stuff like django-mptt, along with plugins that filter and sort. I don't even look at the queries, all I see is they're running performantly. In all these years, SQL queries have never been a bottleneck. Maybe it's because I'm only storing simple stuff.

[1] https://en.wikipedia.org/wiki/Nested_set_model

rodorgas · 9 years ago
An ORM provides easy DB access and various implicit operations. The trade off is that if you don't know what you're doing, your app gonna have bad performance. Seems fair to me.

Also, Django ORM documentation is very clear and you can use Django Debug Toolbar to analyse the raw sql generated.

Deleted Comment

Dayshine · 9 years ago
>This isn't remotely true; it turns what looks like an in-memory access into a network round trip. Navigating your database using an idiom of lists and member accesses is a fast path to n+1 queries all over the place, or ORM tuned to eagerly fetch the whole world when you only touch a tiny piece of it.

Only if you use naive ways of doing this. In .NET at least your overcomplicated expression can be compiled down to the minimum query needed to pull the bits of data you use.

jasonkester · 9 years ago
But they're unnecessary baggage, because your top six reasons all work in reverse.

If you define your schema in your database and derive your data layer from that, you get everything in your list (apart from that thing that totally always happens where you switch your underlying database technology every few months).

But then you don't have your database defined in two places. And if anybody ever does modify the db by hand, your build will break and it will quickly surface itself as an issue at compile time instead of via an obscure error message somewhere 40 levels deep in the call stack.

mewwts · 9 years ago
How do you in practice derive the data layer from the schema? Somehow parsing the schema in the application code?
jacobush · 9 years ago
Thank you.
richardknop · 9 years ago
Yes, with Django I would definitely stick to ORM. But when working in Golang or some other new language, unfortunately there isn't any ORM library that good. So people usually stick to SQL.
tastyham · 9 years ago
I always liked Django's ORM because it didn't even try to do any complicated stuff, it forces you to drop down to raw SQL instead (I haven't used Django in years, hopefully it's still like that).

Compared to something like Hibernate or SQLAlchemy that tries to support everything under the sun and can result in a lot confusion when trying to understand what exactly it's doing.

Deleted Comment

slackingoff2017 · 9 years ago
They're still alive and well, just not sexy anymore. The most popular one is probably still Hibernate which is helped by Java being the top high level language.

I never understood the ORM hate. Every place I worked we intermingled raw SQL when needed. Hibernate has a way to clear query caches so you can use raw SQL when you need to. You can just write raw SQL exclusively if you want within hibernate so I don't get how you could lose anything :) .

Still, my experience is mostly with Hibernate. It's extremely mature, meaning reliable, feature complete, and only 0-30% slower than raw queries in most cases. It makes adding support for things like multitenency and audited tables a breeze without losing database agnostic behavior. It makes database upgrades a no-brainer 95% of the time too. It has a built in memory caches that help enormously with common queries. Probably the biggest thing is it makes the database strongly typed so it's possible to refactor. Code refactoring in Java is easy but raw stringified SQL is nearly impossible to fix in any language.

I think the biggest counterpoint to ORM is shitty ORM. Things like SQLAlchemy generate downright horrific SQL slowing everything to a crawl and causing deadlocks. Another honest counterpoint to ORM is the learning curve. Everyone is taught SQL but the ORM is more abstract and harder to reason about, not a fun thing to learn.

TBH I think most ORM's are just poorly done. Putting an object abstraction on a relational database is hard. The only ones I've enjoyed for completeness and performance are Hibernate and to some extent Entity Framework. EF being the easiest to use but a bit slower with less features.

I have heard good things about Dapper but never used it. I like the idea of injecting a SQL DSL into the language itself, wish it was more prevalent.

Capt-RogerOver · 9 years ago
I think your experience reflects that basic gist I've gotten from most professionals: ORMs are complicated and convoluted because they provide a solution for a very complicated problem. If you are somewhat new and don't understand things, it's easy for you to see the few problems introduced by ORMs and think that it means that the whole concept is just unnecessary bloat. With more experience and wisdom, you understand that the few quirks of ORMs are a necessary evil for having a software that solves such a complex and flexible task.
klodolph · 9 years ago
That's a good analysis of things, and on top of that, when you're new you'll treat an ORM like a hammer in search of a nail. So someone who's less experienced and doesn't realize that you should supplement the ORM with raw SQL queries might try to evaluate complicated queries in the ORM, with the usual results. Some of that has changed with e.g. LINQ and IQueryable in .NET, but it's no panacea.
wiz21c · 9 years ago
>> Things like SQLAlchemy generate downright horrific SQL slowing everything to a crawl and causing deadlocks.

Not my experience... Would you care to explain ? SQLA works pretty well for me, as long as I stay in rather simple queries. For example, data analysis queries are next to impossible to express with SQLA, but that doesn't matter much.

For the simple update queries, it just works fine for me. It also allows for very good control on relationships, etc.

majewsky · 9 years ago
I've always had success with ORMs when I let them write the 95% of trivial queries (SELECT * FROM table, INSERT INTO table, UPDATE table SET ... WHERE id = x). The complex queries I'd always rather write by hand to take advantage of the expressiveness of SQL, and to make sure that queries hit my indexes correctly.
slackingoff2017 · 9 years ago
Maybe things have changed, I haven't used it since school and needed an example :-/
fzaninotto · 9 years ago
It's now called GraphQL. Let me explain.

First, the need for data has moved from the server to the client side in API-centric architectures. The backend is often just a dumb HTTP layer on top of a (sometimes relational, sometimes not) data persistence. Heck, I even built apps without server, using PostgREST.

Second, on the client side, we don't need an object-oriented interface to the datastore, because JavaScript. But we do need a powerful declarative query builder, and static data types to build up on. GraphQL provides that.

Third, if the purpose of the ORM was ever to abstract the SQL dialect of the database, GraphQL has resolvers, which turn a GraphQL request into a request for the database of your choice. So it does that job, too.

Last, for the data crunching jobs that you decide to do on the backend, you often need specialized data stores (event store, document store, full-text search index), and ORMs don't address that. And since these jobs are often resource intensive, you need the best performance - therefore falling back to raw SQL in most cases.

ORMs were a good solution for the kind of software architectures that were used for the web in the 00s. Since then, we have different needs, which require different solutions.

Disclaimer: I was the maintainer of a popular open-source PHP ORM [1] for 3 years. I no longer use ORMs.

[1] http://propelorm.org/

ruslan_talpa · 9 years ago
What have you used PostgREST for?
fzaninotto · 9 years ago
Public and private web apps for a French media company. Dumb backend, most of the logic was in the frontend (React) and in async workers (Node.js, with direct access to the database without ORM).
coldtea · 9 years ago
What happened to the ORM?

To my opinion, it was a bad solution to the wrong problem.

For one, we're not that enamoured with objects anymore (what with functional programming, immutability, etc).

Second, SQL and DDL, being declarative, is both a higher abstraction that (at least) most ORMs, and offers more fine level control to the DB at the same time!

Third, people don't really switch databases that often, for the abstraction between different SQL syntaxes to matter.

Scarblac · 9 years ago
> Third, people don't really switch databases that often, for the abstraction between different SQL syntaxes to matter.

But third-party libraries shouldn't assume any specific SQL database, so if you want libraries that can do database things, ORMs are very useful.

TeMPOraL · 9 years ago
They can always do SQL abstraction without doing the object-relational mapping.
MrBuddyCasino · 9 years ago
this is what you want: http://www.jooq.org/
codeulike · 9 years ago
An old article, but still somewhat relevant I think:

"ORM is the Vietnam of Computer Science" (2006)

https://blog.codinghorror.com/object-relational-mapping-is-t...

_Codemonkeyism · 9 years ago
I thought we've been through :-)

"ORMs are a thing of the past" (2009) http://codemonkeyism.com/orms/

"Be careful with magical code" (2010) "[...] ORMs and in particular Hibernate [...]" http://codemonkeyism.com/beware-magical-code/

barrkel · 9 years ago
My experience is strongly flavoured by developing and optimizing on the edge of what the database is able to do; and my performance condition isn't scaling up millions of tiny trivial queries, but getting big hairy dynamically generated queries to execute efficiently within a deadline.

No ORM comes remotely close to being able to solve this kind of problem. The kinds of things I need to consider:

- analyzing select / predicate / order / etc. for tables used and selectively including tables

- forcing query evaluation order by creating a subtable query with conditions, projecting out an id, and rejoining in the next query up

- analyzing predicates and rewriting predicate trees to push predicates down into subtable queries, or convert predicates into joins, or replace references to columns on other tables with denormalized columns

- gathering heuristics about query set size and changing query generation technique

- splitting queries into parallelizable queries and composing result set client-side

An ORM doesn't really help me. Even a library tuned for symbolic manipulation of SQL doesn't help hugely; my life is made much simpler when I can freely extend the SQL AST classes with my own analytic and rewrite methods.

joncrocks · 9 years ago
Isn't this the type of thing that 'the' (or a) query optimiser is meant to do?

With a suitably tweaked ORM, you can pass platform-specific hints down to the DB. The platform's query optimiser should be (obviously up for debate!) the one that's taking a query and working out the most efficient way of executing it.

Not saying it always will always pick the best execution, but if you're getting to the point of supplanting the optimiser you're starting to replace the database itself.

At which point you're probably beyonds the realms of what an ORM can and should reasonably doing, you're starting to writing your own custom persistence platform.

barrkel · 9 years ago
No, query planners are generally hopeless; for my use cases, it's trivial to get orders of magnitude better performance with different query structures.

If you have just a predicate, or a single join, most query planners are just about ok; with multiple tables and predicates touching those tables, there's ever increasing risk it will start the join on the wrong table and produce too many rows somewhere in the plan. Engines like PostgreSQL are unpredictably stupid (but sometimes very smart), while MySQL is predictably stupid (but consistently fast when you keep it simple and avoid specific constructs).

When designing for interactive use, you want to design for a maximum run time on the query. If that means using a query which is often not the fastest, but is never slower than X, it's the query to use. Smart planners are more of a hindrance than a help, because they reduce predictability.

The worst case is when a planner starts with a small row set (planners usually like starting with small row sets) and joins against a big table expecting a similarly small joined set, but ends up with an enormous result set because of the cardinality of the join. Estimating join cardinality is a tough problem for database vendors to solve, and it's critical to figuring out the correct join order - particularly since the number of join strategies grows with the factorial of the number of tables to join, so the planner can't spend much time on it. Meanwhile, the application developer may very well know the designed cardinalities and a join sequence that minimizes the chance of a blowup, the most important thing for interactive use.