Python: Just Write SQL

ORMs do much more than "write SQL". This is about 40% of the value they add.

As this argument comes up over, and over, and over, and over again, writers of the "bah ORM" club continuously thinking, well I'm not sure, that ORMs are just going to go "poof" one day? I wrote some years back the "SQL is Just As Easy as an ORM Challenge" which demonstrates maybe a few little things that ORMs do for you besides "write SQL", like persisting and loading data between classes and tables that are joined in various very common ways to represent associations between classes:

https://gist.github.com/zzzeek/5f58d007698c4a0c372edd95ab8e0...

this is why whenever someone writes one of these "just write SQL" comments, or wow here a whole blog post! wow. I just shake my head. Because this is not at all what the ORM is really getting you. Plenty of ORMs let you write raw SQL or something very close to it. The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects. Not to mention abstraction over all the other messy things the database drivers do like dealing with datatypes and stuff like that.

It looks like in this blog post, they actually implemented their own nano-ORM that stores one row and queries one table. Well great, now scale that approach up and see how much fun it is to write the same boilerplate XYZRepository / XYZPostgresqlRepository code with the same INSERT / SELECT statement over, and over again. I'd sure want to automate all that tedium. I'd want a one-to-many collection too maybe.

You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings, and still use the ORM, and still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows. But why would anyone really want to, writing SQL for CRUD is really repetitive and tedious. Computer can do that for you.

marcosdumay · 2 years ago

The one distinguishing feature of an ORM is that it drops SQL and the relational paradigm and places the developer completely within the OOP world. If you drop this, you have merely a database connector library, with much more freedom of behavior than an ORM.

Yes, since they are an notoriously bad abstraction, every ORM will give you an escape hatch for the minority of tasks that it can't abstract at all. That escape hatch is not in any way a defining feature of the system.

Now, about that extra freedom that you get from dropping the requirement that your connector is an ORM... well, neither your data at rest, the presentation to the user, the abstraction for a service client, nor your validation rules benefit from OOP. Proof of that is that OOP interfaces from all of those things precede the current popular ones, and all of them were abandoned due to fundamental dissonance between the model and the data.

The rationale for an ORM is that, even though none of the interfaces you actually want for your data is best done in OOP, somehow OOP is still the best way to integrate them so that you can reuse data and metadata. This thought is not completely without merit, but there is very little empirical motivation for it, and the tiny amount that exists is more than completely explained by the OOP hype that only started to die around a decade ago.

EDIT: Oh I saw you wrote SQL Alchemy! First, thank you for that great piece of software.

Now, SQL Alchemy does provide a lot of useful ways to postpone the object representation or even to map your values into dumb data. My comment up there is on theoretical limitations, but on practice, I do think it's the best option available on Python. (And maybe about the best option afforded, since the language is intrinsically biased into OOP.)

lelanthran · 2 years ago

Okay, you're the expert here, and I'll happily concede that I am not (and apologise in advance if I seem to be disrespectful), but ...

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

I think that that is the problem: mapping a relational dataset to a hierarchical dataset is the digital equivalent of pounding a square peg into a round hole.

I know you've read Ted Neward's "The Vietnam of Computer Science." (it's a short read, so search for it), and he articulated the same thoughts I had each time I had to work with an ORM in a mainstream language.

There is an impedance mismatch of sorts between relational data and hierarchical data. In cases where there is no hierarchy, you can do away with the ORM completely and turn out much easier to read code, for example with `sqlc` and `Go`.[1]

The problem with the ORM is the programming language - one in which hierarchical objects are the idiomatic way to code (all of the mainstream OO languages) is always going to require some wrangling.

> still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows.

Unless you're using `sqlc` and Go, where the automation is provided

[1] Now you may argue that sqlc is technically an ORM, but then where do we draw the line for calling something an ORM? Is it the case that any method for "generating boilerplate for mapping relational datasets to in-program structures" is going to be called an ORM? Because to my mind, the result is so different for in-program representation that they aren't the same thing.

zzzeek · 2 years ago

yah I read Neward's thing, and it was one of the main reasons I wrote SQLAlchemy in the first place, because he was just so wrong. It read like he tried to write some object relational thing and it didn't work out, so he goes off and rant rant ORMs are wrong. Kind of proving that post wrong was one of the primary goals of SQLAlchemy, really, where I sought to change the question of "impedance mismatch" and "leaky abstraction" and all that and redefined the ORM / SQL abstraction layer as *automation*, not any attempt to "hide" anything.

I mean, that was really an important point in time when there really werent ORMs that were easy to work with, there was Hibernate in a very early stage and there were overly simplistic things for Perl, so I thought it was important that this "better way" I had in mind could be put out there, before the idea that "yeah let's all avoid ORMs unconditionally" could take hold.

Yeah if your app has one table and two queries, write the SQL and marshall the data yourself. I think everyone should do that approach anyway for awhile so they know what's actually going on. But if you are writing for 1200 tables, it's just not practical. if the Go thing is also automating that and generating SQL / data marshalling boilerplate code for 1200 classes, yes that's kind of ORMish. that's a totally valid way ORMs are written and if I had multiple lives I'd probably write a stored procedure ORM that does something like that too.

dagmx · 2 years ago

I completely agree with you. Every single time someone says: “Just write your own abstraction over an sql generator ”, it eventually devolves into a full blown ORM.

I swear the majority of these opinion posts against ORMs are from people who must have worked in a badly implemented project that left them with a bad experience and they blamed the pattern rather than the technology.

One of the best bits about an ORM is making it consistent for non-db users on the team to simply grok and work with, without creating monstrous and hard to debug joins everywhere. But when badly set up it can lead to a lot of debugging spaghetti. Which is the same as can happen with SQL but I suppose people think that at least there’s one less layer to debug while ignoring the problem is actually how they got where they are and not the technology

I switched a project from manually written sql to sql alchemy on a project that’s used by multiple Oscar nominated films daily for reviews. The SQL version was gross and impossible for the team to manage because it had bloated over the years, with no nice way to detangle the statement generations and joins. SQL Alchemy made it so any one of the technical directors on the team could step in and add new functionality, without serious performance footguns. Instead of me having to clean up bad sql every year (projects would fork per film and merge at the end) to keep performance up, I could trust the ORM to do that for me.

At its worst, it was way too easy for TDs to get the raw SQL to be tens of minutes per review session by structuring their logic incorrectly, but it was so difficult to see. Switching to an ORM meant I could get the performance down to seconds per session and they couldn’t destroy the performance in subtle ways.

joaodlf · 2 years ago

> I swear the majority of these opinion posts against ORMs are from people who must have worked in a badly implemented project that left them with a bad experience and they blamed the pattern rather than the technology.

Fair. But you then proceed to detail a personal experience on the other side of the spectrum: Badly written code, without an ORM, and how it was fixed by introducing an ORM.

I think we can all agree that you can write bad code, with or without an ORM :). Not that this is entirely relevant to my post, I am simply advocating for writing more SQL, not how to write a good object mapper. That's a different beast, and I purposely kept that simple just to illustrate that it is possible to get started without too much pain.

waffletower · 2 years ago

Enough people have had bad experiences with ORMs who have decided to evolve and find alternatives. ORMs are bulky and do not provide value to many modern functional, data-first development paradigms.

hnfong · 2 years ago

> I swear the majority of these opinion posts against ORMs are from people who must have ...

"microservices" - https://news.ycombinator.com/item?id=37125636

That cleared things up, at least for me. I totally agree that if your job is to essentially write a custom, RESTful, fault tolerant, 99.999% uptime accessor for a table or two, basically you're hired to be the ORM, so you don't want to just suggest people to use a ORM framework and effectively make yourself redundant.

jpc0 · 2 years ago

> and still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows

I've found I would generally only need the handling of database drivers and query building since I'm already writing a validation layer and to add data marshalling to that is pretty trivial.

Likewise practicing YAGNI, what is the chances I need multiple database drivers for different databases, it's extremely unlikely that I'm going to be chopping and changing between different databases so I'm really only writing that code once.

I would argue to start with writing the basic SQL queries and adding an ORM later when you know you actually need it.

It's much easier to onboard someone into pure python code + a db driver vs having to onboard someone to SQLAlchemy, since it is quite a complex piece of software, necessarily complex for what it is trying to achieve but if you don't need it it's not a good fit.

jacurtis · 2 years ago

> I would argue to start with writing the basic SQL queries and adding an ORM later when you know you actually need it.

I think you missed the point of the parent comment, which is that ORM's are not about writing SQL queries (although they do that). But ORMs are about moving data around, transforming it from rows and columns into meaningful objects in the project's language and data model.

As the parent comment suggested, if you are dying to write your own SQL (which does often happen as queries get more complex and don't fit into ORM language model) then you can write raw SQL but still let the ORM do the heavy lifting so you can take advantage of those features, which is the majority justification for the ORM in the first place.

You are basically suggesting what the original post author is suggesting. The comment above provided a rebuttal to their argument and you replied by suggesting the same thing they originally rebutted to. Hence a circular argument.

ris · 2 years ago

> Likewise practicing YAGNI

This is why I have come to hate YAGNI. Nowadays when it's said what I hear is "I don't understand why I need it (yet)".

joaodlf · 2 years ago

First of all, thank you for SQLAlchemy! If I ever had to make a final choice in how I would interact with a database for a very large project that involves a considerable dev team, I would always bet on SQLAlchemy. Not that I would necessarily like all aspects of it, but when it comes to Python and SQL - “Nobody ever got fired for picking SQLAlchemy.”.

With that out of the way, despite ORMs doing much more than "just writing SQL", it is exactly on that point that I flinch: Most devs should be exposed to SQL. And if your project allows you to build around simple enough abstractions so that you aren't reinventing the wheel, you should definitely be writing SQL. Especially if you don't know SQL yet - which is the growing case of new devs coming into the job market.

You can achieve a lot with SQlAlchemy Core, a tool that I absolutely recommend, but my post is just a simple alternative to get developers to think about their approach. If that results in some devs reconsidering using "full fat" SQLAlchemy and to try SQLAlchemy Core, that's a win for me!

Your gist tries to highlight the difficulty of doing certain things without an ORM. Migrations (as just 1 example) doesn't need to be hard, simple tools like flyway, or migrate (https://github.com/golang-migrate/migrate) achieve a similar result (while also keeping you on the path of writing SQL!). Deep and complex relationships between objects also don't need to be hard - typically people approach this subject with a requirement to be very flexible in the way they want to build queries and objects, but that to me in a sign that maybe they should reconsider their business logic AND reconsider that, just maybe, their project doesn't require all that flexibility, it is fairly straightforward to extend objects and introduce some more complex representations as and when it is needed - will all of this make me write code faster? Absolutely not. That is why you have spent so much time perfecting SQLAlchemy, but then again, I am not advocating for devs to go and replace their usage of ORMs, just presenting an alternative that may or may not fit their needs for a new project + give devs the chance to learn something that the ORM might have taken away.

xp84 · 2 years ago

> writing SQL for CRUD is really repetitive and tedious

Agreed, and this is the primary reason that ORMs are a necessary tool. And possibly that it's easier to train a junior developer to use one than it is to get them to a basic level of proficiency (and security awareness!) in SQL.

That said, I think this is one of those areas where It's Complicated, because inefficient database calls due to ORM usage is one of the primary ways I see applications completely break down. For most types of apps I've seen, which are very read-heavy and which aren't doing intensive writes, roughly all your writes should probably be using the ORM (including such niceties as validations, default scopes, all that nice stuff), and if it's simple enough, your "show" actions (fetch and display one entity) may be fine as well, but every "dashboard" and "index" action (show many entities, basically things with joins) likely need to be written in SQL.

In my experience (Rails), the object instantiation cost is insane, much greater than the actual time talking to the DB, so not only do you need to write SQL, but you need to handle the data that comes back without instantiating ActiveRecord models.

This is much harder work (more specifically, it needs much greater skill and experience, and is easier to make a mess of) versus using the ORM and models, but only on apps with tiny amounts of data per request, or very low request volume, can the ORM be a serious exclusive option for this task. Unless you want to end up like an app I once was asked to help fix, where they were on the $9,000 per month Heroku postgres instance, and since that's the biggest one, they could "scale" no bigger. (Okay, this wasn't their only problem, their main one was not understanding that you don't sprinkle analytics DB writes all over the place because now the simple high-volume "read" pageviews can't be generated using just a read replica).

SPBS · 2 years ago

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

That's easy. It's a database mapping library. Write the query, give it an object and it fills it in or reads from it. You can do this with annotations, or struct tags in the Go world. There's no need to introduce abstractions over SQL joins which I find very off-putting, because the abstractions are never perfect and suddenly you have to learn invented concepts just because you didn't want to write a JOIN in an SQL query but rather have some clever framework introspect some classes and automagically write the JOINs for you.

IKantRead · 2 years ago

To add to what you're saying: It's not like the Object-relational impedance mismatch[0] is some great unknown property of ORMs. Since people have been putting ORMs into production designers of these systems have been well aware that you must always chose a trade-off between a full functional object system and a fully relational one.

And the mismatch has two sides. If one's answer is "just use SQL!" then you're going to have new problems dealing with the mismatch coming from the SQL side of things.

The pre-ORM solution to this was not simply to shove a bunch of SQL in your application logic (though this was done), but to have a much, much more complex set of data layer logic. I'm guessing most people today writing about eschewing ORMs in favor of pure SQL have never used: stored procedures, triggers, cascades etc. I personally do miss some of the features from that era of software, but there's a lot more complexity to the "just write SQL" approach than most people realize.

0. https://en.wikipedia.org/wiki/Object%E2%80%93relational_impe...

BeefWellington · 2 years ago

While I somewhat agree that a lot of these articles are people who just don't actually try/use the full feature set of ORMs, I don't agree with the overall premise you're presenting that they really do more than write SQL for you. The other things they provide are largely just abstractions around how the queried data is returned and some additional metadata tracking of the relationships. Your estimation of those parts being 60% of the value added is probably generally wrong for most users of SQLAlchemy. Not having to consider or use another language syntax while writing your code is probably closer to 70-80% of the value for most people.

Your example gist is essentially just "ORMs excel at this one thing, anything else is worthless" and also effectively hides code ("the benefit of the library!!!") to make it a very much disingenuous comparison.

> But why would anyone really want to, writing SQL for CRUD is really repetitive and tedious. Computer can do that for you.

Performance.

I wanted to love SQLAlchemy but even in some relatively simple things it generates really asinine queries that take insane amounts of time. It's just usually not noticeably insane amounts of time at first. So you query for some data and wind up with a query that returns in 20ms when there's 20 rows in the database but falls apart when there's 200,000.

It's also bad at projecting analytics type queries onto a transactionally-normalized database, but then most ORMs are not great at that.

I had both of these issues enough times that I instead just opted to start handcrafting queries in those cases and using a single "ResultSet" type class that projects as both a SimpleNamespace and a dict.

zzzeek · 2 years ago

Would need to know what versions of SQLAlchemy you spent time with and what exactly were these "asinine" queries you refer towards. SQLAlchemy generates very good queries these days for the information its given. Years ago there was heavy use of large complicated subqueries for many cases and those days are long gone.

Also my 40% / 60% breakdown is based on the SQLAlchemy source code itself regarding what parts of the code deal with generating SQL and what parts deal with all the rest. 40% is likely a large overestimate.

TX81Z · 2 years ago

Agree, my datasets have multiple billions of rows and if I don’t know the details of the query, or have the ability to tune it, it’s utterly insufficient for my needs.

I still fail to see how anybody who actually knows sql and works with “Big Kid” datasets would use an ORM.

ansc · 2 years ago

I’m using SQLAlchemy in my job, and have worked with Python for many years. Never have I seen a good case of someone using SQLAlchemy to hydrate objects from raw SQL queries. I’ll definitely admit — I have not gone out of my way to search for it. It seems that it is a common want to do this kind of 60% benefit ORM you speak of, but it’s definitely unclear to me how to pick those parts together with the daunting (and fantastic) piece SQLAlchemy is.

taeric · 2 years ago

My only caveat to what you are saying, is that I have yet to see anything that successfully scales up to many tables. I can also count on one hand the number of products I have seen that successfully migrated between big databases in a meaningful way without a ton of ancillary rewriting in the process.

That said, I fully agree that the ORM isn't necessarily the problem. I point the blame at over eager data modelling that ignores the OLTP/OLAP divide. Or that ignores that row creation has a different contract than row editing, such that using the same object for both leads to confusion with missing fields. Heck, even different edits have different contracts, such that mapping a single "object" to a database row is often a mistake.

hot_gril · 2 years ago

> writing SQL for CRUD is really repetitive and tedious

If you think of relational DBs as just CRUD machines, an ORM makes total sense, but that's the original mistake.

audunw · 2 years ago

Is it a mistake if that's all you need?

I'm kind of on the side of avoiding ORM unless you have a clear need for it. But I've seen projects where they are very valuable. If you mainly want to register, update and delete a bunch of structured data, it's not a bad idea to put it in a dabatase. And if you do those operations a lot in some part of the application, it's not a bad idea to use an ORM there. For analysis and reports and such I would probably suggest just writing SQL directly though.

TX81Z · 2 years ago

I think that’s part of why I’m having trouble with this framing - it’s treating CRUD as the entire universe of why you’d need to connect to a database.

Some of us do very intense compute in very large datasets, and ORM are not capable in those tasks. At all.

epgui · 2 years ago

> It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects. Not to mention abstraction

That's actually the problem.

I'm sure you're aware that ORMs have this fundamental problem called the ORM impedance mismatch problem. AFAIK it remains unsolved to this day.

> You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings

This is true, but it's like saying you can disassemble a Boeing 737 to build chopsticks. It's true but completely unnecessary.

gmassman · 2 years ago

Thanks for you perspective, Mike. Completely agree that the interface between data and code should be handled by a single tool. That tool must meet some minimum complexity, because it’s solving a very hard problem! Also just want to say that my team has benefited greatly from your work on SQLAlchemy, and we appreciate you immensely!

habitue · 2 years ago

Honestly, sqlalchemy is such a different breed of ORM, whenever people slander ORMs, I'm imagining they're thinking of like Django ORM or ActiveRecord which are like the duplos to SQLalchemy's Legos.

aforwardslash · 2 years ago

> I'd sure want to automate all that tedium. I'd want a one-to-many collection too maybe.

You don't actually need an ORM for that. And some of us (I know I'm a minority) actually want to have pure data objects. Some of us (talking about myself again) actually design schema-first - as in, you use DDL and your code just consumes the definition. Some of us don't care about code knowing about relationships which are an intrinsic part of the data model.

> You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings, and still use the ORM

SqlAlchemy is an amazing piece of software, and it is relevant becaise it is actually part of the problem. You don't really have an ecosystem of database abstraction libraries in python, as you have in other languages. The options are SqlAlchemy or Django ORM. Golang? You have 50 different libraries that implement 100 ways of doing the same stuff. C#? You have probably a dozen libraries besides Entity framework that provide different level of abstractions? Java? You got low level stuff up to high level ORMS. PHP? The same.

Python? You have SqlAlchemy, Django or roll your own. Both SqlAlchemy and Django can do raw queries with little effort - but if you're doing this, arent't you discarding most of the functionality that comes with it - namely, code maintenance? If you have a big project with a bunch handwritten queries (because you can do it, and quite easily, I may add), aren't you just ignoring any of the advantages of the orm? If, when you edit a model, you need to also check handwritten sql queries, you shouldn't be using an orm in the first place.

And with SqlAlchemy and Django, there is no middle ground. You either are all in, code-first definitions, or you are in a world of pain.

That is why specialized applications tend to be rewritten in other languages. Because optimizing the data layer alone often boasts 10-100x performance increase, when compared to this kind of abstraction.

> writing SQL for CRUD is really repetitive and tedious

It is. But between that and SqlAlchemy - in most languages, there are options. Not in Python. Between orm and no orm, I'll take no orm any day of the week - at least I can understand the queries.

tru1ock · 2 years ago

I picked up some SQL knowledge through osmosis by using ActiveRecord and I do wish there was a better connect between what the ORM did and the end result. There were some tools to see what code generated what queries but it was not that intuitive and in your face like for example how you would have to deep dive frequently in your generated front end javascript and css code.

In other word I think there are some tooling left on the table that can assist in increasing SQL literacy and comprehension.

ckmar · 2 years ago

> The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects.

If anyone is looking for this "pure" ORM (no query builder API) in Node there is https://github.com/craigmichaelmartin/pure-orm

TX81Z · 2 years ago

I’m sorry but 90% of the time I encounter somebody who swears by an ORM I’ll figure out the reason they use it is because they didn’t know SQL to start with and didn’t commit to learning a new language.

The number of people who know SQL well and still choose an ORM seems to be very, very low in my experience.

aidos · 2 years ago

Low, maybe, but we’re out there. Sqla just saves me so much time in general. Sometimes it’s easier to drop down a level, but for 99% of my db interactions the code will be significantly shorter using sqla, so I choose to use it.

hintymad · 2 years ago

Do we know what Hibernate did wrong? It used to be very popular among Java engineers but later seemed have become an obscure technology.

akkad33 · 2 years ago

Do you know why? In our company we still use hibernate heavily in our java web app

siva7 · 2 years ago

badass mic drop. i will refer to your comment whenever i see this discussion coming up again and it will come up again as long as people are still learning the art of software development.

If you're going to end up querying all the fields and putting them into a model like this dataclass anyways... Django can do that for you. If you're going to later pick and choose what fields you query on the first load, and defer other data til later.... Django can do that for you. If you're going to have some foreign relations you want to easily query.... Django can do that for you. If you're doing a bunch of joins and are using some custom postgres extensions for certain fields and filtering... Django can help you organize the integration code cleanly.

I totally understand people having issues with Django's ORM due to the query laziness making performance tricky to reason about (since an expression might or might not trigger a query depending on the context). In some specialized cases there are some real performance hits from the model creation. But Django is very good at avoiding weird SQL issues, does a lot of things correctly the first time around, and also includes wonderful things like a migration layer.

You might have a database that is _really_ unamenable to something like an ORM (like if most of your tables don't have ID rows), but I wonder how much of the wisdom around ORMs is due to people being burned by half-baked ORMs over the years.

I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern. But I think my bias is influenced by always working on software where there are just so many columns per table that it would be way too much busywork to not use something.

noirscape · 2 years ago

The main problem I've encountered with complaints surrounding ORMs usually tend to be the result of trying to overfit the ORM in a certain way.

ORMs are, for the most part, good at the CRUD operations - that is to say, they easily translate SELECT, UPDATE, INSERT and DELETE operations between conventional class objects and database rows.

Things they usually aren't very good at are when you start trying to do things that require a lot of optimization - it's very easy to have an ORM accidentally retrieve way more data than you need or to have it access a bunch of foreignkey data too many times (in Django you can thankfully preload the latter by specifying it in a queryset). That's less an issue for basic CRUD, but is an issue if you're doing say, mass calculation and only need one column and none of the foreignkey data for speed reasons.

Basically - an ORM is good but don't let yourself feel suffocated by it. If it's not a good fit for an ORM, then don't do it in the ORM, either use SQL code to do it in the DB server or do a simpler SELECT (in the ORM) and do the complex operation in your regular application before INSERTing it back in the db (if that's a goal for the operation anyway). If it's outside of the CRUD types of DB access already, the extra maintenance overhead you get from having non-ORM database code (if you're doing the SQL approach) in the application would be there anyway, you'd just get a very slow application instead of a hard error, and the latter is easier to troubleshoot (and often fix), while with the former you need to start pulling up profiling tools.

pydry · 2 years ago

>Things they usually aren't very good at are when you start trying to do things that require a lot of optimization

I find this ends up being, like, 1 or 2% of queries. It's also very hard if not impossible to guess which queries will end up in that group.

You're better off building it with the ORM first and breaking out SQL later when you are trying to performance optimize.

There is also a small % of queries which use some feature of your database engine which the ORM won't support.

agumonkey · 2 years ago

I realized too late that objects, as of now, are not capable of synthetizing a new class/type based on joins.

jononor · 2 years ago

The most stupid issues with "active record" type ORMs is the implicit queries on member access, especially in collections - leading to the N queries problem. But in SQLAlchemy one can actually turn that off - that is, make it throw an exception when undeclared table dependencies are attempted to be accessed. This restores sanity. And one gets to keep goodies you mention, plus Alembic migrations (mentioned by another). Also one can write direct SQL too with SQLAlchemy, or use the "core" layer to keep a DSL but avoid ORM.

lijok · 2 years ago

> I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern. But I think my bias is influenced by always working on software where there are just so many columns per table that it would be way too much busywork to not use something.

You end up writing a query per usecase, rather than writing generic queries that can be stitched together however in the business logic

evantbyrne · 2 years ago

The first agency I worked at did this on their Java projects. They should have just used a fully baked ORM. Basically, they ended up creating a massive query layer in the program which contained all the different queries organized into different interfaces. To edit a simple API endpoint you would have to open like 5 different files at a minimum. And because queries were usually tailored to logic in a specific controller, they were not typically reusable. It was always a relief to go back to Django after dealing with that.

baq · 2 years ago

Have this right now. Fortunately no custom query builder. It's enough of a hell as is. Cleanup will take years.

rtpg · 2 years ago

So what happens is that you have the one query that runs for a specific page and fetches the data and the relevant fields? I could definitely see that working for many projects, at least while your objects don't have too many tiny little details to pull out of thhe DB

llanowarelves · 2 years ago

I think it ends up being hybrid, like seen in CQRS and especially with DDD. Toss in "Vertical Slice Architecture" as well. What abstractions you want can decided on a per command or query basis and it feels natural.

Deleted Comment

pjmlp · 2 years ago

That is what stored procedures are for.

rowanseymour · 2 years ago

> issues with Django's ORM due to the query laziness making performance tricky to reason about

It's infuriating that this is still not a thing you can disable (https://code.djangoproject.com/ticket/30874). Pretty much my only gripe with the Django ORM which I'm a huge fan of (and I also write lots of SQL).

Daishiman · 2 years ago

There's a library that solves this for you, thankfully https://github.com/charettes/django-seal.

cushychicken · 2 years ago

I read this title and immediately thought "...but why wouldn't you just use Django?"

Having written the sort of SQL-inline code the author talks about, then refactored the whole thing to use Django: Django's ORM solves waaaay more problems than it creates in this regard.

aforwardslash · 2 years ago

There are plenty of advantages of using a dataclass, being the most obvious the fact that behaves like a pure data object (aka it doesn't have underlying associated resources). Serialization/deserialization of data is dead simple, and a dataclass is a construct you can use as a data object when building 3-tier applications. Having pure data objects also gives way more flexibility when implementing cache strategies.

While this separation isn't common in the Django ecosystem, it is very common in enterprise application design (regardless of usage of an ORM). On complex applications, Django models are often a leaky abstraction (not only because the mentioned resource connection problem, but also issues like for relations they require the inclusion of the target model, it cannot be lazy-loaded; a good example is a nullable foreign key to an optional module that may or may not be loaded), and they actually behave like a variation of the ActiveRecord pattern, that mixes two different scopes - data and operation on data. In many cases this is ok, but in many others this is a problem.

I personally use a repository pattern, coupled with a query builder and something vaguely similar to dataclasses (its a bit more complex in the sense that the data object attribute name can be different from the database field name). It is basically an object mapper with a non-related repository class.

rtpg · 2 years ago

Yeah I can understand wanting to split out the ORM model from a separate class that holds data. I just think absolving oneself of an ORM or query builder entirely for a DB schema that doesn't (glibly) fit on a postcard feels like a good way to generate a lot of busy work.

I somewhat disagree about your point on caching. If you're working with models (that, namely, are 1:1 with DB rows) stale object problems are a reality no matter what, and having the ID be put into a pure data object generates the same issues. But these are things that are not very interesting to discuss outside of specific contexts.

I am a bit of a functional programming nerd, but I've just found that for Python stuff in particular, swimming upstream is its own bug generator relative to writing concise stuff in a very imperative fashion. Using the fat models directly is a part of that calculus for me, but YMMV and every team has different strengths.

philwelch · 2 years ago

> I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern.

One successful pattern I’ve seen treats the database as its own service layer. Service code does not send arbitrary SQL to the database—instead, all of the SQL queries are set as stored procedures in the DB. People sometimes freak out when I say this, so I should clarify that the stored procedures, table schemas and other things of that nature were version controlled and deployed with some minimal build tooling we’d developed in house.

I really liked this pattern. I think anything that you need to talk to across a network should be treated as a service in itself, with a well defined interface. This simplifies testing and monitoring as well. The big risk with an ORM, architecturally, is that you end up treating your database as a sidekick to your service code, or even a dumb data store—some shameful implementation detail your service keeps locked in the basement—when they’re capable of much more than that.

ilovetux · 2 years ago

The only problem I have with using the django ORM is that it relies on the django project structure. While there are ways to use the ORM independently, they are full of hacks and trade-offs.

Granted, this problem goes away if you are building a web app or a REST API, but if I just want an ORM for a command line application, I am using django's management command functionality which is OK, but it doesn't really scale easily.

Too · 2 years ago

Yuck indeed. The way it dictates the order of imports, forcing you to import settings before any models can be imported, is reason enough not to use it. This problem spreads to any of your other files, leaving you in the end with everything depending on being launched in a full Django context. Shame, given it’s otherwise very user friendly.

rglullis · 2 years ago

Add refactoring, migrations and testing to all the reasons you mentioned, and it quickly becomes an adapted case of Greenspun's Tenth Rule for ORMs.

pjmlp · 2 years ago

Migration tooling and SQL testing predate ORMs, it is a matter of actually caring about their existence.