This is mostly all true, but there is little incentive for RDBMS vendors to implement and maintain a second query language, in particular a shared cross-vendor one. Databases are the most long-lived and costly-to-migrate dependencies in IT systems, so keeping the SQL-based interface in parallel for a long time would be mandatory. This is compounded by the standardized SQL-centric database driver APIs like ODBC and JDBC. Despite the shortcomings of SQL, there is no real killer feature that would trigger the required concerted change across the industry.
The way I've heard this phrased is, for potential customers to justify switching to your solution, it can't be 10% better, it needs to be 10x better.
(And on top of that they need to clearly perceive the value of Strange New Thing, and clearly perceive the relative lack of value of the thing they have been emotionally invested in for decades...)
> This is compounded by the standardized SQL-centric database driver APIs like ODBC and JDBC.
The criticality of JDBC/ODBC as a platform can't be understated. The JDBC API is the dominant platform for data access libraries. Compare number of drivers for JDBC, ODBC, go/sql, etc.
Newer platforms like Arrow ADBC/FlightSQL are better-suited to high-volume, OLAP style data queries we're seeing become commonplace today but the ecosystem and adoption haven't caught up.
In my day job the question of SQL and its role keeps coming up. Some people want to propagate SQL all the way to clients like web browsers. Perhaps operating over some virtual/abstract data and not the real physical underlying data (that's a whole other layer of complexity). This seems like a bad idea/API in general.
I'm not too familiar with GraphQL but on the surface it seems like another bad idea. Shouldn't you always have some proper API abstraction between your components? My sense for this has been like GraphQL was invented out of the frustration of the frontend team needing to rely on backend teams for adding/changing APIs. But the answer can't be have no APIs?
All that said there might be some situations where your goal is to query raw/tabular data from the client. If that's your application then APIs that enable that can make sense. But most applications are not that.
EDIT: FWIW I do think SQL is pretty good at the job it is designed to do. Trying to replace it seems hard and with unclear value.
> All that said there might be some situations where your goal is to query raw/tabular data from the client. If that's your application then APIs that enable that can make sense. But most applications are not that.
IME, the majority of responses sent to the client is tabular data hammered into a JSON tree.
If you generalise all your response to tabular data, that lets you return scalar values (a table of exactly one row and one column), arrays (a table of exactly one row with multiple columns) or actual tables (a table of multiple rows with multiple columns).
The problem comes in when some of the values within those cells are trees themselves, but I suspect that can be solved by having a response contain multiple tables, with pointer-chasing on the client side reconstructing the trees within cells using the other tables in the response.
That would still leave the 1% of responses that actually are trees, though.
Instead of a client dealing with a server that only presents unopinionated, overly-broad CRUD endpoints for core entities/resources, GraphQL is a tool through which the client tricks the server into creating a bespoke viewmodel for it.
But those endpoints are abstractions. Don't we want control over the surface of the API and our abstractions? If you let the client tell the server what the abstractions are in run-time you've just lost control over that interface?
As I was saying, there might be some situations where that's the right thing, but in general it seems you want to have a well controlled layer there the specifies the contract between these pieces.
> My sense for this has been like GraphQL was invented out of the frustration of the frontend team needing to rely on backend teams for adding/changing APIs.
GraphQL was borne out of the frustration of backend teams not DOCUMENTING their API changes.
It's no different ideologically from gRPC, OpenAPI, or OData -- except for the ability to select subsets of fields, which not all of those provide.
Just a type-documented API that the server allows clients to introspect and ask for a listing of operations + schema types.
GQL resolvers are the same code that you'd find behind endpoint handlers for REST "POST /users/1", etc
Re: GQL - Explain to me what abstraction layer should exist between the data model and what data is loaded into the client? I’ve never understood why injecting arbitrary complexity on top of the data model is wise.
Perhaps unfettered write access has its problems, and GQL has permissions that handle this issue plenty gracefully, but I don’t see why your data model should be obfuscated from your clients which rely on that data.
In my view the abstraction layer should be in the domain of the application.
Let's say your software is HR software and you can add and remove employees. The abstraction is "Add an employee with these details". The data model should be completely independent of the abstraction. I.e. nobody should care how the model is implemented (even if in practice it's maybe some relational model that's more or less standard). Similarly for querying employees. Queries should not be generic, they should be driven by your application use cases, and presumably the underlying implementation and data model is optimized for those as well.
But I get it the GQL can be that thing in a more generic schema-driven thing. It still feels like a layer where you can inadvertently create the wrong contract. Especially if, as I think the case is, that different teams control the schema and the underlying models/implementation. So what it seems to be saving teams/developers is needing to spell out the exact requirements/implementation details of the API. But don't you want to do that?
How do people end up use GQL in practice? what is the layer below GQL? Is it actually a SQL database?
To use SQL effectively a certain amount of training is needed. But people are trained to read and write and do arithmatic. How to understand and write simple relational database queries is a broadly useful skill that should be widely taught in schools.
When it comes to written English, perhaps that could do with some reforms just as with SQL. Yet the way we write remains mostly unchanged.
I hold a very unpopular opinion of GraphQL. I think it’s a great internal querying API. Every web backend project I’ve worked on tries to implement an API for querying data and it’s usually either fast and inflexible or flexible but slow. GraphQL allows to strike a balance, flexible and reasonably fast, with ways to optimise further.
I love GraphQL, it's great. It takes away the ambiguous way to organize REST APIs (don't we all love the endless discussion about which HTTP status code to use...), and at the top level separates operations into query/mutation/subscription instead of trying to segment everything into HTTP keywords. It takes a bunch of decision layers away and that means faster development.
Question is: do you need that flexibility if you have the backend for frontend? Can you design such a flexible api which makes it possible to iterate faster? If not, you just pay, in the best case, a constant overhead, or worst case, exponential overhead for each request! If you need to spend time optimizing because you have monitoring for slow queries or downtime caused by never terminating queries than most likely you’ve already eaten implementation speed advantage - if it exists at all in the first place.
For any language as large and complicated as SQL, it's easy to come up with a long list of design problems. The difficulty is designing something better, and then even more difficult than that is getting people to use it.
Much of the critique is that it's large and complicated because of bad design.
"Because SQL is so inexpressive, incompressible and non-porous it was never able to develop a library ecosystem. Instead, any new functionality that is regularly needed is added to the spec, often with it's own custom syntax. So if you develop a new SQL implementation you must also implement the entire ecosystem from scratch too because users can't implement it themselves.
I would say that’s just another trade off though, in that extensibility and portability are invariably in tension.
The article simultaneously complains that the SQL standard is not universally implemented (fair) and that SQL is not easily extensible (also fair). But taken together it seems odd to me in that if you make SQL very extensible, then not only will it vary between databases, it will vary between every single application.
Also, the line between SQL and database feels a little fuzzy to me, but don’t a lot of postgresql extensions effectively add new functionality to SQL?
SQL is great. I've used it to implement knapsack optimization for Daily Fantasy Sports at scale. I use it in Big Data tools and RDBMS. It's pervasive in data tech.
Feel free to innovate and bring forth other RDBMS/Data query languages and tools, perhaps something may succeed and stick as long as SQL has.
Most of these arguments against seem like personal preferences? For example, I understand it would be convenient to give special treatment to foreign key joins, but i personally find `fk_join(foo, 'bar_id', bar, 'quux_id', quux)` less easy to understand on it's own, without having to look up the underlying table structures to know which tables have which (ie is quux_id a column in foo or bar?). Not to mention I've never worked anywhere where foreign keys were consistently used, mostly for perf reasons.
I think "SQL" is fine, whatever, I'm used to working with multiple different query and programming languages and dialects. That includes the freedom to define abstractions over SQL that meet my personal needs.
Standard SQL is not helpful, though. If that (failed) experiment was ended, database implementations would have even more freedom to explore superior syntax. Prescriptive language standards are a mistake.
I like that SQL is a standard, and it's mostly "fine". Sure, I have to constantly read the man pages because there are half a dozen different ways to do fundamentally similar things, and there are subtle differences between each vendor, and I keep running into silly errors like trailing commas. But it mostly works.
The stuff that is more painful is building any kind of interesting application on top of a database. For example, as far as I know, it's very hard to "type check" a query (to get the "type" returned by a given query). It's also hard to efficiently compose SQL. And as far as I know, there's no standard, bulletproof way to escape SQL ("named parameters" is fine when you need to escape parameters, but most of SQL isn't parameters). There's also no good way to express sum types (a "place" can be a "park" or a "restaurant" or a "library", and each of those have different associated data--I don't need a "has_cycling_trails" boolean column for a restaurant, but I do for a park). There are various workarounds, all deeply unsatisfying.
In MSSQL you can select top 0 * into a temp table and retrieve all the usual column meta data.
I’ve written basic custom report writer functionality using this technique that lets users(usually me the developer or a super user) do custom sanitised SQL selects.
I assume similar functionality exists in all the different vendors databases.
(And on top of that they need to clearly perceive the value of Strange New Thing, and clearly perceive the relative lack of value of the thing they have been emotionally invested in for decades...)
Newer platforms like Arrow ADBC/FlightSQL are better-suited to high-volume, OLAP style data queries we're seeing become commonplace today but the ecosystem and adoption haven't caught up.
https://arrow.apache.org/adbc/current/index.html
https://arrow.apache.org/docs/format/FlightSql.html
Against SQL (2021) - https://news.ycombinator.com/item?id=43777515 - April 2025 (1 comment)
Against SQL - https://news.ycombinator.com/item?id=40454627 - May 2024 (1 comment)
Against SQL (2021) - https://news.ycombinator.com/item?id=39777515 - March 2024 (1 comment)
Against SQL - https://news.ycombinator.com/item?id=27791539 - July 2021 (339 comments)
I'm not too familiar with GraphQL but on the surface it seems like another bad idea. Shouldn't you always have some proper API abstraction between your components? My sense for this has been like GraphQL was invented out of the frustration of the frontend team needing to rely on backend teams for adding/changing APIs. But the answer can't be have no APIs?
All that said there might be some situations where your goal is to query raw/tabular data from the client. If that's your application then APIs that enable that can make sense. But most applications are not that.
EDIT: FWIW I do think SQL is pretty good at the job it is designed to do. Trying to replace it seems hard and with unclear value.
IME, the majority of responses sent to the client is tabular data hammered into a JSON tree.
If you generalise all your response to tabular data, that lets you return scalar values (a table of exactly one row and one column), arrays (a table of exactly one row with multiple columns) or actual tables (a table of multiple rows with multiple columns).
The problem comes in when some of the values within those cells are trees themselves, but I suspect that can be solved by having a response contain multiple tables, with pointer-chasing on the client side reconstructing the trees within cells using the other tables in the response.
That would still leave the 1% of responses that actually are trees, though.
As I was saying, there might be some situations where that's the right thing, but in general it seems you want to have a well controlled layer there the specifies the contract between these pieces.
It's no different ideologically from gRPC, OpenAPI, or OData -- except for the ability to select subsets of fields, which not all of those provide.
Just a type-documented API that the server allows clients to introspect and ask for a listing of operations + schema types.
GQL resolvers are the same code that you'd find behind endpoint handlers for REST "POST /users/1", etc
Perhaps unfettered write access has its problems, and GQL has permissions that handle this issue plenty gracefully, but I don’t see why your data model should be obfuscated from your clients which rely on that data.
Let's say your software is HR software and you can add and remove employees. The abstraction is "Add an employee with these details". The data model should be completely independent of the abstraction. I.e. nobody should care how the model is implemented (even if in practice it's maybe some relational model that's more or less standard). Similarly for querying employees. Queries should not be generic, they should be driven by your application use cases, and presumably the underlying implementation and data model is optimized for those as well.
But I get it the GQL can be that thing in a more generic schema-driven thing. It still feels like a layer where you can inadvertently create the wrong contract. Especially if, as I think the case is, that different teams control the schema and the underlying models/implementation. So what it seems to be saving teams/developers is needing to spell out the exact requirements/implementation details of the API. But don't you want to do that?
How do people end up use GQL in practice? what is the layer below GQL? Is it actually a SQL database?
When it comes to written English, perhaps that could do with some reforms just as with SQL. Yet the way we write remains mostly unchanged.
Deleted Comment
"Because SQL is so inexpressive, incompressible and non-porous it was never able to develop a library ecosystem. Instead, any new functionality that is regularly needed is added to the spec, often with it's own custom syntax. So if you develop a new SQL implementation you must also implement the entire ecosystem from scratch too because users can't implement it themselves.
This results in an enormous language."
The article simultaneously complains that the SQL standard is not universally implemented (fair) and that SQL is not easily extensible (also fair). But taken together it seems odd to me in that if you make SQL very extensible, then not only will it vary between databases, it will vary between every single application.
Also, the line between SQL and database feels a little fuzzy to me, but don’t a lot of postgresql extensions effectively add new functionality to SQL?
Feel free to innovate and bring forth other RDBMS/Data query languages and tools, perhaps something may succeed and stick as long as SQL has.
Cheers
Standard SQL is not helpful, though. If that (failed) experiment was ended, database implementations would have even more freedom to explore superior syntax. Prescriptive language standards are a mistake.
The stuff that is more painful is building any kind of interesting application on top of a database. For example, as far as I know, it's very hard to "type check" a query (to get the "type" returned by a given query). It's also hard to efficiently compose SQL. And as far as I know, there's no standard, bulletproof way to escape SQL ("named parameters" is fine when you need to escape parameters, but most of SQL isn't parameters). There's also no good way to express sum types (a "place" can be a "park" or a "restaurant" or a "library", and each of those have different associated data--I don't need a "has_cycling_trails" boolean column for a restaurant, but I do for a park). There are various workarounds, all deeply unsatisfying.
I’ve written basic custom report writer functionality using this technique that lets users(usually me the developer or a super user) do custom sanitised SQL selects.
I assume similar functionality exists in all the different vendors databases.