I don't disagree that SQL can be improved. It's one of the biggest reasons I use Postgres in the first place because there are so many improvements available on top of SQL.
All that said...SQL is pretty darn effective. As a language, it's the true backbone of the internet today. It's readable, explicit, fairly concise and naturally translates to how data should be broken down for efficient storage...or make some trade-offs to allow for more efficient retrieval.
There are differences with different vendor implementations...but that's what different vendors are for - to find things the other guys are doing wrong and improving on them to build a better product.
I wish the folks luck in their work to improve things, but the language I've been able to rely on consistently over the last 17 years or so has been SQL...and I've worked with a lot of languages. SQL is the one that gets the most done reliably and lets me down the least often.
Afaict, the state of SQL as a grammar with tooling is kind of pathetic.
As a standardized language, it doesn’t really exist; everyone implements numerous extensions, and almost no one is fully ansi compliant
Almost all formatters attempt to be generic (believing standardization exists), and fail to support the full grammar for any dialect.
Across the board, all parsers have pathetic error message support (error on line 3, which is actually just the start of the statement).
The schema offers type constraints, but querying/ide’s extract no value from that (that is, types are statically specified/constrained, but query editors all pretend its fully dynamic)
Theres a lot of awkward nonsense, like where clauses are parsed before the select in most parsing engines, causing alias usage to fail without wrapping in a subselect/with clause
The grammars themselves are an inconsistent, ad-hoc mess
The grammar is also unnecessarily context-dependent (eg from must follow select, and where after that), making programmatic composition unnecessarily difficult
I don’t know how much of the tooling issue is a result of SQL as a language versus the history itself, but I can at least confirm that trying to parse multiple dialects is absolute hell, which would at least explain the sorry state of affairs for eg formatters.
But the majority of its expressive power derives from the relational algebra, and has nothing to do with the SQL grammar, and thats the majority of its value. It seems obvious to me that at the very least the compositional issue of SQL, and its self-inconsistent grammar, should be vulnerable to near-lossless improvement without too much struggle, though I can’t say what the alternative would actually look like.
But it seems like its riddled with a lot of unnecessary flaws
Actually standards compliance is really good these days. There are a lot of custom functions that are hard to do without in some reporting applications but behavior of SQL clauses is consistent across every engine.
JetBrains tooling does use schema metadata for it's autocompletion. I'm not sure though if it's anywhere close to full-on Haskell autocompletion in the Atom editor (very fiddly and prone to break on minor version changes, I must say).
The SQL spec is actually hidden away behind some book or something you have to purchase.
I was experimenting around with creating an Entity Framework equivalent in Typescript and really wanted to create a SQL AST for use under the hood(optimizing queries, SQL push down, etc). Ended up using the PostgreSQL types and a Ruby plugin that binds some PostgreSQL libs to work on my POC. Crazy town.
I’ve been writing some pretty ambitious Hive queries at work lately.
As I learn more about SQL and pull off more complex queries, my respect for it deepens. To have such power and support so many use cases with so few constructs is really an engineering feat. It’s timeless for a reason.
Some of my relatively common access patterns are awkward to express, but they can still be expressed in a few lines + a CTE or two, which is really impressive for a language so small.
This is not to say we can’t do better. But SQL has achieved a deep resonance with its problem space that most tools don’t even come close to. The brightest minds and most effective tooling shops in our field would be lucky merely to do as well.
This shows that humans can learn languages and get motivated by mastering them. This does not tell anything about the consistency, composability or orthogonality if SQL. Those qualities affecting the newcomers effort to learn it.
Having used splunk's query language... I'd hope that in general vendors start with an assumption of SQL support and maybe try tweaking it at the edges rather than burning the house down - SQL has survived this long because it's extremely expressive and any replacement for it is going to need to match that expressiveness.
All that said I think it was originally structured to partially be a human readable language and it fails pretty hard at that - that's a facet I'm sure smart people could revise to make more natural.
To be fair, Splunk excels at dealing with unstructured logs, which is in many ways a harder problem. Obviously in a perfect world, all our data would be structured, which would make querying much simpler.
> I use Postgres […] SQL is pretty darn effective.
FWIW postgres used to have its own query language derived from QUEL[0] rather than SQL.
And findings that SQL is kinda shit are not exactly recent, e.g. C.J. Date's "A Critique of the SQL Language" (1983) lists the following sections
* lack of orthogonality: expressions
* lack of orthogonality: builtin functions
* lack of orthogonality: miscellaneous items formal definition
* mismatch with host languages
* missing function
* mistakes
* aspects of the relational model not supported
The conclusion was, obviously, prescient:
> if SQL is adopted on a wide scale in its present fortm~ then we will to some degree have missed the relational boat~ or at least failed to capitalize to the fullest possible extent on the potential of the relational model. That would be a pity, because we had an opportunity to do it right, and with a little effort we could have done so. The question is whether it is now too late. I sincerely hope not.
SQL succeeded not because it's "pretty darn effective" but because IBM decided on it (at a time where it drove technology) and Oracle are great at sales and marketing (whereas Ingres definitely wasn't).
The various extensions / plugins that allow for custom data types, indexes, use of multiple programming languages to write functions, ability to use a foreign data wrapper to connect to Redis and build a VIEW out of the result or push data out to Redis/Memcached with a database function, varieties of powerful search capabilities, etc
That "most" is a pretty loaded word! Last I checked, even CREATE INDEX is not part of any ANSI or ISO SQL specification. (That's why every RDBMS has such different features and syntax for this.) Good luck building a system with just "standard SQL".
I feel like Postgres is so powerful I could damn near build an entire web app backend with JUST Postgres (i'm only sorta kidding here). If that's standard SQL, well then I really like standard SQL :-)
Same. As a data scientist, I have seen multiple 'analytics workbench' solutions come, go and occasionally stay - Clementine (what is now IBM SPSS), Stata, SAS and its many variants, Statistica and in the recent years, tools built off and designed to make working with data using Python/R.
But, the common workhorse tool that has stayed strong through all these has been the common SQL. Elegant, simple, powerful and thoroughly reliable, it is my primary go-to tool. In an otherwise changing ecosystem, its simplicity and reliability is a boon. Yes, it is primarily because of the nostalgic familiarity but I also believe it continues to be extremely powerful, one that will serve you very well.
I came here to write something similar, but you said it better than I could. Sure, SQL has the problems that the article mentions and their solution looks nice on the surface, but SQL has worked much better for me than any alternatives, especially if it’s slightly extended SQL like in Postgres, as you mention, and any attempts to improve it will always be an uphill battle as SQL is pervasive and well supported. That doesn't mean people shouldn't try, but it does mean that their likelihood of success depends on factors other than whether the solution is an improvement over SQL or not.
I’ll never understand why obvious marketing makes it to the front page, only to get shat upon by 95% of the comments.
Of course we can do better than SQL. We could obviously do better than Javascript. It’s globally understood that we could all do better than English.
We sometimes struggle to express ourselves with language. We can blame the language, try to fix it, or invent a new language and try to get people to speak it.
I would rather spend my time refining my elocution than learn a new language. That said, there are words and phrases that simply work better in other languages. I don’t know a single-word corollary to ‘Simpatico’ in English.
I wish the best to those who would make programming more expressive, and the worst to those who would try to streamline away subtlety.
Performance is quite loosely linked to language, given sufficient abstractions and optimizations. I submit NumPy as an example. We humans have lots of ways to say what we want, and we want lots of different things in myriad ways. As an analyst I often wish for a richer language, some way I could transcend tabular data thinking to find and make associations around real-world state. I’m certainly self-satisfied when my nested subqueries return what I’m expecting, but I may be able to add more value if I had a better way of expressing my questions.
If we can do better than SQL, it ought to bring more people closer to the reality that lies behind the data, and further from the methods used to obtain it. Maybe I don’t have the words for what I’m looking for yet...
The article was still useful. I don't mind an ad at the bottom if there's genuine value. I am not experienced in SQL so learning of its shortcomings (especially the NULL part) will be useful when I write the odd query here and there.
It's another clever feature of Hacker News, in my book.
Perhaps YCombinator omitted the downvote button because it can be abused. For example, if there is a story that is bad news for a certain company, that company could organize a downvoting campaign to hide it. But the upvote button can also be manipulated, as you say.
However, which is worse? (A) To write a silly story and have it rightfully downvoted into obscurity, or (B) To write a silly story, have it upvoted into the limelight, but be littered with comments that expose its flaws.
I think that as long as I remember that just because a story is on the front page doesn't mean it's right, then Hacker News has a nicely curated set of moderation rules.
Yeah I'm not sold. One example from this post that struck me was the author wanted to embed a select in a table expression. I'm not a fan of this at all. I don't want it not to be clear if a given expression list will explode in values or not.
I like the fact that SQL has a solid foundation in relational algebra. I see no such foundation for the alternative.
I do like what LINQ did here (being SQLish), which was to put the FROM clauses first. Some SQL variants have WITH clauses that are quite convenient but you end up with:
WITH (...) AS a,
(...) AS b
SELECT
a.a1,
b.b1
FROM a
JOIN b
ON a.a = b.b
Common alternative:
SELECT
a.a1,
b.b1
FROM (...) a
JOIN (...) b
ON a.a = b.b
whereas I'd prefer:
FROM (...) a
JOIN (...) b
ON a.a = b.b
SELECT
a.a1,
b.b1
Does the spec have any execution requirements re WITH? Pretty sure `WITH` in postgresql is gated and thus can have considerable performance implication vs a nested query (or none at all depending on the query).
If the language is the problem, why write a new server?
Transpile your language to the equivalent SQL, and rely on decades of research and real world experience in things like replication, optimization, locking strategies, high availability, security etc. the things unrelated to the language current SQL databases are really good at.
And do you support other backends as well (i.e. MySQL?). Does EdgeDB still offer a regular SQL interface because people need time to migrate.
Even if I was convinced EQL was the future I wouldn’t throw out the old stuff. If the old stuff continues working but there is a smooth migration path, I would probably give it a try.
Because the 'servers' in existence speak only one language and it is that problematic one we want and need to get rid of.
Your 'transpile' handwaving won't fly. You are blindly presuming that the 'SQL equivalent' (a) always exists (it doesn't) and (b) can always be generated by an automated transpiler (it can't).
I have no background in databases so this may be naive or wrong, but the single biggest pain point in SQL that comes to my mind is that it can be difficult to tell what a query is doing without also knowing the constraints on the tables involved. Here's a real-life (ish) example from work:
We have some_table which we want to join to other_table, but we need to map an identifier through mapping_table in order to do it. So we end up with a query like:
SELECT (...) FROM some_table INNER JOIN mapping_table ON (...) INNER JOIN other_table ON (...) ...;
I know for sure when writing this query that the middle join to mapping_table will map every some_table row to exactly one row (no more, no fewer) in mapping_table. The problem is that the query doesn't capture this. The mapping table isn't really named something as obvious as "mapping_table" so someone reading the query has a hard time inferring what the intent was. It totally changes how you mentally parse and think about the query if the result set can be accumulating multiple matching rows from the join, or maybe even losing rows if there are no matches. You have to go bring up your database schema to figure this out.
And, as a fan of static typing, I can't help but cringe at the possibility of someone changing the constraints on the table without realizing that there are queries which implicitly depend on the old ones. SQL offers no resilience to this and will happily change the meaning of your query without a peep of complaint if you drop that constraint from mapping_table.
If there's a fancy way to capture this "mapping" relationship in standard SQL that doesn't just use a dumb inner join, I'd love to know about it. If not, I'd love a query language that supports some annotations that help reading and are either stripped out before sending to the database engine, or are actually checked at runtime.
If I understand correctly you are describing a one to one or one to many relationship. The canonical way to express that is 'don't use a mapping table'. Mapping tables are for many to many relationships. So why is there a mapping table in you example, and could you just get rid of it altogether?
In this case it's tying together data from different, independent systems. So the schema may not be textbook ideal but I don't think there's any way around it.
> I know for sure when writing this query that the middle join to mapping_table will map every some_table row to exactly one row (no more, no fewer) in mapping_table.
Your example query joins one some_table row with 1-n rows on mapping table, and another 1-n rows from whatever else is in there to that. If you're expecting a single row in the resulting set per some_table row, it means that you're filtering very hard (which is fine) or that you've a schema problem (which is the actual problem).
In CQL (http://categoricaldata.net), which generalizes relational theory with category theory, you can annotate schemas with equations and have them checked at runtime, or at compile time with an automated theorem prover (e.g., to establish that a query into a schema with a constraint will always materialize an instance that satisfies that constraint). One example is de-normalization: https://www.categoricaldata.net/denorm.php
I get that with some queries, you're not sure how the data is going to be retrieved, and letting the database figure all of it out for you is a good strategy. But with a lot of queries, particularly the ones that I'm doing in the milliseconds of a pageload, I want to be very sure that all my joins are hitting efficient indexes. I hate that someone can change my schema, and that can turn my efficient index lookup into a horrible scan, without breaking tests.
Folks who know more SQL than me: Is there a good way to say "I would rather this query fail than try to scan a table?"
Where you start to go wrong is where your talk starts to be of "YOUR" schema. The schema isn't yours, it's the company's. And guarding it is the DBA's job.
(I understand full well that that is a problem if the company has kicked out the DBA role and handed it over to the individual programmers, but perhaps that is precisely the problem.)
I am quite sure other implementations of SQL have similar tools but from my support of an iSeries.
The SQL scripting function is a tool run from a desktop, all emulation and such is JAVA based, with the feature to ask the system what the query is doing. The feature called Visual Explain will explode the query into a graphic representation of how the system optimized it to run. It will recommend indexes as needed. This is very good for understanding when table scans are forced, how files actually joined up, and more.
> And, as a fan of static typing, I can't help but cringe at the possibility of someone changing the constraints on the table without realizing that there are queries which implicitly depend on the old ones
When creating views and sprocs, SQL Server lets you mark them as 'schema bound', creating dependencies on the schema objects it uses.
Not sure if something like this exists in Postgres?
I don't disagree with the obviously true statement, but this code comment comes to mind:
// Dear maintainer:
//
// Once you are done trying to 'optimize' this routine,
// and have realized what a terrible mistake that was,
// please increment the following counter as a warning
// to the next guy:
//
// total_hours_wasted_here = 42
You can test that for yourself. Go through the three language-related sections on my site and decide for yourself how many hours you'd need to "waste" before you would "get it".
And as for "waste" : if while learning a better language, they are in addition also learning the relational model, and to think relationally, (their knowledge of both of which will be VERY poor if all they've ever seen is SQL) then there can't have been much time "wasted", can it ?
I think "ease of use" over SQL is not the hill I would die over if I were trying to displace SQL.
It's far too embedded throughout the entire industry and as a data analyst, learning EdgeQL vs SQL and then being locked into a new startup database that could disappear in a year doesn't seem like a high probability strategy.
I wish the people all the luck but unfortunately SQL is "good enough", pretty standardized (I can use just about any relational database and get useful data by knowing the basics). The inconsistencies may be mathematically "ugly" but it's not hard to wrap your head around and overcome.
Agreed. I think this is one reason that Looker's LookML has been successful. Not that it's entirely what I'd want out of a "SQL replacement", but it's an enhancement that "compiles" down to SQL rather than looking to replace it. Plus, you can always go direct to SQL, in case you need to take advantage of some specific feature or complexity that their language doesn't address itself.
I accidentally built an in-memory database that now lives prominently in our production stack. It works great, its incredibly performant, the codebase is relatively simple (it makes heavy use of code-generation), it will scale very well - but not a day goes by I don't think what if I had just taken the time to adapt an existing solution to the problem set.
There are just so many free things you get with SQL and established RDBMS that deeply impact application features, quality, stability, operations, and much, much, more. I've had to write a custom mongo-db like interface for querying, as well as a fair number of hacky bits to effectively cover the surface area of SQL in an inferior way.
I've learned tremendously, but I just wish people don't follow in my exact footsteps because that's probably wasted time.
As someone who landed up doing what you did, I also had days where I felt what you have described here. On the other hand by going custom you can usually exploit some understanding you have of your problem to great effect.
Why is it so hard to imagine something displacing SQL? A simpler, more predictable syntax seems perfectly plausible--it could ship alongside SQL. Do we have StockHolm Syndrome?
The negativity is surprising and at the same time predictable.
Because of the decades of things not displacing it - when someone suggests "oh we'll just do it simpler" they often are not seeing the forest for the trees.
Simpler languages have been shipped dozens if not hundreds of times, and they generally tend towards expressing the things they missed or not giving enough functionality for the things they missed.
I am not saying its impossible, but you're going to have to do a lot more than hand waving to justify the reverse position.
Yes, it was that, but, still, there's a big and growing hurdle here that many similar efforts, with similar objective merits compared to contemporary SQL implementations, have failed to overcome, and not a lot of reason provided to think EdgeDB is better positioned.
> Why is it so hard to imagine something displacing SQL?
Because systems providing just as good solutions to largely the same set of SQL deficiencies have been produced and failed to displace SQL for a couple of decades.
The problem isn't doing better than SQL. It's doing enough better than SQL to overcome the depth of knowledge, experience, support, tooling maturity, and comfort people have with SQL. And that most gets deeper over time, on top of SQL getting internal mitigations, if not actual solutions, to some of the problems over time.
That said—as I’ve done with several before—I’ll probably download EdgeDB and try to do some stuff with it.
Would you adopt an extra abstraction layer (bugs, maintenance, incompatibilities and surprises included) over your database to get a coherent handling of nulls and a few other optimizations of this level?
I know I'm staying with the nulls. I wish the best luck for them, and if it survives to maturity, I'll haply go get the ~1% (probably less) more productivity they offer. But right now I'm not moving. It's sad, really, but things are stacked against them. Change is costly, so we lose all the small changes that could compose into something huge.
The crux is : it takes much much more to achieve that than what the average C# or java coder yup has to offer. So the average C# or java coder yup will dismiss those things too.
It could happen. However, the probability of success for any given attempt is low. Consider all the attempts at improving on JavaScript before Typescript.
SQL is good enough for the existing set of applications, but that's not really saying much. There are lots of other applications for which SQL is not good enough, and those applications either don't exist because an affordable alternative doesn't exist or they implement their own proprietary database (e.g., many popular BI tools). It's safe to say that your use case--using SQL to perform one-off queries--is fine; writing a program that can dynamically build (performant) SQL to access data of arbitrary schema is quite a lot harder even if you can assume a single implementation. And much of this difficulty comes down to lack of composability.
Perhaps SQL is fine if it's your interface for accessing data on a one-off basis, but if you're trying to build a complex tool on top of it (say, an analysis tool for arbitrary data), the inconsistencies and performance concerns mount. People often end up inventing their own proprietary databases to do these analyses (e.g., virtually any business intelligence tool) assuming they can afford to do so. Perhaps EdgeQL isn't the ideal alternative, but as it is SQL is not good enough for many use cases.
SQL is extremely expressive, it's almost impossible to build something that cannot be expressed in an SQL query. In most cases when people feel like SQL cannot do something it is either because the Database does not implement a part of the standard or because they are not familiar with some of the more advanced usage of SQL. Simple SELECT FROM WHERE clauses, even including JOIN, are still fairly simple compared with what you CAN do if you want.
I'd recommend reading the PGSQL manual, they go very in depth about many of the supported features and how they are implement and can be used.
Same goes for JavaScript. It's so ubiquitous that it's worth putting up with the downsides.
It'll take several unicorns and Fortune 100s hiring thousands of engineers to code in an SQL-alternative to create an ecosystem large enough to eventually overtake SQL.
Agreed on all of that. Further, I wouldn't embrace something intending to displace SQL without it being authored by someone like Anders Hejlsberg (Turbo Pascal, Delphi, C#, TypeScript). That sort of involvement grants confidence that it's as "correct" as it can be for most users. That matters for buy-in. It can happen, Kotlin is a decent example, but is not embraced as widely as TypeScript has been. I'm sure that confidence plays a big role there. The stakes are much higher here with SQL, than in Java or ECMAScript.
There's plenty of brilliant people out there, but when you're talking about replacing the most successful data storage language in the history of mankind, you need everything. SQL has been "killed" many times, everyone wants to sell something. It would probably take involvement from a FAAMG entity. I think the first clue that there's no real room for technical innovation and we're staring at only the opportunity for technical churn, is that no FAAMG players, who definitely operate at-scale, have tried to displace SQL outright already.
TypeScript or Kotlin are really the closest, best and most recent examples of what would need to happen. For me as an end-user, ubiquity and skill-reusability matters. If you don't like SQL, there are ORMs.
In the financial industry, I have a seen a couple places where SQL was the interface for their payment systems. Mind you that these payment systems were not written using any kind of relational paradigm.
Hi - a lot of commenters have valid concerns and critiques, but I joined HN after lurking for years to say that I really like the direction you’ve taken! I’m particularly happy about the convenient syntax for joined inserts and sum types.
I see that you’ve built this as a patched version of Postgres, but I’m curious how much of this syntax you could implement as a client library and shell that would run against an existing Postgres instance.
Right away, you would get adoption from people who have an existing Postgres, or who want to take advantage of SaaS offerings like AWS Aurora.
Longer term, I could imagine the client/shell being extended to support multiple backend DB dialects, even things like Spark or Redshift which you’d have a hard time modifying intrusively.
It could also be cool to explore interoperation with existing schemas written in plain SQL, so people could adopt it incrementally that way.
All that said...SQL is pretty darn effective. As a language, it's the true backbone of the internet today. It's readable, explicit, fairly concise and naturally translates to how data should be broken down for efficient storage...or make some trade-offs to allow for more efficient retrieval.
There are differences with different vendor implementations...but that's what different vendors are for - to find things the other guys are doing wrong and improving on them to build a better product.
I wish the folks luck in their work to improve things, but the language I've been able to rely on consistently over the last 17 years or so has been SQL...and I've worked with a lot of languages. SQL is the one that gets the most done reliably and lets me down the least often.
As a standardized language, it doesn’t really exist; everyone implements numerous extensions, and almost no one is fully ansi compliant
Almost all formatters attempt to be generic (believing standardization exists), and fail to support the full grammar for any dialect.
Across the board, all parsers have pathetic error message support (error on line 3, which is actually just the start of the statement).
The schema offers type constraints, but querying/ide’s extract no value from that (that is, types are statically specified/constrained, but query editors all pretend its fully dynamic)
Theres a lot of awkward nonsense, like where clauses are parsed before the select in most parsing engines, causing alias usage to fail without wrapping in a subselect/with clause
The grammars themselves are an inconsistent, ad-hoc mess
The grammar is also unnecessarily context-dependent (eg from must follow select, and where after that), making programmatic composition unnecessarily difficult
I don’t know how much of the tooling issue is a result of SQL as a language versus the history itself, but I can at least confirm that trying to parse multiple dialects is absolute hell, which would at least explain the sorry state of affairs for eg formatters.
But the majority of its expressive power derives from the relational algebra, and has nothing to do with the SQL grammar, and thats the majority of its value. It seems obvious to me that at the very least the compositional issue of SQL, and its self-inconsistent grammar, should be vulnerable to near-lossless improvement without too much struggle, though I can’t say what the alternative would actually look like.
But it seems like its riddled with a lot of unnecessary flaws
I was experimenting around with creating an Entity Framework equivalent in Typescript and really wanted to create a SQL AST for use under the hood(optimizing queries, SQL push down, etc). Ended up using the PostgreSQL types and a Ruby plugin that binds some PostgreSQL libs to work on my POC. Crazy town.
As I learn more about SQL and pull off more complex queries, my respect for it deepens. To have such power and support so many use cases with so few constructs is really an engineering feat. It’s timeless for a reason.
Some of my relatively common access patterns are awkward to express, but they can still be expressed in a few lines + a CTE or two, which is really impressive for a language so small.
This is not to say we can’t do better. But SQL has achieved a deep resonance with its problem space that most tools don’t even come close to. The brightest minds and most effective tooling shops in our field would be lucky merely to do as well.
I mean, if I gave you protons, electrons and neutrons, you could build the universe out of them!
All that said I think it was originally structured to partially be a human readable language and it fails pretty hard at that - that's a facet I'm sure smart people could revise to make more natural.
FWIW postgres used to have its own query language derived from QUEL[0] rather than SQL.
And findings that SQL is kinda shit are not exactly recent, e.g. C.J. Date's "A Critique of the SQL Language" (1983) lists the following sections
* lack of orthogonality: expressions
* lack of orthogonality: builtin functions
* lack of orthogonality: miscellaneous items formal definition
* mismatch with host languages
* missing function
* mistakes
* aspects of the relational model not supported
The conclusion was, obviously, prescient:
> if SQL is adopted on a wide scale in its present fortm~ then we will to some degree have missed the relational boat~ or at least failed to capitalize to the fullest possible extent on the potential of the relational model. That would be a pity, because we had an opportunity to do it right, and with a little effort we could have done so. The question is whether it is now too late. I sincerely hope not.
SQL succeeded not because it's "pretty darn effective" but because IBM decided on it (at a time where it drove technology) and Oracle are great at sales and marketing (whereas Ingres definitely wasn't).
[0] https://en.wikipedia.org/wiki/QUEL_query_languages
Most of Postgres is standard SQL. It's just that most non-Postgres databases do not implement standard SQL very well.
It's got a lot of stuff going on in there.
Sure, but the non-standard enhancements like JSON support are part of what sets Postgres apart from the competition IMO.
But, the common workhorse tool that has stayed strong through all these has been the common SQL. Elegant, simple, powerful and thoroughly reliable, it is my primary go-to tool. In an otherwise changing ecosystem, its simplicity and reliability is a boon. Yes, it is primarily because of the nostalgic familiarity but I also believe it continues to be extremely powerful, one that will serve you very well.
Of course we can do better than SQL. We could obviously do better than Javascript. It’s globally understood that we could all do better than English.
We sometimes struggle to express ourselves with language. We can blame the language, try to fix it, or invent a new language and try to get people to speak it.
I would rather spend my time refining my elocution than learn a new language. That said, there are words and phrases that simply work better in other languages. I don’t know a single-word corollary to ‘Simpatico’ in English.
I wish the best to those who would make programming more expressive, and the worst to those who would try to streamline away subtlety.
Performance is quite loosely linked to language, given sufficient abstractions and optimizations. I submit NumPy as an example. We humans have lots of ways to say what we want, and we want lots of different things in myriad ways. As an analyst I often wish for a richer language, some way I could transcend tabular data thinking to find and make associations around real-world state. I’m certainly self-satisfied when my nested subqueries return what I’m expecting, but I may be able to add more value if I had a better way of expressing my questions.
If we can do better than SQL, it ought to bring more people closer to the reality that lies behind the data, and further from the methods used to obtain it. Maybe I don’t have the words for what I’m looking for yet...
Posts have an upvote button, but no downvote button, so dissenters can only use comments to express disagreement.
"amiable"? "congenial"? "affable"?
Upvote manipulation maybe? Then once it is on the front-page it gets lots of exposure. So the probability to get upvoted becomes higher.
Perhaps YCombinator omitted the downvote button because it can be abused. For example, if there is a story that is bad news for a certain company, that company could organize a downvoting campaign to hide it. But the upvote button can also be manipulated, as you say.
However, which is worse? (A) To write a silly story and have it rightfully downvoted into obscurity, or (B) To write a silly story, have it upvoted into the limelight, but be littered with comments that expose its flaws.
I think that as long as I remember that just because a story is on the front page doesn't mean it's right, then Hacker News has a nicely curated set of moderation rules.
I like the fact that SQL has a solid foundation in relational algebra. I see no such foundation for the alternative.
I do like what LINQ did here (being SQLish), which was to put the FROM clauses first. Some SQL variants have WITH clauses that are quite convenient but you end up with:
Common alternative: whereas I'd prefer:“Some variants”, including standard SQL since SQL:1999.
https://modern-sql.com/feature/with#compatibility
Transpile your language to the equivalent SQL, and rely on decades of research and real world experience in things like replication, optimization, locking strategies, high availability, security etc. the things unrelated to the language current SQL databases are really good at.
Enterprises need enterprisey features.
Even if I was convinced EQL was the future I wouldn’t throw out the old stuff. If the old stuff continues working but there is a smooth migration path, I would probably give it a try.
Think of TypeScript vs. JavaScript.
Deleted Comment
Your 'transpile' handwaving won't fly. You are blindly presuming that the 'SQL equivalent' (a) always exists (it doesn't) and (b) can always be generated by an automated transpiler (it can't).
We have some_table which we want to join to other_table, but we need to map an identifier through mapping_table in order to do it. So we end up with a query like:
SELECT (...) FROM some_table INNER JOIN mapping_table ON (...) INNER JOIN other_table ON (...) ...;
I know for sure when writing this query that the middle join to mapping_table will map every some_table row to exactly one row (no more, no fewer) in mapping_table. The problem is that the query doesn't capture this. The mapping table isn't really named something as obvious as "mapping_table" so someone reading the query has a hard time inferring what the intent was. It totally changes how you mentally parse and think about the query if the result set can be accumulating multiple matching rows from the join, or maybe even losing rows if there are no matches. You have to go bring up your database schema to figure this out.
And, as a fan of static typing, I can't help but cringe at the possibility of someone changing the constraints on the table without realizing that there are queries which implicitly depend on the old ones. SQL offers no resilience to this and will happily change the meaning of your query without a peep of complaint if you drop that constraint from mapping_table.
If there's a fancy way to capture this "mapping" relationship in standard SQL that doesn't just use a dumb inner join, I'd love to know about it. If not, I'd love a query language that supports some annotations that help reading and are either stripped out before sending to the database engine, or are actually checked at runtime.
Your example query joins one some_table row with 1-n rows on mapping table, and another 1-n rows from whatever else is in there to that. If you're expecting a single row in the resulting set per some_table row, it means that you're filtering very hard (which is fine) or that you've a schema problem (which is the actual problem).
Folks who know more SQL than me: Is there a good way to say "I would rather this query fail than try to scan a table?"
They can't, if you do perf testing as part of your pre-release testing, which you should do.
If you aren't doing perf testing, you are saying perf isn't an acceptance criteria, so why are you upset that breaking perf doesn't break tests?
(I understand full well that that is a problem if the company has kicked out the DBA role and handed it over to the individual programmers, but perhaps that is precisely the problem.)
The SQL scripting function is a tool run from a desktop, all emulation and such is JAVA based, with the feature to ask the system what the query is doing. The feature called Visual Explain will explode the query into a graphic representation of how the system optimized it to run. It will recommend indexes as needed. This is very good for understanding when table scans are forced, how files actually joined up, and more.
When creating views and sprocs, SQL Server lets you mark them as 'schema bound', creating dependencies on the schema objects it uses.
Not sure if something like this exists in Postgres?
And as for "waste" : if while learning a better language, they are in addition also learning the relational model, and to think relationally, (their knowledge of both of which will be VERY poor if all they've ever seen is SQL) then there can't have been much time "wasted", can it ?
It's far too embedded throughout the entire industry and as a data analyst, learning EdgeQL vs SQL and then being locked into a new startup database that could disappear in a year doesn't seem like a high probability strategy.
I wish the people all the luck but unfortunately SQL is "good enough", pretty standardized (I can use just about any relational database and get useful data by knowing the basics). The inconsistencies may be mathematically "ugly" but it's not hard to wrap your head around and overcome.
That is, a reliable tool that translates 99% of normal SQL into readable Edge SQL, and vice versa, would help adoption a lot.
Remember how new JavaScript features became mainstream though transpiling, long before native implementations.
There are just so many free things you get with SQL and established RDBMS that deeply impact application features, quality, stability, operations, and much, much, more. I've had to write a custom mongo-db like interface for querying, as well as a fair number of hacky bits to effectively cover the surface area of SQL in an inferior way.
I've learned tremendously, but I just wish people don't follow in my exact footsteps because that's probably wasted time.
Why is it so hard to imagine something displacing SQL? A simpler, more predictable syntax seems perfectly plausible--it could ship alongside SQL. Do we have StockHolm Syndrome?
The negativity is surprising and at the same time predictable.
Simpler languages have been shipped dozens if not hundreds of times, and they generally tend towards expressing the things they missed or not giving enough functionality for the things they missed.
I am not saying its impossible, but you're going to have to do a lot more than hand waving to justify the reverse position.
Yes, it was that, but, still, there's a big and growing hurdle here that many similar efforts, with similar objective merits compared to contemporary SQL implementations, have failed to overcome, and not a lot of reason provided to think EdgeDB is better positioned.
> Why is it so hard to imagine something displacing SQL?
Because systems providing just as good solutions to largely the same set of SQL deficiencies have been produced and failed to displace SQL for a couple of decades.
The problem isn't doing better than SQL. It's doing enough better than SQL to overcome the depth of knowledge, experience, support, tooling maturity, and comfort people have with SQL. And that most gets deeper over time, on top of SQL getting internal mitigations, if not actual solutions, to some of the problems over time.
That said—as I’ve done with several before—I’ll probably download EdgeDB and try to do some stuff with it.
I know I'm staying with the nulls. I wish the best luck for them, and if it survives to maturity, I'll haply go get the ~1% (probably less) more productivity they offer. But right now I'm not moving. It's sad, really, but things are stacked against them. Change is costly, so we lose all the small changes that could compose into something huge.
The crux is : it takes much much more to achieve that than what the average C# or java coder yup has to offer. So the average C# or java coder yup will dismiss those things too.
The JS framework churn makes me super appreciative of the stability that SQL brings.
Perhaps SQL is fine if it's your interface for accessing data on a one-off basis, but if you're trying to build a complex tool on top of it (say, an analysis tool for arbitrary data), the inconsistencies and performance concerns mount. People often end up inventing their own proprietary databases to do these analyses (e.g., virtually any business intelligence tool) assuming they can afford to do so. Perhaps EdgeQL isn't the ideal alternative, but as it is SQL is not good enough for many use cases.
I'd recommend reading the PGSQL manual, they go very in depth about many of the supported features and how they are implement and can be used.
It'll take several unicorns and Fortune 100s hiring thousands of engineers to code in an SQL-alternative to create an ecosystem large enough to eventually overtake SQL.
There's plenty of brilliant people out there, but when you're talking about replacing the most successful data storage language in the history of mankind, you need everything. SQL has been "killed" many times, everyone wants to sell something. It would probably take involvement from a FAAMG entity. I think the first clue that there's no real room for technical innovation and we're staring at only the opportunity for technical churn, is that no FAAMG players, who definitely operate at-scale, have tried to displace SQL outright already.
TypeScript or Kotlin are really the closest, best and most recent examples of what would need to happen. For me as an end-user, ubiquity and skill-reusability matters. If you don't like SQL, there are ORMs.
I see that you’ve built this as a patched version of Postgres, but I’m curious how much of this syntax you could implement as a client library and shell that would run against an existing Postgres instance.
Right away, you would get adoption from people who have an existing Postgres, or who want to take advantage of SaaS offerings like AWS Aurora.
Longer term, I could imagine the client/shell being extended to support multiple backend DB dialects, even things like Spark or Redshift which you’d have a hard time modifying intrusively.
It could also be cool to explore interoperation with existing schemas written in plain SQL, so people could adopt it incrementally that way.
> It could also be cool to explore interoperation with existing schemas written in plain SQL, so people could adopt it incrementally that way.
Yes, this will happen too.