Readit News logoReadit News
mrtimo · a year ago
This is a great lineup. Excited for the "SQL replacement" talks. Malloy is currently my favorite because of it's semantic nature. Define all the joins, measures, dimensions up in the source, and use them as needed. Makes for easy writing, and cleaner thinking.

Also, Malloy has done something clever. They embed duckdb into their VSCode extension, so you can easily query (and join) .csv or .parquet files on the fly. I laugh when I think of all the time I spent noodling with pandas code to do simple data cleaning, analysis, or joining. This also means that you can press "." on any github repo and start querying.

I teach MS Business Analytics students - they really enjoyed walking through an analysis of first names given in the USA for the past century. To see it, click to the github repo [1], press ".", install the Malloy extension, click back to the names1.malloynb file and run the queries.

[1]: https://github.com/malloydata/malloy-samples/blob/main/names...

tanvach · a year ago
I dunno, personally when I teach SQL to engineers, it’s not the syntax that is the main difficulty (yes it can be improved) but the relational data mental model that trips people up. They all want sequential data processing and loops. Translating that to vectorized code is the closest thing I got to help non-data people grok.
fxttr · a year ago
I used to hate SQL when I was a backend engineer. I had difficulties understanding exactly this mental model and tried to avoid writing SQL by using ORMs. At some point I “accidentally” switched to data engineering and was forced to get to grips with SQL. I think that in the meantime I warmed up to a functional programming style which helped me to some extent.
whatever1 · a year ago
The thing that trips everyone is that you cannot explain the SQL outputs. In sequential code, you can step through and understand why the logic error produced erroneous output.

With SQL, it is a trial and error. When your query passes your sniff tests, you sign looks good to me, and you ship it to the world. Only to silently break shortly afterwards without any warning.

In enterprise, I am convinced at this point that all of the complex ETL jobs are vending wrong outputs. Just nobody has the tools to diagnose and fix the problems.

cyberax · a year ago
For me, it's not the syntax, and not even the relational model. They are fairly easy to explain.

No, it's the morass of the interactions between GROUP BY/ORDER BY/HAVING. Like, why isn't there FIRST statement to select the first element in the group?

kmoser · a year ago
Not sure what getting the first row has to do with any issues you have with GROUP BY/ORDER BY/HAVING? Each of those clauses serves a distinct purpose, and I'm not sure I'd describe any interactions a "morass" since any connections between them are usually by design, and well documented. With the power of SQL (or any powerful and complex tool) comes great responsibility to understand how it works.
helge9210 · a year ago
To have the first element, you have to describe a relation of order. It's not ordered by default.
azurezyq · a year ago
Window function is your friend.

Row_number() over (partition by x order by) as rank

Then: where rank = 1.

Grouping is unordered, which is a clean definition.

mamcx · a year ago
> It’s not the syntax ... but the relational data mental model that trips people up.

It is also the syntax because it does not match the actual model. Even the simple example `SELECT 1` shows a mismatch.

And this cascade. Because the syntax is wrong, people have big trouble with `JOINs` (that is disconnected from the idea of making `?-to-?` specifications), `group by` (that doesn't exist in SQL), aggregation logic (that is badly implemented as window functions), and bigly, the lack of composability (that has a weird band-aid called CTEs) and so on.

A mismatch between the domain and the code is always reflected in syntax.

calmbonsai · a year ago
Can I take death?

For me, it's the lack of SQLs expression composability. And yes, I'm well aware of various libraries and plug-ins that purport to yield "composable SQL". It's never gonna' happen at-scale due to decades of investment in query optimizers for the popular relational engines.

Just make peace with all of SQL's warts or go NoSQL and make piece with code-bloat and eventual consistency.

ro_sharp · a year ago
Not sure I follow. In a world with CTEs and views, what do you think is missing for composability?
antihipocrat · a year ago
CTEs really tripped me up when I started using them professionally. My mental model was that they result in reusable objects in memory and thus could be used to improve performance as well as composition.

After discovering the truth, it was interesting to find out that almost everyone I knew who wasn't a snr db engineer shared the same incorrect assumption.

_dain_ · a year ago
https://borretti.me/article/composable-sql

>Imagine a programming language without functions. You can only write code that operates on concrete values, i.e. variables or literals. So instead of writing a function and calling it anywhere you have to write these little code templates as comments and every time you want to “call” the “function” you copy the template and do a search/replace.

>This would be tiresome. But that’s what SQL is. The concrete values are the table names. The code is the queries. And the function templates you have to search replace are your business logic, which must be inlined everywhere it is used.

benzayb · a year ago
"make sql as fast as possible" is a misnomer.

you cannot make a language fast, only its underlying implementation (i.e the specific dbms).

and this is the error I see 90% of the time. there is a confusion between conceptual (model) vs implementation concerns.

the problem with SQL the language is its conceptual foundations -- that it is not currently relational and yet it is used as the language of "relational" DBMSes in the market.

ayhanfuat · a year ago
It says “making sql go as fast as possible”, referring to the talks about optimizers.
benzayb · a year ago
To quote:

"The talks in this series will present ideas on either (1) making SQL go as fast as possible or (2) replacing SQL with something better."

How were you able to infer that the existence of the word "go" in that phrase refers to optimizers?

And 2nd idea/proposal of "replacing SQL with something better" really implies that they are talking about the language and not the optimizer.

In any case, if they really meant "optimizer" then their phrasing is very vague and imprecise; if they are talking about the "optimizer" -- then which specific optimizer? They have failed to mention that too.

calmbonsai · a year ago
Correct. I think the parent commenter is confusing a language's concrete syntax with its abstract syntax. At the end of the day, the ease (or lack there of) of translating an AST into performant executing code is highly correlated with the potential variance of those ASTs.

The more higher-level expressions your languages supports, the more work its compiler (whether targeting physical or virtual hardware) must do to translate those ASTs into "machine" code and the higher the variance of that code during its execution workload.

It's even more work when that language's expressions are declarative instead of imperative.

mg · a year ago
SQLite makes for a really nice document store with the json arrow functions. Assume you have a "book" column that is a json blob, then selecting the names of all books is just:

    SELECT book->>'name' FROM books;
The only thing I miss is auto increments.

Without an additional "id" column, inserting a new book with a unique id becomes cumbersome.

yayitswei · a year ago
I chose death and never looked back. Favorite replacement has been Diatomic, currently looking into Rama.
__mharrison__ · a year ago
I generally prefer working with dataframes, so this could be interesting.
relaxing · a year ago
> Suppose somebody has been rubbing gasoline on their body since the 1970s. Would you marry that person even if they smelled terrible?

Who wrote this? What are they trying to do here? Be funny? Stop that.