One of the big problems is Oracle decide what does/doesn’t go in. Understandable, but they provide no insight into the decision making process, why patches don’t get merged, or even if/when they take it.
They used to post worklogs for features they were working on which gave some insights, but they’ve stopped doing that now too. Imagine working on a new feature for months, submitting a patch only to find out oracle have went their own route, many people will just say what’s the point.
Last updated work log was in 2021: https://dev.mysql.com/worklog/
Edit: As an example, one of the optimizations called out in this repo was submitted in October 2023. “Thank you for the report and contribution.” and then radio silence ever since.
16. Why aren't these improvements merged into the official MySQL?
Optimizations have been recommended to the official team and have received acknowledgment. However, they are assigned low priority in official bug fixes. Simple optimizations may take considerable time to be integrated, while complex ones might never be implemented.
As a result, the decision was made to open-source the MySQL optimized version to ensure effective application in high-end scenarios.
Sort of wild that a small improvement to relay log processing could almost certainly offset one’s entire lifetime of carbon. I mean, I’m genuinely happier with a tiny latency reduction but it’s still wild the scale at which MySQL operates.
Maybe these optimizations can let me avoid moving to Vitess for another year!
Keep in mind optimization effects can be counter-intuitive, as you need to consider unexpected second-order effects. Say what if my queries being slow forced me into optimizing them via cache that I wouldn't use otherwise, resulting in 10x improvement, but if MySQL is a bit faster I would've reached my initial performance goals without that cache, thus increasing the total carbon footprint?
And if this sounds contrived, this is basically what happened with our hardware vs. software optimization situation. We could do wonders on a 1MHz chip with 2MB of RAM in the 1980s, but now we need literally many thousands of times that capacity just to boot our OS to an empty screen.
Every time hardware improved, software bloated up. Thus eventually we had so much disposable compute just for... again, literally... playing games and crypto scams, that we invented AI running on it. And now that AI is once again blowing up our energy needs.
All that, because hardware kept optimizing, software kept compensating by becoming worse, and thus new use cases revealed themselves that would be impossible before, but rather destructive to climate.
>Maybe these optimizations can let me avoid moving to Vitess for another year!
Any reason why considering Vitess isn't exactly new and has been stable enough? Other than no need to introduce additional complexity unless absolutely necessary.
Was wondering the same. Such a verbose readme and not a single mention of MariaDB.
I know it's moving slower and in a lot of ways it's inferior to mysql, but at the same time that would make it even better to have some contributions like this.
Side note: One downside of ChatGPT generated documentation (assuming this was written in conjunction with an LLM) is that humans tend to be a little less verbose.
You know, a patch file can individually address each upsteam file it intends to modify, right? I presume someone who wants to casually read them would need to fork the repo, cut up the ginormous .patch file into the 2361 individual patches for ease of reading or deep-linking
I also just for-real don't understand how in the universe a ~15MB text file against an open source _git hosted_ project is a sane way of delivering value. Not a single time in the readme did they say why $(git diff origin/tags/8.0.42...HEAD > yolo.patch) was the chosen delivery mechanism
Well, if Google does it then I guess I stand corrected about it being a weirdo way to deliver patches. They went so far as to .gz theirs, too, for extra non-browsing by mere mortals.
MySQL is available in GitHub (so in that sense hosted) but development doesn’t happen there. Not saying that’s the reason for the delivery mechanism though.
Edit: As an example, one of the optimizations called out in this repo was submitted in October 2023. “Thank you for the report and contribution.” and then radio silence ever since.
https://bugs.mysql.com/bug.php?id=112737https://www.percona.com/blog/what-oracle-missed-we-fixed-mor...
16. Why aren't these improvements merged into the official MySQL?
Optimizations have been recommended to the official team and have received acknowledgment. However, they are assigned low priority in official bug fixes. Simple optimizations may take considerable time to be integrated, while complex ones might never be implemented.
As a result, the decision was made to open-source the MySQL optimized version to ensure effective application in high-end scenarios.
Maybe these optimizations can let me avoid moving to Vitess for another year!
And if this sounds contrived, this is basically what happened with our hardware vs. software optimization situation. We could do wonders on a 1MHz chip with 2MB of RAM in the 1980s, but now we need literally many thousands of times that capacity just to boot our OS to an empty screen.
Every time hardware improved, software bloated up. Thus eventually we had so much disposable compute just for... again, literally... playing games and crypto scams, that we invented AI running on it. And now that AI is once again blowing up our energy needs.
All that, because hardware kept optimizing, software kept compensating by becoming worse, and thus new use cases revealed themselves that would be impossible before, but rather destructive to climate.
Having a cheaper, more available resource increases overall utilization of that resource.
Any reason why considering Vitess isn't exactly new and has been stable enough? Other than no need to introduce additional complexity unless absolutely necessary.
I know it's moving slower and in a lot of ways it's inferior to mysql, but at the same time that would make it even better to have some contributions like this.
Do not know about speed difference divergence though. I guess a speed run down would be interesting...
EDIT: I missed that the authors wrote a GitHub book, including some descriptions of the problem(s): https://enhancedformysql.github.io/The-Art-of-Problem-Solvin...
Side note: One downside of ChatGPT generated documentation (assuming this was written in conjunction with an LLM) is that humans tend to be a little less verbose.
After how many years they have finally released 9.0 and are now at 9.3. I wonder how many of problem stated in the list is still true.
At least Vitess still get continuous development.
You know, a patch file can individually address each upsteam file it intends to modify, right? I presume someone who wants to casually read them would need to fork the repo, cut up the ginormous .patch file into the 2361 individual patches for ease of reading or deep-linking
I also just for-real don't understand how in the universe a ~15MB text file against an open source _git hosted_ project is a sane way of delivering value. Not a single time in the readme did they say why $(git diff origin/tags/8.0.42...HEAD > yolo.patch) was the chosen delivery mechanism
I find it curious that <https://github.com/google/mysql-tools/blob/02d18542735a528c4...> and yet <https://github.com/google/mysql-tools/blob/02d18542735a528c4...> says "diff -ruN base/client/mysqldump.c mysql40gpl/client/mysqldump.c"
I had no idea one could release patches of GPL software under an Apache license. That makes my head hurt.
[1] https://github.com/enhancedformysql/The-Art-of-Problem-Solvi...