This is also mentioned in the gui docs here https://github.com/emilk/egui#why-immediate-mode:
> egui only repaints when there is interaction (e.g. mouse movement) or an animation, so if your app is idle, no CPU is wasted.
Yeah, like every other conceivable feature, ever.
So...you're mad that the government tried to fix a problem it had?
I see you also created similar issues in Polars: https://github.com/pola-rs/polars/issues/17932 and DuckDB: https://github.com/duckdb/duckdb/issues/17066
ClickHouse has a built-in memory tracker, so even if there is not enough memory, it will stop the query and send an exception to the client, instead of crashing. It also allows fair sharing of memory between different workloads.
You need to provide more info on the issue for reproduction, e.g., how to fill the tables. 16 GB of memory should be enough even for a CROSS JOIN between a 10 billion-row and a 100-row table, because it is processed in a streaming fashion without accumulating a large amount of data in memory. The same should be true for a merge join.
However, there are places when a large buffer might be needed. For example, if you insert data into a table backed by S3 storage, it requires a buffer that can be in the order of 500 MB.
There is a possibility that your machine has 16 GB of memory, but most of it is consumed by Chrome, Slack, or Safari, and not much is left for ClickHouse server.
I do want to get a better reproduction on CH because it seems like it's an interplay between the INSERT INTO...SELECT. It's just a bit of work to generate synthetic data with the same profile as my production data (for what it's worth I did put quite a bit of effort into following the doc guidelines for dealing with low-memory machines).
It’s important to structure your tables and queries in a way that aligns with the ordering keys, in order to optimize how much data needs to be loaded into RAM. You absolutely CANNOT just replicate your existing postgres DB and its primary keys or whatever over to CH. There are tricks like projections and incremental materialized views that can help to get the appropriate “lenses” for your queries. We use incremental MVs to, for example, continuously aggregate all-time stats about tens of billions of records. In general, for CH, space is cheap and RAM is expensive, so it’s better to duplicate a table’s data with a different ordering key than to make an inefficient query.
As long as the queries align with the ordering keys, it is insanely fast and able to enable analytics queries for truly massive amounts of data. We’ve been very impressed.
> This is generally good advice. Usually, extra allocations are fine, and the resulting performance degradation is not an issue. But it is a little strange that it allocations are encouraged in an otherwise performance-forcused language, not because the program logic demands it, but because the borrowchecker does.
I often end up writing code that (seems) to do a million tiny clones. I've always been a little worried about fragmentation and such but it's never been that much of an issue -- I'm sure one day it will be. I've often wanted a dynamically scoped allocator for that reason.