Readit News logoReadit News
arecurrence commented on The two versions of Parquet   jeronimo.dev/the-two-vers... · Posted by u/tanelpoder
arecurrence · 4 months ago
Sounds similar to HDMI allowing varying levels of specification completeness to all be called the same thing.
arecurrence commented on ClickHouse raises $350M Series C   clickhouse.com/blog/click... · Posted by u/caust1c
porridgeraisin · 7 months ago
> Inserts are also expected to be in bulk (They are initially a new physical part that is later merged into the main table structure). A single DELETE is an ALTER TABLE operation in the MergeTree engine.

> They are initially a new physical part that is later merged into the main table structure

> A single DELETE is an ALTER TABLE operation

Can you explain these two further?

arecurrence · 7 months ago
The Clickhouse docs are so good that I'd point straight to them https://clickhouse.com/docs/sql-reference/statements/alter/d... .

The reason I mentioned it is because it's a huge surprise to some people that... from the docs: "The ALTER TABLE prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. ALTER TABLE is considered a heavyweight operation that requires the underlying data to be merged before it is deleted."

There's also a "lightweight delete" available in many circumstances https://clickhouse.com/docs/sql-reference/statements/delete. Something really nice about the ClickHouse docs is that they devote quite a bit of text to describing the design and performance implications of using an operation. It reiterates the focus on performance that is pervasive across the product.

Edit: Per the other part of your question, why inserts create new parts and how they are merged is best described here https://clickhouse.com/docs/engines/table-engines/mergetree-...

arecurrence commented on ClickHouse raises $350M Series C   clickhouse.com/blog/click... · Posted by u/caust1c
the__alchemist · 7 months ago
Is there an ELI5 for this company? I'm having a difficult time understanding it from their website. Is it an alternative to Postgres etc? Something that runs on top of it? And analyzes your DB automatically?
arecurrence · 7 months ago
Clickhouse has a wide range of really interesting technologies that are not in Postgres; fundamentally, it's not an OLTP database like Postgres but more-so aimed at OLAP workloads. I really appreciate Clickhouse's focus on performance and quite a bit of work goes into optimizing the memory allocation and operations among different data types.

The heart of Clickhouse are these table engines (they don't exist in Postgres) https://clickhouse.com/docs/engines/table-engines . The primary column (or columns) is ordered in some way and adjacent values in memory are from the same column in the table. Index entries span wide areas (EG: By default there's only one key record in the primary index for every 8192 rows) because most operations in Clickhouse are aggregate in nature. Inserts are also expected to be in bulk (They are initially a new physical part that is later merged into the main table structure). A single DELETE is an ALTER TABLE operation in the MergeTree engine. :)

This structure allows it to literally crunch billions of values per second (brutally, not with pre-processing, erm, "tricks" although there is a lot of support for that in Clickhouse as well). I've had tables with hundreds of columns and 100+ billion rows that are nearly as performant as a million row table if I can structure the query to work with the table's physical ordering.

Clickhouse recommends not using nullable fields because of the performance implications (it requires storing a bit somewhere for each value). That's how much they care about perf and how close to the raw data type it is that their memory allocation uses. :)

arecurrence commented on ClickHouse raises $350M Series C   clickhouse.com/blog/click... · Posted by u/caust1c
devops000 · 7 months ago
only 2k users?

with 200$/month I have a good database. $1-5M revenue?

arecurrence · 7 months ago
I've worked at a number of companies using Clickhouse and they all self-hosted. I imagine Clickhouse corporate is focused on large customers.
arecurrence commented on How Does Claude 4 Think? – Sholto Douglas and Trenton Bricken   dwarkesh.com/p/sholto-tre... · Posted by u/consumer451
arecurrence · 7 months ago
This is one of the most interesting interviews I've ever read/listened to. Reminds me of when I first heard a Lex Fridman interview (the style is completely different but it hits on a lot of material that is interesting purely due to the openness of the interviewee to talk about whatever and how the interviewer drives the conversation).

If you are at all interested in the current challenges being grappled on in this space, this does a great job of illuminating some of them. Many many interesting passages in here and the text transcript has links to relevant papers when their topics are brought up. Really like that aspect and would love to see that done a lot more often.

arecurrence commented on What’s new in Swift 6.2   hackingwithswift.com/arti... · Posted by u/ingve
codr7 · 7 months ago
There's a lot I love about Swift, but I fear it's quickly becoming too complicated for its own good.

There are just so many ways to solve a problem now that it's more or less impossible for someone to be familiar with all of them.

arecurrence · 7 months ago
I too wish deprecation with migration path was a more common pattern in today's language development. The language has very much needed work and the numerous bugs within Apple's own libraries certainly hasn't helped.

That said, some of the, erm, "new ways" to solve problems have been significant advancements. EG: Async/Await was a huge improvement over Combine for a wide variety of scenarios.

arecurrence commented on A ChatGPT mistake cost us $10k   asim.bearblog.dev/how-a-s... · Posted by u/asim-shrestha
arecurrence · 2 years ago
I made a bug like this once where a database default was set to a value evaluated at runtime instead of on every insert. Oops

However, luckily in my case, it was caught immediately in the staging env since collisions caused exceptions.

Realizing when an expression is evaluated is pretty easy to miss. That code is probably live somewhere else right now surreptitiously causing issues.

arecurrence commented on We Hacked Multi-Billion $ Companies in 30 Minutes with a VSCode Extension   medium.com/@amitassaraf/t... · Posted by u/amitassaraf
arecurrence · 2 years ago
This is a very well done attack. Enjoyed reading about your efforts to gain community credibility. You rapidly transformed this from a small number of victims into an epidemic.

I'm surprised that VSCode extensions don't have a permissions system (EG: "Request network access").

arecurrence commented on Suno has raised $125M to build a future where anyone can make music   suno.com/blog/fundraising... · Posted by u/whitej125
arecurrence · 2 years ago
I don't understand all the hate. I've been listening to this all morning and it's fantastic.

Sure, it's not going to trend on Apple Music... but it's the best we've ever done and a genuine step above previous efforts.

arecurrence commented on Show HN: I built a game to help you learn neural network architectures   graphgame.sabrina.dev/... · Posted by u/sabrina_ramonov
dpcx · 2 years ago
This seems like it might be interesting to me if I already had some understanding of neural networks. Unfortunately for me, I can't even complete the RNN because there's nothing to even suggest what I'm missing when I connect the dots in the only way that the UI suggests I can.
arecurrence · 2 years ago
I suspect that's a bug because if you connect Xt to Ht twice... it succeeds.

Edit: This no longer repros and only the correct solution works now from what I can tell.

u/arecurrence

KarmaCake day768February 12, 2014View Original