behemot (u/behemot) - Readit News

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

henning · 2 months ago

Yes, this what the people who will curse you out and judge you for not using wide events omits: it will greatly increase storage costs compared to the normal metrics + traces + sample based logging that is conventional. It has both a benefit and a cost, and the cost part is always omitted.

behemot · 2 months ago

ClickHouse is pretty good at compressing the wide events, so it's not that dramatic compared to the benefits of having high-cardinality telemetry. check this out: https://clickhouse.com/blog/optimize-clickhouse-codecs-compr...

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

Xcelerate · 2 months ago

Do wide events really have to take up this much space? I mean, observability is to a large degree basically a sampling problem where the goal is to maximize the ability to reconstruct the state of the environment at a given time using a minimal amount of storage. You can accomplish that by either reducing the number of samples taken or by improving your compression capability.

For the latter, I have a very hard time believing we’ve squeezed most of the juice out of compression already. Surely there’s an absolutely massive amount of low-rank structure in all that redundant data. Yeah, I know these companies already use inverted indices and various sorts of trees, but I would have thought there are more research-y approaches (e.g. low rank tensor decomposition) that if we could figure out how to perform them efficiently would blow the existing methods out of the water. But IDK, I’m not in that industry so maybe I’m overlooking something.

behemot · 2 months ago

> Do wide events really have to take up this much space?

100PB is the total volume of the raw, uncompressed data for the full retention period (180 days). compression is what makes it cost-efficient. on this dataset, we see ~15x compression, so we only store around 6.5PB at rest.

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

fuzzy2 · 2 months ago

Everything OTel I ever did was fully active. So I wouldn't say this is very noteworthy. Instead it is wrong/incomplete information.

behemot · 2 months ago

we use k8s + otel filelog receiver. in this case you don't have to connect to the clickhouse instance to collect what it's writing to stdout/stderr, just tail /var/log/pods/*/*/*.log.

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

the_arun · 2 months ago

I didn’t see how long logs are kept - retention time. After x months you may need summary/aggregated data but not sure about raw data.

behemot · 2 months ago

we keep it for 180 days.

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

nijave · 2 months ago

>but don't properly account for time spent by engineers rummaging around for data they need but don't have

This is a tricky one that's come up recently. How you you quantify the value of $$$ observability platform? Anecdotally I know robust tracing data can help me find problems in 5-15 minutes that would have taken hours or days with manual probing and scouring logs.

Even then you have the additional challenge of quantifying the impact of the original issue.

behemot · 2 months ago

> Anecdotally I know robust tracing data can help me find problems in 5-15 minutes that would have taken hours or days with manual probing and scouring logs.

exactly. high-cardinality, wide structured events are the way.

behemot commented on Scaling our observability platform by embracing wide events and replacing OTel clickhouse.com/blog/scali... · Posted by u/valyala

b0a04gl · 2 months ago

tbh that's not the flex. storing 100PB of logs just means we haven't figured out what's actually worth logging. metrics + structured events can usually tell 90% of the story. the rest? trace level chaos no one reads unless prod's on fire. what'd could've done better be: auto pruning logs that no alert ever looked at. or logs that never hit a search query in 3 months. call it attention weighted retention. until then this is just high end digital landfill with compression

behemot · 2 months ago

hey there! I work at ClickHouse. to clarify: the vast majority of this 100PB is structured events. in our case logs are supplementary.