openWrangler (u/openWrangler)

openWrangler commented on Graceful Shutdown in Go: Practical Patterns victoriametrics.com/blog/... · Posted by u/mkl95

tmpz22 · 4 months ago

Is it me or are observability stacks kind of ridiculous. Logs, metrics, and traces, each with their own databases, sidecars, visualization stacks. Language-specific integration libraries written by whoever felt like it. MASSIVE cloud bills.

Then after you go through all that effort most of the data is utterly ignored and rarely are the business insights much better then the trailer park version ssh'ing into a box and greping a log file to find the error output.

Like we put so much effort into this ecosystem but I don't think it has paid us back with any significant increase in uptime, performance, or ergonomics.

openWrangler · 4 months ago

It's not just you - OSS toolstacks can be sprawling and involve long manual processes while costs from most enterprise vendors are too steep for fully mapped observability.

Coroot is an open source project I'm working with to try and to tackle this. eBPF automatically gathers your data into a centralized service map, and then the tool provides RCA insights (with things like mapped incident timeframes) to help implement fixes quicker and improve uptime.

GitHub here and we'd love any feedback if you think it can help: https://github.com/coroot/coroot

openWrangler commented on Observability 2.0 and the Database for It greptime.com/blogs/2025-0... · Posted by u/todsacerdoti

incangold · 4 months ago

I agree with the broad point- as an industry we still fail to think of logging as a feature to be specified and tested like everything else. We use logging frameworks to indiscriminately and redundantly dump everything we can think of, instead of adopting a pattern of apps and libraries that produce thoughtful, structured event streams. It’s too easy to just chuck another log.info in; having to consider the type and information content of an event results in lower volumes and higher quality of observability data.

A small nit pick but having loads of data that “most likely no-one will look at ever again” is ok to an extent, for the data that are there to diagnose incidents. It’s not useful most of the time, until it’s really really useful. But it’s a matter of degree, and dumping the same information redundantly is pointless and infuriating.

This is one reason why it’s nice to create readable specs from telemetry, with traces/spans initiated from test drivers and passed through the stack (rather than trying to make natural language executable the way Cucumber does it- that’s a lot of work and complexity for non-production code). Then our observability data get looked at many times before there’s a production incident, in order to diagnose test failures. And hopefully the attributes we added to diagnose tests are also useful for similar diagnostics in prod.

openWrangler · 4 months ago

I'm currently working with Coroot, which is an open source project trying to create a solution for this issue of logs and other telemetry sources being too much for any team to reasonably have time to parse manually. Data is automatically imported using eBPF and Coroot will provide insights into RCA (with things like mapped incident timeframes) to help with anything overlooked in dumps.

GitHub here - hope the tool can help some folks in this thread: https://github.com/coroot/coroot

openWrangler commented on Vector Search Conference 2025 vsearchcon.com/... · Posted by u/openWrangler

openWrangler · 4 months ago

The Vector Search Conference is an online event on June 6 I thought could be helpful for developers, data engineers, and AI enthusiasts on HN to connect with other members of the vector search community. It’s a free opportunity to connect and learn from other professionals in your field if you’re interested in building RAG apps or scaling recommendation systems.

Event features:

- Experts from Google, Microsoft, Oracle, Qdrant, Manticore Search, Weaviate sharing real-world applications, best practices, and future directions in high-performance search and retrieval systems

- Presentations for all skill levels

- Live Q&A to engage with industry leaders and virtual networking

A few of the presenting speakers:

- Gunjan Joyal (Google): “Indexing and Searching at Scale with PostgreSQL and pgvector – from Prototype to Production”

- Maxim Sainikov (Microsoft): “Advanced Techniques in Retrieval-Augmented Generation with Azure AI Search”

- Ridha Chabad (Oracle): “LLMs and Vector Search unified in one Database: MySQL HeatWave's Approach to Intelligent Data Discovery”

If you can’t make it but want to learn from experience shared in one of these talks, sessions will also be recorded. Free registration can be checked out at (https://vsearchcon.com/register/) - hope you learn something from the event!

openWrangler commented on Show HN: Nerdlog – Fast, multi-host TUI log viewer with timeline histogram github.com/dimonomid/nerd... · Posted by u/dimonomid

mdaniel · 4 months ago

You'll go broke doing that, as those API calls are not free. Best to configure cloudwatch to dump into some sane place (S3, SigNoz, whatever) so you only pay the api call once and not every time for interactive viewing

I went spelunking around in the codebase trying to get the actual answer to your question and it seems it's like many things: theoretically yes with enough energy expended but by default it seems to be ssh-ing into the target hosts and running a pseudo agent over its own protocol back through ssh. So, "no"

openWrangler · 4 months ago

Seconded - it sounds like compatibility isn't there yet with AWS, but it would be great if there was a way to use nerdlog with other OSS dashboard tools like Signoz or Coroot like you mentioned. Still a really interesting graylog altnerative.