We used to work on product and engineering at Datadog and Splunk. We saw how even teams using these industry-leading tools were struggling to effectively interpret and use their logging data. The sheer volume of logs overwhelmed experts and newcomers alike, making it difficult to quickly identify meaningful issues or patterns. Despite powerful indexing and search capabilities, developers still had to manually piece together context from different logs, dashboards, and sources—a tedious and error-prone process.
The “noisy logging” problem—that is, the gap between overwhelming amounts of raw log data and insights people can act on—ultimately is a gap between machines (which generate all this data) and humans (who want and need the insights). SiftDev is built to bridge that gap and to automate the tedious, manual aspects of debugging and observability. In marketing-speak: “humans should never have to look at a log again!” We think people should interact with their data in terms that make sense on a human level.
What makes SiftDev different is its understanding of application context over time. While traditional tooling typically lets developers analyze logs in isolation, or with minimal surrounding context, SiftDev builds comprehensive profiles of your application's normal behavior patterns. This awareness allows us to understand what's truly abnormal versus what might appear unusual in a single snapshot but is actually expected behavior for your specific application. SiftDev applies semantic analysis and profiling to understand your application's logging behavior holistically. Instead of relying solely on manual search, Sift identifies core application processes, automatically detects patterns, and surfaces anomalies, including clear explanations and context.
Here are some examples of what this can look like in practice: Identify core processes: SiftDev instantly recognizes your payment workflows—like authorization, capture, and refunds—without manual tagging. Detect performance patterns: SiftDev learns your nightly batch job typically handles 10,000 records in 45 minutes, establishing a clear baseline. Surface hidden anomalies: SiftDev flags silent failures, such as two microservices updating the same record within 50ms—issues normally hidden by routine logs.
You can then directly ask your logs questions like, “What's causing errors in our checkout service?” or “Why did latency spike at 2 AM?” and immediately receive insightful, actionable answers that you’d otherwise manually be searching for.
We’d love for you to test out our product via our demo playground at https://app.trysift.dev/! It’s a slightly less functional version of our platform but shares a lot of the core features. Note: we do need users to sign up to do this but waitlist is optional (of course).
We'd love your feedback, thoughts, and experiences dealing with logging and observability challenges!
Log diving takes a lot of time especially during some kind of outage/downtime/bug where the whole team might be watching a screen share of someone diving into logs.
At the same time I am sceptical about "AI" especially if it is just an LLM stumbling around.
Understanding logs is probably the most brain intensive part of the job for me, more so than system design, project planning or coding.
This is because you need to know where the code is logging, imagine code paths in your head and you constantly see stuff that is a red herring or doesn't make sense.
I hope you can improve this space but it won't be easy!
As for the skepticism with LLMs stumbling around raw logs: it's super deserved. Even the developers who wrote the program often refer to larger app context when debugging, so it's not as easy as throwing a bunch of logs into an LLM. Plus, context window limits & the relative lack of "understanding" with increasingly larger contexts is troublesome.
We found it helped a lot to profile application logs over time. Think aggregation, but for individual flows rather than similar logs. By grouping and ordering flows together, it's bringing the context of thousands of (repetitive) logs down to the core flows. Much easier to find when things are out of the ordinary.
Still a lot of improvements in regards to false positives and variations in application flows.
You have to do this at the inception of the software you’re building rather then strap it on the donkey when something breaks (the usual way).
Moving fast has it's downsides and I can't say I blame people for deprioritizing good logging practices. But it does come back to bite...
Though as a caveat, you don't always have control over your logs -- especially with third party services, large but fragmented engineering organizations, etc. -- even with great internal practices, there's always something.
On another note, access to codebase + live logs gives room to develop better auto-instrumentation tooling. Though perhaps cursor could do a decent enough job at starting folks off
* Make sure the logs are actionable
* Make sure the logs are readable
* Make sure you are collecting operational metrics
* Make sure the metrics are useful
* Make sure you have error handling
* Make sure you have alerting
* Make sure you document how to support the application
* Make sure you have knows and levers you can pull in an emergency to change the systems behavior or fix things
* Make sure you have vetted the system for security issues
etc.
I agree, even when applicable LLMs are relegated to analyzing subselected data, so logs have to go somewhere else first. I think understanding logs is brain intensive because it can be a tricky problem. It gets easier with good tools, but often those tools are the kind that need to be used to build something else that solves the problem, rather than solve the problem themselves (e.g. building a good query + automation). I think LLMs can get better at creating the queries which would help a lot.
We started Gravwell to try bring some magic. It's a schema-on-read time-series data lake that will eat text or binary and comes in SaaS or self-hosted (on-prem). We built our backend from scratch to offer maximum flexibility in query. The search syntax looks like a linux command line, and kinda behaves like one too. Chain modules together to extract, filter, aggregate, enrich, etc. Automation system included. If you like Splunk, you should check us out.
There's a free community edition (personal or commercial use) for 2GB/day anon or 14GB/day w/ email. Tech docs are open at docs.gravwell.io.
I don't understand, what about that is a "silent failure"?
in order for your product to even know about it, wouldn't I need to write a log message for every single record update?
and if my architecture allows two microservices to update the same row in the same database...maybe it happening within 50ms is expected?
that could be an inefficient architecture for sure, but I'm confused as to whether your product is also trying to give me recommendations about "here's an architectural inefficiency we found based on feeding your logs to an LLM"
> You can then directly ask your logs questions like, “What's causing errors in our checkout service?” or “Why did latency spike at 2 AM?” and immediately receive insightful, actionable answers that you’d otherwise manually be searching for.
the general question I have with any product that's marketing itself as being "AI-powered" - how do hallucinations get resolved?
I already have human coworkers who will investigate some error or alert or performance problem, and come to an incorrect conclusion about the cause.
when that happens I can walk through their thought process and analysis chain with them and identify the gap that led them to the incorrect conclusion. often this is a useful signal that our system documentation needs to be updated, or log messages need to be clarified, or a dashboard should include a different metric, etc etc.
if I ask your product "what caused such-and-such outage" and the answer that comes back is incorrect, how do I "teach" it the correct answer?
Silent failures can be "allowed" behavior in your applications that aren't actually labeled as errors but can be irregular. Think race conditions, deadlocks, silent timeouts, or even just mislabeled error logs.
> in order for your product to even know about it, wouldn't I need to write a log message for every single record update?
That's right, and this may not always feasible (or necessary!), but if your application can be impacted by errors like these, perhaps it may be worth logging anyway.
> the general question I have with any product that's marketing itself as being "AI-powered" - how do hallucinations get resolved?
> and if my architecture allows two microservices to update the same row in the same database...maybe it happening within 50ms is expected?
> if I ask your product "what caused such-and-such outage" and the answer that comes back is incorrect, how do I "teach" it the correct answer?
For these concerns, human-in-loop feedback is our preliminary approach! We have our own internally running to account for changes and false errors, but having explanations from human input (even as simple as "Not an error" or "Missed error" buttons) is very helpful.
> when that happens I can walk through their thought process and analysis chain with them and identify the gap that led them to the incorrect conclusion. often this is a useful signal that our system documentation needs to be updated, or log messages need to be clarified, or a dashboard should include a different metric, etc etc.
Got it, I imagine it'll be very helpful for us to display our chain of thought from our dashboards too. Great feedback, thank you!
I agree that those are bad things.
but how does your product help me with them?
I have some code that has a deadlock. are you suggesting that I can find the deadlock by shipping my logs to a 3rd-party service that will feed them into an LLM?
In most of the industries I work in we would never just send you our logs.
What stops me from building my own logger that sends a request to write a record to a DB and later asks an LLM what it means ?
Where is the pricing information?
Why do I need to login visit your homepage? How would I pitch this to my boss if they can’t read what it does ?
Edit: https://runsift.com/pricing.html
I see the landing page. The pricing should be clear though “ Contact Us” is scary.
Yep we have an on-prem offering as well, got similar notes from folks before!
> What stops me from building my own logger that sends a request to write a record to a DB and later asks an LLM what it means ?
Great question! The main limitation over brute force is the sheer volume of noise, and therefore relevant context. We tried this and realized it wasn't working. From a numbers perspective, at even just 10s of GBs/day scale of data (not even close to enterprise scale), mainstream LLMs can't provide the context windows you need for more than a few minutes of operational data. And larger models suffer from other factors (like attention diffusion / dilution & drift).
> I see the landing page. The pricing should be clear though “ Contact Us” is scary. Noted!
I hope my tone wasn’t too brash.
If you can update the pricing I might be able to pitch this to my org later this year. We’d definitely like an on prem solution though!
Logs were a natural starting point because that’s where developers often spend a significant amount of time stuck reading & searching for the right information, manually tracking down issues + jumping between logs across services. In a way, just finding & summarizing relevant logs for the user gave people an easier time debugging.
But metrics will introduce more dimensions to establish baseline behavior, so we're pretty excited about it too.
Aggregated (+ simplified) versions of your logs + flagged anomalies get passed through our LLMs
There was no GH link for your npm dep so maybe they're both private. Although npmjs shows your npm one as ISC licensed, likely because of the default in package.json
And Kudos to SigNoz as well - have to check out other folks in the space :)
It is also good for finding out what the buffering story is, because I would want to know if I'm dragging in an unbounded queue into my app (putting memory pressure on me) or knowing that your service returning 503s is going to eat logs. The kind of thing that only looking at the source would say for sure because the docs don't even hint at such operational concerns
Anyway, the only reason I mentioned the dead link is because your PyPI page linked to GH in the first place. So if you don't intend people to think there's supposed to be a repo, then I'd suggest removing the repo link
Would you like me to create a transport for it (I'm not implying I'd be charging to do this; it'd be free)?
The benefit of LogLayer is that they'd just use the loglayer library to make their log calls and it ships it to whatever transports they have defined for it. Better than having them manage two separate loggers (eg Sift and Pino for example) or write their own wrapper.
If you’ve already set up logging, good chance you can just point your instrumentation towards us and we know how to ingest and handle it.