Show HN: Keep – GitHub Actions for your monitoring tools

Show HN: Keep – GitHub Actions for your monitoring tools github.com/keephq/keep...

Hi Hacker News! Shahar and Tal from Keep here.

A few months ago, we introduced here at HN (https://news.ycombinator.com/item?id=34806482) Keep as an “open source alerting CLI” and got some interesting feedback - mainly around UI, automation, and supporting more tools. We were VERY early back then, and we understood that although the current DX around creating alerts is not great, it's not that critical and developers don’t need another tool just for that.

But we did find something else.

While talking to developers and devops, we found that a lot of companies use many tools that generate alerts - from Cloudwatch, Prometheus, Grafana, and Datadog to tools such as Zabbix or Nagios. We definitely agree consolidation in the observability space is a real thing, but while talking to those companies we feel that there are still real use cases for having more than one tool (and for example, according to Grafana’s 2023 observability survey, 52% of the companies uses more than 6 observability tools https://grafana.com/observability-survey-2023/).

So we that in mind, we rebuilt Keep with a simple mindset: (1) Integrate with every tool that triggers alerts - it can be either pushing alerts to Keep via webhooks or routing policies or Keep to pull alerts via the tools API. (2) Create a simple abstraction layer to run workflows on top of these alerts. (3) Maintain a great developer experience - open source, API-first, workflows as code and generally having a developer mindset while building Keep.

During the time we rebuilt Keep, Datadog released their workflow automation tool (https://docs.datadoghq.com/service_management/workflows/) which led us to the understanding that's exactly what we solve - but for everyone who uses tools other than Datadog.

A short demo of Keep with a simple use case: https://www.youtube.com/watch?v=FPMRCZM8ZYg

You can try it yourself by signing into https://platform.keephq.dev

Like always - we invite you to try Keep and we are eager to hear any feedback.

Since this is 2023 and we are releasing things that solve X and Y problems in YML I do want to take the opportunity to question whether solving problem for X or Y in YML is really the thing we should be building businesses around these days. I’ve spent the greater part of the last year or so undoing the pain of “reasonably complex GHA in YML” in my organization. It’s one of those things that sounds great conceptually, and works really well simplistically, but once your use case evolves beyond even remotely simple (for example, abstracting and maintaining this code in an engineering org in the tens of people, not even hundreds), it is a slow growing cancer that ends up being a huge time suck, unmaintainable, untestable mess, and technical debt for your org.

Jnr · 2 years ago

Did you perhaps have an alternative solution you don't mind sharing? In my opinion yaml is good enough for gitops. Easy to read, understand, modify.

SOLAR_FIELDS · 2 years ago

I’ve been using Dagger for this replacement specifically. But that is ci/cd specific to caching and workflow execution. For things like workflow automation and orchestration I would reach for something like Prefect or Dagster. The point is to be able to do something in an actual programming language so that you not only get typing, readability, reusability, unit testability, local execution and language specific tooling, but also that it doesn’t suck for end users to write, debug and maintain. This also gives end users an escape hatch when your abstraction is inevitably not going to be good enough for them

Cuelang etc like siblings mentioned are decent enough but the real scalable solutions here are made available in general purpose programming languages.

shahargl · 2 years ago

And what was the solution? How did you eventually address those issues? While I agree that GitHub Actions has its downsides, it's also widely used and simple to start with, which we thought was a good approach. Would you be more comfortable with 'Zapier for Monitoring' or an alternative to 'Datadog Workflow Automation'?

xctr94 · 2 years ago

The complaint doesn’t seem to be about GitHub Actions, but YAML. I agree 100% percent, as soon as I saw that Keep is using YAML, I closed the tab.

Nope. Nope. Nope.

It’s like going back to Mongo without schemas and relational checks. We have perfectly good configuration languages with schemas, checks, imports, logic, etc. YAML is unacceptable in this profession.

seperman · 2 years ago

I have been working on a data validation tool for a while. I even tried creating an extended YAML parser for data validation. You made me realize I wasted my time with that approach. Better now than later. I would love to talk to you before I throw away more code. Can we connect?

shahargl · 2 years ago

Hey! I've missed your reply here. Sent an email.

SeriousM · 2 years ago

I'm the system architect and code quality gate in my company, and I feel you... my job is to keep things sane, consistent and extendable. GHA as well as Azure Logic Apps are booth helpful in the small scale but, omg, so far away from reusable or even able to deploy the same damn thing on different stages from code. To GHA: I find the GHA just look the same as Azure DevOps Pipelines yet they GHA don't hold your hand when designing and evaluating the steps.

SOLAR_FIELDS · 2 years ago

Under the hood GHA is using the same backend as Azure Devops Pipelines so it would make sense that they look the same

doctorpangloss · 2 years ago

Yeah but people freak out when they see Gradle, Bash, Bazel, or even wacky raw Python.

The real competition is, what will LLMs write better? Because I have zero interest in learning new DSLs, I just want whatever will be most text based to use through an LLM.

lionkor · 2 years ago

Then, honestly, you want it to write something that is statically verifyable

SOLAR_FIELDS · 2 years ago

You probably want python then. I think it's been well demonstrated that is probably the language with the largest amount of effort has gone into training LLM's to work with, in multiple facets.

nstart · 2 years ago

I'm looking at this and thinking, "you know what, this could be an awesome personal tool as well".

This is definitely outside of the use cases described but I can definitely see myself hooking this up in an IFTTT style to funnel things into my todo systems using the HTTP provider.

Will poke around this soon.

thats definitely something we want people to do, use it for their small annoying manual things. in the end of the day I think this will help Keep become very user friendly.

OJFord · 2 years ago

IMO the readme docs make it seem confusingly like it's built-in/really well integrated with Actions, because the syntax is so similar. It takes some light digging to find it's actually entirely separate (but similar) and run as `keeo run --alerts-file=path`, from GH Actions or anything else at all, because it's a separate file parsed by a third-party program that just so happens to have a similar syntax.

Nice tool though, looks useful, added to the list.

talboren · 2 years ago

thanks for the feedback here, that’s really important for us and we actually just refactored the readme a couple of days ago. any concrete action item you’d suggest?

Try to read it as someone approaching the whole project for the first time, particularly the 'Workflows' section, it just doesn't really make sense, kind of implies there's something special about running it in GitHub Actions (maybe it means to say that the syntax is similar?) and doesn't tell you how to actually pass the Keep workflow file to keep (which is what will parse it, not GH Actions) at all.

Start with `keep run --actions-file`, then show the file format. Don't mention specific CI/CD, it's not relevant.

folivore · 2 years ago

This genuinely seems useful. I’ll give it a shot. Nice!

let me know how I can help

rozenmd · 2 years ago

This looks great!

I'm the maker of an alert-generating tool (OnlineOrNot), how would I go about adding an integration for Keep?

well actually documentation around that is still under construction (https://docs.keephq.dev/development/adding-a-new-provider) but adding a new integration (provider in our terms) to Keep is a piece of cake! happy to chat in our Slack (https://slack.keephq.dev) or over a zoom, what ever works for you :)

erulabs · 2 years ago

Haha my team maintains something just like this this internally - ours is called “info-radiator”. Great idea for a product. You should add an Amazon referral link to a small Lenovo tablet and some Velcro for developers to have a dedicated Keep monitor!

thanks! happy to help your team migrate to Keep and send out a few complimentary monitors ;)

matsimitsu · 2 years ago

We have a similar internal tool, and it's also called keep!

Besides alerts it also tracks, and displays things such as which MongoDB server is the primary, or which ElasticSearch node is the controller.

whats the odds. would be super interesting to talk, are you up for it?

CodeAlong · 2 years ago

Looks really interesting! Does the self-hosted version support OAuth or other authentication methods to manage users through an external identity provider?

Self hosted version supports both single tenant mode (no users/login at all) or multi tenant mode where you can configure your own auth0 account and work with all of their supported IDPs. It’s not so well documented but I can have that documented ASAP