Happy to have fixed up a bunch of these issues already though.
Happy to have fixed up a bunch of these issues already though.
I've never heard of anyone using Opsgenie or Splunk for on-call, and Opsgenie's 3-week outage or whatever is pretty damning.
We've since built on-call directly into our product and we've had loads of those customers migrate entirely into us, dropping PagerDuty. The biggest customers we're onboarding now tend to buy us as their sole on-call and incident management tool, too.
We have a bunch of case studies of people who've moved if that's useful to anyone. My favourite is probably Intercom who migrated from PagerDuty into our on-call in a few weeks (the Intercom team are great fun to work with!)
I work at incident.io it's a daily struggle.
Would be interested if anyone else sees these dynamics playing out. It seems there’s a really powerful combination of larger players starting later and a rising tide of AI improvements that mean the smaller shops can leap frog ahead, in ways that I haven’t seen before.
There’s a first wave of incident startups that responded to the market having stagnated about 4 years ago (incident.io, FireHydrant, Rootly) then a slew of extremely recent (<1 year) companies leaning into AI incident response.
It’s weird that Opsgenie is just quitting that race but realistically they weren’t really competing in terms of pace of development. Felt more like Opsgenie was bought under the assumption IR was a ‘solved’ problem that Atlassian could just add to their stack and be done with it, while today it’s increasingly apparently that just paging someone is the smallest part.
We’ve got a good migration story (import OG schedules and escalation paths, etc) so most customers just migrate to us, but if you didn’t have that option the Atlassian migration is much more painful.
In a past life I’d struggled a lot with a public API that had some really tricky pagination performance problems. It was something we were always fighting, be it awkward edge case data shapes or the Postgres planner having bad statistics, where everything would get difficult past >5TB in the table.
Was really happy to see the team find a solution that feels scalable and can be generically applied to a lot of our endpoints at incident. Quite a great outcome where I think we’ll be safely scalable for the next few years instead of hitting other problems that would crop up had we gone with gin indexes or otherwise.