In my experience, the most helpful approach to performing RCA on complicated systems involves several hours, if not days, of hypothesizing and modeling prior to test(s). The hypothesis guides the tests, and without a fully formed conjecture you’re practically guaranteed to fit your hypothesis to the data ex post facto. Not to mention that in complex systems there is usually 10 benign things wrong for every 1 real issue you might find - without a clear hypothesis, its easy to go chasing down rabbit holes with your testing.
I wonder how long it will take frontier LLM's to be able to handle something like this with ease without it using a lot of "scaffolding".
It is designed to solve the problem of "RPG hero just killed a dragon in front of the town and no one says anything about it." All the NPCs realistically react and talk about the Hero's exploits.
Visitors to the site can vote on what quest the hero undertakes next.
I'm running into the problem what the site isn't much fun. I'm honestly not sure what to do about that!
An only slightly buggy build is at https://www.generativestorytelling.ai/tinyllmtown/index.html
Importantly, I am aiming to have everything (except voice gen) working on a small model that can be ran locally.
I love my Peloton and it was really annoying finding a ride with a difficulty over a certain threshold that was recent. I'm an embedded developer by day and did this with a lot of help from ChatGPT and Claude. Hoping to add some more features to it as time goes on.