curious how often the judge actually disagrees with the first candidate answer in practice. does the council mostly refine reasoning, or does it sometimes lead to completely different conclusions?
curious how often the judge actually disagrees with the first candidate answer in practice. does the council mostly refine reasoning, or does it sometimes lead to completely different conclusions?
The harder problem is auth-gated content — Instagram feeds, dashboards, paywalled pages. Browser Steps handles it today (you can script login flows), but honestly I think the real fix is AI-assisted interaction. A small cheap model that can find what you care about without needing a brittle selector at all. That's where I want to take this — less "maintain a CSS path", more "here's what I'm interested in, figure it out...
the idea of a small model just identifying “the thing that looks like a price / status / headline” feels much closer to semantic detection than DOM-path tracking
curious though — would you run that model on every check, or only when the selector fails? seems like a nice hybrid approach to keep things cheap.
The fragile cases are sites that generate class names on every build (React/webpack/vite apps often do this) — those selectors will just stop working.
For semantic elements like price tags, availability text, or content blocks, they tend to be stable enough that it's not a real problem day-to-day. And if a filter stops matching entirely, the watch flags with error message it rather than silently giving you empty diffs.
Deleted Comment
am i right that the narration is generated with AI? sometimes there might be small pronunciation quirks, but overall the quality already sounds pretty good from what i tried.
one thing that would make this even more useful for learning (at least for me) is word-level explanations. for example clicking a word and seeing a simple definition in the same language (like german → german explanations in learner dictionaries), not just translation. that really helps build intuition.
being able to watch a specific element sounds way more useful in practice (price blocks, availability text, etc).
curious how fragile the element tracking is though. if the site slightly changes the DOM structure or class names, does the watcher usually survive or do you end up reselecting the element pretty often?
If the gain comes from giving the model another pass over its internal representation, I'd expect some sort of diminishing-returns curve as you add more repeats. But if those layers form a spevific circuit, running it multiple times might actually break the computation.
It would be really interesting to see which of those regims the model falls into.
the idea of learning languages from AI didn't quite sit right with me. but that might be something to circle back to.
integrating learner dictionaries does sound like a fantastic idea. will definitely explore that!
really nice project overall.