Or maybe you have 100% path coverage in your test..
Or maybe you have 100% path coverage in your test..
I suppose this didn't always use to be the case, though.
Any product that would fall under the good quality segment is primarily targeted at the commercial market, and nobody there is looking for open software.
It's not quite clear to me what the firmware is actually able to do, though. Apparently its motion detection is very basic, though, so you'd need to use e.g. Frigate for that.
User: Fix this problem ...
Assistant: X
User: No, don't do X
Assistant: Y
User: No, Y is wrong too.
Assistant: X
It is generally pointless to continue. You now have a context that is full of the assistant explaining to you and itself why X and Y are the right answers, and much less context of you explaining why it is wrong.If you reach that state, start over, and constrain your initial request to exclude X and Y. If it brings up either again, start over, and constrain your request further.
If the model is bad at handling multiple turns without getting into a loop, telling it that it is wrong is not generally going to achieve anything, but starting over with better instructions often will.
I see so many people get stuck "arguing" with a model over this, getting more and more frustrated as the model keeps repeating variations of the broken answer, without realising they're filling the context with arguments from the model for why the broken answer is right.
I think often it's not required to completely start over: just identify the part where it goes off the rails, and modify your prompt just before that point. But yeah, basically the same process.
But, even with those tokens, fundamentally these models are not "intelligent" enough to fully distinguish when they are operating on user input vs. system input.
In a traditional program, you can configure the program such that user input can only affect a subset of program state - for example, when processing a quoted string, the parser will only ever append to the current string, rather than creating new expressions. However, with LLMs, user input and system input is all mixed together, such that "user" and "system" input can both affect all parts of the system's overall state. This means that user input can eventually push the overall state in a direction which violates a security boundary, simply because it is possible to affect that state.
What's needed isn't "sudo tokens", it's a fundamental rethinking of the architecture in a way that guarantees that certain aspects of reasoning or behaviour cannot be altered by user input at all. That's such a large change that the result would no longer be an LLM, but something new entirely.
0: https://embracethered.com/blog/posts/2024/hiding-and-finding...
And as the poster mentioned if you are running an AI model you probably have GPUs to spare. Unlike the dev working from a 5 year old Thinkpad or their phone.
Indeed a new token should be requested per request; the tokens could also be pre-calculated, so that while the user is browsing a page, the browser could calculate tickets suitable to access the next likely browsing targets (e.g. the "next" button).
The biggest downside I see is that mobile devices would likely suffer. Possible the difficulty of the challange is/should be varied by other metrics, such as the number of requests arriving per time unit from a C-class network etc.
Knowing what kind of errors can occur is one of those tools.