Moscow’s street dogs are renowned for their intelligence. I have seen street dogs taking the escalators on the Metro. This dog worked out not just that the beeping + discomfort was worth the freedom, but also that he could wear out the battery faster by going up to the very edge of the fence - where the chirps became an uninterrupted beeeeep - and as soon as the beeping stopped, whoosh he was gone.
In practice high variance translates on the downside into failure to do basic things that a minimally competent human would basically never get wrong. In agents it's exacerbated by the compounding impact of repeated calls but even for basic workflows it can be annoying.
This is very interesting finding about how to improve capability.
I don't see reliability expressly addressed here, but my assumption is that these alloys will be less rather than more reliable - stronger, but more brittle, to extend the alloy metaphor.
Unfortunately for many if not most B2B use cases this reliability is the primary constraint! Would love to see similar ideas in the reliability space.
The most difficult part of the process (not dealt with in this version of Passport Application but maybe a future DLC pack?) was actually finding someone who could certify my evidence (you are meant to submit originals but they keep the docs including passports for 3-6 months which is a bit unrealistic if you are living abroad). I can't remember the exact rules but it wasn't possible to use a US notary or a normal solicitor certification process and instead I needed to go to a council office.
After calling about 5 councils all of whom disavowed any knowledge of the process or its requirements I ended up finding someone at Islington Council who was delightfully helpful. But it was one of the more frustrating UK government interactions I've had.
I am very excited for the whole STT/TTS to go away and for us to have models that really "hear" exactly what you said.
Sometimes this is about accent but a lot of the time, the AI won't spot areas where you e.g. fudge a case ending or the stress on a word. Yes, you can get some of that pronunciation right by the AI repeating back with the correct stress or clear case, but you never really get the confidence that you would get from an actual human.
Another product suggestion - turn off transcription (at least for the tutor side of the conversation; I'd suggest both). Personally I find it distracting at best for languages I already speak well and a crutch for those I don't.
Finally, I find it really very hard to enjoy having a random conversation that's not very directed ("What interests you most about artificial intelligence?"). I'd suggest that there are ways of making it more goal focused without being explicitly gamified - maybe something like, here's a position and you have to persuade me (AI debate club!), or something that brings out an actual opinion or relates to a concrete experience ("what's your main goal in your job this year").
Overall though this is the first product I've seen in this space that I might actually use, so well done.
- That it's still way cheaper in most instances to book a return (especially where the "trip" straddles a weekend) rather than a one-way fare when travelling long haul - even if you just throw away the return flight.
- That you can sometimes get access to totally different inventory by booking a package including accommodation, even if that accommodation is one night in a shared dormitory in a hostel (which you just don't go to).
At least group discounts have a recognizable economic rationale. But in these examples you are getting a strict superset of the same SKU (OK, maybe the change rules might be a little tighter, but not in a way that's perceptible) for less money.