The desktop experience is the best way to use Aqua yet. It lets you talk easily into any app and comes with:
- Custom dictionary: can have up to 800 words/phrases at a time, no pronunciation tuning required.
- Context Awareness: automatically identifies relevant words and phrases in the active application. This uses system accessibility APIs (not screenshots, like some others) and is heavily processed on device before inference to preserve privacy.
- Command recognition: the system now shows you what it's going to do before it does it. "Deleting…" or "Adding to list…" or "Fixing Spelling…"
We've also spent a ton of time getting better out-of-the-box accuracy. Our core transcription engine is the most accurate real-time system that we know of. We scored 3.2% WER on Librispeech clean, significantly better than the next best real-time system we tested (Google) at 5.5%. We also released a benchmark that tests accuracy & human-friendly formatting which showed that out of the box, Wispr Flow makes 10x as many mistakes as Aqua Voice for emails and technical writing. The full write-up including audio and code is available here (https://withaqua.com/blog/benchmark-nov-2024)
Everyone uses dictation for different reasons - I started in sixth grade (dyslexic) with Dragon Professional and always wanted it to be more than a clunky substitute for the keyboard. Hopefully Aqua Desktop can be that for some of you.
Would love to hear your comments!
-Finn
1) What about languages other than English? Just English is supported for now.
2) Is this running locally? No.
3) Could you imagine offering a one time payment option? Dragon is currently charging 999 € for their latest version but I really like the fact that I actually own the software after paying once.
In any case, good luck with this!
Deleted Comment
Imagine a pen that was only showing you words at the end of a paragraph — that'd be crazy! But this is how most people use voice today because real time is so hard. Latency is 800ms-2000ms from speech.
Deleted Comment