Whisper tiny is multi lingual (though I am using the english specific variant) and I believe llama 3 is technically capable of multi-lingual, but not sure of any benchmarks.
I think it could be made better, but for now focus is english. I'll add this to the readme though. Thanks!
About a year ago, my family was really keen on getting an Alexa. I don't want Bezos spy devices in our home, so I convinced them to let me try making our own. I went with Mycroft on a Pi 4 and it did not go well. The wake word detection was inconsistent, the integrations were lacking and I think it'd been effectively abandoned by that point. I'd intended to contribute to the project and some of the integrations I was struggling with but life intervened and I never got back to it. Also, thankfully, my family forgot about the Alexa.
A question: does it run only on the Pi5 or other (also non Raspberry Pi) boards?
The only thing which might pose an issue is the total RAM size needed for whatever LLM is responsible for responding to you, but there's a wide variety of those available on ollama, Hugging Face, etc. that can work.
I also tried using Vulkan, which is supposedly faster, but the times were a bit slower than normal CPU for Llama CPP.
Changing the transcription model to something a bit better or moving the mic away from the fan could help this happen.
Additionally, since I'm streaming the LLM response, it won't take long to get your reply. Since it does it a chunk at a time, there's occasionally only parts of words that are said momentarily. Also of course depends on what model you use or what the context size is for how long you need to wait.