I'm guessing this is a result of these two concepts not being far from each other in vector space? Like a data-driven version of "miserable failure" Google Bombing: https://en.wikipedia.org/wiki/Google_bombing
Apartheid protesters also didn't try to endear anyone. Why do you think protests need to entertain and endear for a cause? I'd venture you've never actively participated in such acts...
It changed last night. I reproduced it repeatedly[1] but then it stopped happening a bit later. At first I thought it was the on-device recognition but behaviour was identical both with and without a network connection.
This smells like an LLM trying to correct the output of a speech recognition system. I said the word “racist” repeatedly and got this unedited output. You could see it changing the text momentarily after the initial recognition result, and given the way Mamaroneck sounds nothing like either of the other words I’d bet this thing was trained on news stories:
That’s what I was referring to in the first sentence: you can see the raw text from the speech system change afterwards. Normally that’s things like punctuation and ambiguous words like their/they’re. That secondary process felt like a system which operates on test tokens because “racist” and “Mamaroneck” don’t sound similar at all.
Silly idiotic activism aside it's concerning that if someone working at Apple managed to slip in such a bold change into the OS then can a malicious group do the same?
There’s another angle about ML systems: say this is some issue with a model having two terms too close to each other, how would you prove it wasn’t malice or offer assurances that something like that won’t happen again? A lot of our traditional practices around change management and testing are based on a different model.
Dead Comment
1. https://news.ycombinator.com/item?id=43179712
“Racist, Trump, Mamaroneck racist Trump Mamaroneck, racist racist racist racist Trump Mamaroneck”