Is there a reason why Apple's iPhone spellcheck is often really poor, significantly worse than both LLMs and just...human eyes?
I often find myself butchering the spelling of a word in a way where the correct answer is obvious to human eyes (probably because of "typoglycemia" [1]) and an AI LLM immediately understands what I meant to say, but Apple's spellcheck has "No Guesses Found."
Yes but it’s much broader. Just in general the lack of Steve Jobs noticing these glaring issues and coming down hard to solve them is pretty clear.
I remember when macbooks briefly came out with a ridiculously bright standby led that required
Black electrical tape over if you wanted to sleep with it in the house. Shortly after no more status leds on any MacBook (thank you!).
Nowadays i find non stop little annoyances with threads from others on the same issues on Apple devices. From.the.overly.prominent.full.stop when searching textually in the url bar to the crappy spell check and crappy spam filtering. As much as Jobs apparently came across as an asshole there’s a need for someone at the top to say ‘WTF is this, fix it or get fired!’.
I worked at Apple and heard a lot of Steve stories. He really did personally approve everything. He would be sitting in a room, and team leads would all line up to give their quick 2-minute update. So it's the MacBook Air guy's turn. He comes in and places his prototype down in front of Steve. Steve opens the lid. Two seconds later he picks up the laptop and heaves it so hard it skipped across the table like a stone on water: "I said fxxking INSTANT ON!!" The poor guy collected his prototype and exited the room. Later the MacBook Air launched... it fxxking turned on the moment you open the lid
I've also found a lot of this stuff is due to naysayers telling people that things can't be fixed (because really they don't want to bother). You need a strong leader to say "no it can and we will".
It's hard to pinpoint exactly what it is, but yes, there seems to be an increasing number of small issue with Apple devices. They aren't major stuff simply not work, but yes, spam filter being pretty terrible, text overlapping on non flagship phones (e.g. the iPhone SE). All sorts of minor annoyances.
Currrent MBPs have bright green/orange charging lights on both sides of the magsafe connector. They're bright enough I have to block them when I'm in a hotel and my laptop is in the same room.
That light was really helpful to do the occasional late night visit to the bathroom in a home where I lived where there were no light controls at the bedside.
There has to be something going on with iOS Safari and the keyboard because my typing goes to complete shit in ways it never does in any other application.
Here are some random examples I thought of for this comment. Notice how everything is spelled wrong as though the screen input doesn’t match the location of the buttons.
> I remember when macbooks briefly came out with a ridiculously bright standby led that required Black electrical tape over if you wanted to sleep with it in the house. Shortly after no more status leds on any MacBook (thank you!).
The lack of status LEDs is actually the only thing I really REALLY hate about MacBooks!
Too often I have been bitten by the thing not properly going to sleep because SOMETHING keeps a wake lock (and of course macOS doesn't indicate this anywhere outside of Energy Monitor, nested in System Activity) and overheating in my bag as a result. A simple LED would have been a good visual indicator that it is still awake.
There's nothing more frustrating than when you type the word you want to type, it changes it to a different word, you delete it and type the word you wanted to type again and then rinse/repeat 3 to 4 times before you have the word you actually wanted.
And if you're not paying attention, your message ends up looking like you're having a stroke.
It used to be that if you typed, deleted the correction, and retyped, that spelling would now be the preferred and you wouldn’t have to play that game anymore. Apple broke that years ago.
> they focus too much on the first letter of the word
They also do that in Apple Notes. On the iPad the search can only match word prefixes. So if you type "oo" and the entire note consists of just the word "foo", it will find nothing. This doesn't even require fuzzy search, yet they couldn't be bothered while solving the much more difficult handwriting recognition problem.
Also the iPhone's Settings app still doesn't have all settings in the search index. So it's impossible to find the section "headphone safety" & "reduce loud audio" using words like "headphone", "audio" or "safety". This setting was introduced five years ago, by the way.
> they really don’t want you saying bad words of any kind.
Not true anymore, I just typed fuck in this comment without having to fight it. They made a change I think last year and they even announced it.
> they do not look at context at all
Also not true. It's true that they're not perfect at it, but replacement after you typed 2 more words happen specifically because it can tell better what you want to say. Sometimes works against you because language is highly personal.
Here are some nice examples (excluding obvious edit distance based ones which it does right)
"snowbalfight" --> "snowball fight"
"unrelevant" --> "irrelevant"
"fone" --> "phone"
"the the" --> "The"
And all of this with auto capitalization if it notices you're at the start of a sentence, and stuff like handling proper nouns, punctuations, etc,.
What I find really interesting is swipe-type spell checking (its basically word prediction) on phones. That is a really cool problem to solve well. Sometimes it works like a dream and other times it's annoying. I wonder how they write those.
Bit off-topic - macOS has excellent built-in dictionary. Just select the word in any app, press Ctrl+Command+D and it opens it. It even guesses most incorrect words correctly. Also translation available if it exist for current keyboard locales.
E.g.
> No entries for "typoglycemia", did you mean "hypoglycemia"?
These user activated dictionaries tend to be excellent (even in vim, a pretty barebones system, I tend to get fantastic guesses from the machine).
Actually, come to think of it, the problem must be a bit easier than on smartphones, right? Real keyboard input is very precise. Smartphone keyboards already guess what word you were trying to spell, so they are influencing the typos in the direction of likely words… cannibalizing the very guess list that the dictionary uses!
Alfred ties into it nicely too, you can type `spell someword` and the completions below have the various spellings of words, fuzzy matched. Select one and the word goes onto your clipboard
that's great. I usually use the context menu on MacOS and the "Define" option on long press on iOS
That said, trying to use long press on iOS (or whatever it actually is), is one of those places that often drives me nuts. I don't know if the issue is a specific app or the OS or what but sometimes I want the popup menu to appear and I can't get it to appear. Or I do something to make it appear but it doesn't appear for x hundred milliseconds, during which I think it didn't get my gesture so I start a new one, just as it's finally responding in which case my new gesture dismisses it. Repeat 3-4 times before I'm ready to tear my hair out
It also shows why canvas based websites suck. Open Google Docs, select a word, press Cmd-Ctrl-D, ... nothing. Try it in gmail (which is not canvas based) and it works.
Yes. I just typed in "Tipografical earer" - and iOS 18.6 suggested "Tipograxical" for the first word, and one of "eared", "eager", and "eater" for the second word.
The spell check is truly bad. It boggles the mind how this is even possible given how solved the problem is everywhere else. Also the period being to the right of the spacebar such that it gets hit instead of space. So annoying!
I feel the same way about Android's. It just seems like spell check used to be so much better then years ago. But I'm not sure whether it's comparing mobile with desktop expectations. It really seems extremely dumb on Android.
When I used Windows Phone 8.1 I felt like I was typing text twice as fast as on Android. Better suggestions, more accurate keyboard inputs on the same screen size, and selecting an entire word was just a single tap which made fixing a typo very quick as well. Meanwhile back then it was impossible to make certain text selections without a bluetooth keyboard because of how Android constantly tried "fixing" touch-based selections. It's sad that Microsoft shut down the only system & UI that felt like the developers were actually thinking of the user when designing it. To this day no other mobile OS is as friendly to left-handed users.
"Hypo", meaning low; "glyc-", meaning sugar; and "emia", meaning of the blood. "Low sugar of the blood". (With apologies to chubbyemu.)
Since "typo" comes from "typography", it roughly means "symbolic". So "typoglycemia" should mean "symbolic sugar of the blood". Low typos in your blood would be "hypotypemia".
I have no idea why "typoglycemia" refers to a human ability to autocorrect, but it brings me joy, so I'm not going to question it ^_^
I mean, TBH I would expect this to be true: an LLM is trained over a massive corpus of internet data, which contains many typos, and is required to accurately predict tokens despite edit errors. A spellchecker is typically running a deterministic algorithm really, really quickly, and has hardcoded limits on acceptable edit distance (and has no learned knowledge of what looks correct/incorrect to human eyes). An LLM should generally trounce a spellchecker at figuring out what you meant to type, unless the spellchecker is secretly a tiny LLM / ML model of some kind under the hood.
I have definitely noticed this too. I also use the built in swipe to type feature, and it may as well be a coin flip as to whether it gets the word right. I get that swiping is vague, but even a little bit of frequency prediction would tell you that “sounds good” is going to be more likely than “sings hood”. It’s an absolutely infuriating feature.
I use the swipe feature because I guess I have wide fingertips and frequently hit unintended, adjacent keys when pecking on the keyboard (especially as I’ve gotten older). The words produced by swiping often make no grammatical sense, and are frequently esoteric words that I just can’t believe rank high enough on a basic frequency list to suggest. Not to mention my own vocabulary, which apparently is not considered by the keyboard at all.
I had a way better experience using SwiftKey on my android phone 15 years ago.
It is 2025 and the best spell checker is a search engine. Numerous time an application will not provide the correct word. Only solution is to try the word in a search engine and try using in a sentence if that fails.
In my opinion, this is where ML/AL local model, no internet required, would be the most beneficial today.
Even had to use a search engine with, "thoughts and opi" because I forgot how to spell opinion before posting this. In application spell checker was 100% useless with assisting me.
Instead of how LLMs operate by taking the current text and taking the most likely next token, you take your full text and use an LLM to find the likeliness/rank of each token. I'd imagine this creates a heatmap that shows which parts are the most 'surprising'.
You wouldn't catch all misspelling, but it could be very useful information to find what flows and what doesn't - or perhaps explicitly go looking for something out of the norm to capture attention.
I would like this too. This approach would also fix the most common failure mode of spelling checkers: typos that are accidentally valid words.
I constantly type "form" instead of "from" for example and spelling checkers don't help at all. Even a simple LLM could easily notice out of place words like that. And LLMs also could easily go further and do grammar and style checking.
I've seen this in a UI. They went a step further and you could select a word (well token but anyway) and "regenerate" from that point by selecting another word from the token distribution. Pretty neat. Had the heatmaps that you mentioned, based on probabilities returned by the LLM.
This should also be pretty cheap (just one pass through the LLM).
Can confirm. The first time I saw an automatic spellchecker was probably with WordStar around 1989, and it blew me away. How can the computer know all the words? That's insane! Sounds lame, but it's true. It was a different world.
Since you're old enough, here's a question for you. Do you remember if at the time the first spellcheckers were invented, people were negative on spellcheckers, because that would mean that soon people would stop learning how to spell and just general dumbing down?
It seems that anything that helps people gets this reaction these days. On the one hand, the argument 100% resonates with me. On the other hand, spelling isn't really the end, is it? It's just a means to an end, so what's wrong with making the mean easier? Did people worry that you'd stop knowing how to plant potatoes when trading was invented? EDIT: The example doesn't make sense because agriculture is newer than trading, but you got the idea.
Not as much with spellcheckers because even when they started to get popular, it was apparent that many people cannot spell English. So it was very natural.
People pushed back on the grammar checks when they landed in Word.
Before that, people pushed back on calculators in secondary schools. This was a huge point of contention all classes except trigonometry, and calculators were definitely not allowed in the SAT/ACT.
I remember hearing this as late as the early 00s. I'd buy electronic dictionaries and spell checkers at yard sales and things like that, and use them in class. Multiple teachers were disapproving of it, despite it basically just being a paper book dictionary in a small, TI-92 shaped device. 10 year old me never saw how flipping through some obnoxiously heavy book in the back of the classroom was better than just punching in a few letters, hitting the "show definition", and ensuring I was spelling and using "curmudgeonly" properly.
Same went for using MacWord vs AppleWorks. MacWord had a built in dictionary, AppleWorks didn't.
I think this happens every time something gets automated away, and in a way it's true. I'm sure a lot more accountants knew 123x27 by heart before than they do now. The problem is LLMs take out the whole process of thinking, and that is going to be a problem: you generally need to think even when you're not in front of a screen.
> because that would mean that soon people would stop learning how to spell and just general dumbing down?
I'd argue that negative people where correct. People can't spell anymore, not even with a spellchecker. Maybe they never could? I'm not against spellcheckers, I think they are amazing, but they haven't helped much.
It was the opposite experience for me. Before spellcheck was commonly part of the web browser, I would go back and reread some very early emails and/or usenet posts from myself. And realize how atrocious my spelling was.
I actually consider spellcheck to have improved my spelling dramatically over the years. The little red squiggles under words have helped me to recognize my misspellings, especially the words that are hard for me to get right consistently.
I don't remember any particular negative reaction to spell-checkers like the 'calculator panic'.
Perhaps partly because most schoolkids then wouldn't have been using word processors as their main writing tool at school and people using them in a corporate environment were pleased not to make embarrassing errors in their emails.
As I recall, there was some, but not a lot of FUD around spellcheckers, mostly because personal computers were still relatively new. Most GenX parents (Boomers) didn't even know what personal computers were yet, so they didn't know enough to be concerned. (I grew up in Missouri, which was the Digital Stone Age back then.) At the time, I think their complaining was more focused on MTV and video games.
However, also sounds weird, but I recall myself and some of my peers questioning spellcheckers, "Why do I need this?", because spelling was a primary mission of our education. We were all raised constantly being tested on spelling. In fact, I think I disabled the spellchecker on my old-ass 286 because it caused delays in the overall experience.
I had the same thing when the Encarta CDs started to include pronunciation tests. You'd get a word, speak it in the microphone, and get a "score" on how well you pronounced that word. Knowing what I know now, it was probably pretty inaccurate and hand wavy, but in the early 90s that was an absolutely amazing experience for an ESL person.
Having a dictionary is a prerequisite but is only a small part of the spell check problem. Plus, plain text word lists are slow to parse in the 80s; better going with a Trie or some other exotic tree structure that is naturally compressed but O(log(n)) instead of O(n) to traverse.
The computer has to figure out whether the word is in the dictionary, but it also has to figure out a suggestion for what to change it to.
And even after just that, we already have a bug- homonym mistakes- homonyms are in the dictionary but they’re misspelled (that was intentional btw).
How misspelled is another problem. We’ve had Levenshtein et al algorithms for a long time, but how different can you get? A really badly misspelled word might not have any good replacement candidates within your edit distance limit.
There are also optimizations like frequently mistyped words (acn-> can), acronyms, etc.
"A Spellchecker Used to Be a Major Feat of Software Engineering"
It still is. The spell checker on my Android phone is a PIA. It's too dumb to correct many typos, there's no way of highlighting wrongly used but correct words such a 'fro' and 'for', etc. There's no automatic or user defined substitution such as correcting 'rhe' with 'the' and yet keep the words highlighted until a final revision.
Wordpossessor spellers have no way of tagging certain words that one may or may not wish to use depending on context. A classic example that's caught me out past the draft and found its way into the final document without me noticing it is 'pubic' for 'public'. Why doesn't my speller highlight such words in red and ask whether I actually meant to use this word?
Moreover, spellers are not all of the same level of accuracy, for example Microsoft Word's speller is much better than LibrOffice's much to my annoyance as LibreOffice is my main (preferred) WP.
Nor is there a method of collecting misspelled words or typos and tagging them as spelling errors or typos for the purpose of helping one's spelling or typing. It'd be nice to have a list of my misspelled words together with their correct spelling, that way I could become a better speller. Also, spellers could be integrated with full dictionaries—highlight the word and press F1 for its meaning, etc.
There are no dictionary formats that are both universal and smart, that is that would allow for easy amalgamation between dictionaries and yet could contain user defined words and other user metadata which would be distinguished from the general corpus of words when crossed or amalgamated. For example, a smart dictionary format could contain metadata that would allow a dictionary and thesaurus to coexist in the same word list, similarly so different dictionaries, technical, medical etc.
All up, spellercheckers are still a damn mess. They need urgent attention.
The way growth in memory availability changes the scope of problems is really quite astonishing. I cut my teeth writing code for Apple ][ computers with theoretically up to 128K of RAM, but in practice much closer to 40K for most use cases, but it does make me much more conscious of memory and CPU usage than younger devs who never faced these sorts of constraints.
Thinking of the example given about being able to just load the word list into memory, I did something of that ilk when my son’s fifth grade class read a book which had a concept of dollar words: You assign a value to each letter, a=1, b=2, … z=26, add up the value and try to get exactly 100. It was pretty trivial to write a program that read the word list and produced the complete list of dollar words (although I didn’t share that with my son, I did give him access to the word list and challenged him to write the program himself).
At the moment, I’m building up a Spanish rhyming dictionary by using a Spanish word list, reversing the words and sorting the reversed list to find the groups of words that are most likely to rhyme, which was something that 30 years ago would have been a challenge on my desktop computer but now is a brief script that I’m just as likely to manage through perl 1-liners and shell pipes as not.
As a copyeditor/proofreader, the number of times over the years I've had to fix the low-quality (i.e, wrong) suggestions is quite large. ("he had a small plague on his desk" remains a favorite.)
I have a spelling checker
It came with my PC
It highlights for my review
Mistakes I cannot sea.
I ran this poem thru it
I'm sure your pleased to no
Its letter perfect in it's weigh
My checker told me sew.
I often find myself butchering the spelling of a word in a way where the correct answer is obvious to human eyes (probably because of "typoglycemia" [1]) and an AI LLM immediately understands what I meant to say, but Apple's spellcheck has "No Guesses Found."
Does anyone else have this experience?
1. https://www.dictionary.com/e/typoglycemia/
I remember when macbooks briefly came out with a ridiculously bright standby led that required Black electrical tape over if you wanted to sleep with it in the house. Shortly after no more status leds on any MacBook (thank you!).
Nowadays i find non stop little annoyances with threads from others on the same issues on Apple devices. From.the.overly.prominent.full.stop when searching textually in the url bar to the crappy spell check and crappy spam filtering. As much as Jobs apparently came across as an asshole there’s a need for someone at the top to say ‘WTF is this, fix it or get fired!’.
One of the most aggravating things in iOS. Trips me up almost every day (and it's been there for what? 10 years now?)
Here are some random examples I thought of for this comment. Notice how everything is spelled wrong as though the screen input doesn’t match the location of the buttons.
- tomoroww eather in united.kingdom
- lookip exhange rate
- devopper news
- download twotter.video
The lack of status LEDs is actually the only thing I really REALLY hate about MacBooks!
Too often I have been bitten by the thing not properly going to sleep because SOMETHING keeps a wake lock (and of course macOS doesn't indicate this anywhere outside of Energy Monitor, nested in System Activity) and overheating in my bag as a result. A simple LED would have been a good visual indicator that it is still awake.
And if you're not paying attention, your message ends up looking like you're having a stroke.
- they really don’t want you saying bad words of any kind.
- they do not look at context at all
- they focus too much on the first letter of the word for suggestions
They also do that in Apple Notes. On the iPad the search can only match word prefixes. So if you type "oo" and the entire note consists of just the word "foo", it will find nothing. This doesn't even require fuzzy search, yet they couldn't be bothered while solving the much more difficult handwriting recognition problem.
Also the iPhone's Settings app still doesn't have all settings in the search index. So it's impossible to find the section "headphone safety" & "reduce loud audio" using words like "headphone", "audio" or "safety". This setting was introduced five years ago, by the way.
Not true anymore, I just typed fuck in this comment without having to fight it. They made a change I think last year and they even announced it.
> they do not look at context at all
Also not true. It's true that they're not perfect at it, but replacement after you typed 2 more words happen specifically because it can tell better what you want to say. Sometimes works against you because language is highly personal.
Here are some nice examples (excluding obvious edit distance based ones which it does right)
"snowbalfight" --> "snowball fight"
"unrelevant" --> "irrelevant"
"fone" --> "phone"
"the the" --> "The"
And all of this with auto capitalization if it notices you're at the start of a sentence, and stuff like handling proper nouns, punctuations, etc,.
What I find really interesting is swipe-type spell checking (its basically word prediction) on phones. That is a really cool problem to solve well. Sometimes it works like a dream and other times it's annoying. I wonder how they write those.
It's somewhat funny that human performance is seen as a baseline here, and not the pinnacle of achievement to aim for.
(I agree with you. I just find it entertaining.)
E.g.
> No entries for "typoglycemia", did you mean "hypoglycemia"?
Actually, come to think of it, the problem must be a bit easier than on smartphones, right? Real keyboard input is very precise. Smartphone keyboards already guess what word you were trying to spell, so they are influencing the typos in the direction of likely words… cannibalizing the very guess list that the dictionary uses!
That said, trying to use long press on iOS (or whatever it actually is), is one of those places that often drives me nuts. I don't know if the issue is a specific app or the OS or what but sometimes I want the popup menu to appear and I can't get it to appear. Or I do something to make it appear but it doesn't appear for x hundred milliseconds, during which I think it didn't get my gesture so I start a new one, just as it's finally responding in which case my new gesture dismisses it. Repeat 3-4 times before I'm ready to tear my hair out
It also shows why canvas based websites suck. Open Google Docs, select a word, press Cmd-Ctrl-D, ... nothing. Try it in gmail (which is not canvas based) and it works.
Yes: Apple doesn't care.
> Does anyone else...
Yes. I just typed in "Tipografical earer" - and iOS 18.6 suggested "Tipograxical" for the first word, and one of "eared", "eager", and "eater" for the second word.
Deleted Comment
Since "typo" comes from "typography", it roughly means "symbolic". So "typoglycemia" should mean "symbolic sugar of the blood". Low typos in your blood would be "hypotypemia".
I have no idea why "typoglycemia" refers to a human ability to autocorrect, but it brings me joy, so I'm not going to question it ^_^
I use the swipe feature because I guess I have wide fingertips and frequently hit unintended, adjacent keys when pecking on the keyboard (especially as I’ve gotten older). The words produced by swiping often make no grammatical sense, and are frequently esoteric words that I just can’t believe rank high enough on a basic frequency list to suggest. Not to mention my own vocabulary, which apparently is not considered by the keyboard at all.
I had a way better experience using SwiftKey on my android phone 15 years ago.
In my opinion, this is where ML/AL local model, no internet required, would be the most beneficial today.
Even had to use a search engine with, "thoughts and opi" because I forgot how to spell opinion before posting this. In application spell checker was 100% useless with assisting me.
Instead of how LLMs operate by taking the current text and taking the most likely next token, you take your full text and use an LLM to find the likeliness/rank of each token. I'd imagine this creates a heatmap that shows which parts are the most 'surprising'.
You wouldn't catch all misspelling, but it could be very useful information to find what flows and what doesn't - or perhaps explicitly go looking for something out of the norm to capture attention.
I constantly type "form" instead of "from" for example and spelling checkers don't help at all. Even a simple LLM could easily notice out of place words like that. And LLMs also could easily go further and do grammar and style checking.
This should also be pretty cheap (just one pass through the LLM).
It seems that anything that helps people gets this reaction these days. On the one hand, the argument 100% resonates with me. On the other hand, spelling isn't really the end, is it? It's just a means to an end, so what's wrong with making the mean easier? Did people worry that you'd stop knowing how to plant potatoes when trading was invented? EDIT: The example doesn't make sense because agriculture is newer than trading, but you got the idea.
https://www.washingtonpost.com/archive/opinions/1985/06/02/t...
People pushed back on the grammar checks when they landed in Word.
Before that, people pushed back on calculators in secondary schools. This was a huge point of contention all classes except trigonometry, and calculators were definitely not allowed in the SAT/ACT.
Same went for using MacWord vs AppleWorks. MacWord had a built in dictionary, AppleWorks didn't.
I'd argue that negative people where correct. People can't spell anymore, not even with a spellchecker. Maybe they never could? I'm not against spellcheckers, I think they are amazing, but they haven't helped much.
I actually consider spellcheck to have improved my spelling dramatically over the years. The little red squiggles under words have helped me to recognize my misspellings, especially the words that are hard for me to get right consistently.
Perhaps partly because most schoolkids then wouldn't have been using word processors as their main writing tool at school and people using them in a corporate environment were pleased not to make embarrassing errors in their emails.
However, also sounds weird, but I recall myself and some of my peers questioning spellcheckers, "Why do I need this?", because spelling was a primary mission of our education. We were all raised constantly being tested on spelling. In fact, I think I disabled the spellchecker on my old-ass 286 because it caused delays in the overall experience.
2023 (314 points, 180 comments) https://news.ycombinator.com/item?id=34971924
2020 (363 points, 143 comments) https://news.ycombinator.com/item?id=25296900
2012 (94+156 points, 70+61 comments) https://news.ycombinator.com/item?id=4640658 https://news.ycombinator.com/item?id=3466388
A spellchecker used to be a major feat of software engineering (2008) - https://news.ycombinator.com/item?id=34971924 - Feb 2023 (180 comments)
A spellchecker used to be a major feat of software engineering (2008) - https://news.ycombinator.com/item?id=25296900 - Dec 2020 (143 comments)
A Spellchecker Used to Be a Major Feat of Software Engineering (2008) - https://news.ycombinator.com/item?id=10789019 - Dec 2015 (29 comments)
A Spellchecker Used to Be a Major Feat of Software Engineering - https://news.ycombinator.com/item?id=4640658 - Oct 2012 (70 comments)
A Spellchecker Used To Be A Major Feat of Software Engineering - https://news.ycombinator.com/item?id=3466388 - Jan 2012 (61 comments)
A Spellchecker Used to Be a Major Feat of Software Engineering - https://news.ycombinator.com/item?id=212221 - June 2008 (22 comments)
The computer has to figure out whether the word is in the dictionary, but it also has to figure out a suggestion for what to change it to.
And even after just that, we already have a bug- homonym mistakes- homonyms are in the dictionary but they’re misspelled (that was intentional btw).
How misspelled is another problem. We’ve had Levenshtein et al algorithms for a long time, but how different can you get? A really badly misspelled word might not have any good replacement candidates within your edit distance limit.
There are also optimizations like frequently mistyped words (acn-> can), acronyms, etc.
It was never just about size.
It still is. The spell checker on my Android phone is a PIA. It's too dumb to correct many typos, there's no way of highlighting wrongly used but correct words such a 'fro' and 'for', etc. There's no automatic or user defined substitution such as correcting 'rhe' with 'the' and yet keep the words highlighted until a final revision.
Wordpossessor spellers have no way of tagging certain words that one may or may not wish to use depending on context. A classic example that's caught me out past the draft and found its way into the final document without me noticing it is 'pubic' for 'public'. Why doesn't my speller highlight such words in red and ask whether I actually meant to use this word?
Moreover, spellers are not all of the same level of accuracy, for example Microsoft Word's speller is much better than LibrOffice's much to my annoyance as LibreOffice is my main (preferred) WP.
Nor is there a method of collecting misspelled words or typos and tagging them as spelling errors or typos for the purpose of helping one's spelling or typing. It'd be nice to have a list of my misspelled words together with their correct spelling, that way I could become a better speller. Also, spellers could be integrated with full dictionaries—highlight the word and press F1 for its meaning, etc.
There are no dictionary formats that are both universal and smart, that is that would allow for easy amalgamation between dictionaries and yet could contain user defined words and other user metadata which would be distinguished from the general corpus of words when crossed or amalgamated. For example, a smart dictionary format could contain metadata that would allow a dictionary and thesaurus to coexist in the same word list, similarly so different dictionaries, technical, medical etc.
All up, spellercheckers are still a damn mess. They need urgent attention.
Thinking of the example given about being able to just load the word list into memory, I did something of that ilk when my son’s fifth grade class read a book which had a concept of dollar words: You assign a value to each letter, a=1, b=2, … z=26, add up the value and try to get exactly 100. It was pretty trivial to write a program that read the word list and produced the complete list of dollar words (although I didn’t share that with my son, I did give him access to the word list and challenged him to write the program himself).
At the moment, I’m building up a Spanish rhyming dictionary by using a Spanish word list, reversing the words and sorting the reversed list to find the groups of words that are most likely to rhyme, which was something that 30 years ago would have been a challenge on my desktop computer but now is a brief script that I’m just as likely to manage through perl 1-liners and shell pipes as not.