If I remember correctly, the original version of wordle used a word list that was run past the creator's wife, who had learned English later in life. The result was a really accessible game - none of the words felt like ones you wouldn't know. It probably makes sense to reuse words than risk losing that accessibility.
(I kept a copy of original wordle, and it seems to have 2,315 words that are possible answers.)
It’s this. There are many five letter words that are not “wordley”. Words such as, idk, bokeh, are technically part of the lexicon but would never appear as a solution. The wordle bot will even tell you this if you guess them — “good guess, but unlikely to appear as a solution”. The crossword has a similar sort of unwritten rule, maybe not as strict, but really hard technical words seldom appear.
Yes, that's correct! Took her about a year off and on, he had made a little app for her to go through and categorize everything.
As an aside, for about $200, you can ask a true/false question of every word in the English language with a frontier LLM, and get mostly good answers. I make word games in my free time and was sort of shocked when I realized how cheap intelligence has been getting.
$200? Does this use reasoning? Does it involve forgetting to use KV caching?
This should cost well under $1. Process the prompt. Then, for each word, input that word and then the end of prompt token, get your one token of output (maybe two if your favorite model wants to start with a start-of-reply token), and that’s it.
Language or the way we use it is often used to exclude "undesired", so there is a point in using them. Not a very nice point, but a point nevertheless.
The original Wordle came with a pre-baked ordered list of 2315 "secret" words, off which the daily secret word was looked up (I think based on local time). The list was right there in the javascript code of the game (alongside the list of 12972 allowed guess words). It covered dates from 2021-06-19 to 2027-10-20.
Then in January 2022, the NYT bought Wordle, and started tweaking both lists, first shrinking the secret word list to 2309 entries, but leaving the logic otherwise intact. Fast forward to today, I looked up the current code [1], and it seems that there are now 14855 allowed words. The first 12546 are ordered alphabetically (0: "aahed", 12545: "zymic"), and the next 2309 are not. This may suggest that the latter are the secret words, but the logic for picking them has changed: I found no obvious sequence, when compared to the last few days' secret words. So it's either a more complex sequence, or the secret word is picked server-side.
In any case, I guess they decided to re-shuffle the list now at day 1689 / 2309 in order to avoid giving particularly assiduous player an additional bit of information: they can exclude all previous secret words. (To be accurate, I think this would be 1.897 bits, but my information theory is rusty.)
1. Wordle's word list is going to be a lot more curated than TFA's word list because people want to guess words they use or have heard of, not "aahed".
2. Only a tiny group of people care to "card count" Wordle to rule out words that have already been played because they think that sort of min/maxing is fun. Most people don't even think about that, so whether Wordle reuses words every few years is trivial to them.
I will say that having used the same starter word the whole time that has not come up yet, it's a little disappointing that it may now take even longer to appear.
> Wordle's word list is going to be a lot more curated than TFA's word list because people want to guess words they use or have heard of, not "aahed"
The Times sure doesn't think that about the people who do Letter Boxed. One LB had "polymethylmethacrylate" in its dictionary.
I've saved the daily dictionaries from 2024-03-30 and that's the longest word out of the 93 393 total distinct words in the 674 dictionaries I've saved. They average 1199.47 words per dictionary.
They have some truly ridiculous words, such as "troughgeng". WTF is a troughgeng? Googling that gives a couple of pages in Chinese (or a similar looking language) and a Scottish dictionary entry for "Throu" which in one of the examples of "throu" as an adverb lists a bunch of phrases is it used in, including:
> (8) througang, throw-, throoging, trough-geng, -geong (Sh., Ork.), (i) a going over or through; a passage (I.Sc. 1972); specif. (ii) a narration, a recital (of a story); (iii) a full rotation of crops, a shift; (iv) a thoroughfare, lane, passageway, corridor open at either end (Sc. 1808 Jam.; Sh. 1908 Jak. (1928); Rxb. 1923 Watson W.-B.; Ork., w.Lth., wm.Sc. 1972). Also attrib.; (v) = (5); (vi) energy, drive (Bnff. 1866 Gregor D. Bnff. 192);
Ooh and aah aren't words, they're sounds (onomatopoeia). A sound is just a sequence of letters used for their phonological values.
You can spell the sound "ah" however you like: ah, ahh, aah, aahh, there's no wrong way to spell it.
If you write "the washing machine tringged when it finished", 'tring' is not a word, even though it's following the rules of English morphology, you could have written any sequence of letters that most faithfully reproduces the sound of the washing machine. You could have written katrigged or puh-tringged.
"Crisis" is a massively overblown word for this. And the "wordle community" is a drop in the bucket of regular players, and not remotely representative.
I did have a similar reaction personally to the "exciting news" framing but I'm not actually sure it's wrong. The original list of words was an excellent list, and it's been over 4 years.
It seems about right. They reshuffled the deck about three-quarters of the way through (1689 ÷ 2315 = 72.9%). Blackjack shoes are typically shuffled around the same point. Different games, but similar considerations in this respect.
For my game redactle.net, I blacklist the Wikipedia article for 2 years. I figure there is a tradeoff between novelty and allowing the pool of articles to shrink. The Wikipedia vital level 4 category has 10k articles and probably half of them actually meet the criteria (length, number of languages etc) for making the cut.
As someone who recently built a daily word game[1], I 100% get it. I can say from first hand experience: there's an awful lot of words that are totally valid but not fun.
I spent approximately as much time on building the word list as I did developing the game. The author's technique of just grabbing a word list and spellchecking it is completely not sufficient, you will get so many weird unfamiliar words in there. In the end I was able to whittle down my list to about 24,000 using various automatic methods, but from that point I just had to do a manual review on the remaining list, which meant I got to see a lot of words, and many of them felt very obscure and/or not fun.
I am guessing a high percentage of wordle players prefer a wordle version which uses common words, and New York Times would prefer cater to those, rather than a smaller group of enthusiasts.
(I kept a copy of original wordle, and it seems to have 2,315 words that are possible answers.)
Not my experience at all.
Ask me how I know what an EPEE is
As an aside, for about $200, you can ask a true/false question of every word in the English language with a frontier LLM, and get mostly good answers. I make word games in my free time and was sort of shocked when I realized how cheap intelligence has been getting.
This should cost well under $1. Process the prompt. Then, for each word, input that word and then the end of prompt token, get your one token of output (maybe two if your favorite model wants to start with a start-of-reply token), and that’s it.
Then in January 2022, the NYT bought Wordle, and started tweaking both lists, first shrinking the secret word list to 2309 entries, but leaving the logic otherwise intact. Fast forward to today, I looked up the current code [1], and it seems that there are now 14855 allowed words. The first 12546 are ordered alphabetically (0: "aahed", 12545: "zymic"), and the next 2309 are not. This may suggest that the latter are the secret words, but the logic for picking them has changed: I found no obvious sequence, when compared to the last few days' secret words. So it's either a more complex sequence, or the secret word is picked server-side.
In any case, I guess they decided to re-shuffle the list now at day 1689 / 2309 in order to avoid giving particularly assiduous player an additional bit of information: they can exclude all previous secret words. (To be accurate, I think this would be 1.897 bits, but my information theory is rusty.)
[1] https://www.nytimes.com/games-assets/v2/9003.896ec900f2a1ce8...
2. Only a tiny group of people care to "card count" Wordle to rule out words that have already been played because they think that sort of min/maxing is fun. Most people don't even think about that, so whether Wordle reuses words every few years is trivial to them.
The Times sure doesn't think that about the people who do Letter Boxed. One LB had "polymethylmethacrylate" in its dictionary.
I've saved the daily dictionaries from 2024-03-30 and that's the longest word out of the 93 393 total distinct words in the 674 dictionaries I've saved. They average 1199.47 words per dictionary.
They have some truly ridiculous words, such as "troughgeng". WTF is a troughgeng? Googling that gives a couple of pages in Chinese (or a similar looking language) and a Scottish dictionary entry for "Throu" which in one of the examples of "throu" as an adverb lists a bunch of phrases is it used in, including:
> (8) througang, throw-, throoging, trough-geng, -geong (Sh., Ork.), (i) a going over or through; a passage (I.Sc. 1972); specif. (ii) a narration, a recital (of a story); (iii) a full rotation of crops, a shift; (iv) a thoroughfare, lane, passageway, corridor open at either end (Sc. 1808 Jam.; Sh. 1908 Jak. (1928); Rxb. 1923 Watson W.-B.; Ork., w.Lth., wm.Sc. 1972). Also attrib.; (v) = (5); (vi) energy, drive (Bnff. 1866 Gregor D. Bnff. 192);
That isn't a correct diagnosis; people have heard of aahed. You'll find it naturally in the expression "[someone] oohed and aahed".
People don't want aahed, and their instinct that it shouldn't count is reasonable, but unfamiliarity isn't the problem with it.
You can spell the sound "ah" however you like: ah, ahh, aah, aahh, there's no wrong way to spell it.
If you write "the washing machine tringged when it finished", 'tring' is not a word, even though it's following the rules of English morphology, you could have written any sequence of letters that most faithfully reproduces the sound of the washing machine. You could have written katrigged or puh-tringged.
I did have a similar reaction personally to the "exciting news" framing but I'm not actually sure it's wrong. The original list of words was an excellent list, and it's been over 4 years.
Given that it is Wordle, “panic” would be a far more appropriate word.
I spent approximately as much time on building the word list as I did developing the game. The author's technique of just grabbing a word list and spellchecking it is completely not sufficient, you will get so many weird unfamiliar words in there. In the end I was able to whittle down my list to about 24,000 using various automatic methods, but from that point I just had to do a manual review on the remaining list, which meant I got to see a lot of words, and many of them felt very obscure and/or not fun.
1: shameless plug: https://wheybags.com/turntiles