Readit News logoReadit News
voidUpdate · 15 days ago
> “Boiling water” isn’t “water that happens to be boiling.” It’s a hazard, a cooking stage, a state of matter

I guess we'll have to disagree then, because "boiling water" is "water that's boiling" to me. It's not a different state of matter to "water", that would be "steam". It being a hazard doesn't mean it's a singular concept, same as "wet floor"

kdheiwns · 15 days ago
Yeah, if "boiling water" is one word, what about boiling sugar? Boiling milk? Boiling volcano? Boiling soup?

Adding two words together creates a new and different concept. The permutations necessary to represent every concept ever formed by combining two or more different words would be endless.

Some of them on the list, like black hole, do make sense. That's a very distinct thing. It's not a hole in the conventional sense and it's not really black. Boiling water, though, is water. And it's boiling.

vidarh · 15 days ago
[To be clear, the below is me agreeing with you]

Norwegian is almost as compound-happy as German, and we could've filled many volumes with compounds. But what generally happens for one of the compunds to enter the dictionary is that the compound needs to have a meaning that is non-obvious from the individual parts, at least to some people, and typically that the compound has a non-obvious meaning if interpreted as two separate words.

E.g. "akterutseilt" is an example. "Akterut" means behind, aft. "Seilt" means sailed. "Behind sailed" helps as a way to remember it, but it's not obvious whether it's strictly a sailing term, or means that you've been left behind or have left someone else behind.

In this case if you say someone has been akterutseilt, it means they've been metaphorically left behind, often by their own failure to keep up.

Those kinds of compounds deserve dictionary entries whether they are actually written in two words or one, because they function as a single unit however it is written.

I think black hole is a perfect example in English. And in fact, this is a compound that is written in two words in Norwegian as well, but is in Norwegian dictionaries despite that[1] as "svart hull".

[1] https://ordbokene.no/bm/svart%20hull

ben_w · 15 days ago
> Adding two words together creates a new and different concept. The permutations necessary to represent every concept ever formed by combining two or more different words would be endless.

May I introduce you to the German language?

We have "gesundheitszeugnis" (health certificate) and "bärenstark" (strong as a bear), and of course "[der] Donaudampfschifffahrtsgesellschaftskapitän" ([the] Danube Steamship Navigation Company Captain) and "[Das] Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz" ([the] cattle marking and beef labeling supervision duties delegation law).

Beijinger · 15 days ago
Boiling water is not a word. The phrase contains two words. While German has no word for "boiling water", it uses two words too, an adjective and a noun, the German language has the principle of composite words. As a consequence, there is an infinite amount of German words.

"Hackernewsleser" would be a word I just made up but every German can understand. A reader of Hackernews. Obviously this makes a dictionary tricky. And it has been a big problem for spell corrections in early MS Word Software.

seanhunter · 14 days ago
Agree. “boiling water” is such a staggeringly terrible example for TFA to have opened with.

“Honey, I’ve overheated the fondue! The problem is I can’t describe the liquid because English completely lacks any word that might be apposite in this situation other than the newly-minted ‘boiling water’.”

“It’s a problem. Maybe you could call it ‘boiling water that happens to be quite cheesy’. It’s not great, but it’s the best we can do.”

traveler1 · 15 days ago
Boiling point?
RHSeeger · 15 days ago
To me it boils down to (pun intended)

> Traditional dictionaries skip almost all such phrases, because they contain spaces.

Yes, because they're phrases, not words. I don't even understand what's surprising about this. Sure, the entire article talks about how dictionaries contain _some_ phrases; but it's clear it's not many of them. Dictionaries are for words, not phrases.

win311fwg · 15 days ago
Technically they are both phrases and words. You can call them lexemes if you want to avoid confusing the computer programmers who do not understand that life isn't binary.
globular-toast · 15 days ago
Yep, all of the following make perfect sense to me, they're just non-idiomatic:

- Don't put your hand in water that's boiling,

- Add the pasta to water that's boiling,

- That saucepan is full of water that's boiling.

If "boiling water" were a distinct word, all of these sentences would change meaning compare to their idiomatic counterparts.

Ekaros · 15 days ago
Boiling water is mostly same as boiling anything. So I would just have "boiling". No need for "boiling water". I see no reason why boiling water could not just be covered by whatever general boiling entry covers.
9rx · 15 days ago
The reason is the same reason for why the word "hot water" is found in the dictionary: Because it has picked up other meaning.

The word "boiling water" is not currently found in the dictionary because the meaning has not been considered widespread or significant enough to justify inclusion. The article is pondering what line exactly defines widespread or significant.

Deleted Comment

m-schuetz · 15 days ago
Some other words that are sorely missing from dictionaries: "Warm water", "hot water", "cold water", "dirty water"
manarth · 15 days ago
As an idiomatic expression, "Hot water" = "trouble".

Are there idiomatic expressions for warm/cold/dirty water, which mean something other than a literal adjective describing the temperature or condition of water?

epolanski · 15 days ago
> dirty water

Depending on the context you got sewage, slush, runoff, murk, waste etc.

vunderba · 15 days ago
Agree. You can of course treat "Boiling water" in its gerund form where it functions as a noun:

  "Boiling water should be performed in a metal pot".
> It’s a hazard, a cooking stage, a state of matter

All of these are ancillary and depend on context, but in every one of these downstream cases the same underlying process is happening: the water is boiling.

8bitsrule · 15 days ago
> the water is boiling.

Not necessarily. It might refer to heating water to bring it to a boil.

Q. What are you doing over there?

A. Oh, just boiling water.

gcanyon · 15 days ago
I would have agreed with you before they pointed out that "frozen water" gets a word: ice. Honestly, I think it's reasonable: people deal with frozen water far more than they do boiling water, but it changes it from a case of "what are they talking about?" to "okay, where do we draw the line?" for me.
dghf · 15 days ago
But water that has boiled into gas also gets a word: steam.

As far as I'm aware, there is no separate word for freezing water -- i.e. water that is very cold and will, if it continues to get colder (and has something to crystallise around), turn into ice.

So the symmetry seems complete: ice -> freezing water -> water -> boiling water -> steam.

uhhhhhhh · 15 days ago
Well, being pedantic, my favorite hobby:

Frozen water represents a state change and that different state commonly gets its own word: ice/water/steam equates to solid/liquid/gas

Boiling/freezing water represents the state of the liquid, not the transition. Its descriptive. Water boils away into steam, or freezes into ice.

Should we consider luke-warm water also singular? What about body-temperature water? cool water? It makes sense not to treat adjectives/descriptive words combined with the subject as singular because the definition already exists in the root of the words (meaning of adjective word + meaning of subject word). Blue clay is another example, why would that be a singular?

It really only makes sense to me in the rare cases where the combination words represent something different or non obvious than the combined meanings of the two words (i.e to 'give up')

Izkata · 15 days ago
Ice, slush, sleet, snow, graupel, hail... And within there is a subtype "black ice", a compound noun that isn't really just a description (it's not black, it's nearly invisible - a similar sense as another one, "black hole", which you'd never figure out from the components alone).

We have a lot of words for "frozen water" because it takes a lot of forms. As far as I know "boiling water" is only one thing so we've never needed additional words to distinguish it.

NiloCK · 15 days ago
Steam?
RiverCrochet · 15 days ago
What's ice cream then?
adzm · 15 days ago
Yeah, this article brings up a good point per se, but then defeats itself with nonsensical analysis and examples
LgWoodenBadger · 15 days ago
And even more confounding, what's "water ice?"

https://www.ritasice.com

georgefrowny · 15 days ago
It used to be iced cream, which is more descriptive.

Ice cream is a shortened pronunciation.

Deleted Comment

jonplackett · 15 days ago
I’m so glad I’m not going insane. I don’t see any examples on that site that I agree are ‘one word’. Sure they’re singular concepts but so what? Are we going to have singular words to describe all adjective noun pairs now?
kllrnohj · 15 days ago
Really? none are one word? How about "of course"?
adamauckland · 15 days ago
The kettle was boiling water.

The chef was out the back, boiling water.

The chef was out the back. Boiling water had spilled everywhere.

The seas had turned to boiling water.

I dunno, could be down to interpretation.

Deleted Comment

Beijinger · 15 days ago
"a state of matter", no boiling water is not a "state of matter"
georgefrowny · 15 days ago
It's a state that matter can be in. Which is not the same as the technical compound word "state of matter".

Which is why "state of matter" is, itself, often in the dictionary, possibly to the dismay of the Team Single Word in this comment section.

Dead Comment

AlotOfReading · 17 days ago
A compound word isn't just a phrase. The latter is a group of words that indicate a single concept. The former is a new word that has a distinct meaning from the subwords that compose it. "I love you" is an example of a clausal phrase. The meaning is entirely evident from the words that compose it. In contrast, a "hot dog" is not a particularly warm canine, and has its own OED entry [0] as a compound word.

And some of the entries on this list are wrong. "Good night" exists in OED as "goodnight" [1] because there are multiple ways it's used. One is the clausal phrase "I hope you have a good night", which can be modified by changing the adjective, e.g. "great night" or "terrible night". "Goodnight" the bedtime ritual can't be modified the same way, so OED chooses to write it as a compound word without spaces.

[0] https://www.oed.com/dictionary/hot-dog_n

[1] https://www.oed.com/dictionary/goodnight_n

Dead Comment

harperlee · 15 days ago
Surprised that no comment mentioned that there is a standard term (not a word :P) for the set of words that denominates a particular concept: nominal syntagm. Such as "boiling water" and also "that green parrot we saw yesterday over the left branch".

Also the slider examples are abysmal. "I love you", "Go home" and "How are you" are not words by any stretch of imagination. For someone who makes word games, I don't see a particularly deep love of words here.

Edit: Obligatory reference to Borges's Tlön: https://en.wikipedia.org/wiki/Tl%C3%B6n,_Uqbar,_Orbis_Tertiu...

michaeld123 · 15 days ago
Added a note: "'I love you' isn't opaque, but it's tight enough to put on a tile." The familiar end of the spectrum picks up collocations that are transparent but loaded — I'm not claiming they're words in the traditional sense, but they're useful vocabulary for word games, which is where I'm coming from.
vunderba · 15 days ago
> "'I love you' isn't opaque, but it's tight enough to put on a tile."

The problem with introducing phrase/sentences into a word game (let's take Scrabble) is that you'd spend half the night with your friends arguing over what is and is not acceptable with the only litmus test being its... corpus frequency?

medalblue · 15 days ago
I thought that sentence seemed out of place when I read it. Didn't realize this was all AI slop. It all makes sense now.
georgefrowny · 15 days ago
Funnily enough, "nominal syntagm" is, itself, not in the OED or Wiktionary. But Wiktionary has "syntagme nominal" as the French translation for "noun phrase".

You really have to love the human messiness of language!

win311fwg · 15 days ago
A nominal syntagm is a somewhat overlapping concept, but deviates slightly from the direct discussion taking place. The more appropriate standard term here is: open compound word. Or, as one might say casually: word.
dec0dedab0de · 17 days ago
There are nearly half a million compound phrases that aren’t in any dictionary—simply because they contain spaces. “Boiling water.” “Saturday night.” “Help me.”

I would hope that none of those examples were taking up space in a dictionary.

jakub_g · 17 days ago
It's quite interesting that "boiling water" in many Slavic languages is actually a separate word (and not derived from "water", but from "boiling"; similar how the author mentions "ice" being used instead of "frozen water").
rjh29 · 15 days ago
Japan is similar with 熱湯 boiling、お湯 heated、白湯 boiled once then cooled down、水 cold
hbs18 · 14 days ago
Which slavic languages have that? In mine I can't think of a single word for that other than kipuća voda which literally means boiling water.
dec0dedab0de · 17 days ago
It was mentioned in other comments but boiled water is steam, and frozen water is ice. We do not have separate words for freezing water or boiling water.

in the slavic languages do they have a different way to describe boiling or freezing milk, or any other liquid?

Deleted Comment

epgui · 17 days ago
I mean it’s interesting that this is generally the case with many (or even most) words across languages… But I’d wager it’s more the norm than the exception, so I don’t know if “boiling water” is that interesting of an example.
michaeld123 · 15 days ago
This was a great detail — added Russian kipyatok and Polish wrzątok to the article as evidence that "boiling water" carries enough conceptual weight that other languages crystallized it into a single word
gligierko · 17 days ago
Some are better than others. Many semi-transparents could get legit coverage. And many are good fodder for word game content.
dec0dedab0de · 17 days ago
The rest of the article did a good job explaining that. I just think those were terrible examples for the introduction. I think "shut up", "good night", and "hot dog" would have really got the point across better, but those might already be in dictionaries.

Deleted Comment

simlevesque · 17 days ago
The first two I kind of understand what the author means. But "help me" and "severe pain" made me think that I'm just not the right public for this text.
dec0dedab0de · 17 days ago
I don’t see how boiling water could ever be a single word. Would that mean we need entries for every other liquid boiling?

i guess Saturday night could have some extra details explaining the context around our standard work week. But even that is a stretch.

DonHopkins · 15 days ago
>"Boiling water" ... I would hope that none of those examples were taking up space in a dictionary.

Yeah, I agree! Fuck ICE!

thmpp · 17 days ago
While 'this analysis would not have been possible without LLM', I am not sure the LLM analysis was well reviewed after it has been done. From the obscure/familiar word list, some of the n-grams, e.g. "is resource", "seq size", "db xref" surely happen in the wild (we well know), but I would doubt that we can argue they are missing from the dictionary. Knowing the realm, I would argue none of them are words, not even collocations. If "is resource" is, why not, "has resource"? So while the path is surely interesting, this analysis does miss scrutiny, which you would expect from a high-level LLM analysis.
michaeld123 · 17 days ago
The very bottom of the slider is there to illustrate where LLM artifacts and Wiktionary noise live — it's not presented as legitimate vocabulary. The slider lets you see the full quality gradient, including where it breaks down.
exmadscientist · 15 days ago
That's not really mentioned in the article, though. As far as the article is concerned, the right side of that slider is valid-but-possibly-too-rare-to-be-interesting, when in fact it's just garbage. This does not sell the concept well.
less_less · 15 days ago
In addition to what others have pointed out, many of these aren't actually missing from traditional dictionaries: they're just inflected differently. So your example lists phrases like "operating systems", "immune systems" and "solar systems" as missing from traditional dictionaries, but at least the online OED and M-W have "operating system", "immune system" and "solar system" in them. It's just that your script is apparently listing the plural as a separate phrase.

On languages other than English: in general, different languages do word division very differently. At least in German and Dutch, many of those phrasal verbs are separable, meaning that they are one word in the infinitive but are multiple words in the present tense. So for example, where in English you would say "I log in to the website", in Dutch it would be "Ik log in op de website". "Log in" is two words in both cases, but in Dutch it's the separated form of the single-word separable verb inloggen ("I must log in now" = "Ik moet nu inloggen"). The verb is indeed separable in that the two words often don't end up next to each other: "I log in quickly" = "Ik log snel in".

Dutch, like German, has lots of compounds. But there are also agglutinative languages, which have even more complex compound words, perhaps comprising a whole sentence in another language. Eg (from Wikipedia) Turkish "evlerinizdenmiş" = "(he/she/it) was (apparently/said to be) from your houses" or Plains Cree "paehtāwāēwesew" = "he is heard by higher powers"; and these aren't corner cases, that's how the language works.

kelseyfrog · 17 days ago
The name for these are "collocations".

Collocation dictionaries are lists of collocations. The reason they're absent from single word dictionaries is because there's about 25x more collocations than single words.

Deleted Comment

georgefrowny · 15 days ago
And fittingly enough, "collocation dictionary" is not in "the" dictionary. At least not the OED.

Presumably if the word thesaurus was actually "synonym dictionary" it would likewise be absent.

michaeld123 · 15 days ago
You and Shorel were both right — added it. Thanks -- the lexicographers like https://www.sketchengine.eu
urbandw311er · 15 days ago
The author of this article just hasn’t been taught how to use a dictionary. The words aren’t “missing”, they’re just indexed under one of their parts. For example “wait upon” would be located within the entry for “wait”.