Show HN: Semantic Calculator (king-man+woman=?)

  data + plural = number
  data - plural = research
  king - crown = (didn't work... crown gets circled in red)
  king - princess = emperor
  king - queen = kingdom
  queen - king = worker
  king + queen = queen + king = kingdom
  boy + age = (didn't work... boy gets circled in red)
  man - age = woman
  woman - age = newswoman
  woman + age = adult female body (tied with man)
  girl + age = female child
  girl + old = female child

The other suggestions are pretty similar to the results I got in most cases. But I think this helps illustrate the curse of dimensionality (i.e. distances are ill-defined in high dimensional spaces). This is still quite an unsolved problem and seems a pretty critical one to resolve that doesn't get enough attention.

n2d4 · 3 months ago

For fun, I pasted these into ChatGPT o4-mini-high and asked it for an opinion:

   data + plural    = datasets
   data - plural    = datum
   king - crown     = ruler
   king - princess  = man
   king - queen     = prince
   queen - king     = woman
   king + queen     = royalty
   boy + age        = man
   man - age        = boy
   woman - age      = girl
   woman + age      = elderly woman
   girl + age       = woman
   girl + old       = grandmother

The results are surprisingly good, I don't think I could've done better as a human. But keep in mind that this doesn't do embedding math like OP! Although it does show how generic LLMs can solve some tasks better than traditional NLP.

The prompt I used:

> Remember those "semantic calculators" with AI embeddings? Like "king - man + woman = queen"? Pretend you're a semantic calculator, and give me the results for the following:

franga2000 · 3 months ago

This is an LLM approximating a semantic calculator, based solely on trained-in knowledge of what that is and probably a good amount of sample output, yet somehow beating the results of a "real" semantic calculator. That's crazy!

The more I think about it the less surprised I am, but my initial thoughts were quite simply "now way" - surely an approximation of an NLP model made by another NLP model can't beat the original, but the LLM training process (and data volume) is just so much more powerful I guess...

nbardy · 3 months ago

I hate to be pedantic, but the llm is definitely doing embedding math. In fact that’s all it does.

godelski · 3 months ago

  > The results are surprisingly good, I don't think I could've done better as a human

I'm actually surprised that the performance is so poor and would expect a human to do much better. The GPT model has embedding PLUS a whole transformer model that can untangle the embedded structure.

To clarify some of the issues:

  data is both singular and plural, being a mass noun[0,1]. Datum is something you'll find in the dictionary, but not common in use[2]. The dictionary lags actual definitions. I mean words only mean what we collectively agree they mean (dictionary definitely helps with that but we also invent words all the time -- i.e. slang). I see how this one could trick up a human, feeling the need to change the output and would likely consult a dictionary but I don't think that's a fair comparison here as LLMs don't have these same biases.

  King - crown really seems like it should be something like "man" or "person". The crown is the manifestation of the ruling power. We still use phrases like "heavy is the head that wears the crown" in reference to general leaders, not just monarchs.

  king - princess I honestly don't know what to expect. Man is technically gender neutral so I'll take this one.

  king - queen I would expect similar outputs to the previous one. Don't quite agree here.

  queen - king I get why is removing royalty but given the previous (two) results I think is showing a weird gender bias. Remember that queen is something like (woman + crown) and king is akin to (man + crown). So subtracting should be woman - man. 

  The others I agree with. These were actually done because I was quite surprised at the results and was thinking about the aforementioned gender bias.

  > But keep in mind that this doesn't do embedding math like OP!

I think you are misunderstanding the architecture of these models. The embedding sub-network is the translation of text to numeric tokens. You'll find mention of the embedding sub-networks in both the GPT3[3] and GPT4 papers. Though they are given lower importance than other works. While much smaller than the main network, don't forget that embedding networks are still quite large. For the smaller models they constitute a significant part of the total parameter count[4]

After the embedding sub-network is your main transformer network. The purpose of this network is to perform embedding math! It is just that the goal is to do significantly more complicated math. Remember, these are learnable mappings (see Optimal Transport). We're just breaking it down into their two main intermediate mappings. But the embeddings still end up being a bottleneck. It is your literal gateway from words to numbers.

[0] https://en.wikipedia.org/wiki/Mass_noun

[1] https://www.merriam-webster.com/dictionary/data

[2] https://www.sciotoanalysis.com/news/2023/1/18/this-data-or-t...

[3] https://arxiv.org/abs/2005.14165

[4] https://arxiv.org/abs/2303.08774

[4] https://www.lesswrong.com/posts/3duR8CrvcHywrnhLo/how-does-g...

amdivia · 3 months ago

Can you do the same but each line is done in a seperate context?

refulgentis · 3 months ago

...welcome to ChatGPT, everyone! If you've been asleep since...2022?

(some might say all an LLM does is embeddings :)

mathgradthrow · 3 months ago

Distance is extremely well defined in high dimensional spaces. That isn't the problem.

godelski · 3 months ago

Would you care to elaborate? To clarify, I mean that variance reduces as dimensionality increases

Affric · 3 months ago

Yeah I did similar tests and got similar results.

Curious tool but not what I would call accurate.

gweinberg · 3 months ago

I got a bunch of red stuff also. I imagine the author cached embeddings for some words but not really all that many to save on credits. I gave it mermaid - woman and got merman, but when I tried to give it boar + woman - man or ram + woman - man, it turns out it has never heard of rams or boars.

thatguysaguy · 3 months ago

Can you elaborate on what the unsolved problem you're referring to is?

godelski · 3 months ago

Dealing with metrics in high dimensions. As you increase dimensionality the variance decreases, leading to indistinguishablity.

You can get some help in high dimensions when you're more concerned with (clearly disjoint) clusters. But this is akin to doing a dimensional reduction, treating independent clusters as individual points. (Say we have set S which has disjoint subsets {S_0,...,S_n}, your new set is now {a_0,...,a_n}, where each a_i is an element representing all elements in S_i. Think like "set of sets") But you do not get help with interrelationships (i.e. d(s_x,s_y) \in S_i \forall x≠y) and I think you can gather that when clusters are not clearly disjoint then we're in the same situation as trying to differentiate inter-cluster.

Understanding this can help you understand why these models (including LLMs) are good in broader concepts like differentiating between obvious things but struggle more in nuance. A good litmus test is to ask them about any subject you have good deep knowledge in. Essentially test yourself for Murray-Gelmann Amnesia. The things are designed for human preference. When they fail they're likely to fail without warning (i.e. in ways that are not so obvious)

sdeframond · 3 months ago

Such results are inherently limited because a same word can have different meanings depending on context.

The role of the Attention Layer in LLMs is to give each token a better embedding by accounting for context.

charlieyu1 · 3 months ago

I think you need to do A-B+C types? A+B or A-B wouldn’t make much sense when the magnitude changes

virgilp · 3 months ago

hacker+news-startup = golfer

pjc50 · 3 months ago

Ah yes, 女 + 子 = girl but if combined in a kanji you get 好 = like.

> king-man+woman=queen

Is the famous example everyone uses when talking about word vectors, but is it actually just very cherry picked?

I.e. are there a great number of other "meaningful" examples like this, or actually the majority of the time you end up with some kind of vaguely tangentially related word when adding and subtracting word vectors.

(Which seems to be what this tool is helping to illustrate, having briefly played with it, and looked at the other comments here.)

(Btw, not saying wordvecs / embeddings aren't extremely useful, just talking about this simplistic arithmetic)

loganmhb · 3 months ago

I once saw an explanation which I can no longer find that what's really happening here is also partly "man" and "woman" are very similar vectors which nearly cancel each other out, and "king" is excluded from the result set to avoid returning identities, leaving "queen" as the closest next result. That's why you have to subtract and then add, and just doing single operations doesn't work very well. There's some semantic information preserved that might nudge it in the right direction but not as much as the naive algebra suggests, and you can't really add up a bunch of these high-dimensional vectors in a sensible way.

E.g. in this calculator "man - king + princess = woman", which doesn't make much sense. "airplane - engine", which has a potential sensible answer of "glider", instead "= Czechoslovakia". Go figure.

jbjbjbjb · 3 months ago

Well when it works out it is quite satisfying

India - Asia + Europe = Italy

Japan - Asia + Europe = Netherlands

China - Asia + Europe = Soviet-Union

Russia - Asia + Europe = European Russia

calculation + machine = computer

kgeist · 3 months ago

Interesting:

  Russia - Europe = Putin
  Ukraine + Putin = Russia
  Putin - Stalin = Bush
  Stalin - purge = Lenin

That means Bush = Ukraine+Putin-Europe-Lenin-purge.

However, the site gives Bush -4%, second best option (best is -2%, "fleet ballistic missile submarine", not sure what negative numbers mean).

trhway · 3 months ago

democracy - vote = progressivism

I'll have to mediate on that.

groby_b · 3 months ago

I think it's worth keeping in mind that word2vec was specifically trained on semantic similarity. Most embedding APIs don't really give a lick about the semantic space

And, worse, most latent spaces are decidedly non-linear. And so arithmetic loses a lot of its meaning. (IIRC word2vec mostly avoided nonlinearity except for the loss function). Yes, the distance metric sort-of survives, but addition/multiplication are meaningless.

(This is also the reason choosing your embedding model is a hard-to-reverse technical decision - you can't just transform existing embeddings into a different latent space. A change means "reembed all")

Retr0id · 3 months ago

I think it's slightly uncommon for the vectors to "line up" just right, but here are a few I tried:

actor - man + woman = actress

garden + person = gardener

rat - sewer + tree = squirrel

toe - leg + arm = digit

gregschlom · 3 months ago

Also, as I just learned the other day, the result was never equal, just close to "queen" in the vector space.

charcircuit · 3 months ago

And queen isn't even the closest.

chis · 3 months ago

I mean they are floating point vectors so

raddan · 3 months ago

> is it actually just very cherry picked?

100%

bee_rider · 3 months ago

Hmm, well I got

    cherry - picker = blackwood

if that helps.

map - legend = Mercator projection noodle - wheat = egg noodle noodle - gluten = tagliatelle architecture - calculus = architectural style answer - question = comment shop - income = bookshop curry - curry powder = cuisine rice - grain = chicken and rice rice + chicken = poultry milk + cereal = grain blue - yellow = Fiji blue - Fiji = orange blue - Arkansas + Bahamas + Florida - Pluto = Grenada

life + death = mortality life - death = lifestyle drug + time = occasion drug - time = narcotic art + artist + money = creativity art + artist - money = muse happiness + politics = contentment happiness + art = gladness happiness + money = joy happiness + love = joy

montebicyclelo · 3 months ago

spindump8930 · 3 months ago

First off, this interface is very nice and a pleasure to use, congrats!

Are you using word2vec for these, or embeddings from another model?

I also wanted to add some flavor since it looks like many folks in this thread haven't seen something like this - it's been known since 2013 that we can do this (but it's great to remind folks especially with all the "modern" interest in NLP).

It's also known (in some circles!) that a lot of these vector arithmetic things need some tricks to really shine. For example, excluding the words already present in the query[1]. Others in this thread seem surprised at some of the biases present - there's also a long history of work on that [2,3].

[1] https://blog.esciencecenter.nl/king-man-woman-king-9a7fd2935...

[2] https://arxiv.org/abs/1905.09866

[3] https://arxiv.org/abs/1903.03862

nxa · 3 months ago

Thank you! I actually had a hard time finding prior work on this, so I appreciate the references.

The dictionary is based on https://wordnet.princeton.edu/, no word2vec. It's just a plain lookup among precomputed embeddings (with mxbai-embed-large). And yes, I'm excluding words that are present in the query because.

It would be interesting to see how other models perform. I tried one (forgot the name) that was focused on coding, and it didn't perform nearly as well (in terms of human joy from the results).

kaycebasques · 3 months ago

(Question for anyone) how could I go about replicating this with Gemini Embedding? Generate and store an embedding for every word in the dictionary?

antidnan · 3 months ago

Neat! Reminds me of infinite craft

https://neal.fun/infinite-craft/

thaumasiotes · 3 months ago

I went to look at infinite craft.

It provides a panel filled with slowly moving dots. Right of the panel, there are objects labeled "water", "fire", "wind", and "earth" that you can instantiate on the panel and drag around. As you drag them, the background dots, if nearby, will grow lines connecting to them. These lines are not persistent.

And that's it. Nothing ever happens, there are no interactions except for the lines that appear while you're holding the mouse down, and while there is notionally a help window listing the controls, the only controls are "select item", "delete item", and "duplicate item". There is also an "about" panel, which contains no information.

In the panel, you can drag one of the items (eg. Water) onto another one (eg. Earth), and it will create a new word (eg. Plant). It uses AI, so it goes very deep

lcnPylGDnU4H9OF · 3 months ago

Some of these make more sense than others (and bookshop is hilarious even if it's only the best answer by a small margin; no shade to bookshop owners).

C-x_C-f · 3 months ago

I don't want to dump too many but I found

   chess - checkers = wormseed mustard (63%)

pretty funny and very hard to understand. All the other options are hyperspecific grasslike plants like meadow salsify.

ccppurcell · 3 months ago

My philosophical take on it is that natural language has many many more dimensions than we could hope to represent. Whenever you do dimension reduction you lose information.

ActionHank · 3 months ago

dog - fur = Aegean civilization

jumploops · 3 months ago

This is super neat.

I built a game[0] along similar lines, inspired by infinite craft[1].

The idea is that you combine (or subtract) “elements” until you find the goal element.

I’ve had a lot of fun with it, but it often hits the same generated element. Maybe I should update it to use the second (third, etc.) choice, similar to your tool.

[0] https://alchemy.magicloops.app/

[1] https://neal.fun/infinite-craft/

lightyrs · 3 months ago

I don't get it but I'm not sure I'm supposed to.

    Life + death = mortality

is pretty good IMO, it is a nice blend of the concepts in an intuitive manner. I don’t really get

   drug + time = occasion

But

   drug - time = narcotic

Is kind of interesting; one definition of narcotic is

> a drug (such as opium or morphine) that in moderate doses dulls the senses, relieves pain, and induces profound sleep but in excessive doses causes stupor, coma, or convulsions

https://www.merriam-webster.com/dictionary/narcotic

So we can see some element of losing time in that type of drug. I guess? Maybe I’m anthropomorphizing a bit.

grey-area · 3 months ago

Does the system you’re querying ‘get it’? From the answers it doesn’t seem to understand these words or their relations. Once in a while it’ll hit on something that seems to make sense.

__MatrixMan__ · 3 months ago

Here's a challenge: find something to subtract from "hammer" which does not result in a word that has "gun" as a substring. I've been unsuccessful so far.

mrastro · 3 months ago

The word "gun" itself seems to work. Package this as a game and you've got a pretty fun game on your hands :)

Doh why didn't I think of that

aniviacat · 3 months ago

Gun related stuff works: bullet, holster, barrel

Other stuff that works: key, door, lock, smooth

Some words that result in "flintlock": violence, anger, swing, hit, impact

Well that's easy, subtract "gun" :P

ttctciyf · 3 months ago

hammer - keyboard = hammerhead

Makes no sense, admittedly!

- dulcimer and - zither are both in firmly in .*gun.* territory it seems..

downboots · 3 months ago

Bullet

soxfox42 · 3 months ago

hammer - red = lock

tough · 3 months ago

hammer + man = adult male body (75%)

rdlw · 3 months ago

Close, that's addition

neom · 3 months ago

if I'm allowed only 1 something, I can't find anything either, if I'm allowed a few somethings, "hammer - wine - beer - red - child" will get you there. Guessing given that a gun has a hammer and is also a tool, it's too heavily linked in the small dataset.