Readit News logoReadit News
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
kgeist · 3 months ago
Interesting:

  Russia - Europe = Putin
  Ukraine + Putin = Russia
  Putin - Stalin = Bush
  Stalin - purge = Lenin
That means Bush = Ukraine+Putin-Europe-Lenin-purge.

However, the site gives Bush -4%, second best option (best is -2%, "fleet ballistic missile submarine", not sure what negative numbers mean).

nxa · 3 months ago
My interpretation of negative numbers is that no "synonym" was found (no vector pointing in the same direction), and that the closest expression on record is something with an opposite meaning (pointing in reverse direction), so I'd say that's an antonym.
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
nxa · 3 months ago
artificial intelligence - bullsh*t = computer science (34%)
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
kaycebasques · 3 months ago
(Question for anyone) how could I go about replicating this with Gemini Embedding? Generate and store an embedding for every word in the dictionary?
nxa · 3 months ago
Yes, that's pretty much what it is. Watch out for homographs.
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
nxa · 3 months ago
This might be helpful: I haven't implemented it in the UI, but from the API response you can see what the word definitions are, both for the input and the output. If the output has homographs, likeliness is split per definition, but the UI only shows the best one.

Also, if it gets buried in comments, proper nouns need to be capitalized (Paris-France+Germany).

I am planning on patching up the UI based on your feedback.

nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
spindump8930 · 3 months ago
First off, this interface is very nice and a pleasure to use, congrats!

Are you using word2vec for these, or embeddings from another model?

I also wanted to add some flavor since it looks like many folks in this thread haven't seen something like this - it's been known since 2013 that we can do this (but it's great to remind folks especially with all the "modern" interest in NLP).

It's also known (in some circles!) that a lot of these vector arithmetic things need some tricks to really shine. For example, excluding the words already present in the query[1]. Others in this thread seem surprised at some of the biases present - there's also a long history of work on that [2,3].

[1] https://blog.esciencecenter.nl/king-man-woman-king-9a7fd2935...

[2] https://arxiv.org/abs/1905.09866

[3] https://arxiv.org/abs/1903.03862

nxa · 3 months ago
Thank you! I actually had a hard time finding prior work on this, so I appreciate the references.

The dictionary is based on https://wordnet.princeton.edu/, no word2vec. It's just a plain lookup among precomputed embeddings (with mxbai-embed-large). And yes, I'm excluding words that are present in the query because.

It would be interesting to see how other models perform. I tried one (forgot the name) that was focused on coding, and it didn't perform nearly as well (in terms of human joy from the results).

nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
nikolay · 3 months ago
Really?!

  man - brain = woman
  woman - brain = businesswoman

nxa · 3 months ago
I probably should have prefaced this with "try at your own risk, results don't reflect the author's opinions"
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
fph · 3 months ago
"King" (capital) probably was interpreted as https://en.wikipedia.org/wiki/Billie_Jean_King , that's why a tennis player showed up.
nxa · 3 months ago
when I first tried it, king was referring to the instrument and I was getting a result king-man+woman=flute ... :-D
nxa commented on Show HN: Semantic Calculator (king-man+woman=?)   calc.datova.ai... · Posted by u/nxa
cabalamat · 3 months ago
What does it mean when it surrounds a word in red? Is this signalling an error?
nxa · 3 months ago
Yes, word in red = word not found mostly the case when you try plurals or non-nouns (for now)

u/nxa

KarmaCake day93October 3, 2023View Original