Gemini 2.5 - Readit News

One of the biggest problems with hands off LLM writing (for long horizon stuff like novels) is that you can't really give them any details of your story because they get absolutely neurotic with it.

Imagine for instance you give the LLM the profile of the love interest for your epic fantasy, it will almost always have the main character meeting them within 3 pages (usually page 1) which is of course absolutely nonsensical pacing. No attempt to tell it otherwise changes anything.

This is the first model that after 19 pages generated so far resembles anything like normal pacing even with a TON of details. I've never felt the need to generate anywhere near this much. Extremely impressed.

Edit: Sharing it - https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

with pastebin - https://pastebin.com/aiWuYcrF

comboy · 5 months ago

I like how critique of LLMs evolved on this site over the last few years.

We are currently at nonsensical pacing while writing novels.

skyechurch · 5 months ago

The most straightforward way to measure the pace of AI progress is by attaching a speedometer to the goalposts.

solardev · 5 months ago

It's not really passing the Turing Test until it outsells Harry Potter.

krzat · 5 months ago

This either ends at "better than 50% of human novels" garbage or at unimaginably compelling works of art that completely obsoletes fiction writing.

Not sure what is better for humanity in long term.

ruraljuror · 5 months ago

We are, if this comment is the standard for all criticism on this site. Your comment seems harsh. Perhaps novel writing is too low-brow of a standard for LLM critique?

rafaelmn · 5 months ago

People are trying to use gen AI in more and more use-cases, it used to fall flat on its face at trivial stuff, now it got past trivial stuff but still scratching the boundaries of being useful. And that is not an attempt to make the gen AI tech look bad, it is really amazing what it can do - but it is far from delivering on hype - and that is why people are providing critical evaluations.

Lets not forget the OpenAI benchmarks saying 4.0 can do better at college exams and such than most students. Yet real world performance was laughable on real tasks.

stickfu · 5 months ago

I don’t know why I keep submitting myself to hacker news but every few months I get the itch, and it only takes a few minutes to be turned off by the cynicism. I get that it’s from potentialy wizened tech heads who have been in the trenches and are being realistic. It’s great for that, but any new bright eyed and bushy tailed dev/techy, whatever, should stay far away until much later in their journey

ksec · 5 months ago

Do we have any simple benchmarks ( and I know benchmarks are not everything ) that tests all the LLMs?

The pace is moving so fast I simply cant keep up. Or a ELI5 page which gives a 5 min explanation of LLM from 2020 to this moment?

basch · 5 months ago

It’s more a bellwether or symptom of a flaw where the context becomes poisoned and continually regurgitates the same thought over and over.

leokennis · 5 months ago

Not really new is it? First cars just had to be approaching horse and cart levels of speed. Comfort, ease of use etc. were non-factors as this was "cool new technology".

In that light, even a 20 year old almost broken down crappy dinger is amazing: it has a radio, heating, shock absorbers, it can go over 500km on a tank of fuel! But are we fawning over it? No, because the goalposts have moved. Now we are disappointed that it takes 5 seconds for the Bluetooth to connect and the seats to auto-adjust to our preferred seating and heating setting in our new car.

ripped_britches · 5 months ago

lol wouldn’t that be great to read this comment in 2022

deng · 5 months ago

I have actually read it and agree it is impressive. I will not comment much on the style of the writing, since this is very much subjective, but I would rate it as the "typical" modern fantasy style, which aims at filling as much pages as possible: very "flowery" language, lots of adjectives/adverbs, lots of details, lots of high-school prose ("Panic was a luxury they couldn't afford"). Not a big fan of that since I really miss the time where authors could write single, self-contained books instead of a sprawling series over thousands of pages, but I know of course that this kind of thing is very successful and people seem to enjoy it. If someone would give me this I would advise them to get a good copy editor.

There are some logical inconsistencies, though. For instance, when they both enter the cellar through a trapdoor, Kael goes first, but the innkeeper instructs him to close the trapdoor behind them, which makes no sense. Also, Kael goes down the stairs and "risks a quick look back up" and can somehow see the front door bulging and the chaos outside through the windows, which obviously is impossible when you look up through a trapdoor, not to mention that previously it was said this entry is behind the bar counter, surely blocking the sight. Kael lights an oily rag which somehow becomes a torch. There's more generic things, like somehow these Eldertides being these mythical things no one has ever seen, yet they seem to be pretty common occurrences? The dimensions of the cellar are completely unclear, at first it seems to be very small but yet they move around it quite a bit. There's other issues, like people using the same words as the narrator ("the ooze"), like they listen to him. The inkeeper suddenly calling Kael by his name like they already know each other.

Anyway, I would rate it "first draft". Of course, it is unclear whether the LLM would manage to write a consistent book, but I can fully believe that it would manage. I probably wouldn't want to read it.

hjnilsson · 5 months ago

Thank you for taking the time to do a thorough read, I just skimmed it, and the prose is certainly not for me. To me it lacks focus, but as you say, this may be the style the readers enjoy.

And it also, as you say, really reuses words. Just reading I notice "phosphorescence" 4 times for example in this chapter, or "ooze" 17 times (!).

It is very impressive though that it can create a somewhat cohesive storyline, and certainly an improvement over previous models.

blinding-streak · 5 months ago

Regarding your last sentence, I agree. My stance is this: If you didn't bother to write it, why should I bother to read it?

og_kalu · 5 months ago

Yeah I just focused on how well it was paced and didn't give any instructions on style or try a second pass to spot any inconsistencies.

That would be the next step but I'd previously never thought going any further might be worth it.

KittenInABox · 5 months ago

> Not a big fan of that since I really miss the time where authors could write single, self-contained books instead of a sprawling series over thousands of pages, but I know of course that this kind of thing is very successful and people seem to enjoy it.

When was this time you speak of?

nout · 5 months ago

Using the AI in multiple phases is the approach that can handle this. Similarly to "Deep Research" approach - you can tell it to first generate a storyline with multiple twists and turns. Then ask the model to take this storyline and generate prompts for individual chapters. Then ask it to generate the individual chapters based on the prompts, etc.

bbor · 5 months ago

Yup -- asking a chatbot to create a novel in one shot is very similar to asking a human to improvise a novel in one shot.

fshr · 5 months ago

I think you would be better off having the LLM help you build up the plot with high level chapter descriptions and then have it dig into each chapter or arc. Or start by giving it the beats before you ask it for help with specifics. That'd be better at keeping it on rails.

og_kalu · 5 months ago

I don't disagree. Like with almost anything else involving LLMs, getting hands on produces better results but because in this instance, i much prefer to be the reader than the author or editor, it's really important to me that a LLM is capable of pacing long form writing properly on its own.

tluyben2 · 5 months ago

That was what I tried on the train [0] a few weeks ago. I used Groq to get something very fast to see if it would work at least somewhat. It gives you a PDF in the end. Plugging in a better model gave much better results (still not really readable if you actually try to; at a glance it's convincing though), however, it was so slow that testing what kind of impossible. Cannot really have things done in parallel either because it does need to know what it pushed out before, at least the summary of it.

[0] https://github.com/tluyben/bad-writer

sagarpatil · 5 months ago

My prompt is nowhere near yours.

Just for fun: Asked it to rewrite the first page of ‘The Fountainhead’ where Howard is a computer engineer, the rewrite is hilarious lol.

https://gist.github.com/sagarspatil/e0b5443132501a3596c3a9a2...

didip · 5 months ago

Give it time, this will be solved.

I envisioned that one day, a framework will be created that can persist LLM current state on disk and then "fragments of memories" can be paged in and out into memory.

When that happened, LLM will be able to remember everything.

smcleod · 5 months ago

I have never used an LLM for fictional writing, but I have been writing large amounts of code with them for years. What I'd recommend is when you're defining your plan up front as to the sections of the content, simply state in which phase / chapter of the content they should meet.

Planning generated content is often more important to invest in than the writing of it.

Looking at your paste, your prompt is short and basic, it should probably be broken up into clear, formatted sections (try directives inside XML style tags). For such a large output as you're expecting id expect a considerable prompt of rules and context setting (maybe a page or two).

pantsforbirds · 5 months ago

I had Grok summarize + evaluate the first chapter with thinking mode enabled. The output was actually pretty solid: https://pastebin.com/pLjHJF8E.

I wouldn't be surprised if someone figured out a solid mixture of models working as a writer (team of writers?) + editor(s) and managed to generate a full book from it.

Maybe some mixture of general outlining + maintaining a wiki with a basic writing and editing flow would be enough. I think you could probably find a way to maintain plot consistency, but I'm not so sure about maintaining writing style.

stephen_cagle · 5 months ago

Opening with "like a struck flint carried on a wind that wasn’t blowing." <chuckles>

I don't know why, but that is just such a literal thing to say that it seems almost random.

bwb · 5 months ago

why would you ever want to write a novel with AI, that is human stuff right? :)

qingcharles · 5 months ago

I'm terrible at writing, but I love reading. I've got ideas for novels, but I struggle to put them down.

What I have found that works is to give the LLM the "world" outline at the beginning and then just feed it one line summary of each chapter and get it to write a chapter at a time.

The problem is that the quality of results drastically decreases as the context length increases. After about 10 chapters the dialogue will start to get real snippy. I've tried getting it to summarize all the previous chapters and feed that back in, but it never includes enough detail.

og_kalu · 5 months ago

I don't want to write a novel with AI. I want to read them (when they're good enough) because i love reading. Sometimes i want to read something with a certain dynamic and it gets difficult finding human written recommendations.

harrison_clarke · 5 months ago

this seems like something that planning would fix. i wonder if that's how it's doing it

like, if it decides to <think> a table of contents, or chapter summaries, rather than just diving in at page 1

Workaccount2 · 5 months ago

Can you share it on a text sharing site? It seems you hit your share quota

og_kalu · 5 months ago

Sure - https://pastebin.com/aiWuYcrF

dyauspitr · 5 months ago

That is mind blowing. To this fantasy reader that’s pure magic.

solardev · 5 months ago

19 pages?! Am I the only one who prefers an AI that jumps straight to the point?

- Buildup and happy background world-building

- Subtle foreshadowing

- Orcs attack

- Hero is saved by unlikely warrior of astounding beauty

- Evil is defeated until sales justify unnecessary sequel

That's the kind of story fit for the modern attention span...

I've been using a math puzzle as a way to benchmark the different models. The math puzzle took me ~3 days to solve with a computer. A math major I know took about a day to solve it by hand.

Gemini 2.5 is the first model I tested that was able to solve it and it one-shotted it. I think it's not an exaggeration to say LLMs are now better than 95+% of the population at mathematical reasoning.

For those curious the riddle is: There's three people in a circle. Each person has a positive integer floating above their heads, such that each person can see the other two numbers but not his own. The sum of two of the numbers is equal to the third. The first person is asked for his number, and he says that he doesn't know. The second person is asked for his number, and he says that he doesn't know. The third person is asked for his number, and he says that he doesn't know. Then, the first person is asked for his number again, and he says: 65. What is the product of the three numbers?

hmottestad · 5 months ago

This looks like it’s been posted on Reddit 10 years ago:

https://www.reddit.com/r/math/comments/32m611/logic_question...

So it’s likely that it’s part of the training data by now.

canucker2016 · 5 months ago

You'd think so, but both Google's AI Overview and Bing's CoPilot output wrong answers.

Google spits out: "The product of the three numbers is 10,225 (65 * 20 * 8). The three numbers are 65, 20, and 8."

Whoa. Math is not AI's strong suit...

Bing spits out: "The solution to the three people in a circle puzzle is that all three people are wearing red hats."

Hats???

Same text was used for both prompts (all the text after 'For those curious the riddle is:' in the GP comment), so Bing just goes off the rails.

malisper · 5 months ago

Other models aren't able to solve it so there's something else happening besides it being in the training data. You can also vary the problem and give it a number like 85 instead of 65 and Gemini is still able to properly reason through the problem

toonalfrink · 5 months ago

This whole answer hinges on knowing that 0 is not a positive integer, that's why I couldn't figure it out...

f1shy · 5 months ago

Thaks. I wanted to do exactly that: find the answer online. It is amazing that people (even in HN) think that LLM can reason. It just regurgitates the input.

drexlspivey · 5 months ago

And if it wasn’t, it is now

Dead Comment

_cs2017_ · 5 months ago

This is solvable in roughly half an hour on pen and paper by a random person I picked with no special math skills (beyond a university). This is far from a difficult problem. The "95%+" in math reasoning is a meaningless standard, it's like saying a model is better than 99.9% of world population in Albanian language, since less than 0.1% bother to learn Albanian.

Even ignoring the fact that this or similar problem may have appeared in the training data, it's something a careful brute-force math logic should solve. It's neither difficult, nor interesting, nor useful. Yes, it may suggest a slight improvement on the basic logic, but no more so than a million other benchmarks people quote.

This goes to show that evaluating models is not a trivial problem. In fact, it's a hard problem (in particular, it's a far far harder than this math puzzle).

windowshopping · 5 months ago

The "random person" you picked is likely very, very intelligent and not at all a good random sample. I'm not saying this is difficult to the extent that it merits academic focus, but it is NOT a simple problem and I suspect less than 1% of the population could solve this in half an hour "with no special math skills." You have to be either exceedingly clever or trained in a certain type of reasoning or both.

wrasee · 5 months ago

So in the three hours between you reading the puzzle in the parent comment, you stopped what you were doing, managed to get some other "random" person to stop what they were doing and spend half an hour of their time on a maths puzzle that at that point prior experience suggested could take a day? All within three hours?

That's not to say that you didn't, or you're recalling from a previous time that happens to be this exact puzzle (despite there being scant prior references to this puzzle, and precisely the reason for using it). But you can see how some might see that as not entirely credible.

Best guess: this random person is someone that really likes puzzles, is presumably good at them and is very, very far from being representative to the extent you would require to be in support of your argument.

Read: just a heavy flex about puzzle solving.

re-thc · 5 months ago

> This is solvable in roughly half an hour on pen and paper by a random person I picked with no special math skills (beyond a university).

I randomly answered this post and can't solve it in half an hour. Is the point leet code but for AI? I rather it solve real problems than "elite problems".

Side note: couldn't even find pen and paper around in half an hour.

sebzim4500 · 5 months ago

This is a great riddle. Unfortunately, I was easily able to find the exact question with a solution (albeit with a different number) online, thus it will have been in the training set.

Workaccount2 · 5 months ago

What makes this interesting is that while the question is online (on reddit, from 10 years ago) other models don't get the answer right. Gemini also shows it's work and it seems to do a few orders of magnitude more calculating then the elegant answer given on reddit.

Granted this is all way over my head, but the solution gemini comes to matches the one given on reddit (and now here in future training runs)

65×26×39=65910

varispeed · 5 months ago

Seems like we might need a section of internet that is off limits to robots.

beefnugs · 5 months ago

Why is this a great riddle? It sounds like incomplete nonsense to me:

It doesnt say anything about the skill levels of the participants, whether their answers are just guessing, or why they arent just guessing the sum of the other two people each time asked to provide more information?

It doesnt say the guy saying 65 is even correct

How could three statements of "no new information" give information to the first guy that didn't know the first time he was asked?

Deleted Comment

yifanl · 5 months ago

You'd have better results if you had prompted it with the actual answer and asked how the first person came to the conclusion. Giving a number in the training set is very easy.

i.e. You observe three people in a magical room. The first person is standing underneath a 65, the second person is standing underneath a 26 and the third person is standing underneath a 39. They can see the others numbers but not the one they are directly under. You tell them one of the three numbers is the sum of the other two and all numbers are positive integers. You ask the first person for their number, they respond that they don't know. You ask the second person for their number, they respond that they don't know. You ask the third person, they respond that they don't know. You ask the first person again and they respond with the correct value, how did they know?

And of course, if it responds with a verbatim answer in the line of https://www.reddit.com/r/math/comments/32m611/logic_question..., we can be pretty confident what's happening under the hood.

Deleted Comment

semiinfinitely · 5 months ago

I love how the entire comment section is getting one-shotted by your math riddle instead of the original post topic.

refulgentis · 5 months ago

In general I find commentary here too negative on AI, but I'm a bit squeamish about maximalist claims re: AI mathematical reasoning vs. human population based off this, even setting aside lottery-ticket-hypothesis-like concerns.

It's a common logic puzzle, Google can't turn up an exact match to the wording you have, but ex. here: https://www.futilitycloset.com/2018/03/03/three-hat-problem/

utopcell · 5 months ago

Same here: My problem of choice is the 100 prisoners problem [1]. I used to ask simple reasoning questions in the style of "what is the day three days before the day after tomorrow", but nowadays when I ask such questions, I can almost feel the the NN giggling at the naivety of its human operator.

[1] https://en.wikipedia.org/wiki/100_prisoners_problem

r0fl · 5 months ago

Wow

Tried this in deepseek and grok and it kept thunking in loops for a while and I just turned it off

I haven’t seen a question loop this long ever.

Very impressed

z2 · 5 months ago

Deepseek R1 got the right answer after a whopping ~10 minutes of thinking. I'm impressed and feel kind of dirty, I suspect my electricity use from this could have been put to better use baking a frozen pizza.

deepboy2 · 5 months ago

Just tried it on Deepseek (not R1, maybe V3-0324) and got the correct answer after 7-8 pages of reasoning. Incredible!

SwayStar123 · 5 months ago

You can also put the AI in the first person's shoes. Prompt: You are standing in a circle, there are 2 other people in the circle with you, everyone in the circle, has a positive integer above their head, no one knows what the number above their own head is but can see the numbers above the heads of the other people. You see that the person infront of you on the left has 26 above their head. The person on the right has 39 above their head. You are told that the sum of two of the numbers is the third number. You are asked what the number above your head is, the option is the sum, 65, or 13, as 26 + 13 = 39. You don't know which one it is, and you say so. The second person is asked the number above their head. They also say they dont know, the third person also says they dont know. What is your number?

Gemini 2.5 and claude 3.7 thinking get it right, o3 mini and 4o get it wrong

drewbeck · 5 months ago

I just asked it this twice and it gave me 65×65×130=549250. Both times. The first time I made it about ducks instead of people and mentioned that there was a thunderstorm. The second time I c/p your exact text and it gave me the same answer.

Again we find that the failure state of LLMs is a problem – yeah, when you know the answer already and it gets it right, that's impressive! When it fails, it still acts the same exact way and someone who doesn't already know the answer is now a lil stupider.

ototot · 5 months ago

I also tried one-shot.

https://g.co/gemini/share/badd00a824d2

eru · 5 months ago

I use an algorithmic question that I'd been working on for years and that I'm finally writing up the answer to.

It's basically: given a sequence of heap operations (insert element, delete minimum element), can you predict the left-over elements (that are in the heap at the end) in linear time in the comparison model?

(The answer is surprisingly: Yes.)

integralof5y · 5 months ago

A prolog program, swipl (it takes less than a second to solve your puzzle)

N is number of turns of don't know answers. the bad predicate means that the person can know its number at turn N.

  bad(_,_,_,-1) :- !,false.
  bad(_,A,A,0) :- !.
  bad(A,_,A,0) :- !.
  bad(A,A,_,0) :- !.
  bad(B,C,A,N) :- D is abs(B-A),D<C,N1 is N-1, bad(B,D,A,N1),!.
  bad(C,A,B,N) :- D is abs(B-A),D<C,N1 is N-1, bad(D,A,B,N1),!.
  bad(A,B,C,N) :- D is abs(B-A),D<C,N1 is N-1, bad(A,B,D,N1),!.
  
  solve(X,Y,Z) :- Y1 is X-1, between(1,Y1,Y),
                  between(0,2,N), Z is X-Y,bad(X,Y,Z,N).

  ?- solve(65,X,Y).
  X = 26,
  Y = 39 ;
  X = 39,  
  Y = 26 .

adpirz · 5 months ago

Interactive playground for the puzzle: https://claude.site/artifacts/832e77d7-5f46-477c-a411-bdad10...

(All state is stored in localStorage so you can come back to it :) ).

TrackerFF · 5 months ago

The riddle certainly nerd-sniped GPT 4.5

After a couple of minutes it decided on the answer being 65000. (S = {65, 40, 25)}

dkjaudyeqooe · 5 months ago

> I think it's not an exaggeration to say LLMs are now better than 95+% of the population at mathematical reasoning.

It's not an exaggeration it's a non-sequitur, you first have to show that the LLMs are reasoning in the same way humans do.

bbstats · 5 months ago

Could you explain "The sum of two of the numbers is equal to the third"??

rappatic · 5 months ago

I think:

Call the three numbers a, b, and c. This means c = a + b, but we still don’t know to which person each number belongs.

When person 1 (p1) is asked what his number is, he has no way to know whether he has a, b, or c, so he says he doesn’t know. Same goes for p2 and p3. Clearly p1 somehow gains information by p2 and p3 passing. Either he realizes that he must be either a or b, and such his number is the difference between p2 and p3’s numbers, or he realizes that he must be c and so his number is the sum of p2 and p3’s numbers.

That’s all I have so far. Anyone have other ideas?

Deleted Comment

malisper · 5 months ago

if the three numbers are a, b, and c, then either a+b=c, a+c=b, or b+c=a

refulgentis · 5 months ago

Given #s x,y, and z, either x + y = z, x + z = y, or y + z = x.

mitko · 5 months ago

Loved that puzzle, thanks for sharing it. I’ve solved a lot of math problems in the past but this one had a unique flavor of interleaving logical reasoning, partial information and a little bit of arithmetic.

highfrequency · 5 months ago

Fun puzzle! I’m curious how you managed to structure the problem such that a computer could solve it but it took 3 days of computation?

hmottestad · 5 months ago

Is the answer somehow {65, 20, 45} with the product 58,500?

That’s one-shot for o1 pro.

sebzim4500 · 5 months ago

That's wrong. From player 1's perspective {25 20 45} is entirely consistent with the calls made in the first three rounds.

g105b · 5 months ago

Please can you enlighten me, I'm a mathematic plebian?

doener · 5 months ago

ChatGPT 4.5:

Let’s carefully analyze this puzzle step-by-step:

⸻

Step 1: Define the Problem Clearly

We have three people in a circle. Let’s call them: • Person A (first person) • Person B (second person) • Person C (third person)

Each has a positive integer above their head. Each sees the other two numbers but not their own. The numbers satisfy: A + B = C \quad \text{or} \quad B + C = A \quad \text{or} \quad C + A = B

Given the puzzle scenario, we need to find out the exact numbers given the clues, and ultimately find the product of the three numbers.

⸻

Step 2: Logic of the Puzzle (Step-by-Step Reasoning)

Initial conditions: The numbers are positive integers. The puzzle implies a situation of ambiguity at first, but after multiple rounds of “I don’t know,” Person A finally knows their number is 65.

Let’s denote clearly the conditions again: • Exactly one of the following three equations is true: • A + B = C, or • B + C = A, or • C + A = B.

Person A sees B and C, but not A. Similarly for Persons B and C.

⸻

Step 3: Logical deductions from the repeated “I don’t know”

First round deductions: • Person A initially doesn’t know their number. This means, from Person A’s perspective, both scenarios A = B + C and A + B = C (or A + C = B) are still possible. If Person A had seen identical numbers, for instance, Person A could have instantly known their number. Thus, Person A sees two different numbers. • Similarly, Person B doesn’t know either, implying that Person B also sees ambiguity in their own number. Thus, Person B also sees two numbers that could result in at least two possible solutions. • Person C also doesn’t know, same logic as above.

Thus, after one full round, we understand that initially, all three numbers must create ambiguity. No trivial case (like two identical numbers) is present.

Second round deductions:

After hearing that no one knows their number, each person gains information: they now know the scenario is one in which none of the other two people could have immediately solved it.

Crucially, in the second round: • Person A now knows their number is 65 after considering the previous statements. This implies that Person A sees numbers B and C, and from these numbers plus the statements of uncertainty, Person A deduces that their own number can only be 65.

⸻

Step 4: Deriving the Numbers (Critical deduction)

Let’s denote clearly again: Person A’s number is now known to be 65.

We have three possible equations. Let’s test each carefully given A = 65: 1. Case 1: A + B = C Then we have: • 65 + B = C. Person A sees numbers B and C, so they would see B and 65 + B. Person A sees numbers B and 65+B. If Person A sees these two numbers, they might initially consider their number could be either the sum or difference of these two. • However, if Person A sees two numbers, and one is exactly 2× the other, Person A could immediately guess their number is the difference (since if you see numbers X and 2X, your number is clearly X). So this scenario of B and 65+B is okay only if these two numbers don’t exhibit an immediate trivial scenario. • Let’s hold this scenario and continue examining others first. 2. Case 2: B + C = A (So, B + C = 65) • Then Person A sees numbers B and C, adding exactly to 65. • Initially, Person A sees two numbers summing exactly to 65. Could Person A know their number immediately at first glance? No, since their number could either be the sum (65) or the difference of the two numbers. Ambiguity here initially makes sense. • Initially, Person B and Person C also see ambiguous scenarios. They both don’t know their numbers either at first round. This scenario is plausible. 3. Case 3: C + A = B (So, C + 65 = B) • Then Person A sees numbers B = C + 65 and C. • Person A sees C and C+65. Similar reasoning as Case 1.

⸻

Step 5: Identifying the Correct Scenario

The puzzle’s logic typically aligns best with Case 2: B + C = A = 65.

Here’s why clearly stated: • Initially, Person A sees two numbers adding to exactly 65. At first glance, Person A’s number could either be 65 (sum) or the difference between the two numbers. • Person B also initially sees two numbers and cannot distinguish immediately their own number, nor can Person C. • After everyone states “I don’t know,” Person A gains critical insight: if the difference between the two numbers (that A sees) allowed Person B or Person C to know their number immediately, they’d have known right away. The fact that B and C didn’t know means the difference is not immediately obvious to B or C. Thus, the difference cannot be trivial (such as identical numbers or easy multiples).

Therefore, the two numbers must be such that their difference doesn’t immediately clue in the other persons. This occurs most clearly with pairs of numbers which have multiple factorization possibilities or aren’t trivial multiples.

But we must carefully test quickly to find a suitable, plausible integer pair whose sum is 65 and whose ambiguity would logically survive through multiple rounds. The puzzle is a classic logic puzzle with a known structure:

A common scenario in these classic puzzles is the use of Fibonacci-like or similar “golden” sets. Let’s quickly check the following well-known ambiguous triple: • Numbers: 65, 39, 26 • Check quickly: indeed, 39 + 26 = 65.

Checking logic quickly: • Person A sees 39 and 26. Initially ambiguous (number could be 65 or 13). • Person B sees 65 and 26. Could be either 39 or 91 at first glance. Initially ambiguous. • Person C sees 65 and 39. Could be either 26 or 104 initially. Initially ambiguous.

After each person says they don’t know: • Now, Person A considers: If my number was 13, then Person B would see numbers 13 and 26. Person B would then immediately know their number is 39 (13+26=39), no ambiguity. But Person B didn’t know! So Person A’s number cannot be 13. Hence, Person A confidently concludes their number is 65.

Thus, the numbers 65, 39, and 26 satisfy the puzzle conditions perfectly!

⸻

Step 6: Final Answer (Product of the Numbers)

The three numbers identified clearly are: • A = 65 • B = 39 • C = 26

Then, the product is: 65 \times 39 \times 26 = 65 \times 1014 = 65,910

⸻

Final Answer:

\boxed{65,910}

WithinReason · 5 months ago

This was not logic, this was recall from the training set disguised as logic:

Let’s quickly check the following well-known ambiguous triple: • Numbers: 65, 39, 26

How human reviewers improve Google AI To help with quality and improve our products (such as the generative machine-learning models that power Gemini Apps), human reviewers (including third parties) read, annotate, and process your Gemini Apps conversations. We take steps to protect your privacy as part of this process. This includes disconnecting your conversations with Gemini Apps from your Google Account before reviewers see or annotate them. Please don’t enter confidential information in your conversations or any data you wouldn’t want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.

Conversations that have been reviewed or annotated by human reviewers (and related data like your language, device type, location info, or feedback) are not deleted when you delete your Gemini Apps activity because they are kept separately and are not connected to your Google Account. Instead, they are retained for up to three years.