rfurmani (u/rfurmani)

rfurmani commented on Gemini with Deep Think achieves gold-medal standard at the IMO deepmind.google/discover/... · Posted by u/meetpateltech

dvh · 5 months ago

Ok but when reported by mass media, which never used SI units and instead uses units like libraries of Congress, or elephants, what kind of unit should media use to compare computational energy of ai vs children?

rfurmani · 5 months ago

Dollars of compute at market rate is what I'd like to see, to check whether calling this tool would cost $100 or $100,000

rfurmani commented on Gemini with Deep Think achieves gold-medal standard at the IMO deepmind.google/discover/... · Posted by u/meetpateltech

modeless · 5 months ago

> AlphaGeometry and AlphaProof required experts to first translate problems from natural language into domain-specific languages, such as Lean, and vice-versa for the proofs. It also took two to three days of computation. This year, our advanced Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions

So, the problem wasn't translated to Lean first. But did the model use Lean, or internet search, or a calculator or Python or any other tool during its internal thinking process? OpenAI said theirs didn't, and I'm not sure if this is exactly the same claim. More clarity on this point would be nice.

I would also love to know the rough order of magnitude of the amount of computation used by both systems, measured in dollars. Being able to do it at all is of course impressive, but not useful yet if the price is outrageous. In the absence of disclosure I'm going to assume the price is, in fact, outrageous.

Edit: "No tool use, no internet access" confirmed: https://x.com/FredZhang0/status/1947364744412758305

rfurmani · 5 months ago

Sounds like it did not:

> This year, our advanced Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions – all within the 4.5-hour competition time limit

rfurmani commented on Solving LinkedIn Queens Using Haskell imiron.io/post/linkedin-q... · Posted by u/agnishom

dazed_confused · 6 months ago

Hmm, an interesting pattern is that every queen is a knight's move away. I haven't thought about this problem since I started dabbling in chess but now looks like a simple pattern.

rfurmani · 6 months ago

Only for that particular board, in general it will be very complex and depend on the shape of the colored regions

rfurmani commented on Waymo rides cost more than Uber or Lyft and people are paying anyway techcrunch.com/2025/06/12... · Posted by u/achristmascarl

harmmonica · 6 months ago

As a Waymo-booster on HN for a while now, here's my latest anecdote. I tried to figure out how to take Waymo to LAX even though it's not actually in their territory yet just because I value the experience so much. I was borderline going to take it within walking distance (about half a mile), but got lazy at the last minute. I took Lyft instead, and, as if the universe cursed my laziness, I booked a "comfort" car for $3 more than the base level Lyft. At first I was going to get a Tesla Model Y to take me, but that cancelled. Instead, what must have been a first generation Honda Pilot picked me up, suspension creaking and muffler that had seen better days. Did Lyft recognize what they sent instead of the "comfort" they promised and therefore charge me $3 less? Of course not. When I tried to contact customer service I ran into what I'm sure plenty of HN people have, which is a dead end where you report the issue and they (programmatically?) adjudicate the complaint on the spot. Their determination? I wasn't entitled to a $3 refund. Ironic that the rideshare app with human drivers doesn't allow me to contact their customer service whereas Waymo has no problem with it (yeah, yeah, I get it, "we'll see once they reach a huge scale." But today the experience is so much better than Uber or Lyft that while it lasts I will bask in its driverless glory).

rfurmani · 6 months ago

I've had a couple bad experiences with Lyft recently, including one time the driver must have clicked that they picked me up while a block away, because I could see the lyft driving to the destination without me. I tried to get a refund since I was obviously waiting my start location the whole time, but the system claimed the drive went from start to finish (even though I wasn't in the car), so no refund.

rfurmani commented on The End of Sierra as We Knew It, Part 1: The Acquisition filfre.net/2025/04/the-en... · Posted by u/cybersoyuz

vunderba · 8 months ago

Sierra was responsible for creating two of my favorite games of all time - King's Quest VI (designed by Roberta Williams / Jane Jensen) and Conquests of the Longbow (designed by Christy Marx).

It's such a contrast then to read (what I find profoundly distasteful) quotes like this from the other side of the company. Ken Williams: "I read books about business executives who owned yachts and jets, and who hung out with beautiful models in fancy mansions. I knew that was my future and I couldn’t wait to claim it.".

It's a tragedy Ken Williams managed to overrule nearly everyone familiar with Sierra (including his wife) opposed to the acquisition by CUC.

https://en.wikipedia.org/wiki/CUC_International

rfurmani · 8 months ago

Completely agree on both counts! I loved those two games and felt Conquests of the Longbow didn't get the recognition it deserves.

On the second point, when I read his book (https://kensbook.com/) I was disappointed to not hear about the magic of the games themselves and the creative process behind them. It became clear that his primary goal was to grow a business, he thought being a game distributor was more exciting, but then was disrupted by Steam, shareware, and online distribution.

rfurmani commented on PaperBench openai.com/index/paperben... · Posted by u/meetpateltech

amelius · 9 months ago

One thing I'd be interested in is a UI for reading papers with AI assistance.

rfurmani · 9 months ago

I'm building such tools at https://sugaku.net, right now there's chatting with a paper and browsing similar papers. Generally arXiv and other repositories want you to link to them and not embed their papers, which makes it hard to build inline reading tools, but it's on my roadmap to support that for uploaded papers. Would love to hear if you have some feature requests there