projectyang (u/projectyang)

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

csomar · 2 months ago

Was this vibe-coded: https://imgur.com/a/GvxA3mD ?

projectyang · 2 months ago

Yep, I used claude code to help build this.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

sblawrie · 2 months ago

Do the players (LLMs) have memory of how prior hands were played by their opponents, or know their VPIP and PFR percentages? Or is each hand stateless?

projectyang · 2 months ago

Each hand is stateless

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

indigodaddy · 2 months ago

Modern poker (which tbf not sure if these LLMs are acting according to modern GTO or not) is highly dependent on position. Things change a lot too when/if you are in SB/BB.

projectyang · 2 months ago

Yes, the prompt tries to get them to play GTO. I do think their preflop play is the closest to mirroring this compared to postflop behavior.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

ionwake · 2 months ago

Why are there 2 Claude Players ?

projectyang · 2 months ago

On mobile I had to squeeze the names, but on a wider view you'll see that it's Claude (Opus 4.5) and Claude (Sonnet 4.5).

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

indigodaddy · 2 months ago

Curious if you used pokerkit for this, or some other engine or custom engine?

projectyang · 2 months ago

Nope, no external poker libraries. Just a basic nodejs and socket.io server with game logic.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

mashlol · 2 months ago

I'm not an expert, but as I understand it there are existing solvers for poker/holdem? Perhaps one of the players could be a traditional solver to see how the LLMs fare against those?

projectyang · 2 months ago

While others have commented about solvers, I'd also like to bring up AI poker bots such as Pluribus (https://en.wikipedia.org/wiki/Pluribus_(poker_bot)).

This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.

As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

indigodaddy · 2 months ago

Well they can be watching all the action and thinking the whole time as the action leads up them, just like we do in poker. To me it's different, subtly perhaps.

projectyang · 2 months ago

For my implementation, I'm passing in the current hand's action history (e.g. Player 1 raises to $X preflop, Player 2 calls, Player 3 calls. Flop is A B C, Player 2 checks, etc) whenever the action is on the player.

Your idea of having it being passed in real time and having the LLM create a chain of thoughts even if action is not on them is interesting. I'd be curious to see if it would result in improved play.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

jz67 · 2 months ago

Honest question, but this seems like an expensive project to host given the number of tokens per second. How is this being paid for?

projectyang · 2 months ago

Good question! The player rooms have a rate limit per day. And as for the main table, it's actually a replay of hands I recorded the LLMs playing against each other over an extended time which eventually loops.

projectyang commented on Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/... · Posted by u/projectyang

sejje · 2 months ago

I used to play professionally, and I still play in the casinos.

These LLMs are playing better than most human players I encounter (low limits).

They're kinda bad, but not as criminally bad as the humans.

projectyang · 2 months ago

I'm actually surprised at how well they play pre-flop (mostly). Did some initial analysis on VPIP/PFR across positions, and somewhat decent.

Post-flop on the other hand is all over the place...

Posted by u/projectyang 2 months ago

Show HN: Play poker with LLMs, or watch them play against each other llmholdem.com/...

u/projectyang

KarmaCake day67March 17, 2021View Original