This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.
As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.
Your idea of having it being passed in real time and having the LLM create a chain of thoughts even if action is not on them is interesting. I'd be curious to see if it would result in improved play.
These LLMs are playing better than most human players I encounter (low limits).
They're kinda bad, but not as criminally bad as the humans.
Post-flop on the other hand is all over the place...