One of the researchers, Tuomas Sundholm, has a real badass CV. Former pilot in the Finnish airforce. Finnish windsurfer champion. Snowboarder. Professor at Carnegie Mellon. Speaks four european languages, including swedish. And now at the age of 51, he has created the best AI powered poker bot.
> speaking four languages is pretty normal in Europe
Northern Europe, maybe. French people for instance tend to suck at foreign languages. We rarely go beyond 3 languages (French, English, then German or Spanish. The last two are often forgotten after school.)
> speaking four languages is pretty normal in Europe
Clearly we have different experiences (swedish person living in spain currently) but I haven't met that many people who speak four languages and are from a european country (but have yet to been in eastern europe).
That finns speak swedish is a special case though, as AFAIK, they learn swedish in school and being finn-swedish is a thing too.
Getting in touch with two foreign languages in school is not uncommon, but speaking up to four (including your mother tongue) with any sort of sophistication definitely is not normal, at least in western Europe.
No it's not, what are you talking about? I've met thousands of young Europeans and ones that speak 4 languages are extremely rare. Unless they're from countries where they get 2 languages "for free" like Holland/Belgium/Switzerland. Definitely not "pretty normal".
French people can usually speak basic English, and a third language is common if that person has ties with another country but that's it. At school, we are normally taught two foreign languages. The first one is usually English, few people actually practice their second one.
The situation is completely different in Scandinavian countries. And it is indeed quite normal to speak 4 languages in Finland (usually Finnish, Swedish, English and a 4th one, often German). Because their native language is only spoken by a few, foreign languages are a necessity for international relationships. And as a Finnish friend told me, learning new languages is a popular way to pass time during long winter nights.
If you want to keep your conversation private it is not enough to choose a rare language in Berlin. There is always somebody who understands what you are saying.
>> Pluribus is also unusual because it costs far less to train and run than other recent AI systems for benchmark games. Some experts in the field have worried that future AI research will be dominated by large teams with access to millions of dollars in computing resources. We believe Pluribus is powerful evidence that novel approaches that require only modest resources can drive cutting-edge AI research.
That's the best part in all of this. I'm not convinced by the claim the authors repeatedly make, that this technique will translate well to real-world problems. But I'm hoping that there is going to be more of this kind of result, singalling a shift away from Big Data and huge compute and towards well-designed and efficient algorithms.
In fact, I kind of expect it. The harder it gets to do the kind of machine learning that only large groups like DeepMind and OpenAI can do, the more smaller teams will push the other way and find ways to keep making progress cheaply and efficiently.
Yes! I work for a company that does just this: pull big gears on limited data and try to generalise across groups of things to get intelligent results even on small data. In many ways, it absolutely feels like the future.
It's easy to "take away" too much information from this. The focus is that an AI poker bot "did this" and not get too much into other adjacent subjects.
But what's the fun in that?
10,000 hands in an interesting number. If you search the poker forums, you'll see this is the number you'll see people throw out there for how many hands you need to see before you can analyze your play. You then make adjustments and see another 10,000 hands before you can assess those changes.
In 2019, it's impractical to adapt as a competitive player in live poker. A grinder can see 10,000 hands within a day. The live poker room took 12 days. Another characteristic of online poker is that players can also use data to their advantage.
So, I wouldn't consider 10K hands as long term, even if this was a period of 12 days. Once players get a chance to adapt, then they'll increase their rate of wins against a bot. Once you have a history of hand histories being shared, then it's all over. And again, give these players their own software tools.
Remember that one of the most exciting events in online poker was the run of isildur1. That run was put to rest when he went bust against players who had studied thousands of his hand histories.
This doesn't take away from the development of the bot. If we learn something from it, then all good.
>10,000 hands in an interesting number. If you search the poker forums, you'll see this is the number you'll see people throw out there for how many hands you need to see before you can analyze your play. You then make adjustments and see another 10,000 hands before you can assess those changes.
If you read the paper/facebook post[0] (no idea why this worse article is the link here) - you'll see they address this.
>Although poker is a game of skill, there is an extremely large luck component as well. It is common for top professionals to lose money even over the course of 10,000 hands of poker simply because of bad luck. To reduce the role of luck, we used a version of the AIVAT variance reduction algorithm, which applies a baseline estimate of the value of each situation to reduce variance while still keeping the samples unbiased. For example, if the bot is dealt a really strong hand, AIVAT will subtract a baseline value from its winnings to counter the good luck. This adjustment allowed us to achieve statistically significant results with roughly 10x fewer hands than would normally be needed.
>Remember that one of the most exciting events in online poker was the run of isildur1. That run was put to rest when he went bust against players who had studied thousands of his hand histories.
Perhaps more famously, Jungleman compiled hand histories from many different people while he was playing Tom Dwan in the 'durrrr' challenge (which I guess technically isn't over....)
You clearly didn’t read the additional links they posted. They mentioned why they chose 10k (AIVAT), and it goes far beyond any of the variables you mentioned.
That really doesn't address the point that was raised. It's not that the bot wins through luck and that 10k is too small a sample, it's that a good professional poker player isn't good over 10k hands, they're good over 5 years.
Any good player will have their play analyzed and responded to, so there's a feedback loop there - any good player will have their play analyzed, exploited and will have to re-adjust their strategy to respond to exploitative play. The question is: How does the AI strategy adapt over time to players who know the hand history of the AI strategy. That's an extremely important part of being a top level player. To give you an example - if you watch Daniel Negreanu's vlog about his time at the WSOP he actively talks about changing his strategy in response to his analysis of different players' profiles. This is especially important in Sit & Go where at high stakes you'll have regular grinders who build up reputations - less so in tournaments where you're less likely to meet any given player.
What took you so long? I mean not the Pluribus team specifically, but Poker AI researchers in general.
The desire to master this sort of game has inspired the development of entire branches of mathematics. Computers are better at maths than humans. They're less prone to hazardous cognitive biases (gambler's fallacy etc.) and can put on an excellent poker face.
As a layperson who's rather ignorant about both no-limit Texas hold 'em and applicable AI techniques, my intuition would tell me that super-human hold 'em should have been achieved before super-human Go. Apparently your software requires way less CPU power than AlphaGo/AlphaZero, which seems to support my hypothesis. What am I missing?
Bonus questions in case you have the time and inclination to oblige:
What does this mean for people who like to play on-line Poker for real money?
Could you recommend some literature (white papers/books/lecture series/whatever) to someone interested in writing an AI (running on potato-grade hardware) for a niche "draft and pass" card game (e.g. Sushi Go!) as a recreational programming exercise?
I think it took the community a while to come up with the right algorithms. So much of early AI research was focused on beating humans at chess and later Go. But those techniques don't directly carry over to an imperfect-information game like poker. The challenge of hidden information was kind of neglected by the AI community. This line of research really has its origins in the game theory community actually (which is why the notation is completely different from reinforcement learning).
Fortunately, these techniques now work really really well for poker. It's now quite inexpensive to make a superhuman poker bot.
OP discussed it but while this is true, it is not necessarily true or straightforward when it comes to games with hidden information like poker. This is more of a game theoretical problem (Economics) than it is a purely mathematical one, which had less support in the AI/ML community, hence the delay.
The lower CPU/GPU/resource use supports that fact as does your intuition. Breaking poker required a lot of manual work and model design over brute force algorithms and reinforcement learning.
The bot does not seem to consider previous hands in its decisions. That is to say, it does not consider who it is playing against. Should this affect how we perceive the bot as “strategic” or not? Bots that play purely mathematically optimally on expected value aren’t effective or interesting. But it feels like this is playing on just a much higher order expected value.
It feels like a more down to earth version of the sci fi super human running impossible differential equations to predict exactly what you will do given knowledge that he knows what you know what he knows... etc. ad Infinitum. But since it doesn’t actually consider the person it’s predicting, it may simply be a really really good approximation of the game theoretic dominant strategy.
At what complexity of game and hidden information should we feel like the bot can’t win by running a lookup table?
The bot bluffs, and understands that when its opponent bets it might be a bluff. I would consider that to be strategic behavior. The fact that its strategy is determined by a mathematical process doesn't change that in my opinion.
>> Bots that play purely mathematically optimally on expected value aren’t effective or interesting.
Interesting is up to you, but effective is definitely wrong.
ICM-perfect bots crush small tournaments, which do not take into account opponent behavior - merely modeling the gamestate. The faster the blinds and the smaller the stacks, the better, but even normal structures get killed by these so-called "expected value" only bots.
Game Theory Optimal (GTO) attacks are incredibly effective at all levels of the game. The AI need not incorporate opponent feedback to be a winner. It can make it better, but it is not at all required.
First of all, I laughed at the 20-second average per game in self-play, since I ran into the same thing and have been trying to speed up the algorithm but haven't been able to get it faster (without throwing more hardware at it).
Second, I haven't read everything, but I believe you are playing a cash-game and not tournament-style. Is that correct? If that is the case, any chance you will be doing a tourney-style version?
[For those who don't play, in cash, a dollar is a dollar. In Tourney play, the top 2 or 3 players get paid out, so all dollars are not equal, as your strategy changes when you have only a few chips left (avoid risky bets that would knock you out) or when you are chip leader (take risky bets when they are cheap to push around your opponents).]
Also, curious how much poker you folks play in the lab for "research".
We're doing cash games in this experiment. At the end of the day, this is about advancing AI, not about making a poker bot. Going from two-player to multi-player has important implications for AI beyond just poker. I don't think the same is true for cash game vs tournament.
There's a cash game almost every night at the FBNY office! I don't usually play though -- I'm not nearly as good as the bot.
How do you think these same pros would do in a follow-up match? As described in the article, the bot put players off their game with much more varied betting and with donks. Do you think the margin would decrease as players are exposed to these strategies?
Players face mental fatigue and have so over-learned their existing strategies that it takes time to adapt new strategies and even more time for those new strategies to become second-nature.
It reminds me of sports in a way. Teams start running a new wrinkle of offense in the NFL like the wildcat and it takes a few seasons for teams to instinctively know how to play defense correctly against that option.
In the paper we include a graph of performance over the course of the 10,000-hand 5 humans + 1 AI experiment that was played over 12 days. There's no indication that the bot's performance decreased over time (there is a temporary downward blip in the middle, but that's likely just variance). Based on discussions with pros, it sounds like they didn't find any weaknesses and they didn't seem to think they'd find any given more time.
I also suspect it would not be able to maintain a ~40bb/100 hand win rate. The thing about human players is, while the best are capable of learning and employing truly balanced GTO strategies, in practice they rarely adhere to these because other humans (even good pros) will still have exploitable flaws in their strategies, and attempting to exploit these will be more profitable than sticking to the unexploitable strategy; of course it also opens the exploiter to counter-exploitation, creating a fluctuating cycle of players trying to exploit, getting exploited, then moving back towards playing unexploitably. That's the normal state of a pro's strategy in a given game - so to switch to a steady state of always playing unexploitably would be a fairly big adjustment even to top tier pros who are capable of it.
I remember reading in the mid-to-late aughts that a lot of old-school poker players that used more swagger and intuition were starting to be run out of the game by kids who applied statistical methods.
Could you perhaps speak to some of the engineering details that the paper glosses over. E.g.:
- Are the action and information abstraction procedures hand-engineered or learned in some manner?
- How does it decide how many bets to consider in a particular situation?
- Is there anything interesting going on with how the strategy is compressed in memory?
- How do you decide in the first betting round if a bet is far enough off-tree that online search is needed?
- When searching beyond leaf nodes, how did you choose how far to bias the strategies toward calling, raising, and folding?
- After it calculates how it would act with every possible hand, how does it use that to balance its strategy while taking into account the hand it is actually holding?
- In general, how much do these kind of engineering details and hyperparameters matter to your results and to the efficiency of training? How much time did you spend on this? Roughly how many lines of code are important for making this work?
- Why does this training method work so well on CPUs vs GPUs? Do you think there are any lessons here that might improve training efficiency for 2-player perfect-information systems such as AlphaZero?
We tried to make the paper as accessible as possible. A lot of these questions are covered in the supplementary material (along with pseudocode).
- Are the action and information abstraction procedures hand-engineered or learned in some manner?
- How does it decide how many bets to consider in a particular situation?
The information abstraction is determined by k-means clustering on certain features. There wasn't much thought put into the action abstraction because it turns out the exact sizes you use don't matter that much as long as the bot has enough options to choose from. We basically just did 0.25x pot, 0.5x pot, 1x pot, etc. The number of sizes varied depending on the situation.
- Is there anything interesting going on with how the strategy is compressed in memory?
Nope.
- How do you decide in the first betting round if a bet is far enough off-tree that online search is needed?
We set a threshold at $100.
- When searching beyond leaf nodes, how did you choose how far to bias the strategies toward calling, raising, and folding?
In each case, we multiplied by the biased action's probability by a factor of 5 and renormalized. In theory it doesn't really matter what the factor is.
- After it calculates how it would act with every possible hand, how does it use that to balance its strategy while taking into account the hand it is actually holding?
This comes out naturally from our use of Linear Counterfactual Regret Minimization in the search space. It's covered in more detail in the supplementary material
- In general, how much do these kind of engineering details and hyperparameters matter to your results and to the efficiency of training? How much time did you spend on this? Roughly how many lines of code are important for making this work?
I think it's all pretty robust to the choice of parameters, but we didn't do extensive testing to see. While these bots are quite easy to train, the variance is so high in poker that getting meaningful experimental results is relatively quite computationally expensive.
- Why does this training method work so well on CPUs vs GPUs? Do you think there are any lessons here that might improve training efficiency for 2-player perfect-information systems such as AlphaZero?
I think the key is that the search algorithm is picking up so much of the slack that we don't really need to train an amazing precomputed strategy. If we weren't using search, it would probably be infeasible to generate a strong 6-player poker AI. Search was also critical for previous AI benchmark victories like chess and Go.
I don't think the poker world would be happy with us if we did that. Heads-up limit hold'em isn't really played professionally anymore, but six-player no-limit hold'em is very popular.
In your Science paper, you mention playing 1H-5AI against 2 human players: Chris Ferguson and Darren Elias. In your blog post you also mention playing 1H-5AI against Linus Loelinger, who was within standard error of even money. Why did Linus not make it into the Science paper?
That took place after the final version of the Science paper was submitted. It would have been nice to include but it takes a while to do those experiments and we didn't feel it was worth delaying the publication process for it.
The article makes it sound like the AI is trained by evaluating results of decisions it makes on a per-hand basis. Is there any sense in which the AI learns about strategies that depend upon multiple hands? I’m thinking of bluffing/detecting bluffs and identifying recent patterns, which is something human poker players talk about.
Was Judea Pearl's work relevant for the counterfactual regret minimization, or is there some other basis? I've added CR to the list of things to look into later but skimming the paper it was exciting to think advances are being made using causal theory...
The CFR algorithm is actually somewhat similar to Q-learning, but the connection is difficult to see because the algorithms came out of different communities, so the notation is all different.
Who were the pros? Are they credible endbosses? Seth Davies works at RIO which deserves respect but I've never heard of the others except Chris Ferguson who I doubt is a very good player by todays standards (or human being, for that matter), but I've never heard of the others when I do know the likes of LLinusLove (iirc, the king of 6max), Polk and Phil Ganford.
Is 10,000 hands really considered a good enough sample? Most people consider 100k hands w/ a 4bb winrate to be an acceptable other math aside. However, as your opponent and yourself play with equal skill, variance increases to the point where regs refuse to sit each other.
What? The pros chosen were definitely highly skilled players. They're fairly well known in the online poker community.
Furthermore, Chris Ferguson, scumbag aside, is absolutely still a very good player by today's standards, and one way higher than the mean participant in a research experiment.
10,000 hands is an effective enough sample at a certain win rate and analysis of variance of play; the n-value alone is not enough to tell you if it was enough hands.
They're credible enough. I'd like the sample sizes to be bigger as well but they're enough to verify that even if the bot got lucky over the sample size, it's close enough that it doesn't really matter. Add a bit more compute, optimize some algorithms a little, and you'd make up the difference. The real point is that they have a technique that scales to 6-max, and whether it's 97% or 99% is kind of immaterial in the grand scheme of things.
FWIW, they did some variance reduction techniques that dramatically reduce the number of hands needed to be confident in your results, so the number of hands may be bigger than you think. e.g. the results of 10k HU hands have much higher variance than the results of 10k HU hands where everyone just collects their EV once they're all in.
Jimmy Chou, Jason Les, Dong Kim are affiliated with Doug Polk.
It is an interesting point that these are pros but their specialities are either tournament or heads up. The current 6 max pros are LLinusLove, Otb_RedBaron, TrueTeller.
I'm very late to this post, so not sure if you're still around.
What are your thoughts on a poker tournament for bots? Do you think it could turn into a successful product? I've always wanted to build an online poker/chess game that was designed from the ground up for bots (everything being accessible through an API), but have always worried that someone with more computational resources or the best bot would win consistently. Is it an idea you've thought about?
I have a few basic questions. I would like to implement my own generic game bot (discrete states). Are there any universal approaches? Is MCMC sampling good enough to start? My initial idea was to do importance sampling on some utility/score function.
Also, I am looking into poker game solvers - what would be a good place to start? What's the simplest algorithm?
A little bit of both. We didn't think we needed the extra computing power. And we really wanted to convey how cheap it is to make a superstrong poker AI with these latest algorithms.
Knowing when to bluff often depends on the psychology of the opponent, but since it trained playing itself it doesn't seem that knowing when to bluff would be learned. Did it bluff very often?
The bot does bluff, and in fact it learns from self-play that bluffing is (sometimes) the optimal thing to do. At the end of the day, bluffing is simply betting when you have a weak hand. The bot learns from experience that when it bets with a weak hand, the opponent (another copy of itself) sometimes folds and it makes more money than if it hadn't bet. The bot doesn't view it as deceptive or dishonest. It just views it as the action that makes it the most money.
Of course, a key part of bluffing is getting the probabilities right. You can't always bluff and you can't never bluff, because that would make you too predictable. But our self-play and search algorithms are designed to get those probabilities right.
At the highest levels of play psychological factors are pretty minimal. Before a showdown which cards you actually hold aren't particularly material, as the only information you convey is through your bids. This means if you predict that you're more likely to win a hand by bidding (and inducing a fold) than by calling and going to a showdown it makes mathematical sense to "bluff". I'm sure AIs have no trouble learning that fact.
Are there any ethical considerations relating to the prospect of use of this bot for cheating in real-money games? Either from your internal team or after public replication?
We're really focused on advancing the fundamental AI aspect. We're not here to kill poker. The popular poker sites have quite sophisticated anti-bot measures, but it's true that this is an arms race.
There are no ethical reasons why a game like poker must exist. In fact, poker gives a false sense of hope to the thousands of gambling addicts that enter casinos. It is a fun game, but there are an unlimited potential number of fun games..
Very impressive. If my understanding of how the AI works is correct, it is using a pre-computed strategy developed by playing trillions of hands, but it is not dynamically updating that during game play, nor building any kind of profiles of opponents. I wonder if by playing against it many times, human opponents could discern any tendencies they could exploit. Especially if the pre-computed strategy remains static.
We played 10,000 hands of poker over the course of 12 days in the 5 humans + 1 AI experiment, and 5,000 hands per player in the 1 human + 5 AI's experiment. That's a good amount of time for a player to find a weakness in the system. There's no indication that any of the players found any weaknesses.
In fact, the methods we use are designed from the ground up to minimize exploitability. That's a really important property to have for an AI system that is actually deployed in the real world.
A hearty congratulations, Noam, on finishing another chapter of the story i opened in the early 1990s...
Another person asked "What took you so long?", and i had the same question. :) I really thought this milestone would be achieved fairly soon after i left the field in 2007. However, breakthroughs require a researcher with the right amount of reflectiveness, insight, and determination.
Thanks! I think going beyond two-player/team zero-sum games is really important. This was a first step, but it's definitely not the last. I'm hoping to continue in this direction, and maybe start looking at interactions involving the potential for cooperation in addition to competition.
I haven't finished digging through the paper and the supplement yet, but I'm curious about how many hands were multiway to the flop (and whether the percentages differ significantly between 1H/5AI and 5H/1AI). I'd guess that it's a pretty small fraction of the total hands, and I'm wondering what the performance is like in those particular cases.
I don't have the exact percentages but I think it's less than 10%. It's not really possible to measure the bot's performance just in specific situations, but my feeling is the bot performs relatively well in these situations. Multi-way flops were basically impossible to do in a reasonable amount of time for past AI's. Our new search techniques make these situations feasible to figure out in seconds.
What table information does the bot take into account? Position? Other player's stack size?
>Regardless of which hand Pluribus is actually holding, it will first calculate how it would act with every possible hand .
Is this information used to form an idea of what other players might be holding based on how the other player acts and how closely that action matches Pluribus's 'what if' action?
No, it's to mask actions. If you bet big with monsters and check with air 100% of the time, you opponent knows when to fold and bet.
iirc, the frequency of bets in that spot is roughly equivalent to the frequency of times you're definitely in front of your opponent in that particular spot, but not always with the hands that are beating your opponent.
The concept is called Game Theory Optimal (GTO) and it's pretty popular in higher stakes games.
We talk about this a bit in the paper. Based on the feedback from the pros, the bot seems to "donk bet" (call and then bet on the next round) much more than human pros do. It also randomizes between multiple bet sizes, including very large bet sizes, while humans stick to just one or two sizes depending on the situation.
Neal - super interesting stuff. Couple of questions:
1) What were the reasons for choosing 6-handed play (assuming logistical and costs)? It would be interesting to see how the bot’s strategy would differ in a full ring game.
2) Are there any plans to commercialize the bot as a tool for training human players?
1) The goal was to show convincingly that we could handle multi-player poker. The exact number of players was kind of arbitrary. We chose six-player because that's the most common/popular format. Considering training the 6-player bot would cost less than $150 on a cloud computing service, I think it's safe to say these techniques would all work fine in other formats.
2) I'm quite happy working on fundamental AI research and plan to continue in that direction.
It's going for game-theory-optimal play. It doesn't adapt to its opponents' observed weaknesses. But I think it's cool to show that you don't need to adapt to opponent weaknesses to win at poker at the highest levels. You just need to not have any weaknesses yourself.
Congratulations on the win! Can you recommend any papers, blog(post)s, or books for the interested layman? (I am currently scanning though the facebook post, which is great, but personally I am looking for something more technical).
Very interesting results. From the paper it sounds like the algorithms you used are very similar to Libratus (pre-solved blueprint + subgame solving). What change made it so that the computation requirement is much lower now?
There were several improvements but the most important was the depth-limited search. Libratus would always search to the end of the game. But that's not necessarily feasible in a game as complex as six-player poker. With these new algorithms, we don't need to go to the end of the game. Instead, we can stop at some arbitrary depth limit (as is done in chess and Go AI's). That drastically reduces the amount of compute needed.
Can you share more details about the abstraction? The paper is kind of vague on it. How does it decide if it should use 1 or 14 bet values? Is it a perfect recall abstraction? How many information sets are there?
It is in a way disappointing that this question gets so little attention, and yet, it might be the most significant. If a bot can false-card - if it can discern the strategy that the opponents have in mind, and deliberately mislead them to its own advantage - we have a real world AI. However, skills of computer bridge programs remain at club level standards.
Interesting that the conventional wisdom of never open limping emerged as confirmed through self-play. What other general poker “best practices” were either confirmed or upended through this research?
For someone not in the AI field, can you explain why AI is needed and an elaborate code with conditional blocks is not enough? Where does AI fit in with a poker game.
Conditional blocks would work, but it would be an impossibly detailed and granular tree to setup. The AI component simply helps you arrive at the decisions to create the complex tree.
This is super interesting! What steps would you recommend a professional poker player take in order to use AI to improve his/her personal poker skills?
It doesn't exploit its opponents' weaknesses. Its focus was on not having any weaknesses that its opponents could exploit. However, the algorithms are not guaranteed to converge to a Nash equilibrium in this setting because it's not a two-player zero-sum game (and in either case, it's not clear that playing a Nash equilibrium would provide much benefit in this setting).
There was real money at stake in this experiment. The pros were guaranteed $0.40 per hand just for participating, but that could increase to $1.60 per hand depending on how well they did.
To answer your question, no, I don't think human players would play at their best when not playing for actual money.
We played 10,000 hands of poker in the 5 humans + 1 AI experiment. The number of hands won isn't a useful metric in poker. If you win only 10% of your hands and make $1,000 on those hands, while losing only $1 on the other 90% of hands, then you're a winning player. The bot won at a rate of 4.8 bb/100 ($4.8 per hand if the blinds are $50/$100). This is considered a large win rate by professionals.
they're in the extra data section of the science mag article. formatting is terrible for importing into hand history viewers, so i'm trying to get a friend to re-format
Honestly, probably debugging. Training this thing is very cheap, but the variance in poker is huge (even with the best variance-reduction techniques) so it takes a very long time to tell whether one version is better than another version (or better than a human).
The number of players is kind of arbitrary given the techniques we're using. We chose 6 because that's the most popular/common format for poker. I don't think there's any scientific value in also doing 10.
Our goal is to make the research as accessible as possible to the AI community, so we include descriptions of the algorithms and pseudocode in the supplementary material. However, in part due to the potential negative impact this code could have on online poker, we're not releasing the code itself.
This is fascinating stuff. So do I understand this right, Liberatus worked using computing the Nash equilibrium, while the new multiplayer version works using self-play like AlphaGo Zero? Did you run the multiplayer version against the two-player version? If yes, how did it go? Could you recommend a series of books / papers that can take me from zero to being able to reprogram this (I know programming and mathematics, but not much statistics)? And how much computing resources / time did it take to train your bot?
Training was super cheap. It would cost under $150 on cloud computing services.
The training aspect has some improvements but is at its core similar to Libratus. The search algorithm is the biggest difference.
There aren't that many great resources out there for helping new people get caught up to speed on this area. That's something we hope to fix in the future. Maybe this would be a good place to start? http://modelai.gettysburg.edu/2013/cfr/cfr.pdf
So let me see if I understand this. I don't believe it's hard to write a probabilistic program to play poker. That's enough to win against humans in 2-player.
With one AI and multiple professional human players sitting at a physical table, the humans outperform the probabilistic model because they take advantage of each other's mistakes/styles. Some players crash out faster but the winner gets ahead of the safe probabilistic style of play.
So this bot is better at the current professional player meta than the current players. In a 1v1 against a probabilistic model, it would probably also lose?
Am I understanding this properly? Or is playing the probabilistic model directly enough of a tell that it's also losing strategy? Meaning you need some variation of strategies, strategy detection, or knowledge of the meta to win?
Interesting article. Too bad a don't have a subscription to read the paper.
The bot played like 10 000 hands. There is no way that is enough to prove it's better or worse than the opponents.
More so in no-limit where some key all-ins can turn the game up side down. The variance is higher than limit or fixed, right?
I did a heads up Texas holdem fixed bot with "counter factual regret minimization" like 8 years ago from a paper I read. It had to play like 100 000 hands vs a crappy reference bot to prove it was better.
Strategy detection in so short games is probably worthless.
The edge is probably in seeing who are tired or drunk in paper poker.
They mention that they use AIVAT to reduce variance.
> Although poker is a game of skill, there is an extremely large luck component as well. It is common for top professionals to lose money even over the course of 10,000 hands of poker simply because of bad luck. To reduce the role of luck, we used a version of the AIVAT[1] variance reduction algorithm, which applies a baseline estimate of the value of each situation to reduce variance while still keeping the samples unbiased. For example, if the bot is dealt a really strong hand, AIVAT will subtract a baseline value from its winnings to counter the good luck. This adjustment allowed us to achieve statistically significant results with roughly 10x fewer hands than would normally be needed.
Hi Noam: I'm intrigued that you trained/tested the bot against strategies that were skewed to raise a lot, fold a lot and check a lot, as well as something resembling GTO. Were there any kinds of table situations where the bot had a harder time making money? Or where the AI crushed it?
I'm thinking in particular of unbalanced tables with an ever-changing mixture of TAG and LAG play. I've changed my mind three times about whether that's humans' best refuge -- or a situation that's a bot's dream.
With the advent of AI bots in Poker, Chess etc., what happens to the old adage of "Play the player, not the game". How do modern human players manage when you don't have the psychological aspects of the game to work with?
I see on chess channels that grand masters have to rethink their whole game preparation methodology to cope with the "Alpha Zero" oddities that have now been introduced into this ancient game. They literally have to "throw out the book" of standard openings and middle games and start afresh.
The chess channels you're visiting are grossly overstating Alpha Zero's impact. AFAICT, it hasn't made any impact on opening theory at all. AZ's strength is in the middlegame, where it appears to be slightly better than traditional engines (like Stockfish) at finding material sacrifices for long term piece activity and/or mating attacks.
> what happens to the old adage of "Play the player, not the game". How do modern human players manage when you don't have the psychological aspects of the game to work with?
I would say that it's thoroughly rebounded to play the game not the player in poker and this isn't because of super bots like the one used in this paper.
Ever since game theory invaded poker players that play in highly visible events such as tv tournaments try as hard as possible to make their game unexploitable.
Like already stated, saying that Alpha Zero has forced the chess world to seriously reconsider the basic principles of chess openings etc. is a bit of a stretch. But interestingly enough, the current world champion (Magnus Carlsen) is having the chess streak of his life as we speak. On the side, he's been openly joking about Alpha Zero being one of his biggest chess idols. It's safe to say the streak is probably mostly related to his preparation from the last world championship match half a year ago carrying over to all the tournaments after.
However, even according to the former world champion (Viswanathan Anand) the run he's been on is something quite shocking: “His results this year is simply [great].... difficult to find words. [It’s been] completely off the charts. I think the chess world is still in a bit of a shock. The rest of the players are struggling to deal with a phenomenon [like him]. Even in 2012-13, his domination was less than it is this year. Everyone is still processing this information.” [1]
Carlsen is basically on route to breaking 2900 Elo - at 2882 Elo with a clear upwards trend - while there's only two other active players even above 2800 Elo and struggling to keep it above that treshold. (Elo is the rating system used in chess. Above 1500 Elo is an average player, 2000 Elo is a good player, 2500 Elo is a grandmaster. Anything above 2700 Elo is basically godlike.)
Oddly enough, instead of playing more like a machine, it seems like Carlsen has been playing chess that is much more about the human aspect of the game rather than trying to find the top ranked engine move on every turn. (The current traditional top engine - Stockfish - makes an assumption of each move's validity using a point system, which the chess world has been more or less obsessing over for the past decade. Alpha Zero doesn't have such a point system whatsoever.) He's been playing a drastically more aggressive and dynamic variety of chess compared to what has been seen in a long time at the top tournaments.
He's been playing to create dizzying positions on the board, making a few moves that aren't necessarily liked by the traditional top engines, but still finding himself in a winning position several moves after. It definitely looks like some sort of black magic, but it seems like the big thing Alpha Zero has brought to the general philosophy on how to approach chess at the top level is that it's possible to play aggressive chess, take risks and win in 2019. Magnus Carlsen is the first player to successfully reinvent that style of play, more than likely partly inspired by Alpha Zero. So, I'd say the big thing about Alpha Zero isn't necessarily that it could beat the other top engines, but more importantly that the 'artistic' aspect of its play is something that has never been seen from another chess engine. The fact that it proved that sort of style superior to the play ever before played by another chess engine is just the icing on the cake.
Garry Kasparov on Alpha Zero's chess persona: "I admit that I was pleased to see that AlphaZero had a dynamic, open style like my own. The conventional wisdom was that machines would approach perfection with endless dry maneuvering, usually leading to drawn games. But in my observation, AlphaZero prioritizes piece activity over material, preferring positions that to my eye looked risky and aggressive." [2]
https://www.cs.cmu.edu/~sandholm/cv.pdf
Northern Europe, maybe. French people for instance tend to suck at foreign languages. We rarely go beyond 3 languages (French, English, then German or Spanish. The last two are often forgotten after school.)
I suspect Spain and Italy are similar.
Clearly we have different experiences (swedish person living in spain currently) but I haven't met that many people who speak four languages and are from a european country (but have yet to been in eastern europe).
That finns speak swedish is a special case though, as AFAIK, they learn swedish in school and being finn-swedish is a thing too.
Average number of languages spoken: France: 1.8 , Germany: 2.0 , Spain: 1.7 , Portugal: 1.6 , Italy: 1.8 , Greece: 1.8 , Poland: 1.8 , Sweden: 2.5 , Finland: 2.6 , UK: 1.6
I rarely met someone who could speak four languages fluently.
French people can usually speak basic English, and a third language is common if that person has ties with another country but that's it. At school, we are normally taught two foreign languages. The first one is usually English, few people actually practice their second one.
The situation is completely different in Scandinavian countries. And it is indeed quite normal to speak 4 languages in Finland (usually Finnish, Swedish, English and a 4th one, often German). Because their native language is only spoken by a few, foreign languages are a necessity for international relationships. And as a Finnish friend told me, learning new languages is a popular way to pass time during long winter nights.
It's not normal.
Judging by his name, I'd assume Swedish is his first language, so that particular aspect isn't that surprising to me
It's not uncommon to speak four languages (often C2 in couple of them) in the North Europe, esp. the Baltic region.
Like mentioned by sibling (sakarisson), that particular part is not impressive, the rest - sure
Science article: https://science.sciencemag.org/content/early/2019/07/10/scie...
That's the best part in all of this. I'm not convinced by the claim the authors repeatedly make, that this technique will translate well to real-world problems. But I'm hoping that there is going to be more of this kind of result, singalling a shift away from Big Data and huge compute and towards well-designed and efficient algorithms.
In fact, I kind of expect it. The harder it gets to do the kind of machine learning that only large groups like DeepMind and OpenAI can do, the more smaller teams will push the other way and find ways to keep making progress cheaply and efficiently.
But what's the fun in that?
10,000 hands in an interesting number. If you search the poker forums, you'll see this is the number you'll see people throw out there for how many hands you need to see before you can analyze your play. You then make adjustments and see another 10,000 hands before you can assess those changes.
In 2019, it's impractical to adapt as a competitive player in live poker. A grinder can see 10,000 hands within a day. The live poker room took 12 days. Another characteristic of online poker is that players can also use data to their advantage.
So, I wouldn't consider 10K hands as long term, even if this was a period of 12 days. Once players get a chance to adapt, then they'll increase their rate of wins against a bot. Once you have a history of hand histories being shared, then it's all over. And again, give these players their own software tools.
Remember that one of the most exciting events in online poker was the run of isildur1. That run was put to rest when he went bust against players who had studied thousands of his hand histories.
This doesn't take away from the development of the bot. If we learn something from it, then all good.
If you read the paper/facebook post[0] (no idea why this worse article is the link here) - you'll see they address this.
>Although poker is a game of skill, there is an extremely large luck component as well. It is common for top professionals to lose money even over the course of 10,000 hands of poker simply because of bad luck. To reduce the role of luck, we used a version of the AIVAT variance reduction algorithm, which applies a baseline estimate of the value of each situation to reduce variance while still keeping the samples unbiased. For example, if the bot is dealt a really strong hand, AIVAT will subtract a baseline value from its winnings to counter the good luck. This adjustment allowed us to achieve statistically significant results with roughly 10x fewer hands than would normally be needed.
0. https://ai.facebook.com/blog/pluribus-first-ai-to-beat-pros-...
Perhaps more famously, Jungleman compiled hand histories from many different people while he was playing Tom Dwan in the 'durrrr' challenge (which I guess technically isn't over....)
For any number of hands, my money is on the bot.
Any good player will have their play analyzed and responded to, so there's a feedback loop there - any good player will have their play analyzed, exploited and will have to re-adjust their strategy to respond to exploitative play. The question is: How does the AI strategy adapt over time to players who know the hand history of the AI strategy. That's an extremely important part of being a top level player. To give you an example - if you watch Daniel Negreanu's vlog about his time at the WSOP he actively talks about changing his strategy in response to his analysis of different players' profiles. This is especially important in Sit & Go where at high stakes you'll have regular grinders who build up reputations - less so in tournaments where you're less likely to meet any given player.
The desire to master this sort of game has inspired the development of entire branches of mathematics. Computers are better at maths than humans. They're less prone to hazardous cognitive biases (gambler's fallacy etc.) and can put on an excellent poker face.
As a layperson who's rather ignorant about both no-limit Texas hold 'em and applicable AI techniques, my intuition would tell me that super-human hold 'em should have been achieved before super-human Go. Apparently your software requires way less CPU power than AlphaGo/AlphaZero, which seems to support my hypothesis. What am I missing?
Bonus questions in case you have the time and inclination to oblige:
What does this mean for people who like to play on-line Poker for real money?
Could you recommend some literature (white papers/books/lecture series/whatever) to someone interested in writing an AI (running on potato-grade hardware) for a niche "draft and pass" card game (e.g. Sushi Go!) as a recreational programming exercise?
Fortunately, these techniques now work really really well for poker. It's now quite inexpensive to make a superhuman poker bot.
OP discussed it but while this is true, it is not necessarily true or straightforward when it comes to games with hidden information like poker. This is more of a game theoretical problem (Economics) than it is a purely mathematical one, which had less support in the AI/ML community, hence the delay.
The lower CPU/GPU/resource use supports that fact as does your intuition. Breaking poker required a lot of manual work and model design over brute force algorithms and reinforcement learning.
It feels like a more down to earth version of the sci fi super human running impossible differential equations to predict exactly what you will do given knowledge that he knows what you know what he knows... etc. ad Infinitum. But since it doesn’t actually consider the person it’s predicting, it may simply be a really really good approximation of the game theoretic dominant strategy.
At what complexity of game and hidden information should we feel like the bot can’t win by running a lookup table?
Interesting is up to you, but effective is definitely wrong.
ICM-perfect bots crush small tournaments, which do not take into account opponent behavior - merely modeling the gamestate. The faster the blinds and the smaller the stacks, the better, but even normal structures get killed by these so-called "expected value" only bots.
Game Theory Optimal (GTO) attacks are incredibly effective at all levels of the game. The AI need not incorporate opponent feedback to be a winner. It can make it better, but it is not at all required.
Second, I haven't read everything, but I believe you are playing a cash-game and not tournament-style. Is that correct? If that is the case, any chance you will be doing a tourney-style version?
[For those who don't play, in cash, a dollar is a dollar. In Tourney play, the top 2 or 3 players get paid out, so all dollars are not equal, as your strategy changes when you have only a few chips left (avoid risky bets that would knock you out) or when you are chip leader (take risky bets when they are cheap to push around your opponents).]
Also, curious how much poker you folks play in the lab for "research".
There's a cash game almost every night at the FBNY office! I don't usually play though -- I'm not nearly as good as the bot.
Or top 2 or 3 thousand... depends on the tournament but it's usually the top 15% ish.
Players face mental fatigue and have so over-learned their existing strategies that it takes time to adapt new strategies and even more time for those new strategies to become second-nature.
It reminds me of sports in a way. Teams start running a new wrinkle of offense in the NFL like the wildcat and it takes a few seasons for teams to instinctively know how to play defense correctly against that option.
- Are the action and information abstraction procedures hand-engineered or learned in some manner?
- How does it decide how many bets to consider in a particular situation?
- Is there anything interesting going on with how the strategy is compressed in memory?
- How do you decide in the first betting round if a bet is far enough off-tree that online search is needed?
- When searching beyond leaf nodes, how did you choose how far to bias the strategies toward calling, raising, and folding?
- After it calculates how it would act with every possible hand, how does it use that to balance its strategy while taking into account the hand it is actually holding?
- In general, how much do these kind of engineering details and hyperparameters matter to your results and to the efficiency of training? How much time did you spend on this? Roughly how many lines of code are important for making this work?
- Why does this training method work so well on CPUs vs GPUs? Do you think there are any lessons here that might improve training efficiency for 2-player perfect-information systems such as AlphaZero?
- Are the action and information abstraction procedures hand-engineered or learned in some manner?
- How does it decide how many bets to consider in a particular situation?
The information abstraction is determined by k-means clustering on certain features. There wasn't much thought put into the action abstraction because it turns out the exact sizes you use don't matter that much as long as the bot has enough options to choose from. We basically just did 0.25x pot, 0.5x pot, 1x pot, etc. The number of sizes varied depending on the situation.
- Is there anything interesting going on with how the strategy is compressed in memory?
Nope.
- How do you decide in the first betting round if a bet is far enough off-tree that online search is needed?
We set a threshold at $100.
- When searching beyond leaf nodes, how did you choose how far to bias the strategies toward calling, raising, and folding?
In each case, we multiplied by the biased action's probability by a factor of 5 and renormalized. In theory it doesn't really matter what the factor is.
- After it calculates how it would act with every possible hand, how does it use that to balance its strategy while taking into account the hand it is actually holding?
This comes out naturally from our use of Linear Counterfactual Regret Minimization in the search space. It's covered in more detail in the supplementary material
- In general, how much do these kind of engineering details and hyperparameters matter to your results and to the efficiency of training? How much time did you spend on this? Roughly how many lines of code are important for making this work?
I think it's all pretty robust to the choice of parameters, but we didn't do extensive testing to see. While these bots are quite easy to train, the variance is so high in poker that getting meaningful experimental results is relatively quite computationally expensive.
- Why does this training method work so well on CPUs vs GPUs? Do you think there are any lessons here that might improve training efficiency for 2-player perfect-information systems such as AlphaZero?
I think the key is that the search algorithm is picking up so much of the slack that we don't really need to train an amazing precomputed strategy. If we weren't using search, it would probably be infeasible to generate a strong 6-player poker AI. Search was also critical for previous AI benchmark victories like chess and Go.
That said, it did train by playing against itself (before the experiment against the humans began).
Is 10,000 hands really considered a good enough sample? Most people consider 100k hands w/ a 4bb winrate to be an acceptable other math aside. However, as your opponent and yourself play with equal skill, variance increases to the point where regs refuse to sit each other.
We used AIVAT to reduce variance, which reduces the number of samples we need by roughly a factor of 10: https://poker.cs.ualberta.ca/publications/aaai18-burch-aivat...
Furthermore, Chris Ferguson, scumbag aside, is absolutely still a very good player by today's standards, and one way higher than the mean participant in a research experiment.
10,000 hands is an effective enough sample at a certain win rate and analysis of variance of play; the n-value alone is not enough to tell you if it was enough hands.
FWIW, they did some variance reduction techniques that dramatically reduce the number of hands needed to be confident in your results, so the number of hands may be bigger than you think. e.g. the results of 10k HU hands have much higher variance than the results of 10k HU hands where everyone just collects their EV once they're all in.
It is an interesting point that these are pros but their specialities are either tournament or heads up. The current 6 max pros are LLinusLove, Otb_RedBaron, TrueTeller.
Deleted Comment
What are your thoughts on a poker tournament for bots? Do you think it could turn into a successful product? I've always wanted to build an online poker/chess game that was designed from the ground up for bots (everything being accessible through an API), but have always worried that someone with more computational resources or the best bot would win consistently. Is it an idea you've thought about?
I have a few basic questions. I would like to implement my own generic game bot (discrete states). Are there any universal approaches? Is MCMC sampling good enough to start? My initial idea was to do importance sampling on some utility/score function.
Also, I am looking into poker game solvers - what would be a good place to start? What's the simplest algorithm?
Thanks
Of course, a key part of bluffing is getting the probabilities right. You can't always bluff and you can't never bluff, because that would make you too predictable. But our self-play and search algorithms are designed to get those probabilities right.
In fact, the methods we use are designed from the ground up to minimize exploitability. That's a really important property to have for an AI system that is actually deployed in the real world.
Another person asked "What took you so long?", and i had the same question. :) I really thought this milestone would be achieved fairly soon after i left the field in 2007. However, breakthroughs require a researcher with the right amount of reflectiveness, insight, and determination.
Well done.
>Regardless of which hand Pluribus is actually holding, it will first calculate how it would act with every possible hand .
Is this information used to form an idea of what other players might be holding based on how the other player acts and how closely that action matches Pluribus's 'what if' action?
iirc, the frequency of bets in that spot is roughly equivalent to the frequency of times you're definitely in front of your opponent in that particular spot, but not always with the hands that are beating your opponent.
The concept is called Game Theory Optimal (GTO) and it's pretty popular in higher stakes games.
1) What were the reasons for choosing 6-handed play (assuming logistical and costs)? It would be interesting to see how the bot’s strategy would differ in a full ring game. 2) Are there any plans to commercialize the bot as a tool for training human players?
2) I'm quite happy working on fundamental AI research and plan to continue in that direction.
Is the bot going for game-theory-optimal play, or trying to exploit weaknesses in other players?
It's going for game-theory-optimal play. It doesn't adapt to its opponents' observed weaknesses. But I think it's cool to show that you don't need to adapt to opponent weaknesses to win at poker at the highest levels. You just need to not have any weaknesses yourself.
Dead Comment
To answer your question, no, I don't think human players would play at their best when not playing for actual money.
Dead Comment
>A superhuman poker-playing bot called Pluribus has beaten top human professionals at six-player no-limit Texas hold’em poker...
The training aspect has some improvements but is at its core similar to Libratus. The search algorithm is the biggest difference.
There aren't that many great resources out there for helping new people get caught up to speed on this area. That's something we hope to fix in the future. Maybe this would be a good place to start? http://modelai.gettysburg.edu/2013/cfr/cfr.pdf
With one AI and multiple professional human players sitting at a physical table, the humans outperform the probabilistic model because they take advantage of each other's mistakes/styles. Some players crash out faster but the winner gets ahead of the safe probabilistic style of play.
So this bot is better at the current professional player meta than the current players. In a 1v1 against a probabilistic model, it would probably also lose?
Am I understanding this properly? Or is playing the probabilistic model directly enough of a tell that it's also losing strategy? Meaning you need some variation of strategies, strategy detection, or knowledge of the meta to win?
The bot played like 10 000 hands. There is no way that is enough to prove it's better or worse than the opponents.
More so in no-limit where some key all-ins can turn the game up side down. The variance is higher than limit or fixed, right?
I did a heads up Texas holdem fixed bot with "counter factual regret minimization" like 8 years ago from a paper I read. It had to play like 100 000 hands vs a crappy reference bot to prove it was better.
Strategy detection in so short games is probably worthless.
The edge is probably in seeing who are tired or drunk in paper poker.
> Although poker is a game of skill, there is an extremely large luck component as well. It is common for top professionals to lose money even over the course of 10,000 hands of poker simply because of bad luck. To reduce the role of luck, we used a version of the AIVAT[1] variance reduction algorithm, which applies a baseline estimate of the value of each situation to reduce variance while still keeping the samples unbiased. For example, if the bot is dealt a really strong hand, AIVAT will subtract a baseline value from its winnings to counter the good luck. This adjustment allowed us to achieve statistically significant results with roughly 10x fewer hands than would normally be needed.
[1] https://arxiv.org/abs/1612.06915
I'm thinking in particular of unbalanced tables with an ever-changing mixture of TAG and LAG play. I've changed my mind three times about whether that's humans' best refuge -- or a situation that's a bot's dream.
You've done the work. Insights welcome.
I see on chess channels that grand masters have to rethink their whole game preparation methodology to cope with the "Alpha Zero" oddities that have now been introduced into this ancient game. They literally have to "throw out the book" of standard openings and middle games and start afresh.
I would say that it's thoroughly rebounded to play the game not the player in poker and this isn't because of super bots like the one used in this paper.
Ever since game theory invaded poker players that play in highly visible events such as tv tournaments try as hard as possible to make their game unexploitable.
However, even according to the former world champion (Viswanathan Anand) the run he's been on is something quite shocking: “His results this year is simply [great].... difficult to find words. [It’s been] completely off the charts. I think the chess world is still in a bit of a shock. The rest of the players are struggling to deal with a phenomenon [like him]. Even in 2012-13, his domination was less than it is this year. Everyone is still processing this information.” [1]
Carlsen is basically on route to breaking 2900 Elo - at 2882 Elo with a clear upwards trend - while there's only two other active players even above 2800 Elo and struggling to keep it above that treshold. (Elo is the rating system used in chess. Above 1500 Elo is an average player, 2000 Elo is a good player, 2500 Elo is a grandmaster. Anything above 2700 Elo is basically godlike.)
Oddly enough, instead of playing more like a machine, it seems like Carlsen has been playing chess that is much more about the human aspect of the game rather than trying to find the top ranked engine move on every turn. (The current traditional top engine - Stockfish - makes an assumption of each move's validity using a point system, which the chess world has been more or less obsessing over for the past decade. Alpha Zero doesn't have such a point system whatsoever.) He's been playing a drastically more aggressive and dynamic variety of chess compared to what has been seen in a long time at the top tournaments.
He's been playing to create dizzying positions on the board, making a few moves that aren't necessarily liked by the traditional top engines, but still finding himself in a winning position several moves after. It definitely looks like some sort of black magic, but it seems like the big thing Alpha Zero has brought to the general philosophy on how to approach chess at the top level is that it's possible to play aggressive chess, take risks and win in 2019. Magnus Carlsen is the first player to successfully reinvent that style of play, more than likely partly inspired by Alpha Zero. So, I'd say the big thing about Alpha Zero isn't necessarily that it could beat the other top engines, but more importantly that the 'artistic' aspect of its play is something that has never been seen from another chess engine. The fact that it proved that sort of style superior to the play ever before played by another chess engine is just the icing on the cake.
Garry Kasparov on Alpha Zero's chess persona: "I admit that I was pleased to see that AlphaZero had a dynamic, open style like my own. The conventional wisdom was that machines would approach perfection with endless dry maneuvering, usually leading to drawn games. But in my observation, AlphaZero prioritizes piece activity over material, preferring positions that to my eye looked risky and aggressive." [2]
[1] https://sportstar.thehindu.com/chess/viswanathan-anand-on-ma... [2] https://science.sciencemag.org/content/362/6419/1087