The probability to win comes out as a^2 ~= 0.2955977425220847709809965928515386138989754484466083115379546015...
edit2:
To arrive to this solution I expressed the equations in terms of R_i = P(choosing a number larger than i), then substitute P_i = R_{i-1} - R_i . This gets rid of the sums when you evaluate Q_{j+1} - Q_j , and arrives at R_{j+1}^2 = R_{j-1}(2 R_j - R_{j-1}). We know that R_0 = 1, and lim R_i = 0. I just tried R_i = a^i, and it works, with the value of "a" calculated above.
edit3:
To name names: the solution is a geometric distribution [1]
What I ended up reducing the algebra problem to is a homogeneous recurrence relation, albeit non-linear. As it has constant coefficients, a^i is the obvious candidate to test it with.
I got successfully nerd-sniped, this is a fun problem. Next time with 4 players...
This is used as a model for "reverse auctions" [1], which also then charges a fee for participation which essentially makes the whole thing a cross between a lottery and a scam.
There were loads advertised on TV with misleading claims about people winning ipods for 37p, etc, which while technically might be true for the winner ignores the overall cost.
I've not seen many advertised lately so either the practice itself or the advertising of it was cracked down on.
Interesting problem and I appreciate the author bringing it to our attention. That said, I have to admit I found the Math Overflow post he took the solution from much clearer: https://math.stackexchange.com/a/80743/
For one thing, the Math Overflow post is correct, whereas the blog post states a crucial inequality in the wrong direction. This confuses the whole argument, since the surrounding text isn't clear enough for the reader to easily self-correct the typo.
The other factor is the flow of the exposition. For example, the blog post attempts to define Q before using it, but the definition is too vague and ends up being less useful than the "define in the course of the argument" approach used in the Math Overflow post.
I wonder what would happen to the two player game if the rewards were set up so that win=1, lose=0, tie=-1? It would be interesting to see how the strategy changed as the penalty for tying is increased.
Now I'm wondering which is the best strategy you can follow if the other players follow an unknown non-optimal strategy (like selecting 1,2 or 3 with p=1/3). Maybe it's a good scenario to try some reinforcement learning
There is a nice strategy that is playing always 1, and claiming it.
Everyone else should not play 1 because otherwise they would not win. But if no one else plays 1, you win. So they must take turns to lose and make you lose. It's a good strategy to not meet them again ever :)
I think a triple tie means no one wins, which would explain why the probability of winning is less than 1/3. (If each of the three players is playing the same Nash equilibrium strategy, they should each win with the same frequency.)
I worked with a nationwide lottery game in Sweden called Limbo around 2005-2006 that used this concept. I believe the winner each day won around $1000 and had the ability to turn it into $10000 in a weekly final doing the same game in a tv-studio.
The game was completely shut down after people in a small little town won three times in a row and they started to suspect foul play. Turned out to be the local store asking people to join and the store distributed the numbers for the people to make sure they had an even distribution across a huge range.
> if we follow this strategy we’ll win a little bit less than one-third of the time.
That is not possible. If N players use the same strategy then by symmetry each of them must win on average 1 time in N. So against two other players we should win exactly one third of the time, not "a little bit less" than one third of the time.
Also, if all players use the same strategy then it doesn't matter what the strategy actually is. The only requirement is that it have some probabilistic element, otherwise all players would choose the same number every time.
> If N players use the same strategy then by symmetry each of them must win on average 1 time in N.
The point here is that there's an option of tying, if all players choose the same number. So P_1_win + P_2_win + P_3_win + P_tie = 1, means P_i_win < 1/3.
> Also, if all players use the same strategy then it doesn't matter what the strategy actually is. The only requirement is that it have some probabilistic element, otherwise all players would choose the same number every time.
A player choose its next move using some probabilities P, ie: it's going to choose randomly number 1 the 45% of times, 2 the 25% of times, etc. So there's a random element in the game, which makes players following the same strategy to not choose the same numbers every time.
> Also, if all players use the same strategy then it doesn't matter what the strategy actually is.
But it's not very interesting to add "all players use the same strategy" as an assumption, because yes, obviously, if every player randomly picks from the set {33, 247, 17855433344} each has a 29.6% chance of winning. But that's a different game.
In fact the author doesn't assume everyone will use the same strategy; rather, the author simply asks what optimal play is, when you don't know the others' strategies. The fact that all players choose the same strategy (if they're playing optimally) is a consequence of this. Optimal play has to work regardless of the strategy your opponents pick; assuming they'll play optimally too is just a shortcut to figuring out what optimal play is.
For example, let's say you knew the other two players' strategy is to pick from this distribution. If you simply always pick 1, you win any time neither of the others picked 1, which happens with probability (1-0.456)^2 ~ 0.296 -- the same probability of winning as the optimal strategy! So why bother with the distribution? Precisely because you don't know whether your opponents are playing optimally or not. If another player gets the same bright idea as you, you counteract each other and the 3rd player wins 54% of the time (the rest are ties). An optimal strategy has to be optimal even if opponents are not using it; otherwise it's not really optimal (one may not exist).
So all players using the same strategy is not an assumption, but a consequence, and therefore much more interesting.
It seems like there are pathological counterexamples. If you're player three, and player one chooses 1 always, and player two chooses 2 always, then the player using this strategy can never win.
Player 2 wins with probability .45 (only wins when player 1 gets "knocked out"), and player 1 with probability 1-.45, and player 3 never wins.
What's super surprising if that I'm "dumb" and I play against two "smart" players then I can't have a bad strategy. Assuming that the "smart" players play the Nash-equilibrium mixed strategy then whatever strategy I execute, I have the same chance of winning.
but still (1) there's some probability of both players choosing a number bigger than one million billion trillion, and (2) there's some probability of both other players choosing the same number. Actually, the probability that the other both players choose 1 is pretty big (~0.21). So even if you choose a super-big number, there's a big probability of you still winning.
This game seems to do weird things if you have a player able to disclose their choice in advance too. Stating that you will choose 42 is stronger than stating you will choose 1, assuming other players only maximize their own odds of winning. It's kind of like a convoluted game of chicken.
It seems that it contradicts the principle of Nash equilibrium, since in Nash equilibrium you don't have any reason to change your strategy. However, if you change a "smart" strategy to a "dumb" strategy you're not increasing/decreasing your winning changes, so it still fulfills the Nash equilibrium principle.
There was a cool challenge in the Code Golf community in Stack Exchange, where participants wrote python scripts to compete in this problem over 1,000 rounds. The bots had access to previous rounds' results to inform their choice in each round.
P_i = a^i (1/a - 1)
where a is the real-valued solution of a^3 + a^2 + a - 1 = 0 .
a = 1/3 (-1 - 2/(17 + 3 sqrt(33))^(1/3) + (17 + 3 sqrt(33))^(1/3))
edit:
The probability to win comes out as a^2 ~= 0.2955977425220847709809965928515386138989754484466083115379546015...
edit2:
To arrive to this solution I expressed the equations in terms of R_i = P(choosing a number larger than i), then substitute P_i = R_{i-1} - R_i . This gets rid of the sums when you evaluate Q_{j+1} - Q_j , and arrives at R_{j+1}^2 = R_{j-1}(2 R_j - R_{j-1}). We know that R_0 = 1, and lim R_i = 0. I just tried R_i = a^i, and it works, with the value of "a" calculated above.
edit3:
To name names: the solution is a geometric distribution [1]
What I ended up reducing the algebra problem to is a homogeneous recurrence relation, albeit non-linear. As it has constant coefficients, a^i is the obvious candidate to test it with.
I got successfully nerd-sniped, this is a fun problem. Next time with 4 players...
[1] https://en.wikipedia.org/wiki/Geometric_distribution
There were loads advertised on TV with misleading claims about people winning ipods for 37p, etc, which while technically might be true for the winner ignores the overall cost.
I've not seen many advertised lately so either the practice itself or the advertising of it was cracked down on.
[1] https://www.gamblingcommission.gov.uk/public-and-players/gui...
For one thing, the Math Overflow post is correct, whereas the blog post states a crucial inequality in the wrong direction. This confuses the whole argument, since the surrounding text isn't clear enough for the reader to easily self-correct the typo.
The other factor is the flow of the exposition. For example, the blog post attempts to define Q before using it, but the definition is too vague and ends up being less useful than the "define in the course of the argument" approach used in the Math Overflow post.
> To simplify the problem, let’s start with the simple scenario where you are competing against two other persons.
> P(wining)~=0.296
Where does the other 11.2% go? Is the probability of a triple tie? Perhaps you can add it to the article.
Which is why the author skips the case of two players. Because then picking 1 always is a dominating strategy in which neither player wins.
Now I'm wondering which is the best strategy you can follow if the other players follow an unknown non-optimal strategy (like selecting 1,2 or 3 with p=1/3). Maybe it's a good scenario to try some reinforcement learning
Everyone else should not play 1 because otherwise they would not win. But if no one else plays 1, you win. So they must take turns to lose and make you lose. It's a good strategy to not meet them again ever :)
The hard part is to establish a prior for this "unknown". Otherwise you can't say anything.
I think a triple tie means no one wins, which would explain why the probability of winning is less than 1/3. (If each of the three players is playing the same Nash equilibrium strategy, they should each win with the same frequency.)
Deleted Comment
The game was completely shut down after people in a small little town won three times in a row and they started to suspect foul play. Turned out to be the local store asking people to join and the store distributed the numbers for the people to make sure they had an even distribution across a huge range.
> if we follow this strategy we’ll win a little bit less than one-third of the time.
That is not possible. If N players use the same strategy then by symmetry each of them must win on average 1 time in N. So against two other players we should win exactly one third of the time, not "a little bit less" than one third of the time.
Also, if all players use the same strategy then it doesn't matter what the strategy actually is. The only requirement is that it have some probabilistic element, otherwise all players would choose the same number every time.
The point here is that there's an option of tying, if all players choose the same number. So P_1_win + P_2_win + P_3_win + P_tie = 1, means P_i_win < 1/3.
> Also, if all players use the same strategy then it doesn't matter what the strategy actually is. The only requirement is that it have some probabilistic element, otherwise all players would choose the same number every time.
A player choose its next move using some probabilities P, ie: it's going to choose randomly number 1 the 45% of times, 2 the 25% of times, etc. So there's a random element in the game, which makes players following the same strategy to not choose the same numbers every time.
But it's not very interesting to add "all players use the same strategy" as an assumption, because yes, obviously, if every player randomly picks from the set {33, 247, 17855433344} each has a 29.6% chance of winning. But that's a different game.
In fact the author doesn't assume everyone will use the same strategy; rather, the author simply asks what optimal play is, when you don't know the others' strategies. The fact that all players choose the same strategy (if they're playing optimally) is a consequence of this. Optimal play has to work regardless of the strategy your opponents pick; assuming they'll play optimally too is just a shortcut to figuring out what optimal play is.
For example, let's say you knew the other two players' strategy is to pick from this distribution. If you simply always pick 1, you win any time neither of the others picked 1, which happens with probability (1-0.456)^2 ~ 0.296 -- the same probability of winning as the optimal strategy! So why bother with the distribution? Precisely because you don't know whether your opponents are playing optimally or not. If another player gets the same bright idea as you, you counteract each other and the 3rd player wins 54% of the time (the rest are ties). An optimal strategy has to be optimal even if opponents are not using it; otherwise it's not really optimal (one may not exist).
So all players using the same strategy is not an assumption, but a consequence, and therefore much more interesting.
Deleted Comment
Player 2 wins with probability .45 (only wins when player 1 gets "knocked out"), and player 1 with probability 1-.45, and player 3 never wins.
It seems that it contradicts the principle of Nash equilibrium, since in Nash equilibrium you don't have any reason to change your strategy. However, if you change a "smart" strategy to a "dumb" strategy you're not increasing/decreasing your winning changes, so it still fulfills the Nash equilibrium principle.
https://codegolf.stackexchange.com/questions/172178/smallest...