The second-generation Worms games (2, Armageddon, WWP) use the same approach (here called "Deterministic action propagation method"). It was implemented by Karl Morton, and we also use it for replay files (which is why the game logic is meticulously versioned).
The solution we used for managing secret state, when the state is from a random source, is to delay generating the secret information for as long as possible (so, e.g., if a card is drawn, the drawn card is decided at that moment, as opposed to pre-shuffling the deck at the start of the game). We experimented with some designs in which the secret is revealed only to the player who owns it, and the rest only see a verifiable hash of the secret until the player performs an action which reveals the secret; however, it doesn't solve the general problem that any time the game must immediately (without network latency) make a random choice, the design must compromise between either allowing a hacked client to predict the result (when the RNG output is stable) or manipulate it (when the RNG output changes very often, e.g. every frame).
Sadly, the newer games went for a simpler synchronization algorithm, in which if a discrepancy is detected, the entire state is simply copied over from one player's game to another. This does allow blatant cheating; I'm guessing it was a concession due to the new engine lacking in robustness or portability.
This is a really interesting approach for managing secret state. Another life ago I implemented a rule-based version of everyone's favorite collectible card game, and it was P2P (so each client ran the full rules engine and synchronized the initial random seed and the each player's actions). I could never figure out a good way of limiting cheating by peaking at either your own deck or the opponents. I looked into commutative encryption and mental poker, but at that point my interest in the project fizzled and I stopped working on it. This approach would work for keeping randomness in the game, and homomorphic encryption could be used to hide the contents of the opponent's deck.
There's a third way of thinking, best exemplified by Agar.io and things like that which basically asserts that any network latency is the player's problem. There's a deterministic state on the server, and the UI doesn't bother to try to sugarcoat it to make it feel any more instantaneous than it is. IMHO this is the only way most games will be written down the line. I'm an old casino coder, so the "tricks" were things like, slow the card deal animation; keep the roulette wheel animation going longer; etc. But where money's concerned you can't take shortcuts, and in the end a player with a faster connection has a better experience, and (in casino land) those are the players with money anyway. I'm not sure I understand why there'd ever need to be an instant result (sans network latency) in a turn-based game. If the player has a visual indication that the server is waiting for their command, then any jumpy movement on the part of other players is strictly down to their own bad connection as long as the server is running well. If players understand that the alternative would be the ability to cheat, they'll generally accept that and upgrade their network connection.
> I'm not sure I understand why there'd ever need to be an instant result (sans network latency) in a turn-based game. If the player has a visual indication that the server is waiting for their command, then any jumpy movement on the part of other players is strictly down to their own bad connection as long as the server is running well.
Although the Worms games are turn-based, gameplay typically involves performing a lot of actions within one turn. Walking, jumping, using utilities (which don't end your turn), and finally firing a weapon. Many of these actions can trigger an event with a random outcome, which should ideally be unpredictable and un-manipulatable, such as which weapon will be found in a weapon crate, or how long the fuse of a mine with a random fuse will be.
In a peer-to-peer game, the server is just another player, and should not be considered intrinsically more trustworthy. Ignoring latency, it's possible to implement a scheme of securely generating random data, in which everyone produces a verifiable hash (commitment) of a random number, and after everyone announced their hash everyone announces their random number, which is then XORed together to produce the outcome (see e.g. Keybase's "Cryptographic coin flipping"). The problem with this approach is that the latency to obtain the result becomes the roundtrip to the player with the slowest connection and back.
I wonder if speculative calculation on the clients could be used.
For example let's say we're at frame 999 and have 2 players and 10 deterministic NPCs. And each player has 4 possible moves 2 of which have results depending on a random number being <= X or > X.
So each client calculates the next possible frame states knowing its user input (so it only has max 2 options depending on the random generator) * possible inputs from the other player (2 + 2*2 = 6 options), so there's 12 possible game states for frame 1000 from this client POV.
NPCs are deterministic so they can react to the player moves in each branch.
The other client does the same but has different variables missing and different state tree (with small overlap that only depends on random generator).
Then both clients send their player inputs to the server and wait for the missing variables to choose which branch of the state tree they both go. Depending on the game we may not wait just show the most likely branch and if we mispredicted we switch to the right branch after we get missing data.
This state tree could be several levels deep if there's enough computing power, and it could also be used by AI to choose best moves assuming what player can do (but then it gets complicated).
Advantages:
- server needs much less computing power than clients
- server can still verify the state
- clients don't have information they shouldn't have
- network latency is mostly hidden
Disadvantages:
- more computing power needed on the clients
- can get impractical quickly with more possible moves on each turn
You don’t need to simulate multiple futures just keep going with the most likely (continue doing what you’re already doing is a common heuristic) and correct when you get the missing info. This is the basis of deterministic rollback networking popular in fighting games.
I have poked and prodded at making my own turn-based game, and one thing I have learned is validation. Don't trust input, and you will make robust serverside code. Assume every bit sent by the client was maliciously sent in order to bring down your service in flames. Keep this mentality, and security becomes that much easier.
AJAX long polling is a good way to send data while avoiding spurious requests and/or delays, albeit with a bit more load on the server.
Seconded. The way I look at this is to always build the allowable action space from nothing, and default to nothing. Most of the time it involves a state machine where each node comes with a white list of currently allowable verbs a client can send.
The alternative is an infinite action soup, which opens opportunities for inspiration by lateral thinking. I'm doing it this way for a single player Lovecraftian text editor I'm working on. It's a terrible idea for multiplayer things.
For turn based games like board games, I make clients that don't have any concept of the game rules at all. The server generates a complete list of all available actions to the player and sends an array of all actions along with the game state. When you take an action the only payload the client sends is the index into this array. In this way the client has no way to express a cheat whatsoever.
There can (and must) be compromises between the cost of enumerating moves in messages from the server and the cost of validating client commands; command validation should be straightforward enough to be obviously correct and complete (leaving no room for cheating) and sufficiently cheap, not trivially simple.
For example, if the client in a wargame says it would like to move a unit (by name) from place A to place B on the battlefield the possible moves are implicit (units and locations are known) and validation is nontrivial but easy (the unit is at A and it can move, it can reach B considering fog of war, etc.). Far cheaper than sending 50 different moves for each of 50 different units and validating all of them preemptively every turn.
Indeed it isn't a general solution, it's just really good for almost all board games.
Your example is one that it's not the best for and you could use a different approach or amend the approach to take a payload for some actions and have some client smarts for those actions.
Although even in your example of 50 units with 50 moves each, having 2500 moves is certainly not a disaster in a game where the server only does work in response to a player's move and players don't take multiple moves per second and more likely take seconds or even minutes between moves.
You don't need to be so literal. An action like "game action XYZ" with a client smart enough to know how to show a text field and enable submission is sufficient.
The point is just every bit of logic about when an action is enabled must be implemented on the backend for verification, so spelling that out to the client reduces duplication.
In that case you would probably add a payload which you would then have to check, you're correct. It just turns out you can implement most board games without that.
I do still like this system for some games with actions that aren't convenient to enumerate; it's safe and simple by default and you add validation only in exactly the places where you need a payload.
English words are enumerable. There’s just a few million of them. The dictionary is large, but it’s not like it changes frequently, so you can treat it like a static asset that gets downloaded once and only updated periodically out of band.
About deterministic propagation model: "It relies on the assumption that processing the actions of a player is deterministic."
I am curious in what scenario of online gaming does this assumption fail.
I remember that about a decade ago there was a mobile game called Chaos and Order (like dota/League of Legends but was played on apple device). In one game, I heavily outplayed my opponent. After winning the match we added friends and realized that we actually experienced two totally different games - he lost it in my game but won it on his side, and my in-game actions made nonsense in his one. We called it a "ghost".
It is just funny to see how crappy the outcome would be if the deterministic propagation fails. Also in context such as system theory where error inevitably happens, the design shall make sure the extent of failure will not go unbounded.
This is called a "desync" and is a bug in a particular implementation of multiplayer networking. In this model instead of the server sending the state to each player, it sends their inputs, and allows the local device to completely determine the state of the game. If there are any differences at all in how the game's calculations are made, the game will have different states in different devices. Even different models of the iPhone may have tiny differences in how they calculate the square root for example, and over time these minor differences cascade into totally different games.
A common scenario where this approach fails: applying physics in realtime cross platform games. The floating point math involved is pretty much impossible to guarantee same results for. Deterministic physics engines do exist but they generally only guarantee determinism on the same OS/platform.
Attempted deterministic engine, no desync detection or reconciliation, and support for the whole PC ecosystem - I couldn't believe it when I realized what they were doing... pure madness (not in the good way).
I suspect it actually works fairly well for players on console, who aren't playing cross-platform (since they're working with basically the same hardware everywhere).
Let's say you use JDK 17 (which re-specified float and double operations to be strict everywhere) and always use `java.lang.StrictMath` instead of `Math`, wouldn't you get determinism for all the floating point operations "for free"?
The very worst kind of determinism errors are butterfly effect like in that you don’t notice the desync until long after the cause has occurred. It’s pretty imperative if you’re building a deterministic system to add a layer of error checking right from the start so you can ensure each party ends up in the same state and identify desync very early. On the other hand you also want to identify where desync doesn’t matter (for example most FX) so you limit the scope of what has to be deterministic to keep things simple.
Imagine the same game being played on different platform ( different CPU... ) with different compiler etc ... it's possible that the physics does not behave exacly the same everywhere.
In my experience, most of the pain boils down to synchronization of state between participants. This is also your principal vector for cheating.
If you are willing and able to make a very extreme sacrifice regarding client-server architecture, it is feasible to consolidate all state to 1 synchronous domain while serving an arbitrary number of participants.
The sacrifice is basically to build your game as a native take on the Google Stadia model. I.e. inputs are raw player events, output is a low latency A/V stream back to player.
Keep in mind you don't have to always serve an x264 stream to all clients. Depending on your objectives and type of game, you can serve an intermediate view model that instructs the client on how to draw its view using local assets and resources (textures, models, gpus, etc). This would be more like how Blazor server-side model operates.
What you said is kind of complicated way of saying what I do.
Player client sends simple commands to the server. Server validates the command and updates game state.
The server then just simply listens for other clients to request the state of the world, at which point it sends a compete blob of data that includes everything that client can see!
The player can then interpret that data and send new commands as required.
My games are basically turn based so I don't need to poll the server constantly, just at the start of your turn and after every action.
There's a problem with this approach. If the client thinks it's in state X, and sends a command that's only valid in state X, but it turns out that the client was lagging and was actually in state Y, the server has to reconcile this discrepancy somehow.
For example, in an FPS you turn a corner, see an enemy and shoot them. But unbeknownst to the client a grenade exploded just beyond the corner killing the player before the shot got off.
Now we have to deal with the mess of the client thinking you just shot somebody while you were actually already dead. While it's true that this is "simply" a client-side issue, you'll hear endless complaints from players who, from their perspective, turned a corner and shot the enemy and then died to a grenade, but for some reason their shot didn't connect and to them it seems unfair.
Another scenario: An enemy client is approaching a corner. The server sends you their position because it estimates they'll continue moving beyond the corner. You see the enemy and shoot them. But then the enemy client's lagging input comes in and they actually stopped short of turning the corner! How do we reconcile the fact that you've seen and shot an enemy you shouldn't have been able to?
Speak of the devil. I was just talking with someone about Neptune's Pride a few weeks ago and looked it up to see if it was still going. I really enjoyed that when it came out.
Have you ever written about the architecture of that and Blight anywhere?
I also built a similar engine to boardgames.io using the Delta-update approach, which you can try at acos.games.
Deterministic lockstep is like trying to build a house with wet paper. You'll need to stack and compress millions of layers to make it solid.
Delta-update while not the best for bandwidth, is the cheapest for moving forward. I don't want to waste time fixing desync issues, when I can be creating more games and more features.
I'm still writing about my experience and showing exactly how I built my Delta-update networking model, but its working really well so far.
Nice article.
I built my own online version of a turn based board game because none exists at the moment. I just want to play with my family so I didn't bother coding a server at all, instead all clients communicate their actions to all others so I naturally implemented the deterministic approach. I haven't thought about secret state so it's interesting to see a possible solution for that. Fun!
I'm also writing an online version of a board game I like, to play with family and friends. I am about to start the networking part, and I actually intended to implement the deterministic approach, although I didn't know the concept had a name before reading the article. It's nice to read some validation that it is a good design.
I'm curious to know how you achieve not to need a server at all. How do you create the initial connection between clients without a server to negotiate NATs and things like that?
Initially I thought I would be able to do it with webRTC but it turns out you still need some kind of specific server to connect clients.
Instead I cheated a little bit and used a messaging service https://ably.com/.
It was easy to setup and use, and their free plan is generous.
Another thing I didn't achieve is to find a RNG I could configure with a seed, so amongst my client I have a special one, the "host" who is drawing the cards, etc.
Overall it's been working really well. There is one thing that is particularly difficult without a server, that is debugging. Especially if using a mobile phone, you're in the dark when a bug happens. And with Ably and the free tier, you have to be connect in debug mode beforehand to see the logs, very unpractical.
Good luck with your game!
Not sure how far along you are, but I could help you build the game if you are willing to try making it for acos.games. I built a simulator that can help speed up development which is done in React and NodeJS.
One nice thing about turn-based games is that the logic for the game usually runs so fast that you can run it 'in-between' user events, so using a coroutine approach makes it pretty easy to write the game engine in a direct style with the rules explicitly coded in a straightforward fashion:
for turn in game:
for player in players:
action = await player.get_action()
engine.dispatch(action)
The nice thing is this approach supports remote proxy players - when you get an action from the local player you can send it out to all connected clients.
The solution we used for managing secret state, when the state is from a random source, is to delay generating the secret information for as long as possible (so, e.g., if a card is drawn, the drawn card is decided at that moment, as opposed to pre-shuffling the deck at the start of the game). We experimented with some designs in which the secret is revealed only to the player who owns it, and the rest only see a verifiable hash of the secret until the player performs an action which reveals the secret; however, it doesn't solve the general problem that any time the game must immediately (without network latency) make a random choice, the design must compromise between either allowing a hacked client to predict the result (when the RNG output is stable) or manipulate it (when the RNG output changes very often, e.g. every frame).
Sadly, the newer games went for a simpler synchronization algorithm, in which if a discrepancy is detected, the entire state is simply copied over from one player's game to another. This does allow blatant cheating; I'm guessing it was a concession due to the new engine lacking in robustness or portability.
Although the Worms games are turn-based, gameplay typically involves performing a lot of actions within one turn. Walking, jumping, using utilities (which don't end your turn), and finally firing a weapon. Many of these actions can trigger an event with a random outcome, which should ideally be unpredictable and un-manipulatable, such as which weapon will be found in a weapon crate, or how long the fuse of a mine with a random fuse will be.
In a peer-to-peer game, the server is just another player, and should not be considered intrinsically more trustworthy. Ignoring latency, it's possible to implement a scheme of securely generating random data, in which everyone produces a verifiable hash (commitment) of a random number, and after everyone announced their hash everyone announces their random number, which is then XORed together to produce the outcome (see e.g. Keybase's "Cryptographic coin flipping"). The problem with this approach is that the latency to obtain the result becomes the roundtrip to the player with the slowest connection and back.
For example let's say we're at frame 999 and have 2 players and 10 deterministic NPCs. And each player has 4 possible moves 2 of which have results depending on a random number being <= X or > X.
So each client calculates the next possible frame states knowing its user input (so it only has max 2 options depending on the random generator) * possible inputs from the other player (2 + 2*2 = 6 options), so there's 12 possible game states for frame 1000 from this client POV.
NPCs are deterministic so they can react to the player moves in each branch.
The other client does the same but has different variables missing and different state tree (with small overlap that only depends on random generator).
Then both clients send their player inputs to the server and wait for the missing variables to choose which branch of the state tree they both go. Depending on the game we may not wait just show the most likely branch and if we mispredicted we switch to the right branch after we get missing data.
This state tree could be several levels deep if there's enough computing power, and it could also be used by AI to choose best moves assuming what player can do (but then it gets complicated).
Advantages:
- server needs much less computing power than clients
- server can still verify the state
- clients don't have information they shouldn't have
- network latency is mostly hidden
Disadvantages:
- more computing power needed on the clients
- can get impractical quickly with more possible moves on each turn
AJAX long polling is a good way to send data while avoiding spurious requests and/or delays, albeit with a bit more load on the server.
The alternative is an infinite action soup, which opens opportunities for inspiration by lateral thinking. I'm doing it this way for a single player Lovecraftian text editor I'm working on. It's a terrible idea for multiplayer things.
I hope that was not a typo and would love to see what a Lovecraftian text editor looks like.
For example, if the client in a wargame says it would like to move a unit (by name) from place A to place B on the battlefield the possible moves are implicit (units and locations are known) and validation is nontrivial but easy (the unit is at A and it can move, it can reach B considering fog of war, etc.). Far cheaper than sending 50 different moves for each of 50 different units and validating all of them preemptively every turn.
Your example is one that it's not the best for and you could use a different approach or amend the approach to take a payload for some actions and have some client smarts for those actions.
Although even in your example of 50 units with 50 moves each, having 2500 moves is certainly not a disaster in a game where the server only does work in response to a player's move and players don't take multiple moves per second and more likely take seconds or even minutes between moves.
The point is just every bit of logic about when an action is enabled must be implemented on the backend for verification, so spelling that out to the client reduces duplication.
I do still like this system for some games with actions that aren't convenient to enumerate; it's safe and simple by default and you add validation only in exactly the places where you need a payload.
Deleted Comment
I am curious in what scenario of online gaming does this assumption fail.
I remember that about a decade ago there was a mobile game called Chaos and Order (like dota/League of Legends but was played on apple device). In one game, I heavily outplayed my opponent. After winning the match we added friends and realized that we actually experienced two totally different games - he lost it in my game but won it on his side, and my in-game actions made nonsense in his one. We called it a "ghost".
It is just funny to see how crappy the outcome would be if the deterministic propagation fails. Also in context such as system theory where error inevitably happens, the design shall make sure the extent of failure will not go unbounded.
They attempted to make their recent engine deterministic, and failed miserably - at least on PC: https://www.reddit.com/r/halo/comments/r78moa/halo_infinite_...
Attempted deterministic engine, no desync detection or reconciliation, and support for the whole PC ecosystem - I couldn't believe it when I realized what they were doing... pure madness (not in the good way).
I suspect it actually works fairly well for players on console, who aren't playing cross-platform (since they're working with basically the same hardware everywhere).
It used to be the case not so long ago.
If you are willing and able to make a very extreme sacrifice regarding client-server architecture, it is feasible to consolidate all state to 1 synchronous domain while serving an arbitrary number of participants.
The sacrifice is basically to build your game as a native take on the Google Stadia model. I.e. inputs are raw player events, output is a low latency A/V stream back to player.
Keep in mind you don't have to always serve an x264 stream to all clients. Depending on your objectives and type of game, you can serve an intermediate view model that instructs the client on how to draw its view using local assets and resources (textures, models, gpus, etc). This would be more like how Blazor server-side model operates.
Player client sends simple commands to the server. Server validates the command and updates game state.
The server then just simply listens for other clients to request the state of the world, at which point it sends a compete blob of data that includes everything that client can see!
The player can then interpret that data and send new commands as required.
My games are basically turn based so I don't need to poll the server constantly, just at the start of your turn and after every action.
For example, in an FPS you turn a corner, see an enemy and shoot them. But unbeknownst to the client a grenade exploded just beyond the corner killing the player before the shot got off.
Now we have to deal with the mess of the client thinking you just shot somebody while you were actually already dead. While it's true that this is "simply" a client-side issue, you'll hear endless complaints from players who, from their perspective, turned a corner and shot the enemy and then died to a grenade, but for some reason their shot didn't connect and to them it seems unfair.
Another scenario: An enemy client is approaching a corner. The server sends you their position because it estimates they'll continue moving beyond the corner. You see the enemy and shoot them. But then the enemy client's lagging input comes in and they actually stopped short of turning the corner! How do we reconcile the fact that you've seen and shot an enemy you shouldn't have been able to?
Have you ever written about the architecture of that and Blight anywhere?
Deterministic lockstep is like trying to build a house with wet paper. You'll need to stack and compress millions of layers to make it solid.
Delta-update while not the best for bandwidth, is the cheapest for moving forward. I don't want to waste time fixing desync issues, when I can be creating more games and more features.
I'm still writing about my experience and showing exactly how I built my Delta-update networking model, but its working really well so far.
I'm curious to know how you achieve not to need a server at all. How do you create the initial connection between clients without a server to negotiate NATs and things like that?