cbutner (u/cbutner) - Readit News

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

sixstringtheory · 4 years ago

Very cool project. I'm currently reading The Alignment Problem by Brian Christian on AI (very well written and easy to read), and also (very slowly) My Great Predecessors, Vol 1 by Garry Kasparov. This is a great combination of the two topics!

I'm particularly interested in the second neural net that generates explanations. Is it only a neural net to generate the natural language using artifacts from the original engine net, or is it actually inspecting the state of the engine net to derive the insights?

It's an interesting application of explainability of AI algorithms that Christian talks about in his chapter on transparency. In particular, he discusses "saliency" of algorithms (knowing what parts of the input were most important in producing a prediction) and "multitask nets" that output multiple predictions (so, maybe here, one output is the best move, and another output is the explanation).

The writeups are fantastic reading. I see almost no sources in common with the bibliography in The Alignment Problem (which has a 50 page bib) which makes them nice complements. Only common citations I could find were Sutton 1988 (Temporal Differences) and Silver et al 2016 (AlphaGo), 2017 (AlphaGo Zero) and 2018 (AlphaZero).

There's a note (44) from chapter 5 entitled "Shaping" where Christian talks about "meta-reasoning: the right way to think about thinking. When you play a game–for instance, chess–you win because of the moves you chose, but it was the thoughts you had that enabled you to choose those moves...Figuring out how an aspiring chess player–or any kind of agent–should learn about its thought process seemed like a more important but also dramatically harder task than simply learning how to pick good moves." This note might as well be direct inspiration for this project. It goes on to quote Stuart Russell about "a computation that changes your mind about what is a good move to make...reward that computation by how much you changed your mind...so you could change your mind in the sense of discovering that what was the second-best move is actually even better than what was the best move." That's in the context of a cautionary tale where only optimizing for those "changes of mind" doesn't necessarily find a correct outcome, and that you have to "arrange these internal pseudorewards so that along a path, they add up to the same as true, eventually." This sounds pretty much like the task of a coach.

cbutner · 4 years ago

The commentary net inspects the final state of the engine net, but not internal layers.

Deeper introspection is a really important goal, but by the time you make serious progress there, chess is the least of your worries.

I do really like the work people have put into introspection and visualization so far though: DeepDream comes to mind. There was also another great paper or page that I can't find.

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

jmcguckin · 4 years ago

I’d like to self host. Will this run with a gpu?

cbutner · 4 years ago

Yes! I haven't done as much testing with GPU, but did validate running with 4x V100s. You just need to adjust the "search_threads" option to the number of GPUs, but set it to at least 2.

Installation for GPU is covered here: https://github.com/chrisbutner/ChessCoach#installation (a little messy, sorry)

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

ritchiea · 4 years ago

You estimate it’s rated 3400-ish and it loses games????

cbutner · 4 years ago

It loses some games to Stockfish 13 and 14, and Lc0 - rarely at slow time control, and more often at blitz and bullet (actually, it has losses all the way down to Stockfish 9 in blitz).

Partly because of the way it tries to search more widely to avoid tactical traps, it can also be a little sloppy in holding advantages or minimizing losses (this could use some more work and tuning). This ends up making it a little drawish, so it loses less than you'd expect to Stockfish 14, but also doesn't beat up weaker engines as well as Stockfish 14 does.

You can see some of this in the raw tournament results[1]. At 40 moves per 15 minutes, repeating, each engine draws with the ones above and below it, but starts to win and lose at a distance of 2 or 3.

At 5+3 time control, ChessCoach goes 1-0-29 vs. Stockfish 12, but Stockfish 12 is better at beating Stockfish 8-11 than ChessCoach is, so CC ends up between SF11 and SF12 in the end.

On Lichess, where there's no "free time" to get ready for searches, ChessCoach's naïve node allocation/deallocation makes it waste time, and means it can't ponder for very long on the opponent's time - a big opportunity for improvement (it needs a multi-threaded pool deallocator that can feed nodes back to local pools for the long-lived search threads). I think it's also hitting a bug with Syzygy memory mapping that Stockfish works around via reloading every "ucinewgame" (which I don't trigger on Lichess). So, overall, its performance on Lichess is worse.

Also, you can't read too much into this data - very few games, and no opening book.

[1] https://chrisbutner.github.io/ChessCoach/data.html#appendix-...

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

thomasahle · 4 years ago

I think training this as a separate head on top of a frozen AlphaZero model makes a lot of sense. I don't think anyone has figured out to do language learning with reinforcement training.

Actually, I can't figure out from your explanation why you trained the whole network yourself instead of just using Leela's network and training the commentary head on top?

If you wanted to in-cooperate the search, maybe you could just take the 1800 or so probabilities output by the MCTS and add some layers on top of that before concatenating with the other data fed into the transformer.

In either case, this is a fantastic project and perhaps an even more impressive write up! Congrats and thank you!

cbutner · 4 years ago

It was partly because I was looking to improve self-play and training tractability on a home desktop with 1 GPU (complete failure), and partly to learn about everything from scratch. I would be interested to see how strong it is with the same search but with Leela's inference backend (for GPU at least) and network.

In terms of search-into-commentary, concatenating like that may be interesting, as long as it can learn to map across - definitely plausible without too much work. I was originally thinking of something more complicated, combining multiple raw network outputs across the tree through some kind of trained weighting, or additional model via recurrence, and punted it.

Ignore my BLEU comment, mixed those up between replies - that was the other potential use of search trees for commentary, an MCTS/PUCT-style alternative to traditional sequential top-k/top-p sampling, once you have logits and are deciding which paragraph to generate.

Thanks!

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

gowld · 4 years ago

Can you have it play more games by giving it less time per turn (~2500 rating is plenty good for an opponent/coach) and playing games concurrently while it waits for human to play?

how much does a game cost in CPU time money?

How do I get the commentary for a game I played? Oh, it's in Analysis page.

It plays chess very well, but the commentary is incoherent and doesn't match the game well -- The attacks described are nonsense and the coordinates are wrong. It seems a little confused about which side is which? It thinks a rook can diagonally attack a bishop, and seems to name squares opposite from their actual name.

cbutner · 4 years ago

That's a good idea. A bigger problem than time-slicing is probably GPU/TPU device ownership issues and GPU/TPU memory usage with multiple games going in parallel. There may be some ways to multiplex it intelligently though.

Costs are difficult to work out - it depends on cloud vs. self-hosting, what kind of TPUs/GPUs, how long you're calculating over.

The advantage that classical/NNUE engines have is that they can more easily spread over distributed frameworks like Fishtest.

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

lemonade5117 · 4 years ago

I see. I guess compute intensive stuff is usually implemented in c++. By the way, if you don't mind, could you share your experience in learning RL? I am struggling through Sutton and Barto's text right now and wondering if I'll progress faster if I just "dive into things." Also, nice project!

cbutner · 4 years ago

I think it always helps to have a project to apply things to as you're learning something, even if it means coming up with something small. While preparing, I found it helpful to read for at least an hour each morning, and then divided the rest of the day into learning vs. "diving in" as I felt like it.

Getting deep into RL specifically wasn't so necessary for me because I was just replicating AlphaZero there, although reading papers on other neural architectures, training methods, etc. helped with other experimentation.

You may be well past this, but my biggest general recommendation is the book, "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" to quickly cover a broad range of statistics, APIs, etc., at the right level of practicality before going further into different areas (for PyTorch, I'm not sure what’s best).

Similarly, I was familiar with the calculus underpinnings but did appreciate Andrew Ng's courses for digging into backpropagation etc., especially when covering batching.

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

lemonade5117 · 4 years ago

Could using something like AlphaZero.jl make it more efficient?

https://github.com/jonathan-laurent/AlphaZero.jl

cbutner · 4 years ago

The engine itself is in C++, but it calls in to TensorFlow via Python as a portability/distribution vs. performance trade-off.

Next steps could be using one of Lc0's backends for GPU scenarios, or taking the other side of the trade and using the C++ API for TPU.

There's also your typical CPU and memory optimizations that could be made - some baseline work there but not targeted.

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

mcyc · 4 years ago

This is a fantastic project. Thanks for sharing!

I had a nice long conversation with two of the authors of [0] at ACL.

One thing we discussed was the reverse problem. That is, as a player, could I give commands to the model and have the engine figure the moves that would best satisfy them.

This ranges from concrete like "take the black square bishop" (there is still variability like which piece should take it or if it's even possible) to more complex positional stuff like "set up to attack the kingside."

Any thoughts on this line of research?

[0] Automated Chess Commentator Powered by Neural Chess Engine (Zang, Yu & Wan, 2019) https://arxiv.org/pdf/1909.10413.pdf

cbutner · 4 years ago

SentiMATE[1] looks at one of the reverse problems in a way - training an engine on commentary data - although it's not exactly what you're talking about.

I think this line of thinking could eventually lead to automated metrics for commentary evaluation, which could in turn lead to better methods than top-k/top-p for turning a bunch of sequential logits into a sentence or paragraph - basically treat it like MCTS/PUCT also.

The problem is that if you look at high-level commentary - maybe Radjabov-MVL on https://www.chess.com/news/view/2021-champions-chess-tour-fi... (I'm not the best judge, just a quick search) - it's not often possible to predict the move starting with the comment. And if you did, you might end up with very dry metrics and reverse commentary.

But this direction has a lot of potential I think, beyond just chess, into more of an algorithmic/generational support for pure NN-based language models.

[1] https://arxiv.org/pdf/1907.08321.pdf

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

owenmarshall · 4 years ago

The gameplay seems solid, but the comments are all over the map:

https://lichess.org/4l1urWeU

I wonder if we are getting snippets of variations in some cases.

cbutner · 4 years ago

It does train on variations too, given the scarcity of data available, so that can hurt accuracy, mood, etc.

cbutner commented on Show HN: ChessCoach – A neural chess engine that comments on each player's moves chrisbutner.github.io/Che... · Posted by u/cbutner

thomasahle · 4 years ago

So do I understand correctly: This is a new head on top of the AlphaZero model?

That is, in addition to the usual evaluation and policy heads, this takes the intermediate board representation and outputs a seed vector that is fed into a transformer text generator?

Or do other things go into the seed? Like the search tree somehow? Otherwise I suppose the commentary will not be able to comment on deeper tactics?

Or maybe this doesn't work using a seed vector at all, but with a custom integration from the board into the transformer somehow?

cbutner · 4 years ago

The original hope was for this to be a third head on top of the AlphaZero model, but I couldn't think of a way to generate commentary during self-play (such that it would gradually improve), and trying to rotate supervised commentary training into the main schedule ended up hurting both sides because of the disjoint datasets.

So, now the commentary decoder is just trained separately on the final primary model. The previous and current game positions are fed into the primary model, and the outputs are taken from the final convolutional layer, just before the value and policy heads. Then, that data plus the side to play is positionally encoded and fed into a transformer decoder.

It would be better for a search tree/algorithm to be used for commentary too so that tactics could be better understood, but that would need some kind of subjective BLEU equivalent, and metrics like those don't work well for chess commentary.

You can see a diagram of the architecture here: https://chrisbutner.github.io/ChessCoach/high-level-explanat...