There has been more interesting work on using transformers for robot motion planning[0]. Getting a robot arm from point a to b without hitting stuff is a very difficult problem, the problem is both high dimensional and continuous. Previous planning methods are both computationally intensive and not very good. This is one reason why robot motion appears 'unnatural' and robots generally being bad at many tasks we'd like them to do. This approach appears pretty competitive with other planning methods, planning near optimal paths faster.
To be fair, they said at the end that their path-finder is not yet competitive with the SOTA. The paper tests how well a transformer does at predicting an execution trace (like in a JIT compiler) and whether this can help improve heuristics in things like path-finding. I'm weary because transformers are slow.
Worse yet… the game ai pro books are entirely free for digital copies. You’d basically be paying $120 for it to be stitched together into a single pdf, instead of one per chapter
If you consider it like a textbook (very small niche audience for very specialized knowledge) it does kind of match up due to the small expected volume to sell.
Planning is already well taken care of by established techniques ranging from graph search, SAT-solvers, OR, Prolog, etc. The point usually is optimization over multiple feasible alternatives, which I have a hard time seeing transformers being up to. I see the role of LLM techniques more in translating natural language descriptions to executable programs - but then Prolog is already very close, it being designed for classic NLP after all.
> Searchformer [...] solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search ... [and] significantly outperforms baselines that predict the optimal plan directly with a 5-10× smaller model size and a 10× smaller training dataset.
So this can be interpreted as a generic statement to say there might be progress during learning iterations (compared to its own arbitrary first iteration baseline). I think it's important to not getting carried away and assume the recent immense progress in generative AI is just easily repeated for any other AI task. There's also quantum computing having been advertised for over a decade now for a break through in planning and optimization efficiency.
> We also demonstrate how Searchformer scales to larger and more complex decision making tasks like Sokoban with improved percentage of solved tasks and shortened search dynamics.
Worth mentioning that Sokoban is nowhere near a complex decision making task let alone state of the art in an academic or commercial planning/optimization context.
> A∗'s search dynamics are expressed as a token sequence outlining when task states are added and removed [...] during symbolic planning.
Go ahead and compare against eg. Quantum Prolog's readily available container planning problem and solutions [1] then to generate exactly that in the browser - a plan with symbolic actions to perform - or alternatively, a classic Graphplan or CLIPS planning competition problem.
Machine Translation used to involve complicated grammatical decoding, using search. Now we just use transformers for MT, with much simpler decoding that doesn't really need search.
Perhaps we can reach full inception. Let's use our current best-of-breed predictive models to learn heuristics for neural architecture search (NAS) and search for new neural blocks (better than transformer and mamba).
...and finally enter a world where literally no one understands anymore how the technology works, not even the people developing them. Singularity, here we come...
We could then go on to create something akin to the mechanicum, create the position of tech priests and have them pray to the machine god on on our behalf
Shameless plug, but if you are interested in Sokoban type games check out https://thinky.gg . Features a fun Sokoban variant (called Sokopath) as well as another NP hard variant called Pathology (goal is to get from point A to point B in the shortest amount of steps).
Many in the community have tried to create various solvers and things get very hard once the grid gets to be over 5x5. However, some interesting levels with very high maximum step counts have been discovered by the thinky communnity (via simulated annealing)
So A* is the most optimal search algorithm under the specific constraints it specifies. You can't do better.
However, sometimes the specific domain you are searching has other constraints that can be exploited to do better than A*. An example of this being Jump Point Search that exploits certain properties of grid searches if you can only move in certain ways.
If you were able to write a general searching algorithm that can effectively exploit the whatever the specific properties of the underlying domain "automatically" without you having to actually work out what they are, that would be useful right?
Because they used a transformer to reach a nice solution, better than a typical A* search, which is a "naive" or go to solution. And they didn't think about designing an algorithm. I find it quite impressive that a simple encoder-decoder transformer can achieve so much.
It is in the abstract, very first line. "While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks ..."
This paper is interesting (to me) as an example of how to use transformers for decision making. I don't particularly care if it is up to A* standards at the moment.
Because it's further evidence for the "unreasonable effectiveness of transformers", i.e. that transformers are a fully-general approach for all kinds of learning tasks, not just next-token prediction. Obviously there is a strong and a weak version of that hypothesis and the strong version is probably not true, but to the extent that we appear to be honing in Nature's "one true way" to learn how to accomplish a task, that seems like important news.
In certain computer science problems, a suboptimal action at time t may give rise to an optimal outcome at time >t.
Why wouldn't this be the case for research generally? Has our community really devolved to the point where things should only be noteworthy insofar as they optimize for SOTA for a given problem?
I agree. OP's comment could be quickly rewritten into something useful and just by changing the tone, for example:
"26.8% fewer search steps than standard A∗ search"
For reference of prior art, it's slightly better than A*, which is far from SOTA on Sokoban (https://festival-solver.site/).
Thank you so much for your comment!
Apologies if my comment came off like an ad - I have mild dyslexia and struggle to keep up with ML papers; I hope others also find this useful!
If you have any feedback or issues, feel free to email us at support [at] trurecord.com
[0]https://sites.google.com/ucsd.edu/vq-mpt/home
It's in Game AI Pro 2 [0] if anyone is curious.
[0] https://www.amazon.ca/Game-AI-Pro-Collected-Professionals/dp...
http://www.gameaipro.com/
Authors still gotta eat.
So this can be interpreted as a generic statement to say there might be progress during learning iterations (compared to its own arbitrary first iteration baseline). I think it's important to not getting carried away and assume the recent immense progress in generative AI is just easily repeated for any other AI task. There's also quantum computing having been advertised for over a decade now for a break through in planning and optimization efficiency.
> We also demonstrate how Searchformer scales to larger and more complex decision making tasks like Sokoban with improved percentage of solved tasks and shortened search dynamics.
Worth mentioning that Sokoban is nowhere near a complex decision making task let alone state of the art in an academic or commercial planning/optimization context.
> A∗'s search dynamics are expressed as a token sequence outlining when task states are added and removed [...] during symbolic planning.
Go ahead and compare against eg. Quantum Prolog's readily available container planning problem and solutions [1] then to generate exactly that in the browser - a plan with symbolic actions to perform - or alternatively, a classic Graphplan or CLIPS planning competition problem.
[1]: https://quantumprolog.sgml.io/\*
Perhaps we can reach full inception. Let's use our current best-of-breed predictive models to learn heuristics for neural architecture search (NAS) and search for new neural blocks (better than transformer and mamba).
Deleted Comment
Many in the community have tried to create various solvers and things get very hard once the grid gets to be over 5x5. However, some interesting levels with very high maximum step counts have been discovered by the thinky communnity (via simulated annealing)
So, slightly better than A* which is far from SOTA on Sokoban (https://festival-solver.site/).
What is impressive in this paper? Why is this on Hacker News?
However, sometimes the specific domain you are searching has other constraints that can be exploited to do better than A*. An example of this being Jump Point Search that exploits certain properties of grid searches if you can only move in certain ways.
If you were able to write a general searching algorithm that can effectively exploit the whatever the specific properties of the underlying domain "automatically" without you having to actually work out what they are, that would be useful right?
A* can't solve even the first level of original Sokoban 90 levels.
This paper is interesting (to me) as an example of how to use transformers for decision making. I don't particularly care if it is up to A* standards at the moment.
What is the scientific contribution of the paper?
They trained transformer on pairs of <sokoban_level, A*_optimal_solution>.
Anything that is on HN is on HN because the community likes it
Why wouldn't this be the case for research generally? Has our community really devolved to the point where things should only be noteworthy insofar as they optimize for SOTA for a given problem?
What a sad thought.
Comments like these are antithetical to a strong technical sharing community.
"26.8% fewer search steps than standard A∗ search" For reference of prior art, it's slightly better than A*, which is far from SOTA on Sokoban (https://festival-solver.site/).
Dead Comment
https://player.oration.app/09fefe41-f2a7-4257-a25e-30e479b30...