Beyond A*: Better Planning with Transformers

Planning is already well taken care of by established techniques ranging from graph search, SAT-solvers, OR, Prolog, etc. The point usually is optimization over multiple feasible alternatives, which I have a hard time seeing transformers being up to. I see the role of LLM techniques more in translating natural language descriptions to executable programs - but then Prolog is already very close, it being designed for classic NLP after all.

3abiton · 2 years ago

It would be interesting to see the comparison between Prolog and a similar purposes LLM

tannhaeuser · 2 years ago

> Searchformer [...] solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search ... [and] significantly outperforms baselines that predict the optimal plan directly with a 5-10× smaller model size and a 10× smaller training dataset.

So this can be interpreted as a generic statement to say there might be progress during learning iterations (compared to its own arbitrary first iteration baseline). I think it's important to not getting carried away and assume the recent immense progress in generative AI is just easily repeated for any other AI task. There's also quantum computing having been advertised for over a decade now for a break through in planning and optimization efficiency.

> We also demonstrate how Searchformer scales to larger and more complex decision making tasks like Sokoban with improved percentage of solved tasks and shortened search dynamics.

Worth mentioning that Sokoban is nowhere near a complex decision making task let alone state of the art in an academic or commercial planning/optimization context.

> A∗'s search dynamics are expressed as a token sequence outlining when task states are added and removed [...] during symbolic planning.

Go ahead and compare against eg. Quantum Prolog's readily available container planning problem and solutions [1] then to generate exactly that in the browser - a plan with symbolic actions to perform - or alternatively, a classic Graphplan or CLIPS planning competition problem.

[1]: https://quantumprolog.sgml.io/\*

"26.8% fewer search steps than standard A∗ search"

So, slightly better than A* which is far from SOTA on Sokoban (https://festival-solver.site/).

What is impressive in this paper? Why is this on Hacker News?

Negitivefrags · 2 years ago

So A* is the most optimal search algorithm under the specific constraints it specifies. You can't do better.

However, sometimes the specific domain you are searching has other constraints that can be exploited to do better than A*. An example of this being Jump Point Search that exploits certain properties of grid searches if you can only move in certain ways.

If you were able to write a general searching algorithm that can effectively exploit the whatever the specific properties of the underlying domain "automatically" without you having to actually work out what they are, that would be useful right?

tintor · 2 years ago

Paper authors choose to compare against A* and Sokoban.

A* can't solve even the first level of original Sokoban 90 levels.

mromanuk · 2 years ago

Because they used a transformer to reach a nice solution, better than a typical A* search, which is a "naive" or go to solution. And they didn't think about designing an algorithm. I find it quite impressive that a simple encoder-decoder transformer can achieve so much.

RogerL · 2 years ago

It is in the abstract, very first line. "While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks ..."

This paper is interesting (to me) as an example of how to use transformers for decision making. I don't particularly care if it is up to A* standards at the moment.

tintor · 2 years ago

Abstract doesn't answer my question.

What is the scientific contribution of the paper?

They trained transformer on pairs of <sokoban_level, A*_optimal_solution>.

airstrike · 2 years ago

> Why is this on Hacker News?

Anything that is on HN is on HN because the community likes it

Analemma_ · 2 years ago

Because it's further evidence for the "unreasonable effectiveness of transformers", i.e. that transformers are a fully-general approach for all kinds of learning tasks, not just next-token prediction. Obviously there is a strong and a weak version of that hypothesis and the strong version is probably not true, but to the extent that we appear to be honing in Nature's "one true way" to learn how to accomplish a task, that seems like important news.

MooseBurger · 2 years ago

In certain computer science problems, a suboptimal action at time t may give rise to an optimal outcome at time >t.

Why wouldn't this be the case for research generally? Has our community really devolved to the point where things should only be noteworthy insofar as they optimize for SOTA for a given problem?

What a sad thought.

esafak · 2 years ago

Has anyone compared planning algorithms by complexity against optimality (error)?

DalasNoin · 2 years ago

It's on hn because it sounds similar to the so called Q* algorithm, the supposed secret openai algo that has seen a huge amount of speculation.

hlfshell · 2 years ago

What? Someone somewhere tried to do something and wasn't the most optimal possible solution? We should just ban their account honestly.

Comments like these are antithetical to a strong technical sharing community.

pqdbr · 2 years ago

I agree. OP's comment could be quickly rewritten into something useful and just by changing the tone, for example:

"26.8% fewer search steps than standard A∗ search" For reference of prior art, it's slightly better than A*, which is far from SOTA on Sokoban (https://festival-solver.site/).

agumonkey · 2 years ago

Especially considering the amount of trivial mainstream tech articles nowadays. It's cool to see more algorithmic topics.

mindwok · 2 years ago

Some people find things interesting regardless of if they break records.

Dead Comment

gene-h · 2 years ago

There has been more interesting work on using transformers for robot motion planning[0]. Getting a robot arm from point a to b without hitting stuff is a very difficult problem, the problem is both high dimensional and continuous. Previous planning methods are both computationally intensive and not very good. This is one reason why robot motion appears 'unnatural' and robots generally being bad at many tasks we'd like them to do. This approach appears pretty competitive with other planning methods, planning near optimal paths faster.

[0]https://sites.google.com/ucsd.edu/vq-mpt/home

gromneer · 2 years ago

Informative comments like this is the reason I go to the comment section before reading the article.

softfalcon · 2 years ago

I wonder if they tried the modified J* algorithm (an optimization of A* for games graph/path-finding) before going down this research route.

It's in Game AI Pro 2 [0] if anyone is curious.

[0] https://www.amazon.ca/Game-AI-Pro-Collected-Professionals/dp...

leeoniya · 2 years ago

also https://github.com/anvaka/ngraph.path

ogogmad · 2 years ago

To be fair, they said at the end that their path-finder is not yet competitive with the SOTA. The paper tests how well a transformer does at predicting an execution trace (like in a JIT compiler) and whether this can help improve heuristics in things like path-finding. I'm weary because transformers are slow.

mysterydip · 2 years ago

I love these books (and nice to see Steve Rabin still doing them), but $120 for an ebook? That's unexpected.

setr · 2 years ago

Worse yet… the game ai pro books are entirely free for digital copies. You’d basically be paying $120 for it to be stitched together into a single pdf, instead of one per chapter

http://www.gameaipro.com/

If you consider it like a textbook (very small niche audience for very specialized knowledge) it does kind of match up due to the small expected volume to sell.

Authors still gotta eat.

bravura · 2 years ago

Machine Translation used to involve complicated grammatical decoding, using search. Now we just use transformers for MT, with much simpler decoding that doesn't really need search.

Perhaps we can reach full inception. Let's use our current best-of-breed predictive models to learn heuristics for neural architecture search (NAS) and search for new neural blocks (better than transformer and mamba).

FeepingCreature · 2 years ago

"Every time I fire a linguist, the performance of the speech recognizer goes up." --Frederick Jelinek

xg15 · 2 years ago

...and finally enter a world where literally no one understands anymore how the technology works, not even the people developing them. Singularity, here we come...

zwirbl · 2 years ago

We could then go on to create something akin to the mechanicum, create the position of tech priests and have them pray to the machine god on on our behalf

throwuwu · 2 years ago

Barring the singularity, just because you find something with search doesn’t mean you can’t understand why it’s better.

Deleted Comment

k2xl · 2 years ago

Shameless plug, but if you are interested in Sokoban type games check out https://thinky.gg . Features a fun Sokoban variant (called Sokopath) as well as another NP hard variant called Pathology (goal is to get from point A to point B in the shortest amount of steps).

Many in the community have tried to create various solvers and things get very hard once the grid gets to be over 5x5. However, some interesting levels with very high maximum step counts have been discovered by the thinky communnity (via simulated annealing)

nikolay · 2 years ago

Thank you! Great games! Super nice execution! I love it!

ultra_nick · 2 years ago

If transformers can plan, then AGI only requires better education...

nyrikki · 2 years ago

Approximating exhaustive search is not logic or causality.

There’s a lot more pieces, agency being a big one but online learning is required too and many other layers beyond that.

th0ma5 · 2 years ago

That's probably the foreseeable future, ingesting ever larger amounts of data hoping it prevents hallucinations.

adi4213 · 2 years ago

For the auditory learners, here is this paper in a summarized audiobook format :

https://player.oration.app/09fefe41-f2a7-4257-a25e-30e479b30...

syassami · 2 years ago

Thanks for sharing, I've been looking for an app like this for ages.

Thank you so much for your comment! Apologies if my comment came off like an ad - I have mild dyslexia and struggle to keep up with ML papers; I hope others also find this useful! If you have any feedback or issues, feel free to email us at support [at] trurecord.com