Fusing LLM & Classical Planning

In the previous part (Why are LLMs required for Planning?), the question we asked was: Can we improve LLM planning to have some formal guarantees?

To answer that, Let’s first recap.

LLM Planning Classical Planning
Open-world Planning
Handling Abstract Tasks
Handling Partial Observability
Feasibility
Optimality



We can observe that the formal guarantees come from classical planning. Can we somehow combine the best of both? Any guesses? Here’s a hint to jog your memories.

Description of the image content



Ring any bells now? You guessed it right.

Considering LLM Planning as System 1 and Classical Planning as System 2, we look at two different ways of combining the best of both.

  • Equip LLMs with verifiers, dynamics, and heuristic search.
  • Use LLms as knowledge bases for open-world planning

Description of the image content




Description of the image content



Let’s first start with some motivation. [1] shows that:

  • LLMs can generate good solutions if called multiple times.
  • Verifiers help discard bad candidate actions/plans.

1. LLMs + external verifiers

Description of the image content



Description of the image content



SayCan has a critical limitation: it selects actions based solely on feasibility, not their relevance to the goal. Consider this analogy: if you’re traveling from San Francisco to New York, would it make sense to fly via New Delhi simply because it’s feasible? To address this, SayCanPay[3] further adds a Pay model to estimate the payoff of an action with respect to the goal. That is, actions which are more optimal wrt the goal are more likely to be selected.

Description of the image content



SayCanPay[3] also proposes heuristic search using an aggregated score of the Say (LLM), Can (feasibility), and Pay (optimality) models. As shown in the Figure, the Beam-Action search performs a beam search over the action. This mirrors the search in heuristic planners. They show that the overall score for each action is a sum of the aggregated score and heuristic score, akin to A* planning.

3. LLMs + dynamics models

Description of the image content



Description of the image content




LLMs as Knowledge Base for Planners

Description of the image content



Description of the image content




Perhaps, a more apt conlusion would be:

Description of the image content



Description of the image content




References

[1] Lightman, H. (2023). Let’s Verify Step by Step.

[2] Ahn, M. (2022). Do As I Can, Not As I Say: Grounding Language in Robotic Affordances. 6th Annual Conference on Robot Learning

[3] Hazra, R. (2023). SayCanPay: Heuristic Planning with Large Language Models Using Learnable Domain Knowledge. Proceedings of the AAAI Conference on Artificial Intelligence, 20123–20133.

[4] Hao, S. (2023). Reasoning with Language Models is Planning with World Models. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 8154–8173.

[5] Du, Y. (2024). Video Language Planning. The Twelfth International Conference on Learning Representations.

[6] Liu, B. (2023). LLM+P: Empowering Large Language Models with Optimal Planning Proficiency.

[7] Wong, L. (2024). Learning Grounded Action Abstraction from Language. The Twelfth International Conference on Learning Representations




If you found this useful, please cite this as:

Hazra, Rishi (May 2024). Fusing LLM & Classical Planning. https://rishihazra.github.io.

or as a BibTeX entry:

@article{hazra2024fusing-llm-classical-planning,
  title   = {Fusing LLM & Classical Planning},
  author  = {Hazra, Rishi},
  year    = {2024},
  month   = {May},
  url     = {https://rishihazra.github.io/llm-planning/2024/05/26/fusing_llms_and_planners.html}
}