ICAPS Online Summer School – Reading

Temporal Planning

  • Automated Planning and Acting; Chapter “Deliberation with Temporal Models” Section 4.1 (Introduction) and Section 4.2 (Temporal Representation.
    Ghallab, M., Nau, D., Traverso, P., (2016).

Lab: Plan Synthesis

  • The PDDL wiki, which includes a reference and guide:
  • An Introduction to the Planning Domain Definition Language.
    Haslum, P., Lipovetzky, N., Magazzeni, D., Muise, C., (2019).
  • PDDL – The Planning Domain Definition Language.
    Ghallab, M., Knoblock, C., Wilkins, D., Barrett, A., Christianson, D., Friedman, M., Kwok, C., Golden, K., Penberthy, S., Smith, D., Sun, Y., & Weld, D. (1998).
  • Fox, M., & Long, D. (2003). PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains. CoRR, abs/1106.4561.

Planning and Plan Execution for Human-Robot Interaction tasks

Integrated Planning and Acting using Operational Models

  • Integrating acting, planning, and learning in hierarchical operational models.
    Patra, S., Mason, J., Kumar, A., Ghallab, M., Traverso, P., Nau D., In ICAPS-2020. Best student paper honourable mention award.

Integrated Planning and Acting using Operational Models

Relevant Introduction

Sutton and Barto, RL Book
– edition 1 (can be read quickly): http://incompleteideas.net/book/first/the-book.html
– edition 2 (more modern update): http://incompleteideas.net/book/the-book-2nd.html

Monte Carlo Tree Search
– Thomas Keller had some great insights in his THTS paper:
– His PROST planner cited in the above paper that has won many ICAPS probabilistic planning competitions and more recent versions of his MCTS work can be found on his web page:

RTDP papers
– Real-time Dynamic Programming (discussed above in Thomas Keller’s THTS work) is a beautiful way to combine the benefits of dynamic programming and tree search:
– Bounded Real-time Dynamic Programming is a simple but beautiful RTDP algorithm that exploits lower and upper bound properties of dynamic programming:

– I mentioned today that Marcus Hutter has worked on optimal decision making when you only have observations and you cannot easily define a state.  He calls this model AIXI and he has a website and book here: http://www.hutter1.net/ai/

Latent State Models
– I mentioned a very interesting trend of recent work on learning and planning with latent models of state (rather than the explicit models we’re used to dealing with in ICAPS), here are some pointers:
  Predictron (DeepMind): https://arxiv.org/abs/1612.08810
  MuZero (DeepMind): https://deepmind.com/research/publications/Mastering-Atari-Go-Chess-and-Shogi-by-Planning-with-a-Learned-Model
  Masataro Asai’s Research Publications: https://scholar.google.co.jp/citations?user=b4UzH5AAAAAJ&hl=en

Planning in Continuous State MDPs
– There is excellent work for planning with continuous states and actions in the deterministic setting, however, there is not nearly as much work as I would like to see in the stochastic setting (though perhaps I have missed recent work).  Let me give you two groups of pointers based on the work of my own group:
… Symbolic Dynamic Programming (exact dynamic programming solutions to continuous state and action MDPs):
    Continuous State MDPs: https://ssanner.github.io/papers/anu/cont_mdp.pdf
    Continuous State and Action MDPs: https://ssanner.github.io/papers/anu/aaai12_sdp.pdf
    (see also Related Work in these papers and many more papers with “Symbolic Dynamic Programming” in the title on my group’s web page: https://d3m.mie.utoronto.ca/publications/)
… Deriving Plans and Policies in Continuous State and Action MDPs specified in RDDL:
    Policy Learning: https://ssanner.github.io/papers/aaai19_drp.pdf
    (the above paper includes a Python version of a RDDL parser)
    Planning in Deterministic Continuous RDDL Models: https://ssanner.github.io/papers/nips17_tfplan.pdf

RDDL – Relational Dynamic Influence Diagram Language
– A compact language for specifying MDPs used in the past decade of ICAPS probabilistic planning competitions.  Overview, Slides, Exercises, the Source Code. etc: