Giuseppe De Giacomo, degiacomo@diag.uniroma1.it
Luca Iocchi, iocchi@diag.uniroma1.it
Fabio Patrizi, patrizi@diag.uniroma1.it
Alessandro Ronca, ronca@diag.uniroma1.it
Roberto Cipollone (guest lecturer), cipollone@diag.uniroma1.it
Gabriel Paludo Licks (guest lecturer), licks@diag.uniroma1.it
Elena Umili (guest lecturer), umili@diag.uniroma1.it
We present non-Markov decision processes, where rewards and dynamics can depend on the history of events. This is contrast with Markov Decision Processes, where the dependency is limited to the last state and action. We study how to specify non-Markov reward functions and dynamics functions using Linear Temporal Logic on finite traces. The resulting decision processes are called Regular Decision Processes, and we show how to solve them by extending solution techniques for Markov Decision Processes. Then, we turn to Reinforcement Learning. First, we study the Restraining Bolt, a device that enables an agent to learn a specified non-Markov behaviour while relying on the Markov property. Second, we study how an agent can achieve an optimal behaviour in a non-Markov domain, by learning a finite-state automaton that describes rewards and dynamics. Specifically we will cover the following topics: MDP with Non-Markov Rewards, Non-Markov Dynamics, Regular Decision Processes, Restraining Bolts, Linear Time Logic on finite traces as a reward/dynamics specification language, Reinforcement Learning, Deep Reinforcement Learning, Automata Learning. This course is partially based on the work carried out in ERC Advanced Grant WhiteMech and EU ICT-48 TAILOR.
PhD
20 hours
Semester Course
If you are an AIDA Student, complete the following two steps: (a) register at the link https://forms.gle/Nua6Sef2xgmtsbGs8; (b) also enroll in the same course in the AIDA system, in order for this course to be included on your AIDA Course Attendance Certificate. If you are not an AIDA Student, complete only step (a).
Mondays and Fridays from 11:00 to 13:00 CET (Slot 1) and from 14:00 to 16:00 CET (Slot 2).
English
Blended (Online and In Presence)
We, as instructors, will not give exams to other PhD curricula except Sapienza’s. Though we can informally discuss the topics in the course with anyone interested in them.