Reinforcement Learning and Sequential Decision-making

Reinforcement Learning and Sequential Decision-making

Level
Intermediate, Broad, Algorithmic, Methodological.
This topic covers the study and design of machine learning algorithms for online learning, multi-armed bandits and reinforcement learning (RL).
Reinforcement Learning and Sequential Decision-making

Learning outcomes

Content /
Knowledge

Students should be able to:

  • Understand the difference between online and batch learning.
  • Describe the main online learning algorithms and understand the analysis of their performance.
  • Understand the multi-armed bandit problem, describe the main algorithms, and understand the analysis of their performance.
  • Understand the goal of reinforcement learning and the mathematical MDP model.
  • Describe the basic evaluation criteria for RL: finite, infinite, and discounted horizon.
  • Describe the main algorithms for model-based RL and understand their performance guarantees.
  • Describe the main algorithms for model-free RL and understand their performance guarantees.
  • Understand value function approximation and deep RL.
Methodological
skills
Students should be able to:
  • Design RL solutions for new problems using a correct MDP abstraction.
  • Implement RL algorithms taking advantage of available libraries and simulation environments.
  • Evaluate the accuracy of the derived solutions in a systematic way, using available benchmarks and considering different performance metrics.
Transferrable/
Application
Students should be able to:
  • Work effectively with others in an interdisciplinary and/or international team.
  • Design and manage individual projects.
  • Clearly and succinctly communicate their ideas to technical audiences.