DATA130060.01

Algorithmic and Theoretical Foundations of RL

Lecture notes may vary slightly from year to year. Last update for Fall 2025: Jan 19, 2026. A short introduction on RL can be found here .

Introduction
MDP and Bellman Optimality
Value Iteration and Policy Iteration
Monte Carlo Methods
Monte Carlo (MC) Learning
Temporal-Difference (TD) Learning
Value Function Approximation
Policy Optimization I
Policy Optimization II
Bandit and MCTS
Introduction of RL for LLMs