Class notes for the Machine Learning Nanodegree at Udacity
Side Note: Reinforcement is a misused term, because all we care is to maximize rewards.