Reinforcement learning

Reinforcement learning (RL) is the most distinct category, with respect to the one we saw so far. The concept is quite fascinating: the algorithm is trying to find a policy to maximize the sum of rewards.

The policy is learned by an agent who uses it to take actions in an environment. The environment then returns feedback, which the agent uses to improve its policy. The feedback is the reward for the action taken and it can be a positive, null, or negative number, as shown in the following diagram: