Episodic versus continuous tasks
A lot of the tasks that we specify in the real world have a well-defined ending point. For example, if an agent is playing a game, then the episode or the task ends when the agent wins or loses, or dies.
In the situation of a self-driving car, the task ends when the car reaches the destination or it crashes. These tasks with well-defined ending points are called episodic tasks. The reward that the agent gets is given to it at the end of each episode and this is when the agent decides how well it has done in the environment. Then the agent goes on to the next episode, when it starts from scratch but has the prior information of the last episode with it and can perform better.
As time passes, over a period of episodes, the agent will learn to play the game or drive the car to a particular destination, and thus it will be trained. As you will remember, the agent's goal is to maximize the cumulative reward at the end of the episode.
However, there...