- For a more comprehensive survey about the multi-armed bandit problem, read A Survey of Online Experiment Design with Stochastic Multi-Armed Bandit: https://arxiv.org/pdf/1510.00757.pdf.
- For reading the paper that leverages intrinsic motivation for playing Montezuma's Revenge, refer to Unifying Count-Based Exploration and Intrinsic Motivation: https://arxiv.org/pdf/1606.01868.pdf.
- For the original ESBAS paper, follow this link: https://arxiv.org/pdf/1701.08810.pdf.




















































