- If you are interested in the original paper of the NPG, read A Natural Policy Gradient: https://papers.nips.cc/paper/2073-a-natural-policy-gradient.pdf.
- For the paper that introduced the Generalized Advantage Function, please read High-Dimensional Continuous Control Using Generalized Advantage Estimation: https://arxiv.org/pdf/1506.02438.pdf.
- If you are interested in the original Trust Region Policy Optimization paper, then please read Trust Region Policy Optimization: https://arxiv.org/pdf/1502.05477.pdf.
- If you are interested in the original paper that introduced the Proximal Policy Optimization algorithm, then please read Proximal Policy Optimization Algorithms: https://arxiv.org/pdf/1707.06347.pdf.
- For a further explanation of Proximal Policy Optimization, read the following blog post: https://openai.com/blog/openai-baselines-ppo/.
- If you are interested in...





















































