Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Deep Reinforcement Learning Hands-On

You're reading from   Deep Reinforcement Learning Hands-On A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF

Arrow left icon
Product type Paperback
Published in Nov 2024
Publisher Packt
ISBN-13 9781835882702
Length 716 pages
Edition 3rd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Maxim Lapan Maxim Lapan
Author Profile Icon Maxim Lapan
Maxim Lapan
Arrow right icon
View More author details
Toc

Table of Contents (29) Chapters Close

Preface 1. Part 1 Introduction to RL
2. What Is Reinforcement Learning? FREE CHAPTER 3. OpenAI Gym API and Gymnasium 4. Deep Learning with PyTorch 5. The Cross-Entropy Method 6. Part 2 Value-based methods
7. Tabular Learning and the Bellman Equation 8. Deep Q-Networks 9. Higher-Level RL Libraries 10. DQN Extensions 11. Ways to Speed Up RL 12. Stocks Trading Using RL 13. Part 3 Policy-based methods
14. Policy Gradients 15. Actor-Critic Method: A2C and A3C 16. The TextWorld Environment 17. Web Navigation 18. Part 4 Advanced RL
19. Continous Action Space 20. Trust Region Methods 21. Black-Box Optimizations in RL 22. Advanced Exploration 23. Reinforcement Learning with Human Feedback 24. AlphaGo Zero and MuZero 25. RL in Discrete Optimization 26. Multi-Agent RL 27. Bibliography
28. Index

Changes in the third edition

In comparison to the second edition of this book (published in 2020), there are several major changes made to the book’s contents in this new edition:

  • All the dependencies of code examples have been updated to the recent versions or replaced with better alternatives. For example, OpenAI Gym is not supported anymore, but we have the Farama Foundation Gymnasium fork. Another example is the MiniWoB++ library, which has replaced the MiniWoB and Universe environment.

  • A new chapter on RLHF has been included, and the MuZero method has been added to the chapter on AlphaGo Zero.

  • There are lots of small fixes and improvements — most of the figures have been redrawn to make them clearer and more easily understandable.

To better meet book volume limitations, several chapters were rearranged, which I hope made the book more consistent and easier to read.

Download the example code files

The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On-Third-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781835882702.

Conventions used

There are a number of text conventions used throughout this book. CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: ”For the reward table, it is represented as a tuple with [State, Action, State] and for the transition table, it is written as [State, Action].”

A block of code is set as follows:

import typing as tt 
import gymnasium as gym 
from collections import defaultdict, Counter 
from torch.utils.tensorboard.writer import SummaryWriter 
 
ENV_NAME = "FrozenLake-v1" 
GAMMA = 0.9 
TEST_EPISODES = 20

Any command-line input or output is written as follows:

>>> e.action_space 
Discrete(2) 
>>> e.observation_space 
Box([-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38], [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38], (4,), float32)

Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the text like this. For example: ”The second term is called cross-entropy, which is a very common optimization objective in deep learning.” Citations are represented using a condensed author–year format within square brackets, similar to [Sut88] or [Kro+11]. You can find the details of the corresponding paper in the Bibliography section at the end of the book.

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.

Leave a Review!

Thank you for purchasing this book from Packt Publishing—we hope you enjoy it! Your feedback is invaluable and helps us improve and grow. Once you’ve completed reading it, please take a moment to leave an Amazon review; it will only take a minute, but it makes a big difference for readers like you.

Scan the QR code below to receive a free ebook of your choice.

PIC

https://packt.link/NzOWQ

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?

Don’t worry; with every Packt book, you now get a DRM-free PDF version of that book at no cost.

Read anywhere, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there! You can get exclusive access to discounts, newsletters, and great free content in your inbox daily.

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below:

    PIC

    https://packt.link/free-ebook/9781835882702

  2. Submit your proof of purchase.

  3. That’s it! We’ll send your free PDF and other benefits to your email address directly.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image