Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
TensorFlow 2 Reinforcement Learning Cookbook
TensorFlow 2 Reinforcement Learning Cookbook

TensorFlow 2 Reinforcement Learning Cookbook: Over 50 recipes to help you build, train, and deploy learning agents for real-world applications

eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

TensorFlow 2 Reinforcement Learning Cookbook

Chapter 2: Implementing Value-Based, Policy-Based, and Actor-Critic Deep RL Algorithms

This chapter provides a practical approach to building value-based, policy-based, and actor-critic algorithm-based reinforcement learning (RL) agents. It includes recipes for implementing value iteration-based learning agents and breaks down the implementation details of several foundational algorithms in RL into simple steps. The policy gradient-based agent and the actor-critic agent make use of the latest major version of TensorFlow 2.x to define the neural network policies.

The following recipes will be covered in this chapter:

  • Building stochastic environments for training RL agents
  • Building value-based (RL) agent algorithms
  • Implementing temporal difference learning
  • Building Monte Carlo prediction and control algorithms for RL
  • Implementing the SARSA algorithm and an RL agent
  • Building a Q-learning agent
  • Implementing policy gradients
  • Implementing actor-critic...

Technical requirements

The code in this book has been tested extensively on Ubuntu 18.04 and Ubuntu 20.04, and should work with later versions of Ubuntu if Python 3.6+ is available. With Python 3.6 installed, along with the necessary Python packages listed at the beginning of each recipe, the code should run fine on Windows and Mac OS X too. It is advised that you create and use a Python virtual environment named tf2rl-cookbook to install the packages and run the code in this book. Installing Miniconda or Anaconda for Python virtual environment management is recommended.

The complete code for each recipe in each chapter is available here: https://github.com/PacktPublishing/Tensorflow-2-Reinforcement-Learning-Cookbook.

Building stochastic environments for training RL agents

To train RL agents for the real world, we need learning environments that are stochastic, since real-world problems are stochastic in nature. This recipe will walk you through the steps for building a Maze learning environment to train RL agents. The Maze is a simple, stochastic environment where the world is represented as a grid. Each location on the grid can be referred to as a cell. The goal of an agent in this environment is to find its way to the goal state. Consider the maze shown in the following diagram, where the black cells represent walls:

Figure 2.1 – The Maze environment

The agent's location is initialized to be at the top-left cell in the Maze. The agent needs to find its way around the grid to reach the goal located at the top-right cell in the Maze, collecting a maximum number of coins along the way while avoiding walls. The location of the goal, coins, walls, and the agent...

Building value-based reinforcement learning agent algorithms

Value-based reinforcement learning works by learning the state-value function or the action-value function in a given environment. This recipe will show you how to create and update the value function for the Maze environment to obtain an optimal policy. Learning value functions, especially in model-free RL problems where a model of the environment is not available, can prove to be quite effective, especially for RL problems with low-dimensional state space.

Upon completing this recipe, you will have an algorithm that can generate the following optimal action sequence based on value functions:

Figure 2.3 – Optimal action sequence generated by a value-based RL algorithm with state values represented through a jet color map

Let's get started.

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install numpy...

Implementing temporal difference learning

This recipe will walk you through how to implement the temporal difference (TD) learning algorithm. TD algorithms allow us to incrementally learn from incomplete episodes of agent experiences, which means they can be used for problems that require online learning capabilities. TD algorithms are useful in model-free RL settings as they do not depend on a model of the MDP transitions or rewards. To visually understand the learning progression of the TD algorithm, this recipe will also show you how to implement the GridworldV2 learning environment, which looks as follows when rendered:

Figure 2.6 – The GridworldV2 learning environment 2D rendering with state values and grid cell coordinates

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install numpy gym. If the following import statements run without issues, you are ready to get...

Building Monte Carlo prediction and control algorithms for RL

This recipe provides the ingredients for building a Monte Carlo prediction and control algorithm so that you can build your RL agents. Similar to the temporal difference learning algorithm, Monte Carlo learning methods can be used to learn both the state and the action value functions. Monte Carlo methods have zero bias since they learn from complete episodes with real experience, without approximate predictions. These methods are suitable for applications that require good convergence properties. The following diagram illustrates the value that's learned by the Monte Carlo method for the GridworldV2 environment:

Figure 2.10 – Monte Carlo prediction of state values (left) and state-action values (right)

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install -r requirements.txt. If the following import...

Implementing the SARSA algorithm and an RL agent

This recipe will show you how to implement the State-Action-Reward-State-Action (SARSA) algorithm, as well as how to develop and train an agent using the SARSA algorithm so that it can act in a reinforcement learning environment. The SARSA algorithm can be applied to model-free control problems and allows us to optimize the value function of an unknown MDP.

Upon completing this recipe, you will have a working RL agent that, when acting in the GridworldV2 environment, will generate the following state-action value function using the SARSA algorithm:

Figure 2.15 – Rendering of the GridworldV2 environment – each triangle represents the action value of taking that directional action in that grid state

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install -r requirements.txt. If the following import statements run...

Building a Q-learning agent

This recipe will show you how to build a Q-learning agent. Q-learning can be applied to model-free RL problems. It supports off-policy learning and therefore provides a practical solution to problems where available experiences were/are collected using some other policy or by some other agent (even humans).

Upon completing this recipe, you will have a working RL agent that, when acting in the GridworldV2 environment, will generate the following state-action value function using the SARSA algorithm:

Figure 2.18 – State-action values obtained using the Q-learning algorithm

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install -r requirements.txt. If the following import statements run without issues, you are ready to get started:

import numpy as np
import random

Now, let's begin.

How to do it…

Let's implement...

Implementing policy gradients

Policy gradient algorithms are fundamental to reinforcement learning and serve as the basis for several advanced RL algorithms. These algorithms directly optimize for the best policy, which can lead to faster learning compared to value-based algorithms. Policy gradient algorithms are effective for problems/applications with high-dimensional or continuous action spaces. This recipe will show you how to implement policy gradient algorithms using TensorFlow 2.0. Upon completing this recipe, you will be able to train an RL agent in any compatible OpenAI Gym environment.

Getting ready

To complete this recipe, you will need to activate the tf2rl-cookbook Python/conda virtual environment and run pip install -r requirements.txt. If the following import statements run without issues, you are ready to get started:

import tensorflow as tf
import tensorflow_probability as tfp
from tensorflow import keras
from tensorflow.keras import layers
import numpy as...

Implementing actor-critic RL algorithms

Actor-critic algorithms allow us to combine value-based and policy-based reinforcement learning – an all-in-one agent. While policy gradient methods directly search and optimize the policy in the policy space, leading to smoother learning curves and improvement guarantees, they tend to get stuck at the local maxima (for a long-term reward optimization objective). Value-based methods do not get stuck at local optimum values, but they lack convergence guarantees, and algorithms such as Q-learning tend to have high variance and are not very sample-efficient. Actor-critic methods combine the good qualities of both value-based and policy gradient-based algorithms. Actor-critic methods are also more sample-efficient. This recipe will make it easy for you to implement an actor-critic-based RL agent using TensorFlow 2.x. Upon completing this recipe, you will be able to train the actor-critic agent in any OpenAI Gym-compatible reinforcement learning...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Develop and deploy deep reinforcement learning-based solutions to production pipelines, products, and services
  • Explore popular reinforcement learning algorithms such as Q-learning, SARSA, and the actor-critic method
  • Customize and build RL-based applications for performing real-world tasks

Description

With deep reinforcement learning, you can build intelligent agents, products, and services that can go beyond computer vision or perception to perform actions. TensorFlow 2.x is the latest major release of the most popular deep learning framework used to develop and train deep neural networks (DNNs). This book contains easy-to-follow recipes for leveraging TensorFlow 2.x to develop artificial intelligence applications. Starting with an introduction to the fundamentals of deep reinforcement learning and TensorFlow 2.x, the book covers OpenAI Gym, model-based RL, model-free RL, and how to develop basic agents. You'll discover how to implement advanced deep reinforcement learning algorithms such as actor-critic, deep deterministic policy gradients, deep-Q networks, proximal policy optimization, and deep recurrent Q-networks for training your RL agents. As you advance, you’ll explore the applications of reinforcement learning by building cryptocurrency trading agents, stock/share trading agents, and intelligent agents for automating task completion. Finally, you'll find out how to deploy deep reinforcement learning agents to the cloud and build cross-platform apps using TensorFlow 2.x. By the end of this TensorFlow book, you'll have gained a solid understanding of deep reinforcement learning algorithms and their implementations from scratch.

Who is this book for?

The book is for machine learning application developers, AI and applied AI researchers, data scientists, deep learning practitioners, and students with a basic understanding of reinforcement learning concepts who want to build, train, and deploy their own reinforcement learning systems from scratch using TensorFlow 2.x.

What you will learn

  • Build deep reinforcement learning agents from scratch using the all-new TensorFlow 2.x and Keras API
  • Implement state-of-the-art deep reinforcement learning algorithms using minimal code
  • Build, train, and package deep RL agents for cryptocurrency and stock trading
  • Deploy RL agents to the cloud and edge to test them by creating desktop, web, and mobile apps and cloud services
  • Speed up agent development using distributed DNN model training
  • Explore distributed deep RL architectures and discover opportunities in AIaaS (AI as a Service)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jan 15, 2021
Length: 472 pages
Edition : 1st
Language : English
ISBN-13 : 9781838985998
Vendor :
Google
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jan 15, 2021
Length: 472 pages
Edition : 1st
Language : English
ISBN-13 : 9781838985998
Vendor :
Google
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 107.97
Mastering Reinforcement Learning with Python
€37.99
TensorFlow 2.0 Computer Vision Cookbook
€32.99
TensorFlow 2 Reinforcement Learning Cookbook
€36.99
Total 107.97 Stars icon
Banner background image

Table of Contents

10 Chapters
Chapter 1: Developing Building Blocks for Deep Reinforcement Learning Using Tensorflow 2.x Chevron down icon Chevron up icon
Chapter 2: Implementing Value-Based, Policy-Based, and Actor-Critic Deep RL Algorithms Chevron down icon Chevron up icon
Chapter 3: Implementing Advanced RL Algorithms Chevron down icon Chevron up icon
Chapter 4: Reinforcement Learning in the Real World – Building Cryptocurrency Trading Agents Chevron down icon Chevron up icon
Chapter 5: Reinforcement Learning in the Real World – Building Stock/Share Trading Agents Chevron down icon Chevron up icon
Chapter 6: Reinforcement Learning in the Real World – Building Intelligent Agents to Complete Your To-Dos Chevron down icon Chevron up icon
Chapter 7: Deploying Deep RL Agents to the Cloud Chevron down icon Chevron up icon
Chapter 8: Distributed Training for Accelerated Development of Deep RL Agents Chevron down icon Chevron up icon
Chapter 9: Deploying Deep RL Agents on Multiple Platforms Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(6 Ratings)
5 star 0%
4 star 100%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Xudong@SF Jun 14, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
It is a very nice book for people who want to gain hands-on experience in reinforcement learning. The book starts with an introduction to the OpenAI gym environment very early on, so that readers can visualize any reinforcement learning system very easily.The book presents a very concise introduction to the basic reinforcement learning algorithms (value-based, policy-based, actor-critic) and advanced ones (DQN, PPO, DDPG, etc). Many examples of real-world reinforcement learning projects are also provided. The last part of the book talks about deploying reinforcement learning to different platforms, which are very useful for real-world practitioners.
Amazon Verified review Amazon
Og J. Ramos Mar 26, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
If you're just starting and want to learn, I would definitely suggest looking at other Packt Tensorflow 2 books first. This is not a how-to book nor a book where you'll learn how to use Tensorflow 2. This book is a cookbook with recipes that will help you play with RL, find ways to implement them as an AI developer and understand what you're doing.If you're an AI Developer and you're looking for some good scripts with explanations and some commentary on how they work and what they can be used for, this book is for you. If you're starting to learn Tensorflow 2, AI, RL, etc., this book might be good later.Overall, it's a good group of code, explanations right at your fingertips.
Amazon Verified review Amazon
Adwait Ullal Apr 22, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
The book, TensorFlow 2 Reinforcement Learning Cookbook, contains a host of cradle-to-grave recipes i.e. from getting started with Reinforcement Learning to deployment. The book is for someone who is familiar with TensorFlow and wants to get hands-on with Reinforcement Learning.The book is full of code, as the title suggests and includes recipes basic as well advanced algorithms and for theoretical as well as real-world use-cases. The code is mostly in Python and is also available on Github.I would have liked to see some basic explanation on Reinforcement Learning, instead of diving head-on into code, etc. At least, external references or other Packt books related to TensorFlow and/or Reinforcement Learning could have been suggested.Overall, it is a very focused book for getting to understand Reinforcement Learning in TensorFlow.
Amazon Verified review Amazon
Vince S. Mar 14, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
The authors provided me a pdf version of this book in exchange for a review on it. I think over all this book had reinforcement learning well-explained. It is good for someone who is familiar with tensorflow and deep learning basics and want to explore reinforcement learning domain.This book is very heavy on sample code, around 80% of the book is about actual python code. It is good for someone want to have some hands-on experiences, and it has some interesting projects like building your own Cryptocurrency Trading Agents. However, on the other side, this book doesn't include enough theory and cutting edges researches on it.I would recommend this book to someone who is new to reinforcement learning and would like to have some quick hands-on fun on it.
Amazon Verified review Amazon
Andrzej Jankowski Jun 07, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I have received a copy of the book from Packt Publishing and have been asked to provide a review. Opinions below are my own."TensorFlow 2 Reinforcement Learning Cookbook" is a book you will reference to solve a specific problem, is not really supposed to be read. I see it as a kind of structured stack overflow focusing on covering specific topic.You can treat the book recipes as answers to a specific question (eg. How to implement Deep Q-Learning algorithm). They dive right into the code. It doesn't mean the author forgets about explanations - "How it works" section in every chapter is clearly written and thorough enough for you to understand the topic.The strength of the book is it's breadth. It covers all reinforcement learning algorithm, but also their usage for various real life tasks (cryptocurrency or stock trading, booking flights for yourself etc.)The author doesn't stop with simplest implementations. Trading example gives the agent opportunity to trade based on visual data. The agent observes the stock market data in the form of candlestick price charts.The biggest value of "TensorFlow 2 Reinforcement Learning Cookbook" are the parts covering distributed training and deployment. All other parts of the book are easy to find somewhere else, but these two topics are rarely covered. I believe it is worth reading the book for those 2 parts are alone.Distributed training starts with using Tensorflow API for multi GPU set up and goes up to cluster of servers using Ray framework (popular library coming from Berkeley RISE Lab). This selection of topics gives you solid understanding of how to speed up training by distributing it.Deployment scenarios get extra attention in 2 chapters (which is a good thing). It includes deployment to cloud, mobile (Tensorflow Lite) , browser (Tensorflow.js) and various HW set up (using ONNX runtime or Nvidia Triton).You probably have guessed by now - it is not a beginner book. I recommend it for more advanced data scientists interested in quick answers to their reinforcement learning algorithms questions, or looking for easy guide to deployment scenarios.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.