Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Machine Learning Using TensorFlow Cookbook
Machine Learning Using TensorFlow Cookbook

Machine Learning Using TensorFlow Cookbook: Create powerful machine learning algorithms with TensorFlow

Arrow left icon
Profile Icon Alexia Audevart Profile Icon Konrad Banachewicz Profile Icon Luca Massaron
Arrow right icon
Free Trial
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9 (16 Ratings)
Paperback Feb 2021 416 pages 1st Edition
eBook
S$29.99 S$42.99
Paperback
S$52.99
Subscription
Free Trial
Arrow left icon
Profile Icon Alexia Audevart Profile Icon Konrad Banachewicz Profile Icon Luca Massaron
Arrow right icon
Free Trial
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9 (16 Ratings)
Paperback Feb 2021 416 pages 1st Edition
eBook
S$29.99 S$42.99
Paperback
S$52.99
Subscription
Free Trial
eBook
S$29.99 S$42.99
Paperback
S$52.99
Subscription
Free Trial

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Machine Learning Using TensorFlow Cookbook

The TensorFlow Way

In Chapter 1, Getting Started with TensorFlow 2.x we introduced how TensorFlow creates tensors and uses variables. In this chapter, we'll introduce how to put together all these objects using eager execution, thus dynamically setting up a computational graph. From this, we can set up a simple classifier and see how well it performs.

Also, remember that the current and updated code from this book is available online on GitHub at https://github.com/PacktPublishing/Machine-Learning-Using-TensorFlow-Cookbook.

Over the course of this chapter, we'll introduce the key components of how TensorFlow operates. Then, we'll tie it together to create a simple classifier and evaluate the outcomes. By the end of the chapter, you should have learned about the following:

  • Operations using eager execution
  • Layering nested operations
  • Working with multiple layers
  • Implementing loss functions
  • Implementing backpropagation
  • Working with batch and stochastic training
  • Combining everything together

Let's start working our way through more and more complex recipes, demonstrating the TensorFlow way of handling and solving data problems.

Operations using eager execution

Thanks to Chapter 1, Getting Started with TensorFlow 2.x we can already create objects such as variables in TensorFlow. Now we will introduce operations that act on such objects. In order to do so, we'll return to eager execution with a new basic recipe showing how to manipulate matrices. This recipe, and the following ones, are still basic ones, but over the course of the chapter, we'll combine these basic recipes into more complex ones.

Getting ready

To start, we load TensorFlow and NumPy, as follows:

import TensorFlow as tf
import NumPy as np 

That's all we need to get started; now we can proceed.

How to do it...

In this example, we'll use what we have learned so far, and send each number in a list to be computed by TensorFlow commands and print the output.

First, we declare our tensors and variables. Here, out of all the various ways we could feed data into the variable using TensorFlow, we will create a NumPy array to feed into our variable and then use it for our operation:

x_vals = np.array([1., 3., 5., 7., 9.])
x_data = tf.Variable(x_vals, dtype=tf.float32)
m_const = tf.constant(3.)
operation = tf.multiply(x_data, m_const)
for result in operation:
    print(result.NumPy()) 

The output of the preceding code is as follows:

3.0 
9.0 
15.0 
21.0 
27.0 

Once you get accustomed to working with TensorFlow variables, constants, and functions, it will become natural to start from NumPy array data, progress to scripting data structures and operations, and test their results as you go.

How it works...

Using eager execution, TensorFlow immediately evaluates the operation values, instead of manipulating the symbolic handles referred to the nodes of a computational graph to be later compiled and executed. You can therefore just iterate through the results of the multiplicative operation and print the resulting values using the .NumPy method, which returns a NumPy object from a TensorFlow tensor.

Layering nested operations

In this recipe, we'll learn how to put multiple operations to work; it is important to know how to chain operations together. This will set up layered operations to be executed by our network. In this recipe, we will multiply a placeholder by two matrices and then perform addition. We will feed in two matrices in the form of a three-dimensional NumPy array.

This is another easy-peasy recipe to give you ideas about how to code in TensorFlow using common constructs such as functions or classes, improving readability and code modularity. Even if the final product is a neural network, we're still writing a computer program, and we should abide by programming best practices.

Getting ready

As usual, we just need to import TensorFlow and NumPy, as follows:

import TensorFlow as tf
import NumPy as np 

We're now ready to move forward with our recipe.

How to do it...

We will feed in two NumPy arrays of size 3 x 5. We will multiply each matrix by a constant of size 5 x 1, which will result in a matrix of size 3 x 1. We will then multiply this by a 1 x 1 matrix resulting in a 3 x 1 matrix again. Finally, we add a 3 x 1 matrix at the end, as follows:

  1. First, we create the data to feed in and the corresponding placeholder:
    my_array = np.array([[1., 3., 5., 7., 9.], 
                         [-2., 0., 2., 4., 6.], 
                         [-6., -3., 0., 3., 6.]]) 
    x_vals = np.array([my_array, my_array + 1])
    x_data = tf.Variable(x_vals, dtype=tf.float32)
    
  2. Next, we create the constants that we will use for matrix multiplication and addition:
    m1 = tf.constant([[1.], [0.], [-1.], [2.], [4.]]) 
    m2 = tf.constant([[2.]]) 
    a1 = tf.constant([[10.]]) 
    
  3. Now, we declare the operations to be eagerly executed. As good practice, we create functions that execute the operations we need:
    def prod1(a, b):
        return tf.matmul(a, b)
    def prod2(a, b):
        return tf.matmul(a, b) 
    def add1(a, b):
        return tf.add(a, b)
    
  4. Finally, we nest our functions and display the result:
    result = add1(prod2(prod1(x_data, m1), m2), a1)
    print(result.NumPy()) 
    [[ 102.] 
     [  66.] 
     [  58.]] 
    [[ 114.] 
     [  78.] 
     [  70.]] 
    

Using functions (and also classes, as we are going to cover) will help you write clearer code. That makes debugging more effective and allows easy maintenance and reuse of code.

How it works...

Thanks to eager execution, there's no longer a need to resort to the "kitchen sink" programming style (meaning that you put almost everything in the global scope of the program; see https://stackoverflow.com/questions/33779296/what-is-exact-meaning-of-kitchen-sink-in-programming) that was so common when using TensorFlow 1.x. At the moment, you can adopt either a functional programming style or an object-oriented one, such as the one we present in this brief example, where you can arrange all your operations and computations in a more logical and understandable way:

class Operations():  
    def __init__(self, a):
        self.result = a
    def apply(self, func, b):
        self.result = func(self.result, b)
        return self
        
operation = (Operations(a=x_data)
             .apply(prod1, b=m1)
             .apply(prod2, b=m2)
             .apply(add1, b=a1))
print(operation.result.NumPy())

Classes can help you organize your code and reuse it better than functions, thanks to class inheritance.

There's more...

In all the examples in this recipe, we've had to declare the data shape and know the outcome shape of the operations before we run the data through the operations. This is not always the case. There may be a dimension or two that we do not know beforehand or some that can vary during our data processing. To take this into account, we designate the dimension or dimensions that can vary (or are unknown) as value None.

For example, to initialize a variable to have an unknown amount of rows, we would write the following line and then we can assign values of arbitrary row numbers:

v = tf.Variable(initial_value=tf.random.normal(shape=(1, 5)),
                shape=tf.TensorShape((None, 5)))
v.assign(tf.random.normal(shape=(10, 5)))

It is fine for matrix multiplication to have flexible rows because that won't affect the arrangement of our operations. This will come in handy in later chapters when we are feeding data in multiple batches of varying batch sizes.

While the use of None as a dimension allows us to use variably-sized dimensions, I always recommend that you be as explicit as possible when filling out dimensions. If the size of our data is known in advance, then we should explicitly write that size as the dimensions. The use of None as a dimension is recommended to be limited to the batch size of the data (or however many data points we are computing on at once).

Working with multiple layers

Now that we have covered multiple operations, we will cover how to connect various layers that have data propagating through them. In this recipe, we will introduce how to best connect various layers, including custom layers. The data we will generate and use will be representative of small random images. It is best to understand this type of operation with a simple example and see how we can use some built-in layers to perform calculations. The first layer we will explore is called a moving window. We will perform a small moving window average across a 2D image and then the second layer will be a custom operation layer.

Moving windows are useful for everything related to time series. Though there are layers specialized for sequences, a moving window may prove useful when you are analyzing, for instance, MRI scans (neuroimages) or sound spectrograms.

Moreover, we will see that the computational graph can get large and hard to look at. To address this, we will also introduce ways to name operations and create scopes for layers.

Getting ready

To start, you have to load the usual packages – NumPy and TensorFlow – using the following:

import TensorFlow as tf
import NumPy as np

Let's now progress to the recipe. This time things are getting more complex and interesting.

How to do it...

We proceed with the recipe as follows.

First, we create our sample 2D image with NumPy. This image will be a 4 x 4 pixel image. We will create it in four dimensions; the first and last dimensions will have a size of 1 (we keep the batch dimension distinct, so you can experiment with changing its size). Note that some TensorFlow image functions will operate on four-dimensional images. Those four dimensions are image number, height, width, and channel, and to make it work with one channel, we explicitly set the last dimension to 1, as follows:

batch_size = [1]
x_shape = [4, 4, 1]
x_data = tf.random.uniform(shape=batch_size + x_shape)

To create a moving window average across our 4 x 4 image, we will use a built-in function that will convolute a constant across a window of the shape 2 x 2. The function we will use is conv2d(); this function is quite commonly used in image processing and in TensorFlow.

This function takes a piecewise product of the window and a filter we specify. We must also specify a stride for the moving window in both directions. Here, we will compute four moving window averages: the upper-left, upper-right, lower-left, and lower-right four pixels. We do this by creating a 2 x 2 window and having strides of length 2 in each direction. To take the average, we will convolute the 2 x 2 window with a constant of 0.25, as follows:

def mov_avg_layer(x):
    my_filter = tf.constant(0.25, shape=[2, 2, 1, 1]) 
    my_strides = [1, 2, 2, 1] 
    layer = tf.nn.conv2d(x, my_filter, my_strides, 
                         padding='SAME', name='Moving_Avg_Window')
    return layer

Note that we are also naming this layer Moving_Avg_Window by using the name argument of the function.

To figure out the output size of a convolutional layer, we can use the following formula: Output = (WF + 2P)/S + 1), where W is the input size, F is the filter size, P is the padding of zeros, and S is the stride.

Now, we define a custom layer that will operate on the 2 x 2 output of the moving window average. The custom function will first multiply the input by another 2 x 2 matrix tensor, and then add 1 to each entry. After this, we take the sigmoid of each element and return the 2 x 2 matrix. Since matrix multiplication only operates on two-dimensional matrices, we need to drop the extra dimensions of our image that are of size 1. TensorFlow can do this with the built-in squeeze() function. Here, we define the new layer:

    def custom_layer(input_matrix): 
        input_matrix_sqeezed = tf.squeeze(input_matrix) 
        A = tf.constant([[1., 2.], [-1., 3.]]) 
        b = tf.constant(1., shape=[2, 2]) 
        temp1 = tf.matmul(A, input_matrix_sqeezed) 
        temp = tf.add(temp1, b) # Ax + b 
        return tf.sigmoid(temp)  

Now, we have to arrange the two layers in the network. We will do this by calling one layer function after the other, as follows:

first_layer = mov_avg_layer(x_data) 
second_layer = custom_layer(first_layer)

Now, we just feed in the 4 x 4 image into the functions. Finally, we can check the result, as follows:

print(second_layer)
 
tf.Tensor(
[[0.9385519  0.90720266]
 [0.9247799  0.82272065]], shape=(2, 2), dtype=float32)

Let's now understand more in depth how it works.

How it works...

The first layer is named Moving_Avg_Window. The second is a collection of operations called Custom_Layer. Data processed by these two layers is first collapsed on the left and then expanded on the right. As shown by the example, you can wrap all the layers into functions and call them, one after the other, in a way that later layers process the outputs of previous ones.

Implementing loss functions

For this recipe, we will cover some of the main loss functions that we can use in TensorFlow. Loss functions are a key aspect of machine learning algorithms. They measure the distance between the model outputs and the target (truth) values.

In order to optimize our machine learning algorithms, we will need to evaluate the outcomes. Evaluating outcomes in TensorFlow depends on specifying a loss function. A loss function tells TensorFlow how good or bad the predictions are compared to the desired result. In most cases, we will have a set of data and a target on which to train our algorithm. The loss function compares the target to the prediction (it measures the distance between the model outputs and the target truth values) and provides a numerical quantification between the two.

Getting ready

We will first start a computational graph and load matplotlib, a Python plotting package, as follows:

import matplotlib.pyplot as plt 
import TensorFlow as tf 

Now that we are ready to plot, let's proceed to the recipe without further ado.

How to do it...

First, we will talk about loss functions for regression, which means predicting a continuous dependent variable. To start, we will create a sequence of our predictions and a target as a tensor. We will output the results across 500 x values between -1 and 1. See the How it works... section for a plot of the outputs. Use the following code:

x_vals = tf.linspace(-1., 1., 500) 
target = tf.constant(0.) 

The L2 norm loss is also known as the Euclidean loss function. It is just the square of the distance to the target. Here, we will compute the loss function as if the target is zero. The L2 norm is a great loss function because it is very curved near the target and algorithms can use this fact to converge to the target more slowly the closer it gets to zero. We can implement this as follows:

def l2(y_true, y_pred):
    return tf.square(y_true - y_pred) 

TensorFlow has a built-in form of the L2 norm, called tf.nn.l2_loss(). This function is actually half the L2 norm. In other words, it is the same as the previous one but divided by 2.

The L1 norm loss is also known as the absolute loss function. Instead of squaring the difference, we take the absolute value. The L1 norm is better for outliers than the L2 norm because it is not as steep for larger values. One issue to be aware of is that the L1 norm is not smooth at the target, and this can result in algorithms not converging well. It appears as follows:

def l1(y_true, y_pred):
    return tf.abs(y_true - y_pred)

Pseudo-Huber loss is a continuous and smooth approximation to the Huber loss function. This loss function attempts to take the best of the L1 and L2 norms by being convex near the target and less steep for extreme values. The form depends on an extra parameter, delta, which dictates how steep it will be. We will plot two forms, delta1 = 0.25 and delta2 = 5, to show the difference, as follows:

def phuber1(y_true, y_pred):
    delta1 = tf.constant(0.25) 
    return tf.multiply(tf.square(delta1), tf.sqrt(1. +  
                        tf.square((y_true - y_pred)/delta1)) - 1.) 
def phuber2(y_true, y_pred):
    delta2 = tf.constant(5.) 
    return tf.multiply(tf.square(delta2), tf.sqrt(1. +  
                        tf.square((y_true - y_pred)/delta2)) - 1.) 

Now, we'll move on to loss functions for classification problems. Classification loss functions are used to evaluate loss when predicting categorical outcomes. Usually, the output of our model for a class category is a real-value number between 0 and 1. Then, we choose a cutoff (0.5 is commonly chosen) and classify the outcome as being in that category if the number is above the cutoff. Next, we'll consider various loss functions for categorical outputs.

To start, we will need to redefine our predictions (x_vals) and target. We will save the outputs and plot them in the next section. Use the following:

x_vals = tf.linspace(-3., 5., 500) 
target = tf.fill([500,], 1.)

Hinge loss is mostly used for support vector machines but can be used in neural networks as well. It is meant to compute a loss among two target classes, 1 and -1. In the following code, we are using the target value 1, so the closer our predictions are to 1, the lower the loss value:

def hinge(y_true, y_pred):
    return tf.maximum(0., 1. - tf.multiply(y_true, y_pred))

Cross-entropy loss for a binary case is also sometimes referred to as the logistic loss function. It comes about when we are predicting the two classes 0 or 1. We wish to measure a distance from the actual class (0 or 1) to the predicted value, which is usually a real number between 0 and 1. To measure this distance, we can use the cross-entropy formula from information theory, as follows:

def xentropy(y_true, y_pred):
    return (- tf.multiply(y_true, tf.math.log(y_pred)) -   
          tf.multiply((1. - y_true), tf.math.log(1. - y_pred))) 

Sigmoid cross-entropy loss is very similar to the previous loss function except we transform the x values using the sigmoid function before we put them in the cross-entropy loss, as follows:

def xentropy_sigmoid(y_true, y_pred):
    return tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true,  
                                                   logits=y_pred) 

Weighted cross-entropy loss is a weighted version of sigmoid cross-entropy loss. We provide a weight on the positive target. For an example, we will weight the positive target by 0.5, as follows:

def xentropy_weighted(y_true, y_pred):
    weight = tf.constant(0.5) 
    return tf.nn.weighted_cross_entropy_with_logits(labels=y_true,
                                                    logits=y_pred,  
                                                pos_weight=weight)

Softmax cross-entropy loss operates on non-normalized outputs. This function is used to measure a loss when there is only one target category instead of multiple. Because of this, the function transforms the outputs into a probability distribution via the softmax function and then computes the loss function from a true probability distribution, as follows:

def softmax_xentropy(y_true, y_pred):
    return tf.nn.softmax_cross_entropy_with_logits(labels=y_true,                                                    logits=y_pred)
    
unscaled_logits = tf.constant([[1., -3., 10.]]) 
target_dist = tf.constant([[0.1, 0.02, 0.88]])
print(softmax_xentropy(y_true=target_dist,                        y_pred=unscaled_logits))
[ 1.16012561] 

Sparse softmax cross-entropy loss is almost the same as softmax cross-entropy loss, except instead of the target being a probability distribution, it is an index of which category is true. Instead of a sparse all-zero target vector with one value of 1, we just pass in the index of the category that is the true value, as follows:

def sparse_xentropy(y_true, y_pred):
    return tf.nn.sparse_softmax_cross_entropy_with_logits(
                                                    labels=y_true,
                                                    logits=y_pred) 
unscaled_logits = tf.constant([[1., -3., 10.]]) 
sparse_target_dist = tf.constant([2]) 
print(sparse_xentropy(y_true=sparse_target_dist,  
                      y_pred=unscaled_logits))
[ 0.00012564] 

Now let's understand better how such loss functions operate by plotting them on a graph.

How it works...

Here is how to use matplotlib to plot the regression loss functions:

x_vals = tf.linspace(-1., 1., 500) 
target = tf.constant(0.) 
funcs = [(l2, 'b-', 'L2 Loss'),
         (l1, 'r--', 'L1 Loss'),
         (phuber1, 'k-.', 'P-Huber Loss (0.25)'),
         (phuber2, 'g:', 'P-Huber Loss (5.0)')]
for func, line_type, func_name in funcs:
    plt.plot(x_vals, func(y_true=target, y_pred=x_vals), 
             line_type, label=func_name)
plt.ylim(-0.2, 0.4) 
plt.legend(loc='lower right', prop={'size': 11}) 
plt.show()

We get the following plot as output from the preceding code:

Figure 2.1: Plotting various regression loss functions

Here is how to use matplotlib to plot the various classification loss functions:

x_vals = tf.linspace(-3., 5., 500)  
target = tf.fill([500,], 1.)
funcs = [(hinge, 'b-', 'Hinge Loss'),
         (xentropy, 'r--', 'Cross Entropy Loss'),
         (xentropy_sigmoid, 'k-.', 'Cross Entropy Sigmoid Loss'),
         (xentropy_weighted, 'g:', 'Weighted Cross Enropy Loss            (x0.5)')]
for func, line_type, func_name in funcs:
    plt.plot(x_vals, func(y_true=target, y_pred=x_vals), 
             line_type, label=func_name)
plt.ylim(-1.5, 3) 
plt.legend(loc='lower right', prop={'size': 11}) 
plt.show()

We get the following plot from the preceding code:

Figure 2.2: Plots of classification loss functions

Each of these loss curves provides different advantages to the neural network optimizing it. We are now going to discuss this a little bit more.

There's more...

Here is a table summarizing the properties and benefits of the different loss functions that we have just graphically described:

Loss function

Use

Benefits

Disadvantages

L2

Regression

More stable

Less robust

L1

Regression

More robust

Less stable

Pseudo-Huber

Regression

More robust and stable

One more parameter

Hinge

Classification

Creates a max margin for use in SVM

Unbounded loss affected by outliers

Cross-entropy

Classification

More stable

Unbounded loss, less robust

The remaining classification loss functions all have to do with the type of cross-entropy loss. The cross-entropy sigmoid loss function is for use on unscaled logits and is preferred over computing the sigmoid loss and then the cross-entropy loss, because TensorFlow has better built-in ways to handle numerical edge cases. The same goes for softmax cross-entropy and sparse softmax cross-entropy.

Most of the classification loss functions described here are for two-class predictions. This can be extended to multiple classes by summing the cross-entropy terms over each prediction/target.

There are also many other metrics to look at when evaluating a model. Here is a list of some more to consider:

Model metric

Description

R-squared (coefficient of determination)

For linear models, this is the proportion of variance in the dependent variable that is explained by the independent data. For models with a larger number of features, consider using adjusted R-squared.

Root mean squared error

For continuous models, this measures the difference between prediction and actual via the square root of the average squared error.

Confusion matrix

For categorical models, we look at a matrix of predicted categories versus actual categories. A perfect model has all the counts along the diagonal.

Recall

For categorical models, this is the fraction of true positives over all predicted positives.

Precision

For categorical models, this is the fraction of true positives over all actual positives.

F-score

For categorical models, this is the harmonic mean of precision and recall.

In your choice of the right metric, you have to both evaluate the problem you have to solve (because each metric will behave differently and, depending on the problem at hand, some loss minimization strategies will prove better than others for our problem), and to experiment with the behavior of the neural network.

Implementing backpropagation

One of the benefits of using TensorFlow is that it can keep track of operations and automatically update model variables based on backpropagation. In this recipe, we will introduce how to use this aspect to our advantage when training machine learning models.

Getting ready

Now, we will introduce how to change our variables in the model in such a way that a loss function is minimized. We have learned how to use objects and operations, and how to create loss functions that will measure the distance between our predictions and targets. Now, we just have to tell TensorFlow how to backpropagate errors through our network in order to update the variables in such a way to minimize the loss function. This is achieved by declaring an optimization function. Once we have an optimization function declared, TensorFlow will go through and figure out the backpropagation terms for all of our computations in the graph. When we feed data in and minimize the loss function, TensorFlow will modify our variables in the network accordingly.

For this recipe, we will do a very simple regression algorithm. We will sample random numbers from a normal distribution, with mean 1 and standard deviation 0.1. Then, we will run the numbers through one operation, which will be to multiply them by a weight tensor and then adding a bias tensor. From this, the loss function will be the L2 norm between the output and the target. Our target will show a high correlation with our input, so the task won't be too complex, yet the recipe will be interestingly demonstrative, and easily reusable for more complex problems.

The second example is a very simple binary classification algorithm. Here, we will generate 100 numbers from two normal distributions, N(-3,1) and N(3,1). All the numbers from N(-3, 1) will be in target class 0, and all the numbers from N(3, 1) will be in target class 1. The model to differentiate these classes (which are perfectly separable) will again be a linear model optimized accordingly to the sigmoid cross-entropy loss function, thus, at first operating a sigmoid transformation on the model result and then computing the cross-entropy loss function.

While specifying a good learning rate helps the convergence of algorithms, we must also specify a type of optimization. From the preceding two examples, we are using standard gradient descent. This is implemented with the tf.optimizers.SGD TensorFlow function.

How to do it...

We'll start with the regression example. First, we load the usual numerical Python packages that always accompany our recipes, NumPy and TensorFlow:

import NumPy as np 
import TensorFlow as tf 

Next, we create the data. In order to make everything easily replicable, we want to set the random seed to a specific value. We will always repeat this in our recipes, so we exactly obtain the same results; check yourself how chance may vary the results in the recipes, by simply changing the seed number.

Moreover, in order to get assurance that the target and input have a good correlation, plot a scatterplot of the two variables:

np.random.seed(0)
x_vals = np.random.normal(1, 0.1, 100).astype(np.float32) 
y_vals = (x_vals * (np.random.normal(1, 0.05, 100) - 0.5)).astype(np.float32)
plt.scatter(x_vals, y_vals)
plt.show()

Figure 2.3: Scatterplot of x_vals and y_vals

We add the structure of the network (a linear model of the type bX + a) as a function:

def my_output(X, weights, biases):
    return tf.add(tf.multiply(X, weights), biases)

Next, we add our L2 Loss function to be applied to the results of the network:

def loss_func(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred - y_true))

Now, we have to declare a way to optimize the variables in our graph. We declare an optimization algorithm. Most optimization algorithms need to know how far to step in each iteration. Such a distance is controlled by the learning rate. Setting it to a correct value is specific to the problem we are dealing with, so we can figure out a suitable setting only by experimenting. Anyway, if our learning rate is too high, our algorithm might overshoot the minimum, but if our learning rate is too low, our algorithm might take too long to converge.

The learning rate has a big influence on convergence and we will discuss it again at the end of the section. While we're using the standard gradient descent algorithm, there are many other alternative options. There are, for instance, optimization algorithms that operate differently and can achieve a better or worse optimum depending on the problem. For a great overview of different optimization algorithms, see the paper by Sebastian Ruder in the See also section at the end of this recipe:

my_opt = tf.optimizers.SGD(learning_rate=0.02)

There is a lot of theory on which learning rates are best. This is one of the harder things to figure out in machine learning algorithms. Good papers to read about how learning rates are related to specific optimization algorithms are listed in the See also section at the end of this recipe.

Now we can initialize our network variables (weights and biases) and set a recording list (named history) to help us visualize the optimization steps:

tf.random.set_seed(1)
np.random.seed(0)
weights = tf.Variable(tf.random.normal(shape=[1])) 
biases = tf.Variable(tf.random.normal(shape=[1])) 
history = list()

The final step is to loop through our training algorithm and tell TensorFlow to train many times. We will do this 100 times and print out results every 25th iteration. To train, we will select a random x and y entry and feed it through the graph. TensorFlow will automatically compute the loss, and slightly change the weights and biases to minimize the loss:

for i in range(100): 
    rand_index = np.random.choice(100) 
    rand_x = [x_vals[rand_index]] 
    rand_y = [y_vals[rand_index]]
    with tf.GradientTape() as tape:
        predictions = my_output(rand_x, weights, biases)
        loss = loss_func(rand_y, predictions)
    history.append(loss.NumPy())
    gradients = tape.gradient(loss, [weights, biases])
    my_opt.apply_gradients(zip(gradients, [weights, biases]))
    if (i + 1) % 25 == 0: 
        print(f'Step # {i+1} Weights: {weights.NumPy()} Biases: {biases.NumPy()}')
        print(f'Loss = {loss.NumPy()}') 
Step # 25 Weights: [-0.58009654] Biases: [0.91217995]
Loss = 0.13842473924160004
Step # 50 Weights: [-0.5050226] Biases: [0.9813488]
Loss = 0.006441597361117601
Step # 75 Weights: [-0.4791306] Biases: [0.9942327]
Loss = 0.01728087291121483
Step # 100 Weights: [-0.4777394] Biases: [0.9807473]
Loss = 0.05371852591633797

In the loops, tf.GradientTape() allows TensorFlow to track the computations and calculate the gradient with respect to the observed variables. Every variable that is within the GradientTape() scope is monitored (please keep in mind that constants are not monitored, unless you explicitly state it with the command tape.watch(constant)). Once you've completed the monitoring, you can compute the gradient of a target in respect of a list of sources (using the command tape.gradient(target, sources)) and get back an eager tensor of the gradients that you can apply to the minimization process. The operation is automatically concluded with the updating of your sources (in our case, the weights and biases variables) with new values.

When the training is completed, we can visualize how the optimization process operates over successive gradient applications:

plt.plot(history)
plt.xlabel('iterations')
plt.ylabel('loss')
plt.show()

Figure 2.4: L2 loss through iterations in our recipe

At this point, we will introduce the code for the simple classification example. We can use the same TensorFlow script, with some updates. Remember, we will attempt to find an optimal set of weights and biases that will separate the data into two different classes.

First, we pull in the data from two different normal distributions, N(-3, 1) and N(3, 1). We will also generate the target labels and visualize how the two classes are distributed along our predictor variable:

np.random.seed(0)
x_vals = np.concatenate((np.random.normal(-3, 1, 50), 
                         np.random.normal(3, 1, 50))
                    ).astype(np.float32) 
y_vals = np.concatenate((np.repeat(0., 50), np.repeat(1., 50))).astype(np.float32) 
plt.hist(x_vals[y_vals==1], color='b')
plt.hist(x_vals[y_vals==0], color='r')
plt.show()

Figure 2.5: Class distribution on x_vals

Because the specific loss function for this problem is sigmoid cross-entropy, we update our loss function:

def loss_func(y_true, y_pred):
    return tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true, 
                                                logits=y_pred))

Next, we initialize our variables:

tf.random.set_seed(1)
np.random.seed(0)
weights = tf.Variable(tf.random.normal(shape=[1])) 
biases = tf.Variable(tf.random.normal(shape=[1])) 
history = list()

Finally, we loop through a randomly selected data point several hundred times and update the weights and biases variables accordingly. As we did before, every 25 iterations we will print out the value of our variables and the loss:

for i in range(100):    
    rand_index = np.random.choice(100) 
    rand_x = [x_vals[rand_index]] 
    rand_y = [y_vals[rand_index]]
    with tf.GradientTape() as tape:
        predictions = my_output(rand_x, weights, biases)
        loss = loss_func(rand_y, predictions)
    history.append(loss.NumPy())
    gradients = tape.gradient(loss, [weights, biases])
    my_opt.apply_gradients(zip(gradients, [weights, biases]))
    if (i + 1) % 25 == 0: 
        print(f'Step {i+1} Weights: {weights.NumPy()} Biases: {biases.NumPy()}')
        print(f'Loss = {loss.NumPy()}')
Step # 25 Weights: [-0.01804185] Biases: [0.44081175]
Loss = 0.5967269539833069
Step # 50 Weights: [0.49321094] Biases: [0.37732077]
Loss = 0.3199256658554077
Step # 75 Weights: [0.7071932] Biases: [0.32154965]
Loss = 0.03642747551202774
Step # 100 Weights: [0.8395616] Biases: [0.30409005]
Loss = 0.028119442984461784

A plot, also in this case, will reveal how the optimization proceeded:

plt.plot(history)
plt.xlabel('iterations')
plt.ylabel('loss')
plt.show()

Figure 2.6: Sigmoid cross-entropy loss through iterations in our recipe

The directionality of the plot is clear, though the trajectory is a bit bumpy because we are learning one example at a time, thus making the learning process decisively stochastic. The graph could also point out the need to try to decrease the learning rate a bit.

How it works...

For a recap and explanation, for both examples, we did the following:

  1. We created the data. Both examples needed to load data into specific variables used by the function that computes the network.
  2. We initialized variables. We used some random Gaussian values, but initialization is a topic on its own, since much of the final results may depend on how we initialize our network (just change the random seed before initialization to find it out).
  3. We created a loss function. We used the L2 loss for regression and the cross-entropy loss for classification.
  4. We defined an optimization algorithm. Both algorithms used gradient descent.
  5. We iterated across random data samples to iteratively update our variables.

There's more...

As we mentioned before, the optimization algorithm is sensitive to the choice of learning rate. It is important to summarize the effect of this choice in a concise manner:

Learning rate size

Advantages/disadvantages

Uses

Smaller learning rate

Converges slower but more accurate results

If the solution is unstable, try lowering the learning rate first

Larger learning rate

Less accurate, but converges faster

For some problems, helps prevent solutions from stagnating

Sometimes, the standard gradient descent algorithm can be stuck or slow down significantly. This can happen when the optimization is stuck in the flat spot of a saddle. To combat this, the solution is taking into account a momentum term, which adds on a fraction of the prior step's gradient descent value. You can access this solution by setting the momentum and the Nesterov parameters, along with your learning rate, in tf.optimizers.SGD (see https://www.TensorFlow.org/api_docs/python/tf/keras/optimizers/SGD for more details).

Another variant is to vary the optimizer step for each variable in our models. Ideally, we would like to take larger steps for smaller moving variables and shorter steps for faster changing variables. We will not go into the mathematics of this approach, but a common implementation of this idea is called the Adagrad algorithm. This algorithm takes into account the whole history of the variable gradients. The function in TensorFlow for this is called AdagradOptimizer() (https://www.TensorFlow.org/api_docs/python/tf/keras/optimizers/Adagrad).

Sometimes, Adagrad forces the gradients to zero too soon because it takes into account the whole history. A solution to this is to limit how many steps we use. This is called the Adadelta algorithm. We can apply this by using the AdadeltaOptimizer() function (https://www.TensorFlow.org/api_docs/python/tf/keras/optimizers/Adadelta).

There are a few other implementations of different gradient descent algorithms. For these, refer to the TensorFlow documentation at https://www.TensorFlow.org/api_docs/python/tf/keras/optimizers.

See also

For some references on optimization algorithms and learning rates, see the following papers and articles:

Working with batch and stochastic training

While TensorFlow updates our model variables according to backpropagation, it can operate on anything from a one-datum observation (as we did in the previous recipe) to a large batch of data at once. Operating on one training example can make for a very erratic learning process, while using too large a batch can be computationally expensive. Choosing the right type of training is crucial for getting our machine learning algorithms to converge to a solution.

Getting ready

In order for TensorFlow to compute the variable gradients for backpropagation to work, we have to measure the loss on a sample or multiple samples. Stochastic training only works on one randomly sampled data-target pair at a time, just as we did in the previous recipe. Another option is to put a larger portion of the training examples in at a time and average the loss for the gradient calculation. The sizes of the training batch can vary, up to and including the whole dataset at once. Here, we will show how to extend the prior regression example, which used stochastic training, to batch training.

We will start by loading NumPymatplotlib, and TensorFlow, as follows:

import matplotlib as plt 
import NumPy as np 
import TensorFlow as tf 

Now we just have to script our code and test our recipe in the How to do it… section.

How to do it...

We start by declaring a batch size. This will be how many data observations we will feed through the computational graph at one time:

batch_size = 20

Next, we just apply small modifications to the code used before for the regression problem:

np.random.seed(0)
x_vals = np.random.normal(1, 0.1, 100).astype(np.float32) 
y_vals = (x_vals * (np.random.normal(1, 0.05, 100) - 0.5)).astype(np.float32)
def loss_func(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred - y_true))
tf.random.set_seed(1)
np.random.seed(0)
weights = tf.Variable(tf.random.normal(shape=[1])) 
biases = tf.Variable(tf.random.normal(shape=[1])) 
history_batch = list()
for i in range(50):    
    rand_index = np.random.choice(100, size=batch_size) 
    rand_x = [x_vals[rand_index]] 
    rand_y = [y_vals[rand_index]]
    with tf.GradientTape() as tape:
        predictions = my_output(rand_x, weights, biases)
        loss = loss_func(rand_y, predictions)
    history_batch.append(loss.NumPy())
    gradients = tape.gradient(loss, [weights, biases])
    my_opt.apply_gradients(zip(gradients, [weights, biases]))
    if (i + 1) % 25 == 0: 
        print(f'Step # {i+1} Weights: {weights.NumPy()} \
              Biases: {biases.NumPy()}')
        print(f'Loss = {loss.NumPy()}')

Since our previous recipe, we have learned how to use matrix multiplication in our network and in our cost function. At this point, we just need to deal with inputs that are made of more rows as batches instead of single examples. We can even compare it with the previous approach, which we can now name stochastic optimization:

tf.random.set_seed(1)
np.random.seed(0)
weights = tf.Variable(tf.random.normal(shape=[1])) 
biases = tf.Variable(tf.random.normal(shape=[1])) 
history_stochastic = list()
for i in range(50):    
    rand_index = np.random.choice(100, size=1) 
    rand_x = [x_vals[rand_index]] 
    rand_y = [y_vals[rand_index]]
    with tf.GradientTape() as tape:
        predictions = my_output(rand_x, weights, biases)
        loss = loss_func(rand_y, predictions)
    history_stochastic.append(loss.NumPy())
    gradients = tape.gradient(loss, [weights, biases])
    my_opt.apply_gradients(zip(gradients, [weights, biases]))
    if (i + 1) % 25 == 0: 
        print(f'Step # {i+1} Weights: {weights.NumPy()} \
              Biases: {biases.NumPy()}')
        print(f'Loss = {loss.NumPy()}')

Just running the code will retrain our network using batches. At this point, we need to evaluate the results, get some intuition about how it works, and reflect on the results. Let's proceed to the next section.

How it works...

Batch training and stochastic training differ in their optimization methods and their convergence. Finding a good batch size can be difficult. To see how convergence differs between batch training and stochastic training, you are encouraged to change the batch size to various levels.

A visual comparison of the two approaches will explain better how using batches for this problem resulted in the same optimization as stochastic training, though there were fewer fluctuations during the process. Here is the code to produce the plot of both the stochastic and batch losses for the same regression problem. Note that the batch loss is much smoother and the stochastic loss is much more erratic:

plt.plot(history_stochastic, 'b-', label='Stochastic Loss') 
plt.plot(history_batch, 'r--', label='Batch Loss') 
plt.legend(loc='upper right', prop={'size': 11}) 
plt.show() 

Figure 2.7: Comparison of L2 loss when using stochastic and batch optimization

Now our graph displays a smoother trend line. The persistent presence of bumps could be solved by reducing the learning rate and adjusting the batch size.

There's more...

Type of training

Advantages

Disadvantages

Stochastic

Randomness may help move out of local minimums

Generally needs more iterations to converge

Batch

Finds minimums quicker

Takes more resources to compute

Combining everything together

In this section, we will combine everything we have illustrated so far and create a classifier for the iris dataset. The iris dataset is described in more detail in the Working with data sources recipe in Chapter 1Getting Started with TensorFlow. We will load this data and make a simple binary classifier to predict whether a flower is the species Iris setosa or not. To be clear, this dataset has three species, but we will only predict whether a flower is a single species, Iris setosa or not, giving us a binary classifier.

Getting ready

We will start by loading the libraries and data and then transform the target accordingly. First, we load the libraries needed for our recipe. For the Iris dataset, we need the TensorFlow Datasets module, which we haven't used before in our recipes. Note that we also load matplotlib here, because we would like to plot the resultant line afterward:

import matplotlib.pyplot as plt 
import NumPy as np 
import TensorFlow as tf 
import TensorFlow_datasets as tfds

How to do it...

As a starting point, let's first declare our batch size using a global variable:

batch_size = 20 

Next, we load the iris data. We will also need to transform the target data to be just 1 or 0, whether the target is setosa or not. Since the iris dataset marks setosa as a 0, we will change all targets with the value 0 to 1, and the other values all to 0. We will also only use two features, petal length and petal width. These two features are the third and fourth entry in each row of the dataset:

iris = tfds.load('iris', split='train[:90%]', W)
iris_test = tfds.load('iris', split='train[90%:]', as_supervised=True)
def iris2d(features, label):
    return features[2:], tf.cast((label == 0), dtype=tf.float32)
train_generator = (iris
                   .map(iris2d)
                   .shuffle(buffer_size=100)
                   .batch(batch_size)
                  )
test_generator = iris_test.map(iris2d).batch(1)

As shown in the previous chapter, we use the TensorFlow dataset functions to both load and operate the necessary transformations by creating a data generator that can dynamically feed our network with data, instead of keeping it in an in-memory NumPy matrix. As a first step, we load the data, specifying that we want to split it (using the parameters split='train[:90%]' and split='train[90%:]'). This allows us to reserve a part (10%) of the dataset for the model evaluation, using data that has not been part of the training phase.

We also specify the parameter, as_supervised=True, that will allow us to access the data as tuples of features and labels when iterating from the dataset.

Now we transform the dataset into an iterable generator by applying successive transformations. We shuffle the data, we define the batch to be returned by the iterable, and, most important, we apply a custom function that filters and transforms the features and labels returned from the dataset at the same time.

Then, we define the linear model. The model will take the usual form bX+a. Remember that TensorFlow has loss functions with the sigmoid built in, so we just need to define the output of the model prior to the sigmoid function:

def linear_model(X, A, b):
    my_output = tf.add(tf.matmul(X, A), b) 
    return tf.squeeze(my_output)

Now, we add our sigmoid cross-entropy loss function with TensorFlow's built-in sigmoid_cross_entropy_with_logits() function:

def xentropy(y_true, y_pred):
    return tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true, 
                                                logits=y_pred))

We also have to tell TensorFlow how to optimize our computational graph by declaring an optimizing method. We will want to minimize the cross-entropy loss. We will also choose 0.02 as our learning rate:

my_opt = tf.optimizers.SGD(learning_rate=0.02) 

Now, we will train our linear model with 300 iterations. We will feed in the three data points that we require: petal length, petal width, and the target variable. Every 30 iterations, we will print the variable values:

tf.random.set_seed(1)
np.random.seed(0)
A = tf.Variable(tf.random.normal(shape=[2, 1])) 
b = tf.Variable(tf.random.normal(shape=[1]))
history = list()
for i in range(300):
    iteration_loss = list()
    for features, label in train_generator:
        with tf.GradientTape() as tape:
            predictions = linear_model(features, A, b)
            loss = xentropy(label, predictions)
        iteration_loss.append(loss.NumPy())
        gradients = tape.gradient(loss, [A, b])
        my_opt.apply_gradients(zip(gradients, [A, b]))
    history.append(np.mean(iteration_loss))
    if (i + 1) % 30 == 0:
        print(f'Step # {i+1} Weights: {A.NumPy().T} \
              Biases: {b.NumPy()}')
        print(f'Loss = {loss.NumPy()}')
Step # 30 Weights: [[-1.1206311  1.2985772]] Biases: [1.0116111]
Loss = 0.4503694772720337
…
Step # 300 Weights: [[-1.5611029   0.11102282]] Biases: [3.6908474]
Loss = 0.10326375812292099

If we plot the loss against the iterations, we can acknowledge from the smoothness of the reduction of the loss over time how the learning has been quite an easy task for the linear model:

plt.plot(history)
plt.xlabel('iterations')
plt.ylabel('loss')
plt.show() 

Figure 2.8: Cross-entropy error for the Iris setosa data

We'll conclude by checking the performance on our reserved test data. This time we just take the examples from the test dataset. As expected, the resulting cross-entropy value is analogous to the training one:

predictions = list()
labels = list()
for features, label in test_generator:
    predictions.append(linear_model(features, A, b).NumPy())
    labels.append(label.NumPy()[0])
    
test_loss = xentropy(np.array(labels), np.array(predictions)).NumPy()
print(f"test cross-entropy is {test_loss}")
test cross-entropy is 0.10227929800748825

The next set of commands extracts the model variables and plots the line on a graph:

coefficients = np.ravel(A.NumPy())
intercept = b.NumPy()
# Plotting batches of examples
for j, (features, label) in enumerate(train_generator):
    setosa_mask = label.NumPy() == 1
    setosa = features.NumPy()[setosa_mask]
    non_setosa = features.NumPy()[~setosa_mask]
    plt.scatter(setosa[:,0], setosa[:,1], c='red', label='setosa')
    plt.scatter(non_setosa[:,0], non_setosa[:,1], c='blue', label='Non-setosa')
    if j==0:
        plt.legend(loc='lower right')
# Computing and plotting the decision function
a = -coefficients[0] / coefficients[1]
xx = np.linspace(plt.xlim()[0], plt.xlim()[1], num=10000)
yy = a * xx - intercept / coefficients[1]
on_the_plot = (yy > plt.ylim()[0]) & (yy < plt.ylim()[1])
plt.plot(xx[on_the_plot], yy[on_the_plot], 'k--')
plt.xlabel('Petal Length') 
plt.ylabel('Petal Width') 
plt.show() 

The resultant graph is in the How it works... section, where we also discuss the validity and reproducibility of the obtained results.

How it works...

Our goal was to fit a line between the Iris setosa points and the other two species using only petal width and petal length. If we plot the points, and separate the area of the plot where classifications are zero from the area where classifications are one with a line, we see that we have achieved this:

Figure 2.9: Plot of Iris setosa and non-setosa for petal width versus petal length; the solid line is the linear separator that we achieved after 300 iterations

The way the separating line is defined depends on the data, the network architecture, and the learning process. Different starting situations, even due to the random initialization of the neural network's weights, may provide you with a slightly different solution.

There's more...

While we achieved our objective of separating the two classes with a line, it may not be the best model for separating two classes. For instance, after adding new observations, we may realize that our solution badly separates the two classes. As we progress into the next chapter, we will start dealing with recipes that address these problems by providing testing, randomization, and specialized layers that will increase the generalization capabilities of our recipes.

See also

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Deep Learning solutions from Kaggle Masters and Google Developer Experts
  • Get to grips with the fundamentals including variables, matrices, and data sources
  • Learn advanced techniques to make your algorithms faster and more accurate

Description

The independent recipes in Machine Learning Using TensorFlow Cookbook will teach you how to perform complex data computations and gain valuable insights into your data. Dive into recipes on training models, model evaluation, sentiment analysis, regression analysis, artificial neural networks, and deep learning - each using Google’s machine learning library, TensorFlow. This cookbook covers the fundamentals of the TensorFlow library, including variables, matrices, and various data sources. You’ll discover real-world implementations of Keras and TensorFlow and learn how to use estimators to train linear models and boosted trees, both for classification and regression. Explore the practical applications of a variety of deep learning architectures, such as recurrent neural networks and Transformers, and see how they can be used to solve computer vision and natural language processing (NLP) problems. With the help of this book, you will be proficient in using TensorFlow, understand deep learning from the basics, and be able to implement machine learning algorithms in real-world scenarios.

Who is this book for?

If you are a data scientist or a machine learning engineer, and you want to skip detailed theoretical explanations in favor of building production-ready machine learning models using TensorFlow, this book is for you. Basic familiarity with Python, linear algebra, statistics, and machine learning is necessary to make the most out of this book.

What you will learn

  • Take TensorFlow into production
  • Implement and fine-tune Transformer models for various NLP tasks
  • Apply reinforcement learning algorithms using the TF-Agents framework
  • Understand linear regression techniques and use Estimators to train linear models
  • Execute neural networks and improve predictions on tabular data
  • Master convolutional neural networks and recurrent neural networks through practical recipes

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 08, 2021
Length: 416 pages
Edition : 1st
Language : English
ISBN-13 : 9781800208865
Category :
Languages :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Feb 08, 2021
Length: 416 pages
Edition : 1st
Language : English
ISBN-13 : 9781800208865
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just S$6 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just S$6 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total S$ 179.97
TensorFlow 2.0 Computer Vision Cookbook
S$59.99
Machine Learning Using TensorFlow Cookbook
S$52.99
TensorFlow 2 Reinforcement Learning Cookbook
S$66.99
Total S$ 179.97 Stars icon
Banner background image

Table of Contents

14 Chapters
Getting Started with TensorFlow 2.x Chevron down icon Chevron up icon
The TensorFlow Way Chevron down icon Chevron up icon
Keras Chevron down icon Chevron up icon
Linear Regression Chevron down icon Chevron up icon
Boosted Trees Chevron down icon Chevron up icon
Neural Networks Chevron down icon Chevron up icon
Predicting with Tabular Data Chevron down icon Chevron up icon
Convolutional Neural Networks Chevron down icon Chevron up icon
Recurrent Neural Networks Chevron down icon Chevron up icon
Transformers Chevron down icon Chevron up icon
Reinforcement Learning with TensorFlow and TF-Agents Chevron down icon Chevron up icon
Taking TensorFlow to Production Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9
(16 Ratings)
5 star 93.8%
4 star 6.3%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Marleen Mar 06, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The first three chapters of this book provide an introduction to TensorFlow 2 (TF) in general as well as some TF specific operations/syntax which is very handy to know! And the third chapter on the introduction is related to Keras.Then the book continues with a chapter on Linear Regression. I got a bit confused here because this chapters also discusses Logistic Regressions and others, which are all known as linear models, I just didn't expect to find logistic regression in here - a pleasant surprise!The chapter on Boosted Trees was missing structure I find. This one, the Transformers chapter and the RL chapter are definitely for readers that know how those models work and are just looking for implementation advise using TF.Then finally, my favourite chapters: various Neural Networks and the Deploying models to PRD. Those pages were filled with very well designed examples and I enjoyed going through them a lot!Therefore overall, this book will definitely help you understand how to use TensorFlow, but if you expect to learn the models used as well, you might need an additional book with more theory on e.g. Reinforcement Learning.The examples accompanying the book are all very straight forward and easy to follow along. This book is definitely a buy if you buy it for the purpose of learning TF!!(Site-note: I wish for other RL examples in books that have chapters on RL. It seems like many use the same.)
Amazon Verified review Amazon
Samuel de Zoete Jul 20, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
"Getting Ready! How to do it...! How it works...!" is the structure of the recipes in the book and it works for me. The explanations are short and straight to the point, I personally love when a learning book is concise. Plenty of resource suggestions if deep dive is necessary or brushing up on certain knowledge, e.g. Matrix Computations. I would suggest, download the code from GitHub and use Google Colab with the book. I read the book from my iPad and code on my laptop, it's perfect for me this way.When reading the book it provided me a good reference and framework how to setup, design and use TensorFlow. I usually use the Keras Interface, beyond that, it seems all a bit 'difficult', but this book gives me the confidence that directly using TF is now an option when it's needed.
Amazon Verified review Amazon
Vicky Apr 29, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book is an extraordinary asset for those keen on applying complex data computations using TensorFlow.It is somewhat light on detailed theoretical explanations but provides resourceful insights for building production-ready modules in real-world scenarios. I would recommend it if you are either already familiar with basic ML theory and the math background or if you have a software engineering background and are simply looking to implement an ML solution with Tensorflow.
Amazon Verified review Amazon
Mihai Maruseac Apr 23, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I really enjoyed reading this book. I tried to judge it based on how someone new to the Machine Learning field would feel when reading it as well as on how someone with deep expertise would. I am happy to report that I think both types of readers would enjoy reading the book.The book is a collection of recipes on how to solve various ML tasks using TF. It uses TF 2.x, the more modern version of TensorFlow but still has mentions for relevant TF 1.x features, as they are needed in some places of the prose.Each recipe starts from the first line of code and finishes with a fully working example, including output documentation. There are also explanations for the output and sometimes even suggestions for different avenues of experimentation.The first few chapters are very introductory. A reader would learn how to use raw TF API to create an ML model, then how to become better and more efficient by using higher level APIs, such as Keras and tf.data (and Estimators but these are TF 1.x features, so the corresponding chapter should be read with a view on the past history of TF).The last part of the book focuses on solutions for different ML tasks. Image processing, NLP, reinforcement learning, are all topics that are touched and presented at a reasonable level.
Amazon Verified review Amazon
JOSET Mar 20, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Really interesting and relevant book for the use of Temsorflow!Well done to the writers and Alexia!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.