Part 3: Going Distributed
In this part, you will learn how to spread the training process across multiple devices and machines. First, you will learn about the fundamental concepts related to the distributed training process. Then, you will learn how to distribute the training process on multiple CPUs in a single machine. After that, you will learn how to train the model by using multiple GPUs in a single machine. At the end, you will learn how to distribute the training process among multiple devices located in multiple machines.
This part has the following chapters:
- Chapter 8, Distributed Training at a Glance,
- Chapter 9, Training with Multiple CPUs
- Chapter 10, Training with Multiple GPUs
- Chapter 11, Training with Multiple Machines