You're reading from Accelerate Model Training with PyTorch 2.X Build more accurate models by boosting the model training process

Product type Paperback

Published in Apr 2024

Publisher Packt

ISBN-13 9781805120100

Length 230 pages

Edition 1st Edition

Languages

Python

Tools

PyTorch

Concepts

Machine Learning

Author (1):

Maicon Melo Alves

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Paving the Way FREE CHAPTER

2. Chapter 1: Deconstructing the Training Process

3. Chapter 2: Training Models Faster

4. Part 2: Going Faster

5. Chapter 3: Compiling the Model

6. Chapter 4: Using Specialized Libraries

7. Chapter 5: Building an Efficient Data Pipeline

8. Chapter 6: Simplifying the Model

9. Chapter 7: Adopting Mixed Precision

10. Part 3: Going Distributed

11. Chapter 8: Distributed Training at a Glance

12. Chapter 9: Training with Multiple CPUs

13. Chapter 10: Training with Multiple GPUs

14. Chapter 11: Training with Multiple Machines

15. Index

Why subscribe?

16. Other Books You May Enjoy

Distributed training on PyTorch

This section introduces the basic workflow to implement distributed training on PyTorch, besides presenting the components used in this process.

Basic workflow

Generally speaking, the basic workflow to implement distributed training on PyTorch comprises the steps illustrated in Figure 8.14:

Figure 8.14 – Basic workflow to implement distributed training in PyTorch

Let’s look at each step in more detail.

Note

The complete code shown in this section is available at https://github.com/PacktPublishing/Accelerate-Model-Training-with-PyTorch-2.X/blob/main/code/chapter08/pytorch_ddp.py.

Initialize and destroy the communication group

The communication group is the logical entity that’s used by PyTorch to define and control the distributed environment. So, the first step to code the distributed training concerns initializing a communication group. This step is performed by instantiating an object...

The rest of the chapter is locked

You're reading from Accelerate Model Training with PyTorch 2.X Build more accurate models by boosting the model training process

Table of Contents (17) Chapters

Distributed training on PyTorch

Basic workflow

Initialize and destroy the communication group

Authors (1)

Personalised recommendations for you

You're reading from Accelerate Model Training with PyTorch 2.X Build more accurate models by boosting the model training process

Table of Contents (17) Chapters

Distributed training on PyTorch

Basic workflow

Initialize and destroy the communication group

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you