You're reading from Learn Amazon SageMaker A guide to building, training, and deploying machine learning models for developers and data scientists

Product type Paperback

Published in Aug 2020

Publisher Packt

ISBN-13 9781800208919

Length 490 pages

Edition 1st Edition

Languages

Python

Tools

AWS

Concepts

Machine Learning

Author (1):

Julien Simon

View More author details

Table of Contents (19) Chapters

Preface

1. Section 1: Introduction to Amazon SageMaker

2. Chapter 1: Introduction to Amazon SageMaker FREE CHAPTER

3. Chapter 2: Handling Data Preparation Techniques

4. Section 2: Building and Training Models

5. Chapter 3: AutoML with Amazon SageMaker Autopilot

6. Chapter 4: Training Machine Learning Models

7. Chapter 5: Training Computer Vision Models

8. Chapter 6: Training Natural Language Processing Models

9. Chapter 7: Extending Machine Learning Services Using Built-In Frameworks

10. Chapter 8: Using Your Algorithms and Code

11. Section 3: Diving Deeper on Training

12. Chapter 9: Scaling Your Training Jobs

13. Chapter 10: Advanced Training Techniques

14. Section 4: Managing Models in Production

15. Chapter 11: Deploying Machine Learning Models

16. Chapter 12: Automating Machine Learning Workflows

17. Chapter 13: Optimizing Prediction Cost and Performance

18. Other Books You May Enjoy

Leave a review - let other readers know what you think

Streaming datasets with pipe mode

The default setting of estimators is to copy the dataset to training instances, which is known as File Mode. Instead, pipe mode streams it directly from S3. The name of the feature comes from its use of Unix named pipes (also known as FIFOs): at the beginning of each epoch, one pipe is created per input channel.

Pipe mode removes the need to copy any data to training instances. Obviously, training jobs start quicker. They generally run faster too, as pipe mode is highly optimized. Another benefit is that you won't have to provision any storage for the dataset on training instances.

Cutting down on training time and storage means that you'll save money. The larger the dataset, the more you'll save. You can find benchmarks at https://aws.amazon.com/blogs/machine-learning/accelerate-model-training-using-faster-pipe-mode-on-amazon-sagemaker/.

In practice, you can start experimenting with pipe mode for datasets in the hundreds of...

The rest of the chapter is locked

You're reading from Learn Amazon SageMaker A guide to building, training, and deploying machine learning models for developers and data scientists

Table of Contents (19) Chapters

Streaming datasets with pipe mode

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Learn Amazon SageMaker A guide to building, training, and deploying machine learning models for developers and data scientists

Table of Contents (19) Chapters

Streaming datasets with pipe mode

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you