Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Machine Learning on AWS

You're reading from   Mastering Machine Learning on AWS Advanced machine learning in Python using SageMaker, Apache Spark, and TensorFlow

Arrow left icon
Product type Paperback
Published in May 2019
Publisher Packt
ISBN-13 9781789349795
Length 306 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Maximo Gurmendez Maximo Gurmendez
Author Profile Icon Maximo Gurmendez
Maximo Gurmendez
Dr. Saket S.R. Mengle Dr. Saket S.R. Mengle
Author Profile Icon Dr. Saket S.R. Mengle
Dr. Saket S.R. Mengle
Arrow right icon
View More author details
Toc

Table of Contents (24) Chapters Close

Preface 1. Section 1: Machine Learning on AWS FREE CHAPTER
2. Getting Started with Machine Learning for AWS 3. Section 2: Implementing Machine Learning Algorithms at Scale on AWS
4. Classifying Twitter Feeds with Naive Bayes 5. Predicting House Value with Regression Algorithms 6. Predicting User Behavior with Tree-Based Methods 7. Customer Segmentation Using Clustering Algorithms 8. Analyzing Visitor Patterns to Make Recommendations 9. Section 3: Deep Learning
10. Implementing Deep Learning Algorithms 11. Implementing Deep Learning with TensorFlow on AWS 12. Image Classification and Detection with SageMaker 13. Section 4: Integrating Ready-Made AWS Machine Learning Services
14. Working with AWS Comprehend 15. Using AWS Rekognition 16. Building Conversational Interfaces Using AWS Lex 17. Section 5: Optimizing and Deploying Models through AWS
18. Creating Clusters on AWS 19. Optimizing Models in Spark and SageMaker 20. Tuning Clusters for Machine Learning 21. Deploying Models Built in AWS 22. Other Books You May Enjoy Appendix: Getting Started with AWS

What this book covers

Chapter 1, Getting Started with Machine Learning for AWS, introduces you to machine learning. It explains why it is necessary for data scientists to learn about machine learning and how AWS can help them to solve various real-world problems. We also discuss the AWS services and tools that we will be covered in the book.

Chapter 2, Classifying Twitter Feeds with Naive Bayes, introduces the basics of the Naive Bayes algorithm and presents a text classification problem that will be addressed by the use of this algorithm and language models. We'll provide examples explaining how to apply Naive Bayes using scikit-learn and Apache Spark on SageMaker's BlazingText. Additionally, we'll explore how to use the ideas behind Bayesian reasoning in more complex scenarios. We will use the Twitter API to stream tweets from two different political candidates and predict who wrote them. We will use scikit-learn, Apache Spark, SageMaker, and BlazingText.

Chapter 3, Predicting House Value with Regression Algorithms, introduces the basics of regression algorithms and applies them to predict the price of houses given a number of features. We'll also introduce how to use logistic regression for classification problems. Examples in SageMaker for scikit-learn and Apache Spark will be provided. We'll be using the Boston Housing Price dataset (https://www.kaggle.com/c/boston-housing/) along with scikit-learn, Apache Spark, and SageMaker.

Chapter 4, Predicting User Behavior with Tree-Based Methods, introduces decision trees, random forests, and gradient-boosted trees. We will explore how to use these algorithms to predict when users will click on ads. Additionally, we will explain how to use AWS EMR and Apache Spark to engineer models at a large scale. We will use the Adform click prediction dataset (https://doi.org/10.7910/DVN/TADBY7, Harvard Dataverse, V2). We will use the XGBoost, Apache Spark, SageMaker, and EMR libraries.

Chapter 5, Customer Segmentation Using Clustering Algorithms, introduces the main clustering algorithms by exploring how to apply them for customer segmentation based on consumer patterns. Through AWS SageMaker, we will show how to run these algorithms in skicit-learn and Apache Spark. We will use the e-commerce data from Fabien Daniel (https://www.kaggle.com/fabiendaniel/customer-segmentation/data) and scikit-learn, Apache Spark, and SageMaker.

Chapter 6, Analyzing Visitor Patterns to Make Recommendations, presents the problem of finding similar users based on their navigation patterns in order to recommend custom marketing strategies. Collaborative filtering and distance-based methods will be introduced with examples in scikit-learn and Apache Spark on AWS SageMaker. We will use Kwan Hui Lim's Theme Park Attraction Visits dataset (https://sites.google.com/site/limkwanhui/datacode), Apache Spark, and SageMaker.

Chapter 7, Implementing Deep Learning Algorithms, introduces you to the main concepts behind deep learning and explains why it has become so relevant in today's AI-powered products. The aim of this chapter is to not discuss the theoretical details of deep learning, but to explain the algorithms with examples and provide a high-level conceptual understanding of deep learning algorithms. This will give you a platform to understand what you will be implementing in the next chapters.

Chapter 8, Implementing Deep Learning with TensorFlow on AWS, goes through a series of practical image-recognition problems and explains how to address them with TensorFlow on AWS. TensorFlow is a very popular deep learning framework that can be used to train deep neural networks. This chapter will explain how TensorFlow can be installed and used to train deep learning models using toy datasets. In this chapter, we'll use the MNIST handwritten digits dataset (http://yann.lecun.com/exdb/mnist/), along with TensorFlow and SageMaker.

Chapter 9, Image Classification and Detection with SageMaker, revisits the image classification problem we dealt with in the previous chapters, but using SageMaker's image classification algorithm and object detection algorithm. We'll use the Caltech256 dataset (http://www.vision.caltech.edu/Image_Datasets/Caltech256/) and AWS Sagemaker.

Chapter 10, Working with AWS Comprehend, explains the functionality of an AWS tool called Comprehend, which is a natural language processing tool that performs various useful tasks.

Chapter 11, Using AWS Rekognition, explains how to use Rekognition, which is an image recognition tool that uses deep learning. You will learn an easy way of applying image recognition in your applications.

Chapter 12, Building Conversational Interfaces Using AWS Lex, explains that AWS Lex is a tool that allows programmers to build conversational interfaces. This chapter introduces you to topics such as natural language understanding using deep learning.

Chapter 13, Creating Clusters on AWS, addresses how one of the key problems in deep learning is understanding how to scale and parallelize learning on multiple machines. In this chapter, we'll examine different ways to create clusters of learners. In particular, we'll focus on how to parallelize deep learning pipelines through distributed TensorFlow and Apache Spark.

Chapter 14, Optimizing Models in Spark and SageMaker, explains that the models that are trained on AWS can be further optimized to run smoothly in production environments. In this section, we will discuss various tricks that you can use to improve the performance of your algorithms.

Chapter 15, Tuning Clusters for Machine Learning, explains that many data scientists and machine learning practitioners face the problem of scale when attempting to run machine learning data pipelines at scale. In this chapter, we focus primarily on EMR, which is a very powerful tool for running very large machine learning jobs. There are many ways to configure EMR, and not every setup works for every scenario. We will go through the main configurations of EMR and explain how each configuration works for different objectives. Additionally, we'll present other ways to run big data pipelines through AWS.

Chapter 16, Deploying Models Built on AWS, discusses deployment. At this point, you will have your model built on AWS and would like to ship it to production. There are a variety of different contexts in which models should be deployed. In some cases, it's as easy as generating a CSV of actions that would be fed to some system. Often, we just need to deploy a web service that's capable of making predictions. However, there are many times in which we need to deploy these models to complex, low-latency, or edge systems. We will go through the different ways you can deploy machine learning models to production in this chapter.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image