Chapter 10: Scaling Up Your Machine Learning Workflow
In this chapter, you will learn about diverse techniques and patterns to scale your machine learning (ML) workflow in different scalability dimensions. We will look at using a Databricks managed environment to scale your MLflow development capabilities, adding Apache Spark for cases where you have larger datasets. We will explore NVIDIA RAPIDS and graphics processing unit (GPU) support, and the Ray distributed frameworks to accelerate your ML workloads. The format of this chapter is a small proof-of-concept with a defined canonical dataset to demonstrate a technique and toolchain.
Specifically, we will look at the following sections in this chapter:
- Developing models with a Databricks Community Edition environment
- Integrating MLflow with Apache Spark
- Integrating MLflow with NVIDIA RAPIDS (GPU)
- Integrating MLflow with the Ray platform
This chapter will require researching the appropriate...