Chapter 5: Advanced Model Building – Part I
In this chapter, we begin the transition from basic to advanced model building through the introduction of the nuanced issues and choices that a data scientist considers when building enterprise-grade models. We will discuss data splitting options, compare modeling algorithms, present a two-stage grid-search strategy for hyperparameter optimization, introduce H2O AutoML for automatically fitting multiple algorithms to data, and further investigate feature engineering tactics to extract as much information as possible from the data. We will introduce H2O Flow, a menu-based UI that is included with H2O, which is useful for monitoring the health of the H2O cluster and enables interactive data and model investigations.
Throughout the entire process, we will illustrate these advanced model-building concepts using the Lending Club problem that was introduced in Chapter 3, Fundamental Workflow – Data to Deployable Model. By the...