Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Ensemble Machine Learning Cookbook Over 35 practical recipes to explore ensemble machine learning techniques using Python

Product type Paperback

Published in Jan 2019

Publisher Packt

ISBN-13 9781789136609

Length 336 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (2):

Vijayalakshmi Natarajan

Dipayan Sarkar

View More author details

Table of Contents (14) Chapters

Preface

1. Get Closer to Your Data FREE CHAPTER

2. Getting Started with Ensemble Machine Learning

3. Resampling Methods

4. Statistical and Machine Learning Algorithms

5. Bag the Models with Bagging

6. When in Doubt, Use Random Forests

7. Boosting Model Performance with Boosting

8. Blend It with Stacking

9. Homogeneous Ensembles Using Keras

10. Heterogeneous Ensemble Classifiers Using H2O

11. Heterogeneous Ensemble for Text Classification Using NLP

12. Homogenous Ensemble for Multiclass Classification Using Keras

13. Other Books You May Enjoy

Leave a review - let other readers know what you think

k-fold and leave-one-out cross-validation

Machine learning models often face the problem of generalization when they're applied to unseen data to make predictions. To avoid this problem, the model isn't trained using the complete dataset. Instead, the dataset is split into training and testing subsets. The model is trained on the training data and evaluated on the testing set, which it doesn't see during the training process. This is the fundamental idea behind cross-validation.

The simplest kind of cross-validation is the holdout method, which we saw in the previous recipe, Introduction to sampling. In the holdout method, when we split our data into training and testing subsets, there's a possibility that the testing set isn't that similar to the training set because of the high dimensionality of the data. This can lead to instability in the outcome...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Sarkar

Aurobindo Sarkar leads a team of data scientists and engineers at Session AI, developing cloud-based ML models for in-session marketing in e-commerce and retail. As a former CTO at multiple SaaS startups, he has architected secure, scalable, and highly available AWS cloud applications. His research interests now focus on AWS-based large-scale transformer models for NLP and HFT models for the futures and options market. Aurobindo holds a bachelor's degree in engineering from IIT Delhi, a master's in management from the Indian Institute of Science Bangalore, and a master's in computer science from New York University.

See other products by Sarkar

Natarajan

Vijayalakshmi Natarajan holds an ME in Computer Science, comes with 4 years of industry experience. She is a data science enthusiast and is a passionate trainer in the field of data science & data visualization. She takes keen interests in deep diving into Machine Learning techniques. Her specialization includes machine learning techniques in the field of image processing.

See other products by Natarajan