Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Free Learning

You're reading from Mastering Machine Learning with scikit-learn Apply effective learning algorithms to real-world problems using scikit-learn

Product type Paperback

Published in Jul 2017

Publisher

ISBN-13 9781788299879

Length 254 pages

Edition 2nd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Author (1):

Gavin Hackeling

View More author details

Table of Contents (15) Chapters

Preface

1. The Fundamentals of Machine Learning FREE CHAPTER

2. Simple Linear Regression

3. Classification and Regression with k-Nearest Neighbors

4. Feature Extraction

5. From Simple Linear Regression to Multiple Linear Regression

6. From Linear Regression to Logistic Regression

7. Naive Bayes

8. Nonlinear Classification and Regression with Decision Trees

9. From Decision Trees to Random Forests and Other Ensemble Methods

10. The Perceptron

11. From the Perceptron to Support Vector Machines

12. From the Perceptron to Artificial Neural Networks

13. K-means

14. Dimensionality Reduction with Principal Component Analysis

Extracting features from categorical variables

Many problems have explanatory variables that are categorical or nominal. A categorical variable can take one of a fixed set of values. For example, an application that predicts the salary for a job might use categorical variables such as the city in which the position is located. Categorical variables are commonly encoded using one-of-k encoding, or one-hot encoding, in which the explanatory variable is represented using one binary feature for each of its possible values.

For example, let's assume our model has a city variable that can take one of three values: New York, San Francisco, or Chapel Hill. One-hot encoding represents the variable using one binary feature for each of the three possible cities. scikit-learn's DictVectorizer class is a transformer that can be used to one-hot encode categorical features:

# In[1...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Gavin Hackeling

Gavin Hackeling develops machine learning services for large-scale documents and image classification at an advertising network in New York. He received his Master's degree from New York University's Interactive Telecommunications Program, and his Bachelor's degree from the University of North Carolina.

See other products by Gavin Hackeling