Packt+ | Advance your knowledge in tech

You're reading from R Data Mining Implement data mining techniques through practical use cases and real-world datasets

Product type Paperback

Published in Nov 2017

Publisher Packt

ISBN-13 9781787124462

Length 442 pages

Edition 1st Edition

Languages

Tools

ggplot

Concepts

Data Mining

Author (1):

Andrea Cirillo

View More author details

Table of Contents (16) Chapters

Preface

1. Why to Choose R for Your Data Mining and Where to Start

2. A First Primer on Data Mining Analysing Your Bank Account Data FREE CHAPTER

3. The Data Mining Process - CRISP-DM Methodology

4. Keeping the House Clean – The Data Mining Architecture

5. How to Address a Data Mining Problem – Data Cleaning and Validation

6. Looking into Your Data Eyes – Exploratory Data Analysis

7. Our First Guess – a Linear Regression

8. A Gentle Introduction to Model Performance Evaluation

9. Don't Give up – Power up Your Regression Including Multiple Variables

10. A Different Outlook to Problems with Classification Models

11. The Final Clash – Random Forests and Ensemble Learning

12. Looking for the Culprit – Text Data Mining with R

13. Sharing Your Stories with Your Stakeholders through R Markdown

14. Epilogue

15. Dealing with Dates, Relative Paths and Functions

Summary

Here I am again. You just took another major leap on your journey to machine learning discovery. If you took the right time to acquire and practice what Andy just showed you, you should have now added to your toolbox two of the most employed classification models: logistic regression and support vector machines. Both of them are employed to perform classification exercises.

The logistic regression predicts the probability of a given outcome occurring, estimating the level of contribution to this output provided by all of the explanatory variables. This makes this model quite useful when interpretability is one of the objectives of the analysis.

On the other side, you have support vector machines, which are based on the concept of a hyperplane, a sort of blade of different possible shapes able to divide our population into two or more groups, and by that, mean perform the desired classification task. This algorithm shows pretty high performance, especially with a non-linear hyperplane...