Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Machine Learning with R, Second Edition

You're reading from   Mastering Machine Learning with R, Second Edition Advanced prediction, algorithms, and learning methods with R 3.x

Arrow left icon
Product type Paperback
Published in Apr 2017
Publisher Packt
ISBN-13 9781787287471
Length 420 pages
Edition 2nd Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Cory Lesmeister Cory Lesmeister
Author Profile Icon Cory Lesmeister
Cory Lesmeister
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. A Process for Success FREE CHAPTER 2. Linear Regression - The Blocking and Tackling of Machine Learning 3. Logistic Regression and Discriminant Analysis 4. Advanced Feature Selection in Linear Models 5. More Classification Techniques - K-Nearest Neighbors and Support Vector Machines 6. Classification and Regression Trees 7. Neural Networks and Deep Learning 8. Cluster Analysis 9. Principal Components Analysis 10. Market Basket Analysis, Recommendation Engines, and Sequential Analysis 11. Creating Ensembles and Multiclass Classification 12. Time Series and Causality 13. Text Mining 14. R on the Cloud 15. R Fundamentals 16. Sources

Preface

"A man deserves a second chance, but keep an eye on him"
                                                                                                                -John Wayne

It is not so often in life that you get a second chance. I remember that only days after we stopped editing the first edition, I kept asking myself, "Why didn't I...?", or "What the heck was I thinking saying it like that?", and on and on. In fact, the first project I started working on after it was published had nothing to do with any of the methods in the first edition. I made a mental note that if given the chance, it would go into a second edition.

When I started with the first edition, my goal was to create something different, maybe even create a work that was a pleasure to read, given the constraints of the topic. After all the feedback I received, I think I hit the mark. However, there is always room for improvement, and if you try and be everything to all people, you become nothing to everybody. I'm reminded of one of my favorite Frederick the great quotes, "He who defends everything, defends nothing". So, I've tried to provide enough of the skills and tools, but not all of them, to get a reader up and running with R and machine learning as quickly and painlessly as possible. I think I've added some interesting new techniques that build on what was in the first edition. There will probably always be the detractors who complain it does not offer enough math or does not do this, that, or the other thing, but my answer to that is they already exist! Why duplicate what was already done, and very well, for that matter? Again, I have sought to provide something different, something that would keep the reader's attention and allow them to succeed in this competitive field.

Before I provide a list of the changes/improvements incorporated into the second edition, chapter by chapter, let me explain some universal changes. First of all, I have surrendered in my effort to fight the usage of the assignment operator <- versus just using =. As I shared more and more code with others, I realized I was out on my own using = and not <-. The first thing I did when under contract for the second edition was go line by line in the code and change it. The more important part, perhaps, was to clean and standardize the code. This is also important when you have to share code with coworkers and, dare I say, regulators. Using RStudio facilitates this standardization in the most recent versions. What sort of standards! Well, the first thing is to properly space the code. For instance, I would not hesitate in the past to write c(1,2,3,4,5,6). Not anymore! Now, I will write this--c(1, 2, 3, 4, 5, 6)--as a space after commas, which makes it easier to read. If you want other ideas, please have a look a Google's R style guide, https://google.github.io/styleguide/Rguide.xml/. I also received a number of e-mails saying that the data I scraped off the Web wasn't available. The National Hockey League decided to launch a completely new version of their statistics, so I had to start from scratch. Problems such as that led me to put data on GitHub.

All in all, I put forth a rather large effort to put the best possible tool in your hands to get you going. On another note, in the month of February '17, there was much attention on the Web on these comments from entrepreneur Mark Cuban:

  • "Artificial Intelligence, deep learning, machine learning--whatever you’re doing if you don’t understand it--learn it. Because otherwise you’re going to be a dinosaur within 3 years."
  • "I personally think there's going to be a greater demand in 10 years for liberal arts majors than there were for programming majors and maybe even engineering, because when the data is all being spit out for you, options are being spit out for you, you need a different perspective in order to have a different view of the data. And so is having someone who is more of a freer thinker."

Besides the fact that these comments created a bit of a stir on the blogosphere, they also seem to be, at first glance, mutually exclusive. But think about what he is saying here. I think he gets to the core of why I felt compelled to write this book. Here is what I believe, machine learning needs to be embraced and utilized, to some extent, by the masses: the tired, the poor, the hungry, the proletariat, and the bourgeoisie. More and more availability of computational power and information will make machine learning something for virtually everyone. However, the flip side of that and what, in my mind, has been and will continue to be a problem is the communication of results. What are you going to do when you describe true positive rate and false positive rate and receive blank stares? How do you quickly tell a story that enlightens your audience? If you think it can't happen, please drop me a note, I'd be more than happy to share my story.

We must have people who can lead these efforts and influence their organization. If a degree in history or music appreciation helps in that endeavor, then so be it. I study history every day, and it has helped me tremendously. Cuban's comments have reinforced my belief that in many ways, the first chapter is the most important in this book. If you are not asking your business partners "what they plan to do differently", you'd better start tomorrow. There are far too many people working far too hard to complete an analysis that is completely irrelevant to the organization and its decisions.

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image