What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Mastering Machine Learning with R

Chapter 1. A Process for Success

	"If you don't know where you are going, any road will get you there."
	--Robert Carrol

	"If you can't describe what you are doing as a process, you don't know what you're doing."
	--W. Edwards Deming

At first glance, this chapter may seem to have nothing to do with machine learning, but it has everything to do with machine learning and specifically, its implementation and making the changes happen. The smartest people, best software, and best algorithm do not guarantee success, no matter how it is defined.

In most—if not all—projects, the key to successfully solving problems or improving decision-making is not the algorithm, but the soft, more qualitative skills of communication and influence. The problem many of us have with this is that it is hard to quantify how effective one is around these skillsets. It is probably safe to say that many of us ended up in this position because of a desire to avoid it. After all, the highly successful TV comedy The Big Bang Theory was built on this premise. Therefore, this chapter is to set you up for success. The intent is to provide a process, a flexible process no less, where you can become a Change Agent: a person who can influence and turn their insights into action without positional power. We will focus on Cross-Industry Standard Process for Data Mining (CRISP-DM). It is probably the most well-known and respected of any processes for analytical projects. Even if you use another industry process or something proprietary, there should still be a few gems in this chapter that you can take away.

I will not hesitate to say that this all is easier said than done, and without question, I'm guilty of every sin by both commission and omission that will be discussed in this chapter. With skill and some luck, you can avoid the many physical and emotional scars I've picked up over the last 10 and a half years.

Finally, we will also have a look at a flow chart (a cheat sheet) that you can use to help you identify what methodology to apply to the problem at hand.

Business understanding

One cannot underestimate how important this first step of the process is in achieving success. It is the foundational step and failure or success here will likely determine failure or success for the rest of the project. The purpose of this step is to identify the requirements of the business so that you can translate them into analytical objectives. It has the following four tasks:

Identify the business objective
Assess the situation
Determine the analytical goals
Produce a project plan

Identify the business objective

The key to this task is to identify the goals of the organization and frame the problem. An effective question to ask is, what are we going to do different? This may seem like a benign question, but it can really challenge people to ponder what they need from an analytical perspective and it can get to the root of the decision that needs to be made. It can also prevent you from going out and doing a lot of unnecessary work on some fishing expedition. As such, the key for you is to identify the decision. A working definition of a decision can be put forward to the team as the irrevocable choice to commit or not commit the resources. Additionally, remember that the choice to do nothing different is indeed a decision.

This does not mean that a project should not be launched if the choices are not absolutely clear. There will be times when the problem is not or cannot be well-defined; to paraphrase former Defense Secretary Donald Rumsfeld, there are known – unknowns. Indeed, there will probably be many times when the problem is ill-defined and the project's main goal is to further the understanding of the problem and generate hypotheses; again calling on Secretary Rumsfeld, unknown – unknowns, which means that you don't know what you don't know. However, in ill-defined problems, one should go forward with an understanding of what will happen next in terms of resource commitment based on the various outcomes of hypothesis exploration.

Another thing to consider in this task is to manage expectations. There is no such thing as a perfect data, no matter what its depth and breadth is. This is not the time to make guarantees but to communicate what is possible, given your expertise.

I recommend a couple of outputs from this task. The first is a mission statement. This is not the touchy-feely mission statement of an organization, but it is your mission statement or, more importantly, the mission statement approved by the project sponsor. I stole this idea from my years of military experience and I could write volumes on why it is effective, but that is for another day. Let's just say that in the absence of clear direction or guidance, the mission statement or whatever you want to call it becomes the unifying statement and can help prevent scope creep. It consists of the following points:

Who: This is yourself or the team or project name; everyone likes a cool project name, for example, Project Viper, Project Fusion, and so on
What: This is the task that you will perform, for example, conduct machine learning
When: This is the deadline
Where: This could be geographical; by function, department, initiative, and so on
Why: This is the purpose of doing the project, that is, the business goal

The second task is to have as clear a definition of success as possible. Literally, ask what does success look like? Help the team/sponsor paint a picture of success that you can understand. Your job then is to translate this into modeling requirements.

Assess the situation

This task helps you in project planning by gathering information on the resources available, constraints, and assumptions, identifying the risks, and building contingency plans. I would further add that this is also the time to identify the key stakeholders that will be impacted by the decisions to be made.

A couple of points here. When examining the resources that are available, do not neglect to scour the records of the past and current projects. Odds are someone in the organization has or is working on the same problem and it may be essential to synchronize your work with theirs. Don't forget to enumerate the risks considering time, people, and money. Do everything in your power to create a list of the stakeholders, both those that impact your project and those that could be impacted by your project. Identify who these people are and how they can influence/be impacted by the decision. Once this is done, work with the project sponsor to formulate a communication plan with these stakeholders.

Determine the analytical goals

Here, you are looking to translate the business goal into technical requirements. This includes turning the success criterion from the task of creating a business objective to technical success. This might be things such as RMSE or a level of predictive accuracy.

Produce a project plan

The task here is to build an effective project plan with all the information gathered up to this point. Regardless of what technique you use, whether it be a Gantt chart or some other graphic, produce it and make it a part of your communication plan. Make this plan widely available to the stakeholders and update it on a regular basis and as circumstances dictate.

Description

Machine learning is a field of Artificial Intelligence to build systems that learn from data. Given the growing prominence of R—a cross-platform, zero-cost statistical programming environment—there has never been a better time to start applying machine learning to your data. The book starts with introduction to Cross-Industry Standard Process for Data Mining. It takes you through Multivariate Regression in detail. Moving on, you will also address Classification and Regression trees. You will learn a couple of “Unsupervised techniques”. Finally, the book will walk you through text analysis and time series. The book will deliver practical and real-world solutions to problems and variety of tasks such as complex recommendation systems. By the end of this book, you will gain expertise in performing R machine learning and will be able to build complex ML projects using R and its packages.

What you will learn

Gain deep insights to learn the applications of machine learning tools to the industry

Manipulate data in R efficiently to prepare it for analysis

Master the skill of recognizing techniques for effective visualization of data

Understand why and how to create test and training data sets for analysis

Familiarize yourself with fundamental learning methods such as linear and logistic regression

Comprehend advanced learning methods such as support vector machines

Realize why and how to apply unsupervised learning methods

What do you get with eBook?

Instant access to your Digital eBook purchase

Download this book in EPUB and PDF formats

Access this title in our online reader with advanced features

DRM FREE - Read whenever, wherever and however you want

Frequently bought together

$54.99

$54.99

$38.99

Total $ 148.97

Filter reviews by

All

Amazon verified reviews

Fabien Deneuville Aug 04, 2016

J'ai apprécié ce livre. Il est globalement très bien, fait, donne de multiples exemples. Il s'adresse à qui connaît déjà bien R et a de solides bases en analytics. Il permettra d'aller plus loin sur le machine learning, les différents types d'algorithmes, les techniques existantes... J'ai appris des choses avec ce livre.

Amazon Verified review

HDFS_Python Jun 11, 2016

Overall, I think the book was good and I enjoyed reading it, for a statistics book this is a praise. The following pros will seem lacking to the cons but believe me that it is because the book was overall good and any compliment hits nearly all chapters in the book. When I did see a con, I expanded on it to give full insight into the issue. As in any endeavor of this sort, it is always a challenge to find the right balance between theory and application.Pros:The book contains companion code. This means a student can save the code for the future, load it in when necessary, and alter the code to learn from it. In my honest opinion, this is the best option for me to study and learn a topic. Each chapter covers a different over-arching problem, which is gradually solved when new-techniques and strategies are introduced then implemented to solidify the knowledge with use. Allowing the reader to see what scenarios the technique surrounds and how it is run. The book covers a wide variety of topics, allowing a student to become a jack-of-all-trades, in the use of machine learning and advanced statistical techniques in R.Cons:I believe this book is suited well for someone with a mathematical and programming background. Without either, the book would seem challenging and daunting in some areas (i.e. Neural Networks). The book would not be impossible for someone without knowledge in R to read it, but it would be advised that the person knows passing knowledge of the software before they begin this book.Lack of mathematical theory. In a few areas, the book shows how to use the topic to reach the end but does not include the deep mathematical background into how the calculation are run. It has a chance of creating a black-box scenario where someone knows how it works on the outside without a clue of how it is run on the inside. In my opinion, this isn’t always necessary knowing how to calculate acf, pacf, and eacf by hand is nice but doesn’t help when running acf(model). Side note: no reasonable person would calculate acf past five lag or pacf by hand.Overview for all subjects. The way the book was made for ease in learning makes brings up a small problem. Some challenging data sets may exceed the scope of the books training material and could lead to the reader being ill prepared. An example of this problem would be if a time series problem contains innovative or additive outlier. This means the student may receive a model with the lowest AIC value, but the formula may not be the most optimized format. For this case, a student should know when a problem is showing intriguing characteristics and should being a research process into how to confront these problems, through other reading material, internet, or professional network.

meitzmann Oct 20, 2016

Mastering Machine Learning in R provides a great introduction to machine learning and data analysis techniques. It is refreshing to read a statistics/data focused book that is written in an accessible manner by someone with good communication skills.The concepts are laid out in a logical format that includes Data Preparation and Business Cases, two things that are often left out of many similar texts. The author goes into detail on concepts that need it and avoids it on concepts that don’t while still providing enough resources for the reader.The code that comes with the book makes it a great resource for students or someone who is looking to teach themselves. Overall I highly recommend this text.

Mugdha Hota Oct 18, 2018

Nick P Jan 31, 2018

Great introductory book on the subject and no need for me to fumble through other books.

Mastering Machine Learning with R: Master machine learning techniques with R to deliver insights for complex projects

What do you get with eBook?

Mastering Machine Learning with R

Chapter 1. A Process for Success

The process

Business understanding

Identify the business objective

Assess the situation

Determine the analytical goals

Produce a project plan

Data understanding

Data preparation

Modeling

Evaluation

Deployment

Algorithm flowchart

Summary

Page 1 of 10

Description

What you will learn

Product Details

What do you get with eBook?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Mastering Machine Learning with R: Master machine learning techniques with R to deliver insights for complex projects

What do you get with eBook?

Contact Details

Billing Address

Description

What you will learn

Product Details

What do you get with eBook?

Contact Details

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs