Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning with R

You're reading from   Machine Learning with R R gives you access to the cutting-edge software you need to prepare data for machine learning. No previous knowledge required – this book will take you methodically through every stage of applying machine learning.

Arrow left icon
Product type Paperback
Published in Oct 2013
Publisher Packt
ISBN-13 9781782162148
Length 396 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Brett Lantz Brett Lantz
Author Profile Icon Brett Lantz
Brett Lantz
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Machine Learning with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
1. Introducing Machine Learning 2. Managing and Understanding Data FREE CHAPTER 3. Lazy Learning – Classification Using Nearest Neighbors 4. Probabilistic Learning – Classification Using Naive Bayes 5. Divide and Conquer – Classification Using Decision Trees and Rules 6. Forecasting Numeric Data – Regression Methods 7. Black Box Methods – Neural Networks and Support Vector Machines 8. Finding Patterns – Market Basket Analysis Using Association Rules 9. Finding Groups of Data – Clustering with k-means 10. Evaluating Model Performance 11. Improving Model Performance 12. Specialized Machine Learning Topics Index

Uses and abuses of machine learning


At its core, machine learning is primarily interested in making sense of complex data. This is a broadly applicable mission, and largely application agnostic. As you might expect, machine learning is used widely. For instance, it has been used to:

  • Predict the outcomes of elections

  • Identify and filter spam messages from e-mail

  • Foresee criminal activity

  • Automate traffic signals according to road conditions

  • Produce financial estimates of storms and natural disasters

  • Examine customer churn

  • Create auto-piloting planes and auto-driving cars

  • Identify individuals with the capacity to donate

  • Target advertising to specific types of consumers

For now, don't worry about exactly how the machines learn to perform these tasks; we will get into the specifics later. But across each of these contexts, the process is the same. A machine learning algorithm takes data and identifies patterns that can be used for action. In some cases, the results are so successful that they seem to reach near-legendary status.

One possibly apocryphal tale is of a large retailer in the United States, which employed machine learning to identify expectant mothers for targeted coupon mailings. If mothers-to-be were targeted with substantial discounts, the retailer hoped they would become loyal customers who would then continue to purchase profitable items like diapers, formula, and toys.

By applying machine learning methods to purchase data, the retailer believed it had learned some useful patterns. Certain items, such as prenatal vitamins, lotions, and washcloths could be used to identify with a high degree of certainty not only whether a woman was pregnant, but also when the baby was due.

After using this data for a promotional mailing, an angry man contacted the retailer and demanded to know why his teenage daughter was receiving coupons for maternity items. He was furious that the merchant seemed to be encouraging teenage pregnancy. Later on, as a manager called to offer an apology, it was the father that ultimately apologized; after confronting his daughter, he had discovered that she was indeed pregnant.

Whether completely true or not, there is certainly an element of truth to the preceding tale. Retailers, do in fact, routinely analyze their customers' transaction data. If you've ever used a shopper's loyalty card at your grocer, coffee shop, or another retailer, it is likely that your purchase data is being used for machine learning.

Retailers use machine learning methods for advertising, targeted promotions, inventory management, or the layout of the items in the store. Some retailers have even equipped checkout lanes with devices that print coupons for promotions based on the items in the current transaction. Websites also routinely do this to serve advertisements based on your web browsing history. Given the data from many individuals, a machine learning algorithm learns typical patterns of behavior that can then be used to make recommendations.

Despite being familiar with the machine learning methods working behind the scenes, it still feels a bit like magic when a retailer or website seems to know me better than I know myself. Others may be less thrilled to discover that their data is being used in this manner. Therefore, any person wishing to utilize machine learning or data mining would be remiss not to at least briefly consider the ethical implications of the art.

Ethical considerations

Due to the relative youth of machine learning as a discipline and the speed at which it is progressing, the associated legal issues and social norms are often quite uncertain and constantly in flux. Caution should be exercised when obtaining or analyzing data in order to avoid breaking laws, violating terms of service or data use agreements, abusing the trust, or violating privacy of the customers or the public.

Tip

The informal corporate motto of Google, an organization, which collects perhaps more data on individuals than any other, is "don't be evil." This may serve as a reasonable starting point for forming your own ethical guidelines, but it may not be sufficient.

Certain jurisdictions may prevent you from using racial, ethnic, religious, or other protected class data for business reasons, but keep in mind that excluding this data from your analysis may not be enough—machine learning algorithms might inadvertently learn this information independently. For instance, if a certain segment of people generally live in a certain region, buy a certain product, or otherwise behave in a way that uniquely identifies them as a group, some machine learning algorithms can infer the protected information from seemingly innocuous data. In such cases, you may need to fully "de-identify" these people by excluding any potentially identifying data in addition to the protected information.

Apart from the legal consequences, using data inappropriately may hurt your bottom line. Customers may feel uncomfortable or become spooked if aspects of their lives they consider private are made public. Recently, several high-profile web applications have experienced a mass exodus of users who felt exploited when the applications' terms of service agreements changed and their data was used for purposes beyond what the users had originally agreed upon. The fact that privacy expectations differ by context, by age cohort, and by locale, adds complexity to deciding the appropriate use of personal data. It would be wise to consider the cultural implications of your work before you begin on your project.

Tip

The fact that you can use data for a particular end does not always mean that you should.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image