Chapter 1. Getting Ready with Predictive Analytics
Analytics, predictive analytics, and data visualization are trendy topics today. The reasons are:
- Today a lot of internal and external data is available
- Technology to use this data has evolved a lot
- It is commonly accepted that there is a lot of value that can be extracted from data
As in many trendy topics, there is a lot of confusion around it. In this chapter, we will cover the following concepts:
- Introducing the key concepts of the book and tools we're going to use
- Defining analytics, predictive analytics, and data visualization
- Explaining the purpose of this book and the methodology we'll follow
- Covering the installation of the environment we'll use to create examples of applications in each chapter
After this chapter, we'll learn how to use our data to make predictions that will add value to our organizations. Before starting a data project, you always need to understand how your project will add value to the organization. In an analytics project, the two main sources of value are cost reduction and revenue increase. When you're working on a fraud detection project, your objective is to reduce fraud; this will lead into a cost reduction that will improve the margin of the organization. Finally, to understand the value of your data solution, you need to evaluate the cost of your solution. The real value added to an organization is the difference between the provided value and the total cost.
Working with data to create predictive solutions sounds very glamorous, but before that we'll learn how to use Rattle to load data, to avoid some problems related to the quality of the data, and to explore it. Rattle is a tool for statisticians, and, sometimes, we need a tool that provides us with a business approach to data exploration. We'll learn how to use Qlik Sense Desktop to do this.
After learning how to explore and understand data, we'll now learn how to create predictive systems. We'll divide these systems into unsupervised learning methods and supervised learning methods. We'll explain the difference later in this book.
To achieve a better understanding, in this book we'll create three different solutions using the most common predictive techniques: Clustering, Decision Trees, and Linear Regression.
To present data to the user, we need to create an application that helps the user to understand the data and take decisions; for this reason we'll look at the basics of data visualization. Data Visualization, Predictive Analytics and most of the topics of this book are huge knowledge areas. In this book we'll introduce you to these topics and at the end of each chapter you will find a section called Further learning where you will find references to continue learning.