You're reading from Machine Learning for Data Mining Improve your data mining capabilities with advanced predictive modeling

Product type Paperback

Published in Apr 2019

Publisher Packt

ISBN-13 9781838828974

Length 252 pages

Edition 1st Edition

Languages

Python

Tools

Combine

Concepts

Data Mining

Author (1):

Jesus Salcedo

View More author details

Working with neural networks

Neural networks were initially developed in an attempt to understand how the brain operates. They were originally used in the areas of neuroscience and linguistics.

In these fields, researchers noticed that something happened in the environment (input), the individual processed the information (in the brain), and then reacted in some way (output).

So, the idea behind neural networks or neural nets is that they will serve as a brain, which is like a black box. We then have to try to figure out what is going on so that the findings can be applied.

Advantages of neural networks

The following are the advantages of using a neural network:

Good for many types of problems: They work well with most of the complex problems that you might come across.
They generalize very well: Accurate generalization is a very important feature.
They are very common: Neural networks have become very common in today's world, and they are readily accepted and implemented for real-world problems.
A lot is known about them: Owing to the popularity that neural networks have gained, there is a lot of research being done and implemented successfully in different areas, so there is a lot of information available on neural networks.
Works well with non-clustered data: When you have non-clustered data, neural networks can be used in several situations, such as where the data itself is very complex, where you have many interactions, or where you have nonlinear relationships; neural networks are certainly very powerful and very robust solutions for such situations.

Disadvantages of neural networks

Good models come at the cost of a few disadvantages:

They take time to train: Neural networks do take a long time to train; they are generally slower than a linear regression model or a decision tree model, as these basically just do one pass on the data, while, with neural networks, you actually go through many, many iterations.
The best solution is not guaranteed: You're not guaranteed to find the best solution. This also means that, in addition to running a single neural network through many iterations, you'll also need to run it multiple times using different starting points so that you can try to get closer to the best solution.
Black boxes: As we discussed earlier, it is hard to decipher what gave a certain output and how.

Representing the errors

While building our neural network, our actual goal is to build the best possible solution, and not to get stuck with a sub-optimal one. We'll need to run a neural network multiple times.

Consider this error graph as an example:

This is a graph depicting the amount of errors in different solutions. The Global Solution is the best possible solution and is really optimal. A Sub-Optimal Solution is a solution that terminates, gets stuck, and no longer improves, but it isn't really the best solution.

Types of neural network models

There are different types of neural networks available for us; in this section, we will gain insights into these.

Multi-layer perceptron

The most common type is called the multi-layer perceptron model. This neural network model consists of neurons represented by circles, as shown in the following diagram. These neurons are organized into layers:

Every multi-layer perceptron model will have at least three layers:

Input Layer: This layer consists of all the predictors in our data.
Output Layer: This will consist of the outcome variable, which is also known as the dependent variable or target variable.
Hidden Layer: This layer is where you maximize the power of a neural network. Non-linear relationships can also be created in this layer, and all the complex interactions are carried out here. You can have many such hidden layers.

You will also notice in the preceding diagram that every neuron in a layer is connected to every neuron in the next layer. This forms connections, and every connecting line will have a weight associated with it. These weights will form different equations in the model.

Why are weights important?

Weights are important for several reasons. First because all neurons in one layer are connected to every neuron in the next layer, this means that the layers are connected. It also means that a neural network model, unlike many other models, doesn't drop any predictors. So for example, you may start off with 20 predictors, and these 20 predictors will be kept. A second reason why weights are important is that they provide information on the impact or importance of each predictor to the prediction. As will be shown later, these weights start off randomly, however through multiple iterations, the weights are modified so as to provide meaningful information.

An example representation of a multilayer perceptron model

Here, we will look at an example of a multilayer perceptron model. We will try to predict a potential buyer of a particular item based on an individual's age, income, and gender.

Consider the following, for example:

As you can see, our input predictors that form the Input Layer are age, income, and gender. The outcome variable that forms our Output Layer is Buy, which will determine whether someone bought a product or not. There is a hidden layer where the input predictors end up combining.

To better understand what goes on behind the scenes of a neural network model, lets take a look at a linear regression model.

The linear regression model

Let's understand the linear regression model with the help of an example.

Consider the following:

In linear regression, every input predictor in the Input Layer is connected to the outcome field by a single connection weight, also known as the coefficient, and these coefficients are estimated by a single pass through the data. The number of coefficients will be equal to the number of predictors. This means that every predictor will have a coefficient associated with it.

Every input predictor is directly connected to the Target with a particular coefficient as its weight. So, we can easily see the impact of a one unit change in the input predictor on the outcome variable or the Target. These kind of connections make it easy to determine the effect of each predictor on the Target variable as well as on the equation.