scikit-learn is a machine learning toolset built using Python. Part of the package is supervised learning, where the sample data points have attributes that allow you to assign the data points into separate classes. We use an estimator that assigns a data point to a class and makes predictions as to other data points with similar attributes. In scikit-learn, an estimator provides two functions, fit() and predict(), providing mechanisms to classify data points and predict classes of other data points, respectively.
As an example, we will be using the housing data from https://uci.edu/ (I think this is data for the Boston area). There are a number of factors including a price factor.
We will take the following steps:
- We will break up the dataset into a training set and a test set
- From the training set, we will produce a model
- We will then...