Summary
Machine learning consists of constructing models, some of which are based on complicated mathematical concepts, to understand data. Scikit-learn is an open source Python library that is meant to facilitate the process of applying these models to data problems, without much complex math knowledge required.
This chapter first covered an important step in developing a data problem, that is, representing the data in a tabular manner. Then, the steps involved in the creation of features and target matrices, data preprocessing, and choosing an algorithm were also covered.
Finally, after selecting the type of algorithm that best suits the data problem, the construction of the model can begin through the use of the scikit-learn API, which has three interfaces: estimators, predictors, and transformers. Thanks to the uniformity of the API, learning to use the methods for one algorithm is enough to enable their use for others.
With all of this in mind, in the next chapter, we will focus on detailing the process of implementing an unsupervised algorithm to a real-life dataset.