Datasets used in this chapter
We will use a few datasets in this chapter, which will cover a wide range of topics. What they have in common is that they are all multidimensional; as a result, the techniques of this chapter are easy to implement. For convenience, all datasets are available online at http://scholar.harvard.edu/gerrard/mastering-scientific-computation-r. The following are some of the datasets:
Red wine: This is a dataset of red wine properties. This dataset contains the chemical properties of the wine as well as the wine quality score. This dataset comes from the paper P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009. It was downloaded from the University of California Irvine Machine Learning Repository at http://archive.ics.uci.edu/ml/.
Abalone: This is a dataset of abalone measurements. The measurements are concerned largely with sizes...