Summary
In this chapter, we discussed our first unsupervised learning task, clustering. Clustering is used to discover structures in unlabeled data. We learned about the K-means clustering algorithm, which iteratively assigns instances to clusters and refines the positions of the cluster centroids. While K-means learns from experience without supervision, its performance is still measurable; we learned to use distortion and the silhouette coefficient to evaluate clusters. We applied K-means to two different problems. First, we used K-means for image quantizationK a compression technique that represents a range of colors with a single color. We also used K-means to learn features in a semi-supervised image classification problem.
In the next chapter we will discuss another unsupervised learning task called dimensionality reduction. Like the semi-supervised feature representations we created to classify images of cats and dogs, dimensionality reduction can be used to reduce the dimensions of...