Evaluating the Performance of Clusters
After applying a clustering algorithm, it is necessary to evaluate how well the algorithm has performed. This is especially important when it is difficult to visually evaluate the clusters, for example, when there are several features.
Usually, with supervised algorithms, it is easy to evaluate the performance by simply comparing the prediction of each instance with its true value (class). On the other hand, when dealing with unsupervised models, it is necessary to pursue other strategies. In the specific case of clustering algorithms, it is possible to evaluate performance by measuring the similarity of the data points that belong to the same cluster.
Available Metrics in Scikit-Learn
Scikit-learn allows its users to use two different scores for evaluating the performance of unsupervised clustering algorithms. The main idea behind these scores is to measure how well-defined the cluster's edges are, instead of measuring the dispersion within a cluster...