Mean-Shift Algorithm
The mean-shift algorithm works by assigning each data point a cluster based on the density of data points in the data space, also known as the mode in a distribution function. Contrary to the k-means algorithm, the mean-shift algorithm does not require you to specify the number of clusters as a parameter.
The algorithm works by modeling the data points as a distribution function, where high-density areas (high concentration of data points) represent high peaks. Then, the general idea is to shift each data point until it reaches its nearest peak, which becomes a cluster.
Understanding the Algorithm
The first step of the mean-shift algorithm is the representation of the data points as a density distribution. To do so, the algorithm builds upon the idea of Kernel Density Estimation (KDE), which is a method used to estimate the distribution of a set of data:
In the preceding diagram, the red dots represent...