A variety of models and algorithms for clustering
◆ kMeans()
The k-means clustering algorithm.
- The k-means algorithm takes vector-valued data \( \{x_1, \dots, x_n\} \subset \mathbb R^d \) and splits it into k clusters, based on centroids \( \{c_1, \dots, c_k\} \). The result is stored in a Centroids object that can be used to construct clustering models.
- This implementation starts the search with the given centroids, in case the provided centroids object (third parameter) contains a set of k centroids. Otherwise the search starts from the first k data points.
- Note that the data set needs to include at least k data points for k-means to work. This is because the current implementation does not allow for empty clusters.
- Parameters
-
data | vector-valued data to be clustered |
k | number of clusters |
centroids | centroids input/output |
maxIterations | maximum number of k-means iterations; 0: unlimited |
- Returns
- number of k-means iterations
Referenced by main().