A variety of models and algorithms for clustering

Collaboration diagram for Clustering Algorithms:

Classes
class	shark::AbstractClustering< InputT >
	Base class for clustering. More...

class	shark::Centroids
	Clusters defined by centroids. More...

class	shark::ClusteringModel< InputT, OutputT >
	Abstract model with associated clustering object. More...

class	shark::HardClusteringModel< InputT >
	Model for "hard" clustering. More...

class	shark::HierarchicalClustering< InputT >
	Clusters defined by a binary space partitioning tree. More...

class	shark::SoftClusteringModel< InputT >
	Model for "soft" clustering. More...

Functions
SHARK_EXPORT_SYMBOL std::size_t	shark::kMeans (Data< RealVector > const &data, std::size_t k, Centroids &centroids, std::size_t maxIterations=0)
	The k-means clustering algorithm.

Function Documentation

◆ kMeans()

SHARK_EXPORT_SYMBOL std::size_t shark::kMeans	(	Data< RealVector > const &	data,
		std::size_t	k,
		Centroids &	centroids,
		std::size_t	maxIterations = `0`
	)

The k-means clustering algorithm.

: The k-means algorithm takes vector-valued data \( \{x_1, \dots, x_n\} \subset \mathbb R^d \) and splits it into k clusters, based on centroids \( \{c_1, \dots, c_k\} \). The result is stored in a Centroids object that can be used to construct clustering models.

: This implementation starts the search with the given centroids, in case the provided centroids object (third parameter) contains a set of k centroids. Otherwise the search starts from the first k data points.

: Note that the data set needs to include at least k data points for k-means to work. This is because the current implementation does not allow for empty clusters.

Parameters

data	vector-valued data to be clustered
k	number of clusters
centroids	centroids input/output
maxIterations	maximum number of k-means iterations; 0: unlimited

Referenced by main().