
Meta-Clustering: Provides a tool to repeatedly run the clustering algorithms, changing the data through leave-n-out selection or by addition of noise. Results are cumulated and can be displayed for both the training and test data set.
k-Means: Splits input records into k clusters by minimizing the distances of cluster members from the centroid. Supports fuzzy cluster membership. Search for an optimal value of k.
Self Organizing Maps (SOM): Creates a regular one or two-dimensional (rectangular or hexagonal) grid of cluster nodes and maps data points from high-dimensional space to these nodes. SOM matrix may be annotated with text labels or graphics thumbnails.
Bayesian Classification: Generates a probabilistic partitioning of the data into classes, with each gene (sample) being assigned a probability of belonging to each class. Determines the optimal number of classes, and provides a range of model-fit diagnostics.
Hierarchical Clustering: Starts with every gene (sample) forming its own cluster and calculates the distances between every pair of clusters, progressively amalgamating the closest clusters, until all genes (samples) are in a single group. Options available to color code the tree by covariate value, isolate clusters for study, optimally order the tree, and automatically label clusters groups by defining characteristic.
Principal Component (PC) Shaving: Selects a set of genes (or samples) correlated with the first principal component of the data to represent the first cluster. Repeats the process for additional clusters, for principal components that are orthogonal to those clusters already chosen.
Support Vector Machines (SVM): Uses iterative algorithms to identity points (support vectors) that define a plane in n-dimensional space that optimally separates 2 groups. Generalizations include non-linear boundaries between regions, and k-group classification. RFE progressively eliminates the least valuable features to derive a subset of effective vectors.
Nearest Centroid: Classifies samples based on the group whose centroid lies closest. Nearest shrunken centroids, allows for shrinkage towards the global centroid, eliminating the influence of genes that differentiate most weakly. Apply to a training data set, a test set, or a leave-n-out set.


