We can make your applications smarter by giving them the tools to dynamically learn trends and make predictions.
Dataset before clustering

Clustering

Use

Clustering is a technique for grouping data into “clusters” based on how similar items are to each other. It's considered an “unsupervised” learning technique, which means that no user-assigned class labels are required.

Examples

I've found an interesting website and want to see a list of other sites similar to it, or I want to extract groups of pixels from an image (segmentation).

Popular Techniques

k-Means, k-Medoids, Kernel Clustering, Spectral Clustering (uses eigenvectors), Gravitational Clustering, Canopy Clustering, Self-Organizing Maps, Expectation Maximization, AGNES, CLARA, DBSCAN, DIANA, BIRCH, and many others.

Measuring

It's difficult to measure cluster performance. The simplest way is to manually compare clusters with expected results, but this isn't rigorous and requires domain expertise. Another method is to use the diameter of clusters or distances between clusters as a performance metric. In these cases, information-theoretic approaches such as entropy are commonly used as distance metrics. Hierarchical clustering schemes also make use of tree diagrams called dendrograms to show relationships between attributes.

Caveats