This is an old revision of the document!
4.7 Clustering
The Problem
Clustering arises when one is trying to classify a collection of objects into coherent groups.The first question that comes into mind is whether there exist some measure under which each pair of objects is similar or dissimilar to another. A common approach used is to define a distance function on the objects, where objects at a larger distance from one another are less similar to each other. However, the notion of distance usually takes on a very abstract meaning in many applications.
Given a distance function on the objects, the clustering problem seeks to divide them into groups such that objects within the same group are “close”, and objects in different groups are “far apart”. So the main challenge is figuring out what a good set of groups might look like.
Clustering of Maximum Spacing
–> Let's consider a set U of n objects{p1,p2,…,pn}
–> For each pair pi and pj, there's a numerical distance d(pi,pj)
–> such that d(pi,pi) = 0