Part of Advances in Neural Information Processing Systems 8 (NIPS 1995)
Marcelo Blatt, Shai Wiseman, Eytan Domany
A new approach for clustering is proposed. This method is based on an analogy to a physical model; the ferromagnetic Potts model at thermal equilibrium is used as an analog computer for this hard optimization problem . We do not assume any structure of the un(cid:173) derlying distribution of the data. Phase space of the Potts model is divided into three regions; ferromagnetic, super-paramagnetic and paramagnetic phases. The region of interest is that corresponding to the super-paramagnetic one, where domains of aligned spins ap(cid:173) pear. The range of temperatures where these structures are stable is indicated by a non-vanishing magnetic susceptibility. We use a very efficient Monte Carlo algorithm to measure the susceptibil(cid:173) ity and the spin spin correlation function. The values of the spin spin correlation function, at the super-paramagnetic phase, serve to identify the partition of the data points into clusters.
Many natural phenomena can be viewed as optimization processes, and the drive to understand and analyze them yielded powerful mathematical methods. Thus when wishing to solve a hard optimization problem, it may be advantageous to apply these methods through a physical analogy. Indeed, recently techniques from statistical physics have been adapted for solving hard optimization problems (see e.g. Yuille and Kosowsky, 1994). In this work we formulate the problem of clustering in terms of a ferromagnetic Potts spin model. Using the Monte Carlo method we estimate physical quantities such as the spin spin correlation function and the susceptibility, and deduce from them the number of clusters and cluster sizes. Cluster analysis is an important technique in exploratory data analysis and is ap(cid:173) plied in a variety of engineering and scientific disciplines. The problem of partitionaZ clustering can be formally stated as follows. With everyone of i = 1,2, ... N pat(cid:173) terns represented as a point Xi in a d-dimensional metric space, determine the partition of these N points into M groups, called clusters, such that points in a cluster are more similar to each other than to points in different clusters. The value of M also has to be determined.
Clustering Data through an Analogy to the Potts Model
417
The two main approaches to partitional clustering are called parametric and non(cid:173) parametric. In parametric approaches some knowledge of the clusters' structure is assumed (e.g . each cluster can be represented by a center and a spread around it) . This assumption is incorporated in a global criterion. The goal is to assign the data points so that the criterion is minimized . A typical example is variance min(cid:173) imization (Rose, Gurewitz, and Fox, 1993) . On the other hand, in non-parametric approaches a local criterion is used to build clusters by utilizing local structure of the data. For example, clusters can be formed by identifying high-density regions in the data space or by assigning a point and its K -nearest neighbors to the same cluster. In recent years many parametric partitional clustering algorithms rooted in statistical physics were presented (see e.g. Buhmann and Kiihnel , 1993). In the present work we use methods of statistical physics in non-parametric clustering.
Our aim is to use a physical problem as an analog to the clustering problem. The notion of clusters comes very naturally in Potts spin models (Wang and Swendsen, 1990) where clusters are closely related to ordered regions of spins. We place a Potts spin variable Si at each point Xi (that represents one of the patterns), and introduce a short range ferromagnetic interaction Jij between pairs of spins, whose strength decreases as the inter-spin distance Ilxi - Xj" increases. The system is governed by the Hamiltonian (energy function)
1i = - L hj D8,,8j