Affinity Propagation is an exemplar-based clustering algorithm that has been shown to produce state of the art results on a number of synthetic and real-world exemplar-based clustering problems. One drawback of Affinity Propagation, however, is that the prior over clusterings is not well-understood. Different granularities of clusterings are controlled by a hand-tunable parameter, called a self-similarity, and there is little theoretical justification for how to choose this parameter.
We introduce a model that admits Dirichlet process priors into the Affinity Propagation framework, allowing us to express a family of infinite priors over cluster size distributions. After we add an additional factor to Affinity Propagation that changes strength based on how many points are assigned to a cluster, the self-similarity is shown to be determined by a combination of a Dirichlet process concentration parameter and base distribution, which are well-understood, natural ways to express infinite priors over clustering problems. We further show that any exchangeable (but not necessarily consistent) prior can also be expressed in our model.
(work done with Rich Zemel and Brendan Frey)
