Abstract
In the last decade, there has been a growing interest in distance function learning for semi-supervised clustering settings. In addition to the earlier methods that learn Mahalanobis metrics (or equivalently, linear transformations), some nonlinear metric learning methods have also been recently introduced. However, these methods either allow limited choice of distance metrics yielding limited flexibility or learn nonparametric kernel matrices and scale very poorly (prohibiting applicability to medium and large data sets). In this paper, we propose a novel method that learns low-rank kernel matrices from pairwise constraints and unlabeled data. We formulate the proposed method as a trace ratio optimization problem and learn appropriate distance metrics through finding optimal low-rank kernel matrices. The proposed optimization problem can be solved much more efficiently than SDP problems introduced to learn nonparametric kernel matrices. Experimental results demonstrate the effectiveness of our method on synthetic and real-world data sets.