2012 IEEE 53rd Annual Symposium on Foundations of Computer Science
Download PDF

Abstract

For a set of n points in \Re^d, and parameters k and e, we present a data structure that answers (1 + e)-approximate k nearest neighbor queries in logarithmic time. Surprisingly, the space used by the data-structure is \wide tilde{O}(n/k), that is, the space used is sub linear in the input size if k is sufficiently large. Our approach provides a novel way to summarize geometric data, such that meaningful proximity queries on the data can be carried out using this sketch. Using this we provide a sub linear space data-structure that can estimate the density of a point set under various measures, including: (i) sum of distances of k closest points to the query point, and (ii) sum of squared distances of k closest points to the query point. Our approach generalizes to other distance based estimation of densities of similar flavor.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles