Intelligent Computation Technology and Automation, International Conference on
Download PDF

Abstract

Most real-world data come with explicitly defined domain orders; e.g., lexicographic for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Chinese Lunar calendar is Corner < Apricot < Peach. To do so, we enhance data profiling methods by discovering implicit domain orders in data through order dependencies. We enumerate tractable special cases and show that the general case is NP-complete but can be effectively handled by a SAT solver. We also devise an interestingness measure to rank the discovered implicit domain orders. Based on an extensive suite of experiments with real-world data, we establish the efficacy of our algorithms.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Similar Articles