Hao Wang, The University of Hong Kong, Hong Kong Yilun Cai, The University of Hong Kong, Hong Kong Yin Yang, Advanced Digital Sciences Center, Singapore Shiming Zhang, Noah's Ark Lab, Hong Kong Nikos Mamoulis, The University of Hong Kong, Hong Kong
Time Series Analysis, Indexing, Trajectory, Knowledge Discovery, Data Engineering, Search Problems, Spatiotemporal Databases, Durable Query, Time Series, Historical Data
Abstract
This paper studies the problem of finding objects with durable quality over time in historical time series databases. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70 percent of the time during a certain time period and so on. Durable top-(k) (DTop-(k)) and nearest neighbor ((DkNN)) queries can be viewed as natural extensions of the standard snapshot top-(k) and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, to our knowledge, there is little prior work that addresses this new class of durable queries. Existing methods for DTop-(k) processing either apply trivial solutions, or rely on domain-specific properties. Motivated by this, we propose efficient and scalable algorithms for the DTop-(k) and (DkNN) queries, based on novel indexing and query evaluation techniques. Our experiments show that the proposed algorithms outperform previous and baseline solutions by a wide margin.
1. V. Athitsos, P. Papapetrou, M. Potamias, G. Kollios, and D. Gunopulos, “Approximate Embedding-Based Subsequence Matching of Time Series,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2008.
2. Y. Cai, and R. Ng, “Indexing Spatio-Temporal Trajectories with Chebyshev Polynomials,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2004.
3. K. Chabrabarti, E. Keogh, S. Mehrotra, and M. Pazzani, “Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases,” ACM Trans. Database Systems, vol. 27, no. 2, pp. 188-228, 2002.
4. L. Chen, M.T. Ozsu, and V. Oria, “Robust and Fast Similarity Search for Moving Object Trajectories,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2005.
5. Q. Chen, L. Chen, X. Lian, Y. Liu, and J.X. Yu, “Indexable PLA for Efficient Similarity Search,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
6. P. Cudré-Mauroux, E. Wu, and S. Madden, “TrajStore: An Adaptive Storage System for Very Large Trajectory Data Sets,” Proc. IEEE 26th Int'l Conf. Data Eng. (ICDE), 2010.
7. M. de Berg, O. Cheong, M. van Kreveld, and M. Overmars, Computational Geometry: Algorithms and Applications, third, ed. Springer Verlag, 2008.
8. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 1994.
9. R.H. Güting, T. Behr, and J. Xu, “Efficient K-Nearest Neighbor Search on Moving Object Trajectories,” VLDB J., vol. 19, pp. 687-714, 2010.
10. A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 1984.
11. I.F. Ilyas, G. Beskales, and M.A. Soliman, “A Survey of Top-K Query Processing Techniques in Relational Database Systems,” ACM Computing Surveys, vol. 40, no. 4, pp. 11:1-11:58, 2008.
12. J. Jestes, J.M. Phillips, F. Li, and M. Tang, “Ranking Large Temporal Data,” Proc. VLDB Endowment, vol. 5, pp. 1412-1423, 2012.
13. B. Jiang, and J. Pei, “Online Interval Skyline Queries on Time Series,” Proc. IEEE Int'l Conf. Data Eng. (ICDE), 2009.
14. E. Keogh, “Exact Indexing of Dynamic Time Warping,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB), 2002.
15. M.L. Lee, W. Hsu, L. Li, and W.H. Tok, “Consistent Top-K Queries over Time,” Proc. 14th Int'l Conf. Database Systems for Advanced Applications (DASFAA), 2009.
16. F. Li, K. Yi, and W. Le, “Top-k Queries on Temporal Data,” VLDB J., vol. 19, pp. 715-733, 2010.
17. C. Ré, N. Dalvi, and D. Suciu, “Efficient Top-K Query Evaluation on Probabilistic Data,” Proc. Int'l Conf. Data Eng. (ICDE), 2007.
18. R. Sherkat, and D. Rafiei, “On Efficiently Searching Trajectories and Archival Data for Historical Similarities,” Proc. VLDB Endowment, vol. 1, no. 1, pp. 896-908, 2008.
19. R.S. Tsay, Analysis of Finacial Time Series, second ed. John Wiley & Sons, Inc., 2005.
20. L.H. U, N. Mamoulis, K. Berberich, and S. Bedathur, “Durable Top-K Search in Document Archives,” Proc. ACM SIGMOD Int'l Conf. Management of Data, 2010.
21. A. Vlachou, C. Doulkeridis, Y. Kotidis, and K. Nørvåg, “Reverse Top-K Queries,” Proc. IEEE 26th Int'l Conf. Data Eng. (ICDE), 2010.
22. X. Yu, K.Q. Pu, and N. Koudas, “Monitoring K-Nearest Neighbor Queries over Moving Objects,” Proc. Int'l Conf. Data Eng. (ICDE), 2005.