2023 IEEE International Conference on Data Mining (ICDM)
Download PDF

Abstract

Forecasting extreme values in time series is an important but challenging problem as the extreme values are rarely observed even when a large amount of historical data is available. The modeling of extreme values requires a specific focus on estimating the tail distribution of the time series, whose statistical properties may differ from the distribution of its non-extreme values. To overcome this challenge, we present a novel self-supervised learning framework, SimEXT, to learn a robust representation of the time series that preserves the fidelity of its tail distribution. The framework employs a combination of contrastive learning and a reconstruction-based autoencoder architecture to facilitate robust representation learning of the temporal patterns associated with the extreme events. SimEXT also incorporates a wavelet-based data augmentation technique with a distribution-based loss function to prioritize the learning of extreme value distribution. We provide probabilistic guarantees on the wavelet-based augmentation that enables the wavelet coefficients to be perturbed during data augmentation without significantly altering the extreme values of the time series. Experimental results on real-world datasets show that SimEXT can effectively learn a robust representation of the time series to boost the performance of downstream tasks for forecasting block maxima values.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles