Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

Abstract

Extreme and rare events, such as abnormal spikes in air pollution or weather conditions can have serious repercussions. Many of these sorts of events develop from spatio-temporal processes, and accurate predictions are a most valuable tool in addressing their impact, in a timely manner. In this paper, we propose a new set of resampling strategies for imbalanced spatio-temporal forecasting tasks, by introducing bias into formerly random processes. This spatio-temporal bias includes a hyper-parameter that regulates the relative importance of the temporal and spatial dimensions in the selection of observations during under- or over-sampling. We test and compare our proposals against standard versions of the strategies on 10 different geo-referenced numeric time series, using 3 distinct off-the-shelf learning algorithms. Experimental results show that our proposal provides an advantage over random resampling strategies in imbalanced spatio-temporal forecasting tasks. Additionally, we also find that valuing an observation’s recency is more useful when over-sampling; while valuing its spatial distance to other cases with extreme values is more beneficial when under-sampling.

Publication
Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytic, DSAA

This conference paper was extended and published as a journal article.

Mariana Oliveira
Mariana Oliveira
Post-doctoral Fellow

Mariana Oliveira is a post-doctoral fellow at Dalhousie University, Faculty of Computer Science. Her research focuses on Machine Learning and Data Mining.