Evaluation Procedures for Forecasting with Spatiotemporal Data

Abstract

The increasing use of sensor networks has led to an ever larger number of available spatiotemporal datasets. Forecasting applications using this type of data are frequently motivated by important domains such as environmental monitoring. Being able to properly assess the performance of different forecasting approaches is fundamental to achieve progress. However, traditional performance estimation procedures, such as cross-validation, face challenges due to the implicit dependence between observations in spatiotemporal datasets. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures, using both artificially generated and real-world spatiotemporal datasets. Our results show both CV and OOS reporting useful estimates, but they suggest that blocking data in space and/or in time may be useful in mitigating CV’s bias to underestimate error. Overall, our study shows the importance of considering data dependencies when estimating the performance of spatiotemporal forecasting models.

Publication
Mathematics

This journal article is an extension to our conference paper:

Oliveira M, Torgo L, Santos Costa V. “Evaluation Procedures for Forecasting with Spatio-Temporal Data.” In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD (pp. 703–718). Springer, Cham, 2018. doi: 10.1007/978-3-030-10925-7_43

Mariana Oliveira
Mariana Oliveira
Post-doctoral Fellow

Mariana Oliveira is a post-doctoral fellow at Dalhousie University, Faculty of Computer Science. Her research focuses on Machine Learning and Data Mining.