Vangumalli, D. and Nikolopoulos, K. and Litsiou, K. (2021) 'Aggregate selection, individual selection, and cluster selection: an empirical evaluation and implications for systems research.', Cybernetics & systems: an international journal., 52 (7). pp. 553-578.
Abstract
Data analysts when forecasting large number of time series, they regularly employ one of the following methodological approaches: either select a single forecasting method for the entire dataset (aggregate selection), or use the best forecasting method for each time series (individual selection). There is evidence in the predictive analytics literature that the former is more robust than the latter, as in individual selection you tend to overfit models to the data. A third approach is to first identify homogeneous clusters within the dataset, and then select a single forecasting method for each cluster (cluster selection). To that end, we examine three machine learning clustering methods: k-medoids, k-NN and random forests. The evaluation is performed in the 645 yearly series of the M3 competition. The empirical evidence suggests: (a) random forests provide the best clusters for the sequential forecasting task, and (b) cluster selection has the potential to outperform aggregate selection.
Item Type: | Article |
---|---|
Full text: | (AM) Accepted Manuscript Available under License - Creative Commons Attribution Non-commercial 4.0. Download PDF (1665Kb) |
Status: | Peer-reviewed |
Publisher Web site: | https://doi.org/10.1080/01969722.2021.1902049 |
Publisher statement: | This is an Accepted Manuscript version of the following article, accepted for publication in Cybernetics & Systems. Vangumalli, D., Nikolopoulos, K. & Litsiou, K. (2021). Aggregate selection, individual selection, and cluster selection: an empirical evaluation and implications for systems research. Cybernetics & Systems: An International Journal 52(7): 553-578.. It is deposited under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. |
Date accepted: | 08 March 2021 |
Date deposited: | 10 March 2021 |
Date of first online publication: | 14 June 2021 |
Date first made open access: | 14 June 2022 |
Save or Share this output
Export: | |
Look up in GoogleScholar |