Reconciliation of forecasts in stretched crossvalidation #305

henningsway · 2021-02-03T10:12:50Z

I've recently been working regularly with fable and the package has been a joy to work with, thank you!

I currently would like to use tsibble::stretch_tsibble (is there an alternative? I'm wondering about the "questioning" lifecycle tag) to evaluate reconciled forecasts. It seems to be working for sliding windows, but for stretched tsibbles I run into an error.

Please find a reproducable example below

library(tidyverse)
library(tsibble)
library(fable)
#> Lade nötiges Paket: fabletools


tourism_hts <- tourism %>%
  aggregate_key(State / Region, Trips = sum(Trips))

# reconciliation with sliding window - works
fc_slided <- tourism_hts %>% 
  filter(State == "Tasmania") %>% 
  slide_tsibble(.step = 8, .size = 60) %>%
  model(ets = ETS(Trips)) %>%
  reconcile(ets_rec = min_trace(ets)) %>%
  forecast(h = 4)

# reconciliation with sliding window - doesn't work
fc_slided <- tourism_hts %>% 
  filter(State == "Tasmania") %>% 
  stretch_tsibble(.step = 8, .init = 60) %>%
  model(ets = ETS(Trips)) %>%
  reconcile(ets_rec = min_trace(ets)) %>%
  forecast(h = 4)
#> Error: Problem with `mutate()` input `ets_rec`.
#> x Fehler bei der Auswertung des Argumentes 'x' bei der Methodenauswahl für Funktion 'as.matrix': Join columns must be present in data.
#> x Problem with `date`.
#> i Input `ets_rec` is `(function (object, ...) ...`.

^{Created on 2021-02-03 by the reprex package (v0.3.0)}

The text was updated successfully, but these errors were encountered:

mitchelloharawild · 2021-02-04T01:36:05Z

This specific error was fixed in 683e8a9, however reconciling cross validated forecasts is not yet possible.

This is because the key variable used to identify the cross validation fold becomes part of the hierarchy. As there is no <aggregated> value for these folds (which is appropriate), this produces 'disjoint' hierarchies (where each branch - or fold - should be reconciled separately).

The relevant issue for this is here: #106

claudiolaas · 2021-02-04T12:27:19Z

Hi Mitchell, I am working with @henningsway on this and we thought of a workaround: iterate over the chunks that stretch_tsibble or slide_tsibble generate and do the model-reconciliate-accuracy step on each chunk individually. Then average the error metrics over all chunks to get the overall error metric.

However, not all error metrics came out accurately. Some examples: ME, MAPE and CRPS did average out to the correct overall value but MASE and RMSSE did not. We suppose that one would have to average over(?) the residuals of the chunks and then calculate the overall error metrics instead of calculating the error metrics for each chunk and then do the averaging.

Or in other words, how exactly do the forecasts of a stretched tsibble get combined to arrive at one overall accuracy measure?

mitchelloharawild · 2021-02-04T13:37:37Z

Could you elaborate on why you think the MASE and RMSSE error metrics are not accurate? Perhaps there is a problem or confusion about the scaling of these accuracy measures.

When forecasting a stretched tsibble, you will get separate forecasts for each fold of the tsibble. From there, you can compute a set of accuracy() measures for the forecast errors using the test set. Typically these accuracy measures would be summarised into a single value (across the folds of the stretched tsibble) using a mean or median.

claudiolaas · 2021-02-04T16:25:50Z

Typically these accuracy measures would be summarised into a single value (across the folds of the stretched tsibble) using a mean or median.

This is exactly what we tried, but it appears that by using stretch or slide some values get averaged out differently, see example below.

# just one time series
test_data <- tourism %>%
  filter(Region == "Adelaide",
         State == "South Australia",
         Purpose == "Business")

#create two non overlapping chunks of 39 rows each
fc_slide <- test_data %>% 
  slide_tsibble(.step = 39, .size = 39) %>% 
  model(ets = ETS(Trips)) %>% 
  forecast(h = 1)  %>% 
  accuracy(test_data)#,measures = list(distribution_accuracy_measures))

# first 39 rows
fc_1 <- test_data %>%
  filter(Quarter < yearquarter("2007 Q4")) %>% 
  model(ets = ETS(Trips)) %>%
  forecast(h = 1) %>% 
  accuracy(test_data)#,measures = list(distribution_accuracy_measures))


# second 39 rows
fc_2 <- test_data %>%
  filter(Quarter >= yearquarter("2007 Q4"),
         Quarter <= yearquarter("2017 Q2")) %>% 
  model(ets = ETS(Trips)) %>%
  forecast(h = 1) %>% 
  accuracy(test_data)#,measures = list(distribution_accuracy_measures))


(fc_1$ME + fc_2$ME)/2 == fc_slide$ME # --> True

(fc_1$RMSSE + fc_2$RMSSE)/2 == fc_slide$RMSSE # --> False

mitchelloharawild · 2021-02-04T22:10:28Z

@robjhyndman I think I asked you about this before, but I couldn't find the answer. When computing scaled accuracy measures over folds of a cross-validated dataset, is it more appropriate to use the same scaling factor or a scaling factor specific to each fold?

robjhyndman · 2021-02-04T22:16:00Z

I would use the same scaling factor computed over the whole data set. Otherwise it just adds another source of variability.

mitchelloharawild · 2021-02-11T07:17:02Z

Closing as scaling factor used in accuracy() is more appropriate, and reconciliation of cross-validated models will be added in #106.

henningsway changed the title ~~Reconciliation of forecasts in stretched crossvalidation raises error~~ Reconciliation of forecasts in stretched crossvalidation Feb 3, 2021

mitchelloharawild closed this as completed Feb 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconciliation of forecasts in stretched crossvalidation #305

Reconciliation of forecasts in stretched crossvalidation #305

henningsway commented Feb 3, 2021

mitchelloharawild commented Feb 4, 2021

claudiolaas commented Feb 4, 2021

mitchelloharawild commented Feb 4, 2021

claudiolaas commented Feb 4, 2021

mitchelloharawild commented Feb 4, 2021

robjhyndman commented Feb 4, 2021

mitchelloharawild commented Feb 11, 2021

Reconciliation of forecasts in stretched crossvalidation #305

Reconciliation of forecasts in stretched crossvalidation #305

Comments

henningsway commented Feb 3, 2021

mitchelloharawild commented Feb 4, 2021

claudiolaas commented Feb 4, 2021

mitchelloharawild commented Feb 4, 2021

claudiolaas commented Feb 4, 2021

mitchelloharawild commented Feb 4, 2021

robjhyndman commented Feb 4, 2021

mitchelloharawild commented Feb 11, 2021