can reconcile() be used for distributional forecasts #334

alexhallam · 2021-11-08T13:57:53Z

I would like to know if reconcile() can be used with distributional forecasts.

As a reproducible example I took the m5 data and generated naive forecasts for total, ca, wi, and tx. The total is the sum of the three states. I generated 8000 samples over a 10 day forecast horizon. The data set can be grabbed with the following.

wget https://storage.googleapis.com/data_xvzf/m5_hierarchical_forecast.csv

This is the head

      ds         rep y_sim_total y_sim_ca y_sim_tx y_sim_wi 
1     2016-05-23 0   42171.      15851    13205.   13970.   
2     2016-05-23 1   30535.      13646.   13284.   13211.   
3     2016-05-23 2   44897.      16215.   13132.   12586.   
4     2016-05-23 3   39906.      16860.   11361.   13291.   
5     2016-05-23 4   47499.      18238.   13715.   13108.   
6     2016-05-23 5   46119.      18963.   9978.    11286.   
7     2016-05-23 6   46307.      18532.   13197.   12959.   
8     2016-05-23 7   35262.      18930.   11556.   9308.    
9     2016-05-23 8   43877.      18618.   11755.   14966.

Can I use the reconcile() function to reconcile these forecasts to the total?

The text was updated successfully, but these errors were encountered:

robjhyndman · 2021-11-08T21:23:26Z

Almost. You need to remove the total column and set up the data as a tsibble with state as the key. Then generate the total using aggregate_key(). There is an example at https://otexts.com/fpp3/prison.html.

The distributions are reconciled as described here: https://otexts.com/fpp3/rec-prob.html

alexhallam · 2021-11-08T22:03:40Z

Thanks for the quick response!

I think I see where I am getting hung up. I am starting with the output from probabilistic forecasts (downstream from where the example starts). It looks like the bottom_up() function expects a mable model. Do you recommend that users make their own mable object in this case? Maybe I can make a simple model based on lm(y_sim ~ 1).

fit <- prison_gts %>%
  filter(year(Quarter) <= 2014) %>%
  model(base = ETS(Count)) %>%
  reconcile(
    bottom_up = bottom_up(base),  # <--- needs a mable object
    MinT = min_trace(base, method = "mint_shrink")
  )

This is as far as I can get

df <- readr::read_csv("https://storage.googleapis.com/data_xvzf/m5_hierarchical_forecast.csv") %>% 
  select(-y_sim_total) %>%
  pivot_longer(cols = y_sim_ca:y_sim_wi, names_to = "state", values_to = "y_sim") %>% 
  as_tsibble(index = ds, key = c(rep, state)) %>% 
  aggregate_key(total = mean(y_sim)) %>% 
  reconcile(
    bottom_up = bottom_up(),  # <-- missing a mable object
    MinT = min_trace(method = "mint_shrink")
  )

mitchelloharawild · 2021-11-08T22:09:50Z

Currently reconcile() only works in conjunction with the forecast() function, but this is something I'm looking at changing.
The complexity here is that you'll need to provide your own weighting matrix, as you no longer have access to the model's training residuals. Some weighting matrices (such as OLS and structural scaling) do not use the model's residuals, and will be easier to use.

At minimum, you'll need to convert your samples into a distribution using dist_sample(). This should be done by summarising over the replicates (grouping by ds and state).

From there you can achieve bottom up reconciliation by summing the distribution column in aggregate_key(). Reconciling optimally currently would require writing your own matrix algebra, but again I'd like to improve this.

alexhallam · 2021-11-09T01:03:04Z

I see what you are saying. Thanks for the info!

alexhallam · 2021-11-09T14:20:27Z

Thanks again for the help.

I just wanted to codify what you said above. This is where things are at the moment.

df <- readr::read_csv("https://storage.googleapis.com/data_xvzf/m5_hierarchical_forecast.csv") %>% 
  dplyr::select(-y_sim_total) %>%
  tidyr::pivot_longer(cols = y_sim_ca:y_sim_wi, names_to = "state", values_to = "y_sim") %>% 
  dplyr::group_by(ds, state) %>% 
  dplyr::summarise(dist = distributional::dist_sample(list(y_sim))) %>% 
  as_tsibble(index=ds, key = state) %>% 
  aggregate_key(total = sum(dist)) # <- error (I am not sure what to put here when my column is a dist object)

mitchelloharawild self-assigned this Nov 8, 2021

mitchelloharawild added the enhancement New feature or request label Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can reconcile() be used for distributional forecasts #334

can reconcile() be used for distributional forecasts #334

alexhallam commented Nov 8, 2021 •

edited

Loading

robjhyndman commented Nov 8, 2021

alexhallam commented Nov 8, 2021

mitchelloharawild commented Nov 8, 2021

alexhallam commented Nov 9, 2021

alexhallam commented Nov 9, 2021

can reconcile() be used for distributional forecasts #334

can reconcile() be used for distributional forecasts #334

Comments

alexhallam commented Nov 8, 2021 • edited Loading

robjhyndman commented Nov 8, 2021

alexhallam commented Nov 8, 2021

mitchelloharawild commented Nov 8, 2021

alexhallam commented Nov 9, 2021

alexhallam commented Nov 9, 2021

alexhallam commented Nov 8, 2021 •

edited

Loading