Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutate() issue - Hierarchical Forecasting #306

Open
deltaz3r0 opened this issue Feb 10, 2021 · 6 comments
Open

Mutate() issue - Hierarchical Forecasting #306

deltaz3r0 opened this issue Feb 10, 2021 · 6 comments

Comments

@deltaz3r0
Copy link

Hi, I am reposting this issue on GitHub, with a more complete example, as I suspect it might not be related to the data being used or code mistakes.

I am trying to perform Hierarchical Forecasting on a dataset that is fundamentally structured in the same way as the tourism tsibble referenced in Forecasting: Principles and Practice, but with more hierarchical levels. However, after the structural aggregation, a mutate() error shows up.
The data doesn't contain any missing values.

Following, you will find a reprex of the code, containing a minimal version of the data used that is able to reproduce the error.

Thanks in advance.

library(fable)
library(dplyr)
library(tsibble)
library(tidyverse)

t_london <- tibble::tribble(
  ~Month,             ~Value.type,   ~LSOA11CD,             ~LSOA11NM,     ~WD19CD,      ~WD19NM,    ~LAD19CD,         ~LAD19NM,           ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
  "2010 Dec", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Jan", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Feb", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     3L,
  "2011 Mar", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Apr", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2011 May", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Jun", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     4L,
  "2011 Jul", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     3L,
  "2011 Aug", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Sep", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2011 Oct", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2011 Nov", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2011 Dec", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     6L
)

t_london <- t_london  %>%
mutate(Month = yearmonth(Month)) %>%
  as_tsibble(key = c(LSOA11CD, Value.type), index=Month)

london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))

fit <- london_full %>%
  model(base = ARIMA(Total)) %>%
  reconcile(
    bu = bottom_up(base),
    ols = min_trace(base, method = "ols"),
    mint = min_trace(base, method = "mint_shrink"),
  )
#> Warning in max(which(abs(ma) > 1e-08)): no non-missing arguments to max;
#> returning -Inf

#> Warning: 16 errors (1 unique) encountered for base
#> [16] argument must be coercible to non-negative integer

fc <- fit %>%
  forecast(h = 5)
#> Warning: Problem with `mutate()` input `mint`.
#> ℹ diag(.) had 0 or NA entries; non-finite result is doubtful
#> ℹ Input `mint` is `(function (object, ...) ...`.
#> Warning: Problem with `mutate()` input `mint`.
#> ℹ diag(.) had 0 or NA entries; non-finite result is doubtful
#> ℹ Input `mint` is `(function (object, ...) ...`.
#> Error: Problem with `mutate()` input `mint`.
#> x infinite or missing values in 'x'
#> ℹ Input `mint` is `(function (object, ...) ...`.

Created on 2021-02-10 by the reprex package (v0.3.0)

@mitchelloharawild
Copy link
Member

I am unable to reproduce this issue with the latest versions of the packages. Perhaps try updating to the latest CRAN releases?

library(fable)
#> Loading required package: fabletools
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
library(tidyverse)

t_london <- tibble::tribble(
  ~Month,             ~Value.type,   ~LSOA11CD,             ~LSOA11NM,     ~WD19CD,      ~WD19NM,    ~LAD19CD,         ~LAD19NM,           ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
  "2010 Dec", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Jan", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Feb", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     3L,
  "2011 Mar", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Apr", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2011 May", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Jun", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     4L,
  "2011 Jul", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     3L,
  "2011 Aug", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     2L,
  "2011 Sep", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2011 Oct", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2011 Nov", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2011 Dec", "Value-Type-1         ", "E01000001", "City of London 001A", "E05009288", "Aldersgate", "E09000001", "City of London", "City Of London", "London", "England",     "UK",     6L
)

t_london <- t_london  %>%
  mutate(Month = yearmonth(Month)) %>%
  as_tsibble(key = c(LSOA11CD, Value.type), index=Month)

london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))

fit <- london_full %>%
  model(base = ARIMA(Total)) %>%
  reconcile(
    bu = bottom_up(base),
    ols = min_trace(base, method = "ols"),
    mint = min_trace(base, method = "mint_shrink"),
  )

fc <- fit %>%
  forecast(h = 5)
fc
#> # A fable: 320 x 12 [1M]
#> # Key:     NTN21NM, Value.type, CNTY21NM, RGN19NM, CTYNM, LAD19NM, WD19NM,
#> #   LSOA11NM, .model [64]
#>    NTN21NM    Value.type CNTY21NM   RGN19NM    CTYNM      LAD19NM    WD19NM    
#>    <chr*>     <chr*>     <chr*>     <chr*>     <chr*>     <chr*>     <chr*>    
#>  1 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  2 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  3 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  4 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  5 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  6 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  7 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  8 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#>  9 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#> 10 UK         Value-Typ… England    London     City Of L… City of L… Aldersgate
#> # … with 310 more rows, and 5 more variables: LSOA11NM <chr*>, .model <chr>,
#> #   Month <mth>, Total <dist>, .mean <dbl>

Created on 2021-02-11 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Ubuntu 20.04.1 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_AU:en                    
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2021-02-11                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package        * version    date       lib source                            
#>  anytime          0.3.9      2020-08-27 [1] CRAN (R 4.0.2)                    
#>  assertthat       0.2.1      2019-03-21 [1] CRAN (R 4.0.2)                    
#>  backports        1.2.1      2020-12-09 [1] CRAN (R 4.0.2)                    
#>  blob             1.2.1      2020-01-20 [1] CRAN (R 4.0.2)                    
#>  broom            0.7.0      2020-07-09 [1] CRAN (R 4.0.2)                    
#>  callr            3.5.1      2020-10-13 [1] CRAN (R 4.0.2)                    
#>  cellranger       1.1.0      2016-07-27 [1] CRAN (R 4.0.2)                    
#>  cli              2.3.0      2021-01-31 [1] CRAN (R 4.0.2)                    
#>  colorspace       2.0-0      2020-11-11 [1] CRAN (R 4.0.2)                    
#>  crayon           1.4.0      2021-01-30 [1] CRAN (R 4.0.2)                    
#>  DBI              1.1.0      2019-12-15 [1] CRAN (R 4.0.2)                    
#>  dbplyr           1.4.4      2020-05-27 [1] CRAN (R 4.0.2)                    
#>  desc             1.2.0      2018-05-01 [1] CRAN (R 4.0.2)                    
#>  devtools         2.3.2      2020-09-18 [1] CRAN (R 4.0.2)                    
#>  digest           0.6.27     2020-10-24 [1] CRAN (R 4.0.2)                    
#>  distributional   0.2.1      2020-10-06 [1] CRAN (R 4.0.2)                    
#>  dplyr          * 1.0.4      2021-02-02 [1] CRAN (R 4.0.2)                    
#>  ellipsis         0.3.1      2020-05-15 [1] CRAN (R 4.0.2)                    
#>  evaluate         0.14       2019-05-28 [1] CRAN (R 4.0.2)                    
#>  fable          * 0.3.0      2021-02-02 [1] local                             
#>  fabletools     * 0.3.0.9000 2021-02-02 [1] local                             
#>  fansi            0.4.2      2021-01-15 [1] CRAN (R 4.0.2)                    
#>  farver           2.0.3      2020-01-16 [1] CRAN (R 4.0.2)                    
#>  feasts           0.1.7      2021-02-08 [1] local                             
#>  forcats        * 0.5.1      2021-01-27 [1] CRAN (R 4.0.2)                    
#>  fs               1.5.0      2020-07-31 [1] CRAN (R 4.0.2)                    
#>  generics         0.1.0      2020-10-31 [1] CRAN (R 4.0.2)                    
#>  ggplot2        * 3.3.3      2020-12-30 [1] CRAN (R 4.0.2)                    
#>  glue             1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                    
#>  gtable           0.3.0      2019-03-25 [1] CRAN (R 4.0.2)                    
#>  haven            2.3.1      2020-06-01 [1] CRAN (R 4.0.2)                    
#>  highr            0.8        2019-03-20 [1] CRAN (R 4.0.2)                    
#>  hms              1.0.0      2021-01-13 [1] CRAN (R 4.0.2)                    
#>  htmltools        0.5.1      2021-01-12 [1] CRAN (R 4.0.2)                    
#>  httr             1.4.2      2020-07-20 [1] CRAN (R 4.0.2)                    
#>  jsonlite         1.7.2      2020-12-09 [1] CRAN (R 4.0.2)                    
#>  knitr            1.30       2020-09-22 [1] CRAN (R 4.0.2)                    
#>  lattice          0.20-41    2020-04-02 [2] CRAN (R 4.0.2)                    
#>  lifecycle        0.2.0      2020-03-06 [1] CRAN (R 4.0.2)                    
#>  lubridate        1.7.9.2    2020-11-13 [1] CRAN (R 4.0.2)                    
#>  magrittr         2.0.1      2020-11-17 [1] CRAN (R 4.0.2)                    
#>  Matrix           1.2-18     2019-11-27 [2] CRAN (R 4.0.2)                    
#>  memoise          1.1.0      2017-04-21 [1] CRAN (R 4.0.2)                    
#>  modelr           0.1.8      2020-05-19 [1] CRAN (R 4.0.2)                    
#>  munsell          0.5.0      2018-06-12 [1] CRAN (R 4.0.2)                    
#>  nlme             3.1-148    2020-05-24 [2] CRAN (R 4.0.2)                    
#>  pillar           1.4.7      2020-11-20 [1] CRAN (R 4.0.2)                    
#>  pkgbuild         1.2.0      2020-12-15 [1] CRAN (R 4.0.2)                    
#>  pkgconfig        2.0.3      2019-09-22 [1] CRAN (R 4.0.2)                    
#>  pkgload          1.1.0      2020-05-29 [1] CRAN (R 4.0.2)                    
#>  prettyunits      1.1.1      2020-01-24 [1] CRAN (R 4.0.2)                    
#>  processx         3.4.5      2020-11-30 [1] CRAN (R 4.0.2)                    
#>  progressr        0.7.0      2020-12-11 [1] CRAN (R 4.0.2)                    
#>  ps               1.5.0      2020-12-05 [1] CRAN (R 4.0.2)                    
#>  purrr          * 0.3.4      2020-04-17 [1] CRAN (R 4.0.2)                    
#>  R6               2.5.0      2020-10-28 [1] CRAN (R 4.0.2)                    
#>  Rcpp             1.0.6      2021-01-15 [1] CRAN (R 4.0.2)                    
#>  readr          * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                    
#>  readxl           1.3.1      2019-03-13 [1] CRAN (R 4.0.2)                    
#>  remotes          2.2.0      2020-07-21 [1] CRAN (R 4.0.2)                    
#>  reprex           0.3.0      2019-05-16 [1] CRAN (R 4.0.2)                    
#>  rlang            0.4.10     2020-12-30 [1] CRAN (R 4.0.2)                    
#>  rmarkdown        2.6        2020-12-14 [1] CRAN (R 4.0.2)                    
#>  rprojroot        2.0.2      2020-11-15 [1] CRAN (R 4.0.2)                    
#>  rvest            0.3.6      2020-07-25 [1] CRAN (R 4.0.2)                    
#>  scales           1.1.1      2020-05-11 [1] CRAN (R 4.0.2)                    
#>  sessioninfo      1.1.1      2018-11-05 [1] CRAN (R 4.0.2)                    
#>  stringi          1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                    
#>  stringr        * 1.4.0      2019-02-10 [1] CRAN (R 4.0.2)                    
#>  testthat         3.0.1      2020-12-17 [1] CRAN (R 4.0.2)                    
#>  tibble         * 3.0.6      2021-01-29 [1] CRAN (R 4.0.2)                    
#>  tidyr          * 1.1.2      2020-08-27 [1] CRAN (R 4.0.2)                    
#>  tidyselect       1.1.0      2020-05-11 [1] CRAN (R 4.0.2)                    
#>  tidyverse      * 1.3.0      2019-11-21 [1] CRAN (R 4.0.2)                    
#>  tsibble        * 1.0.0      2021-02-05 [1] Github (tidyverts/tsibble@722cc86)
#>  urca             1.3-0      2016-09-06 [1] CRAN (R 4.0.2)                    
#>  usethis          1.6.3      2020-09-17 [1] CRAN (R 4.0.2)                    
#>  utf8             1.1.4      2018-05-24 [1] CRAN (R 4.0.2)                    
#>  vctrs            0.3.6      2020-12-17 [1] CRAN (R 4.0.2)                    
#>  withr            2.4.1      2021-01-26 [1] CRAN (R 4.0.2)                    
#>  xfun             0.20       2021-01-06 [1] CRAN (R 4.0.2)                    
#>  xml2             1.3.2      2020-04-23 [1] CRAN (R 4.0.2)                    
#>  yaml             2.2.1      2020-02-01 [1] CRAN (R 4.0.2)                    
#> 
#> [1] /home/mitchell/R/x86_64-pc-linux-gnu-library/4.0
#> [2] /opt/R/4.0.0/lib/R/library

@deltaz3r0
Copy link
Author

Great, the issue is solved with the latest version of the packages. Thank you very much!

@deltaz3r0 deltaz3r0 reopened this Feb 12, 2021
@deltaz3r0
Copy link
Author

deltaz3r0 commented Feb 12, 2021

The problem seemed to re-appear, when using the whole dataset.
I tried capturing some of the rows that seem to be part of the issue, which you will find in the new reprex. All the packages used are the latest version.

library(fable)
#> Loading required package: fabletools
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tsibble)
library(tidyverse)

t_london <- tibble::tribble(
  ~Month,   ~Value.type,             ~LSOA11NM,     ~WD19CD,             ~WD19NM,         ~LAD19NM,           ~CTYNM, ~RGN19NM, ~CNTY21NM, ~NTN21NM, ~Count,
  "2016 Dec", "Value-Type2", "City of London 001A", "E05009288",        "Aldersgate", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2017 Jan", "Value-Type2", "City of London 001A", "E05009288",        "Aldersgate", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2016 Dec", "Value-Type2", "City of London 001B", "E05009302",       "Cripplegate", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2017 Jan", "Value-Type2", "City of London 001B", "E05009302",       "Cripplegate", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2016 Dec", "Value-Type2", "City of London 001C", "E05009302",       "Cripplegate", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2017 Jan", "Value-Type2", "City of London 001C", "E05009302",       "Cripplegate", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2016 Dec", "Value-Type2", "City of London 001E", "E05009308",         "Portsoken", "City of London", "City Of London", "London", "England",     "UK",     0L,
  "2017 Jan", "Value-Type2", "City of London 001E", "E05009308",         "Portsoken", "City of London", "City Of London", "London", "England",     "UK",     1L,
  "2016 Dec", "Value-Type2", "City of London 001F", "E05009311",            "Vintry", "City of London", "City Of London", "London", "England",     "UK",    54L,
  "2017 Jan", "Value-Type2", "City of London 001F", "E05009311",            "Vintry", "City of London", "City Of London", "London", "England",     "UK",    62L,
  "2016 Dec", "Value-Type2", "City of London 001G", "E05009304", "Farringdon Within", "City of London", "City Of London", "London", "England",     "UK",    12L,
  "2017 Jan", "Value-Type2", "City of London 001G", "E05009304", "Farringdon Within", "City of London", "City Of London", "London", "England",     "UK",     9L
)

t_london <- t_london  %>%
mutate(Month = yearmonth(Month)) %>%
  as_tsibble(key = c(LSOA11NM, Value.type), index=Month)

london_full <- t_london %>% aggregate_key((NTN21NM/ CNTY21NM / RGN19NM / CTYNM / LAD19NM / WD19NM /LSOA11NM) * Value.type, Total = sum(Count))

fit <- london_full %>%
  model(base = ARIMA(Total)) %>%
  reconcile(
    bu = bottom_up(base),
    ols = min_trace(base, method = "ols"),
    mint = min_trace(base, method = "mint_shrink"),
  )
#> Warning: 6 errors (1 unique) encountered for base
#> [6] missing value where TRUE/FALSE needed

fc <- fit %>%
  forecast(h = 1)
#> Warning in cov2cor(covm): diag(.) had 0 or NA entries; non-finite result is
#> doubtful
#> Warning in cov2cor(tar): diag(.) had 0 or NA entries; non-finite result is
#> doubtful
#> Error: Problem with `mutate()` input `mint`.
#> x infinite or missing values in 'x'
#> ℹ Input `mint` is `(function (object, ...) ...`.

Created on 2021-02-12 by the reprex package (v1.0.0)

@wdzhy123
Copy link

get the same issue, are there any updates for this?

@slava-keshkov
Copy link

are there any updates on this @mitchelloharawild?

thanks in advance!

@mitchelloharawild
Copy link
Member

Hi, please provide a minimal reproducible example.
I've just tried reproducing the example above, and the reason why it fails is due to ARIMA models being trained on just 2 observations per series - more data is required to produce sensible output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants