Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check_residuals() function #105

Open
robjhyndman opened this issue Aug 7, 2019 · 12 comments
Open

Add check_residuals() function #105

robjhyndman opened this issue Aug 7, 2019 · 12 comments

Comments

@robjhyndman
Copy link
Member

Essentially a wrapper to

  augment(model) %>%
  features(.resid, ljung_box, lag=<10 or 2*period>, dof=<from model>)
@robjhyndman
Copy link
Member Author

Or maybe this should be called test_residuals().

@mitchelloharawild
Copy link
Member

Certainly possible, this would also require models to add the dof to the glance output or similar (much like forecast:::modeldf()). period can be determined from the tsibble.

I think the interface needs more thought to ensure that a consistent and general interface is preserved throughout the package.

  • Checking/testing the residuals would often involve more than just the Ljung-Box test. Should there be a tag which is used for model testing?
  • How would you specify the tests that they are interested in?
  • What other parameters from the model fit may be useful when computing a feature?
  • Are there other functions which should wrap features()? Should this be common practice?

@robjhyndman
Copy link
Member Author

Adding dof to the glance output seems like a good idea in any case.
Yes, making it more general might be a good idea, although the use in the textbook is almost always LB apart from regression models where LB is controversial and Breusch-Godfrey is sometimes preferred.

@mitchelloharawild mitchelloharawild added this to the v0.2.0 milestone Aug 7, 2019
@mitchelloharawild mitchelloharawild removed this from the v0.2.0 milestone May 27, 2020
@mbg-unsw
Copy link

This would be great for mables where each model has a different dof.

@mitchelloharawild
Copy link
Member

I've now added (experimentally) hypothesize() methods in fabletools (0f3c42f6c6e1aa837de3ca5385447387bbdc1f48) for running statistical tests on fitted models. It is very similar to features(), but more oriented to computing tests on fitted models, rather than features on data. Note that tests can be features, but not the other way round.
Note that hypothesise() will be available once r-lib/generics#55 is resolved. As an example of how this function works, I have also added breusch_godfrey() in tidyverts/fable@0f3c42f which can be used as follows:

library(fpp3)
tourism %>% 
  model(TSLM(Trips ~ trend() + season())) %>% 
  hypothesize(tests = lst(breusch_godfrey), order = 24)
#> # A tibble: 304 x 9
#>    Region   State    Purpose .model     .test  statistic order null_dist p.value
#>    <chr>    <chr>    <chr>   <chr>      <chr>      <dbl> <int>    <dist>   <dbl>
#>  1 Adelaide South A… Busine… TSLM(Trip… breus…      23.3    24    ᵪ²(24)  0.500 
#>  2 Adelaide South A… Holiday TSLM(Trip… breus…      26.1    24    ᵪ²(24)  0.346 
#>  3 Adelaide South A… Other   TSLM(Trip… breus…      34.7    24    ᵪ²(24)  0.0732
#>  4 Adelaide South A… Visiti… TSLM(Trip… breus…      29.7    24    ᵪ²(24)  0.194 
#>  5 Adelaid… South A… Busine… TSLM(Trip… breus…      24.8    24    ᵪ²(24)  0.414 
#>  6 Adelaid… South A… Holiday TSLM(Trip… breus…      24.1    24    ᵪ²(24)  0.458 
#>  7 Adelaid… South A… Other   TSLM(Trip… breus…      25.5    24    ᵪ²(24)  0.377 
#>  8 Adelaid… South A… Visiti… TSLM(Trip… breus…      12.1    24    ᵪ²(24)  0.979 
#>  9 Alice S… Norther… Busine… TSLM(Trip… breus…      26.5    24    ᵪ²(24)  0.327 
#> 10 Alice S… Norther… Holiday TSLM(Trip… breus…      30.7    24    ᵪ²(24)  0.163 
#> # … with 294 more rows

Created on 2021-04-08 by the reprex package (v1.0.0)

I do think that it should be easy to compute both Ljung-Box and Breusch-Godfrey tests on regression models, and at most it should hint toward Breusch-Godfrey for regression models in the documentation.

@mbg-unsw
Copy link

mbg-unsw commented Apr 9, 2021

Thanks, that looks great.

I assume we'll also need new Ljung-Box and Breusch-Godfrey methods for ARIMA that can pick up the dof from each model?

@mitchelloharawild
Copy link
Member

Ljung-Box and Box-Pierce tests will be written to work with any model that makes the degrees of freedom available. This will be the next one to add, however it will require some migration of feasts::ljung_box() to fabletools::ljung_box().

@baumstan
Copy link

baumstan commented Jul 18, 2021

@mitchelloharawild I've updated my fabletools package and am unable to run the breusch_godfrey on my TSLM.

remotes::install_github("tidyverts/fabletools")

fit_trend <- q1_ts %>% mutate(surfing_festival = ifelse(month(month)==3 & year(month) > 1987,1,0)) %>% model(exponential = TSLM(log(sales)~ trend() + season() + surfing_festival)) report(fit_trend)

I've tried:
fit_trend %>% hypothesise(tests = lst(breusch_godfrey), order = 24)
Error in hypothesise(., tests = lst(breusch_godfrey), order = 24) :
could not find function "hypothesise"

and:
fable::breusch_godfrey(fit_trend)
Error in UseMethod("breusch_godfrey") :
no applicable method for 'breusch_godfrey' applied to an object of class "c('mdl_df', 'tbl_df', 'tbl', 'data.frame')"

Any guidance would be appreciated.

@mitchelloharawild
Copy link
Member

Looks like you haven't yet loaded the development version. Try restarting R to unload the CRAN version of fabletools so that next time you load the fabletools package, you will have the dev version and access to these new functions.

@baumstan
Copy link

Thank you. I'd loaded but not restarted. This code works:

fit_trend %>%
  hypothesize(tests = lst(breusch_godfrey), order = 1)

But this one doesn't...

fable::breusch_godfrey(fit_trend, order =1)

Could you confirm that I've correctly used the hypothesize option given that my model is a regression not an ARIMA?

@mitchelloharawild
Copy link
Member

Yes, the first code snippet is the current interface for running the test.

@mitchelloharawild
Copy link
Member

An alternative generic function is needed for computing values from distributions, such as Newey-West (tidyverts/fable#332). The function could/would act very similarly to what we have described here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants