Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically select() key variables in a mable #170

Closed
hongooi73 opened this issue Mar 6, 2020 · 7 comments
Closed

Automatically select() key variables in a mable #170

hongooi73 opened this issue Mar 6, 2020 · 7 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@hongooi73
Copy link
Contributor

This is somewhat related to #167. Rather than deleting a model, is there a good way to select only a specified model in a mable? Doing select(mable, mod) results in

Error: The result is not a valid mable. The key variables must uniquely identify each row.
Run rlang::last_error() to see where the error occurred.

@mitchelloharawild
Copy link
Member

select(mable, key, mod) is required here.

I'm not sure if key variables should be automatically included in the select() call or not, but currently the dplyr support is rather simple (most functions just call NextMethod() and then repair the structure with checks).

I think it is best for select.mdl_df() to match select.tbl_ts() and select.fbl_ts() in terms of handling of keys.

@hongooi73
Copy link
Contributor Author

select for tsibbles is actually inconsistent in how it treats key and indices. It automatically selects the latter, but not the former. I would suggest that it should always select the keys, if present.

@mitchelloharawild
Copy link
Member

select() for tsibble is consistent as keys and indices have different nature.
All tsibbles must have an index, but they may not have keys.

If a key is no longer necessary to uniquely identify a time series, it can be removed with select().

library(tsibble)
library(dplyr)
tourism %>% 
  filter(Purpose == "Holiday") %>% 
  select(-Purpose)
#> # A tsibble: 6,080 x 4 [1Q]
#> # Key:       Region, State [76]
#>    Quarter Region   State           Trips
#>      <qtr> <chr>    <chr>           <dbl>
#>  1 1998 Q1 Adelaide South Australia  224.
#>  2 1998 Q2 Adelaide South Australia  130.
#>  3 1998 Q3 Adelaide South Australia  156.
#>  4 1998 Q4 Adelaide South Australia  182.
#>  5 1999 Q1 Adelaide South Australia  185.
#>  6 1999 Q2 Adelaide South Australia  135.
#>  7 1999 Q3 Adelaide South Australia  136.
#>  8 1999 Q4 Adelaide South Australia  169.
#>  9 2000 Q1 Adelaide South Australia  184.
#> 10 2000 Q2 Adelaide South Australia  134.
#> # … with 6,070 more rows

Created on 2020-03-07 by the reprex package (v0.3.0)

Fundamentally, the same could be said for mables, however I expect redundant keys would be less common.

@hongooi73
Copy link
Contributor Author

hongooi73 commented Mar 6, 2020

Yeah, what I mean is, if a tsibble has a key, then selecting should retain it by default. This just seems more intuitive and easier to use.

I can understand selecting both key and index, and not selecting both key and index. But selecting index and not selecting key seems weird.

@mitchelloharawild
Copy link
Member

@earowang can comment on this.

@earowang
Copy link
Member

earowang commented Mar 8, 2020

select() a tsibble now by default includes key and index in the dev version. should be there for next CRAN release

@mitchelloharawild mitchelloharawild self-assigned this Mar 8, 2020
@mitchelloharawild mitchelloharawild added enhancement New feature or request good first issue Good for newcomers labels Mar 8, 2020
@mitchelloharawild mitchelloharawild changed the title Selecting specific model from a mable Automatically select() key variables in a mable Mar 8, 2020
@mitchelloharawild
Copy link
Member

Consolidated into #192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants