Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early exit for empty index in vec_assign() #1590

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mgirlich
Copy link
Contributor

Closes #1408.

For example this is relevant for coalesce() like uses, e.g. vec_assign(x, is.na(x), 1L).

I only added this optimisation for vec_assign_opts() and not for other functions in slice-assign.h e.g. vec_proxy_assign_opts(). I only saw vec_proxy_assign_opts() being called in unchop() and vec_c() but it don't think one would encounter an empty index there.

Do you want an explicit test that vec_assign(x, integer(), -1L) works? It is currently tested "indirectly" via vec_slice(x, FALSE) <- 2L.

library(rlang)
library(vctrs)

n <- 10e6
x <- 0L + 1:n
idx <- vec_rep(FALSE, n)

bench::mark(
  false_n = vec_assign(x, idx, -1L),
  false = vec_assign(x, FALSE, -1L),
  empty = vec_assign(x, integer(), -1L),
  check = FALSE,
  min_iterations = 20
)

# Before
#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 false_n     13.98ms  14.58ms      62.8    38.1MB     68.1
#> 2 false        5.07ms   5.49ms     179.     38.1MB    172. 
#> 3 empty        5.03ms   5.59ms     177.     38.1MB    171.

# After
#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 false_n      8.37ms   9.01ms      109.     3.5KB      0  
#> 2 false        1.74µs   1.86µs   371996.        0B     37.2
#> 3 empty        2.11µs   2.29µs   341040.        0B      0

Created on 2022-07-15 by the reprex package (v2.0.1)

@lionel-
Copy link
Member

lionel- commented Jul 19, 2022

Do you want an explicit test that vec_assign(x, integer(), -1L) works?

Yes please, it never hurts to add one more test :-)

Copy link
Member

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually tempted to put this optimization deep within the assignment macros, like ASSIGN_INDEX(), ASSIGN_COMPACT(), the corresponding BARRIER macros, and the corresponding "shaped" ones for arrays. Otherwise I think it is very hard to get the placement of the optimization right, because we aren't always exactly sure what index is (is it compact, or not?) and we have all kinds of other checks on x/proxy/value that we want to ensure are run before we decide to early exit.

Basically I think the early exit should be done as close to the generation of the copy as possible (i.e. the vec_clone_referenced() call)

Comment on lines +34 to 36

// Cast and recycle `value`
value = KEEP(vec_cast(value, x, opts.value_arg, opts.x_arg, opts.call));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the cast and recycle checks need to run before we can think about early exiting

@lionel-
Copy link
Member

lionel- commented Sep 20, 2022

Closing for the reasons explained in #1636 (comment)

@lionel- lionel- closed this Sep 20, 2022
@hadley
Copy link
Member

hadley commented Oct 20, 2022

Reopening, because I'd like to reconsider in light of tidyverse/tidyr#1392 — the performance gains seem meaningful, and it seems much more reasonable to handle this in vctrs rather than having to wrap every use of vec_assign().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fast path for vec_assign(x, integer())?
4 participants