-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add list.index_of_in() to Expr and Series #21192
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #21192 +/- ##
==========================================
+ Coverage 79.20% 79.90% +0.70%
==========================================
Files 1585 1594 +9
Lines 225619 227899 +2280
Branches 2588 2600 +12
==========================================
+ Hits 178697 182101 +3404
+ Misses 46332 45201 -1131
- Partials 590 597 +7 ☔ View full report in Codecov by Sentry. |
We still haven't decided if we want this. I want to think of better ways than implementing every expression again over arrays and lists. Iterators over anyvalues, are things we should avoid. I want something more generic that allows us to reuse expressions on lists in way that doesn't impact binary size and reuses existing logic such as the row-encoding. I first want to thing about this one. Being faster than list.eval isn't a metric we think is important in this, as list.eval is very very slow. This is of course perfectly implementable as plugin, if needed upstream. |
Insofar as possible this uses And the fact that (if you want type coercion in IR, anyway) it required an additional kind of type coercion is at least something to consider for a more generic implementation strategy. If you decide not to accept this I will likely refactor into a plugin, yes. |
Not sure what's up with the Python docs. |
Fixes #20626.
The prototype in the draft PR (#20733) was 10× faster than using non-obvious hacky Python implementations based on
list.eval()
, so I would guess this is too.Just as a reminder, the original requested feature in #5503 that led to the addition of
Expr.index_of()
was actually what's covered in this PR. And this is also what G-Research internal users had requested, so it seems like there are multiple potential users.