Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Selector abstract type for ecosytem compat, and rethink Between #41

Open
rafaqz opened this issue Sep 14, 2021 · 8 comments
Open

Add Selector abstract type for ecosytem compat, and rethink Between #41

rafaqz opened this issue Sep 14, 2021 · 8 comments

Comments

@rafaqz
Copy link
Contributor

rafaqz commented Sep 14, 2021

DimensionalData.jl defines a Selector abstract type and selectors Between, Near, At, Where, and Contains.

AxisKeys.jl also defines Selector and Near and Inervals selectors (ping @mcabbott)

It would be useful to standardise these so we can all use the simplest words without clashes, and this is probably the place for that.

We could define Selector here, and make Between <: Selector. Near could also move here to avoid the (maybe less common) clashes between AxisKeys.jl and DimensionalData.jl, if @mcabbott is also interested in sharing these types.

Between in DD holds a 2-tuple of any types, while here it has two fields limited to Union{Int,Symbol}.

@bkamins
Copy link
Member

bkamins commented Sep 14, 2021

I agree we can make this change. This would be breaking for DataAPI.jl, so 2.0 release would be needed, which is a bit painful given then number of dependants https://juliahub.com/ui/Packages/DataAPI/3a8mN/1.8.0?t=2. However, I assume that only DataFrames.jl uses Between now, so it would be just a matter of changing 1 to 1, 2 in Project.toml files.

In DataFrames.jl we could allow both 1 and 2 release of DataAPI.jl and appropriately handle both cases.

AN ALTERNATIVE that would be non-breaking would be to define Between to hold two fields of any types (so then DD and AxisKeys.jl would need to accommodate). This would be less breaking (so I feel it is preferred), but maybe there are some benefits of holding a 2-tuple so let us discuss.

CC @nalimilan

@rafaqz
Copy link
Contributor Author

rafaqz commented Sep 14, 2021

If it's less breaking DimensionalData.jl can change to 2 fields, there is a breaking version bump coming up soon anyway, and it wont affect behavior as the constructor already accepts either Tuple or 2 arguments.

The reason for using 1 field was so that all Selectors have a val field holding the thing to be selected, which is a single value or a Tuple. Just to standardise that aspect. The val method also returns that value. But otherwise it's not important.

@rafaqz
Copy link
Contributor Author

rafaqz commented Sep 14, 2021

DD also defines rebuild methods for these to update the contents programatically, with ConstructionBase.setproperties as the default keyword constructor. Keeping field names the same removes all boilerplate. You can rebuild any selector with a modified value using the same syntax.

rebuild(x; kw...) = ConstructionBase.setproperties(x, (; kw...))

So that:

rebuild(sel; val=2 .* val(sel))

Or similar will just work, even when a Selector has additional metadata fields, like atol for At. Otherwise updating Selctor values progoramatically has to be special-cased for each selector type.

@bkamins
Copy link
Member

bkamins commented Sep 14, 2021

DD also defines rebuild methods for these to update the contents programatically

This is what I have feared that there might be such cases. Let us wait for @nalimilan to comment in the first place what he thinks about merging the definitions and then we will decide what path should be taken to merge them (as maybe we can agree that DataAPI.jl definition is "internal" and changing it is non-breaking as long as we keep the external API untouched).

@nalimilan
Copy link
Member

Merging definitions sounds like a good idea, but tagging DataAPI 2.0 now is a no-go IMO. If we really want to switch to a single field, we could define appropriate constructors and getproperty/setproperty! methods to continue supporting the previous syntax.

That said, I don't really get the value of being able to do rebuild(sel; val=2 .* val(sel)) on any type of selector. Does happen that you don't know what kind of selector you have and just want to multiply the values by 2? As a counter-argument, it seems that rebuild(sel; last=2) could be more often useful if you want to keep the starting point but change the ending point.

@mcabbott
Copy link

AxisKeys.jl takes Interval is from IntervalSets.jl, and Not from InvertedIndices.jl. I doubt these two will want extra supertypes (although IntervalSets.jl depends on EllipsisNotation.jl which I see now depends on ArrayInterface.jl so it's not 10 lines anymore.)

Its struct Near{T} <: Selector{T} val::T end holds anything, and has the meaning of taking the nearest one element. I think that's the same meaning as DD.jl. The alternative meaning would be a value with a tolerance.

@bkamins
Copy link
Member

bkamins commented Sep 14, 2021

One small consideration to also mention is

Base.Broadcast.broadcastable(x::Between) = Ref(BroadcastedSelector(x))
. This is intended to allow for lazy broadcasting of the selector (i.e. that it can be properly expanded by a caller side that is aware of the actual range of indices selected). Would this break something in the other packages?

@rafaqz
Copy link
Contributor Author

rafaqz commented Sep 14, 2021

@nalimilan Multiplying values by 2 was a bad example... practically it's more like swapping out their contents, e.g. a Vector, for the individual values of the vector. But 2 fields is fine.

But maybe DD can also move to using Interval from IntervalSets.jl like AxisKeys, and this wont be an issue. I used Between because the pattern is selectors as positional words. And interval is confusing when an index can represent intervals or points, both of with can be selected with Between.

@mcabbott yes Near means the same thing in both AK and DD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants