-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rules for sortslices
, unique
#546
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,3 +1,7 @@ | ||||||
##### | ||||||
##### `sort` | ||||||
##### | ||||||
|
||||||
function rrule(::typeof(partialsort), xs::AbstractVector, k::Union{Integer,OrdinalRange}; kwargs...) | ||||||
inds = partialsortperm(xs, k; kwargs...) | ||||||
ys = xs[inds] | ||||||
|
@@ -33,3 +37,55 @@ function rrule(::typeof(sort), xs::AbstractVector; kwargs...) | |||||
end | ||||||
return ys, sort_pullback | ||||||
end | ||||||
|
||||||
##### | ||||||
##### `sortslices` | ||||||
##### | ||||||
|
||||||
function rrule(::typeof(sortslices), x::AbstractArray; dims::Integer, kw...) | ||||||
p = sortperm(collect(eachslice(x; dims=dims)); kw...) | ||||||
inds = ntuple(d -> d == dims ? p : (:), ndims(x)) | ||||||
function sortslices_pullback(dy) | ||||||
# No actual need to zero this, and if you didn't, then you could widen eltype | ||||||
# Also, you could use similar(dy) here not x, same size? | ||||||
dx = _zerolike_writeat(x, unthunk(dy), (), inds...) | ||||||
return (NoTangent(), ProjectTo(x)(dx)) | ||||||
end | ||||||
return x[inds...], sortslices_pullback | ||||||
end | ||||||
|
||||||
##### | ||||||
##### `unique` | ||||||
##### | ||||||
|
||||||
function rrule(::typeof(unique), x::AbstractArray{<:Number}; dims=:) | ||||||
axes_x = axes(x) | ||||||
y = unique(x; dims=dims) # accepts only dims=: or dims::Integer | ||||||
function unique_pullback(dy_raw) | ||||||
dy = unthunk(dy_raw) | ||||||
if length(x) == length(y) | ||||||
# Short-circuit for the case of all unique, since `mask` is fairly expensive: | ||||||
dx = reshape(dy, axes_x) | ||||||
return (NoTangent(), ProjectTo(x)(dx)) | ||||||
end | ||||||
|
||||||
if dims isa Colon | ||||||
mcabbott marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
xs, ys = vec(x), y | ||||||
else | ||||||
xs, ys = collect(eachslice(x; dims=dims)), collect(eachslice(y; dims=dims)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is an issue open on BlueStyle to remcomment against this If we are going to do this then how do you feel about:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could avoid this for style. But I think the broadcast is confusing, and perhaps you do too, because I also think it's missing an easy-to-miss dot:
|
||||||
end | ||||||
mask = isequal.(permutedims(ys), xs) # unique([0.0, -0.0, NaN, NaN]) | ||||||
mask .= (mask .== cumsum(mask, dims=1) .== true) # this implements findfirst(mask; dims=1) | ||||||
keep = map(I -> I[1], findall(mask)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this will work:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wow, I hate it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I sort-of understand why you can't splat or broadcast it, but it's pretty weird that you can still index it. |
||||||
if dims isa Colon | ||||||
# The function `_zerolike_writeat` allows second derivatives. | ||||||
# Should perhaps eventually be shared with `getindex`. | ||||||
dx = reshape(_zerolike_writeat(vec(x), vec(dy), (), keep), axes_x) | ||||||
else | ||||||
inds = ntuple(d -> d==dims ? keep : (:), length(axes_x)) | ||||||
dx = _zerolike_writeat(x, dy, (), inds...) | ||||||
end | ||||||
return (NoTangent(), ProjectTo(x)(dx)) | ||||||
end | ||||||
return y, unique_pullback | ||||||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the
unthunk
?If so, should we push it down inside the
_zerolike_writeat
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that ideally,
_zerolike_writeat
should be upgraded to return an InplaceThunk. And eventually it should be called grad_getindex or something, too.I'm not sure whether it should handle un-thunking. I guess it wouldn't hurt to add a method. But since most rules at present call unthunk explicitly, maybe it's clearer to call it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other place it's called at present is:
https://github.com/JuliaDiff/ChainRules.jl/blob/main/src/rulesets/Base/array.jl#L364-L373
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably we shouldn't be unthunking if the destination that we are writing into can accept
Any
.(but practically that case doesn't really matter since performance is already shot. And likely Zygote will hate that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is far from being a high-performance function!
If you don't take the shortcut above, then not all entries were unique, and thus
_zerolike_writeat
has to copy dy into dx at some nontrivial indices. So it has to slice updy
, I don't think it can write just one thunk anywhere.