You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The kwarg sorted is set to False by default for ht.unique(), but it's True by default for torch.unique(). Currently, ht.unique(a) results in a sorted DNDarray if a.split=None (pure torch implementation: ht.unique(a) = torch.unique(a._DNDarray__array)), whereas if a.split is not None, the result will not be sorted (heat implementation).
Replacing ht.unique(a) = torch.unique(a._DNDarray__array) with ht.unique(a) = torch.unique(a._DNDarray__array, sorted=sorted) doesn't help, because sorted=False means different things for heat and torch:
heat interpretation: leave result unsorted;
torch interpretation: leave result REVERSE SORTED. See discussion on Add clarification to documentation of unique() #564 for an example.
I propose setting sorted=True by default in ht.unique() as at the moment it's the only way to prevent inconsistencies with torch, although I'm aware that the sorting comes with significant overhead. Incidentally, numpy.unique() returns the "sorted unique elements of an array" and not sorting is not even an option.
if return_inverse=True, ht.unique() by design returns a list of one DNDarray (the unique elements) and one torch tensor (the inverse indices). Should be two DNDarrays.
it is currently not possible to run ht.unique(a, sorted=True, axis=axis) if axis != split. Error message:
Description
While looking into #564, I found a number of inconsistencies in ht.unique().
Add clarification to documentation of unique() #564 is no documentation issue.
sorted
is set to False by default for ht.unique(), but it's True by default for torch.unique(). Currently,ht.unique(a)
results in a sorted DNDarray ifa.split=None
(pure torch implementation:ht.unique(a) = torch.unique(a._DNDarray__array)
), whereas ifa.split
is not None, the result will not be sorted (heat implementation).ht.unique(a) = torch.unique(a._DNDarray__array)
withht.unique(a) = torch.unique(a._DNDarray__array, sorted=sorted)
doesn't help, becausesorted=False
means different things for heat and torch:I propose setting
sorted=True
by default in ht.unique() as at the moment it's the only way to prevent inconsistencies with torch, although I'm aware that the sorting comes with significant overhead. Incidentally, numpy.unique() returns the "sorted unique elements of an array" and not sorting is not even an option.if
return_inverse=True
, ht.unique() by design returns a list of one DNDarray (the unique elements) and one torch tensor (the inverse indices). Should be two DNDarrays.it is currently not possible to run
ht.unique(a, sorted=True, axis=axis)
ifaxis != split
. Error message:This needs to be followed up.
To Reproduce
Steps to reproduce the behavior:
Which module/class/function is affected?
manipulations.unique()
What are the circumstances under which the bug appears?
see above
What is the exact error message / erroneous behavior?
see above
Version Info
current main branch
The text was updated successfully, but these errors were encountered: