Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a unique function returning only the unique values in a vector. #940

Open
loiseaujc opened this issue Feb 24, 2025 · 3 comments
Open
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@loiseaujc
Copy link

loiseaujc commented Feb 24, 2025

Motivation

Recently, I've run into the problem of extracting unique values in a vector (of any integer, real or complex type or possibly even character). Consider for instance the following vector x = [1, 2, 3, 3, 4]. What I'd need is a function taking x as input and returning the vector y = [1, 2, 3, 4] as output. The interface for a real-valued vector could be as simple as

pure function unique(x, sorted) result(y)
     real(dp), intent(in) :: x(:)
     !! Array whose unique values need to be extracted.
     logical(lk), optional, intent(in) :: sorted
     !! Whether the output vector needs to be sorted or not (default .false. ?)
     real(dp), allocatable :: y(:)
     !! Vector containing only the unique values from x.
end function

The output vector could be sorted or not, depending on the user's choice. I know that there are no Fortran intrinsic functions for that purpose, but I ain't sure something like that is already available in stdlib. If I'm wrong, could anyone point me to the correct function?

Prior Art

  • In Matlab, there is the unique function whose description is available here.
  • Python has the set function taking as input a list and returning only the unique elements of this list.
  • Numpy has np.unique whose description is available here.
  • @jacobwilliams provides an integer-based implementation on his blog (here).

Additional Information

Both Matlab and Numpy's implementations cover a relatively large set of cases (1D-array, multidimensional arrays, different types, etc) and return values (the unique elements, the corresponding indices, indices to the reconstruct the original array from this unique set, etc).

I don't know if absolutely all these cases need to be covered (at least as a starting point). I would probably recommend to start with the simplest ones (i.e. only input vectors and output vector with the unique elements) as these are probably the most common situations where a unique function might be needed. That would include integer, real, complex and character 1D-arrays.

@loiseaujc loiseaujc added the idea Proposition of an idea and opening an issue to discuss it label Feb 24, 2025
@loiseaujc
Copy link
Author

I'm not sure either into which module this utility function should be included. Maybe stdlib_sorting?

@perazz
Copy link
Member

perazz commented Feb 24, 2025

Good idea @loiseaujc, please note there is an open discussion at #670, should we merge this issue with that one?

@loiseaujc
Copy link
Author

Oh sure! I completely overlooked this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

No branches or pull requests

2 participants