-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-44393: [C++][Compute] Swizzle vector functions #44394
base: main
Are you sure you want to change the base?
Changes from all commits
1e141d7
216e217
f3c73ea
be88f0c
b445c36
3707fa6
4e9d3a6
c4c5c41
78bb335
38bcc5d
d88877a
cc6a0ef
2bbf44b
b31f9f2
b951348
520b952
b450f5e
4ea1465
cbdce2f
d2e118a
7128a28
034d3b7
9f93e5c
c320002
3e438e8
0811b2b
154ad95
66d977a
a4c292c
846039d
2f2ae47
3af49a8
944609c
e132f0d
220598b
c03f6e0
705c7b2
9e9ccb0
bd334fe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -257,6 +257,38 @@ class ARROW_EXPORT ListFlattenOptions : public FunctionOptions { | |||||||||
bool recursive = false; | ||||||||||
}; | ||||||||||
|
||||||||||
/// \brief Options for inverse_permutation function | ||||||||||
class ARROW_EXPORT InversePermutationOptions : public FunctionOptions { | ||||||||||
public: | ||||||||||
explicit InversePermutationOptions(int64_t max_index = -1, | ||||||||||
std::shared_ptr<DataType> output_type = NULLPTR); | ||||||||||
static constexpr char const kTypeName[] = "InversePermutationOptions"; | ||||||||||
static InversePermutationOptions Defaults() { return InversePermutationOptions(); } | ||||||||||
|
||||||||||
/// \brief The max value in the input indices to process. Any indices that are greater | ||||||||||
/// than this value will be ignored. If negative, this value will be set to the length | ||||||||||
/// of the input indices minus 1. | ||||||||||
int64_t max_index = -1; | ||||||||||
/// \brief The type of the output inverse permutation. If null, the output will be of | ||||||||||
/// the same type as the input indices, otherwise must be integer types. An invalid | ||||||||||
/// error will be reported if this type is not able to store the length of the input | ||||||||||
/// indices. | ||||||||||
std::shared_ptr<DataType> output_type = NULLPTR; | ||||||||||
}; | ||||||||||
|
||||||||||
/// \brief Options for scatter function | ||||||||||
class ARROW_EXPORT ScatterOptions : public FunctionOptions { | ||||||||||
public: | ||||||||||
explicit ScatterOptions(int64_t max_index = -1); | ||||||||||
static constexpr char const kTypeName[] = "ScatterOptions"; | ||||||||||
static ScatterOptions Defaults() { return ScatterOptions(); } | ||||||||||
|
||||||||||
/// \brief The max value in the input indices to process. Any values with indices that | ||||||||||
/// are greater than this value will be ignored. If negative, this value will be set to | ||||||||||
/// the length of the input minus 1. | ||||||||||
int64_t max_index = -1; | ||||||||||
}; | ||||||||||
|
||||||||||
/// @} | ||||||||||
|
||||||||||
/// \brief Filter with a boolean selection filter | ||||||||||
|
@@ -705,5 +737,52 @@ Result<std::shared_ptr<Array>> PairwiseDiff(const Array& array, | |||||||||
bool check_overflow = false, | ||||||||||
ExecContext* ctx = NULLPTR); | ||||||||||
|
||||||||||
/// \brief Return the inverse permutation of the given indices. | ||||||||||
/// | ||||||||||
/// For indices[i] = x, inverse_permutation[x] = i. And inverse_permutation[x] = null if x | ||||||||||
/// does not appear in the input indices. For indices[i] = x where x < 0 or x > max_index, | ||||||||||
/// it is ignored. If multiple indices point to the same value, the last one is used. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this explanation is confusing, but we can work on this later. |
||||||||||
/// | ||||||||||
/// For example, with indices = [null, 0, 3, 2, 4, 1, 1], the inverse permutation is | ||||||||||
/// [1, 6, 3] if max_index = 2, | ||||||||||
/// [1, 6, 3, 2, 4, null, null] if max_index = 6. | ||||||||||
/// | ||||||||||
/// \param[in] indices array-like indices | ||||||||||
/// \param[in] options configures the max index and the output type | ||||||||||
/// \param[in] ctx the function execution context, optional | ||||||||||
/// \return the resulting inverse permutation | ||||||||||
/// | ||||||||||
/// \since 19.0.0 | ||||||||||
/// \note API not yet finalized | ||||||||||
ARROW_EXPORT | ||||||||||
Result<Datum> InversePermutation( | ||||||||||
const Datum& indices, | ||||||||||
const InversePermutationOptions& options = InversePermutationOptions::Defaults(), | ||||||||||
ExecContext* ctx = NULLPTR); | ||||||||||
|
||||||||||
/// \brief Scatter the values into specified positions according to the indices. | ||||||||||
/// | ||||||||||
/// For indices[i] = x, output[x] = values[i]. And output[x] = null if x does not appear | ||||||||||
/// in the input indices. For indices[i] = x where x < 0 or x > max_index, values[i] | ||||||||||
/// is ignored. If multiple indices point to the same value, the last one is used. | ||||||||||
Comment on lines
+766
to
+767
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Can you explain the point of the lenient behavior wrt. negative indices and the max_index option? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No there isn't a particular case in my mind that (And just in case you are curious about the option Thank you. |
||||||||||
/// | ||||||||||
/// For example, with values = [a, b, c, d, e, f, g] and indices = [null, 0, | ||||||||||
/// 3, 2, 4, 1, 1], the output is | ||||||||||
/// [b, g, d] if max_index = 2, | ||||||||||
/// [b, g, d, c, e, null, null] if max_index = 6. | ||||||||||
/// | ||||||||||
/// \param[in] values datum to scatter | ||||||||||
/// \param[in] indices array-like indices | ||||||||||
/// \param[in] options configures the max index of to scatter | ||||||||||
/// \param[in] ctx the function execution context, optional | ||||||||||
/// \return the resulting datum | ||||||||||
/// | ||||||||||
/// \since 19.0.0 | ||||||||||
/// \note API not yet finalized | ||||||||||
ARROW_EXPORT | ||||||||||
Result<Datum> Scatter(const Datum& values, const Datum& indices, | ||||||||||
const ScatterOptions& options = ScatterOptions::Defaults(), | ||||||||||
ExecContext* ctx = NULLPTR); | ||||||||||
|
||||||||||
} // namespace compute | ||||||||||
} // namespace arrow |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1037,8 +1037,9 @@ ArrayKernelExec GenerateFloatingPoint(detail::GetTypeId get_id) { | |
// Generate a kernel given a templated functor for integer types | ||
// | ||
// See "Numeric" above for description of the generator functor | ||
template <template <typename...> class Generator, typename Type0, typename... Args> | ||
ArrayKernelExec GenerateInteger(detail::GetTypeId get_id) { | ||
template <template <typename...> class Generator, typename Type0, | ||
typename KernelType = ArrayKernelExec, typename... Args> | ||
KernelType GenerateInteger(detail::GetTypeId get_id) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this change is for generate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Just slightly extending it like GenerateNumeric. |
||
switch (get_id.id) { | ||
case Type::INT8: | ||
return Generator<Type0, Int8Type, Args...>::Exec; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should nullable being considered here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be decided during the actual computation. If there are "holes" in the output, validity buffer will be allocated and filled on demand.