Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename overloaded intrinsics #117

Open
HanKuanChen opened this issue Sep 30, 2021 · 6 comments
Open

Rename overloaded intrinsics #117

HanKuanChen opened this issue Sep 30, 2021 · 6 comments
Labels
Revisit after v1.0 Features or problems we will revisit after the v1.0 release

Comments

@HanKuanChen
Copy link
Contributor

I notice the overloaded intrinsic function name is very unfriendly to C++ template.
For example, if we want to do element-wise max for two array, we will write

#include <riscv_vector.h>

vint32m8_t vle_m8(const int32_t *ptr, size_t vl)
{
    return vle32_v_i32m8(ptr, vl);
}

vint32m8_t max(const int32_t *lhs, const int32_t *rhs, size_t vl)
{
    return vmax(vle_m8(lhs, vl), vle_m8(rhs, vl), vl);
}

To make it support unsigned and float, we will write

#include <riscv_vector.h>

vint32m8_t vle_m8(const int32_t *ptr, size_t vl)
{
    return vle32_v_i32m8(ptr, vl);
}

vint32m8_t max(const int32_t *lhs, const int32_t *rhs, size_t vl)
{
    return vmax(vle_m8(lhs, vl), vle_m8(rhs, vl), vl);
}

vuint32m8_t vle_m8(const uint32_t *ptr, size_t vl)
{
    return vle32_v_u32m8(ptr, vl);
}

vuint32m8_t max(const uint32_t *lhs, const uint32_t *rhs, size_t vl)
{
    return vmaxu(vle_m8(lhs, vl), vle_m8(rhs, vl), vl);
}

vfloat32m8_t vle_m8(const float *ptr, size_t vl)
{
    return vle32_v_f32m8(ptr, vl);
}

vfloat32m8_t max(const float *lhs, const float *rhs, size_t vl)
{
    return vfmax(vle_m8(lhs, vl), vle_m8(rhs, vl), vl);
}

However, they cannot write in a simple C++ template, since max has different name (vmax, vmaxu, vfmax).

#include <riscv_vector.h>

vint32m8_t vle_m8(const int32_t *ptr, size_t vl)
{
    return vle32_v_i32m8(ptr, vl);
}

vuint32m8_t vle_m8(const uint32_t *ptr, size_t vl)
{
    return vle32_v_u32m8(ptr, vl);
}

vfloat32m8_t vle_m8(const float *ptr, size_t vl)
{
    return vle32_v_f32m8(ptr, vl);
}

template<class T>
auto max(T lhs, T rhs, size_t vl)
{
    // return vmax(vle_m8(lhs, vl), vle_m8(rhs, vl), vl); // only valid for signed
    // return vmaxu(vle_m8(lhs, vl), vle_m8(rhs, vl), vl); // only valid for unsigned
    // return vfmax(vle_m8(lhs, vl), vle_m8(rhs, vl), vl); // only valid for float
}

We should make intrinsics have same name if they have same semantic.
For example,
vadd, vfadd -> vadd
vwadd, vfwadd -> vwadd
vmax, vmaxu, vfmax -> vmax
vmseq, vmfeq -> vmseq
vredsum, vfredusum -> vredsum

Old intrinsics will be kept (backward compatibility).

@HanKuanChen HanKuanChen changed the title rename overloaded_intrinsic_funcs.md rename overloaded intrinsics Sep 30, 2021
@Hsiangkai
Copy link
Collaborator

I am not sure if we should provide such intrinsics for C++ template. In the previous discussion #7, we decide to provide 1-to-1 intrinsics. Is it possible to provide wrapper functions in users' code to meet the purpose?

@HanKuanChen
Copy link
Contributor Author

  1. If users want to use 1-to-1 intrinsic, they can use intrinsics which is in intrinsic_funcs.md. What I want to rename is only overloaded intrinsics, which is an optional implementation for compiler.
  2. The amount of riscv vector type is huge, this would be a huge effort for users to do this. Meanwhile, if users want to use C++ template, all of them will face the same problem.

@Hsiangkai
Copy link
Collaborator

It may make sense to define them in the overloaded interface. Wait for others' opinions.

@kito-cheng
Copy link
Collaborator

+1 from me.

@eopXD
Copy link
Collaborator

eopXD commented Aug 2, 2022

This is a good feature to have. Let's revisit this after the v1.0 release?

@eopXD eopXD added the Revisit after v1.0 Features or problems we will revisit after the v1.0 release label Aug 2, 2022
@eopXD eopXD changed the title rename overloaded intrinsics Rename overloaded intrinsics Oct 3, 2022
@sh1boot
Copy link

sh1boot commented Dec 22, 2023

The __riscv_vleXX_v_SXXmY() situation, where the element type can be deduced even while LMUL cannot, also exists for __riscv_vget() and __riscv_vlmul_ext() and __riscv_vlmul_trunc().

Conversely, __riscv_vreinterpret() can deduce LMUL since it's not legal to change it using reinterpret anyway.

These all become a nightmare as soon as people start using typedef under conditional compilation and then having to map all the intrinsic operations through at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Revisit after v1.0 Features or problems we will revisit after the v1.0 release
Projects
None yet
Development

No branches or pull requests

5 participants