-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-interpreting a value of a large type as an array of a smaller type #28
Comments
No, bitcasting int32_t* p = /*...*/;
int16_t* q = (int16_t*) p;
int16_t i = q[1]; // the C standard does not allow this. I don't understand the second part of your question. Can you give me some more context or an example? |
EDIT: fn main() {
use std::mem;
let x : i32 = 200;
let one: [i8 ; 4] = unsafe { mem::transmute_copy(&x) };
assert_eq!(-56, one[0]);
assert_eq!(0, one[1]);
} ORIGINAL COMMENT: I want to use the PTX warp shuffle instruction (here). fn shuffle_words[T, T2](input: T, src_lane: i32) -> () {
let num_words = sizeof[T]() / sizeof[T2]();
let in_words : [T2 * num_words] = bitcast[[T2 * num_words]](&input); // I know that I can't use an expr for allocation here, but I am just trying to explain my point.
let out_words : [T2 * num_words];
for i in range(0, num_words) {
out_words(i) = shuffle(in_words(i), src_lane);
}
} For more context, I am implementing parallel operations equivalent to what is provided by CUDA's CUB library. Here is an implementation of the shuffle operation from CUB's code: template <typename T>
__device__ __forceinline__ T ShuffleDown(
T input, ///< [in] The value to broadcast
int src_offset, ///< [in] The relative up-offset of the peer to read from
int last_lane = CUB_PTX_WARP_THREADS - 1) ///< [in] Index of first lane in segment
{
typedef typename UnitWord<T>::ShuffleWord ShuffleWord;
const int WORDS = (sizeof(T) + sizeof(ShuffleWord) - 1) / sizeof(ShuffleWord);
T output;
ShuffleWord *output_alias = reinterpret_cast<ShuffleWord *>(&output);
ShuffleWord *input_alias = reinterpret_cast<ShuffleWord *>(&input);
unsigned int shuffle_word;
asm volatile("shfl.down.b32 %0, %1, %2, %3;"
: "=r"(shuffle_word) : "r"((unsigned int) input_alias[0]), "r"(src_offset), "r"(last_lane));
output_alias[0] = shuffle_word;
#pragma unroll
for (int WORD = 1; WORD < WORDS; ++WORD)
{
asm volatile("shfl.down.b32 %0, %1, %2, %3;"
: "=r"(shuffle_word) : "r"((unsigned int) input_alias[WORD]), "r"(src_offset), "r"(last_lane));
output_alias[WORD] = shuffle_word;
}
// ShuffleDown(input_alias, output_alias, src_offset, last_lane, Int2Type<WORDS - 1>());
return output;
} |
Hi, We should be careful here, line 3 is certainly legal (indexing an However, if handled properly (alignment and such) the results can be Best,
Am 14.04.2016 um 11:50 schrieb Roland Leißa:
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH Geschäftsführung: Sitz der Gesellschaft: Kaiserslautern (HRB 2313) USt-Id.Nr.: DE 148646973, Steuernummer: 19/673/0060/3 |
No, line 3 is illegal (i.e. undefined behavior). According to the C standard (§6.5 - 7):
And the effective type of But I agree that we want to support such things in the long run. I'm just saying, that there are good reasons for C to prohibit such things. Allowing such things will give us performance problems (missed optimization opportunities) at other locations. @KareemErgawy: I will check out your example at the weekend. |
OK, I see. This is certainly not how C has been used for ages but it Thanks,
Am 15.04.2016 um 18:43 schrieb Roland Leißa:
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH Geschäftsführung: Sitz der Gesellschaft: Kaiserslautern (HRB 2313) USt-Id.Nr.: DE 148646973, Steuernummer: 19/673/0060/3 |
Yes, it's a big problem. Many C programmers don't understand the standard and there are many myths regarding the language. And then programmers curse the compiler engineers when the C compiler apparently emits "incorrect" code. But actually the input program was broken to begin with... That's why But anyway, we're getting off-topic :) |
@KareemErgawy: I think the easiest solution would be to simply use Rust's idea with this |
Yes, that would be totally fine. Also for now, I ended up using bitwise operators as suggested by Arsene. Which do the same trick in a less elegant way. The down side is this will enable us to only read the words of a large variable and not be able to write them (because it is just a copy not an actual alias). But I understand this goes against the design choice for more type safety. But again, this is totally fine. Thanks. |
I want to re-interpret a value as an array of values of a smaller type. For example, an i32 as an array of 4 i8s. Also, the source (big) and target (small) types are polymorphic so the size of the target array is not fixed (in terms of the number of elements rather than the total size in bytes). Can I do that?
The text was updated successfully, but these errors were encountered: