-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retire struct array. #84
base: master
Are you sure you want to change the base?
Conversation
I appreciate that this change will remove the need for one dereference. However, this change will make it impossible to concatenate two of these arrays without copying, since the metadata is inline with the data. I think we can get (mostly) rid of the dereference without moving the metadata. |
Right now arrays are a struct with a single pointer to a memory blob where the data is stored. How do you concatenate two arrays without copying data in the current representation? |
One way would be using the int32_t blob[128]; // Target buffer
struct array arr1;
struct array arr2;
struct array out;
arr1.buffer = &blob[0];
arr1.length = 64;
arr1.elemSize = 4;
arr1.bytes = 256;
arr2.buffer = &blob[64];
arr2.length = 64;
arr2.elemSize = 4;
arr2.bytes = 256;
out.buffer = &blob[0];
out.length = 128;
out.elemSize = 4;
out.bytes = 512; Admittedly, no Feldspar code can currently generate this C code. I would like to keep the The reason is that in the generated body we are free to handle the metadata as we see fit, but in the interface we have to adhere to some protocol. By separating the generated code from the protocol we enable future support of other protocols than |
I'm not sure in which end I should start, so here's a random access dump of things. We definitely need to adhere to some protocol for our interface to the world, but that protocol could equally well be specified through accessor functions in feldspar_array.h (that we control) instead of a data format that might include padding. We already have a few of these accessor functions such as at and freeArray, and we could complement with getData for returning the data pointer and something resembling at for accessing the header information for arrays. There's a pretty deep assumption built into the compiler that there is only a single owner of a block of memory, and until my pointer-for-scanl-patch was merged the guarantees were even stronger: the owner is the only accessor of the memory and it can be freed at will. There will be a lot of work required to get rid of this assumption. I have code for putting array literals into NativeArrays, but Arrays and NativeArrays don't blend that well right now since we need to define copyNativeArrayToArray(), copyArrayToNativeArray() and the respective copy*ArrayLen functions as well. The code in the compiler for handling nativearrays versus arrays also became quite yucky--if nativearrays and arrays were the same (except for the header) they could be treated as a the same thing throughout the compiler. The code you've written looks elegant (except that the blob is stack allocated?), but it requires a lot of things to align to work out in practice. The compiler needs to realize that both arr1 and arr2 should share blob with out, but arr2 should start at a specific offset into the blob. Furthermore the compiler needs to realize that the blob escapes the scope, and that only the header for arr1 and arr2 can be freed. Even if things aligned perfectly and your scenario works out: how does that give us single pointer dereferences for element access? I think it's a fair assumption that out is the same size as the input, so it's not just measurement noise to have two pointer dereferences compared to a single dereference in the final loop that writes the result to out. |
Having thought a bit more about blobs: the blob in your example is typed, and having typed blobs in the compiler would simplify things a bit--maybe to the extent that we could retire the NativeElem constructor in Expr and use ArrayElem everywhere. If we/you want to keep the struct array around it should be easy to automatically generate the struct declaration for the right datatype just like we generate struct declarations for tuples. If we want to keep a single struct array around as the interface then the void pointer needs to stay, but we could still have internal blobs/structs that are typed (so the blob is attached to the struct after the computation instead of before). Furthermore, it should then be simple to keep multiple array representations around since the implementation would be nearly completely localized to a few accessor functions in the C files--users like us who need single-dereference element access could decide that we sacrifice fast concatenation in favor of element access performance, and those programs that concatenate a lot of arrays could make the other choice. |
I did the work on removing the NativeElem constructor which turned out to be quite easy; it's in pull request #88. That should be an improvement regardless of array representation.That simplification does not solve any of the other issues mentioned though, so we still need to define the copy_Array_ functions for every array representation. |
Replace the current pointer to struct array+pointer to memory blob with just a pointer to a memory blob.