-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rocr/aie: AIE Queue Processing #251
base: amd-staging
Are you sure you want to change the base?
Conversation
#19) * Using BOs of type BO_SHMEM instead of BO_CMD for kernarg memory region * Changing test to use kernarg memory region for operand allocation
…er of packets the queue supports. (#23) * Fixed workarounds in the AIE soft queue regarding the size of the command chain and the individual commands. * Added the functionality to aie_hsa_dispatch_test.cc to query the size of the AIE queue and issue the maximum number of packets it supports. * Some additional small fixes on aie_hsa_dispatch_test.cc test. * Adding links to driver source
* Freeing the commands and the command chain created during dispatch * Unmaping and closing cmd BOs as well as freeing the queue ring buffer
@@ -316,6 +317,16 @@ hsa_status_t XdnaDriver::InitDeviceHeap() { | |||
return HSA_STATUS_SUCCESS; | |||
} | |||
|
|||
hsa_status_t XdnaDriver::GetHandleMappings(std::unordered_map<uint32_t, void*> &vmem_handle_mappings) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could return a const std::unordered_map<uint32_t, void*>&
and avoid the copy here. Do we ever expect the function to fail?
@@ -117,7 +117,8 @@ hsa_status_t XdnaDriver::GetAgentProperties(core::Agent &agent) const { | |||
return HSA_STATUS_ERROR; | |||
} | |||
|
|||
aie_agent.SetNumCols(aie_metadata.cols); | |||
// Right now can only target N-1 columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need some additional context here on why we can only target N-1 columns (e.g., current hardware limitations, something else).
return HSA_STATUS_SUCCESS; | ||
} | ||
|
||
hsa_status_t XdnaDriver::GetFd(int &fd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we can just return the fd.
@@ -100,7 +128,7 @@ class AieAqlQueue : public core::Queue, | |||
void *value) override; | |||
|
|||
// AIE-specific API | |||
AieAgent &GetAgent() { return agent_; } | |||
AieAgent &GetAgent() const { return agent_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're giving non-const access to agent_
then this function should be non-const.
constexpr int NON_OPERAND_COUNT = 6; | ||
|
||
// Used to transform an address into a device address | ||
constexpr int DEV_ADDR_BASE = 0x04000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a fixed-width integer types for these?
cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]); | ||
// clang-format off | ||
cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter + 1] = | ||
(uint64_t)vmem_handle_mappings[cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]] >> 32 & 0xFFFFFFFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected that if an entry does not exist in vmem_handle_mappings
, it should be replaced with a nullptr
? unordered_map::operator[]
will default construct a value if the key-value does not exist.
If not, then vmem_handle_mappings
should be taked by const&
and use find()
instead of []
cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]); | ||
// clang-format off | ||
cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter + 1] = | ||
(uint64_t)vmem_handle_mappings[cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]] >> 32 & 0xFFFFFFFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(uint64_t)vmem_handle_mappings[cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]] >> 32 & 0xFFFFFFFF; | |
static_cast<uint64_t>(vmem_handle_mappings[cmd_pkt_payload->data[operand_starting_index + 2 * operand_iter]]) >> 32 & 0xFFFFFFFF; |
// Transform the instruction sequence address into device address | ||
cmd_pkt_payload->data[CMD_PKT_PAYLOAD_INSTRUCTION_SEQUENCE_IDX] = | ||
DEV_ADDR_BASE | | ||
(reinterpret_cast<uint64_t>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(reinterpret_cast<uint64_t>( | |
(static_cast<uint64_t>( |
if (ioctl(fd, DRM_IOCTL_AMDXDNA_GET_BO_INFO, &cmd_bo_get_bo_info)) | ||
return HSA_STATUS_ERROR; | ||
|
||
*cmd = static_cast<amdxdna_cmd *>(mmap(nullptr, create_cmd_bo.size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should check for MAP_FAILED
here.
switch (pkt->opcode) { | ||
case HSA_AMD_AIE_ERT_START_CU: { | ||
std::vector<uint32_t> bo_args; | ||
std::vector<uint32_t> cmd_handles; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we know the sizes for those, you can call reserve
on each to avoid resizes.
This PR add's functionality to submit commands to the NPU xdna driver via a soft queue.