-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 #35
base: devel
Are you sure you want to change the base?
Conversation
cd2ee51
to
8afb9f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Run, great PR addressing lots of issues and building strong ground for every fp execution on PULPOpen! A few comments to address but no critical ones.
Deeploy/Targets/PULPOpen/Bindings.py
Outdated
@@ -120,6 +124,7 @@ | |||
MemoryManagementGeneration("L3.*"), | |||
MemoryManagementGeneration("L2"), | |||
MemoryManagementGeneration(), | |||
ProfilingCodeGeneration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not enable that by default. This is only useful in the case of untiled execution (with testRunner_Siracusa.py
) so let's add this pass only in this situation. I recommend adding an argument to the untiled test runner and add this pass from networkGenerate.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I would fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed #35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the interface that you used (CodeGenVerbosity
) but I don't like that the pass is in the PULPTiling pass. A small change and this will roll.
@@ -303,6 +305,7 @@ def generate_test(self): | |||
|
|||
command = f"python {generation_script} -d {self._dir_gen} -t {self._dir_test} -p {self._platform} {self.gen_args}" | |||
command += self._argument_parser.generate_cmd_args() | |||
print(command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 206bdd3.
if verbose.untilingProfiling: | ||
ctxt, executionBlock = self.profiluntiling.apply(ctxt, executionBlock, name) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pass is unrelated to tiling and should not be there. You should make an independent pass that does smth only when the given flag is passed (through CodeGenVerbosity
). Also, untiling does not mean anything in this context; let's call it profileUntiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in commit 206bdd3. Add new PULPProfileUntiled Pass.
Description
This update improves CCT's kernel tiling support and resolves multiple operator issues on the Siracusa platform. The new kernel templates for convolution and max-pooling enhance padding integration while adopting an HWC layout. Additionally, key constraints for tiling have been introduced, fixing several execution issues in GEMM, MatMul, and float-based computations. The layers has also been refined to handle bias broadcasting correctly, ensuring accurate output shape inference.
Added
Float Bindings, Tilers for Pulp Target
Float Convolution, MaxPool Parser, Template, Kernel
Tiling Constraints
conv
gather
andlayernorm
and exisitng constraints for other kernels.Fixed
CycleMeasure Pass for Siracusa Untiling Profilling
GEMM Tiling Constraints Issue
transA
and `transB' not supported.MatMul Multi-Dimensional Input Issue
Add Layer for Broadcasted Bias
float32
withf
causedinf
errors.Changed
add
andgemm
to avoid unnecessary broadcasting.PR Merge Checklist
devel
commit and pointing todevel
.CHANGELOG.md
file has been updated.