-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem in getting optimal dataflow #253
Comments
Here is the constrained arch.yaml file
And here is the prob.yaml file
|
The problem with a completely unconstrained search is that the space is so vast that you cannot afford to run an exhaustive search. Even if you are comparing heuristic searches on constrained-vs-unconstrained mapspaces, the probability of arriving at a near-optimal solution (for that space) is much higher with the smaller constrained space. It's a tough problem. Better heuristics help. GeorgiaTech's GAMMA mapper was ported to work with Timeloop, but is not actively supported. In our experience constraining the mapspace seems to be the best strategy. That said, for small-ish architectures you should be able to run an exhaustive search (set algorithm to |
Thanks for your reply. I apologize for causing you concern; my phrasing was not accurate. When I mentioned an "incorrect dataflow," I actually meant a dataflow that is too far from the optimal solution and offers no practical reference value for the design, not a genuinely incorrect dataflow. I have tried to run an exhaustive search by setting algorithm to |
I have tried setting The arch.yaml and prob.yaml files I am running are the ones I previously provided. Could this situation be due to my architecture being too complex, causing the exhaustive algorithm to not proceed? |
0 is the magic number that causes the search algorithms to ignore those termination conditions. That leaves exhausting the mapspace as the only criteria for the mapper threads to terminate. See here:
Since these are unsigned vars, setting them to -1 probably caused them to overflow to UINT_MAX, which would effectively have the same result. So I am surprised (and concerned) you saw different behavior with -1 vs. 0. Could you please confirm that this is true? If the exhaustive search is triggering successfully, it's probably not stuck but just stops reporting updates because it's not seeing any better mappings. You can turn on It's certainly possible that the mapspace happens to be laid out in such a pathologically poor way that the exhaustive search only gets to the good mappings much later. One suggestion is to add a
This will not reduce the mapspace but will early-reject mappings that don't have enough spatial fanout, so the model doesn't waste time evaluating them. |
I have retried a search with This is what the
It seems that there was only one mapping in every thread, and each one of them was invalid. The diagnostic returns one fail class, the Fanout fail class. |
Here is my mapper.yaml file:
Could you please check if my settings have understood your requirements correctly? Are there any settings that are unreasonable, leading to the outcome above? |
I retried a search with However, by repeating the search, it has indeed been confirmed that setting |
Understood. Thank you for helping us by re-verifying the behavior. Let me look into it. But in the meantime, I recommend proceeding with constrained searches. |
Thank you for your patience and guidance. I'm looking forward for your further reply. |
Hi,
Thank you for developing such an amazing tool and providing a clear tutorial for it. I'm currently designing a dataflow for a conv 2d workload and have encountered some challenges in achieving the optimal dataflow.
I found that when no constraints are provided in the arch.yaml file, the timeloop mapper can only return a suboptimal solution or even an incorrect dataflow with pe utilization = 0.3 and pJ/compute = 24.242, while when I give explicit constraints, the timeloop mapper can return a dataflow with pe utilization up to 0.96 and pJ/compute = 9.475.
I really wonder why arch.yaml with no constraints cannot find a better dataflow. My initial thought was that an unconstrained arch.yaml would generate a larger mapspace, potentially encompassing the mapspace created by a constrained arch.yaml. However, the results suggest otherwise.
Could you provide insight into why an unconstrained arch.yaml fails to identify more optimal dataflows? Any guidance or suggestions would be greatly appreciated.
The text was updated successfully, but these errors were encountered: