-
Notifications
You must be signed in to change notification settings - Fork 10
NM Profiler : Update visualize_trace.py #370
Conversation
ced0868
to
2b9a9f3
Compare
def shorten_plot_legend_strings(legend, max_char_len: int): | ||
for t in legend.get_texts(): | ||
t.set_text( | ||
trim_string_back(abbreviate_known_names(t.get_text()), | ||
max_char_len)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lucas had this already :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now that it was there before this PR. The "thank you" still stands given how wide some of the previous plot legends were :)
''' | ||
|
||
|
||
def group_trace_by_operations(trace_df: pd.DataFrame) -> pd.DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice this is super useful, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for improving this!
They Y-label in the |
Hey Lucas. Yes, I noticed that and fixed it. Sorry, should have mentioned it somewhere. |
Migrated all changes including all of the layer-by-layer profiling code to neuralmagic#3 |
Update visualize trace utility.
ignore_sampler
arg, and instead add afold_json_node
arg - This argument collapses the specified JSON tree so the plot has less clutter.Usage:
python3 neuralmagic/tools/profiler/visualize_trace.py --json-trace profiler_fp8_trace.json --output-directory ./kernel --level kernel
This command produce 2 output files :
kernel/prefill.png
andkernel/decode_steps.png
which are stacked-bar graph plots. In these plots the operations are grouped together by high-level concepts such asgemms
,attention
,rms-norm
etc.python3 neuralmagic/tools/profiler/visualize_trace.py --json-trace profiler_fp8_trace.json --output-directory ./module --level module --plot-metric pct_cuda_time
This command also produces 2 output files :
module/prefill.png
andmodule/decode_steps.png
which are stacked-bar graph plots. In these plots the bars sum up to a 100 as the requested plot metric ispct_cuda_time