Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper way to stop nsys profiling for SGLang serving #3511

Open
chenzhengda opened this issue Feb 12, 2025 · 2 comments
Open

Proper way to stop nsys profiling for SGLang serving #3511

chenzhengda opened this issue Feb 12, 2025 · 2 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@chenzhengda
Copy link

When using nsys profile to profile SGLang serving, I often encounter issues where the profile data is not properly saved when terminating the server with Ctrl+C. According to the documentation https://docs.sglang.ai/references/benchmark_and_profiling.html#profile-with-nsight , the recommended command for profiling a server is:

nsys profile --trace-fork-before-exec=true --cuda-graph-trace=node -o sglang.out --delay 60 --duration 70 python3 -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct --disable-radix-cache

Could you please clarify:
What is the recommended way to stop the profiling process?

@jhinpan jhinpan self-assigned this Feb 12, 2025
@jhinpan jhinpan added the help wanted Extra attention is needed label Feb 12, 2025
@jhinpan
Copy link
Collaborator

jhinpan commented Feb 12, 2025

cc @Fridge003 . Could you help take a look? Thx!

@Fridge003
Copy link
Collaborator

Fridge003 commented Feb 12, 2025

nsys profile --trace-fork-before-exec=true --cuda-graph-trace=node -o sglang.out --delay 60 --duration 70 python3 -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct --disable-radix-cache

Hi @chenzhengda, by setting the --delay and --duration parameters properly, the nsys will automatically stop the profiling. --delay marks the starting time of profiling in seconds, and --duration marks the time for profiling, which should be a value greater than model running time.

This feature is also discussed in #3049.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants