Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating to from .NET 8 SDK to .NET 9 SDK Preview 4 causes dotnet test to hang forever #5091

Closed
Youssef1313 opened this issue Jun 11, 2024 · 22 comments

Comments

@Youssef1313
Copy link
Member

Description

So far, I don't have an isolated repro. But it happens for Uno Platform Wasm UI tests that we execute via dotnet test. The tests are passing, but dotnet test isn't terminating.

Upon investigation, I found:

image

image

So it looks like somehow, the VSTestTask2 isn't terminating. It's stuck there forever. Setting MSBUILDENSURESTDOUTFORTASKPROCESSES environment variable to 1 does the trick for now.

Steps to reproduce

We haven't yet got a minimal repro.

Expected behavior

Actual behavior

Diagnostic logs

logs.zip

Environment

@nohwnd
Copy link
Member

nohwnd commented Jun 11, 2024

vstesttask2 task starts an exe, and waits for it to exit. It will sit there as long as the exe will be running. When you look in test explorer do you see vstest.console / dotnet running under this process? Do you see also testhost running under vstest.console process?

@Youssef1313
Copy link
Member Author

vstesttask2 task starts an exe

Is it the testhost.exe? That one terminates correctly

@Youssef1313
Copy link
Member Author

I have a dump of the dotnet process, in case that can help, I'll send it to you.

@nohwnd
Copy link
Member

nohwnd commented Jun 14, 2024

I've got your dump, it looks like the tool task is simply waiting for a child process to exit. The child process is vstest.console.

In the logs of vstest.console I can see it exited, but I also see that the Process ID is different from what is in the dump file, so this is probably from 2 different runs, not a big problem, but could you double check that vstest.console is stopped under the task? if you put long wait in your test, you should see one under some MSBuild node, and then it should exit.

You could also try using -nodereuse:false, that will disable using "cached" MSbuild nodes, and will start a new one for this run, if it still stays stuck it makes it much easier to see what process is stuck, because they all run under the terminal process.

@Youssef1313
Copy link
Member Author

@nohwnd Oh. The logs I sent to @Evangelink were different run than when I took the dump, I think. However, when I was seeing the task is waiting for a process, I couldn't find the process id at all in task manager, so it was strange it's waiting for a process that already exited somehow.

I'll try to delay the test and see if I can find more information.

@nohwnd
Copy link
Member

nohwnd commented Jun 14, 2024

That is indeed weird, and in that case it would be a MSBuild bug (not that I am trying to ditch responsibility, but we are fully relying on ToolTask to do this). Let me know what you found and I will talk with msbuild team if there is problem in tooltask.

@Youssef1313
Copy link
Member Author

Great. I'll double check my analysis and try to get more info and get back to you

@tmds
Copy link
Contributor

tmds commented Jun 20, 2024

We ran into this issue in dotnet/sdk#41198 CI.

The CI test step would time out after 30 min on each attempt.
After adding MSBUILDENSURESTDOUTFORTASKPROCESSES=1 the test step finishes in less than 4 min.

cc @ViktorHofer @rainersigwald @Forgind

@nohwnd nohwnd added this to the terminal logger milestone Jun 20, 2024
@Forgind
Copy link
Member

Forgind commented Jun 20, 2024

On a hunch, can someone try setting MSBUILDNODEWINDOW to 1 to see if that also resolves the problem?

@Forgind
Copy link
Member

Forgind commented Jun 20, 2024

Actually, I guess I can do that. I'll try to get that started later today.

@nohwnd
Copy link
Member

nohwnd commented Jun 21, 2024

This is now on list of work for msbuild team, and me to fix. We still don't know where it is happening though. So if you have any additional info, or repro it would be very welcome. Especially double checking if vstest.console is or is not running while the hang is observed. And diagnostic logs of dotnet test.

@MichalPavlik
Copy link

There is a thing with WaitForExit() method when parent process reads stdout asynchronously. If there is a grandchild process started by child process, WaitForExit() of the parent process waits for exit of the grandchild. It blocks even when the child process exits.

I'm not saying it's the root cause in this situation, but it's possible. Our ToolTask uses WaitForExit(), so I can try to avoid this situation on our side.

@Youssef1313
Copy link
Member Author

Not sure if that would be related to dotnet/runtime#103384

@MichalPavlik
Copy link

What I described relates to issue you mentioned. @Youssef1313, could you please try to find if there is a process that was started by the testhost and terminate it? If it unblocks the MSBuild, then the problem is in our codebase and should be fixed. The workaround is to use different overload of WaitForExit method.

@Youssef1313
Copy link
Member Author

I may not be able to re-test this soon-ish, but IIRC, when WaitForExit was stuck, I wasn't able to find a matching process id that's open. So it felt like that process already terminated but WaitForExit was still blocking and didn't return.

@MichalPavlik
Copy link

Yes, if testhost started another process with redirected output, then our WaitForExit will wait for the grandchild process to exit. So far I don't have another idea what is happening.

@MichalPavlik
Copy link

I implemented workaround but I had to revert my changes, because it caused problem with exit code :( Still, it would be great to know if there is hanging grandchild process. In that case, testing team could make change here and don't exit before their child process terminates.

@Sebazzz
Copy link

Sebazzz commented Nov 19, 2024

This indeed happens when the test itself creates child processes (for instance: MSSQL localdb, or webdrivers/browsers) and fails to terminate these child processes. dotnet test will then indeed hang.

Still happens with SDK 9.0.100.

@nohwnd
Copy link
Member

nohwnd commented Nov 19, 2024

Looks to be connected to capturing process output:

This change adds an option to fully disable capturing standard output. Setting it unblocked my run: #4998

VSTEST_DISABLE_STANDARD_OUTPUT_CAPTURING=1 env variable

BUT we've been capturing the output like this for a long time, we were just redirecting it to null. So it should be a workaround, and not a cause of the issue.

https://github.com/nohwnd/Cuemon/blob/3126ca48266b9ad493524a06ea9b77c79e907a30/.github/workflows/pipelines.yml#L157-L158

@nohwnd
Copy link
Member

nohwnd commented Nov 19, 2024

Actually no matter what I do I cannot repro the hang when running on 9.0.100, so I am not sure if the option above did anything.

@Youssef1313
Copy link
Member Author

I can confirm this is now working for Uno: unoplatform/uno#18841

@nohwnd
Copy link
Member

nohwnd commented Nov 19, 2024

OP confirmed this is working for them now.

@Sebazzz do you have a repro of your problem please? If it still does repro, please start a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants