Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does V4 works faster than the previous version ? #126

Open
ertan-retinize opened this issue Aug 17, 2022 · 22 comments
Open

Does V4 works faster than the previous version ? #126

ertan-retinize opened this issue Aug 17, 2022 · 22 comments

Comments

@ertan-retinize
Copy link

Hello there !

I was using the previous version's .exe in my Unity application to create approximate colliders for my input objects and now gone through the v4 docs and switched to it. Now it works a lot slower than before.
I'm using the default values for .exe didn't event change any parameters.

So for example I'm using character X:
VHACD in Unity (no .exe ): 3 min 23 seconds
VHACD exe (3.1) : 31 seconds
VHACD .exe V4.0 : 4 min 29 seconds.

What I'm doing wrong ?

@jratcliff63367
Copy link
Collaborator

Can you please provide the asset you are testing with and the parameters you used (if not the default ones)? In general 4.0 is faster than 3.1 but you might be running into a case where 3.1 is exiting earlier. Also, I have never seen VHACD take four and a half minutes. That doesn't sound right to me either. Once I have the wavefront OBJ I can inspect what is going on in greater detail.

@ertan-retinize
Copy link
Author

ertan-retinize commented Aug 17, 2022

Hey
Just realised that my phrasing could be better :)
It's not that I'm sending one object but instead sending lots of small objects one by one and in total their completion time is 4 min 30 sec but before it was taking 31 seconds( see first comment for comparison).
I'm using default parameters.
So, in this case do you want all of those small obj files ?

@jratcliff63367
Copy link
Collaborator

Hmmm...this is a significantly different problem. Perhaps there is some overhead when running it many times that I haven't noticed before. Yes, I guess I would need all of the small OBJ files then. This could take a while to diagnose, but I will try to look into it. I'm doing a lot of traveling the next two months so I may not have a lot of time to dig into it.

@ertan-retinize
Copy link
Author

ertan-retinize commented Aug 17, 2022

I think the problem is with the v4 actually. I mean I already know that you guys are saying that it's working faster and robust but let me show you one piece of my model used with v3.1 and v4

So running the model with V4 took 15 seconds approximately and created an object with the details below:
image

Running the same model with V3.1 took less then a second and created an object with the details below
image

Used default parameters for comparing both versions.

Couldn't add the model here it's not supported by git.

I hope this helps you to understand me better. Just like in the example above all of the pieces are just taking longer to process and creates a more detailed results (more tris and verts) in contrast to V3

@jratcliff63367
Copy link
Collaborator

Can you provide this as a wavefront OBJ file? You can always just upload it to Google drive and share a link too.

@ertan-retinize
Copy link
Author

@jratcliff63367
Copy link
Collaborator

I can confirm that it does take longer. According to the log it appears that a lot of the time is being spent in computing the merge cost matrix, so I can investigate if I can make that go faster. However, I won't get a solution for that right away. If you would like to get a faster result you could set the voxel resolution down to say 100,000 and maybe set the error threshold to 5%. That will make it go faster but you may or may not be happy with the results.

Your sample file is a hollow object and convergence in that use case is much different than for a solid object.

For example, try using the 'mite.obj' sample file that is provided in the 'meshes' sub-folder. That's a very high triangle count complex object but version 4.0 makes quick work of it.

It does look like I have more optimization work I still need to do though.

@ertan-retinize
Copy link
Author

Thanks a lot for the confirmation. I don't think that I'll switch to V4 for now even though it looks good and totally re-written. I guess it doesn't suit to my models for now.
Please let me know when you have time to diagnose/fix this so that I can use the new version also let me know if you need any more info that will help you.

@jratcliff63367
Copy link
Collaborator

jratcliff63367 commented Aug 17, 2022

Ok, I figured out the issue. A while back I changed the default maximum recursion depth to 14 (2 raised to the 14th power). This would allow the algorithm to produce 16,384 convex hull fragments.

The reason I did this is because I was tuning some assets which had very fine details. The reason to use a high recursion depth is if your source object has tiny details in it that you want to be able to capture.

With solid objects (mostly what people use V-HACD for) this generally isn't a problem. However, with assets like your example which are hollow, they are effectively completely concave. In the case of hollow objects the algorithm will recurse all of the way down to the maximum recursion depth producing a massive number of convex hull fragments.

When the algorithm gets to the merge step it first needs to compute a cost matrix which is an N squared operation. Normally this computation is pretty fast. However, if N is large (which it is in the case of a hollow object with a high recursion depth) then this section of code becomes exponentially slower.

The short term fix is simply to change the default maximum recursion depth, which I have done. The default is now set to 10 which produces a maximum of 1,024 convex hull fragments.

I have already changed the default value and submitted it to the master branch.

Another thing that I can do is refactor the cost matrix code so that it can run in parallel. If I do this I can expect to get, hopefully, a 4x performance increase.

There are times when you might want to have a larger maximum recursion depth. As I said, typically when you are trying to pick up some fine details. That said, a default value of 10 should more closely match how V-HACD version 3.1 was working.

@ertan-retinize
Copy link
Author

Hey John,
Thanks a lot for the investigation and fix. Much appreciated. Will you release another version once you make the cost matrix code parallel ?

@jratcliff63367
Copy link
Collaborator

I looked into it yesterday and saw that I had already made it parallel. So just changing the default recursion depth is the fix for now.

@ertan-retinize
Copy link
Author

But I have to build the vhacd from the master branch right ? Because this change is not included in the v4

@jratcliff63367
Copy link
Collaborator

Yes.

@ertan-retinize
Copy link
Author

Hey
I had some time to build the project from master today. But the thing is that the VHACD takes even more time to work now then before !. The vhacd 4.0 (release) was taking 4.5 minutes to finish in total but now it's been more than 30mins and still going on....
Using default parameters btw.
Do you have any idea what may be causing this ?

@jratcliff63367
Copy link
Collaborator

That doesn't sound right to me. Is there a chance you are running a debug and not release configuration? I'm out of the country traveling for the next week so I won't be able to get to it right away. Are you using the TestVHACD.exe windows binary I have checked in? Upload a test asset and I will look into it when I get back. But really I've never seen it take that long. Something weird is happening for you.

@ertan-retinize
Copy link
Author

Hey, I'm using the debug configuration. And, I've built testVHACD.exe from the master branch. I'll try to use it with release config and update you

@jratcliff63367
Copy link
Collaborator

The debug config is extremely slow due to the STL

@ertan-retinize
Copy link
Author

Thanks a lot for your quick responses and explanations. Much appreciated !
Well, I switched to release. built the exe again. Now it works better :) but It took 3min 19 seconds to complete which means it's still slower than the v3.1.

@jratcliff63367
Copy link
Collaborator

Upload the asset and I will look into it when I get back. Realize that the main motivation for writing version 4.0 is that 3.1 had bugs in it which would cause it to stop recursion early and produce poor results. Three minutes sounds crazy long to me. When I run it 'a long time' is like 30 seconds. I've never seen it take that long so I'm still unsure why you are seeing that.

@ertan-retinize
Copy link
Author

Asset is already loaded and sent to you via drive link. But the thing is that we're processing lots of assets at once (one after another) so that's why it takes 3ish minutes. I've tried to decrease the max recursion depth parameter to 6 and error percentage to 6 from 4 but it's still performing slower than the 3.1

@jratcliff63367
Copy link
Collaborator

You are saying three minutes for one asset or for a whole bunch of them? That one hips asset you sent, if I recall correctly, just took like 8 seconds or so on my machine. Sorry, again I'm currently in Armenia and won't be back in the US until September 1st. Version 3.1 did have bugs which would cause it to produce invalid results on occasion, which was the major motivation for the rewrite of the code.

@ertan-retinize
Copy link
Author

Thank you for your explanation again. And feel free to answer after your travel/holiday.

I'm running bunch of assets in my project. So let's say there is 100 objects and I'm running the vhacd exe on every one of them asynchronously. And it was taking 30 seconds with v3.1 and now it takes 3ish minute.

In other words, that hips obj I sent you it was taking a second with 3.1 and now it takes 7-8 seconds...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants