-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite user-mode threading to work with overlapped IO/IOCP #210
Comments
Great propsition @Corillian ! I totally agree with you ❤️ Removing the threads limitation would enhance a lot the dokan speed also ! If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the About point 5, is it not related to #148 ? It seems that you have a lot investigate around how to make it, do that mean it is something that you are already planning to do ? |
Yes point 5 is identical to #148 except I don't believe the lookaside list API is available in user mode.
All of the changes would be in the user mode dokan.dll. It will keep I have no problem doing this change myself however I don't have a lot of time at the moment. There's a good chance I wouldn't be able to get to it until sometime in June at the earliest but that could change (for better or worse). |
Oups yes lookaside does not exist for user mode ofc 😄 sorry Ok I see the breaking changes you are talking about. It would be nice if we can just add a proxy fonction for each event that would convert the dynamically allocated structures into current function parameters and the other way back at the user function return. Don't worry about time, we also currently have couple of things that we need to do to push for pushing this 1.0.0 out 😞 (some stability issue) |
I have also been happily and successfully using Dokany from the beginning and have also run into the bottleneck of waiting on IO. I think a much easier solution that does not break our API, would be for the driver to handle the return value STATUS_PENDING from (at least) user mode function dokanOperations->ReadFile. All existing Dokany projects would continue to operate, and developers that want concurrent IO could do their own that worked by returning STATUS_PENDING in conjunction with IO handler thread(s). |
Driver can only communicate to the user mode if the user mode tells the driver "here you have a worker". How would that happen using the approach suggested by @Corillian? Need to understand the full flow. The approach suggested by @JohnOberschelp seems to me much simpler. But in this case one thread will still be blocked. If driver gets STATUS_PENDING it will make the same call again using maybe even the same thread, until it gets another STATUS. Here we are limited in the amount of threads resp. workers provided to the driver. Possible solution here is to create a new additional thread resp. worker in user mode, if user mode returns STATUS_PENDING. STATUS_PENDING means this thread is still working resp. thread is blocked. We need to keep the old thread running with setting a special async flag in the context like suggested by @Corillian. If once this special threads have been handled properly those can be terminated. The thread amount specified in dokan options is just the minimum amount of threads which will be handled in parallel. In generally we can increase the amount of threads in the dokan options to 64. This can be done already yet. |
You are right, @marinkobabic, one thread will be blocked, but not the rest of the threads. Just by implementing handling of STATUS_PENDING on reads, and implementing lazy writes in userland, all Dokany threads would be able to go as far as they can till they are I/O bound. For read in particular, all the driver may need to do (as @marinkobabic said) when it receives a STATUS_PENDING, is yield to other threads, and try again; the read request is still completely valid. @Corillian, wouldn't this be all you need to get your FTP requests running in parallel? |
Inside the user mode driver callback a pointer to a structure with the relevant arguments for that packet will be provided. It should be done this way because we've then gone through the trouble of creating a proper structure to marshal packet arguments so that driver writers don't have to constantly reinvent the wheel for their own async operations. They just pass around a single pointer and reference it as needed. If the user mode driver callback will be asynchronous it will return STATUS_PENDING which tells Dokan to immediately exit the completion packet callback so that the thread can begin servicing other completion packets. If it's a synchronous callback it'll return its normal error code and Dokan will then automatically call Outstanding asynchronous operations will also need to be tracked by Dokan so that during shutdown all pending IO operations will be verified complete before tearing down the IO completion port and the thread pool. Once a user-mode asynchronous operation completes it will need to call |
@JohnOberschelp @Corillian |
I agree that the current API is weird, and doesn't make much sense for some cases (my filesystem queries the network asynchronously too, so I have to workaround it by using promises and futures, and blocking the callback in DokanOperations). But note that is pretty easy to implement a filesystem in Dokan+FUSE this way. Also, I'm not very familiar with the Windows API, and even if I were, I will have to use both Windows' and Unix's if I had to write a cross-platform filesystem. My current approach for that is using a cross platform library that isolates me from that as much as possible. An alternative would be to document the protocol to talk to the kernel device, am I right? FUSE is considering it and there is already a protocol sketch. That would be helpful for people who want to use Dokan with any language. |
Interesting discussion you started @Corillian. Current threading model was always a bother to me so I'm glad to see suggestions. My opinion is that it currently works and only result to performance issue so even if I totally agree, I will personally do on it after few others issues / bug fixes. Or maybe you already planned to work on it on your side as you already contributed several times? Better to know it to avoid work duplicate as happened with WiX installer 😄. @suy For now, most current language Dokan libraries (C#, Java, ...) are wrapping the C resulting dynamic library dokan1.dll. I currently prefer this way to avoid user-mode handling work duplicate (libraries would have to through native win32 call anyway to talk to the driver). But documenting the protocol to talk to kernel device should be done for the sake of documentation indeed! |
Hi @Corillian , Since it is june, I just wanted to know if you got any time on this ? |
No unfortunately the stuff I needed to get done before I could address this ended up taking a bit longer than I had anticipated. I noticed that 1.0.0 isn't released yet so I figured the delay on my end wouldn't be a huge issue ;). I'm still planning on implementing this, hopefully soon as the lack of throughput is a near constant complaint from users of my driver. |
Alright I've started working on this. I won't be working on it every day so I don't know exactly how long it will take but it's currently under way =). |
@Corillian 👍 don't hesitate if you need help! |
@Liryna Why is |
@Corillian I don't have my environment to test with me but I would say that this come from the last pull requests of @bailey27. If I remember, there should be a first createfile to know if the file exist and a second for opening the directory. Or do you mean you have two time the same createfile? |
I changed the behavior of SL_OPEN_TARGET_DIRECTORY to fix #249 (MS Word 2007 & 2010 can not save file normally in mirror.exe). If SL_OPEN_TARGET_DIRECTORY is specified, then the caller expects to open the parent dir of the file. But it also expects to be told in create Information if the child did not exist. |
Going through directory.c it seems to me that since Dokan caches the previous directory listing and provides its own pattern search maintaining Also is it correct to assume that the kernel synchronizes all file operations on a file handle? If that's not true then there are quite a few synchronization bugs in the user-mode driver. Lastly shouldn't directory list caching be done in the kernel and not the user-mode driver? I don't understand why we have to endure 2 user/kernel transitions when |
Independent where the caching is done, how is the following scenario covered here:
Actually it seems that the cache is not updated. To make the whole system stable and faster:
This way the file and directory information can retrieve the information from cache. If the findfiles should be cached, then every external change on the mirrored filesystem needs to be reported to the kernel. FindFilesWithPattern is in generally not a bad idea, it simplifies the implementation. The developer don’t needs to check the mask for * Thanks Marinko |
Ok after looking through FastFat it looks like it's up to the driver to do its own synchronization - which is what I had assumed would be the case. FastFat obviously doesn't need to do any caching because it has direct access to the drive but it does maintain the state of the previous search. It looks like |
Here's the FastFat code for searching: //
// Determine where to start the scan. Highest priority is given
// to the file index. Lower priority is the restart flag. If
// neither of these is specified, then the Vbo offset field in the
// Ccb is used.
//
if (IndexSpecified) {
CurrentVbo = FileIndex + sizeof( DIRENT );
} else if (RestartScan) {
CurrentVbo = 0;
Ccb->OffsetToStartSearchFrom = 0;
} else {
CurrentVbo = Ccb->OffsetToStartSearchFrom;
} As @marinkobabic mentioned there is currently a problem with refreshing the cache. As far as I'm aware Dokan does not currently provide a driver with a way to notify the file system that the file system has changed? |
I agree that the cache should probably review/rewrite and move to the kernel. Line 1300 in 84c38f8
Line 598 in b845c08
|
Yes but say your driver is a remote FTP server - when a new file gets created on that server there's no way to notify Dokan and Windows Explorer won't be aware of the new file until the next time it happens to issue a request for a directory listing. |
Alright sure thing, thanks! |
@Liryna looking at FastFat this behavior does not seem to be accounted for which makes me think that Dokan is doing something wrong and that's why the OS is sending read/write's after close/cleanup. |
Ok I think I found the problem. From the documentation for IRP_MJ_CLEANUP:
Which is exactly what the documentation for IRP_MJ_CLEANUP says can happen. |
@Corillian Thank you for pointing the part of documentation showing it! So does everything is now always for you on the dokan behavior? |
More or less rewrote all of dokan.dll to support async IO. Rewrote the Mirror example to support the new Dokan API. Still need to add async IO support. User-mode drivers can reeturn STATUS_PENDING to indicate they are performing an async operation at which point they are responsible for calling EndDispatch*() to notify the driver that the operation has completed. Still some known issues that need to be fixed.
Close/Cleanup no longer gets called if Create fails and no user context is supplied Renamed EndDispatch*() functions to DokanEndDispatch*() since they're public The DOKAN_VECTOR API is now exported from dokan.dll
Made critical sections a bit easier to identify Added missing exports
Added DokanInit() and DokanShutdown() FIxed a bunch of memory leaks caused by the Close handler
I have no idea what you just asked lol. I've submitted the initial PR for review: #307 It's going to need a lot of testing and review. |
@Corillian Thank you corillian! I will take a look as soon as I can. Probably this weekend :) Hahaha yes that was a strange sentence. I just wanted to know if you still think there is an issue on dokan behavior with read and write after cleanup. |
no I believe I fixed it, so far I haven't encountered any additional problems. |
Access tokens now get passed to the user mode driver for SetFileSecurity and CreateFile Fixed an issue in Mirror with readonly files not returning access denied in CanDelete
Added unit tests for file streams
Fixes for DOKAN_OPTION_FORCE_SINGLE_THREADED Mirror now closes a file handle in Cleanup again Added more unit tests Some unit tests are still failing due to issues with setting file security
Is this still awaiting a review? Can I do anything to help test it? |
Hi @andreas-eriksson , There is a security file issue #307 (comment) After this step, we need people to test the mirror for being sure that everything is still working properly. |
I think a way forward would be for someone to resolve the merge-conflicts with the master-branch. I would be willing to modify the API to be able to use both the old API and the new API. This would simplify testing and would allow people to test the new architecture without having to modify their filesystem. We can mark the new API this PR introduced as an experimental API until it is stable. I tried to merge with master but gave up because I was not really sure how to resolve some of the conflicts :-( I did not try for too long, so maybe it was just me so I don't want to discourage anyone from giving it a try. |
This is now partially available in 2.0.0 thanks to @Corillian PR |
The current threading model used by Dokan is a major performance bottleneck in any user-mode driver that can make asynchronous IO calls - for example a driver that provides access to an FTP server. The OS can potentially make thousands of IO calls a second but with a limit of 15 threads and no async callback mechanism those threads will spend most of their time blocking on IO operations (in the FTP example they will block on socket read/writes).
To solve this I propose the following:
DeviceIoControl()
in dokan.c needs to use an IO completion port for dispatching IO events to the user-mode driver. IO events will automatically be handled on the system thread pool and neither Dokan nor the user-mode driver are required to manage their own threads nor worry about the number of threads that are allocated.Dispatch*()
functions will need to be split in two: one half (BeginDispatch*()
) for event setup and the call into the driver event handler and one half (EndDispatch*()
) for validation, cleanup, and sending the result to the kernel. Finalization for an asynchronous event (EndDispatch*()
) can be initiated by aDokanFinishAsyncEvent()
function called from the driver that takes a pointer to the context information as an argument.DokanFinishAsyncEvent()
knows where to find the information it needs to pass back to the kernel.Unfortunately this both a non-trivial undertaking and would require breaking API changes however my driver is already running into problems with throughput due to restrictions imposed by the current Dokan threading model.
The text was updated successfully, but these errors were encountered: