-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] add support for project id (i.e. lsattr -p) #553
Comments
I haven't used project ids much myself. Do you know if there is a C API to access that info? Are those store as extended attributes maybe, and if so, do you know the attribute name? As with the dwalk formatting change, we could hack something up to get you a quick solution here. If there is a C API, we could likely just call that at the point where the text lines are formatted. If there is not a C API, we could try to invoke and parse the |
For reference, I see the
which then points to this git repo: https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/lib Looks like this is the code that reads the file project id: https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/lib/e2p/fgetproject.c#n42 the https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/lib/e2p/project.h |
Another reference to project id info on Lustre: https://jira.whamcloud.com/browse/LU-4017 @ofaaland , do you know offhand if there is an API to read a file project id ( |
@adammoody there is an This same ioctl is implemented for Lustre, XFS (original), and other filesystems that support project IDs (possibly ZFS also). See, for example I definitely would not recommend to do an lsattr system call to get this information, as that is a ton of overhead, and makes the code dependent on the installation of e2fsprogs and the exact syntax and output of that tool. While it is unlikely to change, that adds needless complexity. |
Thank you, @adilger ! That's perfect. @markmoe19 , for a quick fix, you could hack this in to the text formatting function that you modified in #555. Here is some example code I whipped up based on the tips above that I think will work:
You can add the definition for That will execute twice for each file, once to fetch the value when computing the number of characters needed, and once more to fetch the value for actual writing to the file. That obviously could be improved by caching the value, but at least it would work to just do it twice. Since this is an uint32_t type, I suppose it'd be best to print with |
Opening and closing the file twice is not free, even in newer Lustre releases that have automatic open cache on the client. Ideally, you would open an fd on the path once at the starts, and pass the fd to the various functions. That also avoids TOCTOU race when doing pathname resolution multiple times. |
Reducing the open calls is a good point, especially if you are dealing with many files, @markmoe19 . If that's a problem, the most immediate improvement would be to pull the project id lookup up into |
In case you want to jump straight to the single-open version, here is a patch:
I can push up a branch if you'd like. This code doesn't properly handle the case when Perhaps change this to use a signed datatype and then use -1 to represent an invalid project id? Or otherwise use a second array or an array of structs or something to denote whether the corresponding entry is valid... Also, when the project id has not been explicitly set (or inherited) on a file, does the project id default to 0? Anyone know? |
Went ahead and pushed this a branch to make it easier to try: https://github.com/hpc/mpifileutils/tree/projid and here are the actual changes: |
That works! I get a lot of projid='0', but project id is not set for all the files so that makes sense. I do get 10001, 10002, and 20034 as some other example project-ids. |
questions:
[2023-10-09T12:45:00] [44] [/project/selene-admin/mpifileutils_debug/mpifileutils-v0.11.1/mpifileutils/src/common/mfu_flist_io.c:1613] ERROR: ioctl failed: `/lustre/fsw/.../system/rc.service' errno=25 (Inappropriate ioctl for device) I noticed that in those cases, "lsattr -p" also gives an error like this: root# lsattr -p /lustre/fsw/.../system/rc.service Could that error just be ignored and also end up with project id value of '0'? Thanks,
|
Right, we don't have a field for the project id within the internal flist structure that gets stored to the binary The above hack can still work when reading a binary Adding the project id to the flist element would take some effort, because that structure needs to be updated in multiple places throughout the code base. It's also a bit disruptive since it forces a change to the As a future feature, I'd like to see whether we can enable users to attach arbitrary data to each file element. This data would need to be serializable to be stored to a binary Regarding the ioctl error, yes, that could be changed to use 0 with a one line fix.
However, since 0 seems to be the default project id for many files, I thought it might be useful to use another value to represent those error cases. For example, if we change the datatype from uint32_t to int32_t, we could initialize to -1 instead of 0. However, that assumes that one doesn't use project ids above what an int32_t can represent (int32 overflow). For that matter, we could leave the type as uint32_t and use the max uint32_t value assuming that one is not using that as a valid project id. To really handle that cleanly, it might require using a second field to indicate whether the corresponding project id is valid. It's your call on which value you want to initialize here. |
The projid needs to be a uint32 field, but the -1 value is invalid (there is a named constant for this, like I suspect from the name that the I think it would be most useful to not store a projid for such cases, rather than trying to store an invalid projid for the file. |
Right, it was a link to a file on an NFS file system. Probably storing nothing for projid in that case is the best response, I agree. |
I'm a bit torn on how to handle projid when using .mfu file. We have kept some .mfu files around as it provides a sort of snap-shot of the metadata that can be queried without the filesystem around (or even if the filesystem is there, it is no longer the same as some time as passed). Maybe, until the .mfu file can store the projid, the only valid way to get projid in the text output file is if you are going directly from file system scan to text file (and not keeping a .mfu file)? The good news is that now I have a way to capture project-id for all the files quickly, thank you! :) |
@adammoody I saw your patch on this issue from last year, and am wondering if you had made any progress toward landing this feature? Is it a hard requirement that there is a field in the .mfu file even for dcp/dsync to copy between existing filesystems, or only for saving the data to a backup? For Lustre we also added an xattr interface to save/restore the projid with a virtual "trusted.projid" file that allows backup and restore of the project ID for applications that don't support it natively, but this is not exposed on the client by default. |
When copying files using dcp or dsync, extended attributes such as project id are also copied. Resolves hpc#553. Signed-off-by: Sohei Koyama <[email protected]>
When copying files using dcp or dsync, extended attributes such as project id are also copied. Resolves hpc#553. Signed-off-by: Sohei Koyama <[email protected]>
When copying files using dcp or dsync, extended attributes such as project id are also copied. Resolves hpc#553. Signed-off-by: Sohei Koyama <[email protected]>
When copying files using dcp or dsync, extended attributes such as project id are also copied. Resolves hpc#553. Signed-off-by: Sohei Koyama <[email protected]>
When copying files using dcp or dsync, extended attributes such as project id are also copied. Resolves hpc#553. Signed-off-by: Sohei Koyama <[email protected]>
@adammoody it would be great if you can review PR that Sohei pushed. |
dwalk supports group and gid. Could it also support project and project id? See "/usr/bin/lsattr -p" for project id linux attribute on a file. We make use of gid for access and project id is often the same but sometimes different and used more for quota control. Thank you.
The text was updated successfully, but these errors were encountered: