-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add endpoint for querying a folder or asset path in a Dandiset #1837
Comments
@jwodder - in the current current return of `paths` api at the top level of 000108{
"count": 7,
"next": null,
"previous": null,
"results": [
{
"path": "dataset_description.json",
"version": 500,
"aggregate_files": 1,
"aggregate_size": 71,
"asset": {
"asset_id": "2847011b-f9fb-4933-a6b5-1641e0c1886b",
"url": "https://dandiarchive.s3.amazonaws.com/blobs/c07/71a/c0771a4f-3483-47e7-821e-b28ac8df46a5"
}
},
{
"path": "samples.tsv",
"version": 500,
"aggregate_files": 1,
"aggregate_size": 572,
"asset": {
"asset_id": "a8a5fed9-02bf-4d54-b457-12427a4d3e70",
"url": "https://dandiarchive.s3.amazonaws.com/blobs/369/c1b/369c1b33-7311-402a-a236-373b5af3a800"
}
},
{
"path": "sub-mEhm",
"version": 500,
"aggregate_files": 168,
"aggregate_size": 3102758281057,
"asset": null
},
{
"path": "sub-MITU01",
"version": 500,
"aggregate_files": 6375,
"aggregate_size": 290203968395869,
"asset": null
},
{
"path": "sub-MITU01h3",
"version": 500,
"aggregate_files": 475,
"aggregate_size": 52424119215488,
"asset": null
},
{
"path": "sub-SChmi53",
"version": 500,
"aggregate_files": 574,
"aggregate_size": 12364425736517,
"asset": null
},
{
"path": "sub-U01hm15x",
"version": 500,
"aggregate_files": 490,
"aggregate_size": 16635711419780,
"asset": null
}
]
} |
|
this could be fixed i think. the current api assumes one is using it for navigation and hence this situation never comes up. on that front can you describe your use case in webdav that would be different from the tree navigator in the web app?
i'm not sure why i would return children of anything that is a folder, unless we parameterize that with a depth field. again something we could add to the api. however, it would be good to know what the webdav use case for it is. an intent behind that api was to provide a limited view so as to not overwhelm the server and to reflect what the ui looks like. i think the specific use case of webdav that's different from the online file browser would be good to know. if not, it would be good to examine whether the filebrowser could be improved. |
@satra In order to make dandi-webdav more efficient (cf. dandi/dandi-webdav#5), it needs to be able to take a path and do the following operations on it:
|
let's consolidate this with the web file browser. @mvandenburgh - do you have thoughts on the issues raised here.
not sure why this is a requirement for webdavfs. i would suggest any zarr reading part be left to a zarr library and a zarr object is treated the same way as any other binary file. |
@satra It's a requirement because that's what @yarikoptic told me to do. You can't treat a Zarr as a binary file, as it isn't a file in the first place. |
I'd prefer we avoid this design approach. The web file browser and this webdav server have fundamentally different requirements and therefore should not be served by one consolidated API. And I'd rather not add to the public API for a single client's needs (unless and until that API shows promise for general usefulness). What I'd like to try is creating some private API tailored to the dandidav's specific needs (recognizing that dandidav, like the web client and the CLI, is an official interface to the DANDI datastore and therefore has special privileges). Later on, we can factor out anything that is suitable for the public API. I realize this may sound like an odd approach, but it has advantages:
|
what is "private API" @waxlamp in the context of DANDI? |
the goal here for having zarr as a browsable folder is multifold
Just try it at https://dandi.centerforopenneuroscience.org/dandisets/000108/draft/sub-MITU01h3/ses-20211130h19m15s39/micr/ and around there ;-) |
I think we first need to acknowledge that current API related to
{
"count": 0,
"next": null,
"previous": null,
"results": []
} it is odd because it is an implicit knowledge to give meaning to such a result. It at least does return 404 when pointing to non-existing path (phewph). {
"path": "sub-RAT123",
"version": 419,
"aggregate_files": 1,
"aggregate_size": 18792,
"asset": null
},
{
"path": "test1234",
"version": 419,
"aggregate_files": 1,
"aggregate_size": 9,
"asset": {
"asset_id": "28155d24-9342-48bc-a3a7-b69191e86e13",
"url": "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
}
}, From this point of acknowledging that it sucks quite a bit, it sounds like we really need to strive to make it better, and then benefit from an improvement. What I see possible to be done to make it better without breaking anything:
And our file browser could immediately benefit and expose And now we can think about possible changes which we can expose "conditionally" based on some option to the request without breaking API (we already did that for E.g. we could add
{
"path": null,
"version": 419,
"aggregate_files": 1,
"aggregate_size": 9,
"date_modified": "2023-03-16T11:50:08.973069-04:00",
"result_type": "AssetBlob",
"asset": {
"asset_id": "28155d24-9342-48bc-a3a7-b69191e86e13",
"url": "https://dandiarchive.s3.amazonaws.com/blobs/031/7cf/0317cf5a-4047-4e19-aae1-4f7b7434d2d7"
} and if path points to the folder -- it would be that folder's record as when listed from its parent, but with @jwodder do you think this would suffice for our needs in webdav implementation? |
@yarikoptic This would improve things, but WebDAV would still have to make further requests to get asset properties & metadata (both for an asset at a path and for assets under a folder). Note that I already described what information WebDAV needs per path in #1837 (comment). |
I see:
could you @jwodder list explicitly specific metadata fields and properties which are of interest here? I guess adding |
|
Let's "side discuss" at #1851 where it might be easier "per line" |
Implementing https://github.com/dandi/dandi-webdav and some other things would be much easier if it were possible to make a single API request that provided a path and received in return either the asset at that path or an indication that the path is a folder, possibly along with the folder's children. Note that the current
/dandisets/000108/versions/draft/assets/paths/
endpoint is insufficient for this, as it does not return asset properties or metadata, and even if it did, settingpath_prefix
to an asset path returns an empty list, so one cannot make a single request that returns either an asset or the contents of a folder at a path.Possible endpoint:
GET /dandiset/{dandiset_id}/versions/{version_id}/assets/atpath/?path={path}[&metadata=<0|1>]
If
path
is omitted from the request, return a 4xx error.If
path
is not an asset path and there are no assets under{path}/
, return 404.If
path
points to an asset, the following is returned:If there are assets under
{path}/
(i.e., ifpath
is a folder), the following or similar is returned:Ideally, I'd like it if including a
children=1
query parameter would, ifpath
turns out to be a folder, cause the response to include a list of all direct children ofpath
, both assets (with properties, and including metadata ifmetadata=1
) and paths of subfolders. However, this would need to be paginated, and having a response that (a) is sometimes paginated and (b) only paginates a subset of the data returned seems awkward.A parallel endpoint could also be implemented for Zarr entries.
The text was updated successfully, but these errors were encountered: