-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A discussion on needed /paths
properties and metadata
#1851
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,21 @@ | ||||||
# Discussion on metadata needed for dandidav | ||||||
|
||||||
Submitted as a PR to avoid spaghetti thread discussion on https://github.com/dandi/dandi-archive/issues/1837#issuecomment-1921864949 | ||||||
|
||||||
|
||||||
@jwodder listed needed properties and metadata fields: | ||||||
|
||||||
* Asset properties used by dandidav: | ||||||
* `asset_id` (PRESENT) | ||||||
* `blob_id` | ||||||
* `zarr_id` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess So it feels that we might ask for asset record to include
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so in principle we can just parse out bucket, zarr_id and blob_id from contenturl, correct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but I'd rather dandidav not have to make too many assumptions about how our S3 URLs are structured. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agree. We would only need to be able to construct zarr "http s3" URLs from manifests so it would need to know that , right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that asset metadata lists two
|
||||||
* `path` (PRESENT) | ||||||
* `size` (PRESENT) | ||||||
* `created` (GOOD TO HAVE) | ||||||
* `modified` (SUGGESTED to be included) | ||||||
* Note: It might be more accurate to use the `blobDateModified` metadata field for this instead; cf. dandi/dandidav#17 | ||||||
|
||||||
* Asset metadata used by dandidav: | ||||||
* `encodingFormat` (blobs only) (STRONGLY DESIRED but not REQUIRED) | ||||||
* `contentUrl` (API download URL for blobs, S3 URL for Zarrs) (INCLUDED) | ||||||
* `digest["dandi:dandi-etag"]` (blobs only) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what this one for? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Etag (reported in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in principle could be any other checksum I guess , right? note: I thought that besides ETag functionality a commonly known checksum could then be used by receiving end to validate the download if that would be the target purpose for such a listing. for versioned zarrs we would need There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant that we can use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if it makes sense to apply etags to collections, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what for do we need blob_id if we have contentUrl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThecontentUrl
returned in.../assets/paths/
responses is a plain S3 URL, and serving it to the user in a web browser would result in the file being downloaded and named with the blob ID. In order for the downloaded file to be named with the asset's filename instead, we need a signed S3 URL, which is obtained via the API download URL in thecontentUrl
field of the metadata.EDIT: Sorry, I misinterpreted the question. For blob ID, the code just checks whether
blob_id
orzarr_id
is non-None in order to determine whether the asset is a blob or a Zarr.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so we the actually need
result_type
(Folder, AssetBlob, AssetZarr) as I suggested in #1837 (comment) .