Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize UNF definition to apply across all Files #2

Open
mercecrosas opened this issue May 26, 2015 · 1 comment
Open

Generalize UNF definition to apply across all Files #2

mercecrosas opened this issue May 26, 2015 · 1 comment
Assignees

Comments

@mercecrosas
Copy link
Member

We would like to apply a new, more general algorithm to UNF to apply it across all files. See original discussion on this in IQSS/dataverse#2192

Functional Requirements Document (FRD) will be created and linked to this issue.

@leeper
Copy link
Member

leeper commented May 26, 2015

This seems relatively easy to wrap into the existing UNF standard. Treat a file as a binary vector, base64 encode it, hash using SHA256, and truncate to the specified UNF length. This has could then be aggregated just like dataset UNFs are currently combined to create the study-level UNF.

Even if MD5 is a common standard in archiving, SHA256 seems reasonably widely implemented and would be consistent with the existing UNF standard. I think you would still have to supply MD5's somewhere in Dataverse though, given their prevalence as a checksum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants