Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust file dependency download process #206

Open
nichollsh opened this issue Oct 16, 2024 · 3 comments
Open

Robust file dependency download process #206

nichollsh opened this issue Oct 16, 2024 · 3 comments
Labels
bug Something isn't working JOSS publication: PROTEUS TBD before PROTEUS JOSS publication Priority 3: standard Priority level 3: medium time criticality or importance software Relating to software and implementation

Comments

@nichollsh
Copy link
Contributor

Sometimes the tests fail because the model is not able to get data from OSF. This can usually be fixed by requesting the test to be run again, in which case the data usually downloads fine.

Example of it failing: https://github.com/FormingWorlds/PROTEUS/actions/runs/11366750494/job/31617828810

We could include a loop in the downloader functions that re-attempts the download if an error is thrown.

@nichollsh nichollsh added the bug Something isn't working label Oct 16, 2024
@nichollsh nichollsh changed the title Tests fail during FWL data download Tests sometimes fail during FWL data download Oct 22, 2024
@lsoucasse lsoucasse added the software Relating to software and implementation label Nov 11, 2024
@timlichtenberg timlichtenberg added Priority 3: standard Priority level 3: medium time criticality or importance JOSS publication: PROTEUS TBD before PROTEUS JOSS publication labels Nov 19, 2024
@timlichtenberg
Copy link
Member

timlichtenberg commented Jan 20, 2025

Issue splits into two issues

  • A) Download sometimes fails completely and throws an exception because of server-side issues.
  • B) Sometimes it just doesn't download everything, and just a few files or one file is missing. Need some sort of registry or file inventory.

First issue is probably easier to solve than second one.

@nichollsh nichollsh moved this from JOSS Publication to Next up in PROTEUS Development Roadmap Jan 20, 2025
@nichollsh nichollsh changed the title Tests sometimes fail during FWL data download Robust file dependency download check Jan 20, 2025
@nichollsh nichollsh changed the title Robust file dependency download check Robust file dependency download process Jan 20, 2025
@nichollsh
Copy link
Contributor Author

nichollsh commented Jan 20, 2025

Regarding issue B, we could potentially query OSF for the files we expect to download. This can be done fairly easily with curl.

For example, the node ID for the spectral files is vehxg, so we can obtain a list of all the files with:
curl -X GET https://api.osf.io/v2/nodes/vehxg/files/osfstorage/

This provides a JSON dataset which lists all of the folders and files within the node. We can the recursively follow the folders to make a tree of files. For each file, OSF also provides an MD5 hash that can be used to verify that each of the files downloaded fine.

In the same JSON dataset, we also get download links for each file. We could also consider using this as an alternative to the current osfclient library which hasn't been updated in over 4 years.

@nichollsh
Copy link
Contributor Author

nichollsh commented Jan 20, 2025

Working example in the notebook here:
request.ipynb.zip

This provides functions for listing all files in a node, getting the download URL for a given file path on OSF, and obtaining MD5 hashes. This is done without using the old osfclient library.

@nichollsh nichollsh moved this from Next up to Done in PROTEUS Development Roadmap Jan 21, 2025
@nichollsh nichollsh moved this from Done to In Progress in PROTEUS Development Roadmap Jan 21, 2025
@nichollsh nichollsh moved this from In Progress to Next up in PROTEUS Development Roadmap Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working JOSS publication: PROTEUS TBD before PROTEUS JOSS publication Priority 3: standard Priority level 3: medium time criticality or importance software Relating to software and implementation
Projects
Status: Next up
Development

No branches or pull requests

3 participants