You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using --fsspec, the --deposition-path option receives the directory where the user expects files to be downloaded.
The orchestrator deletes the contents of this directory during cleanup, irrespective of whether those files were created by archiver or pre-date the run:
Thanks for writing this up! I would personally vote for "don't do something scary" over "warn the user that the code does something scary, then do the thing anyway."
Creating a temp directory seems like a good enough workaround. If y'all want to take this on Margay, go for it - otherwise I'm happy to pull it into Inframundo-land for next sprint.
When using
--fsspec
, the--deposition-path
option receives the directory where the user expects files to be downloaded.The orchestrator deletes the contents of this directory during cleanup, irrespective of whether those files were created by archiver or pre-date the run:
https://github.com/catalyst-cooperative/pudl-archiver/blob/43eacab34afb356ec430a698617e883a8eea568a/src/pudl_archiver/orchestrator.py#L29C1-L33C54
This could result in data loss if the user specifies a directory that is used for multiple purposes. We do not inform them of this risk in the README.
Potential resolutions:
--deposition-path
directory gets wiped during cleanup, so they know what to expect. Candidate locations:--deposition-path
so it's guaranteed to be used only by usThe text was updated successfully, but these errors were encountered: