-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ids 641 end to end ingestion test on staging server #445
base: main
Are you sure you want to change the base?
Ids 641 end to end ingestion test on staging server #445
Conversation
…to IDS-641-end-to-end-ingestion-test-on-staging-server
…ytardis_ingestion into IDS-641-end-to-end-ingestion-test-on-staging-server
…ytardis_ingestion into IDS-641-end-to-end-ingestion-test-on-staging-server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Libby, thanks for working on this change, there are a couple of things that I thought we could work through. It may be something to do after this sprint though!
@@ -72,7 +73,7 @@ def _is_completed_df( | |||
|
|||
def _filter_completed_dfs( | |||
config: ConfigFromEnv, datafiles: list[RawDatafile], min_file_age: Optional[int] | |||
) -> list[RawDatafile]: | |||
) -> Tuple[list[RawDatafile], list[RawDatafile]]: | |||
"""Inspects through a list of datafiles, return datafiles which have completed ingestion.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update comment to clarify what's being returned.
network_base_path = Path( | ||
"//files.auckland.ac.nz/research/resmed202000005-biru-shared-drive" | ||
) | ||
modified_source_path_base = network_base_path / source_path.parent.relative_to( | ||
"/mnt/biru_shared_drive" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this command is meant to be generic to different workflows, I'm wondering if it's necessary to prepend BIRU drive paths here?
|
||
# save file verification status to a csv file for idw project | ||
if profile_name == "idw": | ||
logger.info("Saving files verification status into a csv file.") | ||
_save_data_status(source_data_path, verified_dfs, unverified_dfs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to have a separate command for the reporting functionality. The ids clean
command mainly deletes files, whereas this functionality creates a report. To prevent accidental deleting of files it's better for the CLI user to run it as a separate command...
In this PR:
Added
_save_data_status
function to save file verification status into a CSV file for BIRU users to determine if they want to delete some ingested files when the auto clean-up script is disabled. It can be disabled, too, depending on the cleanup strategy's decision.Tested
cmd_clean.py
in the ingestion_biru.py.