You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a user wishes to add images to an existing pipeline run, they modify the config to include the new inputs and relaunch the run. A check is performed to ensure that:
New inputs have been added to the config, and
No other settings have been changed.
Both of these conditions must be true for a pipeline run to be re-run in "add mode". The pipeline checks if the inputs have changed by reading the previous config file config_prev.yml and comparing it with the updated config.yml file. Both config files are parsed, validated, and all glob expressions are resolved.
Suppose that the config inputs are a simple glob expression, e.g.
If new files that match this expression are added to the filesystem, the pipeline will fail to detect that the inputs have changed. It will read both config_prev.yml and config.yml, which would contain the same glob expression in this case, and compare them. Since the globs are resolved when the config file is read, both config files will end up with the same list of inputs even though new files matching the glob were added since the run was executed.
The problem is that the config diff check only parses the previous config file and doesn't look at which images were actually used.
A potential solution would be to add a comparison of the number of resolved inputs in config.yml with the number of images stored in the Run object (i.e. Run.n_images) to the config diff check. If the number of inputs is greater than the number of images in the run object, then the run should be re-run in add mode. This won't work if images were removed, but that isn't allowed for "add mode" anyway.
The text was updated successfully, but these errors were encountered:
By the way, the context of this issue is that I found 15 low-band images that weren't included in the combined run. The inputs are specified with a glob expression per epoch, e.g.
I don't think there's a way I can add the new images to this config without fixing the config diff check. If I add the new files to the config explicitly, they'll show up twice when the globs are resolved.
When a user wishes to add images to an existing pipeline run, they modify the config to include the new inputs and relaunch the run. A check is performed to ensure that:
Both of these conditions must be true for a pipeline run to be re-run in "add mode". The pipeline checks if the inputs have changed by reading the previous config file
config_prev.yml
and comparing it with the updatedconfig.yml
file. Both config files are parsed, validated, and all glob expressions are resolved.Suppose that the config inputs are a simple glob expression, e.g.
If new files that match this expression are added to the filesystem, the pipeline will fail to detect that the inputs have changed. It will read both
config_prev.yml
andconfig.yml
, which would contain the same glob expression in this case, and compare them. Since the globs are resolved when the config file is read, both config files will end up with the same list of inputs even though new files matching the glob were added since the run was executed.The problem is that the config diff check only parses the previous config file and doesn't look at which images were actually used.
A potential solution would be to add a comparison of the number of resolved inputs in
config.yml
with the number of images stored in the Run object (i.e.Run.n_images
) to the config diff check. If the number of inputs is greater than the number of images in the run object, then the run should be re-run in add mode. This won't work if images were removed, but that isn't allowed for "add mode" anyway.The text was updated successfully, but these errors were encountered: