-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow changes needed to manage offline gracedb upload prep #4535
Workflow changes needed to manage offline gracedb upload prep #4535
Conversation
@GarethCabournDavies This branch has conflicts, should I still be reviewing this? |
Yes, I will check what the conflicts are and fix them |
a38093f
to
292ab2c
Compare
afbff8a
to
9e5ff3f
Compare
CC is complaining about minifollowups module being too long - I think I can split this into minifollowups.py and minifollowup_plots.py (i.e. single_template, trigger_timeseries, qscans etc.) |
82a9d4e
to
58b1bbd
Compare
I tried this, but got so many errors in codeclimate, I have reverted it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the most part this looks good. I think there is one issue here related to file dependencies in sub-daxes (which was hard to get right, and I think in looking closer I've also spotted a bug in how this is done in the sbank workflow). Perhaps some changes to resolve_url
is needed to support the "file is produced in the workflow above this, I don't have a PFN but pegasus would manage it" case. However, this requires somehow knowing that this is the case. Originally there was a --is-subworkflow
flag to identify this. Alex removed that (alongside a bunch of things where the flag wasn't needed), but I think we may have lost this functionality in the process ....
|
||
psd_fname = args.psd_files[ifo] | ||
psd_file = resolve_url_to_file(os.path.abspath(psd_fname), | ||
attrs={'ifos': ifo}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bad idea because it will copy the file into the new directory, and this is a big file. If you follow through what this function is doing, there's a call to a resolve_url
function. That takes a copy_to_cwd
option, but this is always True in this instance. Perhaps worth exposing this key-word argument, and setting it to False here (and perhaps in other instances as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... this also assumes that this is always a top-level workflow, and it would never be called from within another workflow. For a subworkflow things are a bit more complicated as you need to add file dependencies, and would not create a File object in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this to be able to work as either, e.g. we can run it on an already-completed analysis, or run it within the workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually wouldn't result in a copy. The filename passed here would always be something like ./H1L1-HDFTRIGGER_MERGE.hdf
, and pegasus copies/symlinks the file here before this runs (as per usual), so the file is already in the right place.
The main point is that these are always subworkflows, and pegasus will have already handled file transfer for subworkflows.
'generate_xml', | ||
ifos=workflow.ifos, out_dir=out_dir, | ||
tags=tags | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more general point, with cases like this, where a particular executable is required, there should be some check that the thing in [executables] -> generate_xml
is actually the executable that this is written against.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds a good idea, I think something like the subclass check used for coinc_lim_for_thresh in the stat module might be worthwhile, but that would require quite a few changes so I think is outside the scope of this PR
ifos=workflow.ifos, out_dir=dax_output, tags=tags) | ||
|
||
node = exe.create_node() | ||
node.add_input_opt('--config-files', config_file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're doing this (which is right) you must not use resolve_url
in the pycbc_upload_prep_minifollowup
. If you do you'll end up with file paths pointing to some local condor filesystem ... and this will fail.
I tried to point out a good example for this, but I think I found broken code (https://github.com/gwastro/pycbc/blob/master/bin/workflows/pycbc_make_sbank_workflow#L205). That example is a workflow that could be a top-level workflow or a subworkflow. The seed bank would be passed from the top-level workflow or is an input file. I think this would fail if seed-file is given and it is not a subworkflow.
The heart of the issue is that resolve_url
does this https://github.com/gwastro/pycbc/blob/master/pycbc/workflow/core.py#L2120, and from_file
does not. We do need to add a PFN if this is a top-level workflow and this is a global input, we must not add the PFN if it is not (especially because in this case we cannot predict where the statmap-file and bank-file would actually be).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add back in the --is-subworkflow
option for this then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure of this now, this seems to be working in my latest tests (using the copy_to_cwd=True
argument), however my testing uses a cache file, so the files for which resolve_url is called are definitely not still in local. Could this be an issue?
This also matches the way this is called in the foreground and injection minifollowups - will they need a fix as well? I'm surprised we haven't seen issues before in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reuse cache is affecting this test. There's no file management going on between the workflows when everything just comes as a pregenerated input file in a guaranteed location. Is there any way to turn this off, perhaps let the workflow make H1L1-PAGE_FOREGROUND_XMLALL-XXX.xml
again from the already existing inputs?
I do see the "incorrect" file location in the dax file:
- lfn: H1L1-INSP_SEGMENTS-1368975466-1121610.xml
pfns:
- site: local
pfn: file:///local/condor/execute/dir_2994154/pegasus.kxIWl4525/H1L1-INSP_SEGMENTS-1368975466-1121610.xml
but it's possible that pegasus will always prefer PFN coming from reuse caches .... And it uses the reuse cache functionality to send intermediate files up and down. So this may work okay without changes ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update on what happens without using a reuse cache (or a workflow I can look at with this in place).
We've verified that the PFN issue is not causing failures in this code. Pegasus seems to always prefer files being sent down from higher-level workflows, than PFNs set in dax files. I still think we should fix this, but it doesn't hold this patch up any longer. |
…o#4535) * Workflow changes needed to manage preparation of offline gracedb upload files * fix change removed after rebase * Allow waveforms to not be given * IH comments
* Need to create .cvmfscatalog file (#4619) * Workflow changes needed to manage offline gracedb upload prep (#4535) * Workflow changes needed to manage preparation of offline gracedb upload files * fix change removed after rebase * Allow waveforms to not be given * IH comments * Update setup.py --------- Co-authored-by: Gareth S Cabourn Davies <[email protected]>
…o#4535) * Workflow changes needed to manage preparation of offline gracedb upload files * fix change removed after rebase * Allow waveforms to not be given * IH comments
…o#4535) * Workflow changes needed to manage preparation of offline gracedb upload files * fix change removed after rebase * Allow waveforms to not be given * IH comments
…o#4535) * Workflow changes needed to manage preparation of offline gracedb upload files * fix change removed after rebase * Allow waveforms to not be given * IH comments
The PR #4438 was getting too long and complicated to review, and there was an obvious split in order to make this more simple.
Here I separate off the changes to the workflow to allow the new executables (in #4534) to be called in a minifollowup.
The basic outline of this PR is that a new minifollowup is created (optional), which makes the files to upload and outputs to an 'upload_files' directory in the main output directory of the search.
The main change is the code
pycbc_upload_prep_minifollowup
; this defines a minifollowup which prepares files for upload to gracedb, including:.fits
file from running bayestar (well the wrapper around it)Other changes are made so that this workflow can run properly.
There are two parts that affects the other minifollowups:
get_single_template_params
I think this way is neatest in terms of applying / removing this functionality through the config and existing codes, however I also see the benefits of creating these files as part of the foreground minifollowup and adding these files to the open box directory (and plots to the page?), and can change to do that instead. There would be a change in how the foreground minifollowup is called in terms of FAR threshoold vs N events for this change though.