Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow changes needed to manage offline gracedb upload prep #4535

Conversation

GarethCabournDavies
Copy link
Contributor

The PR #4438 was getting too long and complicated to review, and there was an obvious split in order to make this more simple.

Here I separate off the changes to the workflow to allow the new executables (in #4534) to be called in a minifollowup.

The basic outline of this PR is that a new minifollowup is created (optional), which makes the files to upload and outputs to an 'upload_files' directory in the main output directory of the search.

The main change is the code pycbc_upload_prep_minifollowup; this defines a minifollowup which prepares files for upload to gracedb, including:

  • an XML file, including PSD and SNR timeseries
  • the .fits file from running bayestar (well the wrapper around it)
    • the skymap plot from this fits file
  • a plot of the SNR timeseries of all active detectors
  • a plot of the PSD of all active detectors

Other changes are made so that this workflow can run properly.

There are two parts that affects the other minifollowups:

  • There is a change to the make_single_template_plots function so that it returns both the plots and the single_template files (this can be reversed if wanted)
  • There is a change so that getting the parameters of the event uses shared code in get_single_template_params

I think this way is neatest in terms of applying / removing this functionality through the config and existing codes, however I also see the benefits of creating these files as part of the foreground minifollowup and adding these files to the open box directory (and plots to the page?), and can change to do that instead. There would be a change in how the foreground minifollowup is called in terms of FAR threshoold vs N events for this change though.

@spxiwh
Copy link
Contributor

spxiwh commented Nov 6, 2023

@GarethCabournDavies This branch has conflicts, should I still be reviewing this?

@GarethCabournDavies
Copy link
Contributor Author

@GarethCabournDavies This branch has conflicts, should I still be reviewing this?

Yes, I will check what the conflicts are and fix them

@GarethCabournDavies
Copy link
Contributor Author

CC is complaining about minifollowups module being too long - I think I can split this into minifollowups.py and minifollowup_plots.py (i.e. single_template, trigger_timeseries, qscans etc.)

@GarethCabournDavies GarethCabournDavies force-pushed the offline_gracedb_prep_workflow branch 2 times, most recently from 82a9d4e to 58b1bbd Compare December 11, 2023 13:15
@GarethCabournDavies
Copy link
Contributor Author

CC is complaining about minifollowups module being too long - I think I can split this into minifollowups.py and minifollowup_plots.py (i.e. single_template, trigger_timeseries, qscans etc.)

I tried this, but got so many errors in codeclimate, I have reverted it

Copy link
Contributor

@spxiwh spxiwh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part this looks good. I think there is one issue here related to file dependencies in sub-daxes (which was hard to get right, and I think in looking closer I've also spotted a bug in how this is done in the sbank workflow). Perhaps some changes to resolve_url is needed to support the "file is produced in the workflow above this, I don't have a PFN but pegasus would manage it" case. However, this requires somehow knowing that this is the case. Originally there was a --is-subworkflow flag to identify this. Alex removed that (alongside a bunch of things where the flag wasn't needed), but I think we may have lost this functionality in the process ....

bin/minifollowups/pycbc_foreground_minifollowup Outdated Show resolved Hide resolved
bin/minifollowups/pycbc_upload_prep_minifollowup Outdated Show resolved Hide resolved
bin/minifollowups/pycbc_upload_prep_minifollowup Outdated Show resolved Hide resolved

psd_fname = args.psd_files[ifo]
psd_file = resolve_url_to_file(os.path.abspath(psd_fname),
attrs={'ifos': ifo})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bad idea because it will copy the file into the new directory, and this is a big file. If you follow through what this function is doing, there's a call to a resolve_url function. That takes a copy_to_cwd option, but this is always True in this instance. Perhaps worth exposing this key-word argument, and setting it to False here (and perhaps in other instances as well).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... this also assumes that this is always a top-level workflow, and it would never be called from within another workflow. For a subworkflow things are a bit more complicated as you need to add file dependencies, and would not create a File object in this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to be able to work as either, e.g. we can run it on an already-completed analysis, or run it within the workflow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copying also happens for other large files here and here - I guess these can be updated here as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually wouldn't result in a copy. The filename passed here would always be something like ./H1L1-HDFTRIGGER_MERGE.hdf, and pegasus copies/symlinks the file here before this runs (as per usual), so the file is already in the right place.

The main point is that these are always subworkflows, and pegasus will have already handled file transfer for subworkflows.

bin/minifollowups/pycbc_upload_prep_minifollowup Outdated Show resolved Hide resolved
bin/minifollowups/pycbc_upload_prep_minifollowup Outdated Show resolved Hide resolved
bin/workflows/pycbc_make_offline_search_workflow Outdated Show resolved Hide resolved
'generate_xml',
ifos=workflow.ifos, out_dir=out_dir,
tags=tags
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more general point, with cases like this, where a particular executable is required, there should be some check that the thing in [executables] -> generate_xml is actually the executable that this is written against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds a good idea, I think something like the subclass check used for coinc_lim_for_thresh in the stat module might be worthwhile, but that would require quite a few changes so I think is outside the scope of this PR

ifos=workflow.ifos, out_dir=dax_output, tags=tags)

node = exe.create_node()
node.add_input_opt('--config-files', config_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're doing this (which is right) you must not use resolve_url in the pycbc_upload_prep_minifollowup. If you do you'll end up with file paths pointing to some local condor filesystem ... and this will fail.

I tried to point out a good example for this, but I think I found broken code (https://github.com/gwastro/pycbc/blob/master/bin/workflows/pycbc_make_sbank_workflow#L205). That example is a workflow that could be a top-level workflow or a subworkflow. The seed bank would be passed from the top-level workflow or is an input file. I think this would fail if seed-file is given and it is not a subworkflow.

The heart of the issue is that resolve_url does this https://github.com/gwastro/pycbc/blob/master/pycbc/workflow/core.py#L2120, and from_file does not. We do need to add a PFN if this is a top-level workflow and this is a global input, we must not add the PFN if it is not (especially because in this case we cannot predict where the statmap-file and bank-file would actually be).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add back in the --is-subworkflow option for this then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure of this now, this seems to be working in my latest tests (using the copy_to_cwd=True argument), however my testing uses a cache file, so the files for which resolve_url is called are definitely not still in local. Could this be an issue?

This also matches the way this is called in the foreground and injection minifollowups - will they need a fix as well? I'm surprised we haven't seen issues before in that case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reuse cache is affecting this test. There's no file management going on between the workflows when everything just comes as a pregenerated input file in a guaranteed location. Is there any way to turn this off, perhaps let the workflow make H1L1-PAGE_FOREGROUND_XMLALL-XXX.xml again from the already existing inputs?

I do see the "incorrect" file location in the dax file:

  - lfn: H1L1-INSP_SEGMENTS-1368975466-1121610.xml
    pfns:
    - site: local
      pfn: file:///local/condor/execute/dir_2994154/pegasus.kxIWl4525/H1L1-INSP_SEGMENTS-1368975466-1121610.xml

but it's possible that pegasus will always prefer PFN coming from reuse caches .... And it uses the reuse cache functionality to send intermediate files up and down. So this may work okay without changes ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update on what happens without using a reuse cache (or a workflow I can look at with this in place).

@spxiwh
Copy link
Contributor

spxiwh commented Feb 2, 2024

We've verified that the PFN issue is not causing failures in this code. Pegasus seems to always prefer files being sent down from higher-level workflows, than PFNs set in dax files. I still think we should fix this, but it doesn't hold this patch up any longer.

@GarethCabournDavies GarethCabournDavies merged commit 806750c into gwastro:master Feb 5, 2024
32 of 33 checks passed
@GarethCabournDavies GarethCabournDavies deleted the offline_gracedb_prep_workflow branch February 5, 2024 09:28
@GarethCabournDavies GarethCabournDavies added the v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master label Feb 9, 2024
spxiwh pushed a commit to spxiwh/pycbc that referenced this pull request Feb 9, 2024
…o#4535)

* Workflow changes needed to manage preparation of offline gracedb upload files

* fix change removed after rebase

* Allow waveforms to not be given

* IH comments
spxiwh added a commit that referenced this pull request Feb 9, 2024
* Need to create .cvmfscatalog file (#4619)

* Workflow changes needed to manage offline gracedb upload prep (#4535)

* Workflow changes needed to manage preparation of offline gracedb upload files

* fix change removed after rebase

* Allow waveforms to not be given

* IH comments

* Update setup.py

---------

Co-authored-by: Gareth S Cabourn Davies <[email protected]>
@spxiwh spxiwh removed the v23_release_branch PRs applied to the v2.3.X release branch or to be cherry-picked if merging to master label Feb 20, 2024
bhooshan-gadre pushed a commit to bhooshan-gadre/pycbc that referenced this pull request Mar 4, 2024
…o#4535)

* Workflow changes needed to manage preparation of offline gracedb upload files

* fix change removed after rebase

* Allow waveforms to not be given

* IH comments
lpathak97 pushed a commit to lpathak97/pycbc that referenced this pull request Mar 13, 2024
…o#4535)

* Workflow changes needed to manage preparation of offline gracedb upload files

* fix change removed after rebase

* Allow waveforms to not be given

* IH comments
acorreia61201 pushed a commit to acorreia61201/pycbc that referenced this pull request Apr 4, 2024
…o#4535)

* Workflow changes needed to manage preparation of offline gracedb upload files

* fix change removed after rebase

* Allow waveforms to not be given

* IH comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants