Skip to content
This repository has been archived by the owner on Oct 19, 2024. It is now read-only.

adding docker build materials and README #240

Closed
wants to merge 45 commits into from

Conversation

alecgunny
Copy link
Collaborator

@alecgunny alecgunny commented Dec 8, 2022

Adds a Dockerfile for building an (admittedly large) container image which has all of the project environments pre-installed via pinto. We can make this available on LDG and use it as the default entrypoint for new users to get BBHNet running. Also adds a README which covers the couple different paths to get running with BBHNet, and will eventually include developer instructions as well.

  • Add workflow for container build/push
  • Make a PR against OSG to include this container on /cvmfs
  • Include developer set-up instructions in README

Closes #237 , closes #192

@alecgunny alecgunny added this to the Documentation milestone Dec 8, 2022
@alecgunny alecgunny self-assigned this Dec 8, 2022
@alecgunny alecgunny changed the base branch from main to documentation December 8, 2022 21:25
@alecgunny
Copy link
Collaborator Author

Omicron won't work with a containerized deployment right now. pyomicron doesn't support creating condor sub files that leverage singularity images, so there won't be any omicron executable to point to which will exist in the submitted job.

Taking a look at the pyomicron code, it seems like it might be somewhat straightforward to augment to support this behavior. Basically you would just need to add some code to its OmicronProcessJob.write_sub_file function to check if the job is meant to run in a singularity container, and if so add the corresponding

+SingularityImage = "..."
Requirements = HasSingularity

lines to the sub file and re-write it. All you would have to do then is add an arg to the command line parser and pass it to the DAG instantiation here.

@EthanMarx I know you were also planning on doing some forking/developing of omicron. Have you made any moves on that? Might make sense for us to create an ML4GW/pyomicron fork that we develop on our own and use to submit PRs upstream without having to wait for their dev cycle.

@alecgunny
Copy link
Collaborator Author

Filed this in gwpy/pyomicron#150 for reference and tracking purposes

@EthanMarx
Copy link
Collaborator

Yeah I like the idea of an ML4GW/pyomicron fork. We can fork and make pull requests against it ourselves, and then if the pyomicron folks deem our PRs worthy, we can make PRs from ML4GW/pyomicron to gwpy/pyomicron.

I'll try to set this up for my more basic addition (flag for cleaning the temporary files) today.

alecgunny and others added 6 commits December 16, 2022 08:50
* updating typeo dep in archs and potetntially fixing workflow

* getting rid of workflow debug echo statement

* getting arch tests passing

* updating downstream environments and adding env doc

* fixing weird issue with data lib
* updating ml4gw reference

* using ml4gw distributions

* adding dummy device arg to sample patch
* Updated poetry.lock files

* Added check to Sampler to make sure the buffer is large enough

* Getting tests working

* Getting tests working

* Aligning lock files with main, I think

* Switched from raising an error for incompatible parameters to forcing parameters to be compatible. Updated pyproject.toml default values so that new definitions match the old values

* Forgot to add changes to pyproject.toml

Co-authored-by: William Benoit <[email protected]>
Co-authored-by: William Benoit <[email protected]>
@alecgunny
Copy link
Collaborator Author

alecgunny commented Dec 23, 2022

PR for omicron fix filed in ML4GW/pyomicron#1. Once this is merged I'll update the pyproject.toml to reference the upstream master branch.

alecgunny and others added 6 commits December 24, 2022 16:55
* updating export tensorrt dep

* getting new trt versions working

* adding comments to pyproject for clarification

* updating infer image

* updating environments to resolve issues

* updating hermes reference

* fixing infer tests

* fixing hermes image location
* parallelize glitch project

* fix line endings

* fix line endings

* remove veto from tests

* fix glitch tests

* fix args in glitch test

* fix glitch tests

* get tests working

* change to ThreadPool
@EthanMarx
Copy link
Collaborator

Been looking into docker more and how it may interact with condor.

Thoughts on the following:

  1. Instead of creating a monolithic image for the entire project (which seems to be large) we containerize each component (e.g. an individual container for each of datagen, train, analysis, etc.)
  2. Create a condor wrapper script that creates the end to end pipeline and runs each individual component in its respective container.
  3. Create a version of the pipeline with docker-compose in the scenario a user does not have condor access. (For the pyomicron jobs, we could simply launch the bash file that is created, or write a wrapper that creates a more parallelized version)

@alecgunny
Copy link
Collaborator Author

@EthanMarx great points. What you've described with (1) is my plan for production for sure, I would never use a container like this. And (2) and (3) are generalizations of pinto to containerization that I've been considering for a while but haven't really made sense until the past couple months now that we can use apptainer without root to build and pull containers at will for dev purposes.

I think the idea here for me was to create something that didn't require someone to install any dependencies to run. Just a true out-of-the-box implementation that someone who just wanted to get up and running could use with 0 thought, and could also be used to recreate deterministically any results we eventually publish. This was more attractive when I thought it wouldn't take that much work, but obviously Omicron is proving that's not the case, so maybe it's a fair time to rethink this.

@EthanMarx
Copy link
Collaborator

Was able to get a docker image built that installs omicron and gwollum from source via the git.ligo.org master branches, into a conda environment. I've pushed the docker image here .

Here's the Dockerfile. My first ever so curious to hear feedback:

ARG CONDA_TAG=22.11.1

FROM continuumio/miniconda3:${CONDA_TAG} as base

# create conda environment installing omicron/gwollum deps
COPY environment.yaml .
RUN conda env create -f environment.yaml 

ENV SRCDIR=/src/ \
    INSTALLDIR=/opt/conda/envs/omicron

SHELL ["conda", "run", "-n", "omicron", "/bin/bash", "-c"]

RUN apt-get update && apt-get install -y --no-install-recommends \
        git \
        build-essential \ 
        pkg-config \
        cmake 

# install GWOLLUM
RUN mkdir -p ${SRCDIR} \
        && cd ${SRCDIR} \
        && git clone https://git.ligo.org/virgo/virgoapp/GWOLLUM.git && cd GWOLLUM \
        && mkdir ./build && cd ./build \ 
        && cmake -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} ${SRCDIR}/GWOLLUM \ 
        && make && make install \ 
        && source ${INSTALLDIR}/etc/gwollum.env.sh


# install Omicron
RUN cd ${SRCDIR} \
    && git clone https://git.ligo.org/virgo/virgoapp/Omicron.git && cd Omicron \
    && mkdir ./build && cd ./build \ 
    && cmake -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} ${SRCDIR}/Omicron \
    && make && make install \ 
    && source ${INSTALLDIR}/etc/omicron.env.sh

And the environment.yaml that contains the gwollum / omicron deps

name: omicron 
channels:
  - conda-forge
dependencies:
  - fftw
  - hdf5
  - framel>=8.42 
  - root_base>=6.26
  - libstdcxx-ng 

@wbenoit26 wbenoit26 closed this Jan 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Containerize repo Main repo README
3 participants