Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ripping with docker in parallel #8

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 50 additions & 25 deletions rip_docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,28 +13,53 @@ if [ "$#" -ne 2 ]; then
exit -1
fi

# Flags determined by examining docker-wine with:
# --as-me
# --workdir
# --xvfb
# --name
# https://github.com/scottyhardy/docker-wine/blob/master/docker-wine
docker run \
-it \
--rm \
--volume=${2}:/data \
--env=USER_NAME=${USER} \
--env=USER_UID=$(id -u ${USER}) \
--env=USER_GID=$(id -g ${USER}) \
--env=USER_HOME=${HOME} \
--workdir=/home/${USER} \
--env=USE_XVFB=yes \
--env=XVFB_SERVER=:95 \
--env=XVFB_SCREEN=0 \
--env=XVFB_RESOLUTION=320x240x8 \
--env=DISPLAY=:95 \
--hostname=bruker-ripper \
--name=bruker-ripper \
--shm-size=1g \
--env=TZ=America/Los_Angeles \
${1}
## Support running in parallel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you choose to do this in a shell script, compared to launching multiple jobs (or an array job) via slurm?

The reason I ask is because this seems to be adding some complexity around xvfb and locking (which looks tricky and is just one more thing that could break....), whereas parallel launching via slurm is straightforward. Also, it gets tricky to manage ending gracefully and/or doing retries in case where one or more of the parallel threads fails for some reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been running this script on my local computer to avoid filesystem slowness, so this is nice for parallelization, but perhaps better to be in new script..?

# https://stackoverflow.com/questions/30332137/xvfb-run-unreliable-when-multiple-instances-invoked-in-parallel

# allow settings to be updated via environment
: "${xvfb_lockdir:=$HOME/.xvfb-locks}"
: "${xvfb_display_min:=99}"
: "${xvfb_display_max:=599}"

# assuming only one user will use this, let's put the locks in our own home directory
# avoids vulnerability to symlink attacks.
mkdir -p -- "$xvfb_lockdir" || exit

i=$xvfb_display_min # minimum display number
while (( i < xvfb_display_max )); do
if [ -f "/tmp/.X$i-lock" ]; then # still avoid an obvious open display
(( ++i )); continue
fi
exec 5>"$xvfb_lockdir/$i" || continue # open a lockfile
if flock -x -n 5; then # try to lock it
# if locked, run

# Flags determined by examining docker-wine with:
# --as-me
# --workdir
# --xvfb
# --name
# https://github.com/scottyhardy/docker-wine/blob/master/docker-wine
exec docker run \
-i \
--rm \
--volume=${2}:/data \
--env=USER_NAME=${USER} \
--env=USER_UID=$(id -u ${USER}) \
--env=USER_GID=$(id -g ${USER}) \
--env=USER_HOME=${HOME} \
--workdir=/home/${USER} \
--env=USE_XVFB=yes \
--env=XVFB_SERVER=:$i \
--env=XVFB_SCREEN=0 \
--env=XVFB_RESOLUTION=320x240x8 \
--env=DISPLAY=:$i \
--hostname=bruker-ripper \
--name=bruker-ripper-$i \
--shm-size=1g \
--env=TZ=America/Los_Angeles \
${1}
fi
(( i++ ))
done

19 changes: 13 additions & 6 deletions two-photon/process.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ def get_dirname_hdf5(sess_name, rec_name):
prev_data = get_dirname_hdf5(sn, rn) / 'data' / 'data.h5'
data_files.append(prev_data)
data_files.append(fname_hdf5)
run_suite2p(data_files, dirname_output, mdata)
run_suite2p(data_files, dirname_output, mdata, args.ops_suite2p)

if args.backup_output:
backup(dirname_output, dirname_backup / 'output')
Expand Down Expand Up @@ -156,8 +156,7 @@ def preprocess(basename_input, dirname_output, fname_csv, fname_uncorrected, fna
stim_channel_name, settle_time):
"""Main method for running processing of TIFF files into HDF5."""
size = mdata['size']

df_voltage = pd.read_csv(fname_csv, index_col='Time(ms)', skipinitialspace=True)
df_voltage = pd.read_csv(fname_csv, skipinitialspace=True)
logger.info('Read voltage recordings from: %s, preview:\n%s', fname_csv, df_voltage.head())
fname_frame_start = dirname_output / 'frame_start.h5'
frame_start = artefacts.get_frame_start(df_voltage, fname_frame_start)
Expand Down Expand Up @@ -241,13 +240,19 @@ def run_cmd(cmd, expected_returncode, shell=False):
logger.info('Output:\n%s', result.stdout.decode('utf-8'))


def run_suite2p(hdf5_list, dirname_output, mdata):
def run_suite2p(hdf5_list, dirname_output, mdata, ops_file=None):
z_planes = mdata['size']['z_planes']
fs_param = 1. / (mdata['period'] * z_planes)

# Load suite2p only right before use, as it has a long load time.
import suite2p
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loads in all of suite2p (which includes the gui stuff and the detection modules). It's not fatal, but I've found suite2p to be heavy weight. So I had just imported the subset of libraries (run_s2p) that we use below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest version of suite2p refactored a bit, e.g. need suite2p.default_ops() but maybe not fully necessary

from suite2p import run_s2p
default_ops = run_s2p.default_ops()
if ops_file is None:
ops = suite2p.default_ops()
else:
# from:
# https://github.com/MouseLand/suite2p/blob/4b6c3a95b53e5581dbab1feb26d67878db866068/suite2p/gui/rungui.py#L472
ops = np.load(ops_file, allow_pickle=True).item()
params = {
'input_format': 'h5',
'data_path': [str(f.parent) for f in hdf5_list],
Expand All @@ -264,7 +269,7 @@ def run_suite2p(hdf5_list, dirname_output, mdata):
logger.info('Running suite2p on files:\n%s\n%s', '\n'.join(str(f) for f in hdf5_list), params)
with open(dirname_output / 'recording_order.json', 'w') as fout:
json.dump([str(e) for e in hdf5_list], fout, indent=4)
run_s2p.run_s2p(ops=default_ops, db=params)
run_s2p(ops=ops, db=params)


def parse_args():
Expand Down Expand Up @@ -295,6 +300,8 @@ def parse_args():
help='Top level directory for SLM setup data')
group.add_argument('--output_dir', type=pathlib.Path, help='Top level directory of data processing')

group.add_argument('--suite2p-ops', type=pathlib.Path, help='.ops file for suite2p', default=None)

group.add_argument('--recording',
type=str,
help=('Name of a recording, given as a slash separated id of SESSION/RECORDING/OPTIONAL_PREFIX '
Expand Down
5 changes: 3 additions & 2 deletions two-photon/rip.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
# Ripping process does not end cleanly, so the filesystem is polled to detect the
# processing finishing. The following variables relate to the timing of that polling
# process.
RIP_TOTAL_WAIT_SECS = 3600 # Total time to wait for ripping before killing it.
RIP_EXTRA_WAIT_SECS = 10 # Extra time to wait after ripping is detected to be done.
RIP_TOTAL_WAIT_SECS = 60*60*10 # Total time to wait for ripping before killing it.
RIP_EXTRA_WAIT_SECS = 60*60*10 # Extra time to wait after ripping is detected to be done.
RIP_POLL_SECS = 10 # Time to wait between polling the filesystem.


Expand All @@ -35,6 +35,7 @@ def determine_ripper(data_dir, ripper_dir):
version = root.attrib['version']

# Prairie View versions are given in the form A.B.C.D.
# TODO: allow match when minor version mismatches, eg fall back to `version = '5.5.1.1'`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this uses the major+minor to determine which ripper version to use. It seems, at least anecdotally, that the minor version is important.

What behavior are you expecting on a minor version mismatch?

One improvement I see is that we should throw an error right here if the ripper version is not in the repo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you're right i think minor version is sometimes (usually) important...annoying...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this TODO probably isn't necessary. We currently handle 5.4 vs 5.5, and I'd like the code to fail loudly (not silently try something we haven't tested) if it detects anything else.

match = re.match(r'^(?P<majmin>\d+\.\d+)\.\d+\.\d+$', version)
if not match:
raise RippingError('Could not parse version (expected A.B.C.D): %s' % version)
Expand Down
2 changes: 1 addition & 1 deletion two-photon/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def convert(data, fname_data, df_artefacts=None, fname_uncorrected=None):

logger.info('Writing corrected data to %s', fname_data)
with h5py.File(fname_uncorrected, 'r') as hfile:
arr = da.from_array(hfile[HDF5_KEY])
arr = da.from_array(hfile[HDF5_KEY], chunks=('auto', -1, -1, -1))
tbenst marked this conversation as resolved.
Show resolved Hide resolved
# Depth of 1 in the first coordinate means to bring in the frames before and after
# the chunk -- needed for doing diffs.
depth = (1, 0, 0, 0)
Expand Down