Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O2vae integration #1

Open
wants to merge 220 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
220 commits
Select commit Hold shift + click to select a range
7b43109
Removing umap feature compression
ctr26 Jan 15, 2024
e58fbac
Merge pull request #19 from ctr26/decoder_fix
ctr26 Jan 17, 2024
38cf94d
Merge pull request #21 from ctr26/no_umap
ctr26 Jan 17, 2024
b17827a
Indexation augmentation (forgot this wasnt in here)
ctr26 Jan 5, 2024
5cf77fc
Fixed the import issue
ctr26 Jan 8, 2024
9273f8c
missing import
ctr26 Jan 8, 2024
1abc9d6
Merge pull request #17 from ctr26/indexation_aug
ctr26 Jan 17, 2024
5f46b74
Fixing tests
ctr26 Jan 17, 2024
0fc9066
First attempt at setting up the testing cicd
ctr26 Jan 16, 2024
8a2d14d
adding windows back in?
ctr26 Jan 16, 2024
74f2d13
commented instead I think this makes more sense
ctr26 Jan 16, 2024
c190e04
removing snakemake from env
ctr26 Jan 16, 2024
1e657fd
Forgot to remove sourceing
ctr26 Jan 16, 2024
9a61af2
Merge pull request #22 from ctr26/cicd
ctr26 Jan 17, 2024
510aa04
Generalised the logging a bit
ctr26 Jan 18, 2024
3b37ca9
First attempt as arg hashing for checkpoints
ctr26 Jan 18, 2024
8e18943
Early stopping on val loss to stop overfitting
ctr26 Jan 18, 2024
397abe8
Attempt at cli
ctr26 Jan 18, 2024
8534143
Merge remote-tracking branch 'origin/hashing_model_args' into all_tog…
afoix Jan 19, 2024
e4e26a9
Merge pull request #34 from ctr26/logger_improve
ctr26 Jan 19, 2024
820b816
Merge pull request #35 from ctr26/hashing_model_args
ctr26 Jan 19, 2024
4bf27f9
Early stopping on val loss to stop overfitting
ctr26 Jan 18, 2024
4ca9dd0
adding branch prose back
ctr26 Jan 20, 2024
0068fa0
Merge branch 'master' into all_together
afoix Jan 20, 2024
26bf58c
Merge remote-tracking branch 'origin/early_stopping' into all_together
afoix Jan 20, 2024
1d5c49c
Merge remote-tracking branch 'origin/readme_edits' into all_together
afoix Jan 20, 2024
a34ee72
local changes to run
afoix Jan 20, 2024
9135043
command line arguments
afoix Jan 20, 2024
43c6ed0
enable testing + uncomment dataset
afoix Jan 20, 2024
e4e0aae
added a slurm python script
afoix Jan 20, 2024
e78afd6
fix cli type
afoix Jan 20, 2024
ee021c6
add correct name for the jobs
afoix Jan 20, 2024
30c34cc
Log f1 score mean and std in wandb
afoix Jan 20, 2024
b77c4fc
choose memory allocation base on latent space size
afoix Jan 20, 2024
8f7d9d8
dynamically chose n gpus based on latent space size + fix mem allocat…
afoix Jan 20, 2024
41dec50
fix gpu allocation typo
afoix Jan 20, 2024
775548a
comment out all the mean and std login for f1
afoix Jan 20, 2024
1cf6646
added a --clear-checkpoints clarg
afoix Jan 23, 2024
704c88f
use wandblogger to log info (mean, std dev...)
afoix Jan 23, 2024
15343f6
run individual jobs in own folder to work around checkpoints
afoix Jan 23, 2024
4981381
Updating pythae
ctr26 Feb 21, 2024
9da2a22
Adding standard scalar to df scoring fun
ctr26 Feb 21, 2024
873ef92
Refactoring and adding back umap
ctr26 Feb 21, 2024
7527363
Seed "everything"
ctr26 Feb 21, 2024
afb0368
Reduce k folds (should be a hparam)
ctr26 Feb 21, 2024
941dc80
Update args to match what we now think is good
ctr26 Feb 21, 2024
2ccf8b4
Dynamic best weights finding
ctr26 Feb 21, 2024
d7761b6
Fixed umap
ctr26 Feb 21, 2024
22832d6
Made the class column categorical
ctr26 Feb 21, 2024
e567748
modification for slurm
afoix Feb 22, 2024
161b0a0
changes in the shape embed script
afoix Feb 22, 2024
ae215ae
Merge remote-tracking branch 'origin/shape_embed' into slurm
afoix Feb 22, 2024
faf72e4
Merge pull request #40 from ctr26/shape_embed
ctr26 Feb 22, 2024
a6cb292
fix merge commit + add command line args for dataset (name and path) …
afoix Feb 22, 2024
1999076
duplicated slurm script + specify dataset
afoix Feb 22, 2024
d3525a8
Fix wandb logger
afoix Feb 22, 2024
7a60b2a
Add helakyoto dataset to the slurm script
afoix Feb 22, 2024
e43e839
better imports
ctr26 Feb 29, 2024
ffbd8ea
Adding dataset path to args for better checkpointing
ctr26 Feb 29, 2024
a3e82a9
Imrproved dataset logic so that dist depends on coords
ctr26 Feb 29, 2024
5b720d4
Improved data rejection for datatsets
ctr26 Feb 29, 2024
dea41f6
Removing redundant code and adding logging
ctr26 Feb 29, 2024
85f1211
[bug] Removing hard coded idx mapper
ctr26 Feb 29, 2024
e411a9d
Adding logging and and tqdm so it doesnt look like the code is hanging
ctr26 Feb 29, 2024
66df0b9
More logging
ctr26 Feb 29, 2024
b5db367
Adding a disk cleanup step
ctr26 Feb 29, 2024
f580f97
Reverting to other path structure
ctr26 Feb 29, 2024
ae9f8d1
Adding average cross val to logs
ctr26 Feb 29, 2024
dcdfa63
Adding opencv to env file
ctr26 Feb 29, 2024
e8ca2cb
Added allen dataset
afoix Mar 1, 2024
775442c
Limit time per job increased to 24h
afoix Mar 1, 2024
5e21453
Merge branch 'shape_embed' into slurm
afoix Mar 1, 2024
2fce547
Fixing case where multiple contours are found, chose the longest
ctr26 Mar 1, 2024
f88da4f
Merge branch 'bad_data_fix' into slurm
afoix Mar 3, 2024
6e14ffd
change back to use dataset name from clarg + change default wandb job…
afoix Mar 3, 2024
f31b9a0
added back dataset subseting
afoix Mar 3, 2024
d052800
Added a tiny dataset for quick debugging (commented out in the slurm …
afoix Mar 3, 2024
df41415
use specific gpu resource
afoix Mar 4, 2024
f1e5a3c
Adding roc_auc and using balanced accuracy
ctr26 Mar 5, 2024
d31ca96
Probably should stratify
ctr26 Mar 5, 2024
abe2664
Adding coordinate debug (unchecked)
ctr26 Mar 5, 2024
ab48c09
put back frobenius norm false
afoix Mar 6, 2024
67e3708
merge of Craig's branch
afoix Mar 6, 2024
46eb3d1
Forgot an import
ctr26 Mar 6, 2024
db88d34
add the hardcode entity and add model dir
afoix Mar 7, 2024
162ae51
Merge remote-tracking branch 'origin/shape_embed' into slurm
afoix Mar 7, 2024
2505a2d
reduce epochs
afoix Mar 11, 2024
f588a2d
all changes
afoix Mar 27, 2024
6151a6f
first structure
afoix Mar 27, 2024
8208238
Properly overwrite default params from clargs
afoix Apr 1, 2024
aacac25
Use DatasetFolder to load .npy and turn the dist matrix into a 3 chan…
afoix Apr 1, 2024
579ab0b
Disable checkpoints in training by default (maybe re-enable at some f…
afoix Apr 1, 2024
f0fee88
Enable gpu accelleration by default
afoix Apr 1, 2024
5e97713
more informative verbose print
afoix Apr 1, 2024
bb425df
bring argparse to the masks2distmatrices script
afoix Apr 1, 2024
ac3b85a
training and test model
afoix Apr 2, 2024
7e7c7a2
Roll indices + normalisation + sanity_check + dataset name for latent…
afoix Apr 2, 2024
da4acf8
Added wandb logging
afoix Apr 2, 2024
fd7d122
Added the extraction of original/reconstructed matrices + clarg for o…
afoix Apr 2, 2024
ee7cd7f
created a script that renders dist matrices .npy as .png images
afoix Apr 2, 2024
20db342
new changes: sparisity, periodicity and also add a script to draw co…
afoix Apr 9, 2024
7844798
masks2distmat: turn find_contour into find_longest_contour
afoix Apr 9, 2024
177df9e
masks2distmat: enable periodic splprep for closed contours
afoix Apr 9, 2024
4bd1487
masks2distmat: updated default sparsity to 4
afoix Apr 9, 2024
6d951e4
distmat2contour: removed spurious return statement in vprint
afoix Apr 9, 2024
1481b51
drawContourFromDM: removed spurious return statement in vprint
afoix Apr 9, 2024
01c499a
set correct aspect ratio for distmat2contour scripts
afoix Apr 9, 2024
879c5f4
add different normalisations in dataset initial transformations
afoix Apr 9, 2024
fc58ad3
Add notion of class label to ditmat2emb script output
afoix Apr 9, 2024
2a4d955
Updated default model path in distmat2emb script
afoix Apr 9, 2024
0c166ec
Added umap and kmeans + original filenames list
afoix Apr 15, 2024
4826ba3
dist2emb: random seed for np and pl
afoix Apr 17, 2024
3e56247
dist2emb: test different initial transformations
afoix Apr 17, 2024
a29c57d
dist2emb: remove "TODO" from prints
afoix Apr 17, 2024
bf5ce7f
cosmetics
afoix Apr 17, 2024
41890e5
LitAutoEncoderTorch: return both loss and recon_loss
afoix Apr 17, 2024
beb5714
MaskEmbed: turn off normalisation in DistanceMatrixLoss
afoix Apr 17, 2024
da092b5
MaskEmbed: log losses in loss_function method
afoix Apr 17, 2024
78d067d
renamed varitional_loss to vq_loss + comment out kdl_vae_loss
afoix Apr 24, 2024
d0c4ffb
removed new line
afoix Apr 24, 2024
253b3f4
Normalise contour coord in mask2distmat script
afoix Apr 24, 2024
5066e3d
Use bokeh for interactive umap plot (save as html file)
afoix Apr 24, 2024
977620e
save latent_space with extra info as pickle again and have a separate…
afoix Apr 25, 2024
ccdc090
updated the render umap script with a _hardcoded_ trick to extract in…
afoix Apr 26, 2024
3601e3b
minor config + comments
afoix Apr 29, 2024
3f98d06
added a beta vae model
afoix May 8, 2024
f6ee1ac
added extra parameters in the wandb jobname
afoix May 8, 2024
01b746f
finer grained clargs around latent space related parameters
afoix May 10, 2024
d493ea5
log different losses for vq or beta models
afoix May 10, 2024
b15dff4
code to do classification using the features of the latent space
afoix May 13, 2024
1a2bcf1
new latent space size
afoix Jun 5, 2024
b6f12c2
Added imports that will be needed for next commits
afoix Jun 5, 2024
79e75ce
Added checkpoint mechanism
afoix Jun 5, 2024
a8406ea
Added regionprops + fourrier decomposition trials (! hardcoded path t…
afoix Jun 5, 2024
03a4cf7
Adding a n compression parameter for the latent space size
afoix Jun 16, 2024
5467f9f
improve scoring function and use StratifiedKFold instead of KFold for…
afoix Jun 16, 2024
344c8ab
hardcoded commited setup now points to quick test setup
afoix Jun 16, 2024
9a6456e
initial refactor commit, script with split up functionnalities, missi…
afoix Jun 16, 2024
751394e
Added predictions + kmeans of input data
afoix Jun 17, 2024
949254d
factored out evaluation functionality + added regionprops, efd and sc…
afoix Jun 17, 2024
43a3969
cleaner logging + score shapeembed itself
afoix Jun 18, 2024
9030e48
reshaped shapeembed reported dataframe
afoix Jun 18, 2024
6a5c7d0
renamed label to class
afoix Jun 18, 2024
d216cc7
updated scoring function + collate and save results
afoix Jun 18, 2024
7e2bdbd
Added clargs to control matrix normalization and roll
afoix Jun 18, 2024
86ede7b
Added umap_plot
afoix Jun 18, 2024
19edf47
fix dataset clarg
afoix Jun 24, 2024
976edc2
fix model name clarg
afoix Jun 24, 2024
d1c5d3c
fix model_name clarg again
afoix Jun 24, 2024
4851241
Added early stop clarg (default no early stop)
afoix Jun 25, 2024
b95272e
added confusion matrices to scoring function
afoix Jun 26, 2024
222f698
use integer division for compression factor clarg
afoix Jun 26, 2024
793b720
explicitly binarise image when running regionprops
afoix Jun 26, 2024
43673ee
keep 'class' as a column rather than index + keeps column names as st…
afoix Jun 26, 2024
11a6e69
change len for shape[0]
afoix Jun 26, 2024
91dd1c5
drop not needed return value from run_predictions
afoix Jun 26, 2024
963d590
added combined shapeembed + efd + regionprops scoring and comment out…
afoix Jun 26, 2024
df09ad5
save combined score
afoix Jun 27, 2024
ff6fbd4
save confusion matrices
afoix Jun 27, 2024
ebadaff
First attempt at a result gathering script
afoix Jul 5, 2024
59fab42
added barplots
afoix Jul 5, 2024
bfab20d
Added a separate regionprops script
afoix Jul 18, 2024
c4a3a23
added a separate efd script
afoix Jul 18, 2024
b799803
refactor efd and regionprops out of evaluation helpers
afoix Jul 18, 2024
6bc1947
less debug info by default + create outdir if not there
afoix Jul 18, 2024
db47da9
removed regionprops/efd from main shapeembed script + filename saniti…
afoix Jul 18, 2024
5a8b274
unify file names across efd/regionprops/shapeembed
afoix Jul 18, 2024
9d3a053
Added a readme
afoix Jul 18, 2024
8326afc
track params in reporting
afoix Jul 18, 2024
aaa55db
also add model specific params as tag columns
afoix Jul 18, 2024
ba67d36
added a slurm script to sweap shapeembed parameters
afoix Jul 18, 2024
7c422b1
added resnet50_beta_vae to the factory
afoix Jul 18, 2024
1307579
added resnet50_beta_vae to the shapeembed script
afoix Jul 18, 2024
1f82d9f
handle per model params in slurm script + chose some param values to …
afoix Jul 18, 2024
2c49fc9
better slurm jobname
afoix Jul 18, 2024
a291c03
removed compression factor 20
afoix Jul 19, 2024
a29b6bd
bumped up memory allocation to 250G
afoix Jul 19, 2024
dde96fc
added --no-early-stop flag
afoix Jul 19, 2024
8dbf551
added an oom_retry function
afoix Jul 20, 2024
38a9ff6
refined min / max epochs clargs
afoix Jul 20, 2024
10c8b50
slurm script refactor args + force 150 epochs
afoix Jul 20, 2024
c0cd3b4
bring triangular + compression computation in named function (to shar…
afoix Jul 20, 2024
3b1fdda
fix in model_str function test of model_args
afoix Jul 20, 2024
65842f6
refactor slurm script to detect already completed jobs
afoix Jul 20, 2024
f97657e
factored out some common helpers
afoix Jul 20, 2024
b74be39
Add a comment/uncomment block for quick ad-hoc single config run
afoix Jul 21, 2024
9b8e934
added a function to find currently submitted slurm jobs
afoix Jul 21, 2024
2cead3d
added clargs for job filtering enabling/disabling (enabled by default)
afoix Jul 21, 2024
650fcc1
typo fix: sweap -> sweep
afoix Jul 21, 2024
62704af
parse dataset as a SimpleNamespace from job string
afoix Jul 21, 2024
6e9ffcf
updated data gathering script to newer changes (still TODO for figures)
afoix Jul 21, 2024
ffce0d3
removed stale script string
afoix Jul 21, 2024
d802f9a
Split model name in two columns if there are model args
afoix Jul 21, 2024
663cc52
remove stale import
afoix Jul 21, 2024
7d328d9
experiment with plots
afoix Jul 21, 2024
e3796ba
keep exploring potential plots
afoix Jul 22, 2024
cecde4e
more graphs
afoix Jul 22, 2024
344cff1
fix model name in shapeembed output csv
afoix Jul 22, 2024
6846ce1
Added loss / mse to shapeembed's generated csv
afoix Jul 22, 2024
7a8972f
updated slurm script with regex filtering of squeue output
afoix Jul 25, 2024
d9c87a3
added a simple latex table to the gather_run_results script
afoix Jul 25, 2024
cced4e9
minor refactor in efd
afoix Jul 27, 2024
3e6e4c9
minor refactor in regionprops
afoix Jul 27, 2024
2908c15
generated plots and more tables in gather_run_results
afoix Jul 27, 2024
41d4603
added regionprops and efd to gather results script
afoix Jul 27, 2024
251d85f
Updated graphs titles
afoix Jul 29, 2024
932b13a
fake beta column if necessary and filter out regionprops and efd for …
afoix Jul 29, 2024
61f8e17
updated datasets + only find jobs and scores if corresponding filter …
afoix Jul 29, 2024
583e835
bugfix overwriting loop dataframe
afoix Jul 29, 2024
07bb89c
dded a clarg to control region prop properties
afoix Jul 30, 2024
a35490c
Added random order to efd and regionprops
afoix Aug 8, 2024
83a7679
force different markers for scatter plot F1vMSE
afoix Sep 7, 2024
b9c64d5
updated scatterplot
afoix Sep 9, 2024
d51a120
add standard deviation to the report for regions props and efd
afoix Sep 27, 2024
c0232c7
modification slurm script
afoix Sep 27, 2024
870aa42
changes to test o2vae integration XXX relies on an adapted o2vae repo…
afoix Sep 29, 2024
6fa5cc9
off-by-one in square recrop
afoix Sep 29, 2024
85ce853
added drop_last for uneven dataset sizes
afoix Sep 30, 2024
db3a4fb
specialized slurm script
afoix Sep 30, 2024
c45b884
added o2vae repo patch
afoix Sep 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 26 additions & 34 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ name: Publish Docker
on:
push:
branches:
- main
- master
- main
- master
# pull_request: ~

env:
Expand All @@ -14,37 +14,29 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/[email protected]
with:
fetch-depth: 2
- name: Log in to the Container registry
uses: docker/[email protected]
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Checkout
uses: actions/[email protected]
with:
fetch-depth: 2
- name: Log in to the Container registry
if: ${{ !env.ACT }}
uses: docker/[email protected]
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/[email protected]
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/[email protected]
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

- name: Build and push Docker image (version tag)
if: steps.check-version.outputs.current-version
uses: docker/[email protected]
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}:${{ steps.check-version.outputs.current-version }}
labels: ${{ steps.meta.outputs.labels }}

- name: Build and push Docker image (latest tag)
if: steps.check-version.outputs.current-version
uses: docker/[email protected]
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}:latest
labels: ${{ steps.meta.outputs.labels }}
- name: Build and push Docker image (version tag)
if: steps.check-version.outputs.current-version
uses: docker/[email protected]
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}:${{ steps.check-version.outputs.current-version }}
labels: ${{ steps.meta.outputs.labels }}
66 changes: 37 additions & 29 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -1,36 +1,44 @@
# https://github.com/marketplace/actions/install-poetry-action
name: test

on: [pull_request,push]

name: conda
on: [push]
jobs:
test:
constructor:
name: conda build (${{ matrix.python-version }}, ${{ matrix.os }})
runs-on: ${{ matrix.os }}-latest
defaults:
run:
shell: bash -l {0}
shell: ${{ matrix.shell }}
strategy:
fail-fast: false
matrix:
# os: [ubuntu, windows, macos]
os: [ubuntu]
python-version: ["3.9"]
os: [ubuntu-latest]
# os: [ubuntu-18.04, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
include:
- os: ubuntu
shell: bash -l {0}
# - os: windows
# shell: cmd /C call {0}
# - os: macos
# shell: bash -l {0}
steps:
- name: Check out repository
uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
use-mamba: true
environment-file: environment.yml
python-version: ${{ matrix.python-version }}
- name: poetry env
run: poetry env use python
- name: Poetry lock
run: poetry lock
- name: Install library
run: poetry install --no-interaction
# - name: Run tests
# run: |
# source .venv/bin/activate
# pytest tests/
- uses: actions/checkout@v2
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
tool-cache: false
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: true
swap-storage: true
- uses: conda-incubator/setup-miniconda@v2
with:
environment-file: environment.yml
miniforge-variant: Mambaforge
miniforge-version: latest
mamba-version: "*"
use-mamba: true
python-version: ${{ matrix.python-version }}
- name: Run tests
run: |
make test
98 changes: 0 additions & 98 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,101 +9,3 @@ download.data:
test:
pytest


GOOGLE_APPLICATION_CREDENTIALS=$(shell pwd)/credentials.json
BUCKET_NAME=idr-hipsci
TRAINING_DIR=idr0034-kilpinen-hipsci
PROJECT=prj-ext-dev-bia-binder-113155

JOB_PREFIX=vae
JOB_NAME=$(JOB_PREFIX)_$(shell date +%Y%m%d_%H%M%S)
JOB_DIR=gs://${BUCKET_NAME}/${JOB_NAME}/models
DATA_DIR=gs://${BUCKET_NAME}/${TRAINING_DIR}

.EXPORT_ALL_VARIABLES:
GOOGLE_APPLICATION_CREDENTIALS
BUCKET_NAME
TRAINING_DIR
JOB_PREFIX
JOB_NAME
JOB_DIR


# MY_VAR := $(shell echo whatever)

# test:
# @echo MY_VAR IS $(MY_VAR)

test:
@echo $$GOOGLE_APPLICATION_CREDENTIALS $$BUCKET_NAME $$TRAINING_DIR

all: get_data_list build

build:
conda activate torch
python idr_get_data.py

get_data_list:
ls /nfs/bioimage/drop/idr*/**/*.tiff > file_list.txt
ls -u /nfs/bioimage/drop/idr*/**/*.tiff > file_list.txt

run.on.cloud:
python idr_get_data_s3.py

run.on.cloud.snake:
snakemake --use-conda --cores all \
--verbose --google-lifesciences \
--default-remote-prefix idr-hipsci \
--google-lifesciences-region eu-west2

run.snake:
snakemake --cores all -F --use-conda --verbose

get.env.file:
conda env export --from-history -f environment.yml -n torch

on.gcp:
gcloud ai-platform jobs submit training ${JOB_NAME} \
--region=europe-west2 \
--master-image-uri=gcr.io/cloud-ml-public/training/pytorch-gpu.1-9 \
--scale-tier=CUSTOM \
--master-machine-type=n1-standard-8 \
--master-accelerator=type=nvidia-tesla-t4,count=1 \
--job-dir=${JOB_DIR} \
--package-path=./trainer \
--module-name=trainer.train \
--stream-logs \
-- \
--num-epochs=10 \
--batch-size=100 \
--learning-rate=0.001 \
--gpus=1


on.gcp.big:
gcloud ai-platform jobs submit training ${JOB_NAME} \
--region=europe-west2 \
--master-image-uri=gcr.io/cloud-ml-public/training/pytorch-gpu.1-9 \
--config=config.yaml \
--job-dir=${JOB_DIR} \
--package-path=./trainer \
--module-name=trainer.train \
--stream-logs \
-- \
--num-epochs=10 \
--batch-size=100 \
--learning-rate=0.001 \
--gpus=2 \
--accelerator='ddp'\
--num_nodes=3

tensorboard:
tensorboard --logdir=gs://$(BUCKET_NAME)/${JOB_NAME}
download.data:
kaggle competitions download -c data-science-bowl-2018

test:
pytest

download.idr:
rsync -avR --progress ctr26@noah-login:/nfs/bioimage/drop/idr0093-mueller-perturbation/ data/idr
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,13 @@ This utility makes it simple to fetch the necessary datasets:
```bash
make download.data
```
If you don't have a Kaggle account you must create one and then follow the next steps:
1. Install the Kaggle API package so you can download the data from the Makefile you have all the information in their [Github repository](https://github.com/Kaggle/kaggle-api).
2. To use the Kaggle API you need also to create an API token.
You can found how to do it in their [documentation](https://github.com/Kaggle/kaggle-api#api-credentials)
4. After that you will need to add your user and key in a file called `kaggle.json` in this location in your home directory `chmod 600 ~/.kaggle/kaggle.json`
5. Don't forget to accept the conditions for the "2018 Data Science Bowl" on the Kaggle website.
Otherwise you would not be able to pull this data from the command line.

### 4. Developer Installation:

Expand All @@ -88,4 +95,4 @@ bioimage_embed is licensed under the MIT License. Please refer to the [LICENSE](

---

Happy Embedding! 🧬🔬
Happy Embedding! 🧬🔬
73 changes: 35 additions & 38 deletions bioimage_embed/augmentations.py
Original file line number Diff line number Diff line change
@@ -1,40 +1,6 @@
import albumentations as A
import cv2

DEFAULT_AUGMENTATION = A.Compose(
[
# Flip the images horizontally or vertically with a 50% chance
A.OneOf(
[
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
],
p=0.5,
),
# Rotate the images by a random angle within a specified range
A.Rotate(limit=45, p=0.5),
# Randomly scale the image intensity to adjust brightness and contrast
A.RandomGamma(gamma_limit=(80, 120), p=0.5),
# Apply random elastic transformations to the images
A.ElasticTransform(
alpha=1,
sigma=50,
alpha_affine=50,
p=0.5,
),
# Shift the image channels along the intensity axis
A.ChannelShuffle(p=0.5),
# Add a small amount of noise to the images
A.GaussNoise(var_limit=(10.0, 50.0), p=0.5),
# Crop a random part of the image and resize it back to the original size
A.RandomResizedCrop(
height=512, width=512, scale=(0.9, 1.0), ratio=(0.9, 1.1), p=0.5
),
# Adjust image intensity with a specified range for individual channels
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
]
)

from typing import Any

import albumentations
Expand All @@ -43,6 +9,39 @@
from omegaconf import DictConfig
from PIL import Image

DEFAULT_AUGMENTATION_LIST = [
# Flip the images horizontally or vertically with a 50% chance
A.OneOf(
[
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
],
p=0.5,
),
# Rotate the images by a random angle within a specified range
A.Rotate(limit=45, p=0.5),
# Randomly scale the image intensity to adjust brightness and contrast
A.RandomGamma(gamma_limit=(80, 120), p=0.5),
# Apply random elastic transformations to the images
A.ElasticTransform(
alpha=1,
sigma=50,
alpha_affine=50,
p=0.5,
),
# Shift the image channels along the intensity axis
A.ChannelShuffle(p=0.5),
# Add a small amount of noise to the images
A.GaussNoise(var_limit=(10.0, 50.0), p=0.5),
# Crop a random part of the image and resize it back to the original size
A.RandomResizedCrop(
height=512, width=512, scale=(0.9, 1.0), ratio=(0.9, 1.1), p=0.5
),
# Adjust image intensity with a specified range for individual channels
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
]

DEFAULT_AUGMENTATION = A.Compose(DEFAULT_AUGMENTATION_LIST)

class TransformsWrapper:
def __init__(self, transforms_cfg: DictConfig) -> None:
Expand Down Expand Up @@ -81,9 +80,7 @@ def __init__(self, transforms_cfg: DictConfig) -> None:
_convert_="object",
)
valid_test_predict_aug.append(aug)
self.valid_test_predict_aug = albumentations.Compose(
valid_test_predict_aug
)
self.valid_test_predict_aug = albumentations.Compose(valid_test_predict_aug)

def set_mode(self, mode: str) -> None:
"""Set `__call__` mode.
Expand Down Expand Up @@ -111,4 +108,4 @@ def __call__(self, image: Any, **kwargs: Any) -> Any:
image = np.asarray(image)
if self.mode == "train":
return self.train_aug(image=image, **kwargs)
return self.valid_test_predict_aug(image=image, **kwargs)
return self.valid_test_predict_aug(image=image, **kwargs)
12 changes: 12 additions & 0 deletions bioimage_embed/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
from .hydra import train, infer
from typer import Typer

app = Typer()
app.command()(train)
app.command()(infer)

def main():
app()

if __name__ == "__main__":
main()
Loading
Loading