Skip to content

Commit

Permalink
Merge pull request #3 from MilagrosMarin/main
Browse files Browse the repository at this point in the history
Pipeline Architecture Refinement and Enhancement
  • Loading branch information
ttngu207 authored Mar 26, 2024
2 parents 1dea4e4 + 83c68d5 commit 2e87c49
Show file tree
Hide file tree
Showing 23 changed files with 3,073 additions and 2,967 deletions.
5 changes: 3 additions & 2 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,9 @@ ENV DJ_HOST fakeservices.datajoint.io
ENV DJ_USER root
ENV DJ_PASS simple

ENV KPMS_ROOT_DATA_DIR /workspaces/element-moseq/example_data/inbox
ENV KPMS_ROOT_OUTPUT_DIR /workspaces/element-moseq/example_data/outbox
ENV DATA_MOUNTPOINT /workspaces/element-moseq/example_data
ENV KPMS_ROOT_DATA_DIR $DATA_MOUNTPOINT/inbox
ENV KPMS_PROCESSED_DATA_DIR $DATA_MOUNTPOINT/outbox
ENV DATABASE_PREFIX neuro_

USER vscode
Expand Down
4 changes: 2 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
"remoteEnv": {
"LOCAL_WORKSPACE_FOLDER": "${localWorkspaceFolder}"
},
"onCreateCommand": "mkdir -p ${KPMS_ROOT_DATA_DIR} && pip install -e .",
"postStartCommand": "docker volume prune -f && s3fs ${DJ_PUBLIC_S3_LOCATION} ${KPMS_ROOT_DATA_DIR} -o nonempty,multipart_size=530,endpoint=us-east-1,url=http://s3.amazonaws.com,public_bucket=1",
"onCreateCommand": "mkdir -p ${DATA_MOUNTPOINT} && pip install -e .",
"postStartCommand": "docker volume prune -f && s3fs ${DJ_PUBLIC_S3_LOCATION} ${DATA_MOUNTPOINT} -o nonempty,multipart_size=530,endpoint=us-east-1,url=http://s3.amazonaws.com,public_bucket=1",
"hostRequirements": {
"cpus": 4,
"memory": "8gb",
Expand Down
8 changes: 0 additions & 8 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,6 @@ on:
jobs:
make_github_release:
uses: datajoint/.github/.github/workflows/make_github_release.yaml@main
pypi_release:
needs: make_github_release
uses: datajoint/.github/.github/workflows/pypi_release.yaml@main
secrets:
TWINE_USERNAME: ${{secrets.TWINE_USERNAME}}
TWINE_PASSWORD: ${{secrets.TWINE_PASSWORD}}
with:
UPLOAD_URL: ${{needs.make_github_release.outputs.release_upload_url}}
mkdocs_release:
uses: datajoint/.github/.github/workflows/mkdocs_release.yaml@main
permissions:
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
Observes [Semantic Versioning](https://semver.org/spec/v2.0.0.html) standard and
[Keep a Changelog](https://keepachangelog.com/en/1.0.0/) convention.

## [0.1.1] - 2024-03-21

+ Update - Schemas and tables renaming
+ Update - Move `PreFit` and `FullFit` to `moseq_train`
+ Update - Additional attributes and data type modification from `time` to `float` for `duration` to eliminate datetime formatting code
+ Update - Code refactoring in `make` functions and enhanced path handling
+ Update - `docs`, docstrings and table definitions
+ Update - `tutorial.ipynb` according to these changes and verify full functionality with Codespaces
+ Update - pipeline `images` according to these changes
+ Fix - `Dockerfile` environment variables
+ Update - Activation of one schema with two modules by updating `tutorial_pipeline.ipynb`
+ Update - remove PyPI release from `release.yml`
+ Update - README

## [0.1.0] - 2024-03-20

+ Add - `CHANGELOG` and version for first release
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ DataJoint Elements collectively standardize and automate data collection and ana
+ Clone the repository to your computer.

```bash
git clone https://github.com/<enter_github_username>/element-moseq
git clone https://github.com/<enter_github_username>/element-moseq.git
```

+ Install with `pip`:
Expand Down Expand Up @@ -72,4 +72,4 @@ MYSQL_VER=8.0 docker compose -f docker-compose-db.yaml up --build -d
1. We recommend you start by navigating to the `notebooks` directory on the left panel and go through the `tutorial.ipynb` Jupyter notebook. Execute the cells in the notebook to begin your walkthrough of the tutorial.
1. Once you are done, see the options available to you in the menu in the bottom-left corner. For example, in Codespace you will have an option to `Stop Current Codespace` but when running Dev Container on your own machine the equivalent option is `Reopen folder locally`. By default, GitHub will also automatically stop the Codespace after 30 minutes of inactivity. Once the Codespace is no longer being used, we recommend deleting the Codespace.
2. Once you are done, see the options available to you in the menu in the bottom-left corner. For example, in Codespace you will have an option to `Stop Current Codespace` but when running Dev Container on your own machine the equivalent option is `Reopen folder locally`. By default, GitHub will also automatically stop the Codespace after 30 minutes of inactivity. Once the Codespace is no longer being used, we recommend deleting the Codespace.
7 changes: 6 additions & 1 deletion docs/src/citation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,9 @@ If your work uses the following resources, please cite the respective manuscript
+ [RRID:SCR_021894](https://scicrunch.org/resolver/SCR_021894)

+ Keypoint-MoSeq
+ [Manuscripts](https://www.biorxiv.org/content/10.1101/2023.03.16.532307v2.full.pdf)
+ Weinreb C, Pearl J, Lin S, Osman MAM, Zhang L, Annapragada S, Conlin E, Hoffman R,
Makowska S, Gillis WF and Jay M. Keypoint-MoSeq: parsing behavior by linking point
tracking to pose dynamics. BioRxiv. 2023 Dec 23. doi: https://doi.org/10.1101/2023.03.16.532307
+ Wiltschko AB, Johnson MJ, Iurilli G, Peterson RE, Katon JM, Pashkovski SL, Abraira VE,
Adams RP, Datta SR. Mapping sub-second structure in mouse behavior. Neuron. 2015 Dec 16;
88(6):1121-35.
4 changes: 1 addition & 3 deletions docs/src/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,4 @@ Key features include:
- Loading and formatting of 2D deeplabcut keypoint tracking data for model training
- Queue management and initiation of Keypoint-MoSeq analysis across multiple sessions
- Ingestion of analysis outcomes such as PCA, AR-HMM, and Keypoint-SLDS components
- Ingestion of analysis outcomes from motion sequencing inference


- Ingestion of analysis outcomes from motion sequencing inference
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

DataJoint Element for Motion Sequencing with
[Keypoint-MoSeq](https://github.com/dattalab/keypoint-moseq){:target="_blank"},
from keypoint data extracted with [DeepLabCut](x){:target="_blank"}. DataJoint Elements collectively standardize and automate
from keypoint data extracted with [DeepLabCut](http://www.mackenziemathislab.org/deeplabcut){:target="_blank"}.
DataJoint Elements collectively standardize and automate
data collection and analysis for neuroscience experiments. Each Element is a modular
pipeline for data storage and processing with corresponding database tables that can be
combined with other Elements to assemble a fully functional pipeline.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/partnerships.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Key partnerships

Element MoSeq was developed in collaboration with the [Keypoint-MoSeq developers](https://github.com/dattalab/keypoint-moseq) in Datta's Lab at Harvard Medical School to promote integration and interoperability between Keypoint-MoSeq and the DataJoint Element MoSeq.
Element MoSeq was developed in collaboration with the [Keypoint-MoSeq developers](https://github.com/dattalab/keypoint-moseq), particularly with Kai Fox from Datta's Lab at Harvard Medical School, to foster integration and interoperability between Keypoint-MoSeq and the DataJoint Element MoSeq.
55 changes: 27 additions & 28 deletions docs/src/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,21 @@ corresponding table in the database. Within the pipeline, Element MoSeq
connects to upstream Elements including Lab, Animal, Session, and Event. For more
detailed documentation on each table, see the API docs for the respective schemas.

The Element is composed of two main schemas, `kpms_pca` and `kpms_model`. The `kpms_pca` schema is designed to handle the analysis and ingestion of PCA model for formatted keypoint tracking. The `kpms_model` schema is designed to handle the analysis and ingestion of Keypoint-MoSeq's motion sequencing on video recordings.
The Element is composed of two main schemas, `moseq_train` and `moseq_infer`. The `moseq_train` schema is designed to handle the analysis and ingestion of PCA model for formatted keypoint tracking and train the Kepoint-MoSeq model. The `moseq_infer` schema is designed to handle the analysis and ingestion of Keypoint-MoSeq's motion sequencing on video recordings by using one registered model.

## Diagrams

### `kpms_pca` module
### `moseq_train` module

- The `kpms_pca` schema is designed to handle the analysis and ingestion of a PCA model for formatted keypoint tracking.
- The `moseq_train` schema is designed to handle the analysis and ingestion of PCA model for formatted keypoint tracking and train the Kepoint-MoSeq model.

![pipeline](https://raw.githubusercontent.com/datajoint/element-moseq/main/images/pipeline_kpms_pca.svg)
![pipeline](https://raw.githubusercontent.com/datajoint/element-moseq/main/images/pipeline_moseq_train.svg)

### `kpms_model` module
### `moseq_infer` module

- The `kpms_model` schema is designed to handle the analysis and ingestion of Keypoint-MoSeq's motion sequencing on video recordings.
- The `moseq_infer` schema is designed to handle the analysis and ingestion of Keypoint-MoSeq's motion sequencing on video recordings by using one registered model.

![pipeline](https://raw.githubusercontent.com/datajoint/element-moseq/main/images/pipeline_kpms_model.svg)
![pipeline](https://raw.githubusercontent.com/datajoint/element-moseq/main/images/pipeline_moseq_infer.svg)

## Table Descriptions

Expand Down Expand Up @@ -49,36 +49,35 @@ The Element is composed of two main schemas, `kpms_pca` and `kpms_model`. The `k
| --- | --- |
| Session | Unique experimental session identifier |

### `kpms_pca` schema
### `model_train` schema

- For further details see the [kpms_pca schema API docs](https://datajoint.com/docs/elements/element-moseq/latest/api/element_moseq/kpms_pca/)
- For further details see the [`model_train` schema API docs](https://datajoint.com/docs/elements/element-moseq/latest/api/element_moseq/model_train/)

| Table | Description |
| --- | --- |
| PoseEstimationMethod | Table to store the pose estimation methods supported by the keypoint loader of `keypoint-moseq` package. |
| KeypointSet | Table to store the keypoint data and video set directory to train the model.|
| KeypointSet.VideoFile | IDs and file paths of each video file that will be used to train the model.|
| Bodyparts | Table to store the body parts to use in the analysis.|
| KeypointSet | Store keypoint data and video set directory for model training.|
| KeypointSet.VideoFile | IDs and file paths of each video file that will be used for model training. |
| Bodyparts | Store the body parts to use in the analysis. |
| PCATask | Staging table to define the PCA task and its output directory. |
| LoadKeypointSet | Table to create the `kpms_project_output_dir`, and create and update the `config.yml` by creating a new `dj_config.yml`. |
| PCAFitting | Automated fitting of the PCA model.|
| LatentDimension | Automated computation to calculate the latent dimension as one of the autoregressive hyperparameters (`ar_hypparams`) necessary for the model fitting. |
| PCAPrep | Setup the Keypoint-MoSeq project output directory (`kpms_project_output_dir`) creating the default `config.yml` and updating it in a new `dj_config.yml`. |
| PCAFit | Fit PCA model.|
| LatentDimension | Calculate the latent dimension as one of the autoregressive hyperparameters (`ar_hypparams`) necessary for the model fitting. |
| PreFitTask | Specify parameters for model (AR-HMM) pre-fitting. |
| PreFit | Fit AR-HMM model. |
| FullFitTask | Specify parameters for the model full-fitting. |
| FullFit | Fit the full (Keypoint-SLDS) model. |

### `moseq_infer` schema

### `kpms_model` schema

- For further details see the [kpms_model schema API docs](https://datajoint.com/docs/elements/element-moseq/latest/api/element_moseq/kpms_model/)
- For further details see the [`moseq_infer` schema API docs](https://datajoint.com/docs/elements/element-moseq/latest/api/element_moseq/moseq_infer/)

| Table | Description |
| --- | --- |
| PreFittingTask | Table to specify the parameters for the pre-fitting (AR-HMM) of the model. |
| PreFitting | Automated computation to fit a AR-HMM model. |
| FullFittingTask | Table to specify the parameters for the full fitting of the model. The full model will generally require a lower value of kappa to yield the same target syllable durations. |
| FullFitting | Automated computation to fit the full model. |
| Model | Table to register the models. |
| Model | Register a model. |
| VideoRecording | Set of video recordings for the Keypoint-MoSeq inference. |
| VideoRecording.File | File IDs and paths associated with a given `recording_id`. |
| InferenceTask | Table to specify the model, the video set, and the output directory for the inference task. |
| Inference | This table is used to infer the model results from the checkpoint file and save them to `{output_dir}/{model_name}/{inference_output_dir}/results.h5`. |
| Inference.MotionSequence | This table is used to store the results of the model inference.|
| Inference.GridMoviesSampledInstances | This table is used to store the grid movies sampled instances.|
| PoseEstimationMethod | Pose estimation methods supported by the keypoint loader of `keypoint-moseq` package. |
| InferenceTask | Staging table to define the Inference task and its output directory. |
| Inference | Infer the model from the checkpoint file and save the results as `results.h5` file. |
| Inference.MotionSequence | Results of the model inference. |
| Inference.GridMoviesSampledInstances | Store the sampled instances of the grid movies. |
Loading

0 comments on commit 2e87c49

Please sign in to comment.