MinCutTAD: Interpretable graph neural network - driven TAD prediction from Hi-C chromatin interactions and chromatin states

Abstract

GNN algorithm driven by spectral clustering to detect TADs. Constructed with GraphConv, a message passing layer, and if the algorithm is unsupervised with a MinCut pooling layer.

Message passing refers to the smoothening of the information among the directly surrounding node features.
Pooling refers to the aggregation of strongly similar nodes, thereby reducing the graph domain and forming sub clusters.

Utilizes Hi-C matrices data & genomic annotations (CTCF, RAD21, SMC3, # of housekeeping genes) for the provided genomic loci of chromosomes

Two approaches:

Supervised uses Arrowhead solutions as labels for the genomic bins and optimizes towards classifying the graph nodes accordingly to those.
Unsupervised: no labels are provided to the model, and it determines whether regions belong to a TAD or not and aggregate them. Therefore, its main goal is to cluster single TAD regions together.

Further descriptions can be found in our 10 page report or our 2 page digest.

Repository Structure

The folder structure of the repsoitory is shown below. The folders ./TopResults, ./cmap_files, ./node_annotations and ./ressources contain files necessary for running the scripts in the folder ./tad_detection.

├── cmap_files
│   ├── 25kb
│   │   ├── GM12878
│   │   │   └── intra
│   │   └── IMR-90
│   │       └── intra
│   └── 100kb
│       ├── GM12878
│       │   ├── inter
│       │   └── intra
│       └── IMR-90
│           └── intra
├── node_annotations
├── ressources
├── tad_detection
│   ├── evaluation
│   ├── model
│   ├── preprocessing
│   └── utils_general.py
├── Digest_TeamHA1.pdf
├── LICENSE
├── README.md
├── Report_TeamHA1.pdf
└── environment.yml

The scripts developed as part of this project can be found in the folder ./tad_detection and the corresponding subfolders.

An exact description of the preprocessing scripts can be found in the folder ./tad_detection/preprocessing and the associated README.
An exact description of the training scripts can be found in the folder ./tad_detection/model and the associated README.
An exact description of the evaluation scripts can be found in the folder ./tad_detection/evaluation and the associated README.
An exact description of the benchmarking tools scripts can be found in the folder ./tad_detection/evaluation/tools_benchmarking and the associated README.

Running the tools in this repository

The tools must be run with ./MeetEU as the working directory. An environment.yml file with a list of all the necessary packages for our model and scripts is available in the repository. Please note that some of the packages may only be available for UNIX-based operating systems. The usage of a HPC with access to a GPU is highly recommended for the training of the model.

Sample data

Sample data to run this algorithm can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MinCutTAD: Interpretable graph neural network - driven TAD prediction from Hi-C chromatin interactions and chromatin states

Abstract

Repository Structure

Running the tools in this repository

Sample data

About

Releases

Packages

Contributors 6

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
cmap_files		cmap_files
node_annotations		node_annotations
ressources		ressources
tad_detection		tad_detection
.DS_Store		.DS_Store
.gitignore		.gitignore
Digest_TeamHA1.pdf		Digest_TeamHA1.pdf
LICENSE		LICENSE
README.md		README.md
Report_TeamHA1.pdf		Report_TeamHA1.pdf
__init__.py		__init__.py
environment.yml		environment.yml

License

meet-eu-21/Team-HA1

Folders and files

Latest commit

History

Repository files navigation

MinCutTAD: Interpretable graph neural network - driven TAD prediction from Hi-C chromatin interactions and chromatin states

Abstract

Repository Structure

Running the tools in this repository

Sample data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages