GitHub

This project aims at predicting chemical structure of metabolites from LC-MS/MS spectra using Deep Canonical Correlation Analysis(DeepCCA). DeepCCA is a deep learning extension of CCA. This work is done in three phases as outlined below.

1. Data processing and Embeddings

In this notebook, we clean, intergrate and generate embeddings of structure and spectra dataset.

2. Model development

This notebook contains DeepCCA optimization codes.

3. Prediction and Evaluation for model development

Here we perfom a cross modal retrieval. It takes in the spectra embeddings and outputs the most likely structure of that spectra. Next we evaluate using Tanimoto scores whether the predicted structure is similar to the true structure.

Training the final model

After selecting best performing hyperparameters, we train the final model in this notebook

Final model predictions and Evaluation

The final model is used to predict the structures of query spectrum in this notebook

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
DeepCCA_models_training.ipynb		DeepCCA_models_training.ipynb
README.md		README.md
backup_for_deleted_codes_just_incase_i_was_out_of_my_mind.ipynb		backup_for_deleted_codes_just_incase_i_was_out_of_my_mind.ipynb
cca_initial.ipynb		cca_initial.ipynb
data_preprocessing_and_embeddings.ipynb		data_preprocessing_and_embeddings.ipynb
deepcca_optimized_model.ipynb		deepcca_optimized_model.ipynb
final_model_prediction_and_evaluation.ipynb		final_model_prediction_and_evaluation.ipynb
old_deepcca_script.ipynb		old_deepcca_script.ipynb
predictions_script.py		predictions_script.py
spectra_structure_prediction.ipynb		spectra_structure_prediction.ipynb
testing_ray_tune.ipynb		testing_ray_tune.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Data processing and Embeddings

2. Model development

3. Prediction and Evaluation for model development

Training the final model

Final model predictions and Evaluation

About

Releases

Packages

Languages

LukaLmelias/DeepCCA_thesis

Folders and files

Latest commit

History

Repository files navigation

1. Data processing and Embeddings

2. Model development

3. Prediction and Evaluation for model development

Training the final model

Final model predictions and Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages