Skip to content

In-depth exploratory data analysis and created an interactive visualization dashboard using Python, Dash framework, and various data analysis libraries. The dataset used for this project is the Mice Protein Expression Dataset, which contains expression levels of 77 proteins measured in the cerebral cortex of different classes of mice.

Notifications You must be signed in to change notification settings

MadniAbdulWahab/InteractiveAnalysisofMiceProteinExpressions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Interactive Analysis of Mice Protein Expressions

Description

In this project, I conducted an in-depth exploratory data analysis and created an interactive visualization dashboard using Python, Dash framework, and various data analysis libraries. The dataset used for this project is the Mice Protein Expression Dataset, which contains expression levels of 77 proteins measured in the cerebral cortex of different classes of mice. The goal was to analyze the dataset and create an interactive dashboard that provides insights into the relationships between protein expressions, classes, and various dimensionality reduction techniques. I achieved it by cleaning and preprocessing the dataset, creating insightful visualizations, and developing an interactive dashboard using Python, Dash framework, and various data analysis libraries.

Project Features and Steps

Data Cleaning and Preprocessing:

-Loaded the dataset using pandas.

-Handled missing values by filling them with mean values.

-Extracted subgroups "t-CS-s" and "c-CS-s" from the dataset.

Parallel Coordinates Plot:

-Utilized the Plotly library to create a parallel coordinates plot.

-Plotted the protein expressions (pPKCG N, pP70S6 N, pS6 N, pGSK3B N, ARC N) for the two classes with distinct colors.

-Annotated each axis with the corresponding protein name.

Dimensionality Reduction Techniques:

-Applied PCA, ISOMAP, and t-SNE as dimensionality reduction techniques to the dataset.

-Visualized the reduced data using scatter plots for each technique.

-Created an interactive interface using the Dash framework.

Interactive Dashboard:

-Built a Dash application with a user-friendly interface.

-Included a radio component for users to select between PCA, ISOMAP, and t-SNE.

radio button

-Displayed a scatter plot based on the selected dimensionality reduction technique.

-Added dropdown menus to select the protein expressions for x and y axes.

dropdown menu

Dynamic Scatter Plots:

-Implemented the ability to add multiple scatter plots.

-Included an "Add" button that replicates the scatter plot with the current axis settings.

add button

-Accommodated the addition of multiple scatter plots below the existing ones.

Here is how it looks like after adding three scatterplots with different proteins:

mice protein project

The three scatterplots in the second row od the image were added by the user through clicking on the add button. User can add many number of scatterplots according to his need.

Technologies and Libraries Used:

I used the following languages and libraries during the development of this project:

  • Python: The primary programming language for data analysis and visualization.

  • Pandas: Data manipulation and analysis library.

  • Plotly: Interactive visualization library.

  • Dash: Web framework for building interactive web applications.

  • Scikit-learn: Machine learning library for dimensionality reduction techniques.

Code Summary:

The provided code showcases the implementation of the above features. It initializes a Dash application, sets up the layout with interactive components, and defines callback functions to update the visualizations based on user inputs. The code also handles the addition of scatter plots and dynamic updates to the figure container.

Feel free to explore the dataset, analyze protein expressions, and interact with the dynamic visualizations to gain insights into the relationships between different classes and proteins in mice.

Note:

Please ensure that you have the necessary dependencies installed before running the application.

Thank you for visiting my project! If you have any inquiries or feedback, please don't hesitate to reach out.

About

In-depth exploratory data analysis and created an interactive visualization dashboard using Python, Dash framework, and various data analysis libraries. The dataset used for this project is the Mice Protein Expression Dataset, which contains expression levels of 77 proteins measured in the cerebral cortex of different classes of mice.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages