Data Cleaning and Preprocessing:
-Loaded the dataset using pandas.
-Handled missing values by filling them with mean values.
-Extracted subgroups "t-CS-s" and "c-CS-s" from the dataset.
Parallel Coordinates Plot:
-Utilized the Plotly library to create a parallel coordinates plot.
-Plotted the protein expressions (pPKCG N, pP70S6 N, pS6 N, pGSK3B N, ARC N) for the two classes with distinct colors.
-Annotated each axis with the corresponding protein name.
Dimensionality Reduction Techniques:
-Applied PCA, ISOMAP, and t-SNE as dimensionality reduction techniques to the dataset.
-Visualized the reduced data using scatter plots for each technique.
-Created an interactive interface using the Dash framework.
Interactive Dashboard:
-Built a Dash application with a user-friendly interface.
-Included a radio component for users to select between PCA, ISOMAP, and t-SNE.
-Displayed a scatter plot based on the selected dimensionality reduction technique.
-Added dropdown menus to select the protein expressions for x and y axes.
Dynamic Scatter Plots:
-Implemented the ability to add multiple scatter plots.
-Included an "Add" button that replicates the scatter plot with the current axis settings.
-Accommodated the addition of multiple scatter plots below the existing ones.
Here is how it looks like after adding three scatterplots with different proteins:
The three scatterplots in the second row od the image were added by the user through clicking on the add button. User can add many number of scatterplots according to his need.
I used the following languages and libraries during the development of this project:
-
Python: The primary programming language for data analysis and visualization.
-
Pandas: Data manipulation and analysis library.
-
Plotly: Interactive visualization library.
-
Dash: Web framework for building interactive web applications.
-
Scikit-learn: Machine learning library for dimensionality reduction techniques.
The provided code showcases the implementation of the above features. It initializes a Dash application, sets up the layout with interactive components, and defines callback functions to update the visualizations based on user inputs. The code also handles the addition of scatter plots and dynamic updates to the figure container.
Feel free to explore the dataset, analyze protein expressions, and interact with the dynamic visualizations to gain insights into the relationships between different classes and proteins in mice.
Note:
Please ensure that you have the necessary dependencies installed before running the application.
Thank you for visiting my project! If you have any inquiries or feedback, please don't hesitate to reach out.