example improved

hadexversum · May 21, 2024 · 208ca05 · 208ca05
1 parent 6fc18b6
commit 208ca05
Show file tree

Hide file tree

Showing 2 changed files with 67 additions and 57 deletions.
diff --git a/R/calculate_hires.R b/R/calculate_hires.R
@@ -150,7 +150,7 @@ calculate_hires <- function(fit_values,
            class_name = "none",
            color = "#000000")
 
-  hires_params <- filter(hires_params, aa!="PP" | is.na(aa)) %>%
+  hires_params <- filter(hires_params, aa!="P" | is.na(aa)) %>%
     rbind(hires_params_p) %>%
     arrange(position)
 

diff --git a/vignettes/example.Rmd b/vignettes/example.Rmd
@@ -25,102 +25,99 @@ library(dplyr)
 
 
 
-
-```{r include = FALSE}
-
-
-kin_dat <- HRaDeX::prepare_kin_dat(alpha_dat, 
-                                   state = "Alpha_KSCN",
-                                   time_0 = 0,
-                                   time_100 = 1440)
-
-fit_values <- create_fit_dataset(kin_dat, 
-                                 fit_k_params = get_example_fit_k_params(), 
-                                 trace = F, 
-                                 fractional = T, 
-                                 workflow = 321)
-
-hires_params <- calculate_hires(fit_values,
-                                fractional = T, 
-                                method = "weighted")
-
-
-```
+# Introduction
 
 This document is an adapted version of Supplement to the HRaDeX manuscript.
 
-Here we describe the detailed step-by-step analysis of experimental data using the hadexversum family tools: HaDeX, HRaDeX and compaHRaDeX. hadexversum availability is described in appropriate section below.
+Here we describe the detailed step-by-step analysis of experimental data using the hadexversum family tools: [HaDeX](https://hadex2.mslab-ibb.pl/), [HRaDeX](https://hradex.mslab-ibb.pl/) and [compaHRaDeX](https://compahradex.mslab-ibb.pl/). 
 
-The analysed protein is eEF1Bα subunit of the human guanine-nucleotide exchange factor (GEF) complex (eEF1B), measured in [Mass Spectrometry Lab](https://mslab-ibb.pl/) in [Institute of Biochemistry and Biophysics Polish Academy of Sciences](https://ibb.edu.pl/en/) and published by [Bondarchuk et al](https://doi.org/10.1093/nar/gkac685). In the one-state classification we will focus on pure gamma state, and in comparative analysis with regard to alpha state. 
+The analysed protein is eEF1Bα subunit of the human guanine-nucleotide exchange factor (GEF) complex (eEF1B), measured in [Mass Spectrometry Lab](https://mslab-ibb.pl/) in [Institute of Biochemistry and Biophysics Polish Academy of Sciences](https://ibb.edu.pl/en/) and published by [Bondarchuk et al](https://doi.org/10.1093/nar/gkac685). In the one-state classification we will focus on pure alpha state, and in comparative analysis with regard to gamma state. 
 
-We discuss only the visualization methods of hadexversum, without making strict interpretations. For that, we suggest contacting the research group that published original research on this topic.
+We present the visualization methods of hadexversum, without making strict interpretations. For that purpose, we suggest contacting the research group that published original research on this topic.
 
 
 # HaDeX 
 
-HaDeX is available as a [web-server](), [R package](), and [standalone software](). The first version is already [published]() and the second version is in advanced state - [HadeX2](). 
-
-HaDeX is a general-use tool for widely understood analysis on the peptide level. Moreover, it provides many features for investigating directly the mass measurements and checking the experiment quality. The summary of the results is wrapped in a short and comprehensive report. 
+## General information 
 
+HaDeX is a general-use tool for widely understood analysis on the peptide level. Moreover, it provides many features for investigating directly the mass measurements and checking the experiment quality. The summary of the results is wrapped in a short and comprehensive report. HaDeX provdes many methods of quality control of the experiment with in-depth analysisc of measurements, uncertainty and statistical significance. Not only commonly used forms of vizualization are available, but also new methods are proposed. In this document, we focus on forms corrensponding with high resolution data.
 
-To control the identification process, and labeling of the spectra, we check the overall uncertainty. If the uncertainty is very high in comparison with others and has obvious outliers, there is a probability that the outlier is mislabeled. 
 
-To check if the protein is covered by peptides on a satisfactory level, we check the coverage plot. There is also the numerical information of the coverage followed by redundancy, which is also presented on the plot, below the coverage plot. 
 
+## Peptide-level uptake analysis
 
-
-To see both the uptake level (with uncertainty of measurement) and the position of each peptide on the protein sequence, we use the comparison plot. For readability purposes, on this type of plot, we can present the data only for a single time point, but multiple biological states. However, a quick glimpse of the plot enables a general view of the exchanged regions. Moreover, let’s suppose we aim for the comparative analysis of two biological states. In that case, we use the so-called Woods plot, with differences in uptake for each peptide and information on which differences are statistically significant for the desired level. As for the comparison plot, we only present the data for a single time point. 
+To see both the uptake level (with uncertainty of measurement) and the position of each peptide on the protein sequence, we use the comparison plot. For readability purposes, on this type of plot, we can present the data only for a single time point, but multiple biological states. However, a quick glimpse of the plot enables a general view of the exchanged regions. Let’s suppose we aim for the comparative analysis of two biological states. In that case, we use the so-called Woods plot, with differences in uptake for each peptide and information on which differences are statistically significant for the desired level. As for the comparison plot, we only present the data for a single time point. 
 
 ```{r include=F}
-
 uptake_dat <- create_uptake_dataset(alpha_dat,
+                                    states = c("Alpha_KSCN", "ALPHA_Gamma"),
                                     time_0 = 0, 
                                     time_100 = 1440)
 ```
 
-```{r}
+```{r fig.width=7}
 HaDeX::plot_state_comparison(uptake_dat,
                              fractional = T,
                              time_t = 150)
 
 ```
 
 
-```{r}
-
+```{r fig.width=7}
 diff_p_uptake_dat <- create_p_diff_uptake_dataset(alpha_dat,
                                                   state_1 = "Alpha_KSCN",
                                                   state_2 = "ALPHA_Gamma")
 
 HaDeX::plot_differential(diff_p_uptake_dat = diff_p_uptake_dat,
                          fractional = T, 
                          show_houde_interval = T, 
-                         time_t = 150)
+                         time_t = 150) +
+  labs(title = "Differential plot in 150 min between Alpha and Alpha+Gamma state")
 ```
 
 
-This plot presents the results for the measurement done after 150 min of exchange. It shows one significant exchange region - between positions 20 and 60 and two regions with values barely above the significance level.
+This plot presents the results for the measurement done after 150 min of exchange. It shows one significant exchange region - between positions 25 and 80 and two regions with values barely above the significance level.
 
 
+# HRaDeX
 
-Measurement control. 
+```{r include = FALSE}
 
 
+kin_dat <- HRaDeX::prepare_kin_dat(alpha_dat, 
+                                   state = "Alpha_KSCN",
+                                   time_0 = 0,
+                                   time_100 = 1440)
 
+fit_values <- create_fit_dataset(kin_dat, 
+                                 fit_k_params = get_example_fit_k_params(), 
+                                 trace = F, 
+                                 fractional = T, 
+                                 workflow = 321)
 
+hires_params <- calculate_hires(fit_values,
+                                fractional = T, 
+                                method = "weighted")
 
-# HRaDeX
 
+```
+
+## General infromation 
 
-HRaDeX is currently available as a [web-server](), and [R package](). 
+HRaDeX provides classification results for one biological state at a time. To get data for comparative purposes, the classification process should be conducted twice, on selected states, with the same classification parameters. Adjusting the parameters can be challenging, especially for longer proteins due to the calculation time. In this document we discuss the results, and the detailed description of the workflow is available in dedicated article.
 
-HRaDeX provides classification results for one biological state at a time. To get data for comparative purposes, the classification process should be conducted twice, on selected states, with the same classification parameters. Adjusting the parameters can be challenging, especially for longer proteins due to the calculation time. 
+## High-resolution dynamics analysis
 
 First, we upload the experimental data. The parameter options are adjusted to the content of the file.
 
 Then, we need to decide if the default parameters are sufficient. Of course, they can be adjusted in an interactive mode. Anyway, additional knowledge about the specificity of analyzed protein is helpful. Some of the peptides have a strong “medium” exchange phase shifted towards default “slow” exchange, with “slow” exchange being very slow, close to the bottom limit of class exchange. In such cases, the broadening of the medium class is desired. 
 
-In the case of our example, we use the default limits, as they are sufficient and the fit results are very good, with small rss.
+In the case of our example, we use the default limits, as they are sufficient and the fit results are very good, with small rss. Default parameters are as follows:
+
+```{r}
+get_example_fit_k_params()
+```
+
 
 All parameters must be confirmed by clicking the button, to avoid unnecessary calculations while selecting the parameters.
 
@@ -132,15 +129,12 @@ Below, there is a plot with two parts on the left, there is normalised uptake cu
 Left plot: Measurement points are marked by circles, with the uncertainty of the measurement shown by the error bars. Mass spectrometry is a very accurate method, and the error bars are hardly visible, although present. The black line indicates the final fitted curve, with color lines indicating the three components of the final model. As described before, the red line presents the fast component, the green line is the medium exchange component, and the blue line is the slow component. Although all populations sum up to one, each population has its intensity that impacts the final classification. 
 
 ```{r  include = FALSE}
-
-example_fit_dat <- filter(fit_values, id == 8)
-example_kin_dat <- filter(kin_dat, ID == 8)
-
+example_fit_dat <- filter(fit_values, id == 112)
+example_kin_dat <- filter(kin_dat, ID == 112)
 
 ```
 
 ```{r fig.width=7}
-
 plot_double_uc(example_kin_dat, example_fit_dat)
 
 ```
@@ -151,7 +145,7 @@ The model parameters are shown below, and the resulting classficiation color is
 ```{r}
 example_fit_dat
 ```
-As we can see, the population of the slow group is the biggest, thus the final color is close to blue. However, the other groups are present and interefere with the purity of the color. Below there is also a legend, to have an understaing where in the color scale is located this classification result.
+As we can see, the population of the fast exchanging group is the biggest, thus the final color is close to red. However, the other groups are present and interfere with the purity of the color. The noticable slow exchaning group is pushing the classification color towards blue, resulting in violet-ish shade.  The small addition of gree leads to the subdued color. Below you can find a legend, to have an understanding where in the color scale is located this classification result.
 
 
 ```{r, fig.width=2, fig.height=2, echo=F}
@@ -171,7 +165,7 @@ Color legend:
 
 <!-- ![](figures/rgb_plaster.png) -->
 
-```{r out.width="30%"}
+```{r out.width="30%", include = FALSE}
 knitr::include_graphics("figures/rgb_plaster.png")
 ```
 
@@ -208,7 +202,7 @@ Here we discussed only one biological state, but for the second one, the reasoni
 
 # compaHRaDeX
 
-```{r}
+```{r include=F}
 
 kin_dat_2 <- HRaDeX::prepare_kin_dat(alpha_dat, 
                                    state = "ALPHA_Gamma",
@@ -226,8 +220,7 @@ hires_params_2 <- calculate_hires(fit_values_2,
                                 method = "weighted")
 ```
 
-compaHRaDeX is currently available as a [web-server]() using the code from HRaDeX [R package](). 
-
+## High-resolution comparative analysis
 
 The ultimate goal of the experiment is usually the comparative analysis between two biological states that provides information on how the exchange is changed by specific factors. In this case, we prepared a classification analysis for two biological states of alpha: the first state (discussed above is gamma without complex) and the second state (gamma in the presence of alpha).
 
@@ -245,16 +238,33 @@ two_states <- HRaDeX::create_two_state_dataset(hires_params, hires_params_2)
 HRaDeX::plot_color_distance(two_states)
 ```
 
+In this case we see great difference in region 30-80 of the sequence, second region 175-180 and third 225-235, roughly estimating. Choosing 0.2 as the threshold of distance value, we can present the regions of difference on the 3D structure, as presented below.
 ```{r}
-HRaDeX::plot_k_distance(two_states)
+color_positions <- HRaDeX::prepare_diff_data(two_states,
+                                             "dist",
+                                             0.2)
+HRaDeX::plot_3d_structure_blank(pdb_file_path = "../data/Model_eEF1Balpha.pdb") %>%
+  r3dmol::m_set_style(sel = r3dmol::m_sel(resi = color_positions),
+            style = r3dmol::m_style_cartoon(color = "aquamarine"))
+
 ```
 
 
+As the distance between populations plot shows us the regions of interest, doesn't show the direction of change - if the region is protected from exchange or the contrary. To account for that, we propose the rough estimate of exchange rate based on the parameters of the model, as defined in the workflow description article.
+
+```{r}
+HRaDeX::plot_k_distance(two_states)
+```
+
 
-We can see the obvious difference in the first part of the protein, in the same region as shown in the Woods plot above. We also see the small difference from Woods plot in the second part of the protein. Using x as the threshold, we plot the differences on the 3D structure.
 
-Although we see the regions of difference, it is hard to say which way goes the change.
-Alternative way to see the difference is a
+We can see the obvious difference in the first part of the protein, in the same region as shown in the Woods plot above. We also see the small difference from Woods plot in the second part of the protein. Although the results are somehow analogical, the high-resolution approach accounts for the whole time course. 
 
 # Availability
 
+HaDeX is available as a [web-server](https://hadex.mslab-ibb.pl/), [R package](https://cran.r-project.org/web/packages/HaDeX/index.html), and [standalone software](). The first version is already [published](https://academic.oup.com/bioinformatics/article/36/16/4516/5862011) and the second version is in advanced state - [HaDeX2](https://hadex2.mslab-ibb.pl/). 
+
+
+Both HRaDeX and compaHRaDeX use code from HRaDeX [R package](https://github.com/hadexversum/HRaDeX). The HRaDeX application is available [here](https://hradex.mslab-ibb.pl/), with open [source code](https://github.com/hadexversum/HRaDeXGUI). compaHRaDeX is available [here](https://compahradex.mslab-ibb.pl/), as well as application [source code](https://github.com/hadexversum/compahradex).
+
+