diff --git a/inst/vignettes/Landmark_partition_test.Rmd b/inst/vignettes/Landmark_partition_test.Rmd index 8fa00af..b2958b7 100644 --- a/inst/vignettes/Landmark_partition_test.Rmd +++ b/inst/vignettes/Landmark_partition_test.Rmd @@ -181,17 +181,17 @@ land_front <- c(2, 11, 12) The idea is to see whether the landmark variation is greater in the landmarks in the front of the skull and relatively smaller in the back of the skull. We can measure that using two different statistics: - * The **area difference** that is the proxy of the cumulative landmark variation + * The **displacement difference** that is the proxy of the cumulative landmark variation * The **Bhattacharyya Coefficient** which is the probability of overlap between two distributions Of course, any other type of statistics can be used in the test that follows. These two however are a pretty intuitive proxy for testing our hypothesis above. -### The Area difference +### The Displacement difference This statistic can be measured as follows: -$\Delta_{area} = \int_{0}^{n-1} (f_{x} - f_{y})d(x,y)$ +$\Delta_{displacement} = \int_{0}^{n-1} (f_{x} - f_{y})d(x,y)$ Where _n_ is minimum number of comparable landmarks and $f_{x}$ and $f_{y}$ are ranked functions (i.e. $f_{0} \geq f_{1} \geq f_{2} ...$) for the landmarks in the partition and all the landmarks respectively. If one of the functions $f_{x}$ or $f_{y}$ have _m_ elements (with $m > n$) $f^{*}_{z}$, a rarefied estimated of the function with _m_ elements is used instead. @@ -209,7 +209,7 @@ This statistic is implemented in `landvR` in the `area.diff` function that simpl distribution_1 <- rnorm(10) distribution_2 <- rnorm(10) -## Their area difference +## Their displacement difference area.diff(distribution_1, distribution_2) ``` @@ -235,7 +235,7 @@ bhatt.coeff(distribution_1, distribution_2) ### Random permutation test Now that we have two statistics (note that one could be sufficient), we can finally test our hypothesis using a random permutation test. -This test measures for on statistic between two populations (e.g. the area difference between the landmarks variation in the front or the back of the skull) come from the same global statistical population (H0) or different ones (H1). +This test measures for on statistic between two populations (e.g. the displacement difference between the landmarks variation in the front or the back of the skull) come from the same global statistical population (H0) or different ones (H1). First we measured the statistic between the landmark partition of interest and all the other landmarks (including the ones from the partition). Second, we generated 1000 statistics by randomly sampling the same number of landmarks as in the partition in the whole distributions and compared them again to the full distribution. @@ -340,21 +340,21 @@ This function allows to pass the statistic as a function to the `test` argument We can use the argument `test.parameter` to calculate the _p_ value as outlined above (note that by default, these tests are two sided). ```{r} -## The random permutation test for the area difference between species +## The random permutation test for the displacement difference between species sp_mean_test_area <- rand.test(species_means_difference[[1]][, "radius"], subset = land_front, test = area.diff, test.parameter = TRUE) ## The random permutation test for the probability of overlap between species sp_mean_test_bc <- rand.test(species_means_difference[[1]][, "radius"], subset = land_front, test = bhatt.coeff, test.parameter = TRUE) -## The random permutation test for the area difference within both species +## The random permutation test for the displacement difference within both species all_sp_test_area <- rand.test(all_spec_difference$range[, "radius"], subset = land_front, test = area.diff, test.parameter = TRUE) ## The random permutation test for the probability of overlap within both species all_sp_test_bc <- rand.test(all_spec_difference$range[, "radius"], subset = land_front, test = bhatt.coeff, test.parameter = TRUE) -## The random permutation test for the area difference within both species +## The random permutation test for the displacement difference within both species ci95_sp_test_area <- rand.test(ci95_spec_difference$range[, "radius"], subset = land_front, test = area.diff, test.parameter = TRUE) ## The random permutation test for the probability of overlap within both species @@ -370,7 +370,7 @@ sp_mean_test_area ``` The `Observation` value is the observed statistic in the subset. -For example, this can be the area difference or the probability of overlap in landmark displacement between the two tested specimens in the tested subset (here the front of the skull). +For example, this can be the displacement difference or the probability of overlap in landmark displacement between the two tested specimens in the tested subset (here the front of the skull). This value is then compared to 100 replicated random values that are equivalent of measuring this statistic over any randomly picked landmark displacements. The _p_ value will indicate the type I error when rejecting the null hypothesis (that is: the landmarks in the subset are not different than the other). @@ -381,7 +381,7 @@ To facilitate interpreting these results, it is possible to plot them: par(mfrow = c(3,2)) ## Plotting one result -plot(sp_mean_test_area, main = "Species mean area difference") +plot(sp_mean_test_area, main = "Species mean displacement difference") ## Adding the p value legend("topright", legend = paste("p =", round(sp_mean_test_area$pvalue, 3)), bty = "n") @@ -389,12 +389,12 @@ legend("topright", legend = paste("p =", round(sp_mean_test_area$pvalue, 3)), bt plot(sp_mean_test_bc, main = "Species mean Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(sp_mean_test_bc$pvalue, 3)), bty = "n") -plot(all_sp_test_area, main = "Overall range area difference") +plot(all_sp_test_area, main = "Overall range displacement difference") legend("topright", legend = paste("p =", round(all_sp_test_area$pvalue, 3)), bty = "n") plot(all_sp_test_bc, main = "Overall range Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(all_sp_test_bc$pvalue, 3)), bty = "n") -plot(ci95_sp_test_area, main = "Overall range area difference") +plot(ci95_sp_test_area, main = "Overall range displacement difference") legend("topright", legend = paste("p =", round(ci95_sp_test_area$pvalue, 3)), bty = "n") plot(ci95_sp_test_bc, main = "Overall range Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(ci95_sp_test_bc$pvalue, 3)), bty = "n") diff --git a/inst/vignettes/Landmark_partition_test.html b/inst/vignettes/Landmark_partition_test.html index 6ae3714..f3c9ffc 100644 --- a/inst/vignettes/Landmark_partition_test.html +++ b/inst/vignettes/Landmark_partition_test.html @@ -11,7 +11,7 @@ - + Testing the variation in landmark position @@ -121,7 +121,7 @@

Testing the variation in landmark position

Thomas Guillerme

-

2018-10-31

+

2019-03-12

@@ -263,14 +263,14 @@

Testing the hypothesis

land_front <- c(2, 11, 12)

The idea is to see whether the landmark variation is greater in the landmarks in the front of the skull and relatively smaller in the back of the skull. We can measure that using two different statistics:

Of course, any other type of statistics can be used in the test that follows. These two however are a pretty intuitive proxy for testing our hypothesis above.

-
-

The Area difference

+
+

The Displacement difference

This statistic can be measured as follows:

-

\(\Delta_{area} = \int_{0}^{n-1} (f_{x} - f_{y})d(x,y)\)

+

\(\Delta_{displacement} = \int_{0}^{n-1} (f_{x} - f_{y})d(x,y)\)

Where n is minimum number of comparable landmarks and \(f_{x}\) and \(f_{y}\) are ranked functions (i.e. \(f_{0} \geq f_{1} \geq f_{2} ...\)) for the landmarks in the partition and all the landmarks respectively. If one of the functions \(f_{x}\) or \(f_{y}\) have m elements (with \(m > n\)) \(f^{*}_{z}\), a rarefied estimated of the function with m elements is used instead.

\(\int_{0}^{n-1}f^*_{z}d(z) = \frac{\sum_1^p\int f^*_{zi}}{s}\)

Where s is the number of rarefaction replicates. s is chosen based on the Silverman’s rule of thumb for choosing the bandwidth of a Gaussian kernel density estimator multiplied by 1000 with a result forced to be 100 \(\leq p \leq\) 1000 (Silverman 1986). This rule of thumb is implemented in R in the bw.nrd0 (a bandwith selector for Kernel Density estimations)

@@ -279,7 +279,7 @@

The Area difference

distribution_1 <- rnorm(10) distribution_2 <- rnorm(10) -## Their area difference +## Their displacement difference area.diff(distribution_1, distribution_2)
## [1] 6.195767
@@ -295,7 +295,7 @@

Probability of overlap between size differences (Bhattacharyya Coefficient)<

Random permutation test

-

Now that we have two statistics (note that one could be sufficient), we can finally test our hypothesis using a random permutation test. This test measures for on statistic between two populations (e.g. the area difference between the landmarks variation in the front or the back of the skull) come from the same global statistical population (H0) or different ones (H1).

+

Now that we have two statistics (note that one could be sufficient), we can finally test our hypothesis using a random permutation test. This test measures for on statistic between two populations (e.g. the displacement difference between the landmarks variation in the front or the back of the skull) come from the same global statistical population (H0) or different ones (H1).

First we measured the statistic between the landmark partition of interest and all the other landmarks (including the ones from the partition). Second, we generated 1000 statistics by randomly sampling the same number of landmarks as in the partition in the whole distributions and compared them again to the full distribution. This resulted in 1000 null comparisons (i.e. assuming the null hypothesis that the statistic in the partition is the same as the statistic in the total distribution). We then calculated the p value based on:

\(p=\frac{\sum_1^B\left(random_{i} >= observed\right)+1}{B + 1}\)

Where B is the number of random draws (1000 bootstrap pseudo-replicates), \(random_{i}\) is the \(i^{th}\) statistic from the comparison of the \(i^{th}\) randomly drawn partition and the distribution and \(observed\) is the observed statistic between the partition and the distribution.

@@ -390,21 +390,21 @@

Range within one species using the 0.95 CI

Running the permutation test

We can use the rand.test function from landvR that’s based on a modification from the as.randtest function from the ade4 package (Thioulouse et al. 1997)). This function allows to pass the statistic as a function to the test argument as follows. We can use the argument test.parameter to calculate the p value as outlined above (note that by default, these tests are two sided).

-
## The random permutation test for the area difference between species
+
## The random permutation test for the displacement difference between species
 sp_mean_test_area <- rand.test(species_means_difference[[1]][, "radius"], subset = land_front,
                                test = area.diff, test.parameter = TRUE)
 ## The random permutation test for the probability of overlap between species
 sp_mean_test_bc <- rand.test(species_means_difference[[1]][, "radius"], subset = land_front,
                             test = bhatt.coeff, test.parameter = TRUE)
 
-## The random permutation test for the area difference within both species
+## The random permutation test for the displacement difference within both species
 all_sp_test_area <- rand.test(all_spec_difference$range[, "radius"], subset = land_front,
                               test = area.diff, test.parameter = TRUE)
 ## The random permutation test for the probability of overlap within both species
 all_sp_test_bc <- rand.test(all_spec_difference$range[, "radius"], subset = land_front,
                             test = bhatt.coeff, test.parameter = TRUE)
 
-## The random permutation test for the area difference within both species
+## The random permutation test for the displacement difference within both species
 ci95_sp_test_area <- rand.test(ci95_spec_difference$range[, "radius"], subset = land_front,
                                test = area.diff, test.parameter = TRUE)
 ## The random permutation test for the probability of overlap within both species
@@ -424,13 +424,13 @@ 

Running the permutation test

## ## Normal residuals Random mean Random variance ## 1.484526e+00 9.682568e-04 5.835753e-05
-

The Observation value is the observed statistic in the subset. For example, this can be the area difference or the probability of overlap in landmark displacement between the two tested specimens in the tested subset (here the front of the skull). This value is then compared to 100 replicated random values that are equivalent of measuring this statistic over any randomly picked landmark displacements. The p value will indicate the type I error when rejecting the null hypothesis (that is: the landmarks in the subset are not different than the other).

+

The Observation value is the observed statistic in the subset. For example, this can be the displacement difference or the probability of overlap in landmark displacement between the two tested specimens in the tested subset (here the front of the skull). This value is then compared to 100 replicated random values that are equivalent of measuring this statistic over any randomly picked landmark displacements. The p value will indicate the type I error when rejecting the null hypothesis (that is: the landmarks in the subset are not different than the other).

To facilitate interpreting these results, it is possible to plot them:

## Graphical parameters
 par(mfrow = c(3,2))
 
 ## Plotting one result
-plot(sp_mean_test_area, main = "Species mean area difference")
+plot(sp_mean_test_area, main = "Species mean displacement difference")
 ## Adding the p value
 legend("topright", legend = paste("p =", round(sp_mean_test_area$pvalue, 3)), bty = "n")
 
@@ -438,16 +438,16 @@ 

Running the permutation test

plot(sp_mean_test_bc, main = "Species mean Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(sp_mean_test_bc$pvalue, 3)), bty = "n") -plot(all_sp_test_area, main = "Overall range area difference") +plot(all_sp_test_area, main = "Overall range displacement difference") legend("topright", legend = paste("p =", round(all_sp_test_area$pvalue, 3)), bty = "n") plot(all_sp_test_bc, main = "Overall range Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(all_sp_test_bc$pvalue, 3)), bty = "n") -plot(ci95_sp_test_area, main = "Overall range area difference") +plot(ci95_sp_test_area, main = "Overall range displacement difference") legend("topright", legend = paste("p =", round(ci95_sp_test_area$pvalue, 3)), bty = "n") plot(ci95_sp_test_bc, main = "Overall range Bhattacharyya Coefficient") legend("topright", legend = paste("p =", round(ci95_sp_test_bc$pvalue, 3)), bty = "n")
-

+