-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Welcome to the dmrff wiki!
-
How can sample size be calculated for meta-analysed DMRs? Here is a solution with code
-
Is it okay to exclude CpG sites for a given cohort from a meta-analysis? Yes if the CpG sites are excluded for reasons unrelated to their associations with the variable of interest, e.g. it is perfectly valid to exclude CpG sites due to genetic or technical artefacts or low population variance.
-
How are p-values adjusted for multiple tests?
p.adjust
in the output is simply thep.value
multiplied by the total number of tests performed. The number of tests is equal to the number of regions for which DMR statistics are calculated.dmrff
starts by identifying all candidate regions, regions covered by CpG sites with EWAS p < 0.05 with consecutive CpG sites at most 500bp apart (these parameters can be modified). It then calculates DMR statistics for sub-regions in order to cover the candidate DMR with sub-regions with the strongest DMR statistics.dmrff
returns just this resulting set of sub-regions in the output, not all sub-regions considered. Consequently it is not possible to count the number of tests or adjust DMR p-values for multiple tests using the output. The number of tests can be calculated by dividing the outputp.adjust
byp.value
. -
Is there a way to use a 'pre' object used in a DMR meta-analysis to identify DMRs for the corresponding dataset? Yes, the
dmrff.cohort()
function can be applied to the 'pre' object. The output should be identical to actually applying thedmrff()
function to that dataset. -
Can
dmrff()
be applied to RRBS DNA methylation profiles? Yes, it can. In fact, our implementation specifically avoided being tied to the Illumina BeadChip format specifically for this reason. That said, we haven't yet used RRBS data and would be very happy to hear about how it works and if you run into any problems. -
How does changing the 'maxgap' parameter affect the dmrff? Increasing 'maxgap' may increase the statistics of returned DMRs but at the expensive a larger number of tests and therefore more stringent adjustment for multiple tests.
This is because a larger 'maxgap' value allows candidate DMRs to have a greater distances between consecutive sites with EWAS p <p.cutoff
(default 0.05) and therefore to cover more of the genome. Larger candidate DMRs have a larger number of sub-regions and therefore require more tests to be performed to identify an optimal set of DMRs covering the candidate region.