Skip to content

Latest commit

 

History

History
executable file
·
89 lines (59 loc) · 5.36 KB

class3.md

File metadata and controls

executable file
·
89 lines (59 loc) · 5.36 KB

Bulk RNAseq analysis

Class 3: Hypothesis and visualization

Objectives

By the end of this class, you should be able to:

  • Analysis tools: EdgeR, limma voom, DESeq
  • visualizing results (MA-plot, volcano plot, heat map)

Assessing differential expression

FIXME: links to papers describing algorithms, short summary of differences among approaches, why prefer one over another?

edgeR

limma (voom)

DESeq2

What is the difference between the three algorithms?

  • Biostars Link
    • Both EdgeR/DESeq2 work on the assumption that no DE genes are being expressed
    • DESeq2 uses geometric normalization
    • EdgeR uses a weighted mean of log ratios-based method
    • Limma normalizes using quantile normalization
  • RNA-Seq differential expression analysis: An extended review and a software tool
    • EdgeR: A Poisson super dispersion model is used to account for technical and biological variation. Apply the Bayesian empirical method to moderate the degree of over dispersion against transcripts.
    • Limma: Based on the linear model and originally developed to analyze data from microarray and currently extended for RNA-Seq analysis. The limma user guide recommends the use of the TMM normalization of the edgeR package associated with the use of the voom conversion, which essentially transforms the normalized counts to logarithms base 2 and estimates the mean-variance relation to determine the weight of each observation made initially by a linear model
    • DESeq2: DESeq2 firstly build a model with observed counts. Secondly, it fits using the same method from the original DESeq, or fit in two steps: find the value of the parameter that makes the likelihood largest, which is called maximum likelihood estimation. Then, it takes all the gene values and move these values towards a average value. DESeq2 uses Bayes theorem to guides the amount of movement for each gene: if the information for the gene is low, its value is moved close to the average, if the information for the gene is high, its value is moved very little. Thus, the moved values are useful to evaluate different sets of genes as well as to apply a threshold

Visualizing differential expression

FIXME/IMAGES for each:

  • what does plot show?
  • how is the plot interpreted?
  • primary manuscripts and example images

Most info for MA and volcano plots sourced from this paper. Also contains image examples.

MA-plot

  • MA plots are commonly used to represent log fold-change versus mean expression between two treatments
  • Each individual data point represents a gene
  • log fold change on the y axis and mean expression on the x axis

volcano plot

  • a comparison between two treatment conditions is the adjusted P-value versus log fold-change
  • Volcano plots display the statistical significance of the difference relative to the magnitude of difference for every single gene in the comparison, usually through the negative base-10 log and base-2 log fold-change, respectively
  • generally include some threshold indicators for adjusted P-values to indicate which genes would be considered statistically differentially expressed based on the adjusted P-value of their difference between treatments

heat maps

  • From online EMBL-EBI Training (Contains example image)
    • In heat maps the data is displayed in a grid where each row represents a gene and each column represents a sample. The colour and intensity of the boxes is used to represent changes (not absolute values) of gene expression. In the example below, red represents up-regulated genes and blue represents down-regulated genes. Black represents unchanged expression.

venn diagrams

  • A Venn diagram consists of multiple overlapping closed curves, usually circles, each representing a set.
  • overlap represents what sets have in commmon

Example papers:

Putting it together

EXERCISE: choosing setup for hypothesis testing (distractors are other metadata)

EXERCISE: interpreting volcano plot, MA-plot

EXERCISE: interpreting heat map

Wrapping up

make sure work is saved

review how to get back into work

review objectives

preview next class's objectives

Errata