-
Notifications
You must be signed in to change notification settings - Fork 298
/
Microarrays.Rmd
169 lines (104 loc) · 4.42 KB
/
Microarrays.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
title: 'Bioinformatics for Big Omics Data: Introduction to gene expression microarrays'
author: "Raphael Gottardo"
date: "January 21, 2014"
output:
ioslides_presentation:
fig_caption: yes
fig_retina: 1
keep_md: yes
smaller: yes
---
## Setting up some options
Let's first turn on the cache for increased performance and improved styling
```{r, cache=FALSE}
# Set some global knitr options
library("knitr")
opts_chunk$set(tidy=TRUE, tidy.opts=list(blank=FALSE, width.cutoff=60), cache=TRUE, messages=FALSE)
```
## Outline
- Motivation
- Reverse transcription and hybridization cDNA microarrays
- Oligonucleotide arrays (Affymetrix)
- Illumina beadarrays
## Motivation
- Why are microarrays so important?
- First technology to enable the genomewide quantification of gene-expression
- Characterize and classify diseases, design new drugs, etc.
- "Recent" technology (1995,1996)
- Fast moving field
## Microarrays
- Isolate mRNA from cells
- From RNA we can get DNA using an enzyme called reverse transcriptase (e.g. Retrovirus)
- The derived "copy" DNA (cDNA) is hybridized to known DNA targets (genes) to quantify gene expression
## Reverse transcription
<img src="Images/RT.png" width=500>
Hybridization
<img src="Images/hybridization.png" width=500>
## cDNA-microarray
<img src="Images/cDNA-microarrays.png" width=500>
## cDNA-microarray
<img src="Images/microarray-image.png" width=500>
## Oligonucleotide arrays
<img src="http://www.genome.gov/dmd/previews/28633_large.jpg" width=500>
- Fabricated by placing short cDNA sequences (oligonucleotides) on a small silicon chip by a photolithographic process
- Each gene is represented by a set of distinct probes, cDNA segments of length 25 nucleotides (11-20)
- Probes chosen based on uniqueness criteria and empirical rules
- Single color
## Affymetrix arrays
<img src="Images/Affy-probe.png" width=500>
## Affymetrix arrays
- about 20 probes that “perfectly” represent the gene (Perfect Match)
- about 20 probes that do not match the gene sequence (Mismatch)
- Probeset
## Affymetrix arrays
<img src="Images/PM-MM-probes.png" width=500>
## Affymetrix arrays
For a valid gene expression measurement
the Perfect Match sticks and the Mistach does not!
## Probe synthesis
Let's watch a small video
http://www.youtube.com/watch?v=ui4BOtwJEXs
## Affymetrix arrays
<img src="Images/affymetrix-layout.png" width=500>
## Affymetrix protocol
<img src="Images/affymetrix-protocol.png" width=500>
## Affymetrix probesets
<img src="Images/probe-set.png" width=500>
## Affymetrix image
<img src="Images/affymetrix-image.png" width=500>
## Illumina bead arrays
<img src="http://res.illumina.com/images/technology/beadarray_multi_sample_array_formats_lg.gif" width=500>
(Source: http://www.illumina.com/)
## Illumina bead arrays
<img src="http://www.mouseclinic.de/uploads/pics/Gene_exp_1_500.jpg" width=500>
(Source: http://www.mouseclinic.de/)
## Illumina bead arrays - image
<img src="http://www.sanger.ac.uk/research/projects/complextraits/gfx/complextraits_array-snp_400x404_72.jpg" width=500>
## Illumina bead arrays
- Illumina bead arrays can be applied to different problems
- Can use one or two colors depending on the application
- Gene expression arrays use one color
## Illumina bead arrays
- High density arrays
- HumanHT-12 v4.0 Expression BeadChip, about 47,000 probes
- A specific oligonucleotide sequence is assigned to each bead type, which is replicated about 30 times on an array
- Bead level data can be relavitely large
## Summary of technologies
cDNA microarray | Affymetrix | Illumina
---|---|---
1 probe/gene | 1-20 probes/gene | 30 beads(probes)/gene
Probe of variable lengths | 25-mer probes | 50-mer probes
Two colors | One color | One color
Flexible, choice of probes | High density/Replication | High density/Replication
- | Cross hybridization | -
## Analysis pipeline
1. Biological question
2. Experimental design
3. Experiment
4. Image analysis
5. Normalization, Batch effect removal
6. Estimation, Testing, Clustering, Prediction, Classification, Feature selection, etc.
7. Validate finding generate new hypothesis -> New experiments
**Most of these steps require the use of statistical methods and/or computational tools**
**Note:** Even though microarrays were designed to study gene expression, they can be used to study many other things (DNA-protein binding, methylation, splicing, etc). Though they are slowly being replaced by sequencing-based assays.