prime-seq

This repository contains scripts used for the analysis performed in our manuscript

prime-seq, efficient and powerful bulk RNA-sequencing

Aleksandar Janjic, Lucas E. Wange, Johannes W. Bagnoli, Johanna Geuder, Phong Nguyen, Daniel Richter, Beate Vieth, Christoph Ziegenhain, Binje Vick, Ines Hellmann, Wolfgang Enard

For the full prime-seq protocol please visit protocols.io. For the full list of 384 barcoded oligo dT primers click here

prime-seq is a simple RNA-seq workflow that goes from lysate to sequencing library in no time. We benchmarked it’s performance against the MAQC-III study using power analysis and showed that it captures known biological differences in a differentiation experiment.

The data necessary to reproduce this analysis can be found at ArrayExpress:

Accession	Dataset
E-MTAB-10140	Beads_Columns_tissue
E-MTAB-10138	Beads_Columns_PBMC
E-MTAB-10142	Beads_Columns_HEK
E-MTAB-10141	gDNA_priming
E-MTAB-10139	UHRR
E-MTAB-10133	iPSC
E-MTAB-10175	AML

Preprocessing

All RNA-seq data was adapter trimmed with cutadapt and preprocessed with zUMIs (Parekh et al., 2017).

1. Development of the prime-seq protocol

Here we summarize the different experiment previous version of prime-seq have been used for in terms of number of samples, species and intron and exon mapped fractions. Next we show that introns can be used for gene expression quantification and are not derived from contaminating gDNA. R Notebooks for this analysis can be found here

1.1 prime-seq has been used extensively and is robust with different inputs

We collected data from prime-seq experiments that were performed in the last years during it’s development and show that prime-seq works robustly on many different samples.

prime-seq robustness

1.2 Intronic reads in prime-seq are not derived from gDNA and can be used for expression quantification

gDNA priming

2. prime-seq performs as well as TruSeq

To benchmark prime-seq we compared it to a gold standard data set from the MAQC consortium using powsimR. R Notebooks for this analysis can be found here.

Method sensitivity
Method correlations
Method powsimR

3. Bead-based RNA extraction increases cost efficiency and throughput

To test the impact of different RNA isolation methods on gene expression we performed prime-seq on three types of input. RNA was isolated from HEK cells, human PBMCs and mouse striatal Tissue with either Columns or SPRI beads. R Notebooks for this analysis can be found here.

Lysis features
Lysis sensitivity
Lysis costs
Lysis DE
Lysis PCA

3.x Intron counts in prime seq correlate with exon counts and show 3’ enrichment

Intron vs. exon expression

3.1 prime-seq is sensitive and works well with 1,000 cells

Low input sensitivity
Low input correlations

3.2 cross-contamination in prime-seq is low

R Notebooks for this analysis can be found here

cross-contamination correlation
cross-contamination cycles
cross-contamination simulation

4. Figure: proof of concept, AML and iPSC to NPC

We used prime-seq on many different types of samples already, here we show two examples. The first data set consists of 96 archival AML PDX samples that were sampled using biopsy punching. We show that the biological differences between the patients and AML types can be measured accurately using our method. In a second dataset we compared neuronal differentiation of five iPS cell lines that we generated previously (Geuder et al. 2021). R Notebooks for this analysis can be found here.

AML PDX PCA iPSC to NPC differentiation

5. Figure: Budget vs. Power

Finally we showed the impact of per sample costs on power to detect differentially expressed genes. By enabling the study of many more biological replicates with a fixed budget compared to Illuminas TruSeq kit, prime-seq leverages the full power of bulk RNA-seq. R Notebooks for this analysis can be found here.

method costs
power vs. budget

6. Molecular Workflow of prime-seq

This schematic outlines the detailed molecular workflow from isolated RNA to sequencing library.

`R` Session Info

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Devuan GNU/Linux 3 (beowulf)
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.1.0  magrittr_2.0.1  fastmap_1.1.0   tools_4.1.0    
#>  [5] htmltools_0.5.2 yaml_2.2.1      stringi_1.7.4   rmarkdown_2.11 
#>  [9] knitr_1.36      stringr_1.4.0   xfun_0.28       digest_0.6.28  
#> [13] rlang_0.4.12    evaluate_0.14

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
0_Scripts		0_Scripts
1_prime_seq_development		1_prime_seq_development
2_power_simulation		2_power_simulation
3_RNA_isolation		3_RNA_isolation
4_proof_of_principle		4_proof_of_principle
5_budget		5_budget
6_additional_analysis		6_additional_analysis
.gitignore		.gitignore
Fig1.png		Fig1.png
LICENSE		LICENSE
README.Rmd		README.Rmd
README.md		README.md
Workflow.png		Workflow.png
prime-seq_E3V7_All_Barcode.txt		prime-seq_E3V7_All_Barcode.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prime-seq

Preprocessing

1. Development of the prime-seq protocol

1.1 prime-seq has been used extensively and is robust with different inputs

1.2 Intronic reads in prime-seq are not derived from gDNA and can be used for expression quantification

2. prime-seq performs as well as TruSeq

3. Bead-based RNA extraction increases cost efficiency and throughput

3.x Intron counts in prime seq correlate with exon counts and show 3’ enrichment

3.1 prime-seq is sensitive and works well with 1,000 cells

3.2 cross-contamination in prime-seq is low

4. Figure: proof of concept, AML and iPSC to NPC

5. Figure: Budget vs. Power

6. Molecular Workflow of prime-seq

`R` Session Info

About

Releases 1

Packages

Contributors 2

Languages

License

Hellmann-Lab/prime-seq

Folders and files

Latest commit

History

Repository files navigation

prime-seq

Preprocessing

1. Development of the prime-seq protocol

1.1 prime-seq has been used extensively and is robust with different inputs

1.2 Intronic reads in prime-seq are not derived from gDNA and can be used for expression quantification

2. prime-seq performs as well as TruSeq

3. Bead-based RNA extraction increases cost efficiency and throughput

3.x Intron counts in prime seq correlate with exon counts and show 3’ enrichment

3.1 prime-seq is sensitive and works well with 1,000 cells

3.2 cross-contamination in prime-seq is low

4. Figure: proof of concept, AML and iPSC to NPC

5. Figure: Budget vs. Power

6. Molecular Workflow of prime-seq

R Session Info

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

`R` Session Info

Packages