Introduction

This is the custom code used in the following paper: Epigenetic therapy potentiates transposable element transcription to create tumor-enriched antigens in glioblastoma cells

Once the paper is published, the citation information will be here: X

Of note, the transcript analysis pipeline used in this paper can be found here: https://github.com/twlab/TEProf2Paper

Furthermore, the script used to perform cap filtering for nanocage data can be found here: https://github.com/nakul2234/cageCapfilter

The script for peak calling using nanoCAGE data and transcript profiling using long-read data can be found here: https://github.com/twlab/LRCAGE

All scripts defined below can be found in the bin folder.

Scripts and Descriptions

TITEA_Analysis.Rmd

R Notebook with analysis of mRNA, nanoCAGE, long read sequencing, and mass spectrometry data

rmskAllFramesTranslate.py

Python script that translates repeatmasker fasta files in all frames to create a potential protein database.

usage: rmskALLFramesTranslate.py [-h] [-m] [-e] [-l <LENGTH>] <FASTA FILE>

Translate FASTA file in all 6 frames. Optimized for use with Repeatmasker TE
files, but any FASTA can be tried. The FASTA fieldname orders will be
different.

positional arguments:
  <FASTA FILE>  The fasta file with DNA sequences of TEs that will be
                translated by this program. This script has been optimized to
                work with Repeatmasker Files.

optional arguments:
  -h, --help    show this help message and exit
  -m            If this flag is placed, then only those peptides starting with
                a Methionine will be considered
  -e            If this flag is placed, then peptides with no stop codon will
                be considered
  -l <LENGTH>   Minimum peptide length. Default: 1

transcriptALLFramesTranslate.py

usage: transcriptALLFramesTranslate.py [-h] [-m] [-e] [-l <LENGTH>]
                                       <FASTA FILE>

Translate FASTA file in all 3 forward frames. Optimized for use with FASTA
files with a uniqid as the only field after >, but any FASTA can be tried.

positional arguments:
  <FASTA FILE>  The fasta file with DNA sequences that will be translated by
                this program.

optional arguments:
  -h, --help    show this help message and exit
  -m            If this flag is placed, then only those peptides starting with
                a Methionine will be considered
  -e            If this flag is placed, then peptides with no stop codon will
                be considered
  -l <LENGTH>   Minimum peptide length. Default: 1

call_ATACpeak.sh

usage: call_ATACpeak.sh <ATAC BAM> <BLACKLIST BED>
Call ATAC peaks from ATAC bam file. To infer Tn5 insertion site for each read, +5 and -4 offset is applied to reads aligned to the forward and reverse strand, respectively.

<positional arguments>:
  <ATAC BAM>       The bam file of ATAC-seq data.
  <BLACKLIST BED>  The blacklist bed file that will filter out overlapping ATAC peaks.

calc_ATACscalefactor.R

usage: calc_ATACscalefactor <ATAC SAMPLE SHEET CSV> <ATAC PEAK AT HOUSE KEEPING GENES BED> <SIZE FACTOR TXT>
Calculate scale factors for each ATAC data using consensus ATAK peaks at house keeping gene promoters. 

<positional arguments>:
  <ATAC SAMPLE SHEET CSV>                 Sample sheet CSV file of ATAC samples. This format is used for DiffBind package.
  <ATAC PEAK AT HOUSE KEEPING GENES BED>  The BED file of consensus ATAC peaks overlapping house keeping gene promoters.
  <SIZE FACTOR TXT>                       Scale factors calculated for each ATAC sample.

pipe_align_ATAC.sh

usage: pipe_align_ATAC.sh <ATAC READ 1> <ATAC READ 2>
Trim adaptors in ATAC-seq reads, align to the Hg38 reference genome using bowtie2, deduplicate, extract usable reads, call peaks, and filter blacklist regions.

<positional arguments>:
  <ATAC READ 1> ATAC-seq read 1.
  <ATAC READ 2> ATAC-seq read 2.

pipe_align_WGBS.sh

usage: pipe_align_WGBS.sh <WGBS READ 1> <WGBS READ 2> <OUTPUT DIR>
Trim adaptors in WGBS-seq reads, align to the Hg38 reference genome using bismark, deduplicate, and extract methylation calls.

<positional arguments>:
  <WGBS READ 1>     WGBS-seq read 1.
  <WGBS READ 2>     WGBS-seq read 2.
  <OUTPUT DIR> Output directory.

pipe_align_nanoCAGE.sh

usage: pipe_align_nanoCAGE.sh <nanoCAGE READ 1> <nanoCAGE READ 2>
Trim adaptors in nanoCAGE-seq reads, align to the Hg38 reference genome using STAR.

<positional arguments>:
  <nanoCAGE READ 1>     nanoCAGE-seq read 1.
  <nanoCAGE READ 2>     nanoCAGE-seq read 2.

pipe_align_RNA.sh

usage: pipe_align_RNA.sh <RNA READ 1> <RNA READ 2>
Trim adaptors in RNA-seq reads, align to the Hg38 reference genome using STAR.

<positional arguments>:
  <RNA READ 1>     RNA-seq read 1.
  <RNA READ 2>     RNA-seq read 2.

pipe_assemble_RNA.sh

usage: pipe_assemble_RNA.sh <RNA BAM>
Assemble transcripts with stringtie using RNA-seq BAM file as an input (only uniquely mapped reads are used).

<positional arguments>:
  <RNA BAM> RNA BAM file.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
bin		bin
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Scripts and Descriptions

TITEA_Analysis.Rmd

rmskAllFramesTranslate.py

transcriptALLFramesTranslate.py

call_ATACpeak.sh

calc_ATACscalefactor.R

pipe_align_ATAC.sh

pipe_align_WGBS.sh

pipe_align_nanoCAGE.sh

pipe_align_RNA.sh

pipe_assemble_RNA.sh

About

Releases 1

Packages

Contributors 2

Languages

juheon/epitherapy_induced_antigen_GBM

Folders and files

Latest commit

History

Repository files navigation

Introduction

Scripts and Descriptions

TITEA_Analysis.Rmd

rmskAllFramesTranslate.py

transcriptALLFramesTranslate.py

call_ATACpeak.sh

calc_ATACscalefactor.R

pipe_align_ATAC.sh

pipe_align_WGBS.sh

pipe_align_nanoCAGE.sh

pipe_align_RNA.sh

pipe_assemble_RNA.sh

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages