Skip to content

bogdanlab/prs-uncertainty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Estimate PRS Uncertainty for Biobank

Here we demonstrate how to estimate PRS uncertainty (full posterior distribution of genetic value) as described in Large uncertainty in individual PRS estimation impacts PRS-based risk stratification.

Data

Prepare the following data:

  • train_bfile: Genotype file for training individuals.
  • train_sumstats: GWAS association summary statistics output from plink.
  • val_bfile: Genotype file for validation individuals.
  • val_pheno: Phenotype file for validation individuals.
  • test_bfile: Genotype file for testing individuals

Estimate PRS Uncertainty

Install the following packages:

install.packages('optparse', 'bigsnpr', 'tibble', 'readr', 'dplyr')

Download the prs_uncertainty.R and summary.R scripts:

Compute the PRS uncertainty:

#!/bin/sh
#$ -cwd
#$ -j y
#$ -l h_data=20G,h_rt=10:00:00 -pe shared 6
#$ -o ./job_out
#$ -t 1-22

chr_i=$SGE_TASK_ID

Rscript prs_uncertainty.R
    --chr_i=$chr_i
    --train_bfile=train_bfile \
    --train_sumstats=all.assoc.linear \
    --val_bfile=val_bfile \
    --val_pheno=height.val.regressed_pheno \
    --test_bfile=test_bfile \
    --out_dir=out_dir  #store outputfiles \
    --cache_dir=cache_file #store LD cache\ 
    --n_cores=6 \
    --num_burn_in=100 \
    --num_iter=500 \

To reduce runtime and memory required for training PRS models, we train PRS models separtely for each of the twentytwo chromosomes and then merge them to obtain a final PRS model for the whole genome using the script summary.R

#!/bin/sh
#$ -cwd
#$ -j y
#$ -l h_data=10G,h_rt=1:00:00 
#$ -o ./job_out

Rscript summary.R --out_dir=out_dir

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages