Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to install "bcftools +blup", exactly? #26

Open
jielab opened this issue Jan 23, 2025 · 7 comments
Open

how to install "bcftools +blup", exactly? #26

jielab opened this issue Jan 23, 2025 · 7 comments

Comments

@jielab
Copy link

jielab commented Jan 23, 2025

Hi, there:

I try to get bcftools _ blup work on the Linux Ubuntu coming with Windows 11.

When I run the first command in the "Installation" section, I got E: Unable to locate package libcholmod4. Is this package a must?

After the above command, bcftools is installed. However, I had to remove it since it is version 1.13. Even after i tried to "apt upgrade bcftools", it is still version 1.13. Then I manually dowloaded the latest version from http://github.com/samtools/bcftools/releases/download/1.20/**bcftools-1.20.tar.bz2**.

After I run tar xjvf bcftools-1.20.tar.bz2, the next command failed wget -P bcftools-1.20 http://raw.githubusercontent.com/DrTimothyAldenDavis/SuiteSparse/stable/{SuiteSparse_config/SuiteSparse_config,CHOLMOD/Include/cholmod}.h. Can you please check if the link is valid?

Please help.

Thanks!

JH.

@freeseek
Copy link
Owner

freeseek commented Jan 24, 2025

CHOLMOD4 is only required if you want to use BCFtools/pgs. If you remove BCFtools/pgs, then you can skip the whole installation of CHOLMOD4. However, I strongly discourage using BCFtools/blup, as it does not generate very predictive polygenic scores. To generate competitive polygenic scores you should really use BCFtools/pgs. If your Ubuntu distribution is coming with BCFtools 1.13, it likely means you are using Ubuntu 22.04. This does not include CHOLMOD4 (or CHOLMOD5), so you would likely have to manually compile everything. I strongly advise to switch to Ubuntu 24.04 to install BCFtools/pgs. As for downloading the CHOLMOD headers, try this:

wget -P bcftools-1.20 http://raw.githubusercontent.com/DrTimothyAldenDavis/SuiteSparse/stable/SuiteSparse_config/SuiteSparse_config.h
wget -P bcftools-1.20 http://raw.githubusercontent.com/DrTimothyAldenDavis/SuiteSparse/stable/CHOLMOD/Include/cholmod.h

The problem might be that you are not running the commands inside BASH which is responsible for expanding the curly braces

@jielab
Copy link
Author

jielab commented Jan 27, 2025

Thanks!

I now upgraded to Ubuntu 24. If I replace libcholmod4 with "libcholmod5", it now worked. I also manually downloaded the latest version bcftools 1.2.1.

However, please see the screenshots below, I still could NOT download the plugins file before running make.

Image

Image

You said that I should really use BCFtools/pgs. I did not see a mentioning of this in the 2023 paper. It only said BLUPx-ldgm is also implemented in bcftools. So, what is the difference between BCFtools/pgs vs. bcftools/blup?

Thank you very much!

Best regards,
Jie

@freeseek
Copy link
Owner

For whatever reasons, it seems like your system is resolving raw.githubusercontent.com as 0.0.0.0. I think this is a problem with your system. You need to fix that first.

As for BCFtools/pgs, there is no official paper yet. The published paper is only about blup, though many aspects of the model are shared. Nevertheless, you should use BCFtools/pgs and not BCFtools/blup.

@jielab
Copy link
Author

jielab commented Jan 29, 2025

Thanks!

I finally made this work, after i manually downloaded those files and copied to the "bcftools/plugins" direcotry. Please see the screenshot below.

Image

Your Github says Download latest version of [HTSlib] and [BCFtools]. But then there is no more mention of [HTSlib] thereafter. Should I also compile it and copy binaries to the $HOME/bin directory?

The above screenshot shows that -- pgs -- Compute best linear unbiased predictor from GWAS-VCF summary statistics. So, bcftools/pgs is like LDpred and PCS-cs, that takes GWAS summary statistics as input? After this, I need to use something like plink --score to calculate actual PGS for individuals, correct?

Thank you & best regadrs,
Jie

@freeseek
Copy link
Owner

If BCFtools is running on your side, it means HTSlib was correctly installed. BCFtools/pgs is in the class of tools like LDpred and PRS-cs. It is meant to be easy to use and it works with GWAS-VCF files and LDGM-VCF files. The output loading files you generate can then be used with BCFtools/score to compute PGS values. You don't need to convert anything to PLINK format

@jielab
Copy link
Author

jielab commented Feb 2, 2025

Dear Guilio:

  1. I now successfully installed bcftools 2.1 with plugins. At this time, I only want to use bcftools +pgs. Your github page writes that "The Gibbs sampling part .... This task is performed by CHOLMOD". My first question is: do I must run CHOLMOD before I run bcftools +pgs?

  2. I also downloaded the LD reference file. My second question is: how could a LD matrics file be in a VCF/BCF format? A traditional VCF/BCF file only has a few columns for each row [SNP]. Can I use .hdf5 LD matrics [used by PRS-CS] for bcftools +pgs?

  3. I did not find any example GWAS VCF file on your Github for me to test-run bcftools +pgs. For a very simple test-run, I downloaded a chunk of GWAS VCF file from here and manually inserted data for 3 more samples, since bcftools + score need at least 5 samples in the VCF file. I simply run: bcftools +pgs -Ov -o test.vcf --log test.log -b 5e-08 example.vcf.gz 1kg_ldgm.{AFR,EAS,EUR,AMR,SAS}.bcf. I did not receive any error message. But, as mentioned above, when and where I should run CHOLMOD? BTW, shouldn't I put --ldgm-vcfs before the LD reference files?

Your clarificaiton and teaching would be greatly appreciated!

Best regards,
Jie

@freeseek
Copy link
Owner

freeseek commented Feb 3, 2025

  1. CHOLMOD is a library required by BCFtools/pgs. You can compile BCFtools/pgs without the CHOLMOD library installed and with just the two header files (SuiteSparse_config.h and cholmod.h) but you will eventually need the CHOLMOD library to run the plugin

  2. The LD reference files are sparse LD precision matrices, that is, sparse inverse LD correlation matrices. This allows these files to be ~1,000 smaller than typical LD correlation matrices. Please see here to understand how this works and see here for how the LDGM-VCF specification works

  3. For examples on how to create VCF loading files see here. The --ldgm-vcfs option is not mandatory. You can run the tool as bcftools +pgs ... 1kg_ldgm.EUR.bcf 1kg_ldgm.AFR.bcf or as bcftools +pgs ... --ldgm-vcfs 1kg_ldgm.EUR.bcf,1kg_ldgm.AFR.bcf. Either way is acceptable. Multiple options are provided to allow scripting in the user's preferred way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants