pos file from beagle/mafs #6

adeflamingh · 2022-03-17T20:16:38Z

Hello, I'm trying to run ngs-HMM using GL files that were previously generated with ANGSD (this was the ANGSD command):

angsd -GL 1 -out gl_55 -nThreads 10 -doGlf 2 -doMajorMinor 1 -doMaf 2 -SNP_pval 1e-2 -minInd 28 -b bam.filelist -ref refXX.fasta

I have a beagle.gz and mafs.gz file, and a total number of 20563944 sites.

I saved the first two columns from the mafs file and removed the header to create the .pos file needed (cat X.mafs.gz | awk '{print $1,$3}' > X.pos, then tail -n +1 X.pos > X.nohead.pos)

I checked that the beagle and the no.head.pos files have 20563944 respectively using zcat X.beagle.gz | wc -l (this was 20563945, because of the header). and cat X.nohead.pos | wc -l)

Then I tried to use ngsF-HMM to calculate stats (assigning 260G memory and 8 threads), using:

ngsF-HMM --n_ind 55 --n_sites 20563944 --geno ../gl_55.beagle.gz --lkl --pos ../gl_55.nohead.pos --min_epsilon 1e-7 --out output/file --n_threads 8

I get following error:

> Header found! Skipping line...
==> Input Arguments:
        geno: ../gl_55.beagle.gz
        pos: ../gl_55.nohead.pos
        lkl: true
        loglkl: false
        n_ind: 55
        n_sites: 20563944
        call_geno: false
        freq: r
        freq_est: 1
        e_prob: 1
        indF: 0.01-0.001
        indF_fixed: false
        alpha_fixed: false
        out: output/file
        log: 0
        log_bin: false
        min_iters: 10
        max_iters: 100
        min_epsilon: 0.0000001000
        n_threads: 8
        verbose: 1
        seed: 383
        version: 1.1.0 (Mar 14 2022 @ 11:09:52)

==> GZIP input file (not BINARY)
==> Reading data
> Sites coordinates
> GENO data
==> Setting initial inbreeding values to: 0.01-0.001
==> Using random initial frequency values.
==> Calculating initial emission probabilities

Iteration 1:
==> Forward Recursion
==> Backward Recursion
Ind 0: -20723775.856501504778862        -20723775.854976736009121 (0.001524768769741)

=====
ERROR: [iter_EM] Fw and Bw lkl do not match!
=====

        : Numerical result out of range

~~~~~~~~~~~~~~~~~~~~~~~~

Any recommendations on what I might be doing wrong? Thank you very much in advance!

The text was updated successfully, but these errors were encountered:

fgvieira · 2022-03-18T09:43:14Z

I'd try running it again with other seeds and, if that does not work, set --min_epsilon 1e-9.

adeflamingh · 2022-03-18T14:49:11Z

Thank you very much for your suggestions! I tried a different seed, and also tried --min_epsilon 1e-9 but I get the same error. Any other fixes?

What I used:

"ngsF-HMM --n_ind 55 --n_sites 20563944 --geno ../gl_55.beagle.gz --lkl --pos ../gl_55.nohead.pos --min_epsilon 1e-9 --out output/file --n_threads 8 --seed 8376"

What I get:

Header found! Skipping line...
==> Input Arguments:
geno: ../gl_55.beagle.gz
pos: ../gl_55.nohead.pos
lkl: true
loglkl: false
n_ind: 55
n_sites: 20563944
call_geno: false
freq: r
freq_est: 1
e_prob: 1
indF: 0.01-0.001
indF_fixed: false
alpha_fixed: false
out: output/file
log: 0
log_bin: false
min_iters: 10
max_iters: 100
min_epsilon: 0.0000000010
n_threads: 8
verbose: 1
seed: 8376
version: 1.1.0 (Mar 14 2022 @ 11:09:52)

==> GZIP input file (not BINARY)
==> Reading data

Sites coordinates
GENO data
==> Setting initial inbreeding values to: 0.01-0.001
==> Using random initial frequency values.
==> Calculating initial emission probabilities

Iteration 1:
==> Forward Recursion
==> Backward Recursion
Ind 0: -20723821.353790059685707 -20723821.352281708270311 (0.001508351415396)

=====
ERROR: [iter_EM] Fw and Bw lkl do not match!

    : Numerical result out of range

fgvieira · 2022-03-29T09:41:43Z

Then it looks like a precision issues due to large datasets, which seems to be the case since you have over 20M snps.

You can try to run on just a single chromosome, to check if that is the case.

Alternatively, you can try a "gentle" LD prunning (e.g. using ngsLD) just to remove sites that are in highly linked.

Let me know if it helped.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pos file from beagle/mafs #6

pos file from beagle/mafs #6

adeflamingh commented Mar 17, 2022

fgvieira commented Mar 18, 2022

adeflamingh commented Mar 18, 2022

fgvieira commented Mar 29, 2022

pos file from beagle/mafs #6

pos file from beagle/mafs #6

Comments

adeflamingh commented Mar 17, 2022

fgvieira commented Mar 18, 2022

adeflamingh commented Mar 18, 2022

===== ERROR: [iter_EM] Fw and Bw lkl do not match!

fgvieira commented Mar 29, 2022

=====
ERROR: [iter_EM] Fw and Bw lkl do not match!