Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAPQ of -2147483648 attributed to some reads in samfile (minimum value for an int32) #83

Open
OmarOakheart opened this issue Jul 19, 2020 · 1 comment

Comments

@OmarOakheart
Copy link

OmarOakheart commented Jul 19, 2020

I'm using the following version of ngmlr:

ngmlr 0.2.7 (build: Jul 3 2020 03:31:03, start: 2020-07-19.14:10:32)

I ran the following command:

ngmlr -t 8 -x ont -r $reference -q $file.fastq.gz -o $file.sam

And one (probably more) of the lines outputs with a MAPQ of -2147483648

CCN_ACA_BMB_26 2048 chromosome15 1074809 -2147483648 2561S12M2D21M2I90M1D15M1I2M2D7M1D5M1I13M1S

I didn't paste the rest to avoid clutter but it's a 2731 bp read so nothing crazy in that respect (I checked in the original fastq file)

To provide more context, I think this is an issue that used to go unnoticed for me, but I now use samtools v1.10, which seems to now check for MapQ and output an error if it's <0

if (b->core.qual < settings->min_mapQ || ((b->core.flag & settings->flag_on) != settings->flag_on) || (b->core.flag & settings->flag_off))
        return 1;

(From: https://github.com/samtools/samtools/blob/a6a160bba842d6afc3da061a8aa4f199fd7b729b/sam_view.c)

Still, I don't know what would cause a MAPQ of -1 in the first place

This issue is also found in #75 (looks like no one noticed the negative MAPQ but it's there. I'm not sure if there's a way to merge issues or so, I'm not proficient with github yet.

My hunch is that the following happens:

The mapping for the offending read is an extremely low quality, so I think there are issues with one of the signs in the these lines:

uint32_t mapq = -4.343 * log(1 - (double)abs(a->score1 - a->score2)/(double)a->score1);
mapq = (uint32_t) (mapq + 4.99);

I gather than mapq should be a positive integer, therefore I assume that

log(1 - (double)abs(a->score1 - a->score2)/(double)a->score1) <0

Therefore

1 - (double)abs(a->score1 - a->score2)/(double)a->score1 <1

Therefore

(double)abs(a->score1 - a->score2)/(double)a->score1 >0

Of course, what seems to be happening is the opposite, so the bug is that sometimes

(double)abs(a->score1 - a->score2)/(double)a->score1 <0

This is where my understanding reaches its limit (and maybe the bug is elsewhere but this seems like a good lead).

Either sometimes score1<score2, or sometimes score1<0

I'm not sure I understand what score1 and score2 represent so I don't know if either one is a plausible scenario. I hope all of this helps.

@rhysnewell
Copy link

I'm having the exact same issue, my guess would be there seems to be some sort of underflow occurring somewhere but I've not looked through the code to try and pinpoint it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants