You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem is that the bcf_sr_next_line is not reset for each call of bcf_sr_set_regions, the break statement is useless, the iterator continues for the first chromosome .
So , is there a method in htslib to reset a bcf_sr_ iterator before calling bcf_sr_set_regions again ?
Thanks !
The text was updated successfully, but these errors were encountered:
I think bcf_sr_seek() mostly does what you want. One wrinkle is that if the reference you seek to is missing from the file it will put you at the start of next available one, so you have to do some filtering to ensure the line you got back was for the reference you expected.
After a little experimentation, I came up with this:
#include<stdlib.h>#include<stdio.h>#include<htslib/vcf.h>#include<htslib/synced_bcf_reader.h>intscan(constchar*fname) {
intret=EXIT_FAILURE;
constchar**seqnames=NULL;
bcf_srs_t*sr=bcf_sr_init();
if (!sr) {
fprintf(stderr, "bcf_sr_init() failed\n");
returnEXIT_FAILURE;
}
bcf_sr_set_opt(sr, BCF_SR_PAIR_LOGIC, BCF_SR_PAIR_BOTH_REF);
bcf_sr_set_opt(sr, BCF_SR_REQUIRE_IDX);
bcf_sr_add_reader(sr,fname);
if (sr->errnum) {
fprintf(stderr, "Error: %s\n", bcf_sr_strerror(sr->errnum));
goto out;
}
if (sr->nreaders<1|| !sr->readers[0].header) {
fprintf(stderr, "No header\n");
goto out;
}
bcf_hdr_t*hdr=sr->readers[0].header;
intnseq=0;
seqnames=bcf_hdr_seqnames(hdr, &nseq);
if (!seqnames) {
fprintf(stderr, "Couldn't get list of reference names\n");
goto out;
}
intseq;
for (seq=0;seq<nseq;++seq) {
constchar*contig_name=seqnames[seq];
intrid=bcf_hdr_id2int(hdr, BCF_DT_CTG, contig_name);
bcf_sr_seek(sr,contig_name, 0);
while ( bcf_sr_next_line(sr) ) {
bcf1_t*line=sr->readers[0].buffer[0];
if (line->rid!=rid)
break; // No variants for this contig_namefprintf(stderr,"%d\t%s\t%"PRIhts_pos"\n",
line->rid, contig_name, line->pos);
break; // ### OK VARIANT FOUND, STOP HERE ###############################
}
}
if ( sr->errnum ) {
fprintf(stderr, "Error: %s\n", bcf_sr_strerror(sr->errnum));
} else {
ret=EXIT_SUCCESS;
}
out:
bcf_sr_destroy(sr);
free(seqnames);
returnret;
}
intmain(intargc, char**argv) {
if (argc>1)
returnscan(argv[1]);
return0;
}
Hi all,
this question was cross-posted on biostars: https://www.biostars.org/p/9599969/#9599979
I'm trying to find the first variant of each chromosome in a VCF. My code so far looks like this:
The problem is that the
bcf_sr_next_line
is not reset for each call ofbcf_sr_set_regions
, thebreak
statement is useless, the iterator continues for the first chromosome .So , is there a method in htslib to reset a
bcf_sr_
iterator before callingbcf_sr_set_regions
again ?Thanks !
The text was updated successfully, but these errors were encountered: