Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Closed
Upendra19993 opened this issue Mar 6, 2024 · 3 comments

Comments

@Upendra19993
Copy link

Hi,

I passed my dataset through SQANTI3 using ML option to filter the isoforms. I also filtered isoform and gtf files.

In the output files after the sqanti filter step, MLresult_classification.txt file has 52700 true isoforms (Intra-priming= FALSE, filter result= Isoform) and the inclusive.txt file also had the same number of isoforms. But the filtered isoform file (filtered.fasta) has 52,553 isoforms and there is 148 difference in number.

I would like to know possible reasons for this and is there an explanation to get such number difference in isoform number?

Many thanks,
Upendra.

@alexpan00
Copy link
Collaborator

Do you have any duplicated ID in your inclusion-list.txt file?

@Upendra19993
Copy link
Author

Yes, I do have duplicated IDs in my inclusion-list.txt file.

Example:
KP_LeafFlower_IsoSeq_HQ_transcript/0
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup11
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup2
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup5
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup8
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup3
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup4
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup6
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup10

And the total number in the list 52,700.

In the fasta file also, I have same ID for more than one isoform without the "duplicate number" in the renaming. But they have different lengths. And total number of transcripts in the fasta file is 52,553.

Example:
Picture1

Could you please explain what does it mean?

Many thanks,
Upendra.

@carolinamonzo
Copy link
Contributor

Closing since it's a duplicated problem from issue #216.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants