Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Upendra19993 · 2024-03-06T21:59:32Z

Hi,

I passed my dataset through SQANTI3 using ML option to filter the isoforms. I also filtered isoform and gtf files.

In the output files after the sqanti filter step, MLresult_classification.txt file has 52700 true isoforms (Intra-priming= FALSE, filter result= Isoform) and the inclusive.txt file also had the same number of isoforms. But the filtered isoform file (filtered.fasta) has 52,553 isoforms and there is 148 difference in number.

I would like to know possible reasons for this and is there an explanation to get such number difference in isoform number?

Many thanks,
Upendra.

alexpan00 · 2024-03-07T09:55:43Z

Do you have any duplicated ID in your inclusion-list.txt file?

Upendra19993 · 2024-03-11T01:22:13Z

Yes, I do have duplicated IDs in my inclusion-list.txt file.

Example:
KP_LeafFlower_IsoSeq_HQ_transcript/0
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup11
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup2
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup5
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup8
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup3
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup4
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup6
KP_LeafFlower_IsoSeq_HQ_transcript/0_dup10

And the total number in the list 52,700.

In the fasta file also, I have same ID for more than one isoform without the "duplicate number" in the renaming. But they have different lengths. And total number of transcripts in the fasta file is 52,553.

Example:

Could you please explain what does it mean?

Many thanks,
Upendra.

carolinamonzo · 2024-04-17T09:23:30Z

Closing since it's a duplicated problem from issue #216.

carolinamonzo closed this as completed Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Upendra19993 commented Mar 6, 2024

alexpan00 commented Mar 7, 2024

Upendra19993 commented Mar 11, 2024

carolinamonzo commented Apr 17, 2024

Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Difference in number of filtered isoforms in classification.txt and filtered isoform files #263

Comments

Upendra19993 commented Mar 6, 2024

alexpan00 commented Mar 7, 2024

Upendra19993 commented Mar 11, 2024

carolinamonzo commented Apr 17, 2024