22 Dec 12:43

KatharinaHoff

d8aaf4b

v1.0.11 - deleting transcripts with CDS features on opposite strands Latest

Latest

Full Changelog: v1.0.10...v1.0.11

Assets 2

15 Dec 16:04

KatharinaHoff

v1.0.10

573c697

X-Mas Release 2023: DIAMOND denoises AUGUSTUS predictions

This released was inspired by the manuscript

Newly Sequenced Genomes Reveal Patterns of Gene Family Expansion in select Dragonflies (Odonata: Anisoptera)

Ethan R. Tolman, Christopher D. Beatty, Paul B. Frandsen, Jonas Bush, Or R. Bruchim, Ella Simone Driever, Kathleen M. Harding, Dick Jordan, Manpreet K. Kohli, Jiwoo Park, Seojun Park, Kelly Reyes, Mira Rosari, Jisong L. Ryu, Vincent Wade, Jessica L. Ware

https://doi.org/10.1101/2023.12.11.569651

The authors state in the manuscript:

"While our genome annotations initially had a high (>50,000) number of genes compared to the annotation of P. flavescens, by conservatively retaining only genes which had a BLAST hit to a protein sequence from P. flavescens [27], we were able to generate highly complete annotations (fig 1. A,B), further supporting the efficacy of this pipeline in insects."

I adopted the idea, added a new script filter_gtf_by_diamond_against_ref.py that does the same thing, using DIAMOND. I chose diamond only because of speed, the result should be highly similar to using BLAST.

Calling the script is integrated into galba.pl . This approach can substantially increase specificity for a marginal tradeoff in specificity.

Accuracy comparison before and after DIAMOND filter for denoising AUGUSTUS predictions in GALBA:

D. melanogaster

before

gene_Sn	71.07
gene_Sp	71.09
trans_Sn	48.45
trans_Sp	63.74
cds_Sn	78.45
cds_Sp	87.54

after

gene_Sn	71.02
gene_Sp	73.28
trans_Sn	48.42
trans_Sp	65.42
cds_Sn	78.43
cds_Sp	88.90

Mus musculus

before

gene_Sn	70.64
gene_Sp	38.34
trans_Sn	28.70
trans_Sp	35.26
cds_Sn	77.43
cds_Sp	82.34

after

gene_Sn	70.29
gene_Sp	66.63
trans_Sn	28.55
trans_Sp	56.33
cds_Sn	77.10
cds_Sp	92.23

Acknowledgement

We thank Tolman et al. for describing this very simple but highly effective idea!

Assets 2

18 Sep 11:30

KatharinaHoff

v1.0.9

91a1d42

Debugged accuracy evaluation, improved training gene selection

@tomasbruna changed miniprothint to additionally output only the best gene per locus (instead of several) -> these are now training genes in GALBA
debugged automated accuracy evaluation

Contributors

tomasbruna

Assets 2

15 Sep 15:03

KatharinaHoff

v1.0.8

cfcbe97

Improved training gene selection

@tomasbruna extended miniprothint to output training genes for GALBA. His implementation is much better than the original implementation in GALBA. GALBA therefore now uses this miniprothint functionality
@tomasbruna also improved specificity of hints, it should now be safter to use proteins of more distant degree of relatedness (accuracy tests on large scale still pending)
galba_cleanup has been ported to python (no change in functionality)

Contributors

tomasbruna

Assets 2

15 May 14:22

KatharinaHoff

v1.0.7

21ef2ad

Fixing redundant augargs

Related to this issue: #32 (comment)
I fixed that augargs** are not passed twice to pygustus when running AUGUSTUS in ab initio mode

Assets 2

30 Mar 08:51

KatharinaHoff

v1.0.6

f96a026

Alternative transcript prediction restored

Key difference to the previous release is a bugfix that restores prediction of alternative transcripts if evidence for such is present

Assets 2

29 Mar 14:56

KatharinaHoff

v1.0.5

4c05e45

Running miniprot only once

What's Changed

Run miniprot only once by @tomasbruna in #25

Full Changelog: v1.0.4...v1.0.5

Contributors

tomasbruna

Assets 2

15 Mar 20:03

KatharinaHoff

v1.0.4

82a8276

Better jsonfile protection

pygustus jsonfile is now locked during fixing, this makes it safe to run multiple Galba processes in parallel
usexisting disappears from instructions in case of error

Assets 2

12 Mar 10:34

KatharinaHoff

v1.0.3

30fd4f3

pygustus json config file fix

automatically update an outdated (typo containing) json file with pygustus and augustus parameters in $AUGUSTUS_CONFIG_PATH/parameters/ if necessary
redirect miniprot stderr output to file
catch star containing lines in miniprot output
bugfix in miniprothint (for case of only 1 reference proteome/coverage 1, actual fix is in miniprothint repository)

Assets 2

08 Mar 12:38

KatharinaHoff

v1.0.2

b7176fa

miniprothint integration

initial miniprothint integration boosts accuracy
iterative training boosts accuracy
runtime is much worse than previous release
will in the future take measures to speed up GALBA

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

D. melanogaster

before

after

Mus musculus

before

after

Acknowledgement

Contributors

Contributors

What's Changed

Contributors

Releases: Gaius-Augustus/GALBA

v1.0.11 - deleting transcripts with CDS features on opposite strands

X-Mas Release 2023: DIAMOND denoises AUGUSTUS predictions

D. melanogaster

before

after

Mus musculus

before

after

Acknowledgement

Debugged accuracy evaluation, improved training gene selection

Contributors

Improved training gene selection

Contributors

Fixing redundant augargs

Alternative transcript prediction restored

Running miniprot only once

What's Changed

Contributors

Better jsonfile protection

pygustus json config file fix

miniprothint integration