Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revising trycycler and select assembly implementations #61

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
e2d4cc9
MAINT: Refactor contig counts
fredjaya Oct 3, 2024
976a6ad
MAINT: temp fix for trycycler_cluster input and ch
fredjaya Oct 3, 2024
efbe12a
STY: groovy and comment formatting
fredjaya Oct 4, 2024
ccc9f97
MAINT: Re-add contig count after trycycler_cluster
fredjaya Oct 4, 2024
1d0fa71
ENH: medaka polishes de novo assemblies
fredjaya Oct 4, 2024
2d9ee91
STY: rename "combined_assembly" to "denovo"
fredjaya Oct 4, 2024
cf58c6c
WIP: quast updates
fredjaya Oct 4, 2024
4e4732e
DEV: assembly qc publishdir per barcode
fredjaya Oct 8, 2024
3942568
ENH: add busco for assembly qc
fredjaya Oct 9, 2024
a88288a
DEV: Save progress for new trycycler partition
fredjaya Oct 10, 2024
2ab75c9
MAINT: Improve trycycler_cluster and refactor outs
fredjaya Oct 10, 2024
f5167cc
MAINT: Simplify reconcile
fredjaya Oct 11, 2024
71b58a8
MAINT: Tidying reconcile
fredjaya Oct 11, 2024
320ee92
MAINT: Add tidied msa implementation
fredjaya Oct 11, 2024
f941394
MAINT: new trycycler partition WIP
fredjaya Oct 15, 2024
033be77
MAINT: Fix partition and reconcile implementations
fredjaya Oct 17, 2024
f579462
MAINT: add trycycler_consensus_new
fredjaya Oct 18, 2024
6c985e5
MAINT: Update consensus and denovo polishing
fredjaya Oct 22, 2024
99ad57a
MAINT: forgot module
fredjaya Oct 22, 2024
029a169
DEV: Now outputs QC for all assemblies!
fredjaya Oct 22, 2024
65a879c
DEV: Concat per-cluster consensus fastas + tidying
fredjaya Oct 29, 2024
3a98a58
MAINT: No longer output medaka subfolders
fredjaya Oct 29, 2024
7b2c3a0
DEV: Add temp fix patch file
fredjaya Oct 29, 2024
094789e
BUG: Fix concat due to file name collision
fredjaya Oct 30, 2024
03bce1e
WIP: Add new select assembly based on busco
fredjaya Oct 30, 2024
17b8245
DEV: Update patch
fredjaya Oct 30, 2024
62f1fae
ENH: Selects assembly according to buscos
fredjaya Oct 31, 2024
29642b6
WIP/STY: Delete old implementations
fredjaya Oct 31, 2024
0466460
MAINT: Improve select assembly caching, tidy annotation processes
fredjaya Oct 31, 2024
39d69aa
MAINT: Tidy chrom. annotation modules and formats
fredjaya Oct 31, 2024
a093eb9
DEV: Replace trycycler and medaka impl.
fredjaya Oct 31, 2024
13583fa
MAINT: select assembly and multiqc fixes
fredjaya Nov 1, 2024
9a76a77
It works!
fredjaya Nov 1, 2024
2e9e869
Merge branch 'main' into issue-54
fredjaya Nov 1, 2024
85bb83a
Change publish mode for results to copy
georgiesamaha Nov 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions bin/compare_busco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/usr/bin/env python3

import argparse
import json
from typing import Tuple

def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"json_files",
metavar="FILE",
type=str,
nargs="+",
help="Paths to one or more BUSCO JSON files"
)
return parser.parse_args()

def get_complete_buscos(json_file) -> Tuple[str, float]:
"""
Get the assembler name and complete BUSCO % from a json
"""
with open(json_file, 'r') as f:
j = json.load(f)
assembler = j['parameters']['out']
# get assembler only from e.g. barcode01_consensus_busco
assembler = assembler.split('_')[1]
assert assembler in ['flye', 'unicycler', 'consensus']
complete_busco = j['results']['Complete percentage']
return assembler, complete_busco

if __name__ == "__main__":
args = parse_args()
results = [get_complete_buscos(file) for file in args.json_files]
highest_busco = max(results, key=lambda x: x[1])
print(highest_busco[0], end="") # don't print newline
Loading