-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add help * add config * add run script * add test data and expected output + script to fetch them * add test * update changelog * handle input --gff has multiple=true * cleanup config * add direction for input arguments * update config: add requirements, add keywords, update --config description * remove unset IFS * add set -eo pipefail to script and test files * create temporary directory and clean up on exit * cleanup changelog * Update CHANGELOG.md --------- Co-authored-by: Robrecht Cannoodt <[email protected]>
- Loading branch information
Showing
8 changed files
with
1,203 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
name: agat_sq_stat_basic | ||
namespace: agat | ||
description: | | ||
The script aims to provide basic statistics of a gtf/gff file. | ||
keywords: [gene annotations, gff, statistics] | ||
links: | ||
homepage: https://github.com/NBISweden/AGAT | ||
documentation: https://agat.readthedocs.io/en/latest/tools/agat_sq_stat_basic.html | ||
issue_tracker: https://github.com/NBISweden/AGAT/issues | ||
repository: https://github.com/NBISweden/AGAT | ||
references: | ||
doi: 10.5281/zenodo.3552717 | ||
license: GPL-3.0 | ||
requirements: | ||
- commands: [agat] | ||
authors: | ||
- __merge__: /src/_authors/leila_paquay.yaml | ||
roles: [ author, maintainer ] | ||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --gff | ||
alternatives: [-i, --file, --input] | ||
description: | | ||
Input GTF/GFF file. | ||
type: file | ||
required: true | ||
multiple: true | ||
direction: input | ||
example: input.gff | ||
- name: --genome_size | ||
alternatives: [-g] | ||
description: | | ||
That input is designed to know the genome size in order to calculate the percentage of the genome represented by each kind of feature type. You can provide an INTEGER. Or you can also pass a fasta file using the argument --genome_size_fasta. If both are provided, only the value of --genome_size will be considered. | ||
type: integer | ||
required: false | ||
direction: input | ||
example: 10000 | ||
- name: --genome_size_fasta | ||
description: | | ||
That input is designed to know the genome size in order to calculate the percentage of the genome represented by each kind of feature type. You can provide the genome in fasta format. Or you can also pass the size directly as an integer using the argument --genome_size. If you provide the fasta, the genome size will be calculated on the fly. If both are provided, only the value of --genome_size will be considered. | ||
type: file | ||
required: false | ||
direction: input | ||
example: genome.fasta | ||
- name: Outputs | ||
arguments: | ||
- name: --output | ||
alternatives: [-o] | ||
description: | | ||
Output file. The result is in tabulate format. | ||
type: file | ||
direction: output | ||
required: true | ||
example: output.txt | ||
- name: Arguments | ||
arguments: | ||
- name: --inflate | ||
description: | | ||
Inflate the statistics taking into account feature with | ||
multi-parents. Indeed to avoid redundant information, some gff | ||
factorize identical features. e.g: one exon used in two | ||
different isoform will be defined only once, and will have | ||
multiple parent. By default the script count such feature only | ||
once. Using the inflate option allows to count the feature and | ||
its size as many time there are parents. | ||
type: boolean_true | ||
- name: --config | ||
alternatives: [-c] | ||
description: | | ||
AGAT config file. By default AGAT takes the original agat_config.yaml shipped with AGAT. The `--config` option gives you the possibility to use your own AGAT config file (located elsewhere or named differently). | ||
type: file | ||
required: false | ||
example: custom_agat_config.yaml | ||
resources: | ||
- type: bash_script | ||
path: script.sh | ||
test_resources: | ||
- type: bash_script | ||
path: test.sh | ||
- type: file | ||
path: test_data | ||
engines: | ||
- type: docker | ||
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0 | ||
setup: | ||
- type: docker | ||
run: | | ||
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt | ||
runners: | ||
- type: executable | ||
- type: nextflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
```sh | ||
agat_sq_stat_basic.pl --help | ||
``` | ||
|
||
------------------------------------------------------------------------------ | ||
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 | | ||
| https://github.com/NBISweden/AGAT | | ||
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se | | ||
------------------------------------------------------------------------------ | ||
|
||
|
||
Name: | ||
agat_sq_stat_basic.pl | ||
|
||
Description: | ||
The script aims to provide basic statistics of a gtf/gff file. | ||
|
||
Usage: | ||
agat_sq_stat_basic.pl -i <input file> [-g <integer or fasta> -o <output file>] | ||
agat_sq_stat_basic.pl --help | ||
|
||
Options: | ||
-i, --gff, --file or --input | ||
STRING: Input GTF/GFF file. Several files can be processed at | ||
once: -i file1 -i file2 | ||
|
||
-g, --genome | ||
That input is design to know the genome size in order to | ||
calculate the percentage of the genome represented by each kind | ||
of feature type. You can provide an INTEGER or the genome in | ||
fasta format. If you provide the fasta, the genome size will be | ||
calculated on the fly. | ||
|
||
--inflate | ||
Inflate the statistics taking into account feature with | ||
multi-parents. Indeed to avoid redundant information, some gff | ||
factorize identical features. e.g: one exon used in two | ||
different isoform will be defined only once, and will have | ||
multiple parent. By default the script count such feature only | ||
once. Using the inflate option allows to count the feature and | ||
its size as many time there are parents. | ||
|
||
-o or --output | ||
STRING: Output file. If no output file is specified, the output | ||
will be written to STDOUT. The result is in tabulate format. | ||
|
||
-c or --config | ||
String - Input agat config file. By default AGAT takes as input | ||
agat_config.yaml file from the working directory if any, | ||
otherwise it takes the orignal agat_config.yaml shipped with | ||
AGAT. To get the agat_config.yaml locally type: "agat config | ||
--expose". The --config option gives you the possibility to use | ||
your own AGAT config file (located elsewhere or named | ||
differently). | ||
|
||
--help or -h | ||
Display this helpful text. | ||
|
||
Feedback: | ||
Did you find a bug?: | ||
Do not hesitate to report bugs to help us keep track of the bugs and | ||
their resolution. Please use the GitHub issue tracking system available | ||
at this address: | ||
|
||
https://github.com/NBISweden/AGAT/issues | ||
|
||
Ensure that the bug was not already reported by searching under Issues. | ||
If you're unable to find an (open) issue addressing the problem, open a new one. | ||
Try as much as possible to include in the issue when relevant: | ||
- a clear description, | ||
- as much relevant information as possible, | ||
- the command used, | ||
- a data sample, | ||
- an explanation of the expected behaviour that is not occurring. | ||
|
||
Do you want to contribute?: | ||
You are very welcome, visit this address for the Contributing | ||
guidelines: | ||
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
#!/bin/bash | ||
|
||
set -eo pipefail | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
# unset flags | ||
[[ "$par_inflate" == "false" ]] && unset par_inflate | ||
|
||
# Convert a list of file names to multiple -gff arguments | ||
input_files="" | ||
IFS=";" read -ra file_names <<< "$par_gff" | ||
for file in "${file_names[@]}"; do | ||
input_files+="--gff $file " | ||
done | ||
|
||
# take care of --genome (can originally be either a fasta file or an integer) | ||
if [[ -n "$par_genome_size" ]]; then | ||
genome_arg=$par_genome_size | ||
elif [[ -n "$par_genome_size_fasta" ]]; then | ||
genome_arg=$par_genome_size_fasta | ||
fi | ||
|
||
# run agat_convert_sp_bed2gff.pl | ||
agat_sq_stat_basic.pl \ | ||
$input_files \ | ||
${genome_arg:+--genome "${genome_arg}"} \ | ||
--output "${par_output}" \ | ||
${par_inflate:+--inflate} \ | ||
${par_config:+--config "${par_config}"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
#!/bin/bash | ||
|
||
set -eo pipefail | ||
|
||
## VIASH START | ||
## VIASH END | ||
|
||
test_dir="${meta_resources_dir}/test_data" | ||
|
||
# create temporary directory and clean up on exit | ||
TMPDIR=$(mktemp -d "$meta_temp_dir/$meta_functionality_name-XXXXXX") | ||
function clean_up { | ||
[[ -d "$TMPDIR" ]] && rm -rf "$TMPDIR" | ||
} | ||
trap clean_up EXIT | ||
|
||
|
||
echo "> Run $meta_name with test data" | ||
"$meta_executable" \ | ||
--gff "$test_dir/1.gff" \ | ||
--output "$TMPDIR/output.txt" | ||
|
||
echo ">> Checking output" | ||
[ ! -f "$TMPDIR/output.txt" ] && echo "Output file output.txt does not exist" && exit 1 | ||
|
||
echo ">> Check if output is empty" | ||
[ ! -s "$TMPDIR/output.txt" ] && echo "Output file output.txt is empty" && exit 1 | ||
|
||
echo ">> Check if output matches expected output" | ||
diff "$TMPDIR/output.txt" "$test_dir/agat_sq_stat_basic_1.gff" | ||
if [ $? -ne 0 ]; then | ||
echo "Output file output.txt does not match expected output" | ||
exit 1 | ||
fi | ||
|
||
echo "> Test successful" |
Oops, something went wrong.