-
Notifications
You must be signed in to change notification settings - Fork 23
Beginners Guide
This guide assumes you have already installed BuddySuite. There is also a much more comprehensive tutorial once you are comfortable with the very basics.
You can print a help file for each module with the -h flag:
$: seqbuddy -h
$: alignbuddy -h
$: phylobuddy -h
$: databasebuddy -h
$: buddysuite -h
This will list all of the available functions and some basic usage instructions. For a more detailed descriptions of any particular function, please refer to the appropriate wiki page.
Short form aliases are used for the BuddySuite tools in the following usage examples, as explained in the Installation Guide.
The BuddySuite modules are used in the following fashion:
$: sb "path to file" <command> <arguments>
For instance, the command
$: sb Mnemiopsis_cds.fa -tr
will take DNA sequences from Mnemiopsis_cds.fa and translate them to amino acid sequences, outputting the results to the terminal window.
To save the output to a file, use the unix redirect operator:
$: sb Mnemiopsis_cds.fa -tr > Mnemiopsis_pep.fa
You could than calculate the molecular weight of the newly translated sequence file with the -mw flag:
$: sb Mnemiopsis_pep.fa -mw
Although, if you do not require the intermediate protein sequences, you can simply pipe the information from one seqbuddy call into another using the '|' character:
$: sb Mnemiopsis_cds.fa -tr | sb -mw
By chaining commands and tools using pipes, you can build sophisticated pipelines without worrying about format conversion or storing unnecessary intermediate files:
$: sb Mnemiopsis_cds.fa -tr | alb -ga clustalo | alb -trm gappyout | pb -gt raxml -o nex > Mnemiopsis_tree.nex
The previous sequence of commands uses SeqBuddy to translate DNA sequences into amino acid sequences; AlignBuddy to generate a multiple sequence alignment with ClustalOmega and then clean that alignment with the gappy-out algorithm (originally popularized by trimAl); and PhyloBuddy to infer a phylogenetic tree with RAxML. The final tree is then converted to the nexus format with the -o (out_format) flag and redirected to a file.
Please note, you cannot combine commands without the pipe character:
$: alb align.fa -ga -trm
would only do the alignment, and not the trimming.
BuddySuite is a memory hog. If you feed it a file, it reads that entire file into memory, along with extra overhead for each record. Depending on the function you are using, it may make a couple full copies of all those records. So ya, a memory hog; things will get hairy if you try to pass it 15 giga bases of sequencing data. If you are working with FASTQ format, however, there's a handy GNU tool that can help you out called Parallel. Because each record in a FASTQ file is exactly 4 lines, it can be split up into more manageable chunks.
$: cat hiseq_reads.fq | parallel -j 1 -k --pipe -N 4000 seqbuddy --in_silico_digest SphI MluCI > res_dig.fq
Please see the Tutorial for a detailed workflow example, and complete listings of all available BuddySuite functions can be found on their respective wiki pages.