-
Notifications
You must be signed in to change notification settings - Fork 23
AB Generate alignment
Generate a multiple sequence alignment using third party alignment tools. Basic default parameters are built into the wrapper for 'quick-and-dirty' alignments with any of the supported tools, or you can specify further parameters as desired. All necessary format conversions are handled by AlignBuddy and the output will be returned in the same format as the input (unless over-ridden with the -o flag, as is normal). This is particularly useful if aligning sequences in a richly annotated format like GenBank, as the annotations are re-mapped back onto the new alignment at the end of the job.
As the job runs, any output the tool normally generates will be streamed to stderr for your reference (suppressible with the '-q' flag). If the program generates files as part of its normal operation, these are sent to a temporary directory and deleted once AlignBuddy finishes the job. To save these files, specify a directory with the '-k' flag (example 4).
The alignment programs listed below are currently supported by AlignBuddy. The default binary names that AlignBuddy will search for in your PATH are the exact names listed below, except all in lower case (e.g., 'mafft', instead of 'MAFFT'). If your version of the software has a different name or is not in your system PATH, explicitly set the name or path as the first positional argument.
Note that the binaries for these programs are not included with the BuddySuite, so they must be installed separately. Let us know if you are regularly using a non-supported tool, because we can probably start supporting you!
Optional. If not set, AlignBuddy will try to find an alignment program on your system and will execute the first one it detects. Otherwise, specify the name of the alignment tool in your PATH or the path to the binary on your system; the actual name of the program is not important, as AlignBuddy will determine which program you are calling automatically.
Optional. There are many optional parameters that each alignment tool may accept (see their documentation for details). This argument injects further commands into the final call that AlignBuddy makes to the wrapped program. It can only be used if an alignment tool is specified as the first argument, and make sure to enclose all additional tool specific arguments in double quotes (so AlignBuddy doesn't try to interpret them itself). See example 2 for a demonstration of the proper syntax.
LOCUS Mle-Panxα3 200 aa UNA 02-JAN-2015
DEFINITION cDNA - ML036514a.
ACCESSION Mle-Panxα3
VERSION Mle-Panxα3
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(1..50,51..111,112..152,153..183,184..200)
/created_by="User"
/label="ML036514a"
/modified_by="User"
TMD1 29..49
TMD2 132..152
ORIGIN
1 mlllgslgti knlsifkdls lddwldqmnr tfmflllcfm gtivavsqyt gkniscdgft
61 kfgedfsqdy cwtqglytik eaydlpesqi pypgiipenv pacrehalkn ggkivcpped
121 qvkpltrarh lwyqwipfyf wviapvfylp ymfvkrmgld rmkpllkims dyyhcttetp
181 seeiivkcad wvynsivdrl
//
LOCUS Mle-Panxα4 200 aa UNA 02-JAN-2015
DEFINITION cDNA and genomic - ML129317a.
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
TMD1 28..48
TMD2 131..151
ORIGIN
1 mviellagyk glspfkdatv ddswdqinrc yvfiamvvmg avttmrqysg tliacdgftk
61 fhpqfaedyc wsigmytvre aydlpssmva ypgvipwdmp acvprllkng trtkcgsekd
121 vmpsekiyhl wyqwasfyfw ivailyyapy imfkqlggge ykplikllcl asgspeqqmq
181 diqervvkwl ffrfktyifa
//
LOCUS Mle-Panxα6 200 aa UNA 02-JAN-2015
DEFINITION cDNA - ML25993a.
ACCESSION Mle-Panxα6
VERSION Mle-Panxα6
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(1..42,43..92,93..125,126..171,172..200)
/created_by="User"
/label="ML25993"
/modified_by="User"
TMD1 28..48
TMD2 131..151
ORIGIN
1 mlleilanfk gatpfkeivl ddkwdqinrc ymfllcvifg tvvtfrqytg giiacdgltk
61 fsaafaedyc wtqglytike aydivdnslp ypgllpedap pclsrrlvsg griecppadl
121 yleptrvhht wyqwipfyfw visiafigpy ivykqlgvne lkpilamlhn pvdgddvtkd
181 qiskvsrwla iklnifiqek
//
LOCUS Mle-Panxα5 200 aa UNA 02-JAN-2015
DEFINITION cDNA - ML223536a.
ACCESSION Mle-Panxα5
VERSION Mle-Panxα5
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(1..49,50..94,95..135,136..200)
/created_by="User"
/label="ML223536a"
/modified_by="User"
TMD1 28..48
TMD2 133..153
ORIGIN
1 miywvwavfk rmapfkvvtl ddrwdqmnrs fmmpltmsfa ylidygiiag stikctgfed
61 sfrseafvde ycwtqgiytl reaydlentk ipypgiipeg fpncmpyerw dgmkvecpke
121 eqylkptrvy hlyyqhiqly fwlvctlfyl pymvgiclgf nytkplinll hnpltrdeee
181 lealldkaar slrlrldiys
//
If no arguments are passed in, AlignBuddy will try to find an alignment program on your system. In this example, MAFFT is found.
$: alb Mnemiopsis_Panxs.gb -ga
nseq = 4
distance = ktuples
iterate = 0
cycle = 2
nguidetree = 2
nthread = 0
sueff_global = 0.100000
done.
scoremtx = 1
Gap Penalty = -1.53, +0.00, +0.00
tuplesize = 6, dorp = p
Making a distance matrix ..
1 / 4
done.
Constructing a UPGMA tree ...
0 / 4
done.
Progressive alignment 1/2...
STEP 1 / 3 f
Reallocating..done. *alloclen = 1404
STEP 3 / 3 d
done.
Constructing a UPGMA tree ...
0 / 4
done.
Progressive alignment 2/2...
STEP 1 / 3 f
Reallocating..done. *alloclen = 1404
STEP 3 / 3 d
done.
disttbfast (aa) Version 7.186 alg=A, model=BLOSUM62, 1.53, -0.00, -0.00, noshift, amax=0.0
0 thread(s)
Strategy:
FFT-NS-2 (Fast but rough)
Progressive method (guide trees were built 2 times.)
If unsure which option to use, try 'mafft --auto input > output'.
For more information, see 'mafft --help', 'mafft --man' and the mafft page.
The default gap scoring scheme has been changed in version 7.110 (2013 Oct).
It tends to insert more gaps into gap-rich regions than previous versions.
To disable this change, add the --legacygappenalty option.
Returning to AlignBuddy...
LOCUS Mle-Panxα3 212 aa UNK 01-JAN-1980
DEFINITION
ACCESSION Mle-Panxα3
VERSION Mle-Panxα3
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(1..50,51..113,114..154,155..192,193..209)
/created_by="User"
/label="ML036514a"
/modified_by="User"
TMD1 29..49
TMD2 134..154
ORIGIN
1 mlllgslgti knlsifkdls lddwldqmnr tfmflllcfm gtivavsqyt gkniscdgft
61 k--fgedfsq dycwtqglyt ikeaydlpes qipypgiipe nvpacrehal knggkivcpp
121 edqvkpltra rhlwyqwipf yfwviapvfy lpymfvkrmg ldrmkpllki msdyyhctte
181 tp-------s eeiivkcadw vynsivdrl- --
//
LOCUS Mle-Panxα4 212 aa UNK 01-JAN-1980
DEFINITION
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
TMD1 29..49
TMD2 134..154
ORIGIN
1 -mviellagy kglspfkdat vddswdqinr cyvfiamvvm gavttmrqys gtliacdgft
61 k--fhpqfae dycwsigmyt vreaydlpss mvaypgvipw dmpacvprll kngtrtkcgs
121 ekdvmpseki yhlwyqwasf yfwivailyy apyimfkqlg ggeykplikl lc----lasg
181 sp----eqqm qdiqervvkw lffrfktyif a-
//
LOCUS Mle-Panxα6 212 aa UNK 01-JAN-1980
DEFINITION
ACCESSION Mle-Panxα6
VERSION Mle-Panxα6
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(2..43,44..95,96..128,129..182,183..212)
/created_by="User"
/label="ML25993"
/modified_by="User"
TMD1 29..49
TMD2 134..154
ORIGIN
1 -mlleilanf kgatpfkeiv lddkwdqinr cymfllcvif gtvvtfrqyt ggiiacdglt
61 k--fsaafae dycwtqglyt ikeaydivdn slpypgllpe dappclsrrl vsggriecpp
121 adlyleptrv hhtwyqwipf yfwvisiafi gpyivykqlg vnelkpilam l--------h
181 npv-dgddvt kdqiskvsrw laiklnifiq ek
//
LOCUS Mle-Panxα5 212 aa UNK 01-JAN-1980
DEFINITION
ACCESSION Mle-Panxα5
VERSION Mle-Panxα5
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
CDS order(2..50,51..95,96..136,137..209)
/created_by="User"
/label="ML223536a"
/modified_by="User"
TMD1 29..49
TMD2 134..154
ORIGIN
1 -miywvwavf krmapfkvvt lddrwdqmnr sfmmpltmsf aylidygiia gstikctgfe
61 dsfrseafvd eycwtqgiyt lreaydlent kipypgiipe gfpncmpyer wdgmkvecpk
121 eeqylkptrv yhlyyqhiql yfwlvctlfy lpymvgiclg fnytkplinl l--------h
181 npltrdeeel ealldkaars lrlrldiys- --
//
Specify a specific version of PRANK not in your system PATH
$: alb Mnemiopsis_Panxs.gb -ga /path/to/prank_v140603 -o fasta
-----------------
PRANK v.140603:
-----------------
Input for the analysis
- aligning sequences in '/Volumes/Zippy/.sysTemp/tmpfn_qudo0/tmp.fa'
- using inferred alignment guide tree
- option '+F' is not used; it can be enabled with '+F'
- external tools available:
MAFFT for initial alignment
Exonerate for alignment anchoring
BppAncestor for ancestral state reconstruction
Warning: sequence names changed.
Generating multiple alignment: iteration 1.
#3#(3/3): 97% aligned
Alignment score: 341
Generating multiple alignment: iteration 2.
#3#(3/3): 97% aligned
Alignment score: 343
Generating multiple alignment: iteration 3.
#3#(3/3): 99% computed
Alignment score: 343
Generating multiple alignment: iteration 4.
#3#(3/3): 99% computed
Alignment score: 343
Generating multiple alignment: iteration 5.
#3#(3/3): 99% computed
Alignment score: 343
Writing
- alignment to '/Volumes/Zippy/.sysTemp/tmpfn_qudo0/result.best.fas'
Analysis done. Total time 5s
Returning to AlignBuddy...
>Mle-Panxα5_cDNA_-_ML223536a.
-MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAGSTIKCTGFE
DSFRSEAFVDEYCWTQGIYTLREAYDLENTKIPYPGIIPEGFPNCMPYERWDGMKVECPK
EEQYLKPTRVYHLYYQHIQLYFWLVCTLFYLPYMVGICLGFNYTKPLINLLHNPLT-RDE
EELEALLDKAARSLRLRLDIY---S
>Mle-Panxα4_cDNA_and_genomic_-_ML129317a.
-MVIELLAGYKGLSPFKDATVDDSWDQINRCYVFIAMVVMGAVTTMRQYSGTLIACDGFT
KF--HPQFAEDYCWSIGMYTVREAYDLPSSMVAYPGVIPWDMPACVPRLLKNGTRTKCGS
EKDVMPSEKIYHLWYQWASFYFWIVAILYYAPYIMFKQLGGGEYKPLIKLLCLASG-SPE
QQMQDIQERVVKWLFFRFKTYIFA-
>Mle-Panxα6_cDNA_-_ML25993a.
-MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTGGIIACDGLT
KF--SAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAPPCLSRRLVSGGRIECPP
ADLYLEPTRVHHTWYQWIPFYFWVISIAFIGPYIVYKQLGVNELKPILAMLHNPVD-GDD
VTKDQIS-KVSRWLAIKLNIFIQEK
>Mle-Panxα3_cDNA_-_ML036514a.
MLLLGSLGTIKNLSIFKDLSLDDWLDQMNRTFMFLLLCFMGTIVAVSQYTGKNISCDGFT
KF--GEDFSQDYCWTQGLYTIKEAYDLPESQIPYPGIIPENVPACREHALKNGGKIVCPP
EDQVKPLTRARHLWYQWIPFYFWVIAPVFYLPYMFVKRMGLDRMKPLLKIMSDYYHCTTE
TPSEEIIVKCADWVY---NSIVDRL
Pass in extra parameters to further refine your alignment.
$: alb Mnemiopsis_Panxs.gb -ga clustalomega "--iter=2" -o clustal
Using 24 threads
Read 4 sequences (type: Protein) from /Volumes/Zippy/.sysTemp/tmpijrd7orz/tmp.fa
not more sequences (4) than cluster-size (100), turn off mBed
Calculating pairwise ktuple-distances...
Ktuple-distance calculation progress done. CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00
Guide-tree computation done.
Progressive alignment progress done. CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00
Iteration step 1 out of 2
Computing new guide tree (iteration step 1032320)
Calculating pairwise aligned identity distances...
Pairwise identity calculation progress done. CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00
Guide-tree computation done.
Computing HMM from alignment
Progressive alignment progress done. CPU time: 0.06u 0.01s 00:00:00.06 Elapsed: 00:00:00
Iteration step 2 out of 2
Computing new guide tree (iteration step 1032320)
Calculating pairwise aligned identity distances...
Pairwise identity calculation progress done. CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00
Guide-tree computation done.
Computing HMM from alignment
Progressive alignment progress done. CPU time: 0.07u 0.00s 00:00:00.07 Elapsed: 00:00:00
Alignment written to /Volumes/Zippy/.sysTemp/tmpijrd7orz/result
Returning to AlignBuddy...
CLUSTAL X (1.81) multiple sequence alignment
Mle-Panxα3 MLLLGSLGTIKNLSIFKDLSLDDWLDQMNRTFMFLLLCFMGTIVAVSQYT
Mle-Panxα4 -MVIELLAGYKGLSPFKDATVDDSWDQINRCYVFIAMVVMGAVTTMRQYS
Mle-Panxα6 -MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYT
Mle-Panxα5 -MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIA
Mle-Panxα3 GKNISCDGFTK--FGEDFSQDYCWTQGLYTIKEAYDLPESQIPYPGIIPE
Mle-Panxα4 GTLIACDGFTK--FHPQFAEDYCWSIGMYTVREAYDLPSSMVAYPGVIPW
Mle-Panxα6 GGIIACDGLTK--FSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPE
Mle-Panxα5 GSTIKCTGFEDSFRSEAFVDEYCWTQGIYTLREAYDLENTKIPYPGIIPE
Mle-Panxα3 NVPACREHALKNGGKIVCPPEDQVKPLTRARHLWYQWIPFYFWVIAPVFY
Mle-Panxα4 DMPACVPRLLKNGTRTKCGSEKDVMPSEKIYHLWYQWASFYFWIVAILYY
Mle-Panxα6 DAPPCLSRRLVSGGRIECPPADLYLEPTRVHHTWYQWIPFYFWVISIAFI
Mle-Panxα5 GFPNCMPYERWDGMKVECPKEEQYLKPTRVYHLYYQHIQLYFWLVCTLFY
Mle-Panxα3 LPYMFVKRMGLDRMKPLLKIMSDYYHCTTETPSEEIIVKCADWVYNSIVD
Mle-Panxα4 APYIMFKQLGGGEYKPLIKLLCLAS-GSPEQQMQDIQERVVKWLFFRFKT
Mle-Panxα6 GPYIVYKQLGVNELKPILAMLHNPVDGDD--VTKDQISKVSRWLAIKLNI
Mle-Panxα5 LPYMVGICLGFNYTKPLINLLHNPLTRDE-EELEALLDKAARSLRLRLDI
Mle-Panxα3 RL---
Mle-Panxα4 YIFA-
Mle-Panxα6 FIQEK
Mle-Panxα5 YS---
Keep all temporary files
$: alb Mnemiopsis_Panxs.gb -ga clustalw2 -o phylip-sequential -k ~/alignment_files
Returning to AlignBuddy...
4 205
Mle-Panxα4 -MVIELLAGYKGLSPFKDATVDDSWDQINRCYVFIAMVVMGAVTTMRQYSGTLIACDGFTK--FHPQFAEDYCWSIGMYTVREAYDLPSSMVAYPGVIPWDMPACVPRLLKNGTRTKCGSEKDVMPSEKIYHLWYQWASFYFWIVAILYYAPYIMFKQLGGGEYKPLIKLLCLAS-GSPEQQMQDIQERVVKWLFFRFKTYIFA-
Mle-Panxα6 -MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTGGIIACDGLTK--FSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAPPCLSRRLVSGGRIECPPADLYLEPTRVHHTWYQWIPFYFWVISIAFIGPYIVYKQLGVNELKPILAMLHNPVDGDD--VTKDQISKVSRWLAIKLNIFIQEK
Mle-Panxα3 MLLLGSLGTIKNLSIFKDLSLDDWLDQMNRTFMFLLLCFMGTIVAVSQYTGKNISCDGFTK--FGEDFSQDYCWTQGLYTIKEAYDLPESQIPYPGIIPENVPACREHALKNGGKIVCPPEDQVKPLTRARHLWYQWIPFYFWVIAPVFYLPYMFVKRMGLDRMKPLLKIMSDYYHCTTETPSEEIIVKCADWVYNSIVDRL---
Mle-Panxα5 -MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAGSTIKCTGFEDSFRSEAFVDEYCWTQGIYTLREAYDLENTKIPYPGIIPEGFPNCMPYERWDGMKVECPKEEQYLKPTRVYHLYYQHIQLYFWLVCTLFYLPYMVGICLGFNYTKPLINLLHNPLTRDE-EELEALLDKAARSLRLRLDIYS---
A new directory was created:
$: ls ~/alignment_file
>>> result tmp.dnd tmp.fa