-
Notifications
You must be signed in to change notification settings - Fork 23
AB Extract regions
Pull out sub-alignments. If using a richly annotated format, like GenBank, features are deleted or adjusted appropriately.
AlignBuddy uses a custom syntax to specify what regions should be extracted from each alignment, and multiple regions can either be passed in as separate arguments or combined into a single comma-separated string.
Single positions: This is the simplest syntax, consisting of a comma-separated list of each column you want extracted.
e.g., "1,2,4,45,79,305"
Ranges: Use two numbers separated by a colon to designate a range of columns, similar to python list notation. If the left side of the range is left blank, the range starts at the first column, and if the right side is left blank, the range extends to the final column. Negative numbers represent the number of columns from the end of the sequence.
e.g., "5:200"
"400:-1"
":245"
Every Nth residue: Use a forward slash to indicate ordered, but non-contiguous, columns. For example, every 10th column. The left side of the slash can also accept the colon notation to specify a sub-range.
e.g., "1/10"
"1:10/100"
3 158
Mle-Panx9 -MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGSVISCDGFKKFGSTFAEDYCWTQGLYTVLEGYDQPSQNIPYPGLLPDEAPPCTPVRLKDGTRLKCPDPDQLLSPTRISHLWYQWVPFYFWLAAAAFFMPYLLYKNFGM
Mle-Panx8 MVLEVLALFPRLAPFKVITLDDVWDQWNRSFMFIMTVLFGSIVTIRSYTGSVIECDGFLKVPVEFAKDYCWTQGIYTLREGYDYHSSLLPYPGVFPEDAPGCLDKVLDNGGRVICPMDKKYRKYQRVYHSWYQFTAFYFWTASCAFFLPYMMFKFFGM
Mle-Panx6 MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTGGIIACDGLTKFSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAPPCLSRRLVSGGRIECPPADLYLEPTRVHHTWYQWIPFYFWVISIAFIGPYIVYKQLGV
5 165
Ael_PanxA -MVVIRELKDILSMKIKTRHDGFCDQFNRMIMTKILIIMSVIVGFNYFYDEVSCMVFKKSDLQKEFISSSCWISGFYIFEEMKTRL-DKSSYYGIPYTINHDGIRKD-GTLCATRDR-LGLVEGCAPMTKVYYLQYQWMPFYIGSLSTFYYMPYIVFKMVNRDLM
Ael_PanxB -MVVIRGLKDILSIKMKTRHDSICDQFNRLFMTRVLLIMSVIMGFDYYSDKVSCMVLGESHLGKDFIHAACWISGFYIYEEMKTRL-DKSSYYGIPYTIDNDGIEYD-GSLCPTRDK-NGKIPGCNPMTKVYYLQYQWMPFYVGSLAIFYYIPYIIFRMVNTDLV
Ael_PanxD ----MEVLKDILSVQLKSRDDSYSDQFNRIFMCKLFLMSSIIMSVDYFSDNVNCMIPDNAQHSSSFFHSACWINGFYIFDEMRSRL-EKSGYYGIPQRVDFDGINRVTGELCITKNL-FGEAADCEPMTRIYYLHYQWMPVYMVSLGMFFYLPYIVFRFVNTDMI
Ael_PanxE --MIGDAISNIISIKIKHRDDGVTDQYNRILMVKMIIMLSAIVGYNYYSDKVSCIVANEDDGIDGFVADTCWIQGFYVFKEMKKRL-GESAYLGLPRNMDYDGLDSN-GVLCSTTDRGSDSIQTCQKMKKVYYLQYQYFPFLLAGLAMLFYFPYIVFKVTNTDLV
Ael_PanxF MGPFEDSIGKIFSFNIKRRADGITDQYNRILMVKICIIFTFVLGIDYFTNKTTCITPDMMRID---PTRTCWNEGFYIYPELENLPAKESSYYGIPKQIDNDGIDEN-GSPCTTKNI-FIKFLSCKPLKKQYYRQYQFMPFLIAVYGIIFYIPHIMFMVINTDII
Extract a range of columns, using the colon (:) operator.
$: alb Panxs.phyr -er "11:100"
3 90
Mle-Panx9 GVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGSVISCDGFKKFGSTFAEDYCWTQGLYTVLEGYDQPSQNIPYPGLLPDEAP
Mle-Panx8 RLAPFKVITLDDVWDQWNRSFMFIMTVLFGSIVTIRSYTGSVIECDGFLKVPVEFAKDYCWTQGIYTLREGYDYHSSLLPYPGVFPEDAP
Mle-Panx6 GATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTGGIIACDGLTKFSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAP
5 90
Ael_PanxA ILSMKIKTRHDGFCDQFNRMIMTKILIIMSVIVGFNYFYDEVSCMVFKKSDLQKEFISSSCWISGFYIFEEMKTRL-DKSSYYGIPYTIN
Ael_PanxB ILSIKMKTRHDSICDQFNRLFMTRVLLIMSVIMGFDYYSDKVSCMVLGESHLGKDFIHAACWISGFYIYEEMKTRL-DKSSYYGIPYTID
Ael_PanxD ILSVQLKSRDDSYSDQFNRIFMCKLFLMSSIIMSVDYFSDNVNCMIPDNAQHSSSFFHSACWINGFYIFDEMRSRL-EKSGYYGIPQRVD
Ael_PanxE IISIKIKHRDDGVTDQYNRILMVKMIIMLSAIVGYNYYSDKVSCIVANEDDGIDGFVADTCWIQGFYVFKEMKKRL-GESAYLGLPRNMD
Ael_PanxF IFSFNIKRRADGITDQYNRILMVKICIIFTFVLGIDYFTNKTTCITPDMMRID---PTRTCWNEGFYIYPELENLPAKESSYYGIPKQID
Leave the left side of the range empty to begin extracting from the start of the alignment.
$: alb Panxs.phyr -er ":100"
3 100
Mle-Panx9 -MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGSVISCDGFKKFGSTFAEDYCWTQGLYTVLEGYDQPSQNIPYPGLLPDEAP
Mle-Panx8 MVLEVLALFPRLAPFKVITLDDVWDQWNRSFMFIMTVLFGSIVTIRSYTGSVIECDGFLKVPVEFAKDYCWTQGIYTLREGYDYHSSLLPYPGVFPEDAP
Mle-Panx6 MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTGGIIACDGLTKFSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAP
5 100
Ael_PanxA -MVVIRELKDILSMKIKTRHDGFCDQFNRMIMTKILIIMSVIVGFNYFYDEVSCMVFKKSDLQKEFISSSCWISGFYIFEEMKTRL-DKSSYYGIPYTIN
Ael_PanxB -MVVIRGLKDILSIKMKTRHDSICDQFNRLFMTRVLLIMSVIMGFDYYSDKVSCMVLGESHLGKDFIHAACWISGFYIYEEMKTRL-DKSSYYGIPYTID
Ael_PanxD ----MEVLKDILSVQLKSRDDSYSDQFNRIFMCKLFLMSSIIMSVDYFSDNVNCMIPDNAQHSSSFFHSACWINGFYIFDEMRSRL-EKSGYYGIPQRVD
Ael_PanxE --MIGDAISNIISIKIKHRDDGVTDQYNRILMVKMIIMLSAIVGYNYYSDKVSCIVANEDDGIDGFVADTCWIQGFYVFKEMKKRL-GESAYLGLPRNMD
Ael_PanxF MGPFEDSIGKIFSFNIKRRADGITDQYNRILMVKICIIFTFVLGIDYFTNKTTCITPDMMRID---PTRTCWNEGFYIYPELENLPAKESSYYGIPKQID
Leave the right side of the range empty to extract until the end of the alignment.
$: alb Panxs.phyr -er "100:"
3 59
Mle-Panx9 PPCTPVRLKDGTRLKCPDPDQLLSPTRISHLWYQWVPFYFWLAAAAFFMPYLLYKNFGM
Mle-Panx8 PGCLDKVLDNGGRVICPMDKKYRKYQRVYHSWYQFTAFYFWTASCAFFLPYMMFKFFGM
Mle-Panx6 PPCLSRRLVSGGRIECPPADLYLEPTRVHHTWYQWIPFYFWVISIAFIGPYIVYKQLGV
5 66
Ael_PanxA NHDGIRKD-GTLCATRDR-LGLVEGCAPMTKVYYLQYQWMPFYIGSLSTFYYMPYIVFKMVNRDLM
Ael_PanxB DNDGIEYD-GSLCPTRDK-NGKIPGCNPMTKVYYLQYQWMPFYVGSLAIFYYIPYIIFRMVNTDLV
Ael_PanxD DFDGINRVTGELCITKNL-FGEAADCEPMTRIYYLHYQWMPVYMVSLGMFFYLPYIVFRFVNTDMI
Ael_PanxE DYDGLDSN-GVLCSTTDRGSDSIQTCQKMKKVYYLQYQYFPFLLAGLAMLFYFPYIVFKVTNTDLV
Ael_PanxF DNDGIDEN-GSPCTTKNI-FIKFLSCKPLKKQYYRQYQFMPFLIAVYGIIFYIPHIMFMVINTDII
Use negative numbers to specify distance from the rear of the alignment.
$: alb Panxs.phyr -er "100:-100"
3 42
Mle-Panx9 KKFGSTFAEDYCWTQGLYTVLEGYDQPSQNIPYPGLLPDEAP
Mle-Panx8 LKVPVEFAKDYCWTQGIYTLREGYDYHSSLLPYPGVFPEDAP
Mle-Panx6 TKFSAAFAEDYCWTQGLYTIKEAYDIVDNSLPYPGLLPEDAP
5 35
Ael_PanxA FISSSCWISGFYIFEEMKTRL-DKSSYYGIPYTIN
Ael_PanxB FIHAACWISGFYIYEEMKTRL-DKSSYYGIPYTID
Ael_PanxD FFHSACWINGFYIFDEMRSRL-EKSGYYGIPQRVD
Ael_PanxE FVADTCWIQGFYVFKEMKKRL-GESAYLGLPRNMD
Ael_PanxF -PTRTCWNEGFYIYPELENLPAKESSYYGIPKQID
Pull out a group of specific columns from both alignments.
$: alb Panxs.phyr -er "32,34,35,37,38,42,43" "135,141,151"
3 10
Mle-Panx9 MVLVVTVVLL
Mle-Panx8 MIMVLIVTTM
Mle-Panx6 MLLVIVVIVI
5 10
Ael_PanxA MKIIIIVQFY
Ael_PanxB MRVLIIMQFY
Ael_PanxD MKLLMIMHVY
Ael_PanxE MKMIMIVQFY
Ael_PanxF MKIIIVLQFY
Extract every tenth column using the forward-slash (/) operator (starting at column #1).
$: alb Panxs.phyr -er "1/10"
3 16
Mle-Panx9 -GDFTSFWGYPTLWLL
Mle-Panx8 MRDFSSVWGYGGYWTM
Mle-Panx6 MGDYTGFWAYPGYWVI
5 17
Ael_PanxA -IDIVEDCESHLLVFYN
Ael_PanxB -IDFVKHCESNLKVFYN
Ael_PanxD -IDFINQCEGFLEIVYN
Ael_PanxE -IDLAKDCEAYLSVFYN
Ael_PanxF MIDLFKRCESNPKQFYN
Extract the first three columns of every ten by mixing the colon (:) and forward-slash (/) operators.
$: alb Panxs.phyr -er "1:3/10"
3 48
Mle-Panx9 -MLGVTDDGFMFTTVSVIFGSWTQGYDYPGPCTTRLLLSWYQLAALLY
Mle-Panx8 MVLRLADDVFMFSIVSVIVPVWTQGYDYPGGCLGRVYRKWYQTASMMF
Mle-Panx6 MLLGATDDKYMFTVVGIIFSAWTQAYDYPGPCLGRIYLEWYQVISIVY
5 51
Ael_PanxA -MVILSDGFIMTVIVEVSDLQCWIEMKSYYHDGLCALVEVYYFYIYMPNRD
Ael_PanxB -MVILSDSIFMTVIMKVSHLGCWIEMKSYYNDGLCPKIPVYYFYVYIPNTD
Ael_PanxD ---ILSDSYFMCIIMNVNQHSCWIEMRGYYFDGLCIEAAIYYVYMYLPNTD
Ael_PanxE --MIISDGVLMVAIVKVSDGICWIEMKAYLYDGLCSSIQVYYFLLYFPNTD
Ael_PanxF MGPIFSDGILMVFVLKTTRIDCWNELESYYNDGPCTKFLQYYFLIYIPNTD
Wacky example to illustrate how flexible the syntax is. NOTE! If you use a minus sign (-), make sure there is a space between your quotation mark and the minus. Otherwise python thinks you're including a new flag.
$: alb Panxs.phyr -er " -5:8/10,45,124" "60:-100,5:42,78,-5" "1/50"
3 83
Mle-Panx9 -ILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVRQYSDGFKKAEDYTVSQNPDEPRLKPDPPRISPFYFFMLKFGM
Mle-Panx8 MVLALFPRLAPFKVITLDDVWDQWNRSFMFIMTVLFGSIIRSYSDGFLKAKDYTLSSLPEDGVLDPMDYRVYAFYFFLMKFGM
Mle-Panx6 MILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVFRQYGDGLTKAEDYTIDNSPEDPRLVPPAPRVHPFYFIGIKLGV
5 87
Ael_PanxA -IRELKDILSMKIKTRHDGFCDQFNRMIMTKILIIMSVIFNYFEVFKSDLQKEFISFYIL-DPYTHKD-DR-GAPMYQWLSTYVFKN
Ael_PanxB -IRGLKDILSIKMKTRHDSICDQFNRLFMTRVLLIMSVIFDYYKVLGSHLGKDFIHFYIL-DPYTNYD-DK-GNPMYQWLAIYIFRN
Ael_PanxD -MEVLKDILSVQLKSRDDSYSDQFNRIFMCKLFLMSSIIVDYFNIPDAQHSSSFFHFYIL-EPQRFRVTNL-DEPMYQWLGMYVFRN
Ael_PanxE -GDAISNIISIKIKHRDDGVTDQYNRILMVKMIIMLSAIYNYYKVANDDGIDGFVAFYVL-GPRNYSN-DRGTQKMYQYLAMYVFKN
Ael_PanxF MEDSIGKIFSFNIKRRADGITDQYNRILMVKICIIFTFVIDYFKTPDMRID---PTFYIPAKPKQNEN-NI-SKPLYQFYGIYMFMN