-
Notifications
You must be signed in to change notification settings - Fork 23
SB Find repeats
alexanjm edited this page Sep 29, 2016
·
3 revisions
Search through all sequences and return lists of sequences with duplicate IDs and/or identical sequences.
Optional. Specify the number of columns that IDs should be organized into.
#NEXUS
begin data;
dimensions ntax=16 nchar=50;
format datatype=protein missing=? gap=-;
matrix
'Mle-Panxα12' -m--vidilsgf------------kgitpfkgitlddgwdqinrsfmfvl
'Mle-Panxα9' ----mldilskf------------kgvtpfkgitiddgwdqlnrsfmfvl
'Mle-Panxα10B' -m--rlsekstshdckacitrshnedcarrwgitiddgwdqlnrsfmfgl
'Mle-Panxα7A' -m--gveilfpi----------inratapiksvniddlssqlnrtfmfyl
'Mle-Panxα8' -m--vlevlalf------------prlapfkvitlddvwdqwnrsfmfim
'Mle-Panxα1' -mywifeicqei------------kraqscrkfaidgpfdwtnriimptl
'Mle-Panxα9' ----mldilskf------------kgvtpfkgitiddgwdqlnrsfmfvl
'Mle-Panxα2' -m--vldlisgs----------l-ngflkiksvsiddqwdqinrtylvmf
'Mle-Panxα5' -m--iywvwavf------------krmapfkvvtlddrwdqmnrsfmmpl
'Mle-Panxα4' -m--viellagy------------kglspfkdatvddswdqinrcyvfia
'Mle-Panxα3' ml--llgslgti------------knlsifkdlslddwldqmnrtfmfll
'Mle-Panxα6' -m--lleilanf------------kgatpfkeivlddkwdqinrcymfll
'Mle-Panxα8' ----mldilskf------------kgvtpfkgitiddgwdqlnrsfmfvl
'Mle-Panxα11' -m--lisslvqf------------srlspfkeitiddgwdqlnrsfmfvl
'Mle-Panxα10A' -m--rlsekstshdckacitrshnedcarrwgitiddgwdqlnrsfmfgl
'Mle-Panxα6' ----mldilskf------------kgvtpfkgitiddgwdqlnrsfmfvl
;
end;
$: sb Mle-Panx-C_terms.nex -frp
#### Records with duplicate IDs: ####
Mle-Panxα9
Mle-Panxα8
Mle-Panxα6
#### Records with duplicate sequences: ####
[Mle-Panxα10A, Mle-Panxα10B]
[Mle-Panxα9, Mle-Panxα9, Mle-Panxα8, Mle-Panxα6]
$: sb Mle-Panx-C_terms.nex -frp 2
#### Records with duplicate IDs: ####
Mle-Panxα9 Mle-Panxα8
Mle-Panxα6
#### Records with duplicate sequences: ####
[Mle-Panxα10A, Mle-Panxα10B], [Mle-Panxα9, Mle-Panxα9, Mle-Panxα8, Mle-Panxα6]