Skip to content
alicarea edited this page Oct 26, 2016 · 4 revisions

--replace_subseq, -rs

Description

Search through the sequences in all records and replace matches with something new.

Arguments

Search pattern ( regex )

Provide a sequences or regular expression pattern to search sequences with (case insensitive).

Replacement ( str )

Optional. If provided, this string will replace any matches found, otherwise matches will be deleted.

Examples

Input file: C-terms.fa

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
>Mle-Panxα1
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM
>Mle-Panxα5
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS

Usage example 1

Simple search and replace

$: sb C-terms.fa -rs "LL" "*"

Output

>Dme-Panxδ1
YK*GSLKSYLKWQIQTDNAVFRLHNSFTTV*LTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKG*KIDQVDNNVFRMHYKATVIILIAFS*VTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVA*LACTF*SSKQYFGDPI
>Mle-Panxα1
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM
>Mle-Panxα5
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6
M*EILANFKGATPFKEIVLDDKWDQINRCYMF*CVIFGTVVTFRQYTG
>Mle-Panxα9
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFV*VVMGTTVTVRQYTGS

Usage example 2

Simple search and delete for all 'M' residues (i.e., blank replace argument)

$: sb C-terms.fa -rs "m"

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
DVFGSVKGLLKIDQVDNNVFRHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPHVINTFC
>Dme-Panxδ4
AAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
>Mle-Panxα1
YWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIPTLVICCFLQTFTF
>Mle-Panxα5
IYWVWAVFKRAPFKVVTLDDRWDQNRSFPLTSFAYLIDYGIIAG
>Mle-Panxα6
LLEILANFKGATPFKEIVLDDKWDQINRCYFLLCVIFGTVVTFRQYTG
>Mle-Panxα9
LDILSKFKGVTPFKGITIDDGWDQLNRSFFVLLVVGTTVTVRQYTGS

Usage example 3

More complicated regular expression replacement

$: sb C-terms.fa -rs "[IL].{1,4}[IL]" "_motif_"

Output

>Dme-Panxδ1
YK_motif_KSY_motif_QTDNAVFRLHNSFTTV_motif_TCS_motif_TATQYVGQ
PI
>Dme-Panxδ2
MDVFGSVKG_motif_DQVDNNVFRMHYKATV_motif_AFSLLVTSRQY_motif_D
>Dme-Panxδ3
GF_motif_DNMVFRCHYR_motif_FTCCIIVTANN_motif_SCI_motif_NTFC
>Dme-Panxδ4
MAAVKP_motif_QFKVH_motif_FTLHSKVTVA_motif_ACTFLLSSKQYFGDPI
>Mle-Panxα1
MYW_motif_CQEIKRAQSCRKFAIDGPFDWTNR_motif_MV_motif_QTFTFM
>Mle-Panxα5
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAY_motif_IAG
>Mle-Panxα6
M_motif_ANFKGATPFKE_motif_DDKWDQINRCYMF_motif_FGTVVTFRQYTG
>Mle-Panxα9
M_motif_SKFKGVTPFKG_motif_DDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS

Usage example 4

Retain part of a match in the replacement string

$: sb C-terms.fa -rs "[IL](.{1,4})[IL]" ">\1<"

Output

>Dme-Panxδ1
YK>LGS<KSY>KWQ<QTDNAVFRLHNSFTTV>L<TCS>I<TATQYVGQPI
>Dme-Panxδ2
MDVFGSVKG>LK<DQVDNNVFRMHYKATV>IL<AFSLLVTSRQY>GDP<D
>Dme-Panxδ3
GF>K<DNMVFRCHYR>TAI<FTCCIIVTANN>IGDP<SCI>PMHV<NTFC
>Dme-Panxδ4
MAAVKP>SKY<QFKVH>YDA<FTLHSKVTVA>L<ACTFLLSSKQYFGDPI
>Mle-Panxα1
MYW>FE<CQEIKRAQSCRKFAIDGPFDWTNR>IMPT<MV>CCF<QTFTFM
>Mle-Panxα5
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAY>IDYG<IAG
>Mle-Panxα6
M>LEI<ANFKGATPFKE>V<DDKWDQINRCYMF>LCV<FGTVVTFRQYTG
>Mle-Panxα9
M>DI<SKFKGVTPFKG>T<DDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS

Main Toolkit Pages





Further Reading

Clone this wiki locally