You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a quick mini-tutorial on using seqtk to extract a set of sequences from a gzipped FASTA file; it should also work with uncompressed files and FASTQ files (compressed or not).
Installation of seqtk below requires an installation of conda & mamba; I recommend using mambaforge.
On farm, in a datalab-XX account:
First, install seqtk in a conda environment named seq:
mamba create -y -n seq seqtk
Activate seq:
mamba activate seq
Make a new working directory:
mkdir -p ~/extract-from-fasta
cd ~/extract-from-fasta
:::warning
Here, the names to extract can be just the prefix of the sequence you want to extract - but the names have to match at the beginning? Not 100% sure.
:::
a quick tutorial!
hackmd: https://hackmd.io/VYrGL_i8SA6WIALlHs16tA?view
reproduced below:
Using
seqtk subseq
to subselect sequencesThis is a quick mini-tutorial on using seqtk to extract a set of sequences from a gzipped FASTA file; it should also work with uncompressed files and FASTQ files (compressed or not).
Installation of seqtk below requires an installation of conda & mamba; I recommend using mambaforge.
On farm, in a datalab-XX account:
First, install
seqtk
in a conda environment namedseq
:Activate seq:
Make a new working directory:
Download a FASTA file of contigs:
List names of sequences in FASAT file:
Make a text file containing the names of a few contigs from that file:
:::warning
Here, the names to extract can be just the prefix of the sequence you want to extract - but the names have to match at the beginning? Not 100% sure.
:::
Extract using
seqtk subseq
:et voila!
The text was updated successfully, but these errors were encountered: