Skip to content

Loading Files

Tony Boyles edited this page May 16, 2018 · 2 revisions

Because MicrobeTrace is a platform for network analysis, it primarily works with two datatypes: Edges and Nodes. Similarly, there are two separate (but similar) workflows. Which workflow you'll use depends upon which type of data you're starting with.

Loading FASTA Files

If you're starting with raw sequence data, you'll need that data in FASTA format. Each sequence in the FASTA file will be represented by one node in the output network.

*Note*: FASTA IDs must be unique. MicrobeTrace will automatically append '_#' characters to non-unique names to make them unique.

To load a FASTA File:

  1. Open MicrobeTrace. Alternately, install it if you have not yet done so.

  2. Click the "Choose File" button.

  3. Navigate to your FASTA File, and double-click on the file, or select (single-click) the file and then click "Open".

  4. Click "Submit".

    Note: The Submit button is not enabled until you complete steps 1-3.

Once you hit submit, MicrobeTrace starts loading the file and displays a progress bar. You can click the "Details" link to view more information about the progress. MicrobeTrace will parse the file and compute the relevant network statistics, so it may take some time.

*Note*: There is no size limit for the file that can be uploaded into MicrobeTrace, but larger files size will take longer to load. This is especially true if there are many sequences. 100 sequences of 100 nucleotides each will load (much) more slowly than 1 sequence of 10,000 nucleotides.

Loading Edge Lists

If you've already compiled a list of edges using other tools, then you can load that edge list directly. The format of this file must be a comma-delimited file (CSV). Additionally, the edge list must contain a source and target column.

*Note*: Because MicrobeTrace only represents *undirected* networks at this time, `source` and `target` are misnomers. They should be thought of a merely "node1" and "node2" and treated as interchangeable.

Any additional edge properties can be included as additional columns in the edge list.

  1. Open MicrobeTrace. Alternately, install it if you have not yet done so.

  2. Click the "Choose File" button.

  3. Navigate to an Edge CSV, and double-click on the file, or select (single-click) the file and then click "Open".

  4. Click "Submit".

    Note: The Submit button is not enabled until you complete steps 1-3.

Unlike loading a FASTA file, loading an Edge CSV should happen rather quickly. This is because computing the network from the information in a FASTA file requires several expensive data-processing steps, while loading the Edge CSV requires no difficult calculations.

Optional, Advanced: Loading Node Data

If you have additional data about the nodes in the network, that can also be imported into MicrobeTrace. It must be stored in a CSV format, and contain an id column with values that match the source or target columns of an Edge CSV or the IDs of a FASTA file. If more than one id column exists, the first one (i.e. left-most) will be used.

*Note*: Rows in a [Node List](https://github.com/CDCgov/MicrobeTRACE/wiki/Node-CSVs) with identical node `id`s cause previous rows with the same `id` to be overwritten. Please ensure that node `id`s are unique.

If no additional node information is available, then it is not necessary to input a node CSV file.

  1. Open MicrobeTrace.

  2. Click the "Choose File" button.

    Note: There is no size limit for the file that can be uploaded into MicrobeTrace, but larger files size will take longer to load.

  3. Navigate to an Edge CSV or FASTA file, and double-click on the file, or select (single-click) the file and then click "Open".

  4. Click on "Advanced", and then click on the "Choose File" button which appears next to the words "Node Data".

  5. Navigate to your Node CSV file, and double-click on the file, or select (single-click) the file and then click "Open".

  6. Click "Submit".

    Note: The Submit button is not enabled until you complete steps 1-3, however clicking it before steps 4 and 5 are completed will result in a network without the additional node data.

Once you hit submit, MicrobeTrace starts loading the file and displays a progress bar. You can click the "Details" link to view more information about the progress.

Next

Once MicrobeTrace completes its computations, you should be able to view the network.

Clone this wiki locally