-
Notifications
You must be signed in to change notification settings - Fork 16
Dragonfly Plus
NOTE: The model and related files currently exist in a fork from the mainline CODES and will be merged shortly.
This page is a basic introduction to running the Dragonfly+ (DF+/DFP) CODES model. Future edits will expand this to be a ground-up tutorial and introduction to the topology and its features.
The Dragonfly+ CODES model is heavily sourced from the Dragonfly-Custom model so the general workflow of running a simulation is similar. It can be broken down into four steps:
- Generate Topology
- Write Configuration File
- Write CODES Workload
- Run Simulation
Similar to Dragonfly-Custom, the DFP topology is stored in binary files that are read at simulation runtime. There are two files that make up the router topology: inter
and intra
files.
The inter
-file is a binary file storing pairs of integers. Each pair of integers represents a, global, interconnection between two routers where each integer is the relative router ID of the endpoints of the edge in the network-graph. The range of these integers is from [0,total_routers_in_network
).
The intra
-file is also a binary file storing pairs of integers. But these integers only range from [0,num_routers_per_group
), i.e. the relative local router IDs of the connected routers. This file consists of all local connections within a group. Dragonfly Plus assumes that all groups intra-topologies are the same.
I have written python scripts to make the generation of these topology files simple. Stored in codes/scripts/dragonfly-plus
there is a python script called dragonfly-plus-topo-gen-v2.py
. The usage for this script is as follows: (note: "pg" == shorthand for 'per-group')
python3 dragonfly-plus-topo-gen-v2.py <num_groups> <num_spine_pg> <num_leaf_pg> <router_radix> <num_terminal_per_leaf> <intra_filename> <inter_filename> --<Loudness>
This script will create two files with supplied filenames for the respective links. Loudness is an optional parameter: --debug
, --extra-loud
, --loud
, --standard
, --quiet
. Ranging from full output to stdout of actions that the script is making to zero output.
There is an additional option: --dry-run
which will do minimal work to verify input and print out stats on the to-be-generated network. It will not generate the inter or intra connections or create the files. This is to allow for basic sanity checking of your topology before committing to the generation of very large networks.
There is also an older script (Version 1) which was not as full-featured and did not have the software structure to allow for more complex DFP topologies which Version 2 was explicitly designed to allow.
Again, Dragonfly+ follows the standard procedure for network configuration like other CODES model-net models. Most of the settings are identical in nature to that of Dragonfly-Custom with differences in how the topologies themselves are described.
Dragonfly+ requires that these parameters be defined in the configuration file:
-
num_router_spine=""
== The number of spine routers per group -
num_router_leaf=""
== The number of leaf routers per group
There are other Dragonfly+ specific parameters by design that have no corresponding implementation as of yet. Everything else follows the Dragonfly-Custom configuration file schema.
-
routing=""
== The routing algorithm to be used by the routers
This parameter has currently four implemented options:
-
"minimal"
== Always routes packets on minimal route -
"non-minimal-spine"
== Always routes packets to a spine in an intermediate group* -
"non-minimal-leaf"
== Always routes packets to a leaf in an intermediate group -
"on-the-fly-adaptive"
== Progressive Adaptive routing of packet between minimal and nonminimal routes
*Note: The non-minimal-spine
routing algorithm is not guaranteed to work for all topologies. If every spine router does not have a connection to every other group, it is possible for there to not be a non-minimal-spine route between two terminals and the simulation will fail. Adaptive does not have this failure as it progressively determines where to route the packet based on the available
CODES modelnet models generally need a workload of servers to send and receive packets throughout the network. Located in the src/network-workloads
directory is the model-net-synthetic-dfly-plus.c source file for the currently implemented synthetic workload. It is a direct port from Dragonfly-Custom with only minor differences in configuration parsing.
If you've followed the first three steps, all that is left is to run the simulation.
While in the build directory, executing the command below will start the simulation with the supplied configuration:
mpirun -n 4 ./bin/model-net-synthetic-dfly-plus --synch=3 -- ../src/network-workloads/conf/dragonfly-plus/modelnet-test-dragonfly-plus.conf