Name	Name	Last commit message	Last commit date
parent directory ..
answers	answers
Makefile	Makefile
README.md	README.md
aie.mlir	aie.mlir
test.cpp	test.cpp

Tutorial 2b - Device configuration

In tutorial-2a, we described how to construct the host code test.cpp file for configuring, running and testing our AI Engine design. This tutorial focuses on how to run a hybrid software simulation of our design with the Vitis aiesimulator.

Before we dive into the simulation aspects, we introduce the mlir-aie device configuration operation:

module @module_name {
    AIE.device(target_device) {
      ... 
      AI Engine array components and connections 
      ...
    }
}

This operation specifies which particular device is being targetted by the design. This information is necessary as different AIE devices have different hardware architectures which influence some of the lower level mappings and library calls when running designs, both in simulation and on hardware.

This operation can be added explicitly, as can be seen in the MLIR source code (aie.mlir). Alternatively, aiecc.py also adds a defaut device target as part of its build flow in case no other target was specified in the code. The different device targets are described in AIETargetModel.h with the default target being the VC1902TargetModel.

Tutorial 2b - Simulation

While single kernel simulation is important and covered in tutorial-9, we focus here on simulating our entire AI Engine system design where individual tiles run a cycle accurate simulation and communication between tiles are simulated at the transaction level.

By default, aiecc.py will generate the sim directory which contains the stub and configuration files necessary for integrating with aiesimulator inside aie.mlir.prj/. These configuration files are specific to the the MLIR-AIE design so any changes to the design would require a recompile by aiecc.py to update the files under sim. The sim directory also contains a wrapper for our top level host code source (test.cpp) to run on the simulation host controller. To compile this wrapper and run the simulator, you can simply call:

make -C {path to}/sim

The wrapper and necessary host API source files (test.cpp, test_library.cpp) are then compiled to generate the top level simulator library (ps.po) and then aiesimulator is invoked automatically. The command to invoke aiesimulator directly from outside the sim folder is:

aiesimulator --pkg-dir={path to}/sim --dump-vcd foo

This command points the simulator to the local sim directory and generates the simulation waveform vcd to foo.vcd. The simulation results will be outputted to the terminal as applicable.

Because aiesimulator is to set to run a cycle accurate simulation for each tile, the simulation time for our small tutorial design only take a few minutes to complete on modern machines. However, much bigger designs that involve a large number of tiles may take longer. A more effective strategy is to simulate a sub portion of that design in aiesimulator and then run the full design directly on hardware.

Some things to note about the differences between running simulation and running on hardware. It is important to note the differences between host code timing and AIE timing in simulation compared to hardware. In hardware, the AI engines complete operation extremely quickly (in real time) while host code commands take a much longer time (as compared to AI Engine program cycles). As a result, a host command like usleep could be used to wait for a program to be done when AIE operations complete very quickly. However, in simulation, the opposite is true as the cycle accurate AIE simulator takes more time than the host program (in real time) which is why we use host API functions like mlir_aie_acquire_lock to ensure we are synchronizing AIE simulation and host code timings. These differences can also come up in unique ways which we will highlight in later tutorials.

NOTE: The simulator cleanup process can take a few minutes but you can exit the simulator once you see the terminal message: "Info: /OSCI/SystemC: Simulation stopped by user" by pressing Ctrl-C.

Simulating with shim DMAs

Integration with aiesimulator is supported with the same host API calls in the host program running in both simulation and on hardware. In most cases, the API calls abstracts the differences between the underlying function calls depending on if you're compiling for simulation. This is gated by the #define __ AIESIM __. However, at the moment, in order to support shim DMA in simulation, a customization is needed in the host code as shown below:

#if defined(__AIESIM__)
  mlir_aie_external_set_addr_ddr_test_buffer_in((u64)((_xaie->buffers[0])->physicalAddr));
  mlir_aie_external_set_addr_ddr_test_buffer_out((u64)((_xaie->buffers[1])->physicalAddr));
#else
  mlir_aie_external_set_addr_ddr_test_buffer_in((u64)mem_ptr_in);
  mlir_aie_external_set_addr_ddr_test_buffer_out((u64)mem_ptr_out);
#endif

In simulation, the transaction model requires the allocated physical address rather than the virtual address used in hardware. This may be better abstracted in future releases but is currently necessary for simulation with shim DMAs. More details about shim DMAs can be found in tutorial-5.

Tutorial 2b Lab

Compile the design and then compile the simulation wrapper and run aiesimulator.
```
make; make -C aie.mlir.prj/sim
```
You should see the simulator print a number of status messages before finally running the host code and outputting the PASS message.
Modify the host code test.cpp to add a mlir_aie_print_tile_status for tile(1,4) and rerun the simulator. Did we need to recompile the aie.mlir and regenerate the core .elf files when we modify the host API?

As mentioned earlier, the Makefile for running aiesimulator also generates a vcd file for viewing the resulting output transaction level waveforms. This can be view in a vcd viewer like gtkwave.

Run gtkwave on the generated vcd wave form.
```
gtkwave foo.vcd
```
Gtkwave is like most standard waveform viewers and allows you to select nodes, zoom in and out, and view the simulation. Selecting some notable signals would give an waveform image like:

Solution: gtkwave ./answers/tutorial-2b.gtkw

Here, we see a few notable simulation timings. First, lock 0 is released during initialization (during the time that all locks are released in the 0 state). Then the tile is activated when mlir_aie_start_cores is asserted. The program on the core begins to run for a number of cycles including startup code. After some cycles, the lock 0 acquire is asserted signaling the beginning of our main kernel code. Then after some more time, the lock 0 release is asserted indicating the end of our kernel code. The final lock 0 acquire is asserted by the host API (test.cpp) so it knows when the kernel program is done.

The next tutorial-2c walks us through running our design on hardware and measuring performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tutorial-2b

tutorial-2b

README.md

Tutorial 2b - Device configuration

Tutorial 2b - Simulation

Simulating with shim DMAs

Tutorial 2b Lab

Files

tutorial-2b

Directory actions

More options

Directory actions

More options

Latest commit

History

tutorial-2b

Folders and files

parent directory

README.md

Tutorial 2b - Device configuration

Tutorial 2b - Simulation

Simulating with shim DMAs

Tutorial 2b Lab