-
Notifications
You must be signed in to change notification settings - Fork 3
JCAP Log #3: Propeller 1
The Propeller 1 (officially known as the P8X32A) is an impressive 32-bit 8-core RISC microcontroller programmed using either a high-level proprietary language called Spin, or a form of assembly called PASM. A wide array of standard programming languages are also able to be compiled for the Propeller, most notably C and BASIC. Several other projects using the Propeller as a foundation for a video game system (such as the HYDRA Game Development Kit) have showcased the microcontroller's abilities and proven make it a perfect fit for this project.
The P8X32A works by implementing a round-robin exlusive resource access methodology via a rotating "hub", which switches shared resource access between each of the 8 individual processors, called "cogs":
Propeller 1 Block Diagram
Each cog operates independently of the rest, meaning different code can execute on different cogs in parallel. All cogs have mutual access to the 32 GPIO pins, and the Propeller uses a simple schema for deciding which cogs have control over which pins. Cogs can "communicate" by accessing the shared memory in the hub, and storing/loading data during their memory access window. This is the only period of time where they have access to the hub, and each cog has this access for 2 cog clock cycles (or 1 hub clock cycle, as the hub runs half as fast as the cogs) every 16 clocks. Calling a hub access instruction causes the cog to wait for its access window before proceeding, with each hub access instruction taking 8 clock cycles on the cog. This means that code can be organized in such a way to optimize hub access.
Round-Robin Hub Access
Once the first hub access instruction is executed, the cog is "synced" with its hub access window. If each access takes 8 clock cycles, and the window occurs every 16 clock cycles, then 8 clock cycles remain for other instructions before the next window. Since most non-hub-access instructions take 4 clock cycles, you could execute two of these between each hub access instruction, and each access instruction will be called right when the access window is open. This means the cog doesn't have to stall to wait for its window; a relatively paltry performance gain, but makes the execution of instructions much more deterministic.
Best/Worst-Case Hub Access Waveforms
PASM code is loaded into a cog via the Spin cognew function with an argument pointing to the starting address of the code. With a 32-bit architecture, the memory space is divided into 32-bit longs, 16-bit words, and 8-bit bytes. Each cog has 2K of RAM: 496 general purpose registers (for storing executable instruction code for example) and 16 special purpose registers (the control registers for the counters, video generator, GPIO pins, and other features). With each instruction being one long, you can load a maximum of 496 instructions into a cog, minus however much cog RAM you declare for variables. This does constrain the size of your code somewhat, but for most applications, almost 500 lines of code is more than sufficient. Additionally, specialized compilers have been developed to create a "large memory model" or LMM on the Propeller 1, allowing single programs to span several cogs seamlessly.
Cog Memory Space
The hub has 64K of memory, divided into 32K of RAM and 32K of ROM. The ROM contains the boot loader and interpreter, several tables useful for performing mathematical functions, and an extensive character set useful for displaying basic text and symbols on the screen. The hub RAM contains the code to be loaded into the cogs, as well as any memory to be accessed from and shared by the cogs. So if you filled all of the cogs to the brim with unique sets of instructions and/or variables, you would consume 8 cogs x 2K = 16K of hub RAM leaving the other 16K for shared memory.
Hub Memory Space
The most significant element of the P8X32A for the purposes of JCAP is its implementation of video generation hardware within each cog. The video generators are actually serializing circuits, which take data and spit it out at a configurable rate much faster than the GPIO pins on the Propeller could be manually controlled with instructions. They work in tandem with the counter modules also present on each cog to create not just video but also other high-entropy signals such as radio and sound. Using a video generator allows focus to be placed on simply generating outputs instead which reduces development time, and introduces a level of security and confidence in the hardware concerning the ablity to generate a video and other signals necessary for JCAP.
Possible Video Circuit
A final non-critical but extremely useful aspect of the P8X32A is that fact that its entire design is provided by Parallax, free of charge, in an HDL format which can be loaded onto DE0 FPGA development boards! This means the Propeller 1 can be simulated for development and testing without having to purchase a P8X32A and build it into a circuit. The only drawback is for the DE0-Nano board-compatible version, the ROM character definitions are removed for the design to fit. Even more importantly, IC components can be defined in HDL as well and connected within the design itself, a benefit which will come in handy when developing subsystems that rely on inputs from such components.
DE0-Nano FPGA Development Board
With its respectable amount of internal memory, multi-core architecture, low cost, simulatability, and most importantly, built-in video generation hardware, the Propeller 1 is a virtually ideal microcontroller to serve as the foundation for the project.
In order to develop the drivers for the low-level subsystems such as video, sound, and input, the Propeller's RISC assembly language PASM will be used as finer and more deterministic control is needed over the microcontroller. The game logic itself will be developed with C (Spin is also an option, however C is a more ubiquitously known language). The game logic will be connected to the PASM subsystems via simple Spin connective tissue which for the most part simply instantiates variables and loads PASM routines into the Cogs. This paradigm may be refactored to eliminate the need for Spin entirely down the road, however that's a bridge to cross when we get there, and optimization should never precede functionality.