This is an implementation of BabelStream in Julia which contains the following variants:
PlainStream.jl
- Single threadedfor
ThreadedStream.jl
- Threaded implementation withThreads.@threads
macrosDistributedStream.jl
- Process based parallelism with@distributed
macrosCUDAStream.jl
- Direct port of BabelStream's native CUDA implementation using CUDA.jlAMDGPUStream.jl
- Direct port of BabelStream's native HIP implementation using AMDGPU.jloneAPIStream.jl
- Direct port of BabelStream's native SYCL implementation using oneAPI.jlKernelAbstractions.jl
- Direct port of miniBUDE's native CUDA implementation using KernelAbstractions.jl
Prerequisites
- Julia >= 1.6+
A set of reduced dependency projects are available for the following backend and implementations:
AMDGPU
supports:AMDGPUStream.jl
CUDA
supports:CUDAStream.jl
oneAPI
supports:oneAPIStream.jl
KernelAbstractions
supports:KernelAbstractionsStream.jl
Threaded
supports:PlainStream.jl
ThreadedStream.jl
DistributedStream.jl
With Julia on path, run your selected benchmark with:
> cd JuliaStream.jl
> julia --project=<BACKEND> -e 'import Pkg; Pkg.instantiate()' # only required on first run
> julia --project=<BACKEND> src/<IMPL>Stream.jl
For example. to run the CUDA implementation:
> cd JuliaStream.jl
> julia --project=CUDA -e 'import Pkg; Pkg.instantiate()'
> julia --project=CUDA src/CUDAStream.jl
Important:
- Julia is 1-indexed, so N >= 1 in
--device N
. - Thread count for
ThreadedStream
must be set via theJULIA_NUM_THREADS
environment variable (e.gexport JULIA_NUM_THREADS=$(nproc)
) otherwise it defaults to 1. - Worker count for
DistributedStream
is set with-p <N>
as per the documentation. - Certain implementations such as CUDA and AMDGPU will do hardware detection at runtime and may download and/or compile further software packages for the platform.
Alternatively, the top-level project Project.toml
contains all dependencies needed to run all implementations in src
.
There may be instances where some packages are locked to an older version because of transitive dependency requirements.
To run the benchmark using the top-level project, run the benchmark with:
> cd JuliaStream.jl
> julia --project -e 'import Pkg; Pkg.instantiate()'
> julia --project src/<IMPL>Stream.jl