Skip to content

Benchmark OMEinsum.jl and pytorch tensor network contraction backends

Notifications You must be signed in to change notification settings

TensorBFS/TensorNetworkBenchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking tensor network contractions

Device information:

  1. NVIDIA A100-PCIE 80G with NVIDIA Driver Version 470.82.01 and CUDA Version 11.4
  2. NVIDIA V100-SXM2 16G with NVIDIA Driver Version 470.63.01 and CUDA Version 11.4
  3. Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

Contraction order scripts/tensornetwork.json is generated by the following code (you do not need to run this, because we have put the contraction order in the scripts folder)

julia> using OMEinsum, OMEinsumContractionOrders, Graphs

julia> function random_regular_eincode(n, k; optimize=nothing)
            g = Graphs.random_regular_graph(n, k)
            ixs = [minmax(e.src,e.dst) for e in Graphs.edges(g)]
            return EinCode((ixs..., [(i,) for i in     Graphs.vertices(g)]...), ())
           end
random_regular_eincode (generic function with 1 method)

julia> code = random_regular_eincode(220, 3);

julia> optcode_tree = optimize_code(code, uniformsize(code, 2), TreeSA(sc_target=29, βs=0.1:0.1:20,
                                                             ntrials=5, niters=30, sc_weight=2.0));

julia> timespace_complexity(optcode_tree, uniformsize(code, 2))
(33.17598124621909, 28.0)

julia> writejson("tensornetwork_permutation_optimized.json", optcode_tree)

Setup

  • Install pytorch.
  • Install Julia and related packages by typing
$ cd scripts
$ julia --project -e 'using Pkg; Pkg.instantiate()'

(NOTE: if you want to update your local environment, just run julia --project -e 'using Pkg; Pkg.update())

pytorch

$ cd scripts
$ python benchmark_pytorch.py gpu
$ python benchmark_pytorch.py cpu

Timing

  • on A100, the minimum time is ~0.12s, 10 execusions take ~1.35s
  • on V100, the minimum time is ~0.12s, 10 execusions take ~1.76s
  • on CPU (MKL backend, single thread), it is 38.87s (maybe MKL is not set properly?)

OMEinsum.jl

$ cd scripts
$ JULIA_NUM_THREADS=1 julia --project benchmark_OMEinsum.jl gpu
$ JULIA_NUM_THREADS=1 julia --project benchmark_OMEinsum.jl cpu

Timing

  • on A100, the minimum time is ~0.16s, 10 execusions take ~2.25s
  • on V100, the minimum time is ~0.13s, 10 execusions take ~1.39s
  • on CPU (MKL backend, single thread), it is 23.05s

Note: The Julia garbadge collection time is avoided.

Notes

The python scripts are contributed by @Fanerst, there are other people in the discussion and provide helpful advices, please check the original post.

About

Benchmark OMEinsum.jl and pytorch tensor network contraction backends

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published