Skip to content

Commit

Permalink
Add a section on performance evaluation
Browse files Browse the repository at this point in the history
  • Loading branch information
mroavi committed Sep 11, 2023
1 parent fc447dc commit 82e4877
Show file tree
Hide file tree
Showing 19 changed files with 2,716 additions and 1 deletion.
37 changes: 36 additions & 1 deletion paper/paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -305,4 +305,39 @@ @software{Jutho2023
version = {v4.0.0},
doi = {10.5281/zenodo.8166121},
url = {https://doi.org/10.5281/zenodo.8166121}
}
}

@online{marinescu2022merlin,
author = {Radu Marinescu},
title = {Merlin},
year = 2022,
url = {\url{https://www.ibm.com/opensource/open/projects/merlin/} [Accessed: 11 September 2023]},
urldate = {2023-09-11}
}

@article{mooij2010libdai,
author = {Joris M. Mooij},
title = {lib{DAI}: A Free and Open Source {C++} Library for Discrete Approximate Inference in Graphical Models},
journal = {Journal of Machine Learning Research},
year = 2010,
month = Aug,
volume = 11,
pages = {2169-2173},
url = "http://www.jmlr.org/papers/volume11/mooij10a/mooij10a.pdf"
}

@online{gal2010summary,
title = {Summary of the 2010 {UAI} approximate inference challenge},
year = 2010,
howpublished = {\url{https://www.cs.huji.ac.il/project/UAI10/summary.php} [Accessed: 11 September 2023]},
urldate = {2021-08-21},
author = {Elidan, Gal and Globerson, Amir},
}

@online{gogate2014uai,
title = {{UAI} 2014 {Probabilistic} {Inference} {Competition}},
year = 2014,
howpublished = {\url{https://www.ics.uci.edu/~dechter/softwares/benchmarks/Uai14/UAI_2014_Inference_Competition.pdf} [Accessed: 11 September 2023]},
urldate = {2023-09-11},
author = {Gogate, Vibhav},
}
34 changes: 34 additions & 0 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,40 @@ networks. By harnessing the best of both worlds, `TensorInference.jl` aims to
enhance the performance of probabilistic inference, thereby expanding the
tractability spectrum of exact inference for more complex, real-world models.

# Performance evaluation

\autoref{fig:performance-evaluation} illustrates a comparison of the runtime
performance of `TensorInference.jl` against `Merlin` [@marinescu2022merlin],
`libDAI` [@mooij2010libdai], and `JunctionTrees.jl`
[@roa2022partial;@roa2023scaling] libraries. We selected `Merlin` and `libDAI`
based on the following criteria: open-source availability, extensive
documentation, and representation of standard practices in the field. Both of
these libraries have previously participated in UAI inference competitions
[@gal2010summary;@gogate2014uai], achieving favorable results. Additionally,
we included two versions of `JunctionTrees.jl`, the predecessor of
`TensorInference.jl`. The first version does not employ tensor technology,
while the second version optimizes individual sum-product computations using
tensors-based technology.

The benchmark problems are arranged along the x-axis in ascending order of
complexity, measured by the induced tree width. On average,
`TensorInference.jl` achieves a speedup of 11 times across all problems.
Notably, for the 10 most complex problems, the average speedup increases to 63
times, highlighting its superior scalability. It's worth noting that the
`TensorInference.jl` method incurs a computational overhead that may result in
a slowdown in probabilistic inference when the problem's complexity is
relatively low compared to the other libraries. However, as the problem
complexity increases, this overhead becomes negligible. In such cases, our
method can often deliver performance improvements that are several orders of
magnitude greater.

![Speedup achieved by `TensorInference.jl`, relative to `Merlin`
[@marinescu2022merlin], `libDAI` [@mooij2010libdai], and `JunctionTrees.jl`
[@roa2022partial;@roa2023scaling] for the UAI 2014 inference competition
benchmark problems. The experiments were conducted on an Intel Core i9--9900K
CPU \@3.60GHz with 64 GB of RAM. \label{fig:performance-evaluation}
](scripts/performance-evaluation/out/co23/2023-09-10--20-20-45/performance-evaluation.svg){width=80%}

# Usage example

The graph below corresponds to the *ASIA network* [@lauritzen1988local], a
Expand Down
Binary file modified paper/paper.pdf
Binary file not shown.
13 changes: 13 additions & 0 deletions paper/scripts/performance-evaluation/Artifacts.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[uai2014]
git-tree-sha1 = "199ed43697fe22447c6c64a939b222fd4073f2d0"

[[uai2014.download]]
sha256 = "5d93ced227cff3eb2ae7feb77dcb6c780212b47a0c0355dda8439de6f5b9d369"
url = "https://github.com/mroavi/uai-2014-inference-competition/raw/main/uai2014.tar.gz"

[uai2014-mar]
git-tree-sha1 = "480aabc22378f9edaa9cd24798de9f416c7d1a49"

[[uai2014.download]]
sha256 = "dd2265fe93eac73a3430f1d98bcd13162ca079f3a9cf7fa529b9c39c4534e671"
url = "https://gist.github.com/mroavi/8d38625bd8731cefc6788b941256cab3/raw/480aabc22378f9edaa9cd24798de9f416c7d1a49.tar.gz"
10 changes: 10 additions & 0 deletions paper/scripts/performance-evaluation/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[deps]
ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"
Artifacts = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
JunctionTrees = "b732b382-80b5-46a8-aa9c-7d077ae04823"
PGFPlotsX = "8314cec4-20b6-5062-9cdb-752b83310925"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
TensorInference = "c2297e78-99bd-40ad-871d-f50e56b81012"
213 changes: 213 additions & 0 deletions paper/scripts/performance-evaluation/generate-graph.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
# Color palettes:
# - https://github.com/matplotlib/matplotlib/issues/9460#issuecomment-875185352
# - https://personal.sron.nl/~pault/#tab:blindvision
# - https://yoshke.org/blog/colorblind-friendly-diagrams
# - https://davidmathlogic.com/colorblind
# - https://www.color-hex.com/color-palette/1018347

using PGFPlotsX, CSV, DataFrames, Artifacts, StatsBase
using JunctionTrees: get_td_soln

push!(
PGFPlotsX.CUSTOM_PREAMBLE,
"""
\\usepackage[T1]{fontenc}
\\usepackage{xcolor}
\\usepackage{tikz}
\\usepackage{pgfplots}
\\usepackage{amsmath,amssymb}
% Bright qualitative colour scheme that is colour-blind safe
% https://personal.sron.nl/~pault/#tab:blindvision
\\definecolor{c01}{HTML}{4477AA}
\\definecolor{c02}{HTML}{EE6677}
\\definecolor{c03}{HTML}{228833}
\\definecolor{c04}{HTML}{CCBB44}
\\definecolor{c05}{HTML}{66CCEE}
\\definecolor{c06}{HTML}{AA3377}
\\definecolor{c07}{HTML}{BBBBBB}
\\definecolor{c08}{HTML}{BBBBBB}
\\usepackage{fontsetup}
\\setmonofont{Hack}
"""
)

# data_file = ARGS[1]

# DEBUG
data_file = "./out/co23/2023-09-10--20-20-45/out.csv"

df1 =
CSV.File(data_file) |> # read benchmark data from file
DataFrame |> # convert it to a data_file frame
x -> unstack(x, :problem, :library, :execution_time) |> # convert from long to wide format (create a column for each possible `library` value)
dropmissing # drop rows with missing values

df2 =
map(x -> joinpath(artifact"uai2014-mar", x * ".tamaki.td"), df1.problem) |> # create absolute filepaths for each problem in `df`
x -> get_td_soln.(x) |> # get the tree decomposition solution line ([nbags, :largest_bag_size, nvars]) for each problem
x -> DataFrame(x=[i for i in x]) |> # create data frame using the constructor for vector of vectors
x -> select(x, :x => AsTable) |> # https://www.juliabloggers.com/handling-vectors-of-vectors-in-dataframes-jl/
x -> rename(x, [:nbags, :largest_bag_size, :nvars]) # rename columns

df =
hcat(df1, df2) |> # horizontally concat data frame
x -> sort(x, [:largest_bag_size, :nvars, :nbags]) |> # sort the rows based on the largest bag size
x -> transform(x, [:libdai, :TensorInference] => (./) => :libdai_ti_speedup) |>
x -> transform(x, [:merlin, :TensorInference] => (./) => :merlin_ti_speedup) |>
x -> transform(x, [:JunctionTrees_v1, :TensorInference] => (./) => :jtv1_ti_speedup) |>
x -> transform(x, [:JunctionTrees_v2, :TensorInference] => (./) => :jtv2_ti_speedup)

labels =
df.problem |>
x -> match.(r"[a-zA-Z]+", x) |>
x -> getfield.(x, :match)

labels_unique = unique(labels) |> sort
xmax = maximum(df.largest_bag_size) + 1

@pgf tp = Axis(
{
# title="TensorInference.jl Speedup",
xmin = 0,
xmax = xmax,
xlabel = "Largest cluster size",
xmajorgrids = true,
ymin = 0,
ymax = 1000000,
ymode = "log",
ytick = [1e-3, 1e-2, 1e-1, 1, 1e1, 1e2, 1e3, 1e4, 1e5],
ymajorgrids = true,
ylabel = "Run time speedup",
label_style = {font = raw"\footnotesize"},
tick_label_style = {font = raw"\footnotesize"},
"scatter/classes" = {
# Warning: These classes be defined in sorted to keep the correspondance with `labels_unique`
Alchemy = {
mark = "x",
},
CSP = {
mark = "+",
},
# DBN = {
# mark = "square"
# },
Grids = {
mark = "asterisk",
},
ObjectDetection = {
mark = "-",
},
Pedigree = {
mark = "triangle",
},
Promedus = {
mark = "o",
# mark = "square",
# mark = "pentagon",
},
Segmentation = {
mark = "Mercedes star",
},
linkage = {
mark = "diamond",
# mark = "|",
},
},
legend_style = {
legend_columns = 3,
at = Coordinate(0.51, -0.4),
anchor = "south",
draw = "none",
font = raw"\footnotesize",
column_sep = 1.5,
},
},
Plot(
{
c01,
scatter,
"only marks",
"scatter src" = "explicit symbolic",
"legend image post style" = "black", "legend style" = {text = "black", font = raw"\footnotesize"},
},
Table(
{
meta = "label"
},
x=df.largest_bag_size,
y=df.libdai_ti_speedup,
label=labels,
),
),
Plot(
{
c02,
scatter,
"only marks",
"scatter src" = "explicit symbolic",
},
Table(
{
meta = "label"
},
x=df.largest_bag_size,
y=df.merlin_ti_speedup,
label=labels,
),
),
Plot(
{
c03,
scatter,
"only marks",
"scatter src" = "explicit symbolic",
},
Table(
{
meta = "label"
},
x=df.largest_bag_size,
y=df.jtv1_ti_speedup,
label=labels,
),
),
Plot(
{
c04,
scatter,
"only marks",
"scatter src" = "explicit symbolic",
},
Table(
{
meta = "label"
},
x=df.largest_bag_size,
y=df.jtv2_ti_speedup,
label=labels,
),
),
HLine({ dashed, black }, 1), # See: https://kristofferc.github.io/PGFPlotsX.jl/v1/examples/convenience/
Legend(labels_unique),
# Library legend (manually made with LaTeX code. See: https://kristofferc.github.io/PGFPlotsX.jl/v1/examples/latex/)
[raw"\node ",
{
draw = "black",
fill = "white",
font = raw"\scriptsize",
# pin = "outlier"
},
" at ",
Coordinate(5.5, 30000), # warning: hardcoded!
raw"{\shortstack[l] { $\textcolor{c01}{\blacksquare}$ libDAI \\ $\textcolor{c02}{\blacksquare}$ Merlin \\ $\textcolor{c03}{\blacksquare}$ JunctionTrees.jl-v1 \\ $\textcolor{c04}{\blacksquare}$ JunctionTrees.jl-v2}};"
]
)

println("Geometric mean of the speedup: $(geomean(vcat(df.libdai_ti_speedup, df.merlin_ti_speedup, df.jtv1_ti_speedup, df.jtv2_ti_speedup)))")
println("Geometric mean of the speedup of the last 10 problems: $(geomean(vcat(last(df.libdai_ti_speedup, 10), last(df.merlin_ti_speedup, 10), last(df.jtv1_ti_speedup, 10), last(df.jtv2_ti_speedup, 10))))")

output_file = joinpath(dirname(data_file), "performance-evaluation.svg")
pgfsave(output_file, tp; include_preamble=true, dpi=150)

# DEBUG
display(tp)
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
┌ Info: SubString{String}["Solving inference problem...", "Total process time: 8856.916000 ms.", "Used time: 9.85718 seconds.", ""]
└ @ Main /home/20180043/repos/Probabilistic-Inference-in-the-Era-of-Tensor-Networks-and-Differential-Programming/scripts/benchmarks/mar/ti-vs-jtv1-jtv2-vs-merlin-vs-libdai/run-libdai.jl:40
Loading

0 comments on commit 82e4877

Please sign in to comment.