added plots to test suite. updated classification docs

JuliaTrustworthyAI · Dec 1, 2022 · ce09ef7 · ce09ef7
1 parent 767dd4f
commit ce09ef7
Show file tree

Hide file tree

Showing 18 changed files with 11,013 additions and 285 deletions.
diff --git a/Project.toml b/Project.toml
@@ -4,6 +4,7 @@ authors = ["Patrick Altmeyer"]
 version = "0.1.3"
 
 [deps]
+CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
 MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
 MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
 MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"

diff --git a/_freeze/docs/src/classification/execute-results/md.json b/_freeze/docs/src/classification/execute-results/md.json
diff --git a/_freeze/docs/src/classification/figure-commonmark/cell-10-output-1.svg b/_freeze/docs/src/classification/figure-commonmark/cell-10-output-1.svg
diff --git a/_freeze/docs/src/classification/figure-commonmark/cell-9-output-1.svg b/_freeze/docs/src/classification/figure-commonmark/cell-9-output-1.svg
diff --git a/docs/Manifest.toml b/docs/Manifest.toml
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -1,4 +1,5 @@
 [deps]
+CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
 ConformalPrediction = "98bfc277-1877-43dc-819b-a3e38c30242f"
 DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb"

diff --git a/docs/src/classification.md b/docs/src/classification.md
@@ -92,6 +92,27 @@ predict(mach, Xtest)[1]
 
     missing
 
+``` julia
+cov_ = .9
+conf_model = conformal_model(model; coverage=cov_)
+mach = machine(conf_model, X, y)
+fit!(mach, rows=train)
+Markdown.parse("""
+The following chart shows the resulting predicted probabilities for ``y=1`` (left) and set size (right) for a choice of ``(1-\\alpha)``=$cov_.
+""")
+```
+
+The following chart shows the resulting predicted probabilities for *y* = 1 (left) and set size (right) for a choice of (1−*α*)=0.9.
+
+``` julia
+using Plots
+p_proba = plot(mach.model, mach.fitresult, X, y)
+p_set_size = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)
+plot(p_proba, p_set_size, size=(800,250))
+```
+
+![](classification_files/figure-commonmark/cell-10-output-1.svg)
+
 The animation below should provide some more intuition as to what exactly is happening here. It illustrates the effect of the chosen coverage rate on the predicted softmax output and the set size in the two-dimensional feature space. Contours are overlayed with the moon data points (including test data). The two samples highlighted in red, *X*₁ and *X*₂, have been manually added for illustration purposes. Let’s look at these one by one.
 
 Firstly, note that *X*₁ (red cross) falls into a region of the domain that is characterized by high predictive uncertainty. It sits right at the bottom-right corner of our class-zero moon 🌜 (orange), a region that is almost entirely enveloped by our class-one moon 🌛 (green). For low coverage rates the prediction set for *X*₁ is empty: on the left-hand side this is indicated by the missing contour for the softmax probability; on the right-hand side we can observe that the corresponding set size is indeed zero. For high coverage rates the prediction set includes both *y* = 0 and *y* = 1, indicative of the fact that the conformal classifier is uncertain about the true label.
@@ -100,16 +121,12 @@ With respect to *X*₂, we observe that while also sitting on the fringe of our
 
 ``` julia
 Xtest_2 = (x1=[-0.5],x2=[0.25])
-cov_ = .9
-conf_model = conformal_model(model; coverage=cov_)
-mach = machine(conf_model, X, y)
-fit!(mach, rows=train)
 p̂_2 = pdf(predict(mach, Xtest_2)[1], 0)
 ```
 
 Well, for low coverage rates (roughly  \< 0.9) the conformal prediction set does not include *y* = 0: the set size is zero (right panel). Only for higher coverage rates do we have *C*(*X*₂) = {0}: the coverage rate is high enough to include *y* = 0, but the corresponding softmax probability is still fairly low. For example, for (1−*α*) = 0.9 we have *p̂*(*y*=0|*X*₂) = 0.72.
 
-These two examples illustrate an interesting point: for regions characterised by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive.
+These two examples illustrate an interesting point: for regions characterized by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive.
 
 ``` julia
 # Setup
@@ -122,12 +139,11 @@ anim = @animate for coverage in coverages
     conf_model = conformal_model(model; coverage=coverage)
     mach = machine(conf_model, X, y)
     fit!(mach, rows=train)
-    p1 = contourf_cp(mach, x1_range, x2_range; type=:proba, title="Softmax", axis=nothing)
-    scatter!(p1, X.x1, X.x2, group=y, ms=2, msw=0, alpha=0.75)
+    # Probabilities:
+    p1 = plot(mach.model, mach.fitresult, X, y)
     scatter!(p1, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6)
     scatter!(p1, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6)
-    p2 = contourf_cp(mach, x1_range, x2_range; type=:set_size, title="Set size", axis=nothing)
-    scatter!(p2, X.x1, X.x2, group=y, ms=2, msw=0, alpha=0.75)
+    p2 = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)
     scatter!(p2, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6)
     scatter!(p2, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6)
     plot(p1, p2, plot_title="(1-α)=$(round(coverage,digits=2))", size=(800,300))

diff --git a/docs/src/classification.qmd b/docs/src/classification.qmd
@@ -111,31 +111,27 @@ predict(mach, Xtest)[1]
 ```
 
 ```{julia}
-#| echo: false
-using Plots
+#| output: true
 
-function contourf_cp(mach::Machine, x1_range, x2_range; type=:set_size, kwargs...)
-    set_size = []
-    proba = []
-    for x2 in x2_range, x1 in x1_range
-        Xnew = (x1 = [x1], x2 = [x2])
-        p̂ = predict(mach, Xnew)[1]
-        # Set size:
-        z = ismissing(p̂) ? 0 : sum(pdf.(p̂, p̂.decoder.classes) .> 0)
-        push!(set_size, z)
-        # Probability:
-        p = ismissing(p̂) ? p̂ : pdf.(p̂, 1)
-        push!(proba, p)
-    end
-    if type == :set_size
-        plt = contourf(x1_range, x2_range, set_size; clim=(0,2), c=cgrad(:blues, 3, categorical = true),  kwargs...)
-    elseif type == :proba
-        plt = contourf(x1_range, x2_range, proba; c=:thermal, kwargs...)
-    end
-    return plt
-end
+cov_ = .9
+conf_model = conformal_model(model; coverage=cov_)
+mach = machine(conf_model, X, y)
+fit!(mach, rows=train)
+Markdown.parse("""
+The following chart shows the resulting predicted probabilities for ``y=1`` (left) and set size (right) for a choice of ``(1-\\alpha)``=$cov_.
+""")
 ```
 
+```{julia}
+#| output: true
+
+using Plots
+p_proba = plot(mach.model, mach.fitresult, X, y)
+p_set_size = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)
+plot(p_proba, p_set_size, size=(800,250))
+```
+
+
 The animation below should provide some more intuition as to what exactly is happening here. It illustrates the effect of the chosen coverage rate on the predicted softmax output and the set size in the two-dimensional feature space. Contours are overlayed with the moon data points (including test data). The two samples highlighted in red, $X_1$ and $X_2$, have been manually added for illustration purposes. Let's look at these one by one.
 
 Firstly, note that $X_1$ (red cross) falls into a region of the domain that is characterized by high predictive uncertainty. It sits right at the bottom-right corner of our class-zero moon 🌜 (orange), a region that is almost entirely enveloped by our class-one moon 🌛 (green). For low coverage rates the prediction set for $X_1$ is empty: on the left-hand side this is indicated by the missing contour for the softmax probability; on the right-hand side we can observe that the corresponding set size is indeed zero. For high coverage rates the prediction set includes both $y=0$ and $y=1$, indicative of the fact that the conformal classifier is uncertain about the true label.
@@ -146,10 +142,6 @@ With respect to $X_2$, we observe that while also sitting on the fringe of our c
 #| code-fold: true
 
 Xtest_2 = (x1=[-0.5],x2=[0.25])
-cov_ = .9
-conf_model = conformal_model(model; coverage=cov_)
-mach = machine(conf_model, X, y)
-fit!(mach, rows=train)
 p̂_2 = pdf(predict(mach, Xtest_2)[1], 0)
 ```
 
@@ -162,7 +154,7 @@ Well, for low coverage rates (roughly ``<0.9``) the conformal prediction set doe
 """)
 ```
 
-These two examples illustrate an interesting point: for regions characterised by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive. 
+These two examples illustrate an interesting point: for regions characterized by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive. 
 
 ```{julia}
 #| output: true
@@ -179,12 +171,11 @@ anim = @animate for coverage in coverages
     conf_model = conformal_model(model; coverage=coverage)
     mach = machine(conf_model, X, y)
     fit!(mach, rows=train)
-    p1 = contourf_cp(mach, x1_range, x2_range; type=:proba, title="Softmax", axis=nothing)
-    scatter!(p1, X.x1, X.x2, group=y, ms=2, msw=0, alpha=0.75)
+    # Probabilities:
+    p1 = plot(mach.model, mach.fitresult, X, y)
     scatter!(p1, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6)
     scatter!(p1, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6)
-    p2 = contourf_cp(mach, x1_range, x2_range; type=:set_size, title="Set size", axis=nothing)
-    scatter!(p2, X.x1, X.x2, group=y, ms=2, msw=0, alpha=0.75)
+    p2 = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)
     scatter!(p2, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6)
     scatter!(p2, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6)
     plot(p1, p2, plot_title="(1-α)=$(round(coverage,digits=2))", size=(800,300))