diff --git a/Project.toml b/Project.toml index 97640a6..4b58b20 100644 --- a/Project.toml +++ b/Project.toml @@ -4,9 +4,11 @@ authors = ["Patrick Altmeyer"] version = "0.1.3" [deps] +CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d" MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" +Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" [compat] diff --git a/_freeze/docs/src/classification/execute-results/md.json b/_freeze/docs/src/classification/execute-results/md.json index 938a281..2216aa0 100644 --- a/_freeze/docs/src/classification/execute-results/md.json +++ b/_freeze/docs/src/classification/execute-results/md.json @@ -1,9 +1,9 @@ { - "hash": "a5e7b7c00cea9c148884a97a62ddf49b", + "hash": "a9026c438717b315a2ac33b3ad7cabcd", "result": { - "markdown": "\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing MLJ\nX, y = MLJ.make_blobs(1000, 2; centers=3, cluster_std=1.0)\ntrain, test = partition(eachindex(y), 0.4, 0.4, shuffle=true)\n```\n:::\n\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nEvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees\nmodel = EvoTreeClassifier() \n```\n:::\n\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconf_model = conformal_model(model)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n```\n:::\n\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nrows = rand(test, 10)\nXtest = selectrows(X, rows)\nytest = y[rows]\npredict(mach, Xtest)\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n╭───────────────────────────────────────────────────────────────────╮\n│ │\n│ (1) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │\n│ (2) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │\n│ (3) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │\n│ (4) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │\n│ (5) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │\n│ (6) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │\n│ (7) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │\n│ (8) UnivariateFinite{Multiclass {#90CAF9}3} (2=>0.82{/#90CAF9}) │\n│ (9) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │\n│ (10) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │\n│ │\n│ │\n╰────────────────────────────────────────────────────── 10 items ───╯\n```\n:::\n:::\n\n\n", + "markdown": "# Classification \n\n```@meta\nCurrentModule = ConformalPrediction\n```\n\n\n\nThis tutorial is based in parts on this [blog post](https://www.paltmeyer.com/blog/posts/conformal-prediction/).\n\n### Split Conformal Classification {#sec-scp}\n\nWe consider a simple binary classification problem. Let $(X_i, Y_i), \\ i=1,...,n$ denote our feature-label pairs and let $\\mu: \\mathcal{X} \\mapsto \\mathcal{Y}$ denote the mapping from features to labels. For illustration purposes we will use the moons dataset 🌙. Using [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) we first generate the data and split into into a training and test set:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing MLJ\nusing Random\nRandom.seed!(123)\n\n# Data:\nX, y = make_moons(500; noise=0.15)\ntrain, test = partition(eachindex(y), 0.8, shuffle=true)\n```\n:::\n\n\nHere we will use a specific case of CP called *split conformal prediction* which can then be summarized as follows:^[In other places split conformal prediction is sometimes referred to as *inductive* conformal prediction.]\n\n1. Partition the training into a proper training set and a separate calibration set: $\\mathcal{D}_n=\\mathcal{D}^{\\text{train}} \\cup \\mathcal{D}^{\\text{cali}}$.\n2. Train the machine learning model on the proper training set: $\\hat\\mu_{i \\in \\mathcal{D}^{\\text{train}}}(X_i,Y_i)$.\n3. Compute nonconformity scores, $\\mathcal{S}$, using the calibration data $\\mathcal{D}^{\\text{cali}}$ and the fitted model $\\hat\\mu_{i \\in \\mathcal{D}^{\\text{train}}}$. \n4. For a user-specified desired coverage ratio $(1-\\alpha)$ compute the corresponding quantile, $\\hat{q}$, of the empirical distribution of nonconformity scores, $\\mathcal{S}$.\n5. For the given quantile and test sample $X_{\\text{test}}$, form the corresponding conformal prediction set: \n\n$$\nC(X_{\\text{test}})=\\{y:s(X_{\\text{test}},y) \\le \\hat{q}\\}\n$$ {#eq-set}\n\nThis is the default procedure used for classification and regression in [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl). \n\nNow let's take this to our 🌙 data. To illustrate the package functionality we will demonstrate the envisioned workflow. We first define our atomic machine learning model following standard [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) conventions. Using [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl) we then wrap our atomic model in a conformal model using the standard API call `conformal_model(model::Supervised; kwargs...)`. To train and predict from our conformal model we can then rely on the conventional [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) procedure again. In particular, we wrap our conformal model in data (turning it into a machine) and then fit it on the training set. Finally, we use our machine to predict the label for a new test sample `Xtest`:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\n# Model:\nKNNClassifier = @load KNNClassifier pkg=NearestNeighborModels\nmodel = KNNClassifier(;K=50) \n\n# Training:\nusing ConformalPrediction\nconf_model = conformal_model(model; coverage=.9)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n\n# Conformal Prediction:\nXtest = selectrows(X, first(test))\nytest = y[first(test)]\npredict(mach, Xtest)[1]\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nimport NearestNeighborModels ✔\n```\n:::\n\n::: {.cell-output .cell-output-stdout}\n```\n\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\n UnivariateFinite{Multiclass{2}} \n ┌ ┐ \n 0 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.94 \n └ ┘ \n```\n:::\n:::\n\n\nThe final predictions are set-valued. While the softmax output remains unchanged for the `SimpleInductiveClassifier`, the size of the prediction set depends on the chosen coverage rate, $(1-\\alpha)$. \n\n::: {.cell execution_count=4}\n\n::: {.cell-output .cell-output-display execution_count=5}\nWhen specifying a coverage rate very close to one, the prediction set will typically include many (in some cases all) of the possible labels. Below, for example, both classes are included in the prediction set when setting the coverage rate equal to $(1-\\alpha)$=1.0. This is intuitive, since high coverage quite literally requires that the true label is covered by the prediction set with high probability.\n\n:::\n:::\n\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nconf_model = conformal_model(model; coverage=coverage)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n\n# Conformal Prediction:\nXtest = (x1=[1],x2=[0])\npredict(mach, Xtest)[1]\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n UnivariateFinite{Multiclass{2}} \n ┌ ┐ \n 0 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5 \n 1 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5 \n └ ┘ \n```\n:::\n:::\n\n\n::: {.cell execution_count=6}\n\n::: {.cell-output .cell-output-display execution_count=7}\nConversely, for low coverage rates, prediction sets can also be empty. For a choice of $(1-\\alpha)$=0.1, for example, the prediction set for our test sample is empty. This is a bit difficult to think about intuitively and I have not yet come across a satisfactory, intuitive interpretation.^[Any thoughts/comments welcome!] When the prediction set is empty, the `predict` call currently returns `missing`:\n\n:::\n:::\n\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nconf_model = conformal_model(model; coverage=coverage)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n\n# Conformal Prediction:\npredict(mach, Xtest)[1]\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\nmissing\n```\n:::\n:::\n\n\n::: {.cell execution_count=8}\n``` {.julia .cell-code}\ncov_ = .9\nconf_model = conformal_model(model; coverage=cov_)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\nMarkdown.parse(\"\"\"\nThe following chart shows the resulting predicted probabilities for ``y=1`` (left) and set size (right) for a choice of ``(1-\\\\alpha)``=$cov_.\n\"\"\")\n```\n\n::: {.cell-output .cell-output-display execution_count=9}\nThe following chart shows the resulting predicted probabilities for $y=1$ (left) and set size (right) for a choice of $(1-\\alpha)$=0.9.\n\n:::\n:::\n\n\n::: {.cell execution_count=9}\n``` {.julia .cell-code}\nusing Plots\np_proba = plot(mach.model, mach.fitresult, X, y)\np_set_size = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)\nplot(p_proba, p_set_size, size=(800,250))\n```\n\n::: {.cell-output .cell-output-display execution_count=10}\n![](classification_files/figure-commonmark/cell-10-output-1.svg){}\n:::\n:::\n\n\nThe animation below should provide some more intuition as to what exactly is happening here. It illustrates the effect of the chosen coverage rate on the predicted softmax output and the set size in the two-dimensional feature space. Contours are overlayed with the moon data points (including test data). The two samples highlighted in red, $X_1$ and $X_2$, have been manually added for illustration purposes. Let's look at these one by one.\n\nFirstly, note that $X_1$ (red cross) falls into a region of the domain that is characterized by high predictive uncertainty. It sits right at the bottom-right corner of our class-zero moon 🌜 (orange), a region that is almost entirely enveloped by our class-one moon 🌛 (green). For low coverage rates the prediction set for $X_1$ is empty: on the left-hand side this is indicated by the missing contour for the softmax probability; on the right-hand side we can observe that the corresponding set size is indeed zero. For high coverage rates the prediction set includes both $y=0$ and $y=1$, indicative of the fact that the conformal classifier is uncertain about the true label.\n\nWith respect to $X_2$, we observe that while also sitting on the fringe of our class-zero moon, this sample populates a region that is not fully enveloped by data points from the opposite class. In this region, the underlying atomic classifier can be expected to be more certain about its predictions, but still not highly confident. How is this reflected by our corresponding conformal prediction sets? \n\n::: {.cell execution_count=10}\n``` {.julia .cell-code code-fold=\"true\"}\nXtest_2 = (x1=[-0.5],x2=[0.25])\np̂_2 = pdf(predict(mach, Xtest_2)[1], 0)\n```\n:::\n\n\n::: {.cell execution_count=11}\n\n::: {.cell-output .cell-output-display execution_count=12}\nWell, for low coverage rates (roughly $<0.9$) the conformal prediction set does not include $y=0$: the set size is zero (right panel). Only for higher coverage rates do we have $C(X_2)=\\{0\\}$: the coverage rate is high enough to include $y=0$, but the corresponding softmax probability is still fairly low. For example, for $(1-\\alpha)=0.9$ we have $\\hat{p}(y=0|X_2)=0.72.$\n\n:::\n:::\n\n\nThese two examples illustrate an interesting point: for regions characterized by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive. \n\n::: {.cell execution_count=12}\n``` {.julia .cell-code}\n# Setup\ncoverages = range(0.75,1.0,length=5)\nn = 100\nx1_range = range(extrema(X.x1)...,length=n)\nx2_range = range(extrema(X.x2)...,length=n)\n\nanim = @animate for coverage in coverages\n conf_model = conformal_model(model; coverage=coverage)\n mach = machine(conf_model, X, y)\n fit!(mach, rows=train)\n # Probabilities:\n p1 = plot(mach.model, mach.fitresult, X, y)\n scatter!(p1, Xtest.x1, Xtest.x2, ms=6, c=:red, label=\"X₁\", shape=:cross, msw=6)\n scatter!(p1, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label=\"X₂\", shape=:diamond, msw=6)\n p2 = plot(mach.model, mach.fitresult, X, y; plot_set_size=true)\n scatter!(p2, Xtest.x1, Xtest.x2, ms=6, c=:red, label=\"X₁\", shape=:cross, msw=6)\n scatter!(p2, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label=\"X₂\", shape=:diamond, msw=6)\n plot(p1, p2, plot_title=\"(1-α)=$(round(coverage,digits=2))\", size=(800,300))\nend\n\ngif(anim, joinpath(www_path,\"classification.gif\"), fps=1)\n```\n\n::: {#fig-anim .cell-output .cell-output-display execution_count=13}\n```{=html}\n\n```\n\nThe effect of the coverage rate on the conformal prediction set. Softmax probabilities are shown on the left. The size of the prediction set is shown on the right.\n:::\n:::\n\n\n![](www/classification.gif)\n\n", "supporting": [ - "classification_files" + "classification_files/figure-commonmark" ], "filters": [] } diff --git a/_freeze/docs/src/classification/figure-commonmark/cell-10-output-1.svg b/_freeze/docs/src/classification/figure-commonmark/cell-10-output-1.svg new file mode 100644 index 0000000..de81765 --- /dev/null +++ b/_freeze/docs/src/classification/figure-commonmark/cell-10-output-1.svgdiff --git a/_freeze/docs/src/classification/figure-commonmark/cell-9-output-1.svg b/_freeze/docs/src/classification/figure-commonmark/cell-9-output-1.svg new file mode 100644 index 0000000..8791f3f --- /dev/null +++ b/_freeze/docs/src/classification/figure-commonmark/cell-9-output-1.svgdiff --git a/_freeze/docs/src/index/execute-results/md.json b/_freeze/docs/src/index/execute-results/md.json index 03a9018..bc25f51 100644 --- a/_freeze/docs/src/index/execute-results/md.json +++ b/_freeze/docs/src/index/execute-results/md.json @@ -1,7 +1,7 @@ { "hash": "c56dcfed5fce5fece3f8dd90b08af0bd", "result": { - "markdown": "```@meta\nCurrentModule = ConformalPrediction\n```\n\n# ConformalPrediction\n\nDocumentation for [ConformalPrediction.jl](https://github.com/pat-alt/ConformalPrediction.jl).\n\n\n\n`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) @blaom2020mlj. Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. \n\n# 📖 Background\n\nConformal Prediction is a scalable frequentist approach to uncertainty quantification and coverage control. It promises to be an easy-to-understand, distribution-free and model-agnostic way to generate statistically rigorous uncertainty estimates. Interestingly, it can even be used to complement Bayesian methods.\n\nThe animation below is lifted from a small blog post that introduces the topic and the package ([[TDS](https://towardsdatascience.com/conformal-prediction-in-julia-351b81309e30)], [[Quarto](https://www.paltmeyer.com/blog/posts/conformal-prediction/#fig-anim)]). It shows conformal prediction sets for two different samples and changing coverage rates. Standard conformal classifiers produce set-valued predictions: for ambiguous samples these sets are typically large (for high coverage) or empty (for low coverage).\n\n![Conformal Prediction in action: Prediction sets for two different samples and changing coverage rates. As coverage grows, so does the size of the prediction sets.](https://raw.githubusercontent.com/pat-alt/blog/main/posts/conformal-prediction/www/medium.gif)\n\n## 🚩 Installation \n\nYou can install the latest stable release from the general registry:\n\n```{.julia}\nusing Pkg\nPkg.add(\"ConformalPrediction\")\n```\n\nThe development version can be installed as follows:\n\n```{.julia}\nusing Pkg\nPkg.add(url=\"https://github.com/pat-alt/ConformalPrediction.jl\")\n```\n\n## 🔁 Status \n\nThis package is in its early stages of development and therefore still subject to changes to the core architecture and API. The following CP approaches have been implemented in the development version:\n\n**Regression**:\n\n- Inductive \n- Naive Transductive \n- Jackknife \n- Jackknife+ \n- Jackknife-minmax\n- CV+\n- CV-minmax\n\n**Classification**:\n\n- Inductive (LABEL [@sadinle2019least])\n- Adaptive Inductive\n\nThe package has been tested for the following supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/).\n\n**Regression**:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing ConformalPrediction\nkeys(tested_atomic_models[:regression])\n```\n\n::: {.cell-output .cell-output-display execution_count=12}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n**Classification**:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nkeys(tested_atomic_models[:classification])\n```\n\n::: {.cell-output .cell-output-display execution_count=13}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n## 🔍 Usage Example \n\nTo illustrate the intended use of the package, let's have a quick look at a simple regression problem. Using [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) we first generate some synthetic data and then determine indices for our training, calibration and test data:\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing MLJ\nX, y = MLJ.make_regression(1000, 2)\ntrain, test = partition(eachindex(y), 0.4, 0.4)\n```\n:::\n\n\nWe then import a decision tree ([`EvoTrees.jl`](https://github.com/Evovest/EvoTrees.jl)) following the standard [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) procedure.\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor() \n```\n:::\n\n\nTo turn our conventional model into a conformal model, we just need to declare it as such by using `conformal_model` wrapper function. The generated conformal model instance can wrapped in data to create a *machine*. Finally, we proceed by fitting the machine on training data using the generic `fit!` method:\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconf_model = conformal_model(model)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n```\n:::\n\n\nPredictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval.\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nn = 10\nXtest = selectrows(X, first(test,n))\nytest = y[first(test,n)]\npredict(mach, Xtest)\n```\n\n::: {.cell-output .cell-output-display execution_count=17}\n```\n╭─────────────────────────────────────────────────────────────────╮\n│ │\n│ (1) ([0.14395897640483468], [1.5537237281612537]) │\n│ (2) ([-0.539687877793372], [0.8700768739630471]) │\n│ (3) ([-0.46442052745067525], [0.9453442243057439]) │\n│ (4) ([0.010529843675146089], [1.420294595431565]) │\n│ (5) ([0.07301045762431613], [1.4827752093807351]) │\n│ (6) ([-0.012020120998203487], [1.3977446307582158]) │\n│ (7) ([0.5297045560243977], [1.9394693077808167]) │\n│ (8) ([-0.46442052745067525], [0.9453442243057439]) │\n│ (9) ([-0.09600489213468855], [1.3137598596217306]) │\n│ (10) ([0.010529843675146089], [1.420294595431565]) │\n│ │\n│ │\n╰──────────────────────────────────────────────────── 10 items ───╯\n```\n:::\n:::\n\n\n## 🛠 Contribute \n\nContributions are welcome! Please follow the [SciML ColPrac guide](https://github.com/SciML/ColPrac).\n\n## 🎓 References \n\n", + "markdown": "```@meta\nCurrentModule = ConformalPrediction\n```\n\n# ConformalPrediction\n\nDocumentation for [ConformalPrediction.jl](https://github.com/pat-alt/ConformalPrediction.jl).\n\n\n\n`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) @blaom2020mlj. Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. \n\n# 📖 Background\n\nConformal Prediction is a scalable frequentist approach to uncertainty quantification and coverage control. It promises to be an easy-to-understand, distribution-free and model-agnostic way to generate statistically rigorous uncertainty estimates. Interestingly, it can even be used to complement Bayesian methods.\n\nThe animation below is lifted from a small blog post that introduces the topic and the package ([[TDS](https://towardsdatascience.com/conformal-prediction-in-julia-351b81309e30)], [[Quarto](https://www.paltmeyer.com/blog/posts/conformal-prediction/#fig-anim)]). It shows conformal prediction sets for two different samples and changing coverage rates. Standard conformal classifiers produce set-valued predictions: for ambiguous samples these sets are typically large (for high coverage) or empty (for low coverage).\n\n![Conformal Prediction in action: Prediction sets for two different samples and changing coverage rates. As coverage grows, so does the size of the prediction sets.](https://raw.githubusercontent.com/pat-alt/blog/main/posts/conformal-prediction/www/medium.gif)\n\n## 🚩 Installation \n\nYou can install the latest stable release from the general registry:\n\n```{.julia}\nusing Pkg\nPkg.add(\"ConformalPrediction\")\n```\n\nThe development version can be installed as follows:\n\n```{.julia}\nusing Pkg\nPkg.add(url=\"https://github.com/pat-alt/ConformalPrediction.jl\")\n```\n\n## 🔁 Status \n\nThis package is in its early stages of development and therefore still subject to changes to the core architecture and API. The following CP approaches have been implemented in the development version:\n\n**Regression**:\n\n- Inductive \n- Naive Transductive \n- Jackknife \n- Jackknife+ \n- Jackknife-minmax\n- CV+\n- CV-minmax\n\n**Classification**:\n\n- Inductive (LABEL [@sadinle2019least])\n- Adaptive Inductive\n\nThe package has been tested for the following supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/).\n\n**Regression**:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing ConformalPrediction\nkeys(tested_atomic_models[:regression])\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n**Classification**:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nkeys(tested_atomic_models[:classification])\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n## 🔍 Usage Example \n\nTo illustrate the intended use of the package, let's have a quick look at a simple regression problem. Using [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) we first generate some synthetic data and then determine indices for our training, calibration and test data:\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing MLJ\nX, y = MLJ.make_regression(1000, 2)\ntrain, test = partition(eachindex(y), 0.4, 0.4)\n```\n:::\n\n\nWe then import a decision tree ([`EvoTrees.jl`](https://github.com/Evovest/EvoTrees.jl)) following the standard [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) procedure.\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor() \n```\n:::\n\n\nTo turn our conventional model into a conformal model, we just need to declare it as such by using `conformal_model` wrapper function. The generated conformal model instance can wrapped in data to create a *machine*. Finally, we proceed by fitting the machine on training data using the generic `fit!` method:\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconf_model = conformal_model(model)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n```\n:::\n\n\nPredictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval.\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nn = 5\nXtest = selectrows(X, first(test,n))\nytest = y[first(test,n)]\npredict(mach, Xtest)\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\n╭─────────────────────────────────────────────────────────╮\n│ │\n│ (1) (1.2801183281465092, 2.0024286641173816) │\n│ (2) (0.8012756658949756, 1.5235860018658482) │\n│ (3) (1.1850387604493555, 1.9073490964202282) │\n│ (4) (1.1185514282818692, 1.8408617642527418) │\n│ (5) (1.1651738766694149, 1.8874842126402875) │\n│ │\n│ │\n╰───────────────────────────────────────────── 5 items ───╯\n```\n:::\n:::\n\n\n## 🛠 Contribute \n\nContributions are welcome! Please follow the [SciML ColPrac guide](https://github.com/SciML/ColPrac).\n\n## 🎓 References \n\n", "supporting": [ "index_files" ], diff --git a/_freeze/docs/src/intro/execute-results/md.json b/_freeze/docs/src/intro/execute-results/md.json index 960f41c..966caac 100644 --- a/_freeze/docs/src/intro/execute-results/md.json +++ b/_freeze/docs/src/intro/execute-results/md.json @@ -1,7 +1,7 @@ { - "hash": "1b0ee553524705afa5795d1e05898476", + "hash": "4b0109965d9339b1ffd867a3c20a947b", "result": { - "markdown": "\n\n`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/). Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. \n\n## Installation 🚩\n\nYou can install the first stable release from the general registry:\n\n```{.julia}\nusing Pkg\nPkg.add(\"ConformalPrediction\")\n```\n\nThe development version can be installed as follows:\n\n```{.julia}\nusing Pkg\nPkg.add(url=\"https://github.com/pat-alt/ConformalPrediction.jl\")\n```\n\n## Status 🔁\n\nThis package is in its very early stages of development and therefore still subject to changes to the core architecture. The following approaches have been implemented in the development version:\n\n**Regression**:\n\n- Inductive \n- Naive Transductive \n- Jackknife \n- Jackknife+ \n- Jackknife-minmax\n- CV+\n- CV-minmax\n\n**Classification**:\n\n- Inductive (LABEL [@sadinle2019least])\n- Adaptive Inductive\n\nI have only tested it for a few of the supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/).\n\n## Usage Example 🔍\n\nTo illustrate the intended use of the package, let's have a quick look at a simple regression problem. Using [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) we first generate some synthetic data and then determine indices for our training, calibration and test data:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing MLJ\nX, y = MLJ.make_regression(1000, 2)\ntrain, test = partition(eachindex(y), 0.4, 0.4)\n```\n:::\n\n\nWe then import a decision tree ([`EvoTrees.jl`](https://github.com/Evovest/EvoTrees.jl)) following the standard [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) procedure.\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor() \n```\n:::\n\n\nTo turn our conventional model into a conformal model, we just need to declare it as such by using `conformal_model` wrapper function. The generated conformal model instance can wrapped in data to create a *machine*. Finally, we proceed by fitting the machine on training data using the generic `fit!` method:\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconf_model = conformal_model(model)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n```\n:::\n\n\nPredictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval.\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nn = 10\nXtest = selectrows(X, first(test,n))\nytest = y[first(test,n)]\npredict(mach, Xtest)\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n```\n╭─────────────────────────────────────────────────────────────────╮\n│ │\n│ (1) ([-0.20063113789390163], [1.323655530145934]) │\n│ (2) ([-0.061147489871723804], [1.4631391781681118]) │\n│ (3) ([-1.4486105066363675], [0.07567616140346822]) │\n│ (4) ([-0.7160881365817455], [0.8081985314580902]) │\n│ (5) ([-1.7173644161988695], [-0.19307774815903367]) │\n│ (6) ([-1.2158809697881832], [0.3084056982516525]) │\n│ (7) ([-1.7173644161988695], [-0.19307774815903367]) │\n│ (8) ([0.26510754559144056], [1.7893942136312764]) │\n│ (9) ([-0.8716996456392521], [0.6525870224005836]) │\n│ (10) ([0.43084861624955606], [1.9551352842893919]) │\n│ │\n│ │\n╰──────────────────────────────────────────────────── 10 items ───╯\n```\n:::\n:::\n\n\n## Contribute 🛠\n\nContributions are welcome! Please follow the [SciML ColPrac guide](https://github.com/SciML/ColPrac).\n\n## References 🎓\n\n", + "markdown": "\n\n`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) @blaom2020mlj. Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. \n\n# 📖 Background\n\nConformal Prediction is a scalable frequentist approach to uncertainty quantification and coverage control. It promises to be an easy-to-understand, distribution-free and model-agnostic way to generate statistically rigorous uncertainty estimates. Interestingly, it can even be used to complement Bayesian methods.\n\nThe animation below is lifted from a small blog post that introduces the topic and the package ([[TDS](https://towardsdatascience.com/conformal-prediction-in-julia-351b81309e30)], [[Quarto](https://www.paltmeyer.com/blog/posts/conformal-prediction/#fig-anim)]). It shows conformal prediction sets for two different samples and changing coverage rates. Standard conformal classifiers produce set-valued predictions: for ambiguous samples these sets are typically large (for high coverage) or empty (for low coverage).\n\n![Conformal Prediction in action: Prediction sets for two different samples and changing coverage rates. As coverage grows, so does the size of the prediction sets.](https://raw.githubusercontent.com/pat-alt/blog/main/posts/conformal-prediction/www/medium.gif)\n\n## 🚩 Installation \n\nYou can install the latest stable release from the general registry:\n\n```{.julia}\nusing Pkg\nPkg.add(\"ConformalPrediction\")\n```\n\nThe development version can be installed as follows:\n\n```{.julia}\nusing Pkg\nPkg.add(url=\"https://github.com/pat-alt/ConformalPrediction.jl\")\n```\n\n## 🔁 Status \n\nThis package is in its early stages of development and therefore still subject to changes to the core architecture and API. The following CP approaches have been implemented in the development version:\n\n**Regression**:\n\n- Inductive \n- Naive Transductive \n- Jackknife \n- Jackknife+ \n- Jackknife-minmax\n- CV+\n- CV-minmax\n\n**Classification**:\n\n- Inductive (LABEL [@sadinle2019least])\n- Adaptive Inductive\n\nThe package has been tested for the following supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/).\n\n**Regression**:\n\n::: {.cell execution_count=2}\n``` {.julia .cell-code}\nusing ConformalPrediction\nkeys(tested_atomic_models[:regression])\n```\n\n::: {.cell-output .cell-output-display execution_count=3}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n**Classification**:\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nkeys(tested_atomic_models[:classification])\n```\n\n::: {.cell-output .cell-output-display execution_count=4}\n```\nKeySet for a Dict{Symbol, Expr} with 4 entries. Keys:\n :nearest_neighbor\n :evo_tree\n :light_gbm\n :decision_tree\n```\n:::\n:::\n\n\n## 🔍 Usage Example \n\nTo illustrate the intended use of the package, let's have a quick look at a simple regression problem. Using [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) we first generate some synthetic data and then determine indices for our training, calibration and test data:\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing MLJ\nX, y = MLJ.make_regression(1000, 2)\ntrain, test = partition(eachindex(y), 0.4, 0.4)\n```\n:::\n\n\nWe then import a decision tree ([`EvoTrees.jl`](https://github.com/Evovest/EvoTrees.jl)) following the standard [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) procedure.\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nEvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees\nmodel = EvoTreeRegressor() \n```\n:::\n\n\nTo turn our conventional model into a conformal model, we just need to declare it as such by using `conformal_model` wrapper function. The generated conformal model instance can wrapped in data to create a *machine*. Finally, we proceed by fitting the machine on training data using the generic `fit!` method:\n\n::: {.cell execution_count=6}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconf_model = conformal_model(model)\nmach = machine(conf_model, X, y)\nfit!(mach, rows=train)\n```\n:::\n\n\nPredictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval.\n\n::: {.cell execution_count=7}\n``` {.julia .cell-code}\nn = 5\nXtest = selectrows(X, first(test,n))\nytest = y[first(test,n)]\npredict(mach, Xtest)\n```\n\n::: {.cell-output .cell-output-display execution_count=8}\n```\n╭──────────────────────────────────────────────────────────╮\n│ │\n│ (1) (-0.9864061984981062, 2.2503222170961554) │\n│ (2) (-0.7192196826151477, 2.5175087329791137) │\n│ (3) (-0.33838267507136344, 2.898345740522898) │\n│ (4) (-2.838413186252051, 0.39831522934221053) │\n│ (5) (-0.7192196826151477, 2.5175087329791137) │\n│ │\n│ │\n╰────────────────────────────────────────────── 5 items ───╯\n```\n:::\n:::\n\n\n## 🛠 Contribute \n\nContributions are welcome! Please follow the [SciML ColPrac guide](https://github.com/SciML/ColPrac).\n\n## 🎓 References \n\n", "supporting": [ "intro_files" ], diff --git a/_freeze/docs/src/regression/execute-results/md.json b/_freeze/docs/src/regression/execute-results/md.json new file mode 100644 index 0000000..0499fa3 --- /dev/null +++ b/_freeze/docs/src/regression/execute-results/md.json @@ -0,0 +1,10 @@ +{ + "hash": "f2f3a2a6c8e2023197fe550406131d55", + "result": { + "markdown": "# Regression\n\n```@meta\nCurrentModule = ConformalPrediction\n```\n\n\n\nThis tutorial mostly replicates this [tutorial](https://mapie.readthedocs.io/en/latest/examples_regression/4-tutorials/plot_main-tutorial-regression.html#) from MAPIE.\n\n## Data\n\nWe begin by generating some synthetic regression data below:\n\n::: {#fig-data .cell execution_count=2}\n``` {.julia .cell-code}\n# Regression data:\n\n# Inputs:\nN = 600\nxmax = 3.0\nusing Distributions\nd = Uniform(-xmax, xmax)\nX = rand(d, N)\nX = reshape(X, :, 1)\n\n# Outputs:\nnoise = 0.5\nfun(X) = X * sin(X)\nε = randn(N) .* noise\ny = @.(fun(X)) + ε\nusing MLJ\ntrain, test = partition(eachindex(y), 0.4, 0.4, shuffle=true)\n\nusing Plots\nscatter(X, y, label=\"Observed\")\nxrange = range(-xmax,xmax,length=N)\nplot!(xrange, @.(fun(xrange)), lw=4, label=\"Ground truth\", ls=:dash, colour=:black)\n```\n:::\n\n\n## Model\n\nTo model this data we will use polynomial regression. There is currently no out-of-the-box support for polynomial feature transformations in `MLJ`, but it is easy enough to add a little helper function for this. Note how we define a linear pipeline `pipe` here. Since pipelines in `MLJ` are just models, we can use the generated object as an input to `conformal_model` below.\n\n::: {.cell execution_count=3}\n``` {.julia .cell-code}\nLinearRegressor = @load LinearRegressor pkg=MLJLinearModels\ndegree_polynomial = 10\npolynomial_features(X, degree::Int) = reduce(hcat, map(i -> X.^i, 1:degree))\npipe = (X -> MLJ.table(polynomial_features(MLJ.matrix(X), degree_polynomial))) |> LinearRegressor()\n```\n:::\n\n\nNext, we conformalize our polynomial regressor using every available approach (except the Naive approach):\n\n::: {.cell execution_count=4}\n``` {.julia .cell-code}\nusing ConformalPrediction\nconformal_models = merge(values(available_models[:regression])...)\ndelete!(conformal_models, :naive)\n# delete!(conformal_models, :jackknife)\nresults = Dict()\nfor _mod in keys(conformal_models) \n conf_model = conformal_model(pipe; method=_mod, coverage=0.95)\n mach = machine(conf_model, X, y)\n fit!(mach, rows=train)\n results[_mod] = mach\nend\n```\n:::\n\n\nFinally, let us look at the resulting conformal predictions in each case.\n\n::: {.cell execution_count=5}\n``` {.julia .cell-code}\nusing Plots\nzoom = -3\nxrange = range(-xmax+zoom,xmax-zoom,length=N)\nplt_list = []\n\nfor (_mod, mach) in results\n plt = plot(mach.model, mach.fitresult, X, y, zoom=zoom, title=_mod)\n plot!(plt, xrange, @.(fun(xrange)), lw=1, ls=:dash, colour=:black, label=\"Ground truth\")\n push!(plt_list, plt)\nend\n\nplot(plt_list..., size=(1600,1000))\n```\n\n::: {.cell-output .cell-output-display execution_count=6}\n![Conformal prediction regions.](regression_files/figure-commonmark/fig-cp-output-1.svg){#fig-cp}\n:::\n:::\n\n\n", + "supporting": [ + "regression_files" + ], + "filters": [] + } +} \ No newline at end of file diff --git a/_freeze/docs/src/regression/figure-commonmark/fig-cp-output-1.svg b/_freeze/docs/src/regression/figure-commonmark/fig-cp-output-1.svg new file mode 100644 index 0000000..a1fba3e --- /dev/null +++ b/_freeze/docs/src/regression/figure-commonmark/fig-cp-output-1.svg @@ -0,0 +1,4606 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/_quarto.yml b/_quarto.yml index 3ce739e..c5443d7 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -1,16 +1,18 @@ project: title: "ConformalPrediction.jl" execute-dir: project - + crossref: fig-prefix: Figure tbl-prefix: Table bibliography: https://raw.githubusercontent.com/pat-alt/bib/main/bib.bib +fig-format: png execute: freeze: auto - echo: true eval: true + echo: true output: false + jupyter: julia-1.8 diff --git a/dev/plot_main-tutorial-regression.ipynb b/dev/plot_main-tutorial-regression.ipynb new file mode 100644 index 0000000..01d99d2 --- /dev/null +++ b/dev/plot_main-tutorial-regression.ipynb @@ -0,0 +1,1236 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "# Tutorial for tabular regression\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this tutorial, we compare the prediction intervals estimated by MAPIE on a\n", + "simple, one-dimensional, ground truth function\n", + "$f(x) = x \\times \\sin(x)$.\n", + "\n", + "Throughout this tutorial, we will answer the following questions:\n", + "\n", + "- How well do the MAPIE strategies capture the aleatoric uncertainty\n", + " existing in the data?\n", + "\n", + "- How do the prediction intervals estimated by the resampling strategies\n", + " evolve for new *out-of-distribution* data ?\n", + "\n", + "- How do the prediction intervals vary between regressor models ?\n", + "\n", + "Throughout this tutorial, we estimate the prediction intervals first using\n", + "a polynomial function, and then using a boosting model, and a simple neural\n", + "network.\n", + "\n", + "**For practical problems, we advise using the faster CV+ or\n", + "Jackknife+-after-Bootstrap strategies.\n", + "For conservative prediction interval estimates, you can alternatively\n", + "use the CV-minmax strategies.**\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import os\n", + "import subprocess\n", + "import warnings\n", + "\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "from mapie.metrics import regression_coverage_score\n", + "from mapie.regression import MapieRegressor\n", + "from mapie.quantile_regression import MapieQuantileRegressor\n", + "from mapie.subsample import Subsample\n", + "from sklearn.linear_model import LinearRegression, QuantileRegressor\n", + "from sklearn.pipeline import Pipeline\n", + "from sklearn.preprocessing import PolynomialFeatures\n", + "\n", + "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"3\"\n", + "warnings.filterwarnings(\"ignore\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Estimating the aleatoric uncertainty of homoscedastic noisy data\n", + "\n", + "Let's start by defining the $x \\times \\sin(x)$ function and another\n", + "simple function that generates one-dimensional data with normal noise\n", + "uniformely in a given interval.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def x_sinx(x):\n", + " \"\"\"One-dimensional x*sin(x) function.\"\"\"\n", + " return x*np.sin(x)\n", + "\n", + "\n", + "def get_1d_data_with_constant_noise(funct, min_x, max_x, n_samples, noise, zoom=5):\n", + " \"\"\"\n", + " Generate 1D noisy data uniformely from the given function\n", + " and standard deviation for the noise.\n", + " \"\"\"\n", + " np.random.seed(59)\n", + " X_train = np.linspace(min_x, max_x, n_samples)\n", + " np.random.shuffle(X_train)\n", + " X_test = np.linspace(min_x-zoom, max_x+zoom, n_samples*5)\n", + " y_train, y_mesh, y_test = funct(X_train), funct(X_test), funct(X_test)\n", + " y_train += np.random.normal(0, noise, y_train.shape[0])\n", + " y_test += np.random.normal(0, noise, y_test.shape[0])\n", + " return (\n", + " X_train.reshape(-1, 1), y_train, X_test.reshape(-1, 1), y_test, y_mesh\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first generate noisy one-dimensional data uniformely on an interval.\n", + "Here, the noise is considered as *homoscedastic*, since it remains constant\n", + "over $x$.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "min_x, max_x, n_samples, noise = -5, 5, 600, 0.5\n", + "X_train, y_train, X_test, y_test, y_mesh = get_1d_data_with_constant_noise(\n", + " x_sinx, min_x, max_x, n_samples, noise\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's visualize our noisy function.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "collapsed": false + }, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAABGEUlEQVR4nO29eXgb1bn4/zkjyY7txHESO5uzsiVkN5iwhC2hJaRsKaVAC70tXSjcll643PSGW75tuaWFNhcK99eWNi20vS2lrDVQlrAk7ASyOAsmCdlIiLM5i+0k3iTN+f0xGluWRrJkS7NI5/M8eeTMjGZejUbnPeddhZQShUKhUOQfmtMCKBQKhcIZlAJQKBSKPEUpAIVCochTlAJQKBSKPEUpAIVCochT/E4LkA7l5eVy3LhxTouhUCgUnmLVqlUHpJQVsds9pQDGjRvHypUrnRZDoVAoPIUQYofVdmUCUigUijxFKQCFQqHIU5QCUCgUijxFKQCFQqHIUxxVAEKIW4UQdUKID4UQjwoh+jkpj0KhUOQTjikAIUQl8D2gWko5BfAB1zglj0KhUOQbTpuA/ECREMIPFAO7HZZHoVAo8gbH8gCklPVCiP8BdgKtwMtSypdjjxNC3ADcADBmzBh7hewrjTthwz/BF4DJn4eScqclUigUvWHvh7DlVeg/FCZdDgUlTkuUEYRT/QCEEIOAp4CrgUbgCeBJKeVfE72nurpaeiYRrO4f8I8bIdRm/L9oEFz9Vxh3trNyKRSK9HhzESz9KRAZKweNh+uegiHHOypWOgghVkkpq2O3O2kC+gywXUrZIKUMAk8DZzkoT+bY8R489U0YMQO+twZufBtKhsLfroaGj52WTqFQpMrKh2HpXTD1SviPLfCVGmhvhke+CG1NTkvXZ5xUADuBM4QQxUIIAVwAbHBQnswQaodnvwsDR8GXH4PB42H4VPiXGsMUVHMj6GGnpVQoFD1xcCu89F9w/Bz4/O+gfwUcP9tYyR/eDst+5rSEfcYxBSClfB94ElgNrI/IstgpeTLGB7+Hg1vgc/dCUVnX9tKRMO8XUL8K1j/pmHgKhSJFlt4FQoPLfwOar2v72LPg1Ovhg8WeX9E7GgUkpfyRlHKilHKKlPIrUsp2J+XpM6EOeO/XMO4cOPEz8funXGmsBl7/GYSD9sunUChSY+96qHsazrgJSkfE75/9X+ArhHcfsF+2DOJ0GGhu8eFTcGQ3zLrFer+mwXkL4fAnsPF5OyVTKBTp8N6voaA/nHWz9f6Scqi6DtY+Bkf32ytbBlEKIJPU/hUGHw8nXJD4mAnzYOAYWPmQfXIpFIrUaTkEHz4N067ubsaNZea3QA/CusdtEy3TKAWQKQ7vgB1vw/QvgRCJj9N8UP012P4mHNhsm3gKhSJF1vwNwu1w2jeSH1cxASqrYc0j4FA4fV9RCiBTrH/CeJ12Vc/HzrgWEF3vUSgU7mHd342Bfdjkno+d8SXY/xHsq8u+XFlAKYBMsfGfMOo0GDS252MHDDcSwj582rMzB4UiJzm41XAAT/lCasdPvBQQsOmFrIqVLZQCyARH9sLuWjjpotTfM+UKOLgZ9n2YPbkUCkV61P3DeJ10eWrHDxgGo6o9G9ShFEAm2BwpYZSOAjj5chA++OiZ7MikUCjSp64GRs2EgZWpv2fixbBnDTTVZ0uqrKEUQCb4eAmUjkrNZmhSMgRGz+xSHgqFwlmadsG+9XDypem978S5xuvWpZmXKcsoBdBXwkHY9jqcdGHy6B8rTvws7FlrmJAUCoWzbHnNeD3xwvTeN/Rko9bX9jcyL1OWUQqgr+xZCx1HYfy56b/XnDlsfiWzMikUivTZ8qqxkq+YkN77hIDjzjNCuz0W1KEUQF/Z8Y7xOqYXhUyHTYYBI2HzkszKpFAo0sNcyZ9wQforeYDx58HRfdCwMeOiZROlAPrKJ+/AkBONaIB0EQJOmAPb3wJdz7xsCoUiNXatNMo8n2BRwysVjjvPeN3+ZuZksgGlAPqCHoad78G4Wb0/x7hzoK0R9nszkUShyAm2LjWi8syBPF3KxkBpJXz6fmblyjJKAfSFveuNWcPYPnT5MjuEbX8rMzIpFIr02fEujJgG/Qb2/hyjToNPP8icTDagFEBfML/ssWf2/hwDRxkt5j55OzMyKRSK9Ai1Q/3K3vnxohl9OjR9Cs17MiOXDSgF0BfqV0H/YcbSry+MO9twJis/gEJhP7trjd7dY/uqAGYar7u8swpQCqAv7F4Nlaf2LmogmvHnGn6AfeszIpZCoUiDzki+PqzkAYZPM5rEeMgMpBRAb2lrMso5jzyl7+cyZx47veVAUihygh3vQcVEIzu/L/gLYOQM2LUiI2LZgVIAvWX3GkBCZVXfz1VaCQNGeOrBUShyAj1sRO70dfZvUnkq7FkH4VBmzpdllALoLbtXG6+ZWAEIYUQQeMh2qFDkBPvqIpF8fbT/m4yYDqFWo9KvB3BUAQghyoQQTwohNgohNgghMqSGbaB+tRG9Uzw4M+cbdZrRK/hoQ2bOp1AoeqZ+pfFqOnD7yvBpxuuedZk5X5ZxegXwAPCSlHIiMB3Y4LA8qbO7FkZmwPxjMuo041WZgRQK+6hfBcVDoCyFRk6pUH4S+PvBXqUAkiKEGAicCzwEIKXskFI2OiVPWrQ2GvG+I6Zl7pwjZ4DmVwpAobCT+tWGGbevkXwmPr9R42vP2sycL8s4uQIYDzQAfxRC1Aoh/iCEKIk9SAhxgxBipRBiZUODS8wj+z8yXodNydw5A0UwfKpSAAqFXbQfgf0bDMdtJhk+zVgBeKAyqJMKwA+cAjwopawCjgELYw+SUi6WUlZLKasrKirsltEaswF0Og1gUmHUTGNJ6pEIAoXC0+xZixHJl2EFMGK6ESbeuCOz580CTiqAXcAuKaUZ/P4khkJwP/s+hKJBRuhmJqk8BYItnokgUCg8TX0kkq8yw8OOaRr2gBnIMQUgpdwLfCqEMLsvXAB85JQ8abGvzjD/ZMpuaDJiuvHqgQdHofA89asM529JeWbPO3QSIGCf+4czp6OAbgYeEUKsA2YAP3NWnBTQdeOLzbT5ByIRBEWRJDOFQpFV6ldn3vwDhj9v8HhocH9Qo9/Ji0sp1wDVTsqQNo2fQPBYdhSA5jMcwWoFoFBkl6MN0LQTTv92ds5fcTLsd393MKdXAN4jWw5gk5EzjAgCVRlUocgeeyOTLNPsmmmGToRDWyHUkZ3zZwilANJlXx0gDA2fDUZMN5rMH9qanfMrFAqjmRMYK+5sUHEy6CE4uCU7588QSgGky/6PDPteQXF2zq8cwQpF9tm73mjjWFSWnfMPnWi8utwPoBRAuhzYDOUTej6ut1RMNGqK767N3jUUinxn7/quuj3ZYMiJIDTX+wGUAkgHPWws6cpPzN41fAFPpZIrFJ6j45gxkcuW+Qcg0A8GH99VNcClKAWQDo07INxhhGtmkxHTjWqCHkglVyg8x/4NgMyuAgDDDNSgVgC5w4FIhq4dCqC9ySgPrVAoMotZqTPbCqDiZDi0DYJt2b1OH1AKIB0aNhmv2TQBQVeROZcvHxUKT7J3PfQbCANHZ/c6FRNA6q6O6FMKIB0OfAwlFZlrApMIM4LAA6nkCoXn2LMOhk3NfCmXWIacYLy6OBRUKYB0OLA5++YfgMIBMGicUXROoVBkDj1s5PJk2/wDSgHkHAc+zr75x2TYlK6sY4VCkRkObjV69tqhAAr7GxWDDyoTkPc5dhBaD9mzAgCjouChrRBsted6CkU+YJcD2GTICV3BIy5EKYBUOWA6gLOYBBbNsMmGA8nlYWQKhafYV2e0Xq2w6Xc85ARlAsoJDnxsvNppAgJlBlIoMknDJiNBy19oz/WGnGBYDloO2XO9NFEKIFUObAZ/v+yHjpkMHm/0BlAKQKHIHA0b7Zv9g+sdwUoBpMqh7TBoPGg23TLNZ4SDKgWgUGSGUDsc3m7U27IL02KgFIDHObQNBh9n7zWHTTZCQVVJCIWi7xzcYvjV7FwBlI0xfA5KAXgYXTfKMgweb+91h02BloNwdL+911UochEzoMJOBeALGDk9SgF4mKN7jdhhuxXA0EnG635lBlIo+kzDJqNEs2mXt4shJ8ABpQC8y6FtxqsTJiBQJSEUikzQsNGYjQeK7L3ukBOMnB4XmnIdVwBCCJ8QolYI8U+nZUnIoe3G6yCbVwAl5VBcrnIBFIpM0LDJXgewyaBxEGqDI3vtv3YPOK4AgH8D3N037dA2w5FjVwhoNENPVgpAoegr4aBRksGuTP5ozImjC8u7O6oAhBCjgIuBPzgpR48c2gZlY8Hnt//aFROMmYsLl48KhWc4tB30oDMrANN3eHi7/dfuAadXAPcD3wf0RAcIIW4QQqwUQqxsaGiwTbBuHNpmvwPYpGIitDfDkT3OXF+hyAWciAAyGTjacD6rFUAXQohLgP1SylXJjpNSLpZSVkspqysqKmySrpsAkRBQmx3AJuaMZb+7rWQKhavpbObkgAnIXwClo7p8iS7CyRXALOAyIcQnwN+BOUKIvzoojzUtB40ZuNMKwHyAFQpF+jRshIFjjBLNTjBorFoBRCOlvF1KOUpKOQ64BlgqpbzOKXkS4lQIqElJORQNVo5ghaIvHNjkjPnHZPB45QPwJE6FgJoIoSKBFIq+oIeNYo5OKoBB4+BYA7QfdU4GC1yhAKSUr0spL3FaDksObQOEsYRziooJhgJQkUAKRfo07jDi8B1VAO4MBXWFAnA1h7bBwFH21Q+3omIitDXB0X3OyaBQeBXTf+ZECKjJoHHGq1IAHqNxR9eX5xQqEkih6D2m+dSJCCATl+YCKAXQE407jZKuTqIigRSK3tPwsdGcvajMORmKBkG/gWoF4ClC7UYCVpmD9n+A/kOhX5lyBCsUvcHuLmCJGDTedbkASgEko2mX8er0CkBFAikUvUNKY+Vc7gYFME6tADxF4w7j1WkFAMYMZv8GFQmkUKRD0y4IHnPJCmCcYVLWw05L0olSAMk47CYFMBHaGo1YYoVCkRpuiAAyGTTOKEjXvNtpSTpRCiAZjTuNMtClI52WREUCKRS9obMInAsUgDmRbPrUWTmiUAogGY07jRwAzee0JCoSSJESNbX1zLpnKeMXPs+se5ZSU1vvtEjOcmCT0VSpZIjTknQpgMadzsoRhQMF7j2EG0JATQYMh8KB0JC/K4Ca2noWLdnE7sZWRpYVsWDuBOZXVTotlmuoqa3n9qfX0xo0bMz1ja3c/vR6gPy9Tw0O1wCKZuAo47VRrQC8gZsUgBCRkhAfOy2JI5iDW31jK5KuwS3vZ7hRLFqyqXPwN2kNhlm0JE9XjVK6JwQUjF7EJUO7gktcgFIAiQi2wtG9zucARFMxwVjS5iH5Mrj1xYSzu7E1re05z9F9RgkVN9j/TcrGuMoHoExAiejMAXCTApgItX+BYwfdYdO0kb4Obl4wH/XWhGN+tkQBwiPLijItqjdwsgtYIspGw561TkvRiVIAiXBTDoCJ+SAf2AQlZzkri82MLCui3mKwjx7czIGwvrEVAZ0DYnFAI6hLgmFji1tt48lWOdFyRiuzsuIAR9tCBHXr4b8o4GPBXBcNgHZimkvdtgLY+DzoOmjOG2CUAkiE6al3owJo2ARj80sBLJg7odvs2OTwsXaq/vtlDrcEuw360cNhSzC+5bTVwOo0qaxyYlcJh1uCCc9XVhTgx5dNdtVntJWGjUb9nf7DnJaki4GjIdxhmKdKRzgtjfIBJOTwDtACRvSNWygdBYGSvAwFnV9Vyd1XTKWsKNBte0tQ7xwE082RdpttPJGpJnq71SohESWF/vwd/KGrBIQQTkvShWlSdokfQCmARDTuNOx1bsgBMNE0KD8xb2sCza+qpKQwc4tWTQhXxcsvmDuBokD35y2gCVo6Qp1yWpnBEpHOsTmJmyKATMpGG68uyQVQCiARbgoBjaZiYl6uAEwyOaiFpXRVSKm5yqksK0JgmHAQhpnHlDOduawAxz+TYxw7CC0H3GX/B8MEBEoBuB7XKoAJcGQ3tDU7LYntZGow81mYBNwSUjq/qpJ3Fs5h+z0XU1Lo73Rcm6Rj5pLALY+tcc0Kx1YOuKgGUDSF/aFosDIBuZpgKxzb714FAHAg/xLCfvxsXZ/PITBm/la4zSeQKXncssKxFTeGgJqUjVErACHEaCHEMiHER0KIOiHEvzklSxydEUAuygEw6awJlH9+gMbWxBEvqZJsBl1WHHBVHZ1ETuHKsiIK/en9dFuDYe58ru8K1DM0bIKC/l3lF9xE2WjXlINwcgUQAm6TUk4CzgC+I4SY5KA8XZhJYKa9zk2UjQVfYV77AbKBJuBoW8hVpSZmT6yI2xbwCcYNKaI9FB/a2hOHW4KOKzXbaNhoBEy4KQLIpGysMcl0QW8Px/IApJR7gD2Rv48IITYAlcBHTsnUSXPkR+KGMtCx+PyRSKD8UwCDigNJ4977gi5Bj/lBOpErEJ3MZkUwLHln66Fen/+2x9dy62NrXJsNnTEaNsFx5zsthTUDR0OoFVoOQkm5o6K4IhFMCDEOqALet9h3A3ADwJgxNtnkm3cDwmgk7UbKT4L6VU5LYTsXTxvBX5fbazvNtl8gUfZytjD9H27Nhs4IbU1GL2832v8hqiz0DqUAhBD9gaeAW6SUcaEtUsrFwGKA6upqe9ZMTbuMRuz+AlsulzYVE6HuH9DRAgXFTkuTVXqaEWebbNbRuaNmPY8s32mZvWwHbsyGzghuLAERTWcuwKdQeaqjojiqAIQQAYzB/xEp5dNOytKN5noodfGPomICIOHgZhgx3WlpskZNbT0LnlwbFwoZTTlNXOZ7l2ptE4M4yiH6s1o/iWfDZ9LAoD5dP5t1dGpq67sN/ukyVWxjnu8DThKfUkQHu+UQ3tSn8ZI+k2AaP2u3RT5lBDdHAIGrcgEcUwBCCAE8BGyQUt7nlByWNO+GISc4LUViomsC5bACuPO5uoSDv48w/+p7hpv8z1Es2vlEH8Z+ypjCJ1wc+ID/9D/Ko+E5/E/oao6Q+irJNMNUZtlGnqx6ZzImiJ380P8XZvnqCEofW2QlLRRygbaaL/rfZJcs547g13ldn5HS+XKyUmjDRvD3c2cUH0BRGRSWdgWbOIiTK4BZwFeA9UKINZFt/yWlfME5kSI01bvXgQQw+HgQvpwPBU3k8O1PC78L/JJZvjr+GT6dX4auZKvsGqjHiT180/cC1/leZa5vJf/a8W+slieldE1z8H9n4ZxMfISEpG/SknzD9yLf9/+dIxTzk+B1PBE+l2b6AyDQOU9bx+3+v/Gngl9wX/BK/jf8eUiSO5yzlUIbNsGQE91VxiWW0squYBMHcTIK6G2SPZ1O0dYMHUfcGQFk4i+AIcfndCRQonDFElr5c8HPmSa28R/Bb/Nk+Ly4Yz6RI7gj9A0eD5/P/wZ+xaMFd/GfwRuo0c9O6drZNIvU1NanndDmI8xP/A/zZf8yXg6fysLgtzhEabdjJBqv6zN4t2Mydwf+wL8HnqRQdLAodE3nMWVFAUoK/a7uiZARGjbB6JlOS5GcgZWuWAGoTOBYOkNAXf7DqJiQ0wrAKmlJoHNf4EGmi63cHLzZcvCPZp08nvkd/81q/STuCzzIF32vp3TtbJlFzFLO6SS0+Qjzq8D/8mX/Mn4Vupwbgv8eN/hH00GA24I38rfQHL7jf5ZrfEsBI8+hqTXYLc/hzufqci8voOMYNO10rwPYpFQpAHfiFQVQPgEObYNQh9OSZJya2npL88+3ff9krm8ld4e+zEt6ajO8RgbwteD3eVufwqLAYj6vvZX0+GizSF/aM1qRTilnMBTeLwKLmedbwU+C1/E/oatJbdEs+H+h63k9PJ07/X/iZLEDXcZHGR1uCbLgybW5pQTMEiludQCbDBxtFKsLtjkqhlIAsTRFfgwDXa4AKiaCDMOhrU5LkhHMwXbcwue55bE1cfsniJ3c5n+C58MzeSg8L61zt1PAt4K38U54Mj8PLOZMzdoEU1lWxN1XTGV+VWVWmtCna1r6gf8RvuB7i3uDV/JQ+HNpvTeMj1uDN9FEfx4I/IpCrCcKwbB0RRG8jNHg0iJwsZjji8N+AKUAYnF7EphJZySQ9x3B0YOtFRo69wT+QDPF3BH8Or1xHbVTwE3BW9guR/C7wC85UcQvv+sbW1m0ZFNn7kGmm9APjGlmk4yrfMv4pv9F/hiay/8X/nyvrneYUhYEv81JWj3f8j2f8LicCgVt2AiaHwaPd1qS5JS6QwE4ngjmOpp3GS3kfKn/WB2h/ERA5IQf4M7n6pKaRq72LaNK28ItHf/K4ST2755opoTrO75PTeEPWRy4l8s6fhoXIlrf2MqCJ9Ym7LEb257RqtG81XaAYx2hlOQ8RXzMXf6HeTM8lbtC19GXWIk39Ok8H57Jd/011Ohns0vG1xfKqVDQhk1GCLfbf79mkbomtQJwF8273W/+AQgUwaCxnlcAiez9Jv1o5xb/U6zUT6JGn9Xn6+2mnH/t+B6jxAHuCzyIIL6oWlCXCWuImYNlIhPRHTXrLbcny2mIZjgH+V3BL6mX5Xw3eDNh+h7KeFfwK+hofN//d8v9LR2h3PEDNGxyv/0fuqIMHXYE96gAhBA3CyH6llLpJZrq3R0CGk0OdAfryaTydd9LDBON3BO8hnRnwrOOH2y5faWcyE9D1/JZ3ypu8j1reYyURuXNaKIdxIlMRI++/6nl9lSK2AUI8WDBAxTRzreCt3XG+PeVPQzhj+G5XKItZ4KIzz7NGWdwsA0Ob3e//R+MCVxxuWFxcJBUVgDDgBVCiMeFEBdFMnhzl+bdRvN1L1B+klEOIpyaacGNJLM/96eFG/3P8Wq4ipUyvR91WVGAR751JoOKrU0BfwrPpSZ8Fv/hf4JztHXWJ5FQUtA1A4+uwZ9I7kTNZlLhdv/fqNK2sCD4bbbIzD6Di0OXcJR+3Op/ynJ/MCy93y/g4BaQujdWABDJBXC5CUhKeQdwIkbZhq8Bm4UQPxNCHJ9l2eynrcn9SWDRVEyEcIdRVdCjJLM/X+NbRqlo4YHQF9I+b1Mk1v5Hl06Oa7RuILg9+E0+lqN4IPArRnIg7oigLmnp6JrNN7YGOyOBEslt1W4yFS7SPuDr/pf4Y2guL+qn9+ocyWiiPw+FPsdFvhWcJKybkRxuCbqiEU6v6awB5IEVABgTTS9EAUkpJbA38i8EDAKeFEL8Iouy2U/zbuPVCz4A8Hx3sJraeloSOEYDhPiG/0XeDU9ivTzO8phkg60EZt1jJEHdfcXUbjN5k1b6cVPwFvyE+U3BAxQQb6aJnc+bkUAL5k6IUyxFAR9fOj39JkJjxV5+EfgdtfoJ/Cx0bdrvT5U/hy+kVRbwDd+LCY9xQyOcXtOwCYTm7jpe0XhhBSCE+DchxCrgF8A7wFQp5U3AqUD6UzM30+SRJDCTikh9Gw8qANOJmsg2fqn2LiPEIX4XvtRyvznYxtrpo4mueV9WbF3ae7scwYLgjczQtnKH/68pyb67sZX5VZXcfcVUKsuKEHTlENw1f2pCs5MVhXTwYOABwvj4Tsf30qrkmS6NDODJ8LnM971NOU0Jj+truKtjNGyEQePBX+i0JKlRWgntTUb5GYdI5WkbDFwhpexmZ5BS6kKIS7IjlkN4JQvYpHCAIasHHcHJs2Il3/S/yAZ9NG/o0+L2+oToTNiqHjuYO5+rS6hIzF64jUmcsEv00/ht6BJu9P+T1fqJPdYMMs0/86sqLWvp/OjSyZbJbFb82P9nJmk7uL5jAbtJvznIrOMHs3pnU8oZxg+H5/EV/6t8xf8yvwx9MeFxnswNOPCxd8w/0BUK2lwP/Xof3twXUvEB/Ch28I/atyHzIjlIcz1GEthwpyVJHY/WBEo2wMwQW5mk7eAv4QuJjfwJaIJ7r5reOfDOr6qk9ocX8sk9FyeMETrcEuyx9PKi0NW8r0/k7sAfLCNlTATWvXp7wxXam3zJv4xfhy5jmV6V9vsHFRuObnMlkgrb5QheDVfxZd9r+EkcPOC53IBw0HACV6RW9dUVuCAXQOUBRNNcbwz+bk8iiaZiojHz0dNvEu4kyQaYL/te45gs5JnwWXH7+vfzJ6xgWZaG6SWWMD6+2/E9jlDMg4H7GUCL5XESeGT5Tu6oMUxL0SUsjr/9hYSlLGKZIrbx08DDLNdP5r4kM/FEFAV8/OjSyYChBN9ZOCdlJfC38AVUiGYu0GoTnttzZaIPbgU9BBUnOy1J6nRmAzsXCqoUQDReygEwKT8Jgi3QZB3Z4VasnKgAA2jhUt97PBM+i2PED2jJTDl9iMAEoIEyvtPxPcaI/SwK/I5ETRqjlUB0CYtUQ0AraGRxwX0cYgDf6fher5K9okNSTRbMnZBSpsQb+nT2ykGdlUKjia6H5ClMP9hQD5mABowwnNZqBeASmnd7x/5vYto8zSqIHmF+VSVfODX+Xs/3vU2R6ODR8AWW70u2cmhKo8xyIlbIidwd+jIX+VZws+8fCY+TYJn01RMFBHmw4H4GcZRvddzGQQb2Ss7okFST+VWVKXUZC+Pj8fB5nKetYwQHO7ebjXA8N/hDRAEIoxGMV/D5of9wR7OBlQIwkdL9vYCt8GhRuJraep5aFT/zucr3OnX6WMvQz4BPJDVNZMpu/VB4Hk+Fz+a2wJNJewikm/SloXNv4EGqtY+5LXgjH8lxfZLTKlonVTPQ4+Hz0YTkKt/rnduKCzw8HDRshEHjoCD19p+uYGClMgG5gvZm6DjqnRwAk+LBUDLUcwrAKgroeFHPVO0Tng6fY/memeMGJZ2dJjIrpY/gP4M38EZ4Gnf7/8AF2qoMnFPyE/8fudS3nJ8Fv8QL+hkZOGe8Mz3Ve7BLDuWt8BSu8L2FaeravP9Yp2/Dc+zf6K0IIJOBo5QJyBV05gB4zAcAkUgg75iAamrrLUs/X+57h7AUPBs+0/J97249lDRBySo2/7ozxnT+Px1C+LkpeAt1chy/CTzAXG1FmmfoQqDzI///ca3/NR4MXcriBLkN0fiEQNBzZnHsqif6HvTEM/osxmr7mSG6eko8+r63fElAVwSQl+z/JmZv4L46sHqJUgAmZhawV+oARWOGgjr0EKWDmQAWj+Ry7V3e1SfTgHXtQUnPxePMiJjt91zMOwvncNf8qZ3/T9U8YtJCP77SsZCP5Dh+HXgg5ZaS0RQQ5N7Ab7nev4SHQvP4eVSP3kQUBXzce9V0tt9zMfdeNT3hjD5RtE6qUUFLwqfRLgNc5nu3c1tfahk5xqFtoAe9uwIItUHLwZ6PzQKOKoBIcblNQogtQoiFTsrSaYfz5ApgopFReGSv05L0SKIEsCqxhbHafp7poeRzXxKUemMiaqY/13XcznL9ZBYFFnOX/6GE3bViGSX280TBnVzhe5tFwav4SQq1/WOjcGJn9OaKIJVonZ4+7xGKWarP4FLfe2gWZbE9g9dqAEVj+hwdcgQ71hBGCOEDfg18FtiFUXH0WSnlR44I1LzbCMnyUhKYSXlUSYhSd3cyS9T163LfO7TJAC+FT0v6fk0IamrrexWpYr7HbNYysChAc1uQBL1fOjlGEV8NLmSBfJwb/c9xjraen4au5RX9VKTFHKqEVr7mW8J3/M8QwscNHbfysp78c0FXFI6V3Jn4vEIQ91mfCc9inm8FZ2p1vKNPBYwaSrFNblzN/kgEULmHksBMoltDjpxh++Wd7Ag2E9gipdwGIIT4O3A54IwCaKr3RicwK6JDQY+f7awsSaiprUcQH12voXOxbzmv6qdwlORRHGEpO01IvR0Uo983fmHiVondrouPe0Jf4k19Knf6/8zigl/yiT6Ml/Vq6vSxHKOIctFEtfYxc7UVDBCtvBg+jbuC11FPz5nDmiAryVfRn3fGnS/TGBMqu0yfQbMs4nLt3U4FYCrp6FpKrlYCDRuN5kheiwCCLpOzQ45gJxVAJRDtcdoFxNXBFULcANwAMGbMmOxJ48UQUJP+Q6FfmesjgRYt2WQZp36a2ESFaOb5cGqRMWb4YyYGpZFlRQlXJVa8q09hXsfdzNM+4BrfMr7qe5lCf9egelj252X9VP4SupA1MvWqlKX9AlkfZK3yJNop4GX9NC7yreAHoW/EFaPL5L3OGg0ejQACKKkAX4FjoaCu7wkspVwMLAaorq7Onoequd67D5EQnqgJlMh+f5HvA9pkgDf06X0+V7osmDuBWx9bk1IClUkIP8/pZ/GcfhaFdDBKNFBMO4cZQL0cYmkW6olMJLH1RCJl91L4NK70vckZ2ke8ZVF8z9WF4cIhOLAZTrzQaUl6h6YZGcEOrQCcdALXA9HF00dFttmPlMYXMNCDEUAmHlAA1olakrm+FbypT6OFfp1bAz7B/VfPSBjJkqmkr/lVlVx7xphet11vp4CtspL18jh2yYpeDf7Q5dvIJgvmTiCgxX/St/SpHJOFCUNdXV0YzssRQCYDR3VFIdqMkwpgBXCiEGK8EKIAuAawbtCabdqaIHjMmxFAJhUToeUAHIvvbOUWrKJSpoltjBSHWBLj/A2GZdLGK5m0l981fyq/vHpGxs7XG0zfRjaVwPyqShZ9cTplRd39XO0U8Lo+nQt9qxAW0UCuLgznxRpAsZSOdKwzmGMKQEoZAr4LLAE2AI9LKZ1pStqZA+BiO2dPdJaEcO8qIDZRSwi4yLeCkNR4VT8l7vhkjVeyYZN2utm1HY1Y5ldV8uPLJsetrJaEZzJUNFIltsS957bH1zJu4fPubBdpKgAvRgCZlI6EI3scqejrqA9ASvkC8IKTMgDeawRjRXlEARzYBOOSx9I7SXRUyviF/+Qi7QPe0yfRRP+4Y3tqvJJJEjmoM40Z6jl+4fOW18u2vd1MxIvNxViqz6Bd+rnIt4LVoe6DqZkc5sqooIaNUDYWCkqclqT3lI4yenu3HDACOmxEZQJDlwLwWh2gaAaOgoL+rl4BRFNTW88ErZ7jtL0sSRAjb6fpwS5Hp3mdRHb1bNvbEyXiHaWYd/XJXKR9QKIy2ODCdpFerQEUjWl6dsAMpBQAGA5goRmlWb2KiCTCuDwUFOCOmvXc+tgaPis+QJeCJeHquGMGFWc/LDIauxyd5nXs8G1YkUzRvaTPZIzWwMlJOqL1dA5bCYfg4GZv2/8hSgHY7whWCgCMG99/uFGf28t4oChcTW09jyzfiQQu8NWyTh4XV/snutuVXWSukmhiogd4O30b0SRTdK+FDT/MBdrqXp/DVg5vN0wnXl8BONga0uMjXoZo3uXtCCCTigmw9lEjqqlf7xqNZBvT1j6EJqaJbdwf+kK3/ZUOlR+ILZswsqyIlo5Qwmbz6WL1uezwbcSyYO4ESx8AwAEGskY/ngt8tfwq/HnL9we05D0ZbGVfJGZk6CRn5egrxeWgBRwxASkFAMYKYKiHeokmwpwJNXwMo3uuPeMEpvngXG0dmpAs1Wd07htUHLCshWMXvS0TkYyigM9VLRajFZ1VUtjScBW3+J9iCE2W3cqS9WS2nX11hunW6ysATTNqeCkTkAOYSWBejgAycXl3sJraerRINcs5vloa5EDqorpiHW3LfjZsOvTG1BHdf8Ct/XXNctFWvKZXoQnJ+dpay/2ZWhFlhH11RgvIQL+ej3U7pZWOKAC1AuhMAnPXj7RXlI0Ffz/XKYCa2np+/GxdZyEyH2HO1dbxcri6W+ZsUKfXlT6zQTJziRWzjh/MXfOnZlmq7FInx7FXDmK2r5an9HPj9vfUoMZW9n0Ilac6LUVmKK2E+pW2X1atADpzAHLAB6D5jFXAPmfy6aww486jq1BWic0MFC0s1avijndTiKHpqE110Fu9szG7AmWYQcVWlW8Fy8IzOFdbR4BQ3F7XNIxpPwKNO2CYvcECWaN0pLECsPn+KgVgLru8XAcommFTjZmRS7CKO5/jW0NIarytx8+WXRNiGGF+VWXSrlzRtAZ192XKJuFHl04m4ItXbkv1KkpFK9VavDJOt6ta1ti/wXjNGQVQGUkGs7czmFIATR7uBGbF8ClwrAGO7HNaEsB6QJ+trWGlnMARi9r/rgkxjCI2ZDMZblrB9MT8qkoWXTk9boXzjj6FdulnjlbbbXvA56YIoMgkJ1cUgJmEanNnMKUAzE5gXk4Ci2bYFON1n1XfXfuJHdCHc5CTtZ0sDc+IO9aORKjeEt1r2Np0YlDf2Mp4t9bNsWB+VSV6jNmhhX4s1yd1UwAlBT4WXTndNf4Z9tVBYSkMHN3zsV7AoWQwpQCa63MjCcxkeEQB7HXeDFRTW8/hY+3dts32rQFgWYz9f1BxwJURM1b0lKQm6aqb4wUlYLXqWqpXcby2h3FiD2VFAer++yJ3fTf76ozZv5uc0n3BDEKxORdAKYDmem/XAIqlaJBRXMphP4Dp/G0Jdq9wOFtbwy5ZzmbZ/Z4XF7govrwH5ldVcl0KPQRcVzcnAVZZ0GZ+xgVaLY2tQXetaKSEfR95PwEsmpKhoPmVArCdpvrcsf+bDJ/i+ArAyvlbQJBZ2ocsC88gtviy25y/PWH2EOjJL+CFzxXt4wAj1PNTOYzNeiXna2sAl61omnZBe1Pu2P8h0hlspO0moByxe/QSKY0b7tV2cokYNgU2vwLBNseSZKwGvpnaRkpEO8uisn9N3Oj87YnozOFZ9yy1zKz1yueKzYKedc9Slh6dwfW+lyihlWMUuac/sBnmbPq7coVS+xVAfq8A2hq93wnMiuFTQIZtTwirqa1n1j1LGb/w+c6M32hma2tolwHe0+OX7rMnVtghYtZwqrpnNrijZj31ja28rs+gQISZpXWtJl2xojHNm7lQviUaBzqD5bcC6MwB8IbtOWWGReLrbfQDmDb/+sZWJNYJQ+dra3hPn0Qr8auSZRsbbJAyezhV3TPT3FGznr8uN8pBr9An0CyLmB0xA4Hh4HbcH7DvQyPrvV+pczJkg4GVtieD5bcJyCy/WpojSWAmg8dDoNhWP0CiRiMmY8Vejtf28H9Ba3ObK2aWfcSJ6p6Z5tH3P+38O4Sft/SpRuRWSGL6bRzvDLZnLYyYbv91s01pJYTaoOUQlAyx5ZJ5vgLIoTIQ0Wg+I0Jir325AD0N4OYs0sr+D96xlec6sSu31/UZDBeHmSR2dNvuWIRTWxMc2pajCsD+zmBKAQgfDMiRJLBohk8xksFsWk72NIDP1tawVR/BTjksbp9XbeW5hpVZ5/VIwl60GcjEkVWbOakZMcP+a2cb0xKR6wpACLFICLFRCLFOCPEPIUSZE3LQvNsY/LXsdoJyhOFTjdlS06c9H5sBknXUKqKNM7QNlrN/LyWA5TpWM/oGylirH8ccX23cPkdWbbvXGK9qBZARnFoBvAJMkVJOAz4GbndEiqZduVEG2ooRkUxb8weTZUwnqFWZhLO0OgpFME4BFAU0an94oRr8XUKiGf3r+gyqxBYG0dxt+7H2kP3O4D1rjd9sf29HjVnSf6hhkbAxFNQRBSClfFlKadaaXQ4444Vt3p179n+TYZONzMLd8TO3bDG/qpLigvi4gtnaGo7KfqzQu3duCunSHYlFCiDxjH5peAaakJyrreu2vbE1yK2PreGOGhvrTuWqAxgMS8QAezuDucEH8HXgRduvKmWkDESORQCZBPoZjmAbFQBYzSIl5/vW8q4+mQ66rw6CYemJUgn5QiIz3jp5HAdkKXMidZyikcAjy3fao8g7jsGBj3NXAYARCmpjRdCsKQAhxKtCiA8t/l0edcwPgBDwSJLz3CCEWCmEWNnQkMFY8bZGCLbk7goAoPIUQwHY4Ag2k8Bir3SS2MUoccCy+QvkRvhnrmCa8WKRaLyhT+c8bS0ausV+m8pg7/3QuFouKwCbs4GzpgCklJ+RUk6x+PcMgBDia8AlwLVSJh6hpJSLpZTVUsrqiooM2v06cwBy2P48sspQdIe3Z/Uy0UlgsZglhZdZlH8GFf7pNuZXVVo2fVkarqJMHKNKbLZ8ny2KfE+kT3FOKwB7k8GcigK6CPg+cJmUssUJGbpyAHJcAUDWzUDJksBm+9ZQp49lH4Pj9qnwT3diZQp6S59KSGqW0UBgkyLfswZKKgw7ea5SWgmhVmg9bMvlnPIB/AoYALwihFgjhPit7RKYCiDXykBEM3QS+AqzrgASzf5KOcqp4mPL8E+vlkrIB6x6ITdTwip5ErO1tXHH26bId9ca8f+50gPACptDQZ2KAjpBSjlaSjkj8u9G24VoiiSB9Y9PTMoZfAEjHyDLoaCJZn/nauvxC52l4Xj7/zsL56jB38VYdQpbGq5ikraD4XT1rS0OaPYo8rZmow/wqNOyex2n6WwMY48fwA1RQM7QvNtYSuZiElg0I6sMBaDHO+8yRaLokdm+NRyW/VkjT8jatRXZI1axmyu5831dq4DWYPaeq27sXg1IGFVtz/WcwubewHmsAHbldgSQycgq6DgCB62dd5mi0N/9URLonKet5Q19GnrMY5asp67CPSyYOwGf1mVu+ViOYpcs79Yr2LYIoF0rjNfKU7N/LSfpP8zWZLA8VgC7c9v+b2L+YHatzMrp76hZz62PraGxNdht+3SxjXLRHGf+CfhEjz11Fe5gflUlAwqjE/sEy8IzmKV9SAFd37ctEUC7VkL5BCgqy/61nESL1CZTCiCLSBlpBZkHCqD8JOhXBp8uz/ipa2rreWT5zrjYfzDMP2EpeFOf1rltUHGARVdOV7Z/D9EUo9iX6TMoEe3M1LqaDUlgxp0vZy8ZTEpjBZDr9n8TGxvD5KcCaD1shFrlgwLQNBg9E3a+n/FTL1qyyXLwB5it1VIrT6SRAQCUFPhU3R8PUhZjrntXn0ybDHQzA4FRFmLBE2uzowQOb4eWgzAqx80/JqWVSgFklVztA5CI0afDgU1Go4kMkmjpX8FhpmnbWRqV/HWsI6zq/niQ2HykNgp5T5/EbC0+tDioZ6m0x65VxmverADsSwbLUwVgtoLM0TpAsYw5w3g1HWkZIlH45+xIzZjXY+L/b396vVICHiPWBASGGWi8to9xYk/cvqz4A3atgEAJVORYD+BElI40ytS0NWb9UvmpAMwQq3xZAYw8xagMujNzfoCa2npaOkKW+y7UVrJLlvORHNttu2NdpBS9xkrJm3Wd5lg0iclKRvDOdw3zjy9POth2hoJmf7KUnwqgeXfuJ4FFU1AMw6fBpx9k5HRm7Z/DLfGzw2LaOEf7kJfD1Zg9ZKNRxd+8hVWOxy45lM16paUZqKUjwz0CWg8bReDGnp25c7odG5PB8lQB1OdHElg0Y86A+lUQjh+00yVZ7Z9ztXUUiiAv69YJO6r4m7ewKgsBhhnodG0DxbR12364JZhZU9/O5YCEcbMycz4vYGM5iPxVAPmQAxDN6NONyKc+1gWqqa23rPppMte3gkOyPyv0+NowqvibN7EqC7FMn0GBCHNOTJMYyLCp75O3jXpWlTmeARxN/+EgNKUAskZTff7Y/03GRZbQ29/o9SlM008i/IS4QKvltfAphOm+uvIJoYq/eZjYldsKfQKHZX8u8lkHFmTM1LfjHaP8Q6BfZs7nBXx+QwkoE1AWkDLSCjLPBqKScqMw3LbeK4Bkph+A07UNlIoWS/PPvVepBDAvs2DuBAJRZSFC+FkSruYz2moK6Yg7PiOmvrZmowfA2Dwy/5jYlAyWfwogn5LAYhl/Hnz6PnT0rgVDT7O6C7WVtMoC3tK7d5UqKfCpwd/jzK+q5OqZo7tte1E/nQGilbO17qvCjJn6Pn0fpJ5f9n8TmzqD5Z8CyIc+AIk47nwId/S6LESyWZ2GzjzfCl7Xp9NGYbd9P/18fJtBhbeoqa3nqVXdZ6Tv6JNplCV8zhcbXSa587k6xi98nln3LO29Q3jrMsP+P2pm797vZQaOMkzVWU4Gyz8FkA+tIBMx5kwjH6CXZqBEZZ/BMP8MFY08Fz6zLxIqXMqPn62LM/+F8PNK+FQ+q63qVhyuNahzuCWIBOobW3sfFbTlVWP2X1DcR+k9SOlICB6DtqasXib/FEC+lYGIprC/MZva9npabzMbvt/62Br6BTSKA/GPzWXauxyV/Sybv6vkL29TU1sfV+3V5Hn9dEpFC2dpHyZ8f6+ighp3GuVLTvhMeu/LFTpDQbNrBspPBaD58ycJLJbj5xiOtSP7Ujo8uuG7xIjzbolpAhIgxDzfB7yinxpn/gGV/OV1kg3e7+hTaZbFXOJLXmww7Wdgy2vG6/EXpPe+XKE0UqYmy47g/FMATXmYBBbNhHmAhM1LUjq8p8gfgHO0dZSJYzwbPstyv0r+8jbJBu8gfl4Mz+Qi7QOKYpLCojGfAXM12aN/YOtrxiBYkad5IzYlg+WfAmjOkz4AiRg2GQaOgU0vpnR4KjO3y3zv0ihLeFuPd/aq5C/v05MCfyp8Dv1FGxdpiYsNtnSEuKNmfbfVZEL/QKgdtr4OJ1yQ2w3gkzFgOCCUCSjjNO3KzwggEyGMVcDWZSmFg/b04y/lGHO1lTwfPoMg3Yt1VZYVqeSvHCCZ8x9ghZzATr2CL/jeTHjM4ZYgjyzfGbeatPQPbF1mtDE9+dI+ye1pfAHDTJ3LKwAhxG1CCCmEKLflgvmaBBbLhHlGLkSCrODoZfqx9hABX+JZ2GW+dykSHfw9PLvb9sqyIt5ZOEcN/jmAWQ+osqwIAXF1gSQaT+vncJb2ESM4mPA8iQIa41aZG56FwoFG3ko+M7Ay6xVBHVMAQojRwIXATtsueuwAhNvzpw9AIsbOgn4Doa4mbles07exNUgwnDgW+WrfMur0sayX4zu3KbNP7jG/qpJ3Fs5h+z0Xx9UFAsMMpAnJFb630j53t1VmOAgbnzcmKf6CvojsfWxIBnNyBfBL4PsknhhknmazD0Cez0r9BTBpPmx4DjqOdduVitPXZLLYzlTtk8jsv2tW+IVTK9XMP4exMgt+KofxTngyX/a/ho/Unh+AgCa6Txa2v2E0Qpl0WQYk9ThmZ7As4ogCEEJcDtRLKdemcOwNQoiVQoiVDQ0NfbtwUx5nAccy7Woj0WTj8902pxOud63vVdpkgGdion+eWlWvOn/lMAvmTrDo9AB/Dl9IpTjIZ7VVqZ8s9kRrHoV+Zfkb/hlNaaXhC8liMljWFIAQ4lUhxIcW/y4H/gv4YSrnkVIullJWSymrKyoq+iZUZxJYnpuAwMgKHjga1v692+bYJuCJGEITX/C9zVPhc2mmf7d9rcEwP362LmOiKtzF/KpKrj1jTNz2V/VT2SXL+Zo/tRBjgGA4qo9w62FjVTrtqvyq/pkIG5LBsqYApJSfkVJOif0HbAPGA2uFEJ8Ao4DVQojh2ZKlk6ZdRm2REnt8zq5G04xVwNalcPiTzs2plh75F/8rFIogD4XnWe5vbA2qVUAOc9f8qdx/9YxuWeE6Gn8OXcgZ2gYmiU9SPlfnqnP9k4aPruq6DEvrUTo7g2Xvd2S7CUhKuV5KOVRKOU5KOQ7YBZwipdyb9Ys3R/oA5GtscSzVXzcaT3zw+85NVk3AYymmjet8r/BK+BS2ycQlNVQJiNxmflUlH/1kHvdfPYOyImPl+Fj4fJplETf7/5HyeUaWFYGuw4o/GK1LR0zPlsjewobewPmVB9BUryKAohlYCZMuh9V/gfajQGpZu9f7XmKIOMKDoeSOOlUCIj+YX1XJmh9dyP1Xz6CZ/jwcnsc83womp7AKEBg+BTa/DA0b4aybsy6vZ+if/WQwxxVAZCVwwJaLqRyAeM78DrQ3wQeLgZ6Tfso4wrf9z/Fy+FRWy5OSnlqVgMh9onNG7nyuDgE8FPocTbKY2/yPJ32vAK49YwzzZ4yEd+43fHOTP2+H2N7AXwD9h+aWCcgx9DAc2a0igGIZVQ0nXQRv3w8th7ol/UC8tez7/scooY1FoauTnlblAuQ+VoUCJXCEYn4dupw5vjV8VluZ8P1lxQGqxw6GTS/Azvfg7FuMDFhFF1nOBcgfBXB0P+ih/CwD3RMX/BDam+G1/waMJf3siRUIujuFz9A+4sv+pfw+fDGbZWJTmiZQJSDygGQ5Iw+H57FBH82dgT8xAOuSI4dbgtz59EqOPrcQyifAqddnU1xvUlqpVgAZQYWAJmbYZMMUtOqPsPkVamrreWT5zm4ZekM5zAOBX7FdH8b9oS8kPFXAJ7jvqhlq8M8Dkvl4Qvi5PfgtKmjivsBvEOiWx90q/0L/Yzvhc78wmqErupPlZLD8UQBNkSxgZQKyZs7/g6GT4YnreeKZZ7oN/hU08ueCn9OfVr4d/HfLmv9g1P9ZdKVq/p4v9OTjWSNP4K7QdXzWt5qf+h9Gi1EC3/A9z7/4X+H3oYuNdqWKeEpHGqvztuasnD5/VG7nCkANTpYE+sF1T9Ly4AU8JH/Ib/2XslyfxIliF9/z/4MS2rgh+O98LEdbvv3+q9WsP99YMHcCtz+9PmnpkD+HL2SoOMx3/M8yQfuUh0LzOEIxV/te5xLfcl4Iz+QvJdfzLfvE9hZm1GLzbuhXmvHT548CaKqHQDEUDXJaEvdSOpIv6j/lZv033OJ/GngagDX68dwe/CYb5NiEb7396fUASgnkEeZ3vWjJJuoTmoMEi0LXsFkfxQ8Cj/Cbgv8FoEUWcl/wSn4Vns+XT85+DqhniW4MM3Rixk+fPwqgeZcx+1dJYEmpayrkRm5lZPAA47U97JWD2SpHEl+0pTtmXXelAPKL+VVG4T8zIijRaqBGP5vn2s9kmthGoQhSp4/jCEaz92Ub+1jjK5fJcjmI/FEATfXK/t8D0aUbdlPObj29khkq8St/MRX/bY+vJZygnkgYH7XyxLj6v+q5ScKA7CqA/HECN9erCKAe6GvpBpX4ld/Mr6q07BXQE+q5SYK/AEqGdpWyzzD5oQDCQTiyV60AeqAvM7HOlH5FXtObwVw9Nz2QxWSw/FAAR/YAUkUA9UBvZ2KdKf3K/p/39FRKxAr13PRAFnMB8kMBqEYwKbFg7gQCWmpO8uKAhsCI/f/l1TO4a/7U7Aqn8ARmKRGzOmhPpHpcXpPF3sD54QRWWcApYc7EbnlsTcJjfELwpdNHqwFfkZDoyKBkzxLAJdNH2COUlykdaRRsbD8ChQMyeuo8WQGoLOBUmV9Vyf1Xz4hbCWjAoOIAupQs29igmr0oemR+VWVnUcFEqBDQFOhsDLMn46fODwXQvBsKB2Zce+Yq86sqWfTF6d2W5zpd1R7rG1u5/en1SgkoeiRR/2ATFQKaAp0KIPORQPmhAKZ8AS6622kpPEd7yLqAF3QlfikUyUjUP9hEhYCmQMVEuPheGHJixk+dHwpgzOlQda3TUniKZKV+TdTsTZEK1WMHYxVbENCECgFNhZIhcNo3ocy6DldfyA8FoEibVAZ3NXtTpMKiJZvQLfLD+vfzqxBQh1EKQGFJT4O76vilSJVEk4nGlqDNkihiUQpAYYlVQo+5iq8sK1IdvxQpk2gyoVaQzuNYHoAQ4mbgO0AYeF5K+X2nZFHEE13qd3djKyPLilgwd4Ia9BVpY9U3QK0g3YEjCkAIMRu4HJgupWwXQgx1Qg5FcsyEHoWiL6jJhHtxagVwE3CPlLIdQEq53yE5FAqFDajJhDtxygdwEnCOEOJ9IcQbQojTEh0ohLhBCLFSCLGyoUFlDSoUCkWmyNoKQAjxKmDV6+0HkesOBs4ATgMeF0IcJ2V8MXEp5WJgMUB1dXX6xcYVCoVCYUnWFICU8jOJ9gkhbgKejgz4HwghdKAcUFN8hUKhsAmnTEA1wGwAIcRJQAFwwCFZFAqFIi9xygn8MPCwEOJDoAP4qpX5R6FQKBTZQ3hp3BVCNAA7evn2cty5ylBypYeSKz2UXOnhVrmgb7KNlVJWxG70lALoC0KIlVLKaqfliEXJlR5KrvRQcqWHW+WC7MimSkEoFApFnqIUgEKhUOQp+aQAFjstQAKUXOmh5EoPJVd6uFUuyIJseeMDUCgUCkV38mkFoFAoFIoolAJQKBSKPCWnFIAQ4otCiDohhC6EqI7Zd7sQYosQYpMQYm6C94+PFKjbIoR4TAhRkAUZHxNCrIn8+0QIsSbBcZ8IIdZHjluZaTksrvdjIUR9lGyfS3DcRZF7uEUIsdAGuRYJITYKIdYJIf4hhChLcJwt96unzy+EKIx8x1siz9K4bMkSdc3RQohlQoiPIs//v1kcc74Qoinq+/1htuWKXDfp9yIM/jdyv9YJIU6xQaYJUfdhjRCiWQhxS8wxtt0vIcTDQoj9kcRYc9tgIcQrQojNkddBCd771cgxm4UQX0374lLKnPkHnAxMAF4HqqO2TwLWAoXAeGAr4LN4/+PANZG/fwvclGV57wV+mGDfJ0C5jffux8B/9HCML3LvjsMo37EWmJRluS4E/JG/fw783Kn7lcrnB/4V+G3k72uAx2z47kYAp0T+HgB8bCHX+cA/7XqeUv1egM8BL2I0nDsDeN9m+XzAXoxEKUfuF3AucArwYdS2XwALI38vtHruMQpqbou8Dor8PSida+fUCkBKuUFKucli1+XA36WU7VLK7cAWYGb0AUIIAcwBnoxs+jMwP1uyRq53FfBotq6RBWYCW6SU26SUHcDfMe5t1pBSviylDEX+uxwYlc3r9UAqn/9yjGcHjGfpgsh3nTWklHuklKsjfx8BNgBeKb5/OfB/0mA5UCaEGGHj9S8Atkope1thoM9IKd8EDsVsjn6OEo1Fc4FXpJSHpJSHgVeAi9K5dk4pgCRUAp9G/X8X8T+QIUBj1GBjdUwmOQfYJ6XcnGC/BF4WQqwSQtyQRTmi+W5kGf5wgiVnKvcxm3wdY7ZohR33K5XP33lM5Flqwni2bCFicqoC3rfYfaYQYq0Q4kUhxGSbROrpe3H6mbqGxJMwJ+6XyTAp5Z7I33uBYRbH9PneOdYTuLeIJH0GpJTP2C2PFSnK+CWSz/7PllLWC6Nd5itCiI2RmUJW5AIeBH6C8YP9CYZ56ut9uV4m5DLvlxDiB0AIeCTBaTJ+v7yGEKI/8BRwi5SyOWb3agwzx9GIf6cGONEGsVz7vUR8fJcBt1vsdup+xSGllEKIrMTre04ByCR9BpJQD4yO+v+oyLZoDmIsP/2RmZvVMRmRUQjhB64ATk1yjvrI634hxD8wzA99+uGkeu+EEL8H/mmxK5X7mHG5hBBfAy4BLpAR46fFOTJ+vyxI5fObx+yKfM8DMZ6trCKECGAM/o9IKZ+O3R+tEKSULwghfiOEKJdSZrXwWQrfS1aeqRSZB6yWUu6L3eHU/YpinxBihJRyT8QkZtU2tx7DV2EyCsP/mTL5YgJ6FrgmEqExHkOTfxB9QGRgWQZcGdn0VSBbK4rPABullLusdgohSoQQA8y/MRyhH1odmyli7K6fT3C9FcCJwoiWKsBYPj+bZbkuAr4PXCalbElwjF33K5XP/yzGswPGs7Q0kdLKFBEfw0PABinlfQmOGW76IoQQMzF++1lVTCl+L88C/xKJBjoDaIoyfWSbhKtwJ+5XDNHPUaKxaAlwoRBiUMRke2FkW+rY4eW26x/GwLULaAf2AUui9v0AI4JjEzAvavsLwMjI38dhKIYtwBNAYZbk/BNwY8y2kcALUXKsjfyrwzCFZPve/QVYD6yLPHwjYuWK/P9zGFEmW22SawuGnXNN5N9vY+Wy835ZfX7gvzEUFEC/yLOzJfIsHWfDPTobw3S3Luo+fQ640XzOgO9G7s1aDGf6WTbIZfm9xMglgF9H7ud6oqL3sixbCcaAPjBqmyP3C0MJ7QGCkfHrGxh+o9eAzcCrwODIsdXAH6Le+/XIs7YFuD7da6tSEAqFQpGn5IsJSKFQKBQxKAWgUCgUeYpSAAqFQpGnKAWgUCgUeYpSAAqFQpGnKAWgUCgUeYpSAAqFQpGnKAWgUPQBIcRpkQJ6/SKZr3VCiClOy6VQpIJKBFMo+ogQ4i6MDOAiYJeU8m6HRVIoUkIpAIWij0TqAq0A2jBKBoQdFkmhSAllAlIo+s4QoD9GN65+DsuiUKSMWgEoFH1ECPEsRnew8RhF9L7rsEgKRUp4rh+AQuEmhBD/AgSllH8TQviAd4UQc6SUS52WTaHoCbUCUCgUijxF+QAUCoUiT1EKQKFQKPIUpQAUCoUiT1EKQKFQKPIUpQAUCoUiT1EKQKFQKPIUpQAUCoUiT/n/ASzhdE17faumAAAAAElFTkSuQmCC", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "plt.xlabel(\"x\")\n", + "plt.ylabel(\"y\")\n", + "plt.scatter(X_train, y_train, color=\"C0\")\n", + "_ = plt.plot(X_test, y_mesh, color=\"C1\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As mentioned previously, we fit our training data with a simple\n", + "polynomial function. Here, we choose a degree equal to 10 so the function\n", + "is able to perfectly fit $x \\times \\sin(x)$.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "degree_polyn = 10\n", + "polyn_model = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", LinearRegression())\n", + " ]\n", + ")\n", + "polyn_model_quant = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", QuantileRegressor(\n", + " solver=\"highs\",\n", + " alpha=0,\n", + " ))\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.ensemble import GradientBoostingRegressor\n", + "polyn_model = Pipeline(\n", + " [\n", + " (\"boosted_tree\", GradientBoostingRegressor())\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then estimate the prediction intervals for all the strategies very easily\n", + "with a\n", + "`fit` and `predict` process. The prediction interval's lower and upper bounds\n", + "are then saved in a DataFrame. Here, we set an alpha value of 0.05\n", + "in order to obtain a 95% confidence for our prediction intervals.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "STRATEGIES = {\n", + " \"naive\": dict(method=\"naive\"),\n", + " \"jackknife\": dict(method=\"base\", cv=-1),\n", + " \"jackknife_plus\": dict(method=\"plus\", cv=-1),\n", + " \"jackknife_minmax\": dict(method=\"minmax\", cv=-1),\n", + " \"cv\": dict(method=\"base\", cv=10),\n", + " \"cv_plus\": dict(method=\"plus\", cv=10),\n", + " \"cv_minmax\": dict(method=\"minmax\", cv=10),\n", + " \"jackknife_plus_ab\": dict(method=\"plus\", cv=Subsample(n_resamplings=50)),\n", + " \"jackknife_minmax_ab\": dict(\n", + " method=\"minmax\", cv=Subsample(n_resamplings=50)\n", + " ),\n", + " \"conformalized_quantile_regression\": dict(\n", + " method=\"quantile\", cv=\"split\", alpha=0.05\n", + " )\n", + "}\n", + "y_pred, y_pis = {}, {}\n", + "for strategy, params in STRATEGIES.items():\n", + " if strategy == \"conformalized_quantile_regression\":\n", + " mapie = MapieQuantileRegressor(polyn_model_quant, **params)\n", + " mapie.fit(X_train, y_train, random_state=1)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test)\n", + " else:\n", + " mapie = MapieRegressor(polyn_model, **params)\n", + " mapie.fit(X_train, y_train)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test, alpha=0.05)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let’s now compare the target confidence intervals with the predicted\n", + "intervals obtained with the Jackknife+, Jackknife-minmax, CV+, CV-minmax,\n", + "Jackknife+-after-Boostrap, and conformalized quantile regression (CQR)\n", + "strategies. Note that for the Jackknife-after-Bootstrap method, we call the\n", + ":class:`~mapie.subsample.Subsample` object that allows us to train\n", + "bootstrapped models. Note also that the CQR method is called with\n", + ":class:`~mapie.quantile_regression.MapieQuantileRegressor` with a\n", + "\"split\" strategy.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "collapsed": false + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "def plot_1d_data(\n", + " X_train,\n", + " y_train,\n", + " X_test,\n", + " y_test,\n", + " y_sigma,\n", + " y_pred,\n", + " y_pred_low,\n", + " y_pred_up,\n", + " ax=None,\n", + " title=None\n", + "):\n", + " ax.set_xlabel(\"x\")\n", + " ax.set_ylabel(\"y\")\n", + " ax.fill_between(X_test, y_pred_low, y_pred_up, alpha=0.3)\n", + " ax.scatter(X_train, y_train, color=\"red\", alpha=0.3, label=\"Training data\")\n", + " ax.plot(X_test, y_test, color=\"gray\", label=\"True confidence intervals\")\n", + " ax.plot(X_test, y_test - y_sigma, color=\"gray\", ls=\"--\")\n", + " ax.plot(X_test, y_test + y_sigma, color=\"gray\", ls=\"--\")\n", + " ax.plot(\n", + " X_test, y_pred, color=\"blue\", alpha=0.5, label=\"Prediction intervals\"\n", + " )\n", + " if title is not None:\n", + " ax.set_title(title)\n", + " ax.legend()\n", + "\n", + "\n", + "strategies = [\n", + " \"jackknife_plus\",\n", + " \"jackknife_minmax\",\n", + " \"cv_plus\",\n", + " \"cv_minmax\",\n", + " \"jackknife_plus_ab\",\n", + " \"conformalized_quantile_regression\"\n", + "]\n", + "n_figs = len(strategies)\n", + "fig, axs = plt.subplots(3, 2, figsize=(9, 13))\n", + "coords = [axs[0, 0], axs[0, 1], axs[1, 0], axs[1, 1], axs[2, 0], axs[2, 1]]\n", + "for strategy, coord in zip(strategies, coords):\n", + " plot_1d_data(\n", + " X_train.ravel(),\n", + " y_train.ravel(),\n", + " X_test.ravel(),\n", + " y_mesh.ravel(),\n", + " np.full((X_test.shape[0]), 1.96*noise).ravel(),\n", + " y_pred[strategy].ravel(),\n", + " y_pis[strategy][:, 0, 0].ravel(),\n", + " y_pis[strategy][:, 1, 0].ravel(),\n", + " ax=coord,\n", + " title=strategy\n", + " )\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "At first glance, the four strategies give similar results and the\n", + "prediction intervals are very close to the true confidence intervals.\n", + "Let’s confirm this by comparing the prediction interval widths over\n", + "$x$ between all strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "fig, ax = plt.subplots(1, 1, figsize=(9, 5))\n", + "ax.axhline(1.96*2*noise, ls=\"--\", color=\"k\", label=\"True width\")\n", + "for strategy in STRATEGIES:\n", + " ax.plot(\n", + " X_test,\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0],\n", + " label=strategy\n", + " )\n", + "ax.set_xlabel(\"x\")\n", + "ax.set_ylabel(\"Prediction Interval Width\")\n", + "ax.legend(fontsize=8)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As expected, the prediction intervals estimated by the Naive method\n", + "are slightly too narrow. The Jackknife, Jackknife+, CV, CV+, JaB, and J+aB\n", + "give\n", + "similar widths that are very close to the true width. On the other hand,\n", + "the width estimated by Jackknife-minmax and CV-minmax are slightly too\n", + "wide. Note that the widths given by the Naive, Jackknife, and CV strategies\n", + "are constant because there is a single model used for prediction,\n", + "perturbed models are ignored at prediction time.\n", + "\n", + "It's interesting to observe that CQR strategy offers more varying width,\n", + "often giving much higher but also lower interval width than other methods,\n", + "therefore,\n", + "with homoscedastic noise, CQR would not be the preferred method.\n", + "\n", + "Let’s now compare the *effective* coverage, namely the fraction of test\n", + "points whose true values lie within the prediction intervals, given by\n", + "the different strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "pd.DataFrame([\n", + " [\n", + " regression_coverage_score(\n", + " y_test, y_pis[strategy][:, 0, 0], y_pis[strategy][:, 1, 0]\n", + " ),\n", + " (\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0]\n", + " ).mean()\n", + " ] for strategy in STRATEGIES\n", + "], index=STRATEGIES, columns=[\"Coverage\", \"Width average\"]).round(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All strategies except the Naive one give effective coverage close to the\n", + "expected 0.95 value (recall that alpha = 0.05), confirming the theoretical\n", + "garantees.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Estimating the aleatoric uncertainty of heteroscedastic noisy data\n", + "\n", + "Let's define again the $x \\times \\sin(x)$ function and another simple\n", + "function that generates one-dimensional data with normal noise uniformely\n", + "in a given interval.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def get_1d_data_with_heteroscedastic_noise(\n", + " funct, min_x, max_x, n_samples, noise\n", + "):\n", + " \"\"\"\n", + " Generate 1D noisy data uniformely from the given function\n", + " and standard deviation for the noise.\n", + " \"\"\"\n", + " np.random.seed(59)\n", + " X_train = np.linspace(min_x, max_x, n_samples)\n", + " np.random.shuffle(X_train)\n", + " X_test = np.linspace(min_x, max_x, n_samples*5)\n", + " y_train = (\n", + " funct(X_train) +\n", + " (np.random.normal(0, noise, len(X_train)) * X_train)\n", + " )\n", + " y_test = (\n", + " funct(X_test) +\n", + " (np.random.normal(0, noise, len(X_test)) * X_test)\n", + " )\n", + " y_mesh = funct(X_test)\n", + " return (\n", + " X_train.reshape(-1, 1), y_train, X_test.reshape(-1, 1), y_test, y_mesh\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first generate noisy one-dimensional data uniformely on an interval.\n", + "Here, the noise is considered as *heteroscedastic*, since it will increase\n", + "linearly with $x$.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "min_x, max_x, n_samples, noise = 0, 5, 300, 0.5\n", + "(\n", + " X_train, y_train, X_test, y_test, y_mesh\n", + ") = get_1d_data_with_heteroscedastic_noise(\n", + " x_sinx, min_x, max_x, n_samples, noise\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's visualize our noisy function. As x increases, the data becomes more\n", + "noisy.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "plt.xlabel(\"x\")\n", + "plt.ylabel(\"y\")\n", + "plt.scatter(X_train, y_train, color=\"C0\")\n", + "plt.plot(X_test, y_mesh, color=\"C1\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As mentioned previously, we fit our training data with a simple\n", + "polynomial function. Here, we choose a degree equal to 10 so the function\n", + "is able to perfectly fit $x \\times \\sin(x)$.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "degree_polyn = 10\n", + "polyn_model = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", LinearRegression())\n", + " ]\n", + ")\n", + "polyn_model_quant = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", QuantileRegressor(\n", + " solver=\"highs\",\n", + " alpha=0,\n", + " ))\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then estimate the prediction intervals for all the strategies very easily\n", + "with a\n", + "`fit` and `predict` process. The prediction interval's lower and upper bounds\n", + "are then saved in a DataFrame. Here, we set an alpha value of 0.05\n", + "in order to obtain a 95% confidence for our prediction intervals.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "STRATEGIES = {\n", + " \"naive\": dict(method=\"naive\"),\n", + " \"jackknife\": dict(method=\"base\", cv=-1),\n", + " \"jackknife_plus\": dict(method=\"plus\", cv=-1),\n", + " \"jackknife_minmax\": dict(method=\"minmax\", cv=-1),\n", + " \"cv\": dict(method=\"base\", cv=10),\n", + " \"cv_plus\": dict(method=\"plus\", cv=10),\n", + " \"cv_minmax\": dict(method=\"minmax\", cv=10),\n", + " \"jackknife_plus_ab\": dict(method=\"plus\", cv=Subsample(n_resamplings=50)),\n", + " \"conformalized_quantile_regression\": dict(\n", + " method=\"quantile\", cv=\"split\", alpha=0.05\n", + " )\n", + "}\n", + "y_pred, y_pis = {}, {}\n", + "for strategy, params in STRATEGIES.items():\n", + " if strategy == \"conformalized_quantile_regression\":\n", + " mapie = MapieQuantileRegressor(polyn_model_quant, **params)\n", + " mapie.fit(X_train, y_train, random_state=1)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test)\n", + " else:\n", + " mapie = MapieRegressor(polyn_model, **params)\n", + " mapie.fit(X_train, y_train)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test, alpha=0.05)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once again, let’s compare the target confidence intervals with prediction\n", + "intervals obtained with the Jackknife+, Jackknife-minmax, CV+, CV-minmax,\n", + "Jackknife+-after-Boostrap, and CQR strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "strategies = [\n", + " \"jackknife_plus\",\n", + " \"jackknife_minmax\",\n", + " \"cv_plus\",\n", + " \"cv_minmax\",\n", + " \"jackknife_plus_ab\",\n", + " \"conformalized_quantile_regression\"\n", + "]\n", + "n_figs = len(strategies)\n", + "fig, axs = plt.subplots(3, 2, figsize=(9, 13))\n", + "coords = [axs[0, 0], axs[0, 1], axs[1, 0], axs[1, 1], axs[2, 0], axs[2, 1]]\n", + "for strategy, coord in zip(strategies, coords):\n", + " plot_1d_data(\n", + " X_train.ravel(),\n", + " y_train.ravel(),\n", + " X_test.ravel(),\n", + " y_mesh.ravel(),\n", + " (1.96*noise*X_test).ravel(),\n", + " y_pred[strategy].ravel(),\n", + " y_pis[strategy][:, 0, 0].ravel(),\n", + " y_pis[strategy][:, 1, 0].ravel(),\n", + " ax=coord,\n", + " title=strategy\n", + " )\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can observe that all of the strategies except CQR seem to have similar\n", + "constant prediction intervals.\n", + "On the other hand, the CQR strategy offers a solution that adapts the\n", + "prediction intervals to the local noise.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "fig, ax = plt.subplots(1, 1, figsize=(7, 5))\n", + "ax.plot(X_test, 1.96*2*noise*X_test, ls=\"--\", color=\"k\", label=\"True width\")\n", + "for strategy in STRATEGIES:\n", + " ax.plot(\n", + " X_test,\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0],\n", + " label=strategy\n", + " )\n", + "ax.set_xlabel(\"x\")\n", + "ax.set_ylabel(\"Prediction Interval Width\")\n", + "ax.legend(fontsize=8)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "One can observe that all the strategies behave in a similar way as in the\n", + "first example shown previously. One exception is the CQR method which takes\n", + "into account the heteroscedasticity of the data. In this method we observe\n", + "very low interval widths at low values of $x$.\n", + "This is the only method that\n", + "even slightly follows the true width, and therefore is the preferred method\n", + "for heteroscedastic data. Notice also that the true width is greater (lower)\n", + "than the predicted width from the other methods at $x \\gtrapprox 3$`\n", + "($x \\leq 3$). This means that while the marginal coverage correct for\n", + "these methods, the conditional coverage is likely not guaranteed as we will\n", + "observe in the next figure.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def get_heteroscedastic_coverage(y_test, y_pis, STRATEGIES, bins):\n", + " recap = {}\n", + " for i in range(len(bins)-1):\n", + " bin1, bin2 = bins[i], bins[i+1]\n", + " name = f\"[{bin1}, {bin2}]\"\n", + " recap[name] = []\n", + " for strategy in STRATEGIES:\n", + " indices = np.where((X_test >= bins[i]) * (X_test <= bins[i+1]))\n", + " y_test_trunc = np.take(y_test, indices)\n", + " y_low_ = np.take(y_pis[strategy][:, 0, 0], indices)\n", + " y_high_ = np.take(y_pis[strategy][:, 1, 0], indices)\n", + " score_coverage = regression_coverage_score(\n", + " y_test_trunc[0], y_low_[0], y_high_[0]\n", + " )\n", + " recap[name].append(score_coverage)\n", + " recap_df = pd.DataFrame(recap, index=STRATEGIES)\n", + " return recap_df\n", + "\n", + "\n", + "bins = [0, 1, 2, 3, 4, 5]\n", + "heteroscedastic_coverage = get_heteroscedastic_coverage(\n", + " y_test, y_pis, STRATEGIES, bins\n", + ")\n", + "\n", + "# fig = plt.figure()\n", + "heteroscedastic_coverage.T.plot.bar(figsize=(12, 5), alpha=0.7)\n", + "plt.axhline(0.95, ls=\"--\", color=\"k\")\n", + "plt.ylabel(\"Conditional coverage\")\n", + "plt.xlabel(\"x bins\")\n", + "plt.xticks(rotation=0)\n", + "plt.ylim(0.8, 1.0)\n", + "plt.legend(fontsize=8, loc=[0, 0])\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let’s now conclude by summarizing the *effective* coverage, namely the\n", + "fraction of test\n", + "points whose true values lie within the prediction intervals, given by\n", + "the different strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "pd.DataFrame([\n", + " [\n", + " regression_coverage_score(\n", + " y_test, y_pis[strategy][:, 0, 0], y_pis[strategy][:, 1, 0]\n", + " ),\n", + " (\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0]\n", + " ).mean()\n", + " ] for strategy in STRATEGIES\n", + "], index=STRATEGIES, columns=[\"Coverage\", \"Width average\"]).round(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All the strategies have the wanted coverage, however, we notice that the CQR\n", + "strategy has much lower interval width than all the other methods, therefore,\n", + "with heteroscedastic noise, CQR would be the preferred method.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Estimating the epistemic uncertainty of out-of-distribution data\n", + "\n", + "Let’s now consider one-dimensional data without noise, but normally\n", + "distributed.\n", + "The goal is to explore how the prediction intervals evolve for new data\n", + "that lie outside the distribution of the training data in order to see how\n", + "the strategies can capture the *epistemic* uncertainty.\n", + "For a comparison of the epistemic and aleatoric uncertainties, please have\n", + "a look at this source:\n", + "https://en.wikipedia.org/wiki/Uncertainty_quantification.\n", + "\n", + "Let's start by generating and showing the data.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def get_1d_data_with_normal_distrib(funct, mu, sigma, n_samples, noise):\n", + " \"\"\"\n", + " Generate noisy 1D data with normal distribution from given function\n", + " and noise standard deviation.\n", + " \"\"\"\n", + " np.random.seed(59)\n", + " X_train = np.random.normal(mu, sigma, n_samples)\n", + " X_test = np.arange(mu-4*sigma, mu+4*sigma, sigma/20.)\n", + " y_train, y_mesh, y_test = funct(X_train), funct(X_test), funct(X_test)\n", + " y_train += np.random.normal(0, noise, y_train.shape[0])\n", + " y_test += np.random.normal(0, noise, y_test.shape[0])\n", + " return (\n", + " X_train.reshape(-1, 1), y_train, X_test.reshape(-1, 1), y_test, y_mesh\n", + " )\n", + "\n", + "\n", + "mu, sigma, n_samples, noise = 0, 2, 1000, 0.\n", + "X_train, y_train, X_test, y_test, y_mesh = get_1d_data_with_normal_distrib(\n", + " x_sinx, mu, sigma, n_samples, noise\n", + ")\n", + "plt.xlabel(\"x\")\n", + "plt.ylabel(\"y\")\n", + "plt.scatter(X_train, y_train, color=\"C0\")\n", + "_ = plt.plot(X_test, y_test, color=\"C1\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As before, we estimate the prediction intervals using a polynomial\n", + "function of degree 10 and show the results for the Jackknife+ and CV+\n", + "strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "polyn_model_quant = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", QuantileRegressor(\n", + " solver=\"highs-ds\",\n", + " alpha=0,\n", + " ))\n", + " ]\n", + ")\n", + "STRATEGIES = {\n", + " \"naive\": dict(method=\"naive\"),\n", + " \"jackknife\": dict(method=\"base\", cv=-1),\n", + " \"jackknife_plus\": dict(method=\"plus\", cv=-1),\n", + " \"jackknife_minmax\": dict(method=\"minmax\", cv=-1),\n", + " \"cv\": dict(method=\"base\", cv=10),\n", + " \"cv_plus\": dict(method=\"plus\", cv=10),\n", + " \"cv_minmax\": dict(method=\"minmax\", cv=10),\n", + " \"jackknife_plus_ab\": dict(method=\"plus\", cv=Subsample(n_resamplings=50)),\n", + " \"jackknife_minmax_ab\": dict(\n", + " method=\"minmax\", cv=Subsample(n_resamplings=50)\n", + " ),\n", + " \"conformalized_quantile_regression\": dict(\n", + " method=\"quantile\", cv=\"split\", alpha=0.05\n", + " )\n", + "}\n", + "y_pred, y_pis = {}, {}\n", + "for strategy, params in STRATEGIES.items():\n", + " if strategy == \"conformalized_quantile_regression\":\n", + " mapie = MapieQuantileRegressor(polyn_model_quant, **params)\n", + " mapie.fit(X_train, y_train, random_state=1)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test)\n", + " else:\n", + " mapie = MapieRegressor(polyn_model, **params)\n", + " mapie.fit(X_train, y_train)\n", + " y_pred[strategy], y_pis[strategy] = mapie.predict(X_test, alpha=0.05)\n", + "\n", + "strategies = [\n", + " \"jackknife_plus\",\n", + " \"jackknife_minmax\",\n", + " \"cv_plus\",\n", + " \"cv_minmax\",\n", + " \"jackknife_plus_ab\",\n", + " \"conformalized_quantile_regression\"\n", + "]\n", + "n_figs = len(strategies)\n", + "fig, axs = plt.subplots(3, 2, figsize=(9, 13))\n", + "coords = [axs[0, 0], axs[0, 1], axs[1, 0], axs[1, 1], axs[2, 0], axs[2, 1]]\n", + "for strategy, coord in zip(strategies, coords):\n", + " plot_1d_data(\n", + " X_train.ravel(),\n", + " y_train.ravel(),\n", + " X_test.ravel(),\n", + " y_mesh.ravel(),\n", + " 1.96*noise,\n", + " y_pred[strategy].ravel(),\n", + " y_pis[strategy][:, 0, :].ravel(),\n", + " y_pis[strategy][:, 1, :].ravel(),\n", + " ax=coord,\n", + " title=strategy\n", + " )\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "At first glance, our polynomial function does not give accurate\n", + "predictions with respect to the true function when $|x| > 6$.\n", + "The prediction intervals estimated with the Jackknife+ do not seem to\n", + "increase. On the other hand, the CV and other related methods seem to capture\n", + "some uncertainty when $x > 6$.\n", + "\n", + "Let's now compare the prediction interval widths between all strategies.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "fig, ax = plt.subplots(1, 1, figsize=(7, 5))\n", + "ax.set_yscale(\"log\")\n", + "for strategy in STRATEGIES:\n", + " ax.plot(\n", + " X_test,\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0],\n", + " label=strategy\n", + " )\n", + "ax.set_xlabel(\"x\")\n", + "ax.set_ylabel(\"Prediction Interval Width\")\n", + "ax.legend(fontsize=8)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The prediction interval widths start to increase exponentially\n", + "for $|x| > 4$ for the CV+, CV-minmax, Jackknife-minmax, and quantile\n", + "strategies. On the other hand, the prediction intervals estimated by\n", + "Jackknife+ remain roughly constant until $|x| \\approx 5$ before\n", + "increasing.\n", + "The CQR strategy seems to perform well, however, on the extreme values\n", + "of the data the quantile regression fails to give reliable results as it\n", + "outputs\n", + "negative value for the prediction intervals. This occurs because the quantile\n", + "regressor with quantile $1 - \\alpha/2$ gives higher values than the\n", + "quantile regressor with quantile $\\alpha/2$. Note that a warning will\n", + "be issued when this occurs.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "pd.DataFrame([\n", + " [\n", + " regression_coverage_score(\n", + " y_test, y_pis[strategy][:, 0, 0], y_pis[strategy][:, 1, 0]\n", + " ),\n", + " (\n", + " y_pis[strategy][:, 1, 0] - y_pis[strategy][:, 0, 0]\n", + " ).mean()\n", + " ] for strategy in STRATEGIES\n", + "], index=STRATEGIES, columns=[\"Coverage\", \"Width average\"]).round(3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In conclusion, the Jackknife-minmax, CV+, CV-minmax, or Jackknife-minmax-ab\n", + "strategies are more\n", + "conservative than the Jackknife+ strategy, and tend to result in more\n", + "reliable coverages for *out-of-distribution* data. It is therefore\n", + "advised to use the three former strategies for predictions with new\n", + "out-of-distribution data.\n", + "Note however that there are no theoretical guarantees on the coverage level\n", + "for out-of-distribution data.\n", + "Here it's important to note that the CQR strategy should not be taken into\n", + "account for width prediction, and it is abundantly clear from the negative\n", + "width coverage that is observed in these results.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Estimating the uncertainty with different sklearn-compatible regressors\n", + "\n", + "MAPIE can be used with any kind of sklearn-compatible regressor. Here, we\n", + "illustrate this by comparing the prediction intervals estimated by the CV+\n", + "method using different models:\n", + "\n", + "* the same polynomial function as before.\n", + "\n", + "* a XGBoost model using the Scikit-learn API.\n", + "\n", + "* a simple neural network, a Multilayer Perceptron with three dense layers,\n", + " using the KerasRegressor wrapper.\n", + "\n", + "Once again, let’s use our noisy one-dimensional data obtained from a\n", + "uniform distribution.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "subprocess.run(\"pip install scikeras\", shell=True)\n", + "subprocess.run(\"pip install tensorflow\", shell=True)\n", + "subprocess.run(\"pip install xgboost\", shell=True)\n", + "\n", + "from scikeras.wrappers import KerasRegressor # noqa: E402\n", + "from tensorflow.keras import Sequential # noqa: E402\n", + "from tensorflow.keras.layers import Dense # noqa: E402\n", + "from xgboost import XGBRegressor # noqa: E402\n", + "\n", + "\n", + "min_x, max_x, n_samples, noise = -5, 5, 100, 0.5\n", + "X_train, y_train, X_test, y_test, y_mesh = get_1d_data_with_constant_noise(\n", + " x_sinx, min_x, max_x, n_samples, noise\n", + ")\n", + "\n", + "plt.xlabel(\"x\")\n", + "plt.ylabel(\"y\")\n", + "plt.plot(X_test, y_mesh, color=\"C1\")\n", + "_ = plt.scatter(X_train, y_train)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's then define the models. The boosing model considers 100 shallow\n", + "trees with a max depth of 2 while the Multilayer Perceptron has two hidden\n", + "dense layers with 20 neurons each followed by a relu activation.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "def mlp():\n", + " \"\"\"\n", + " Two-layer MLP model\n", + " \"\"\"\n", + " model = Sequential([\n", + " Dense(units=20, input_shape=(1,), activation=\"relu\"),\n", + " Dense(units=20, activation=\"relu\"),\n", + " Dense(units=1)\n", + " ])\n", + " model.compile(loss=\"mean_squared_error\", optimizer=\"adam\")\n", + " return model\n", + "\n", + "\n", + "polyn_model = Pipeline(\n", + " [\n", + " (\"poly\", PolynomialFeatures(degree=degree_polyn)),\n", + " (\"linear\", LinearRegression())\n", + " ]\n", + ")\n", + "\n", + "xgb_model = XGBRegressor(\n", + " max_depth=2,\n", + " n_estimators=100,\n", + " tree_method=\"hist\",\n", + " random_state=59,\n", + " learning_rate=0.1,\n", + " verbosity=0,\n", + " nthread=-1\n", + ")\n", + "mlp_model = KerasRegressor(\n", + " build_fn=mlp,\n", + " epochs=500,\n", + " verbose=0\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's now use MAPIE to estimate the prediction intervals using the CV+\n", + "method and compare their prediction interval.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "models = [polyn_model, xgb_model, mlp_model]\n", + "model_names = [\"polyn\", \"xgb\", \"mlp\"]\n", + "for name, model in zip(model_names, models):\n", + " mapie = MapieRegressor(model, method=\"plus\", cv=10)\n", + " mapie.fit(X_train, y_train)\n", + " y_pred[name], y_pis[name] = mapie.predict(X_test, alpha=0.05)\n", + "\n", + "fig, axs = plt.subplots(1, 3, figsize=(20, 6))\n", + "for name, ax in zip(model_names, axs):\n", + " plot_1d_data(\n", + " X_train.ravel(),\n", + " y_train.ravel(),\n", + " X_test.ravel(),\n", + " y_mesh.ravel(),\n", + " 1.96*noise,\n", + " y_pred[name].ravel(),\n", + " y_pis[name][:, 0, 0].ravel(),\n", + " y_pis[name][:, 1, 0].ravel(),\n", + " ax=ax,\n", + " title=name\n", + " )\n", + "plt.show()\n", + "\n", + "\n", + "fig, ax = plt.subplots(1, 1, figsize=(7, 5))\n", + "for name in model_names:\n", + " ax.plot(X_test, y_pis[name][:, 1, 0] - y_pis[name][:, 0, 0])\n", + "ax.axhline(1.96*2*noise, ls=\"--\", color=\"k\")\n", + "ax.set_xlabel(\"x\")\n", + "ax.set_ylabel(\"Prediction Interval Width\")\n", + "ax.legend(model_names + [\"True width\"], fontsize=8)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As expected with the CV+ method, the prediction intervals are a bit\n", + "conservative since they are slightly wider than the true intervals.\n", + "However, the CV+ method on the three models gives very promising results\n", + "since the prediction intervals closely follow the true intervals with\n", + "$x$.\n", + "\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3.9.0 64-bit ('3.9.0')", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.0" + }, + "vscode": { + "interpreter": { + "hash": "9f3749d15f85194e2be8a9af590cff47c92b4121cd966ac320a66f9707b35b3a" + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/docs/Manifest.toml b/docs/Manifest.toml index 04314dd..ac17c5a 100644 --- a/docs/Manifest.toml +++ b/docs/Manifest.toml @@ -2,7 +2,7 @@ julia_version = "1.8.1" manifest_format = "2.0" -project_hash = "ae240e8e0ebf5559f152baffe6c05f54b653e10d" +project_hash = "3d726f20636a3ccda39ed827011f908499c4d97d" [[deps.ANSIColoredPrinters]] git-tree-sha1 = "574baf8110975760d391c710b6341da1afa48d8c" @@ -26,39 +26,50 @@ git-tree-sha1 = "52b3b436f8f73133d7bc3a6c71ee7ed6ab2ab754" uuid = "1520ce14-60c1-5f80-bbc7-55ef81b5835c" version = "0.4.3" +[[deps.Accessors]] +deps = ["Compat", "CompositionsBase", "ConstructionBase", "Dates", "InverseFunctions", "LinearAlgebra", "MacroTools", "Requires", "Test"] +git-tree-sha1 = "eb7a1342ff77f4f9b6552605f27fd432745a53a3" +uuid = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697" +version = "0.1.22" + [[deps.Adapt]] deps = ["LinearAlgebra"] git-tree-sha1 = "195c5505521008abea5aee4f96930717958eac6f" uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" version = "3.4.0" +[[deps.ArgCheck]] +git-tree-sha1 = "a3a402a35a2f7e0b87828ccabbd5ebfbebe356b4" +uuid = "dce04be8-c92d-5529-be00-80e4d2c0e197" +version = "2.3.0" + [[deps.ArgTools]] uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f" version = "1.1.1" [[deps.ArrayInterface]] deps = ["ArrayInterfaceCore", "Compat", "IfElse", "LinearAlgebra", "Static"] -git-tree-sha1 = "d6173480145eb632d6571c148d94b9d3d773820e" +git-tree-sha1 = "6d0918cb9c0d3db7fe56bea2bc8638fc4014ac35" uuid = "4fba245c-0d91-5ea0-9b3e-6abc04ee57a9" -version = "6.0.23" +version = "6.0.24" [[deps.ArrayInterfaceCore]] deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"] -git-tree-sha1 = "e6cba4aadba7e8a7574ab2ba2fcfb307b4c4b02a" +git-tree-sha1 = "c46fb7dd1d8ca1d213ba25848a5ec4e47a1a1b08" uuid = "30b0a656-2188-435a-8636-2ec0e6a096e2" -version = "0.1.23" +version = "0.1.26" [[deps.ArrayInterfaceOffsetArrays]] deps = ["ArrayInterface", "OffsetArrays", "Static"] -git-tree-sha1 = "c49f6bad95a30defff7c637731f00934c7289c50" +git-tree-sha1 = "3d1a9a01976971063b3930d1aed1d9c4af0817f8" uuid = "015c0d05-e682-4f19-8f0a-679ce4c54826" -version = "0.1.6" +version = "0.1.7" [[deps.ArrayInterfaceStaticArrays]] -deps = ["Adapt", "ArrayInterface", "ArrayInterfaceStaticArraysCore", "LinearAlgebra", "Static", "StaticArrays"] -git-tree-sha1 = "efb000a9f643f018d5154e56814e338b5746c560" +deps = ["Adapt", "ArrayInterface", "ArrayInterfaceCore", "ArrayInterfaceStaticArraysCore", "LinearAlgebra", "Static", "StaticArrays"] +git-tree-sha1 = "f12dc65aef03d0a49650b20b2fdaf184928fd886" uuid = "b0d46f97-bff5-4637-a19a-dd75974142cd" -version = "0.1.4" +version = "0.1.5" [[deps.ArrayInterfaceStaticArraysCore]] deps = ["Adapt", "ArrayInterfaceCore", "LinearAlgebra", "StaticArraysCore"] @@ -82,23 +93,34 @@ uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b" version = "0.2.0" [[deps.BSON]] -git-tree-sha1 = "306bb5574b0c1c56d7e1207581516c557d105cad" +git-tree-sha1 = "86e9781ac28f4e80e9b98f7f96eae21891332ac2" uuid = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0" -version = "0.3.5" +version = "0.3.6" + +[[deps.BangBang]] +deps = ["Compat", "ConstructionBase", "Future", "InitialValues", "LinearAlgebra", "Requires", "Setfield", "Tables", "ZygoteRules"] +git-tree-sha1 = "7fe6d92c4f281cf4ca6f2fba0ce7b299742da7ca" +uuid = "198e06fe-97b7-11e9-32a5-e1d131e6ad66" +version = "0.3.37" [[deps.Base64]] uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" +[[deps.Baselet]] +git-tree-sha1 = "aebf55e6d7795e02ca500a689d326ac979aaf89e" +uuid = "9718e550-a3fa-408a-8086-8db961cd8217" +version = "0.1.1" + [[deps.BitFlags]] -git-tree-sha1 = "84259bb6172806304b9101094a7cc4bc6f56dbc6" +git-tree-sha1 = "43b1a4a8f797c1cddadf60499a8a077d4af2cd2d" uuid = "d1d4a3ce-64b1-5f1a-9ba4-7e7e69966f35" -version = "0.1.5" +version = "0.1.7" [[deps.BitTwiddlingConvenienceFunctions]] deps = ["Static"] -git-tree-sha1 = "eaee37f76339077f86679787a71990c4e465477f" +git-tree-sha1 = "0c5f81f47bbbcf4aea7b2959135713459170798b" uuid = "62783981-4cbd-42fc-bca8-16325de8dc4b" -version = "0.1.4" +version = "0.1.5" [[deps.Bzip2_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] @@ -113,9 +135,9 @@ version = "0.4.2" [[deps.CPUSummary]] deps = ["CpuId", "IfElse", "Static"] -git-tree-sha1 = "9bdd5aceea9fa109073ace6b430a24839d79315e" +git-tree-sha1 = "a7157ab6bcda173f533db4c93fc8a27a48843757" uuid = "2a0fbf3d-bb9c-48f3-b0a9-814d99fd7ab9" -version = "0.1.27" +version = "0.1.30" [[deps.CUDA]] deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions", "TimerOutputs"] @@ -147,6 +169,12 @@ git-tree-sha1 = "23fe4c6668776fedfd3747c545cd0d1a5190eb15" uuid = "af321ab8-2d2e-40a6-b165-3d674595d28e" version = "0.1.9" +[[deps.ChainRules]] +deps = ["Adapt", "ChainRulesCore", "Compat", "Distributed", "GPUArraysCore", "IrrationalConstants", "LinearAlgebra", "Random", "RealDot", "SparseArrays", "Statistics", "StructArrays"] +git-tree-sha1 = "0c8c8887763f42583e1206ee35413a43c91e2623" +uuid = "082447d4-558c-5d27-93f4-14fc19e9eca2" +version = "1.45.0" + [[deps.ChainRulesCore]] deps = ["Compat", "LinearAlgebra", "SparseArrays"] git-tree-sha1 = "e7ff6cadf743c098e08fca25c91103ee4303c9bb" @@ -161,15 +189,9 @@ version = "0.1.4" [[deps.CloseOpenIntervals]] deps = ["ArrayInterface", "Static"] -git-tree-sha1 = "5522c338564580adf5d58d91e43a55db0fa5fb39" +git-tree-sha1 = "d61300b9895f129f4bd684b2aff97cf319b6c493" uuid = "fb6a15b2-703c-40df-9091-08a04967cfa9" -version = "0.1.10" - -[[deps.CodeTracking]] -deps = ["InteractiveUtils", "UUIDs"] -git-tree-sha1 = "cc4bd91eba9cdbbb4df4746124c22c0832a460d6" -uuid = "da1fd8a2-8d9e-5ec2-8556-3022fb5608a2" -version = "1.1.1" +version = "0.1.11" [[deps.CodecZlib]] deps = ["TranscodingStreams", "Zlib_jll"] @@ -178,10 +200,10 @@ uuid = "944b1d66-785c-5afd-91f1-9de20f533193" version = "0.7.0" [[deps.ColorSchemes]] -deps = ["ColorTypes", "ColorVectorSpace", "Colors", "FixedPointNumbers", "Random"] -git-tree-sha1 = "1fd869cc3875b57347f7027521f561cf46d1fcd8" +deps = ["ColorTypes", "ColorVectorSpace", "Colors", "FixedPointNumbers", "Random", "SnoopPrecompile"] +git-tree-sha1 = "aa3edc8f8dea6cbfa176ee12f7c2fc82f0608ed3" uuid = "35d6a980-a343-548e-a6ea-1d62b119f2f4" -version = "3.19.0" +version = "3.20.0" [[deps.ColorTypes]] deps = ["FixedPointNumbers", "Random"] @@ -214,25 +236,30 @@ version = "0.3.0" [[deps.Compat]] deps = ["Dates", "LinearAlgebra", "UUIDs"] -git-tree-sha1 = "3ca828fe1b75fa84b021a7860bd039eaea84d2f2" +git-tree-sha1 = "00a2cccc7f098ff3b66806862d275ca3db9e6e5a" uuid = "34da2185-b29b-5c13-b0c7-acf172513d20" -version = "4.3.0" +version = "4.5.0" [[deps.CompilerSupportLibraries_jll]] deps = ["Artifacts", "Libdl"] uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" version = "0.5.2+0" +[[deps.CompositionsBase]] +git-tree-sha1 = "455419f7e328a1a2493cabc6428d79e951349769" +uuid = "a33af91c-f02d-484b-be07-31d278c5ca2b" +version = "0.1.1" + [[deps.ComputationalResources]] git-tree-sha1 = "52cb3ec90e8a8bea0e62e275ba577ad0f74821f7" uuid = "ed09eef8-17a6-5b46-8889-db040fac31e3" version = "0.3.2" [[deps.ConformalPrediction]] -deps = ["MLJ", "MLJBase", "MLJModelInterface", "Statistics", "Term"] +deps = ["CategoricalArrays", "MLJ", "MLJBase", "MLJModelInterface", "Plots", "Statistics"] path = ".." uuid = "98bfc277-1877-43dc-819b-a3e38c30242f" -version = "0.1.2" +version = "0.1.3" [[deps.ConstructionBase]] deps = ["LinearAlgebra"] @@ -240,6 +267,12 @@ git-tree-sha1 = "fb21ddd70a051d882a1686a5a550990bbe371a95" uuid = "187b0558-2788-49d3-abe0-74a17ed4e7c9" version = "1.4.1" +[[deps.ContextVariablesX]] +deps = ["Compat", "Logging", "UUIDs"] +git-tree-sha1 = "25cc3803f1030ab855e383129dcd3dc294e322cc" +uuid = "6add18c4-b38d-439d-96f6-d6bc489c04c5" +version = "0.1.3" + [[deps.Contour]] git-tree-sha1 = "d05d9e7b7aedff4e5b51a029dced05cfb6125781" uuid = "d38c429a-6771-53c6-b99e-75d170b6e991" @@ -257,15 +290,15 @@ uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f" version = "4.1.1" [[deps.DataAPI]] -git-tree-sha1 = "46d2680e618f8abd007bce0c3026cb0c4a8f2032" +git-tree-sha1 = "e08915633fcb3ea83bf9d6126292e5bc5c739922" uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a" -version = "1.12.0" +version = "1.13.0" [[deps.DataFrames]] deps = ["Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Random", "Reexport", "SnoopPrecompile", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"] -git-tree-sha1 = "558078b0b78278683a7445c626ee78c86b9bb000" +git-tree-sha1 = "0f44494fe4271cc966ac4fea524111bef63ba86c" uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" -version = "1.4.1" +version = "1.4.3" [[deps.DataStructures]] deps = ["Compat", "InteractiveUtils", "OrderedCollections"] @@ -284,9 +317,14 @@ uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" [[deps.DecisionTree]] deps = ["AbstractTrees", "DelimitedFiles", "LinearAlgebra", "Random", "ScikitLearnBase", "Statistics"] -git-tree-sha1 = "fb3f7ff27befb9877bee84076dd9173185d7d86a" +git-tree-sha1 = "5ab40e9f5a554a642bc5ab1d20a198f76a1e6958" uuid = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb" -version = "0.11.2" +version = "0.12.0" + +[[deps.DefineSingletons]] +git-tree-sha1 = "0fba8b706d0178b4dc7fd44a96a92382c9065c2c" +uuid = "244e2a9f-e319-4986-a169-4d1fe445cd52" +version = "0.1.2" [[deps.DelimitedFiles]] deps = ["Mmap"] @@ -306,9 +344,9 @@ version = "1.1.0" [[deps.DiffRules]] deps = ["IrrationalConstants", "LogExpFunctions", "NaNMath", "Random", "SpecialFunctions"] -git-tree-sha1 = "8b7a4d23e22f5d44883671da70865ca98f2ebf9d" +git-tree-sha1 = "c5b6685d53f933c11404a3ae9822afe30d522494" uuid = "b552c78f-8df3-52c6-915a-8e097449b14b" -version = "1.12.0" +version = "1.12.2" [[deps.Distances]] deps = ["LinearAlgebra", "SparseArrays", "Statistics", "StatsAPI"] @@ -322,9 +360,9 @@ uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b" [[deps.Distributions]] deps = ["ChainRulesCore", "DensityInterface", "FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "Test"] -git-tree-sha1 = "04db820ebcfc1e053bd8cbb8d8bccf0ff3ead3f7" +git-tree-sha1 = "a7756d098cbabec6b3ac44f369f74915e8cfd70a" uuid = "31c24e10-a181-5473-b8eb-7969acd0382f" -version = "0.25.76" +version = "0.25.79" [[deps.DocStringExtensions]] deps = ["LibGit2"] @@ -363,9 +401,9 @@ version = "0.3.0" [[deps.EvoTrees]] deps = ["BSON", "CUDA", "CategoricalArrays", "Distributions", "LoopVectorization", "MLJModelInterface", "NetworkLayout", "Random", "RecipesBase", "Statistics", "StatsBase", "Tables"] -git-tree-sha1 = "966e236ded10551a44b6e25ce4bbea4c12be1557" +git-tree-sha1 = "7509d36427e6fd08aa7e591d783713263fdac57e" uuid = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" -version = "0.12.4" +version = "0.14.2" [[deps.Expat_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] @@ -407,6 +445,18 @@ git-tree-sha1 = "c6033cc3892d0ef5bb9cd29b7f2f0331ea5184ea" uuid = "f5851436-0d7a-5f13-b9de-f02708fd171a" version = "3.3.10+0" +[[deps.FLoops]] +deps = ["BangBang", "Compat", "FLoopsBase", "InitialValues", "JuliaVariables", "MLStyle", "Serialization", "Setfield", "Transducers"] +git-tree-sha1 = "ffb97765602e3cbe59a0589d237bf07f245a8576" +uuid = "cc61a311-1640-44b5-9fba-1b764f453329" +version = "0.2.1" + +[[deps.FLoopsBase]] +deps = ["ContextVariablesX"] +git-tree-sha1 = "656f7a6859be8673bf1f35da5670246b923964f7" +uuid = "b9860ae5-e623-471e-878b-f6a53c775ea6" +version = "0.1.1" + [[deps.FileIO]] deps = ["Pkg", "Requires", "UUIDs"] git-tree-sha1 = "7be5f99f7d15578798f338f5433b6c432ea8037b" @@ -424,9 +474,9 @@ version = "0.13.5" [[deps.FiniteDiff]] deps = ["ArrayInterfaceCore", "LinearAlgebra", "Requires", "Setfield", "SparseArrays", "StaticArrays"] -git-tree-sha1 = "5a2cff9b6b77b33b89f3d97a4d367747adce647e" +git-tree-sha1 = "04ed1f0029b6b3af88343e439b995141cb0d0b8d" uuid = "6a86dc24-6348-571c-b903-95158fe2bd41" -version = "2.15.0" +version = "2.17.0" [[deps.FixedPointNumbers]] deps = ["Statistics"] @@ -434,6 +484,18 @@ git-tree-sha1 = "335bfdceacc84c5cdf16aadc768aa5ddfc5383cc" uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93" version = "0.8.4" +[[deps.Flux]] +deps = ["Adapt", "CUDA", "ChainRulesCore", "Functors", "LinearAlgebra", "MLUtils", "MacroTools", "NNlib", "NNlibCUDA", "OneHotArrays", "Optimisers", "ProgressLogging", "Random", "Reexport", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "Zygote"] +git-tree-sha1 = "2b85cb85f5d71f05e41089a2446ac33b8e94ebed" +uuid = "587475ba-b771-5e3f-ad9e-33799f191a9c" +version = "0.13.9" + +[[deps.FoldsThreads]] +deps = ["Accessors", "FunctionWrappers", "InitialValues", "SplittablesBase", "Transducers"] +git-tree-sha1 = "eb8e1989b9028f7e0985b4268dabe94682249025" +uuid = "9c68100b-dfe1-47cf-94c8-95104e173443" +version = "0.1.1" + [[deps.Fontconfig_jll]] deps = ["Artifacts", "Bzip2_jll", "Expat_jll", "FreeType2_jll", "JLLWrappers", "Libdl", "Libuuid_jll", "Pkg", "Zlib_jll"] git-tree-sha1 = "21efd19106a55620a188615da6d3d06cd7f6ee03" @@ -448,9 +510,9 @@ version = "0.4.2" [[deps.ForwardDiff]] deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "LinearAlgebra", "LogExpFunctions", "NaNMath", "Preferences", "Printf", "Random", "SpecialFunctions", "StaticArrays"] -git-tree-sha1 = "187198a4ed8ccd7b5d99c41b69c679269ea2b2d4" +git-tree-sha1 = "10fa12fe96e4d76acfa738f4df2126589a67374f" uuid = "f6369f11-7733-5829-9624-2563aa707210" -version = "0.10.32" +version = "0.10.33" [[deps.FreeType]] deps = ["CEnum", "FreeType2_jll"] @@ -470,6 +532,17 @@ git-tree-sha1 = "aa31987c2ba8704e23c6c8ba8a4f769d5d7e4f91" uuid = "559328eb-81f9-559d-9380-de523a88c83c" version = "1.0.10+0" +[[deps.FunctionWrappers]] +git-tree-sha1 = "d62485945ce5ae9c0c48f124a84998d755bae00e" +uuid = "069b7b12-0de2-55c6-9aab-29f3d0a68a2e" +version = "1.1.3" + +[[deps.Functors]] +deps = ["LinearAlgebra"] +git-tree-sha1 = "993c2b4a9a54496b6d8e265db1244db418f37e01" +uuid = "d9f16b24-f501-4c13-a1f2-28368ffc5196" +version = "0.4.1" + [[deps.Future]] deps = ["Random"] uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" @@ -494,21 +567,21 @@ version = "0.1.2" [[deps.GPUCompiler]] deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"] -git-tree-sha1 = "323949b0bbdf38c93d2ea1f7d3e68ff163c3f081" +git-tree-sha1 = "76f70a337a153c1632104af19d29023dbb6f30dd" uuid = "61eb1bfa-7361-4325-ad38-22787b887f55" -version = "0.16.5" +version = "0.16.6" [[deps.GR]] -deps = ["Base64", "DelimitedFiles", "GR_jll", "HTTP", "JSON", "Libdl", "LinearAlgebra", "Pkg", "Preferences", "Printf", "Random", "Serialization", "Sockets", "Test", "UUIDs"] -git-tree-sha1 = "00a9d4abadc05b9476e937a5557fcce476b9e547" +deps = ["Artifacts", "Base64", "DelimitedFiles", "Downloads", "GR_jll", "HTTP", "JSON", "Libdl", "LinearAlgebra", "Pkg", "Preferences", "Printf", "Random", "Serialization", "Sockets", "TOML", "Tar", "Test", "UUIDs", "p7zip_jll"] +git-tree-sha1 = "051072ff2accc6e0e87b708ddee39b18aa04a0bc" uuid = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71" -version = "0.69.5" +version = "0.71.1" [[deps.GR_jll]] deps = ["Artifacts", "Bzip2_jll", "Cairo_jll", "FFMPEG_jll", "Fontconfig_jll", "GLFW_jll", "JLLWrappers", "JpegTurbo_jll", "Libdl", "Libtiff_jll", "Pixman_jll", "Pkg", "Qt5Base_jll", "Zlib_jll", "libpng_jll"] -git-tree-sha1 = "bc9f7725571ddb4ab2c4bc74fa397c1c5ad08943" +git-tree-sha1 = "501a4bf76fd679e7fcd678725d5072177392e756" uuid = "d2c73de3-f751-5644-a686-071e5b155ba9" -version = "0.69.1+0" +version = "0.71.1+0" [[deps.GeoInterface]] deps = ["Extents"] @@ -518,9 +591,9 @@ version = "1.0.1" [[deps.GeometryBasics]] deps = ["EarCut_jll", "GeoInterface", "IterTools", "LinearAlgebra", "StaticArrays", "StructArrays", "Tables"] -git-tree-sha1 = "12a584db96f1d460421d5fb8860822971cdb8455" +git-tree-sha1 = "fe9aea4ed3ec6afdfbeb5a4f39a2208909b162a6" uuid = "5c1252a2-5f33-56bf-86c9-59e7332b4326" -version = "0.4.4" +version = "0.4.5" [[deps.Gettext_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Libiconv_jll", "Pkg", "XML2_jll"] @@ -559,9 +632,9 @@ version = "1.12.2+2" [[deps.HTTP]] deps = ["Base64", "CodecZlib", "Dates", "IniFile", "Logging", "LoggingExtras", "MbedTLS", "NetworkOptions", "OpenSSL", "Random", "SimpleBufferStream", "Sockets", "URIs", "UUIDs"] -git-tree-sha1 = "a97d47758e933cd5fe5ea181d178936a9fc60427" +git-tree-sha1 = "e1acc37ed078d99a714ed8376446f92a5535ca65" uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3" -version = "1.5.1" +version = "1.5.5" [[deps.HarfBuzz_jll]] deps = ["Artifacts", "Cairo_jll", "Fontconfig_jll", "FreeType2_jll", "Glib_jll", "Graphite2_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg"] @@ -569,17 +642,11 @@ git-tree-sha1 = "129acf094d168394e80ee1dc4bc06ec835e510a3" uuid = "2e76f6c2-a576-52d4-95c1-20adfe4de566" version = "2.8.1+1" -[[deps.Highlights]] -deps = ["DocStringExtensions", "InteractiveUtils", "REPL"] -git-tree-sha1 = "0341077e8a6b9fc1c2ea5edc1e93a956d2aec0c7" -uuid = "eafb193a-b7ab-5a9e-9068-77385905fa72" -version = "0.5.2" - [[deps.HostCPUFeatures]] deps = ["BitTwiddlingConvenienceFunctions", "IfElse", "Libdl", "Static"] -git-tree-sha1 = "b7b88a4716ac33fe31d6556c02fc60017594343c" +git-tree-sha1 = "f64b890b2efa4de81520d2b0fbdc9aadb65bdf53" uuid = "3e5b6fbb-0976-4d2c-9146-d79de83f2fb0" -version = "0.1.8" +version = "0.1.13" [[deps.HypergeometricFunctions]] deps = ["DualNumbers", "LinearAlgebra", "OpenLibm_jll", "SpecialFunctions", "Test"] @@ -593,6 +660,12 @@ git-tree-sha1 = "f7be53659ab06ddc986428d3a9dcc95f6fa6705a" uuid = "b5f81e59-6552-4d32-b1f0-c071b021bf89" version = "0.2.2" +[[deps.IRTools]] +deps = ["InteractiveUtils", "MacroTools", "Test"] +git-tree-sha1 = "2e99184fca5eb6f075944b04c22edec29beb4778" +uuid = "7869d1d1-7146-5819-86e3-90919afe41df" +version = "0.4.7" + [[deps.IfElse]] git-tree-sha1 = "debdd00ffef04665ccbb3e150747a77560e8fad1" uuid = "615f187c-cbe4-4ef1-ba3b-2fcf58d6d173" @@ -603,6 +676,11 @@ git-tree-sha1 = "f550e6e32074c939295eb5ea6de31849ac2c9625" uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f" version = "0.5.1" +[[deps.InitialValues]] +git-tree-sha1 = "4da0f88e9a39111c2fa3add390ab15f3a44f3ca3" +uuid = "22cec73e-a1b8-11e9-2c92-598750a2cf9c" +version = "0.3.1" + [[deps.IntelOpenMP_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] git-tree-sha1 = "d979e54b71da82f3a65b62553da4fc3d18c9004c" @@ -626,9 +704,9 @@ uuid = "3587e190-3f89-42d0-90ee-14403ec27112" version = "0.1.8" [[deps.InvertedIndices]] -git-tree-sha1 = "bee5f1ef5bf65df56bdd2e40447590b272a5471f" +git-tree-sha1 = "82aec7a3dd64f4d9584659dc0b62ef7db2ef3e19" uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f" -version = "1.1.0" +version = "1.2.0" [[deps.IrrationalConstants]] git-tree-sha1 = "7fd44fd4ff43fc60815f8e764c0f352b83c49151" @@ -681,6 +759,12 @@ git-tree-sha1 = "b53380851c6e6664204efb2e62cd24fa5c47e4ba" uuid = "aacddb02-875f-59d6-b918-886e6ef4fbf8" version = "2.1.2+0" +[[deps.JuliaVariables]] +deps = ["MLStyle", "NameResolution"] +git-tree-sha1 = "49fb3cb53362ddadb4415e9b73926d6b40709e70" +uuid = "b14d175d-62b4-44ba-8fb7-3064adc8c3ec" +version = "0.2.4" + [[deps.KernelDensity]] deps = ["Distributions", "DocStringExtensions", "FFTW", "Interpolations", "StatsBase"] git-tree-sha1 = "9816b296736292a80b9a3200eb7fbb57aaa3917a" @@ -701,9 +785,9 @@ version = "3.0.0+1" [[deps.LLVM]] deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"] -git-tree-sha1 = "e7e9184b0bf0158ac4e4aa9daf00041b5909bf1a" +git-tree-sha1 = "088dd02b2797f0233d92583562ab669de8517fd1" uuid = "929cbde3-209d-540e-8aea-75f648917ca0" -version = "4.14.0" +version = "4.14.1" [[deps.LLVMExtra_jll]] deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg", "TOML"] @@ -736,9 +820,9 @@ version = "1.8.0" [[deps.LayoutPointers]] deps = ["ArrayInterface", "ArrayInterfaceOffsetArrays", "ArrayInterfaceStaticArrays", "LinearAlgebra", "ManualMemory", "SIMDTypes", "Static"] -git-tree-sha1 = "73e2e40eb02d6ccd191a8a9f8cee20db8d5df010" +git-tree-sha1 = "7e34177793212f6d64d045ee47d2883f09fffacc" uuid = "10f19ff3-798f-405d-979b-55457f8fc047" -version = "0.1.11" +version = "0.1.12" [[deps.LazyArtifacts]] deps = ["Artifacts", "Pkg"] @@ -780,9 +864,9 @@ version = "1.8.7+0" [[deps.Libglvnd_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll", "Xorg_libXext_jll"] -git-tree-sha1 = "7739f837d6447403596a75d19ed01fd08d6f56bf" +git-tree-sha1 = "6f73d1dd803986947b2c750138528a999a6c7733" uuid = "7e76a0d4-f3c7-5321-8279-8d96eeed0f29" -version = "1.3.0+3" +version = "1.6.0+0" [[deps.Libgpg_error_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] @@ -832,24 +916,24 @@ version = "3.8.0" [[deps.LogExpFunctions]] deps = ["ChainRulesCore", "ChangesOfVariables", "DocStringExtensions", "InverseFunctions", "IrrationalConstants", "LinearAlgebra"] -git-tree-sha1 = "94d9c52ca447e23eac0c0f074effbcd38830deb5" +git-tree-sha1 = "946607f84feb96220f480e0422d3484c49c00239" uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688" -version = "0.3.18" +version = "0.3.19" [[deps.Logging]] uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" [[deps.LoggingExtras]] deps = ["Dates", "Logging"] -git-tree-sha1 = "5d4d2d9904227b8bd66386c1138cf4d5ffa826bf" +git-tree-sha1 = "cedb76b37bc5a6c702ade66be44f831fa23c681e" uuid = "e6f89c97-d47a-5376-807f-9c37f3926c36" -version = "0.4.9" +version = "1.0.0" [[deps.LoopVectorization]] deps = ["ArrayInterface", "ArrayInterfaceCore", "ArrayInterfaceOffsetArrays", "ArrayInterfaceStaticArrays", "CPUSummary", "ChainRulesCore", "CloseOpenIntervals", "DocStringExtensions", "ForwardDiff", "HostCPUFeatures", "IfElse", "LayoutPointers", "LinearAlgebra", "OffsetArrays", "PolyesterWeave", "SIMDDualNumbers", "SIMDTypes", "SLEEFPirates", "SnoopPrecompile", "SpecialFunctions", "Static", "ThreadingUtilities", "UnPack", "VectorizationBase"] -git-tree-sha1 = "9f6030ca92d1a816e931abb657219c9fc4991a96" +git-tree-sha1 = "da5317a78e2a9f692730345cf3ff820109f406d3" uuid = "bdcacae8-1622-11e9-2a5c-532679323890" -version = "0.12.136" +version = "0.12.141" [[deps.LossFunctions]] deps = ["InteractiveUtils", "Markdown", "RecipesBase"] @@ -865,51 +949,57 @@ version = "2022.2.0+0" [[deps.MLJ]] deps = ["CategoricalArrays", "ComputationalResources", "Distributed", "Distributions", "LinearAlgebra", "MLJBase", "MLJEnsembles", "MLJIteration", "MLJModels", "MLJTuning", "OpenML", "Pkg", "ProgressMeter", "Random", "ScientificTypes", "Statistics", "StatsBase", "Tables"] -git-tree-sha1 = "025706ea81e635ac530a1d3dd365af971805bf79" +git-tree-sha1 = "9d79ef8684eb15a6fe4c3654cdb9c5de4868a81e" uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" -version = "0.18.5" +version = "0.19.0" [[deps.MLJBase]] deps = ["CategoricalArrays", "CategoricalDistributions", "ComputationalResources", "Dates", "DelimitedFiles", "Distributed", "Distributions", "InteractiveUtils", "InvertedIndices", "LinearAlgebra", "LossFunctions", "MLJModelInterface", "Missings", "OrderedCollections", "Parameters", "PrettyTables", "ProgressMeter", "Random", "ScientificTypes", "Serialization", "StatisticalTraits", "Statistics", "StatsBase", "Tables"] -git-tree-sha1 = "ace5668bc6c4fd46f3e6af67ead3778804f23e5b" +git-tree-sha1 = "decaf881165c0b3c7abf1130dfe3221ee88ef99a" uuid = "a7f614a8-145f-11e9-1d2a-a57a1082229d" -version = "0.20.20" +version = "0.21.2" [[deps.MLJDecisionTreeInterface]] deps = ["DecisionTree", "MLJModelInterface", "Random", "Tables"] -git-tree-sha1 = "d0d682ef8504e1ab705f10307c587239ebb20c4d" +git-tree-sha1 = "74e8076ea6f64fcb490783f2033070b18fa3466f" uuid = "c6f25543-311c-4c74-83dc-3ea6d1015661" -version = "0.2.5" +version = "0.3.0" [[deps.MLJEnsembles]] deps = ["CategoricalArrays", "CategoricalDistributions", "ComputationalResources", "Distributed", "Distributions", "MLJBase", "MLJModelInterface", "ProgressMeter", "Random", "ScientificTypesBase", "StatsBase"] -git-tree-sha1 = "ed2f724be26d0023cade9d59b55da93f528c3f26" +git-tree-sha1 = "bb8a1056b1d8b40f2f27167fc3ef6412a6719fbf" uuid = "50ed68f4-41fd-4504-931a-ed422449fee0" -version = "0.3.1" +version = "0.3.2" + +[[deps.MLJFlux]] +deps = ["CategoricalArrays", "ColorTypes", "ComputationalResources", "Flux", "MLJModelInterface", "ProgressMeter", "Random", "Statistics", "Tables"] +git-tree-sha1 = "a47257705ebca405a25320b111345a978925bcd5" +uuid = "094fc8d1-fd35-5302-93ea-dabda2abf845" +version = "0.2.7" [[deps.MLJIteration]] deps = ["IterationControl", "MLJBase", "Random", "Serialization"] -git-tree-sha1 = "024d0bd22bf4a5b273f626e89d742a9db95285ef" +git-tree-sha1 = "be6d5c71ab499a59e82d65e00a89ceba8732fcd5" uuid = "614be32b-d00c-4edb-bd02-1eb411ab5e55" -version = "0.5.0" +version = "0.5.1" [[deps.MLJLinearModels]] deps = ["DocStringExtensions", "IterativeSolvers", "LinearAlgebra", "LinearMaps", "MLJModelInterface", "Optim", "Parameters"] -git-tree-sha1 = "bfebb824a1b9a0c6d58e417e680f5e99317534e3" +git-tree-sha1 = "7c191a2975e05387da3cd12a4c8a835a3d5186f4" uuid = "6ee0df7b-362f-4a72-a706-9e79364fb692" -version = "0.7.1" +version = "0.8.0" [[deps.MLJModelInterface]] deps = ["Random", "ScientificTypesBase", "StatisticalTraits"] -git-tree-sha1 = "0a36882e73833d60dac49b00d203f73acfd50b85" +git-tree-sha1 = "c8b7e632d6754a5e36c0d94a4b466a5ba3a30128" uuid = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" -version = "1.7.0" +version = "1.8.0" [[deps.MLJModels]] deps = ["CategoricalArrays", "CategoricalDistributions", "Combinatorics", "Dates", "Distances", "Distributions", "InteractiveUtils", "LinearAlgebra", "MLJModelInterface", "Markdown", "OrderedCollections", "Parameters", "Pkg", "PrettyPrinting", "REPL", "Random", "RelocatableFolders", "ScientificTypes", "StatisticalTraits", "Statistics", "StatsBase", "Tables"] -git-tree-sha1 = "147a8e7939601f8c37204addbbe29f2bcfb876a8" +git-tree-sha1 = "08203fc87a7f992cee24e7a1b2353e594c73c41c" uuid = "d491faf4-2d78-11e9-2867-c94bc002c0b7" -version = "0.15.14" +version = "0.16.2" [[deps.MLJNaiveBayesInterface]] deps = ["LogExpFunctions", "MLJModelInterface", "NaiveBayes"] @@ -919,9 +1009,20 @@ version = "0.1.6" [[deps.MLJTuning]] deps = ["ComputationalResources", "Distributed", "Distributions", "LatinHypercubeSampling", "MLJBase", "ProgressMeter", "Random", "RecipesBase"] -git-tree-sha1 = "77209966cc028c1d7730001dc32bffe17a198f29" +git-tree-sha1 = "02688098bd77827b64ed8ad747c14f715f98cfc4" uuid = "03970b2e-30c4-11ea-3135-d1576263f10f" -version = "0.7.3" +version = "0.7.4" + +[[deps.MLStyle]] +git-tree-sha1 = "060ef7956fef2dc06b0e63b294f7dbfbcbdc7ea2" +uuid = "d8e11817-5142-5d16-987a-aa16d5891078" +version = "0.4.16" + +[[deps.MLUtils]] +deps = ["ChainRulesCore", "DataAPI", "DelimitedFiles", "FLoops", "FoldsThreads", "NNlib", "Random", "ShowCases", "SimpleTraits", "Statistics", "StatsBase", "Tables", "Transducers"] +git-tree-sha1 = "82c1104919d664ab1024663ad851701415300c5f" +uuid = "f1d291b0-491e-4a28-83b9-f70985020b54" +version = "0.3.1" [[deps.MacroTools]] deps = ["Markdown", "Random"] @@ -956,9 +1057,15 @@ uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" version = "2.28.0+0" [[deps.Measures]] -git-tree-sha1 = "e498ddeee6f9fdb4551ce855a46f54dbd900245f" +git-tree-sha1 = "c13304c81eec1ed3af7fc20e75fb6b26092a1102" uuid = "442fdcdd-2543-5da2-b0f3-8c86c306513e" -version = "0.3.1" +version = "0.3.2" + +[[deps.MicroCollections]] +deps = ["BangBang", "InitialValues", "Setfield"] +git-tree-sha1 = "4d5917a26ca33c66c8e5ca3247bd163624d35493" +uuid = "128add7d-3638-4c79-886c-908ea0c25c34" +version = "0.1.3" [[deps.Missings]] deps = ["DataAPI"] @@ -973,16 +1080,23 @@ uuid = "a63ad114-7e13-5084-954f-fe012c677804" uuid = "14a3606d-f60d-562e-9121-12d972cd8159" version = "2022.2.1" -[[deps.MyterialColors]] -git-tree-sha1 = "01d8466fb449436348999d7c6ad740f8f853a579" -uuid = "1c23619d-4212-4747-83aa-717207fae70f" -version = "0.3.0" - [[deps.NLSolversBase]] deps = ["DiffResults", "Distributed", "FiniteDiff", "ForwardDiff"] -git-tree-sha1 = "50310f934e55e5ca3912fb941dec199b49ca9b68" +git-tree-sha1 = "a0b464d183da839699f4c79e7606d9d186ec172c" uuid = "d41bc354-129a-5804-8e4c-c37616107c6c" -version = "7.8.2" +version = "7.8.3" + +[[deps.NNlib]] +deps = ["Adapt", "ChainRulesCore", "LinearAlgebra", "Pkg", "Requires", "Statistics"] +git-tree-sha1 = "37596c26f107f2fd93818166ed3dab1a2e6b2f05" +uuid = "872c559c-99b0-510c-b3b7-b6c96a88d5cd" +version = "0.8.11" + +[[deps.NNlibCUDA]] +deps = ["Adapt", "CUDA", "LinearAlgebra", "NNlib", "Random", "Statistics"] +git-tree-sha1 = "4429261364c5ea5b7308aecaa10e803ace101631" +uuid = "a00861dc-f156-4864-bf3c-e6376f28a68d" +version = "0.2.4" [[deps.NaNMath]] deps = ["OpenLibm_jll"] @@ -996,6 +1110,24 @@ git-tree-sha1 = "830c601de91378e773e7286c3a3e8964d6248657" uuid = "9bbee03b-0db5-5f46-924f-b5c9c21b8c60" version = "0.5.4" +[[deps.NameResolution]] +deps = ["PrettyPrint"] +git-tree-sha1 = "1a0fa0e9613f46c9b8c11eee38ebb4f590013c5e" +uuid = "71a1bf82-56d0-4bbc-8a3c-48b961074391" +version = "0.1.5" + +[[deps.NearestNeighborModels]] +deps = ["Distances", "FillArrays", "InteractiveUtils", "LinearAlgebra", "MLJModelInterface", "NearestNeighbors", "Statistics", "StatsBase", "Tables"] +git-tree-sha1 = "727b8f1c3f9fec6b1a805ba9bef72c73758eda02" +uuid = "636a865e-7cf4-491e-846c-de09b730eb36" +version = "0.2.1" + +[[deps.NearestNeighbors]] +deps = ["Distances", "StaticArrays"] +git-tree-sha1 = "440165bf08bc500b8fe4a7be2dc83271a00c0716" +uuid = "b8a86587-4115-5ab1-83bc-aa920d37bbce" +version = "0.4.12" + [[deps.NetworkLayout]] deps = ["GeometryBasics", "LinearAlgebra", "Random", "Requires", "SparseArrays"] git-tree-sha1 = "cac8fc7ba64b699c678094fa630f49b80618f625" @@ -1018,6 +1150,12 @@ git-tree-sha1 = "887579a3eb005446d514ab7aeac5d1d027658b8f" uuid = "e7412a2a-1a6e-54c0-be00-318e2571c051" version = "1.3.5+1" +[[deps.OneHotArrays]] +deps = ["Adapt", "ChainRulesCore", "GPUArraysCore", "LinearAlgebra", "MLUtils", "NNlib"] +git-tree-sha1 = "aee0130122fa7c1f3d394231376f07869f1e097c" +uuid = "0b1bfda6-eb8a-41d2-88d8-f5af5cad476f" +version = "0.2.0" + [[deps.OpenBLAS_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" @@ -1036,15 +1174,15 @@ version = "0.3.0" [[deps.OpenSSL]] deps = ["BitFlags", "Dates", "MozillaCACerts_jll", "OpenSSL_jll", "Sockets"] -git-tree-sha1 = "3c3c4a401d267b04942545b1e964a20279587fd7" +git-tree-sha1 = "df6830e37943c7aaa10023471ca47fb3065cc3c4" uuid = "4d8831e6-92b7-49fb-bdf8-b643e874388c" -version = "1.3.0" +version = "1.3.2" [[deps.OpenSSL_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] -git-tree-sha1 = "e60321e3f2616584ff98f0a4f18d98ae6f89bbb3" +git-tree-sha1 = "f6e9dba33f9f2c44e08a020b0caf6903be540004" uuid = "458c3c95-2e84-50aa-8efc-19380b2a3a95" -version = "1.1.17+0" +version = "1.1.19+0" [[deps.OpenSpecFun_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"] @@ -1054,9 +1192,15 @@ version = "0.5.5+0" [[deps.Optim]] deps = ["Compat", "FillArrays", "ForwardDiff", "LineSearches", "LinearAlgebra", "NLSolversBase", "NaNMath", "Parameters", "PositiveFactorizations", "Printf", "SparseArrays", "StatsBase"] -git-tree-sha1 = "b9fe76d1a39807fdcf790b991981a922de0c3050" +git-tree-sha1 = "1903afc76b7d01719d9c30d3c7d501b61db96721" uuid = "429524aa-4258-5aef-a3af-852621145aeb" -version = "1.7.3" +version = "1.7.4" + +[[deps.Optimisers]] +deps = ["ChainRulesCore", "Functors", "LinearAlgebra", "Random", "Statistics"] +git-tree-sha1 = "f1cccb9f879dd4eaa4d92b115ab793545965d763" +uuid = "3bd65402-5787-11e9-1adc-39752487f4e2" +version = "0.2.13" [[deps.Opus_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] @@ -1087,10 +1231,10 @@ uuid = "d96e819e-fc66-5662-9728-84c9c7592b0a" version = "0.12.3" [[deps.Parsers]] -deps = ["Dates"] -git-tree-sha1 = "6c01a9b494f6d2a9fc180a08b182fcb06f0958a0" +deps = ["Dates", "SnoopPrecompile"] +git-tree-sha1 = "b64719e8b4504983c7fca6cc9db3ebc8acc2a4d6" uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" -version = "2.4.2" +version = "2.5.1" [[deps.Pipe]] git-tree-sha1 = "6842804e7867b115ca9de748a0cf6b364523c16d" @@ -1122,15 +1266,21 @@ version = "1.3.1" [[deps.Plots]] deps = ["Base64", "Contour", "Dates", "Downloads", "FFMPEG", "FixedPointNumbers", "GR", "JLFzf", "JSON", "LaTeXStrings", "Latexify", "LinearAlgebra", "Measures", "NaNMath", "Pkg", "PlotThemes", "PlotUtils", "Printf", "REPL", "Random", "RecipesBase", "RecipesPipeline", "Reexport", "RelocatableFolders", "Requires", "Scratch", "Showoff", "SnoopPrecompile", "SparseArrays", "Statistics", "StatsBase", "UUIDs", "UnicodeFun", "Unzip"] -git-tree-sha1 = "0a56829d264eb1bc910cf7c39ac008b5bcb5a0d9" +git-tree-sha1 = "6a9521b955b816aa500462951aa67f3e4467248a" uuid = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" -version = "1.35.5" +version = "1.36.6" [[deps.PolyesterWeave]] deps = ["BitTwiddlingConvenienceFunctions", "CPUSummary", "IfElse", "Static", "ThreadingUtilities"] -git-tree-sha1 = "b42fb2292fbbaed36f25d33a15c8cc0b4f287fcf" +git-tree-sha1 = "050ca4aa2ca31484b51b849d8180caf8e4449c49" uuid = "1d0040c9-8b98-4ee7-8388-3f51789ca0ad" -version = "0.1.10" +version = "0.1.11" + +[[deps.Polynomials]] +deps = ["LinearAlgebra", "RecipesBase"] +git-tree-sha1 = "3010a6dd6ad4c7384d2f38c58fa8172797d879c1" +uuid = "f27b6e38-b328-58d1-80ce-0feddd5e7a45" +version = "3.2.0" [[deps.PooledArrays]] deps = ["DataAPI", "Future"] @@ -1150,16 +1300,21 @@ git-tree-sha1 = "47e5f437cc0e7ef2ce8406ce1e7e24d44915f88d" uuid = "21216c6a-2e73-6563-6e65-726566657250" version = "1.3.0" +[[deps.PrettyPrint]] +git-tree-sha1 = "632eb4abab3449ab30c5e1afaa874f0b98b586e4" +uuid = "8162dcfd-2161-5ef2-ae6c-7681170c5f98" +version = "0.2.0" + [[deps.PrettyPrinting]] git-tree-sha1 = "4be53d093e9e37772cc89e1009e8f6ad10c4681b" uuid = "54e16d92-306c-5ea0-a30b-337be88ac337" version = "0.4.0" [[deps.PrettyTables]] -deps = ["Crayons", "Formatting", "Markdown", "Reexport", "StringManipulation", "Tables"] -git-tree-sha1 = "460d9e154365e058c4d886f6f7d6df5ffa1ea80e" +deps = ["Crayons", "Formatting", "LaTeXStrings", "Markdown", "Reexport", "StringManipulation", "Tables"] +git-tree-sha1 = "96f6db03ab535bdb901300f88335257b0018689d" uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d" -version = "2.1.2" +version = "2.2.2" [[deps.Printf]] deps = ["Unicode"] @@ -1179,9 +1334,9 @@ version = "1.7.2" [[deps.Qt5Base_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "Fontconfig_jll", "Glib_jll", "JLLWrappers", "Libdl", "Libglvnd_jll", "OpenSSL_jll", "Pkg", "Xorg_libXext_jll", "Xorg_libxcb_jll", "Xorg_xcb_util_image_jll", "Xorg_xcb_util_keysyms_jll", "Xorg_xcb_util_renderutil_jll", "Xorg_xcb_util_wm_jll", "Zlib_jll", "xkbcommon_jll"] -git-tree-sha1 = "c6c0f690d0cc7caddb74cef7aa847b824a16b256" +git-tree-sha1 = "0c03844e2231e12fda4d0086fd7cbe4098ee8dc5" uuid = "ea2cea3b-5b76-57ae-a6ef-0a8af62496e1" -version = "5.15.3+1" +version = "5.15.3+2" [[deps.QuadGK]] deps = ["DataStructures", "LinearAlgebra"] @@ -1215,17 +1370,23 @@ git-tree-sha1 = "dc84268fe0e3335a62e315a3a7cf2afa7178a734" uuid = "c84ed2f1-dad5-54f0-aa8e-dbefe2724439" version = "0.4.3" +[[deps.RealDot]] +deps = ["LinearAlgebra"] +git-tree-sha1 = "9f0a1b71baaf7650f4fa8a1d168c7fb6ee41f0c9" +uuid = "c1ae055f-0cd5-4b69-90a6-9a35b1a98df9" +version = "0.1.0" + [[deps.RecipesBase]] deps = ["SnoopPrecompile"] -git-tree-sha1 = "d12e612bba40d189cead6ff857ddb67bd2e6a387" +git-tree-sha1 = "18c35ed630d7229c5584b945641a73ca83fb5213" uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" -version = "1.3.1" +version = "1.3.2" [[deps.RecipesPipeline]] deps = ["Dates", "NaNMath", "PlotUtils", "RecipesBase", "SnoopPrecompile"] -git-tree-sha1 = "9b1c0c8e9188950e66fc28f40bfe0f8aac311fe0" +git-tree-sha1 = "e974477be88cb5e3040009f3767611bc6357846f" uuid = "01d81517-befc-4cb6-b9ec-a95719d0359c" -version = "0.6.7" +version = "0.6.11" [[deps.Reexport]] git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" @@ -1273,9 +1434,9 @@ version = "0.1.0" [[deps.SLEEFPirates]] deps = ["IfElse", "Static", "VectorizationBase"] -git-tree-sha1 = "938c9ecffb28338a6b8b970bda0f3806a65e7906" +git-tree-sha1 = "c8679919df2d3c71f74451321f1efea6433536cc" uuid = "476501e8-09a2-5ece-8869-fb82de89a1fa" -version = "0.6.36" +version = "0.6.37" [[deps.ScientificTypes]] deps = ["CategoricalArrays", "ColorTypes", "Dates", "Distributions", "PrettyTables", "Reexport", "ScientificTypesBase", "StatisticalTraits", "Tables"] @@ -1313,6 +1474,11 @@ version = "1.1.1" deps = ["Distributed", "Mmap", "Random", "Serialization"] uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383" +[[deps.ShowCases]] +git-tree-sha1 = "7f534ad62ab2bd48591bdeac81994ea8c445e4a5" +uuid = "605ecd9f-84a6-4c9e-81e2-4798472b76a3" +version = "0.1.0" + [[deps.Showoff]] deps = ["Dates", "Grisu"] git-tree-sha1 = "91eddf657aca81df9ae6ceb20b959ae5653ad1de" @@ -1324,6 +1490,12 @@ git-tree-sha1 = "874e8867b33a00e784c8a7e4b60afe9e037b74e1" uuid = "777ac1f9-54b0-4bf8-805c-2214025038e7" version = "1.1.0" +[[deps.SimpleTraits]] +deps = ["InteractiveUtils", "MacroTools"] +git-tree-sha1 = "5d7e3f4e11935503d3ecaf7186eac40602e7d231" +uuid = "699a6c99-e7fa-54fc-8d76-47d257e15c1d" +version = "0.9.4" + [[deps.SnoopPrecompile]] git-tree-sha1 = "f604441450a3c0569830946e5b33b78c928e1a85" uuid = "66db9d55-30c0-4569-8b51-7e840670fc0c" @@ -1334,9 +1506,9 @@ uuid = "6462fe0b-24de-5631-8697-dd941f90decc" [[deps.SortingAlgorithms]] deps = ["DataStructures"] -git-tree-sha1 = "b3363d7460f7d098ca0912c69b082f75625d7508" +git-tree-sha1 = "a4ada03f999bd01b3a25dcaa30b2d929fe537e00" uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c" -version = "1.0.1" +version = "1.1.0" [[deps.SparseArrays]] deps = ["LinearAlgebra", "Random"] @@ -1348,6 +1520,12 @@ git-tree-sha1 = "d75bda01f8c31ebb72df80a46c88b25d1c79c56d" uuid = "276daf66-3868-5448-9aa4-cd146d93841b" version = "2.1.7" +[[deps.SplittablesBase]] +deps = ["Setfield", "Test"] +git-tree-sha1 = "e08a62abc517eb79667d0a29dc08a3b589516bb5" +uuid = "171d559e-b47b-412a-8079-5efa626c420e" +version = "0.1.15" + [[deps.StableRNGs]] deps = ["Random", "Test"] git-tree-sha1 = "3be7d49667040add7ee151fefaf1f8c04c8c8276" @@ -1356,15 +1534,15 @@ version = "1.0.0" [[deps.Static]] deps = ["IfElse"] -git-tree-sha1 = "de4f0a4f049a4c87e4948c04acff37baf1be01a6" +git-tree-sha1 = "0559586098f3cbd2e835461254ea2fcffa4a61ba" uuid = "aedffcd0-7271-4cad-89d0-dc628f76c6d3" -version = "0.7.7" +version = "0.8.2" [[deps.StaticArrays]] deps = ["LinearAlgebra", "Random", "StaticArraysCore", "Statistics"] -git-tree-sha1 = "f86b3a049e5d05227b10e15dbb315c5b90f14988" +git-tree-sha1 = "ffc098086f35909741f71ce21d03dadf0d2bfa76" uuid = "90137ffa-7385-5640-81b9-e52037218182" -version = "1.5.9" +version = "1.5.11" [[deps.StaticArraysCore]] git-tree-sha1 = "6b7ba252635a5eff6a0b0664a41ee140a1c9e72a" @@ -1395,9 +1573,9 @@ version = "0.33.21" [[deps.StatsFuns]] deps = ["ChainRulesCore", "HypergeometricFunctions", "InverseFunctions", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"] -git-tree-sha1 = "5783b877201a82fc0014cbf381e7e6eb130473a4" +git-tree-sha1 = "89a3bfe98f5400f4ff58bb5cd1a9e46f95d08352" uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c" -version = "1.0.1" +version = "1.1.0" [[deps.StringManipulation]] git-tree-sha1 = "46da2434b41f41ac3594ee9816ce5541c6096123" @@ -1442,12 +1620,6 @@ git-tree-sha1 = "1feb45f88d133a655e001435632f019a9a1bcdb6" uuid = "62fd8b95-f654-4bbd-a8a5-9c27f68ccd50" version = "0.1.1" -[[deps.Term]] -deps = ["CodeTracking", "Dates", "Highlights", "InteractiveUtils", "Logging", "Markdown", "MyterialColors", "OrderedCollections", "Parameters", "ProgressLogging", "SnoopPrecompile", "Tables", "UUIDs", "UnicodeFun"] -git-tree-sha1 = "5b5d38673d148f80e7e04569a665006d3bf91cfb" -uuid = "22787eb5-b846-44ae-b979-8e399b8463ab" -version = "1.0.4" - [[deps.Test]] deps = ["InteractiveUtils", "Logging", "Random", "Serialization"] uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" @@ -1460,9 +1632,9 @@ version = "0.5.0" [[deps.TimerOutputs]] deps = ["ExprTools", "Printf"] -git-tree-sha1 = "9dfcb767e17b0849d6aaf85997c98a5aea292513" +git-tree-sha1 = "f2fd3f288dfc6f507b0c3a2eb3bac009251e548b" uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f" -version = "0.5.21" +version = "0.5.22" [[deps.TranscodingStreams]] deps = ["Random", "Test"] @@ -1470,10 +1642,16 @@ git-tree-sha1 = "8a75929dcd3c38611db2f8d08546decb514fcadf" uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa" version = "0.9.9" +[[deps.Transducers]] +deps = ["Adapt", "ArgCheck", "BangBang", "Baselet", "CompositionsBase", "DefineSingletons", "Distributed", "InitialValues", "Logging", "Markdown", "MicroCollections", "Requires", "Setfield", "SplittablesBase", "Tables"] +git-tree-sha1 = "c42fa452a60f022e9e087823b47e5a5f8adc53d5" +uuid = "28d57a85-8fef-5791-bfe6-a80928e7c999" +version = "0.4.75" + [[deps.URIs]] -git-tree-sha1 = "e59ecc5a41b000fa94423a578d29290c7266fc10" +git-tree-sha1 = "ac00576f90d8a259f2c9d823e91d1de3fd44d348" uuid = "5c2747f8-b7ea-4ff2-ba2e-563bfd36b1d4" -version = "1.4.0" +version = "1.4.1" [[deps.UUIDs]] deps = ["Random", "SHA"] @@ -1495,15 +1673,15 @@ version = "0.4.1" [[deps.UnicodePlots]] deps = ["ColorSchemes", "ColorTypes", "Contour", "Crayons", "Dates", "FileIO", "FreeType", "LinearAlgebra", "MarchingCubes", "NaNMath", "Printf", "Requires", "SnoopPrecompile", "SparseArrays", "StaticArrays", "StatsBase", "Unitful"] -git-tree-sha1 = "390b2e8e5535f5beb50885d1a1059f460547d3a5" +git-tree-sha1 = "e20b01d50cd162593cfd9691628c830769f68987" uuid = "b8865327-cd53-5732-bb35-84acbb429228" -version = "3.1.6" +version = "3.3.1" [[deps.Unitful]] deps = ["ConstructionBase", "Dates", "LinearAlgebra", "Random"] -git-tree-sha1 = "d57a4ed70b6f9ff1da6719f5f2713706d57e0d66" +git-tree-sha1 = "d670a70dd3cdbe1c1186f2f17c9a68a7ec24838c" uuid = "1986cc42-f94f-5a68-af5c-568840ba703d" -version = "1.12.0" +version = "1.12.2" [[deps.Unzip]] git-tree-sha1 = "ca0969166a028236229f63514992fc073799bb78" @@ -1512,9 +1690,9 @@ version = "0.2.0" [[deps.VectorizationBase]] deps = ["ArrayInterface", "CPUSummary", "HostCPUFeatures", "IfElse", "LayoutPointers", "Libdl", "LinearAlgebra", "SIMDTypes", "Static"] -git-tree-sha1 = "ba9d398034a2ba78059391492730889c6e45cf15" +git-tree-sha1 = "fc79d0f926592ecaeaee164f6a4ca81b51115c3b" uuid = "3d5dd08c-fd9d-11e8-17fa-ed2836048c2f" -version = "0.21.54" +version = "0.21.56" [[deps.Wayland_jll]] deps = ["Artifacts", "Expat_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg", "XML2_jll"] @@ -1683,6 +1861,18 @@ git-tree-sha1 = "e45044cd873ded54b6a5bac0eb5c971392cf1927" uuid = "3161d3a3-bdf6-5164-811a-617609db77b4" version = "1.5.2+0" +[[deps.Zygote]] +deps = ["AbstractFFTs", "ChainRules", "ChainRulesCore", "DiffRules", "Distributed", "FillArrays", "ForwardDiff", "GPUArrays", "GPUArraysCore", "IRTools", "InteractiveUtils", "LinearAlgebra", "LogExpFunctions", "MacroTools", "NaNMath", "Random", "Requires", "SparseArrays", "SpecialFunctions", "Statistics", "ZygoteRules"] +git-tree-sha1 = "a6f1287943ac05fae56fa06049d1a7846dfbc65f" +uuid = "e88e6eb3-aa80-5325-afca-941959d7151f" +version = "0.6.51" + +[[deps.ZygoteRules]] +deps = ["MacroTools"] +git-tree-sha1 = "8c1a8e4dfacb1fd631745552c8db35d0deb09ea0" +uuid = "700de1a5-db45-46bc-99cf-38207098b444" +version = "0.2.2" + [[deps.fzf_jll]] deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] git-tree-sha1 = "868e669ccb12ba16eaf50cb2957ee2ff61261c56" diff --git a/docs/Project.toml b/docs/Project.toml index ad9ad43..6291faf 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -1,15 +1,21 @@ [deps] +CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" ConformalPrediction = "98bfc277-1877-43dc-819b-a3e38c30242f" DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" DecisionTree = "7806a523-6efd-50cb-b5f6-3fa6f1930dbb" +Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" EvoTrees = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" +Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c" MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7" MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d" MLJDecisionTreeInterface = "c6f25543-311c-4c74-83dc-3ea6d1015661" +MLJFlux = "094fc8d1-fd35-5302-93ea-dabda2abf845" MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692" MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" MLJNaiveBayesInterface = "33e4bacb-b9e2-458e-9a13-5d9a90b235fa" NaiveBayes = "9bbee03b-0db5-5f46-924f-b5c9c21b8c60" +NearestNeighborModels = "636a865e-7cf4-491e-846c-de09b730eb36" PlotThemes = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a" Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" +Polynomials = "f27b6e38-b328-58d1-80ce-0feddd5e7a45" diff --git a/docs/make.jl b/docs/make.jl index 3f5b8f5..e1a6a8f 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -32,8 +32,12 @@ makedocs(; ), pages=[ "🏠 Home" => "index.md", + "🤔 Tutorials" => [ + "Classification" => "classification.md", + "Regression" => "regression.md", + ], + "📖 Reference" => "reference.md", "🛠 Contribute" => "contribute.md", - "📖 Library" => "reference.md", ], ) diff --git a/docs/_metadata.yml b/docs/src/_metadata.yml similarity index 51% rename from docs/_metadata.yml rename to docs/src/_metadata.yml index f591beb..b4ac9c2 100644 --- a/docs/_metadata.yml +++ b/docs/src/_metadata.yml @@ -1,5 +1,4 @@ format: commonmark: variant: -raw_html - wrap: preserve - self-contained: true \ No newline at end of file + wrap: none \ No newline at end of file diff --git a/docs/src/classification.md b/docs/src/classification.md index 546810e..99a1994 100644 --- a/docs/src/classification.md +++ b/docs/src/classification.md @@ -1,41 +1,161 @@ +# Classification + +``` @meta +CurrentModule = ConformalPrediction +``` + +This tutorial is based in parts on this [blog post](https://www.paltmeyer.com/blog/posts/conformal-prediction/). + +### Split Conformal Classification + +We consider a simple binary classification problem. Let (*X*_(*i*),*Y*_(*i*)), *i* = 1, ..., *n* denote our feature-label pairs and let *μ* : 𝒳 ↦ 𝒴 denote the mapping from features to labels. For illustration purposes we will use the moons dataset 🌙. Using [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) we first generate the data and split into into a training and test set: + ``` julia using MLJ -X, y = MLJ.make_blobs(1000, 2; centers=3, cluster_std=1.0) -train, test = partition(eachindex(y), 0.4, 0.4, shuffle=true) +using Random +Random.seed!(123) + +# Data: +X, y = make_moons(500; noise=0.15) +train, test = partition(eachindex(y), 0.8, shuffle=true) ``` +Here we will use a specific case of CP called *split conformal prediction* which can then be summarized as follows:[1] + +1. Partition the training into a proper training set and a separate calibration set: 𝒟_(*n*) = 𝒟^(train) ∪ 𝒟^(cali). +2. Train the machine learning model on the proper training set: *μ̂*_(*i* ∈ 𝒟^(train))(*X*_(*i*),*Y*_(*i*)). +3. Compute nonconformity scores, 𝒮, using the calibration data 𝒟^(cali) and the fitted model *μ̂*_(*i* ∈ 𝒟^(train)). +4. For a user-specified desired coverage ratio (1−*α*) compute the corresponding quantile, *q̂*, of the empirical distribution of nonconformity scores, 𝒮. +5. For the given quantile and test sample *X*_(test), form the corresponding conformal prediction set: + +*C*(*X*_(test)) = {*y* : *s*(*X*_(test),*y*) ≤ *q̂*}   (1) + +This is the default procedure used for classification and regression in [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl). + +Now let’s take this to our 🌙 data. To illustrate the package functionality we will demonstrate the envisioned workflow. We first define our atomic machine learning model following standard [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) conventions. Using [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl) we then wrap our atomic model in a conformal model using the standard API call `conformal_model(model::Supervised; kwargs...)`. To train and predict from our conformal model we can then rely on the conventional [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) procedure again. In particular, we wrap our conformal model in data (turning it into a machine) and then fit it on the training set. Finally, we use our machine to predict the label for a new test sample `Xtest`: + ``` julia -EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees -model = EvoTreeClassifier() +# Model: +KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels +model = KNNClassifier(;K=50) + +# Training: +using ConformalPrediction +conf_model = conformal_model(model; coverage=.9) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) + +# Conformal Prediction: +Xtest = selectrows(X, first(test)) +ytest = y[first(test)] +predict(mach, Xtest)[1] ``` + import NearestNeighborModels ✔ + + UnivariateFinite{Multiclass{2}} + ┌ ┐ + 0 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.94 + └ ┘ + +The final predictions are set-valued. While the softmax output remains unchanged for the `SimpleInductiveClassifier`, the size of the prediction set depends on the chosen coverage rate, (1−*α*). + +When specifying a coverage rate very close to one, the prediction set will typically include many (in some cases all) of the possible labels. Below, for example, both classes are included in the prediction set when setting the coverage rate equal to (1−*α*)=1.0. This is intuitive, since high coverage quite literally requires that the true label is covered by the prediction set with high probability. + ``` julia -using ConformalPrediction -conf_model = conformal_model(model) +conf_model = conformal_model(model; coverage=coverage) mach = machine(conf_model, X, y) fit!(mach, rows=train) + +# Conformal Prediction: +Xtest = (x1=[1],x2=[0]) +predict(mach, Xtest)[1] ``` + UnivariateFinite{Multiclass{2}} + ┌ ┐ + 0 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5 + 1 ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0.5 + └ ┘ + +Conversely, for low coverage rates, prediction sets can also be empty. For a choice of (1−*α*)=0.1, for example, the prediction set for our test sample is empty. This is a bit difficult to think about intuitively and I have not yet come across a satisfactory, intuitive interpretation.[2] When the prediction set is empty, the `predict` call currently returns `missing`: + ``` julia -rows = rand(test, 10) -Xtest = selectrows(X, rows) -ytest = y[rows] -predict(mach, Xtest) -``` - - ╭───────────────────────────────────────────────────────────────────╮ - │ │ - │ (1) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │ - │ (2) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │ - │ (3) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │ - │ (4) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │ - │ (5) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │ - │ (6) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │ - │ (7) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │ - │ (8) UnivariateFinite{Multiclass {#90CAF9}3} (2=>0.82{/#90CAF9}) │ - │ (9) UnivariateFinite{Multiclass {#90CAF9}3} (1=>0.82{/#90CAF9}) │ - │ (10) UnivariateFinite{Multiclass {#90CAF9}3} (3=>0.82{/#90CAF9}) │ - │ │ - │ │ - ╰────────────────────────────────────────────────────── 10 items ───╯ +conf_model = conformal_model(model; coverage=coverage) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) + +# Conformal Prediction: +predict(mach, Xtest)[1] +``` + + missing + +``` julia +cov_ = .9 +conf_model = conformal_model(model; coverage=cov_) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) +Markdown.parse(""" +The following chart shows the resulting predicted probabilities for ``y=1`` (left) and set size (right) for a choice of ``(1-\\alpha)``=$cov_. +""") +``` + +The following chart shows the resulting predicted probabilities for *y* = 1 (left) and set size (right) for a choice of (1−*α*)=0.9. + +``` julia +using Plots +p_proba = plot(mach.model, mach.fitresult, X, y) +p_set_size = plot(mach.model, mach.fitresult, X, y; plot_set_size=true) +plot(p_proba, p_set_size, size=(800,250)) +``` + +![](classification_files/figure-commonmark/cell-10-output-1.svg) + +The animation below should provide some more intuition as to what exactly is happening here. It illustrates the effect of the chosen coverage rate on the predicted softmax output and the set size in the two-dimensional feature space. Contours are overlayed with the moon data points (including test data). The two samples highlighted in red, *X*₁ and *X*₂, have been manually added for illustration purposes. Let’s look at these one by one. + +Firstly, note that *X*₁ (red cross) falls into a region of the domain that is characterized by high predictive uncertainty. It sits right at the bottom-right corner of our class-zero moon 🌜 (orange), a region that is almost entirely enveloped by our class-one moon 🌛 (green). For low coverage rates the prediction set for *X*₁ is empty: on the left-hand side this is indicated by the missing contour for the softmax probability; on the right-hand side we can observe that the corresponding set size is indeed zero. For high coverage rates the prediction set includes both *y* = 0 and *y* = 1, indicative of the fact that the conformal classifier is uncertain about the true label. + +With respect to *X*₂, we observe that while also sitting on the fringe of our class-zero moon, this sample populates a region that is not fully enveloped by data points from the opposite class. In this region, the underlying atomic classifier can be expected to be more certain about its predictions, but still not highly confident. How is this reflected by our corresponding conformal prediction sets? + +``` julia +Xtest_2 = (x1=[-0.5],x2=[0.25]) +p̂_2 = pdf(predict(mach, Xtest_2)[1], 0) +``` + +Well, for low coverage rates (roughly  \< 0.9) the conformal prediction set does not include *y* = 0: the set size is zero (right panel). Only for higher coverage rates do we have *C*(*X*₂) = {0}: the coverage rate is high enough to include *y* = 0, but the corresponding softmax probability is still fairly low. For example, for (1−*α*) = 0.9 we have *p̂*(*y*=0|*X*₂) = 0.72. + +These two examples illustrate an interesting point: for regions characterized by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive. + +``` julia +# Setup +coverages = range(0.75,1.0,length=5) +n = 100 +x1_range = range(extrema(X.x1)...,length=n) +x2_range = range(extrema(X.x2)...,length=n) + +anim = @animate for coverage in coverages + conf_model = conformal_model(model; coverage=coverage) + mach = machine(conf_model, X, y) + fit!(mach, rows=train) + # Probabilities: + p1 = plot(mach.model, mach.fitresult, X, y) + scatter!(p1, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6) + scatter!(p1, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6) + p2 = plot(mach.model, mach.fitresult, X, y; plot_set_size=true) + scatter!(p2, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6) + scatter!(p2, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6) + plot(p1, p2, plot_title="(1-α)=$(round(coverage,digits=2))", size=(800,300)) +end + +gif(anim, joinpath(www_path,"classification.gif"), fps=1) +``` + +The effect of the coverage rate on the conformal prediction set. Softmax probabilities are shown on the left. The size of the prediction set is shown on the right. + +![](www/classification.gif) + +[1] In other places split conformal prediction is sometimes referred to as *inductive* conformal prediction. + +[2] Any thoughts/comments welcome! diff --git a/docs/src/classification.qmd b/docs/src/classification.qmd index 0dc31b7..b5ab560 100644 --- a/docs/src/classification.qmd +++ b/docs/src/classification.qmd @@ -1,32 +1,187 @@ +# Classification + +```@meta +CurrentModule = ConformalPrediction +``` + ```{julia} #| echo: false using Pkg; Pkg.activate("docs") using Plots theme(:wong) +www_path = "docs/src/www" # output path for files don't get automatically saved in auto-generated path (e.g. GIFs) ``` + +This tutorial is based in parts on this [blog post](https://www.paltmeyer.com/blog/posts/conformal-prediction/). + +### Split Conformal Classification {#sec-scp} + +We consider a simple binary classification problem. Let $(X_i, Y_i), \ i=1,...,n$ denote our feature-label pairs and let $\mu: \mathcal{X} \mapsto \mathcal{Y}$ denote the mapping from features to labels. For illustration purposes we will use the moons dataset 🌙. Using [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) we first generate the data and split into into a training and test set: + ```{julia} using MLJ -X, y = MLJ.make_blobs(1000, 2; centers=3, cluster_std=1.0) -train, test = partition(eachindex(y), 0.4, 0.4, shuffle=true) +using Random +Random.seed!(123) + +# Data: +X, y = make_moons(500; noise=0.15) +train, test = partition(eachindex(y), 0.8, shuffle=true) ``` +Here we will use a specific case of CP called *split conformal prediction* which can then be summarized as follows:^[In other places split conformal prediction is sometimes referred to as *inductive* conformal prediction.] + +1. Partition the training into a proper training set and a separate calibration set: $\mathcal{D}_n=\mathcal{D}^{\text{train}} \cup \mathcal{D}^{\text{cali}}$. +2. Train the machine learning model on the proper training set: $\hat\mu_{i \in \mathcal{D}^{\text{train}}}(X_i,Y_i)$. +3. Compute nonconformity scores, $\mathcal{S}$, using the calibration data $\mathcal{D}^{\text{cali}}$ and the fitted model $\hat\mu_{i \in \mathcal{D}^{\text{train}}}$. +4. For a user-specified desired coverage ratio $(1-\alpha)$ compute the corresponding quantile, $\hat{q}$, of the empirical distribution of nonconformity scores, $\mathcal{S}$. +5. For the given quantile and test sample $X_{\text{test}}$, form the corresponding conformal prediction set: + +$$ +C(X_{\text{test}})=\{y:s(X_{\text{test}},y) \le \hat{q}\} +$$ {#eq-set} + +This is the default procedure used for classification and regression in [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl). + +Now let's take this to our 🌙 data. To illustrate the package functionality we will demonstrate the envisioned workflow. We first define our atomic machine learning model following standard [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) conventions. Using [`ConformalPrediction.jl`](https://github.com/pat-alt/ConformalPrediction.jl) we then wrap our atomic model in a conformal model using the standard API call `conformal_model(model::Supervised; kwargs...)`. To train and predict from our conformal model we can then rely on the conventional [`MLJ.jl`](https://alan-turing-institute.github.io/MLJ.jl/v0.18/) procedure again. In particular, we wrap our conformal model in data (turning it into a machine) and then fit it on the training set. Finally, we use our machine to predict the label for a new test sample `Xtest`: + ```{julia} -EvoTreeClassifier = @load EvoTreeClassifier pkg=EvoTrees -model = EvoTreeClassifier() +#| output: true + +# Model: +KNNClassifier = @load KNNClassifier pkg=NearestNeighborModels +model = KNNClassifier(;K=50) + +# Training: +using ConformalPrediction +conf_model = conformal_model(model; coverage=.9) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) + +# Conformal Prediction: +Xtest = selectrows(X, first(test)) +ytest = y[first(test)] +predict(mach, Xtest)[1] ``` +The final predictions are set-valued. While the softmax output remains unchanged for the `SimpleInductiveClassifier`, the size of the prediction set depends on the chosen coverage rate, $(1-\alpha)$. ```{julia} -using ConformalPrediction -conf_model = conformal_model(model) +#| echo: false +#| output: true + +coverage = 1.0 +using Markdown +Markdown.parse(""" +When specifying a coverage rate very close to one, the prediction set will typically include many (in some cases all) of the possible labels. Below, for example, both classes are included in the prediction set when setting the coverage rate equal to ``(1-\\alpha)``=$coverage. This is intuitive, since high coverage quite literally requires that the true label is covered by the prediction set with high probability. +""") +``` + +```{julia} +#| output: true + +conf_model = conformal_model(model; coverage=coverage) mach = machine(conf_model, X, y) fit!(mach, rows=train) + +# Conformal Prediction: +Xtest = (x1=[1],x2=[0]) +predict(mach, Xtest)[1] ``` ```{julia} +#| echo: false #| output: true -rows = rand(test, 10) -Xtest = selectrows(X, rows) -ytest = y[rows] -predict(mach, Xtest) -``` \ No newline at end of file + +coverage = .1 +using Markdown +Markdown.parse(""" +Conversely, for low coverage rates, prediction sets can also be empty. For a choice of ``(1-\\alpha)``=$coverage, for example, the prediction set for our test sample is empty. This is a bit difficult to think about intuitively and I have not yet come across a satisfactory, intuitive interpretation.^[Any thoughts/comments welcome!] When the prediction set is empty, the `predict` call currently returns `missing`: +""") +``` + +```{julia} +#| output: true + +conf_model = conformal_model(model; coverage=coverage) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) + +# Conformal Prediction: +predict(mach, Xtest)[1] +``` + +```{julia} +#| output: true + +cov_ = .9 +conf_model = conformal_model(model; coverage=cov_) +mach = machine(conf_model, X, y) +fit!(mach, rows=train) +Markdown.parse(""" +The following chart shows the resulting predicted probabilities for ``y=1`` (left) and set size (right) for a choice of ``(1-\\alpha)``=$cov_. +""") +``` + +```{julia} +#| output: true + +using Plots +p_proba = plot(mach.model, mach.fitresult, X, y) +p_set_size = plot(mach.model, mach.fitresult, X, y; plot_set_size=true) +plot(p_proba, p_set_size, size=(800,250)) +``` + + +The animation below should provide some more intuition as to what exactly is happening here. It illustrates the effect of the chosen coverage rate on the predicted softmax output and the set size in the two-dimensional feature space. Contours are overlayed with the moon data points (including test data). The two samples highlighted in red, $X_1$ and $X_2$, have been manually added for illustration purposes. Let's look at these one by one. + +Firstly, note that $X_1$ (red cross) falls into a region of the domain that is characterized by high predictive uncertainty. It sits right at the bottom-right corner of our class-zero moon 🌜 (orange), a region that is almost entirely enveloped by our class-one moon 🌛 (green). For low coverage rates the prediction set for $X_1$ is empty: on the left-hand side this is indicated by the missing contour for the softmax probability; on the right-hand side we can observe that the corresponding set size is indeed zero. For high coverage rates the prediction set includes both $y=0$ and $y=1$, indicative of the fact that the conformal classifier is uncertain about the true label. + +With respect to $X_2$, we observe that while also sitting on the fringe of our class-zero moon, this sample populates a region that is not fully enveloped by data points from the opposite class. In this region, the underlying atomic classifier can be expected to be more certain about its predictions, but still not highly confident. How is this reflected by our corresponding conformal prediction sets? + +```{julia} +#| code-fold: true + +Xtest_2 = (x1=[-0.5],x2=[0.25]) +p̂_2 = pdf(predict(mach, Xtest_2)[1], 0) +``` + +```{julia} +#| echo: false +#| output: true + +Markdown.parse(""" +Well, for low coverage rates (roughly ``<0.9``) the conformal prediction set does not include ``y=0``: the set size is zero (right panel). Only for higher coverage rates do we have ``C(X_2)=\\{0\\}``: the coverage rate is high enough to include ``y=0``, but the corresponding softmax probability is still fairly low. For example, for ``(1-\\alpha)=$(cov_)`` we have ``\\hat{p}(y=0|X_2)=$(p̂_2).`` +""") +``` + +These two examples illustrate an interesting point: for regions characterized by high predictive uncertainty, conformal prediction sets are typically empty (for low coverage) or large (for high coverage). While set-valued predictions may be something to get used to, this notion is overall intuitive. + +```{julia} +#| output: true +#| label: fig-anim +#| fig-cap: "The effect of the coverage rate on the conformal prediction set. Softmax probabilities are shown on the left. The size of the prediction set is shown on the right." + +# Setup +coverages = range(0.75,1.0,length=5) +n = 100 +x1_range = range(extrema(X.x1)...,length=n) +x2_range = range(extrema(X.x2)...,length=n) + +anim = @animate for coverage in coverages + conf_model = conformal_model(model; coverage=coverage) + mach = machine(conf_model, X, y) + fit!(mach, rows=train) + # Probabilities: + p1 = plot(mach.model, mach.fitresult, X, y) + scatter!(p1, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6) + scatter!(p1, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6) + p2 = plot(mach.model, mach.fitresult, X, y; plot_set_size=true) + scatter!(p2, Xtest.x1, Xtest.x2, ms=6, c=:red, label="X₁", shape=:cross, msw=6) + scatter!(p2, Xtest_2.x1, Xtest_2.x2, ms=6, c=:red, label="X₂", shape=:diamond, msw=6) + plot(p1, p2, plot_title="(1-α)=$(round(coverage,digits=2))", size=(800,300)) +end + +gif(anim, joinpath(www_path,"classification.gif"), fps=1) +``` + +![](www/classification.gif) diff --git a/docs/src/classification_files/figure-commonmark/cell-10-output-1.svg b/docs/src/classification_files/figure-commonmark/cell-10-output-1.svg new file mode 100644 index 0000000..de81765 --- /dev/null +++ b/docs/src/classification_files/figure-commonmark/cell-10-output-1.svgdiff --git a/docs/src/classification_files/figure-commonmark/cell-9-output-1.svg b/docs/src/classification_files/figure-commonmark/cell-9-output-1.svg new file mode 100644 index 0000000..8791f3f --- /dev/null +++ b/docs/src/classification_files/figure-commonmark/cell-9-output-1.svgdiff --git a/docs/src/contribute_files/figure-commonmark/mermaid-figure-1.png b/docs/src/contribute_files/figure-commonmark/mermaid-figure-1.png index dfc3a35..9abfa0a 100644 Binary files a/docs/src/contribute_files/figure-commonmark/mermaid-figure-1.png and b/docs/src/contribute_files/figure-commonmark/mermaid-figure-1.png differ diff --git a/docs/src/index.md b/docs/src/index.md index 35ef0fe..7d71acd 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -108,27 +108,22 @@ fit!(mach, rows=train) Predictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval. ``` julia -n = 10 +n = 5 Xtest = selectrows(X, first(test,n)) ytest = y[first(test,n)] predict(mach, Xtest) ``` - ╭─────────────────────────────────────────────────────────────────╮ - │ │ - │ (1) ([0.14395897640483468], [1.5537237281612537]) │ - │ (2) ([-0.539687877793372], [0.8700768739630471]) │ - │ (3) ([-0.46442052745067525], [0.9453442243057439]) │ - │ (4) ([0.010529843675146089], [1.420294595431565]) │ - │ (5) ([0.07301045762431613], [1.4827752093807351]) │ - │ (6) ([-0.012020120998203487], [1.3977446307582158]) │ - │ (7) ([0.5297045560243977], [1.9394693077808167]) │ - │ (8) ([-0.46442052745067525], [0.9453442243057439]) │ - │ (9) ([-0.09600489213468855], [1.3137598596217306]) │ - │ (10) ([0.010529843675146089], [1.420294595431565]) │ - │ │ - │ │ - ╰──────────────────────────────────────────────────── 10 items ───╯ + ╭─────────────────────────────────────────────────────────╮ + │ │ + │ (1) (1.2801183281465092, 2.0024286641173816) │ + │ (2) (0.8012756658949756, 1.5235860018658482) │ + │ (3) (1.1850387604493555, 1.9073490964202282) │ + │ (4) (1.1185514282818692, 1.8408617642527418) │ + │ (5) (1.1651738766694149, 1.8874842126402875) │ + │ │ + │ │ + ╰───────────────────────────────────────────── 5 items ───╯ ## 🛠 Contribute diff --git a/docs/src/intro.md b/docs/src/intro.md index a0a13fa..d37b5cd 100644 --- a/docs/src/intro.md +++ b/docs/src/intro.md @@ -1,9 +1,17 @@ -`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/). Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. +`ConformalPrediction.jl` is a package for Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) Blaom et al. (2020). Conformal Prediction is distribution-free, easy-to-understand, easy-to-use and model-agnostic. -## Installation 🚩 +# 📖 Background -You can install the first stable release from the general registry: +Conformal Prediction is a scalable frequentist approach to uncertainty quantification and coverage control. It promises to be an easy-to-understand, distribution-free and model-agnostic way to generate statistically rigorous uncertainty estimates. Interestingly, it can even be used to complement Bayesian methods. + +The animation below is lifted from a small blog post that introduces the topic and the package (\[[TDS](https://towardsdatascience.com/conformal-prediction-in-julia-351b81309e30)\], \[[Quarto](https://www.paltmeyer.com/blog/posts/conformal-prediction/#fig-anim)\]). It shows conformal prediction sets for two different samples and changing coverage rates. Standard conformal classifiers produce set-valued predictions: for ambiguous samples these sets are typically large (for high coverage) or empty (for low coverage). + +![Conformal Prediction in action: Prediction sets for two different samples and changing coverage rates. As coverage grows, so does the size of the prediction sets.](https://raw.githubusercontent.com/pat-alt/blog/main/posts/conformal-prediction/www/medium.gif) + +## 🚩 Installation + +You can install the latest stable release from the general registry: ``` julia using Pkg @@ -17,9 +25,9 @@ using Pkg Pkg.add(url="https://github.com/pat-alt/ConformalPrediction.jl") ``` -## Status 🔁 +## 🔁 Status -This package is in its very early stages of development and therefore still subject to changes to the core architecture. The following approaches have been implemented in the development version: +This package is in its early stages of development and therefore still subject to changes to the core architecture and API. The following CP approaches have been implemented in the development version: **Regression**: @@ -36,9 +44,34 @@ This package is in its very early stages of development and therefore still subj - Inductive (LABEL (Sadinle, Lei, and Wasserman 2019)) - Adaptive Inductive -I have only tested it for a few of the supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/). +The package has been tested for the following supervised models offered by [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/). + +**Regression**: + +``` julia +using ConformalPrediction +keys(tested_atomic_models[:regression]) +``` -## Usage Example 🔍 + KeySet for a Dict{Symbol, Expr} with 4 entries. Keys: + :nearest_neighbor + :evo_tree + :light_gbm + :decision_tree + +**Classification**: + +``` julia +keys(tested_atomic_models[:classification]) +``` + + KeySet for a Dict{Symbol, Expr} with 4 entries. Keys: + :nearest_neighbor + :evo_tree + :light_gbm + :decision_tree + +## 🔍 Usage Example To illustrate the intended use of the package, let’s have a quick look at a simple regression problem. Using [MLJ](https://alan-turing-institute.github.io/MLJ.jl/dev/) we first generate some synthetic data and then determine indices for our training, calibration and test data: @@ -67,32 +100,29 @@ fit!(mach, rows=train) Predictions can then be computed using the generic `predict` method. The code below produces predictions for the first `n` samples. Each tuple contains the lower and upper bound for the prediction interval. ``` julia -n = 10 +n = 5 Xtest = selectrows(X, first(test,n)) ytest = y[first(test,n)] predict(mach, Xtest) ``` - ╭─────────────────────────────────────────────────────────────────╮ - │ │ - │ (1) ([-0.20063113789390163], [1.323655530145934]) │ - │ (2) ([-0.061147489871723804], [1.4631391781681118]) │ - │ (3) ([-1.4486105066363675], [0.07567616140346822]) │ - │ (4) ([-0.7160881365817455], [0.8081985314580902]) │ - │ (5) ([-1.7173644161988695], [-0.19307774815903367]) │ - │ (6) ([-1.2158809697881832], [0.3084056982516525]) │ - │ (7) ([-1.7173644161988695], [-0.19307774815903367]) │ - │ (8) ([0.26510754559144056], [1.7893942136312764]) │ - │ (9) ([-0.8716996456392521], [0.6525870224005836]) │ - │ (10) ([0.43084861624955606], [1.9551352842893919]) │ - │ │ - │ │ - ╰──────────────────────────────────────────────────── 10 items ───╯ - -## Contribute 🛠 + ╭──────────────────────────────────────────────────────────╮ + │ │ + │ (1) (-0.9864061984981062, 2.2503222170961554) │ + │ (2) (-0.7192196826151477, 2.5175087329791137) │ + │ (3) (-0.33838267507136344, 2.898345740522898) │ + │ (4) (-2.838413186252051, 0.39831522934221053) │ + │ (5) (-0.7192196826151477, 2.5175087329791137) │ + │ │ + │ │ + ╰────────────────────────────────────────────── 5 items ───╯ + +## 🛠 Contribute Contributions are welcome! Please follow the [SciML ColPrac guide](https://github.com/SciML/ColPrac). -## References 🎓 +## 🎓 References + +Blaom, Anthony D., Franz Kiraly, Thibaut Lienart, Yiannis Simillides, Diego Arenas, and Sebastian J. Vollmer. 2020. “MLJ: A Julia Package for Composable Machine Learning.” *Journal of Open Source Software* 5 (55): 2704. . Sadinle, Mauricio, Jing Lei, and Larry Wasserman. 2019. “Least Ambiguous Set-Valued Classifiers with Bounded Error Levels.” *Journal of the American Statistical Association* 114 (525): 223–34. diff --git a/docs/src/intro.qmd b/docs/src/intro.qmd index 5ff0e97..ec4c99e 100644 --- a/docs/src/intro.qmd +++ b/docs/src/intro.qmd @@ -97,7 +97,7 @@ Predictions can then be computed using the generic `predict` method. The code be ```{julia} #| output: true -n = 10 +n = 5 Xtest = selectrows(X, first(test,n)) ytest = y[first(test,n)] predict(mach, Xtest) diff --git a/docs/src/regression.md b/docs/src/regression.md new file mode 100644 index 0000000..5c7dc9a --- /dev/null +++ b/docs/src/regression.md @@ -0,0 +1,83 @@ + +# Regression + +``` @meta +CurrentModule = ConformalPrediction +``` + +This tutorial mostly replicates this [tutorial](https://mapie.readthedocs.io/en/latest/examples_regression/4-tutorials/plot_main-tutorial-regression.html#) from MAPIE. + +## Data + +We begin by generating some synthetic regression data below: + +``` julia +# Regression data: + +# Inputs: +N = 600 +xmax = 3.0 +using Distributions +d = Uniform(-xmax, xmax) +X = rand(d, N) +X = reshape(X, :, 1) + +# Outputs: +noise = 0.5 +fun(X) = X * sin(X) +ε = randn(N) .* noise +y = @.(fun(X)) + ε +using MLJ +train, test = partition(eachindex(y), 0.4, 0.4, shuffle=true) + +using Plots +scatter(X, y, label="Observed") +xrange = range(-xmax,xmax,length=N) +plot!(xrange, @.(fun(xrange)), lw=4, label="Ground truth", ls=:dash, colour=:black) +``` + +## Model + +To model this data we will use polynomial regression. There is currently no out-of-the-box support for polynomial feature transformations in `MLJ`, but it is easy enough to add a little helper function for this. Note how we define a linear pipeline `pipe` here. Since pipelines in `MLJ` are just models, we can use the generated object as an input to `conformal_model` below. + +``` julia +LinearRegressor = @load LinearRegressor pkg=MLJLinearModels +degree_polynomial = 10 +polynomial_features(X, degree::Int) = reduce(hcat, map(i -> X.^i, 1:degree)) +pipe = (X -> MLJ.table(polynomial_features(MLJ.matrix(X), degree_polynomial))) |> LinearRegressor() +``` + +Next, we conformalize our polynomial regressor using every available approach (except the Naive approach): + +``` julia +using ConformalPrediction +conformal_models = merge(values(available_models[:regression])...) +delete!(conformal_models, :naive) +# delete!(conformal_models, :jackknife) +results = Dict() +for _mod in keys(conformal_models) + conf_model = conformal_model(pipe; method=_mod, coverage=0.95) + mach = machine(conf_model, X, y) + fit!(mach, rows=train) + results[_mod] = mach +end +``` + +Finally, let us look at the resulting conformal predictions in each case. + +``` julia +using Plots +zoom = -3 +xrange = range(-xmax+zoom,xmax-zoom,length=N) +plt_list = [] + +for (_mod, mach) in results + plt = plot(mach.model, mach.fitresult, X, y, zoom=zoom, title=_mod) + plot!(plt, xrange, @.(fun(xrange)), lw=1, ls=:dash, colour=:black, label="Ground truth") + push!(plt_list, plt) +end + +plot(plt_list..., size=(1600,1000)) +``` + +![Figure 1: Conformal prediction regions.](regression_files/figure-commonmark/fig-cp-output-1.svg) diff --git a/docs/src/regression.qmd b/docs/src/regression.qmd new file mode 100644 index 0000000..384b5bd --- /dev/null +++ b/docs/src/regression.qmd @@ -0,0 +1,98 @@ +# Regression + +```@meta +CurrentModule = ConformalPrediction +``` + +```{julia} +#| echo: false +using Pkg; Pkg.activate("docs") +using Plots +theme(:wong) +``` + +This tutorial mostly replicates this [tutorial](https://mapie.readthedocs.io/en/latest/examples_regression/4-tutorials/plot_main-tutorial-regression.html#) from MAPIE. + +## Data + +We begin by generating some synthetic regression data below: + +```{julia} +#| label: fig-data +#| fig-cap: "Synthetic data." + +# Regression data: + +# Inputs: +N = 600 +xmax = 3.0 +using Distributions +d = Uniform(-xmax, xmax) +X = rand(d, N) +X = reshape(X, :, 1) + +# Outputs: +noise = 0.5 +fun(X) = X * sin(X) +ε = randn(N) .* noise +y = @.(fun(X)) + ε +using MLJ +train, test = partition(eachindex(y), 0.4, 0.4, shuffle=true) + +using Plots +scatter(X, y, label="Observed") +xrange = range(-xmax,xmax,length=N) +plot!(xrange, @.(fun(xrange)), lw=4, label="Ground truth", ls=:dash, colour=:black) +``` + +## Model + +To model this data we will use polynomial regression. There is currently no out-of-the-box support for polynomial feature transformations in `MLJ`, but it is easy enough to add a little helper function for this. Note how we define a linear pipeline `pipe` here. Since pipelines in `MLJ` are just models, we can use the generated object as an input to `conformal_model` below. + +```{julia} +LinearRegressor = @load LinearRegressor pkg=MLJLinearModels +degree_polynomial = 10 +polynomial_features(X, degree::Int) = reduce(hcat, map(i -> X.^i, 1:degree)) +pipe = (X -> MLJ.table(polynomial_features(MLJ.matrix(X), degree_polynomial))) |> LinearRegressor() +``` + +## Conformal Prediction + +Next, we conformalize our polynomial regressor using every available approach (except the Naive approach): + +```{julia} +using ConformalPrediction +conformal_models = merge(values(available_models[:regression])...) +delete!(conformal_models, :naive) +# delete!(conformal_models, :jackknife) +results = Dict() +for _mod in keys(conformal_models) + conf_model = conformal_model(pipe; method=_mod, coverage=0.95) + mach = machine(conf_model, X, y) + fit!(mach, rows=train) + results[_mod] = mach +end +``` + +Finally, let us look at the resulting conformal predictions in each case. + +```{julia} +#| output: true +#| label: fig-cp +#| fig-cap: "Conformal prediction regions." + +using Plots +zoom = 0 +xrange = range(-xmax+zoom,xmax-zoom,length=N) +plt_list = [] + +for (_mod, mach) in results + plt = plot(mach.model, mach.fitresult, X, y, zoom=zoom, title=_mod) + plot!(plt, xrange, @.(fun(xrange)), lw=1, ls=:dash, colour=:black, label="Ground truth") + push!(plt_list, plt) +end + +plot(plt_list..., size=(1600,1000)) +``` + + diff --git a/docs/src/regression_files/figure-commonmark/fig-cp-output-1.svg b/docs/src/regression_files/figure-commonmark/fig-cp-output-1.svg new file mode 100644 index 0000000..a1fba3e --- /dev/null +++ b/docs/src/regression_files/figure-commonmark/fig-cp-output-1.svgdiff --git a/docs/src/www/classification.gif b/docs/src/www/classification.gif new file mode 100644 index 0000000..60ea94a Binary files /dev/null and b/docs/src/www/classification.gif differ diff --git a/src/ConformalModels/ConformalModels.jl b/src/ConformalModels/ConformalModels.jl index dc6e072..f68ac89 100644 --- a/src/ConformalModels/ConformalModels.jl +++ b/src/ConformalModels/ConformalModels.jl @@ -5,21 +5,20 @@ import MLJModelInterface as MMI import MLJModelInterface: predict, fit, save, restore "An abstract base type for conformal models that produce interval-valued predictions. This includes most conformal regression models." -abstract type ConformalInterval <: MMI.Interval end - -"An abstract base type for conformal models that produce set-valued deterministic predictions. This includes most conformal classification models." -abstract type ConformalSet <: MMI.Supervised end # ideally we'd have MMI.Set +abstract type ConformalInterval <: MMI.Interval end "An abstract base type for conformal models that produce set-valued probabilistic predictions. This includes most conformal classification models." -abstract type ConformalProbabilisticSet <: MMI.Supervised end # ideally we'd have MMI.ProbabilisticSet +abstract type ConformalProbabilisticSet <: MMI.ProbabilisticSet end "An abstract base type for conformal models that produce probabilistic predictions. This includes some conformal classifier like Venn-ABERS." abstract type ConformalProbabilistic <: MMI.Probabilistic end -const ConformalModel = Union{ConformalInterval, ConformalSet, ConformalProbabilistic} +const ConformalModel = Union{ConformalInterval, ConformalProbabilisticSet, ConformalProbabilistic} -export ConformalInterval, ConformalSet, ConformalProbabilistic, ConformalModel +export ConformalInterval, ConformalProbabilistic, ConformalModel +include("utils.jl") +include("plotting.jl") include("conformal_models.jl") # Regression Models: diff --git a/src/ConformalModels/inductive_bayes.jl b/src/ConformalModels/inductive_bayes.jl new file mode 100644 index 0000000..7ce425c --- /dev/null +++ b/src/ConformalModels/inductive_bayes.jl @@ -0,0 +1,74 @@ +# # Simple +# "The `SimpleInductiveBayes` is the simplest approach to Inductive Conformalized Bayes." +# mutable struct SimpleInductiveBayes{Model <: Supervised} <: ConformalModel +# model::Model +# coverage::AbstractFloat +# scores::Union{Nothing,AbstractArray} +# heuristic::Function +# train_ratio::AbstractFloat +# end + +# function SimpleInductiveBayes(model::Supervised; coverage::AbstractFloat=0.95, heuristic::Function=f(y, ŷ)=-ŷ, train_ratio::AbstractFloat=0.5) +# return SimpleInductiveBayes(model, coverage, nothing, heuristic, train_ratio) +# end + +# @doc raw""" +# MMI.fit(conf_model::SimpleInductiveBayes, verbosity, X, y) + +# For the [`SimpleInductiveBayes`](@ref) nonconformity scores are computed as follows: + +# `` +# S_i^{\text{CAL}} = s(X_i, Y_i) = h(\hat\mu(X_i), Y_i), \ i \in \mathcal{D}_{\text{calibration}} +# `` + +# A typical choice for the heuristic function is ``h(\hat\mu(X_i), Y_i)=1-\hat\mu(X_i)_{Y_i}`` where ``\hat\mu(X_i)_{Y_i}`` denotes the softmax output of the true class and ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``. The simple approach only takes the softmax probability of the true label into account. +# """ +# function MMI.fit(conf_model::SimpleInductiveBayes, verbosity, X, y) + +# # Data Splitting: +# train, calibration = partition(eachindex(y), conf_model.train_ratio) +# Xtrain = selectrows(X, train) +# ytrain = y[train] +# Xtrain, ytrain = MMI.reformat(conf_model.model, Xtrain, ytrain) +# Xcal = selectrows(X, calibration) +# ycal = y[calibration] +# Xcal, ycal = MMI.reformat(conf_model.model, Xcal, ycal) + +# # Training: +# fitresult, cache, report = MMI.fit(conf_model.model, verbosity, Xtrain, ytrain) + +# # Nonconformity Scores: +# ŷ = pdf.(MMI.predict(conf_model.model, fitresult, Xcal), ycal) # predict returns a vector of distributions +# conf_model.scores = @.(conf_model.heuristic(ycal, ŷ)) + +# return (fitresult, cache, report) +# end + +# @doc raw""" +# MMI.predict(conf_model::SimpleInductiveBayes, fitresult, Xnew) + +# For the [`SimpleInductiveBayes`](@ref) prediction sets are computed as follows, + +# `` +# \hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i^{\text{CAL}}\} \right\}, \ i \in \mathcal{D}_{\text{calibration}} +# `` + +# where ``\mathcal{D}_{\text{calibration}}`` denotes the designated calibration data. +# """ +# function MMI.predict(conf_model::SimpleInductiveBayes, fitresult, Xnew) +# p̂ = MMI.predict(conf_model.model, fitresult, MMI.reformat(conf_model.model, Xnew)...) +# v = conf_model.scores +# q̂ = Statistics.quantile(v, conf_model.coverage) +# p̂ = map(p̂) do pp +# L = p̂.decoder.classes +# probas = pdf.(pp, L) +# is_in_set = 1.0 .- probas .<= q̂ +# if !all(is_in_set .== false) +# pp = UnivariateFinite(L[is_in_set], probas[is_in_set]) +# else +# pp = missing +# end +# return pp +# end +# return p̂ +# end \ No newline at end of file diff --git a/src/ConformalModels/inductive_classification.jl b/src/ConformalModels/inductive_classification.jl index 6a1bc6a..e88774f 100644 --- a/src/ConformalModels/inductive_classification.jl +++ b/src/ConformalModels/inductive_classification.jl @@ -1,6 +1,6 @@ # Simple "The `SimpleInductiveClassifier` is the simplest approach to Inductive Conformal Classification. Contrary to the [`NaiveClassifier`](@ref) it computes nonconformity scores using a designated calibration dataset." -mutable struct SimpleInductiveClassifier{Model <: Supervised} <: ConformalSet +mutable struct SimpleInductiveClassifier{Model <: Supervised} <: ConformalProbabilisticSet model::Model coverage::AbstractFloat scores::Union{Nothing,AbstractArray} @@ -75,7 +75,7 @@ end # Adaptive "The `AdaptiveInductiveClassifier` is an improvement to the [`SimpleInductiveClassifier`](@ref) and the [`NaiveClassifier`](@ref). Contrary to the [`NaiveClassifier`](@ref) it computes nonconformity scores using a designated calibration dataset like the [`SimpleInductiveClassifier`](@ref). Contrary to the [`SimpleInductiveClassifier`](@ref) it utilizes the softmax output of all classes." -mutable struct AdaptiveInductiveClassifier{Model <: Supervised} <: ConformalSet +mutable struct AdaptiveInductiveClassifier{Model <: Supervised} <: ConformalProbabilisticSet model::Model coverage::AbstractFloat scores::Union{Nothing,AbstractArray} @@ -115,9 +115,9 @@ function MMI.fit(conf_model::AdaptiveInductiveClassifier, verbosity, X, y) L = p̂.decoder.classes ŷ = pdf(p̂, L) # compute probabilities for all classes scores = map(eachrow(ŷ),eachrow(ycal)) do ŷᵢ, ycalᵢ - ranks = sortperm(.-ŷᵢ) # rank in descending order - index_y = findall(L[ranks].==ycalᵢ)[1] # index of true y in sorted array - scoreᵢ = last(cumsum(ŷᵢ[ranks][1:index_y])) # sum up until true y is reached + ranks = sortperm(.-ŷᵢ) # rank in descending order + index_y = findall(L[ranks].==ycalᵢ)[1] # index of true y in sorted array + scoreᵢ = last(cumsum(ŷᵢ[ranks][1:index_y])) # sum up until true y is reached return scoreᵢ end conf_model.scores = scores @@ -152,5 +152,4 @@ function MMI.predict(conf_model::AdaptiveInductiveClassifier, fitresult, Xnew) return pp end return p̂ -end - +end \ No newline at end of file diff --git a/src/ConformalModels/inductive_regression.jl b/src/ConformalModels/inductive_regression.jl index 6fe62b8..ccde92e 100644 --- a/src/ConformalModels/inductive_regression.jl +++ b/src/ConformalModels/inductive_regression.jl @@ -60,6 +60,7 @@ function MMI.predict(conf_model::SimpleInductiveRegressor, fitresult, Xnew) v = conf_model.scores q̂ = Statistics.quantile(v, conf_model.coverage) ŷ = map(x -> (x .- q̂, x .+ q̂), eachrow(ŷ)) + ŷ = reformat_interval(ŷ) return ŷ end diff --git a/src/ConformalModels/plotting.jl b/src/ConformalModels/plotting.jl new file mode 100644 index 0000000..1d5eb8c --- /dev/null +++ b/src/ConformalModels/plotting.jl @@ -0,0 +1,131 @@ +using CategoricalArrays +using MLJ +using Plots + +function Plots.plot( + conf_model::ConformalModel,fitresult,X,y; + target::Union{Nothing,Real}=nothing, + colorbar=true,title=nothing,length_out=50,zoom=-1,xlims=nothing,ylims=nothing,linewidth=0.1,lw=4, + train_lab=nothing,hat_lab=nothing,plot_set_size=false, + kwargs... +) + + X = permutedims(MLJ.matrix(X)) + + is_classifier = target_scitype(conf_model.model) <: AbstractVector{<:Finite} + if !is_classifier + @assert size(X,1) == 1 "Cannot plot regression for multiple input variables." + else + @assert size(X,1) == 2 "Cannot plot classification for more than two input variables." + end + + if !is_classifier + + # REGRESSION + + # Surface range: + if isnothing(xlims) + xlims = (minimum(X),maximum(X)).+(zoom,-zoom) + else + xlims = xlims .+ (zoom,-zoom) + end + if isnothing(ylims) + ylims = (minimum(y),maximum(y)).+(zoom,-zoom) + else + ylims = ylims .+ (zoom,-zoom) + end + x_range = range(xlims[1],stop=xlims[2],length=length_out) + y_range = range(ylims[1],stop=ylims[2],length=length_out) + + title = isnothing(title) ? "" : title + + # Plot: + _lab = isnothing(train_lab) ? "Observed" : train_lab + scatter(vec(X), vec(y), label=_lab, xlim=xlims, ylim=ylims, lw=lw, title=title; kwargs...) + _x = reshape([x for x in x_range],:,1) + _x = MLJ.table(_x) + ŷ = predict(conf_model, fitresult, _x) + lb, ub = eachcol(reduce(vcat, map(y -> permutedims(collect(y)), ŷ))) + ymid = (lb .+ ub)./2 + yerror = (ub .- lb)./2 + _lab = isnothing(hat_lab) ? "Predicted" : hat_lab + plot!(x_range, ymid, label=_lab, ribbon = (yerror, yerror), lw=lw; kwargs...) + + else + + # CLASSIFICATION + + # Surface range: + if isnothing(xlims) + xlims = (minimum(X[1,:]),maximum(X[1,:])).+(zoom,-zoom) + else + xlims = xlims .+ (zoom,-zoom) + end + if isnothing(ylims) + ylims = (minimum(X[2,:]),maximum(X[2,:])).+(zoom,-zoom) + else + ylims = ylims .+ (zoom,-zoom) + end + x_range = range(xlims[1],stop=xlims[2],length=length_out) + y_range = range(ylims[1],stop=ylims[2],length=length_out) + + # Target + if !isnothing(target) + @assert target in unique(y) "Specified target does not match any of the labels." + end + if length(unique(y)) > 1 + if isnothing(target) + @info "No target label supplied, using first." + end + target = isnothing(target) ? 1 : target + _default_title = plot_set_size ? "Set size" : "p̂(y=$(target))" + else + target = isnothing(target) ? 2 : target + _default_title = plot_set_size ? "Set size" : "p̂(y=$(target-1))" + end + title = isnothing(title) ? _default_title : title + + # Predictions + Z = [] + for y=y_range, x=x_range + p̂ = predict(conf_model, fitresult, [x y])[1] + if plot_set_size + z = ismissing(p̂) ? 0 : sum(pdf.(p̂, p̂.decoder.classes) .> 0) + else + z = ismissing(p̂) ? p̂ : pdf.(p̂, 1) + end + push!(Z, z) + end + Z = reduce(hcat, Z) + Z = Z[Int(target),:] + + # Contour: + if plot_set_size + _n = length(unique(y)) + clim=(0,_n) + plt = contourf( + x_range, y_range, Z; + colorbar=colorbar, title=title, linewidth=linewidth, + xlims=xlims, + ylims=ylims, + c=cgrad(:blues, _n+1, categorical = true), + clim=clim, + kwargs... + ) + else + plt = contourf( + x_range, y_range, Z; + colorbar=colorbar, title=title, linewidth=linewidth, + xlims=xlims, + ylims=ylims, + kwargs... + ) + end + + # Samples: + y = typeof(y) <: CategoricalArrays.CategoricalArray ? y : Int.(y) + scatter!(plt, X[1,:],X[2,:],group=y; kwargs...) + + end + +end \ No newline at end of file diff --git a/src/ConformalModels/transductive_classification.jl b/src/ConformalModels/transductive_classification.jl index e982d7d..9f5b6af 100644 --- a/src/ConformalModels/transductive_classification.jl +++ b/src/ConformalModels/transductive_classification.jl @@ -1,6 +1,6 @@ # Simple "The `NaiveClassifier` is the simplest approach to Inductive Conformal Classification. Contrary to the [`NaiveClassifier`](@ref) it computes nonconformity scores using a designated trainibration dataset." -mutable struct NaiveClassifier{Model <: Supervised} <: ConformalSet +mutable struct NaiveClassifier{Model <: Supervised} <: ConformalProbabilisticSet model::Model coverage::AbstractFloat scores::Union{Nothing,AbstractArray} @@ -25,7 +25,9 @@ A typical choice for the heuristic function is ``h(\hat\mu(X_i), Y_i)=1-\hat\mu( function MMI.fit(conf_model::NaiveClassifier, verbosity, X, y) # Setup: - Xtrain, ytrain = MMI.reformat(conf_model.model, X, y) + Xtrain = selectrows(X, :) + ytrain = y[:] + Xtrain, ytrain = MMI.reformat(conf_model.model, Xtrain, ytrain) # Training: fitresult, cache, report = MMI.fit(conf_model.model, verbosity, Xtrain, ytrain) diff --git a/src/ConformalModels/transductive_regression.jl b/src/ConformalModels/transductive_regression.jl index 63ab6c6..230b923 100644 --- a/src/ConformalModels/transductive_regression.jl +++ b/src/ConformalModels/transductive_regression.jl @@ -30,7 +30,9 @@ A typical choice for the heuristic function is ``h(\hat\mu(X_i),Y_i)=|Y_i-\hat\m function MMI.fit(conf_model::NaiveRegressor, verbosity, X, y) # Setup: - Xtrain, ytrain = MMI.reformat(conf_model.model, X, y) + Xtrain = selectrows(X, :) + ytrain = y[:] + Xtrain, ytrain = MMI.reformat(conf_model.model, Xtrain, ytrain) # Training: fitresult, cache, report = MMI.fit(conf_model.model, verbosity, Xtrain, ytrain) @@ -60,6 +62,7 @@ function MMI.predict(conf_model::NaiveRegressor, fitresult, Xnew) v = conf_model.scores q̂ = Statistics.quantile(v, conf_model.coverage) ŷ = map(x -> (x .- q̂, x .+ q̂), eachrow(ŷ)) + ŷ = reformat_interval(ŷ) return ŷ end @@ -91,7 +94,9 @@ where ``\hat\mu_{-i}(X_i)`` denotes the leave-one-out prediction for ``X_i``. In function MMI.fit(conf_model::JackknifeRegressor, verbosity, X, y) # Setup: - Xtrain, ytrain = MMI.reformat(conf_model.model, X, y) + Xtrain = selectrows(X, :) + ytrain = y[:] + Xtrain, ytrain = MMI.reformat(conf_model.model, Xtrain, ytrain) # Training: fitresult, cache, report = MMI.fit(conf_model.model, verbosity, Xtrain, ytrain) @@ -131,6 +136,7 @@ function MMI.predict(conf_model::JackknifeRegressor, fitresult, Xnew) v = conf_model.scores q̂ = Statistics.quantile(v, conf_model.coverage) ŷ = map(x -> (x .- q̂, x .+ q̂), eachrow(ŷ)) + ŷ = reformat_interval(ŷ) return ŷ end @@ -209,6 +215,7 @@ function MMI.predict(conf_model::JackknifePlusRegressor, fitresult, Xnew) ub = Statistics.quantile(yᵢ .+ conf_model.scores, conf_model.coverage) return (lb, ub) end + ŷ = reformat_interval(ŷ) return ŷ end @@ -286,6 +293,7 @@ function MMI.predict(conf_model::JackknifeMinMaxRegressor, fitresult, Xnew) q̂ = Statistics.quantile(v, conf_model.coverage) # For each Xnew compute ( q̂⁻(μ̂₋ᵢ(xnew)-Rᵢᴸᴼᴼ) , q̂⁺(μ̂₋ᵢ(xnew)+Rᵢᴸᴼᴼ) ): ŷ = map(yᵢ -> (minimum(yᵢ .- q̂), maximum(yᵢ .+ q̂)), eachrow(ŷ)) + ŷ = reformat_interval(ŷ) return ŷ end @@ -378,6 +386,7 @@ function MMI.predict(conf_model::CVPlusRegressor, fitresult, Xnew) ub = Statistics.quantile(yᵢ .+ conf_model.scores, conf_model.coverage) return (lb, ub) end + ŷ = reformat_interval(ŷ) return ŷ end diff --git a/src/ConformalModels/utils.jl b/src/ConformalModels/utils.jl new file mode 100644 index 0000000..dd8366f --- /dev/null +++ b/src/ConformalModels/utils.jl @@ -0,0 +1,3 @@ +function reformat_interval(ŷ) + return map(y -> map(yᵢ -> ndims(yᵢ)==1 ? yᵢ[1] : yᵢ,y), ŷ) +end \ No newline at end of file diff --git a/test/Manifest.toml b/test/Manifest.toml index a25928c..d564dda 100644 --- a/test/Manifest.toml +++ b/test/Manifest.toml @@ -2,7 +2,7 @@ julia_version = "1.8.1" manifest_format = "2.0" -project_hash = "988c4d4cb0a10e861795c29e95a79cf7c9c883cb" +project_hash = "c86d41ede7b316f1c0c615053739e4cfe0ac765b" [[deps.ANSIColoredPrinters]] git-tree-sha1 = "574baf8110975760d391c710b6341da1afa48d8c" @@ -117,6 +117,12 @@ git-tree-sha1 = "49549e2c28ffb9cc77b3689dc10e46e6271e9452" uuid = "052768ef-5323-5732-b1bb-66c8b64840ba" version = "3.12.0" +[[deps.Cairo_jll]] +deps = ["Artifacts", "Bzip2_jll", "Fontconfig_jll", "FreeType2_jll", "Glib_jll", "JLLWrappers", "LZO_jll", "Libdl", "Pixman_jll", "Pkg", "Xorg_libXext_jll", "Xorg_libXrender_jll", "Zlib_jll", "libpng_jll"] +git-tree-sha1 = "4b859a208b2397a7a623a03449e4636bdb17bcf2" +uuid = "83423d85-b0ee-5818-9007-b63ccbeb887a" +version = "1.16.1+1" + [[deps.Calculus]] deps = ["LinearAlgebra"] git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad" @@ -349,6 +355,12 @@ git-tree-sha1 = "966e236ded10551a44b6e25ce4bbea4c12be1557" uuid = "f6006082-12f8-11e9-0c9c-0d5d367ab1e5" version = "0.12.4" +[[deps.Expat_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "bad72f730e9e91c08d9427d5e8db95478a3c323d" +uuid = "2e619515-83b5-522b-bb60-26c02a35a201" +version = "2.4.8+0" + [[deps.ExprTools]] git-tree-sha1 = "56559bbef6ca5ea0c0818fa5c90320398a6fbf8d" uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" @@ -359,6 +371,18 @@ git-tree-sha1 = "5e1e4c53fa39afe63a7d356e30452249365fba99" uuid = "411431e0-e8b7-467b-b5e0-f676ba4f2910" version = "0.1.1" +[[deps.FFMPEG]] +deps = ["FFMPEG_jll"] +git-tree-sha1 = "b57e3acbe22f8484b4b5ff66a7499717fe1a9cc8" +uuid = "c87230d0-a227-11e9-1b43-d7ebe4e7570a" +version = "0.4.1" + +[[deps.FFMPEG_jll]] +deps = ["Artifacts", "Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "JLLWrappers", "LAME_jll", "Libdl", "Ogg_jll", "OpenSSL_jll", "Opus_jll", "PCRE2_jll", "Pkg", "Zlib_jll", "libaom_jll", "libass_jll", "libfdk_aac_jll", "libvorbis_jll", "x264_jll", "x265_jll"] +git-tree-sha1 = "74faea50c1d007c85837327f6775bea60b5492dd" +uuid = "b22a6f82-2f65-5046-a5b2-351ab43fb4e5" +version = "4.4.2+2" + [[deps.FileIO]] deps = ["Pkg", "Requires", "UUIDs"] git-tree-sha1 = "7be5f99f7d15578798f338f5433b6c432ea8037b" @@ -386,6 +410,12 @@ git-tree-sha1 = "335bfdceacc84c5cdf16aadc768aa5ddfc5383cc" uuid = "53c48c17-4a7d-5ca2-90c5-79b7896eea93" version = "0.8.4" +[[deps.Fontconfig_jll]] +deps = ["Artifacts", "Bzip2_jll", "Expat_jll", "FreeType2_jll", "JLLWrappers", "Libdl", "Libuuid_jll", "Pkg", "Zlib_jll"] +git-tree-sha1 = "21efd19106a55620a188615da6d3d06cd7f6ee03" +uuid = "a3f928ae-7b40-5064-980b-68af3947d34b" +version = "2.13.93+0" + [[deps.Formatting]] deps = ["Printf"] git-tree-sha1 = "8339d61043228fdd3eb658d86c926cb282ae72a8" @@ -410,10 +440,22 @@ git-tree-sha1 = "87eb71354d8ec1a96d4a7636bd57a7347dde3ef9" uuid = "d7e528f0-a631-5988-bf34-fe36492bcfd7" version = "2.10.4+0" +[[deps.FriBidi_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "aa31987c2ba8704e23c6c8ba8a4f769d5d7e4f91" +uuid = "559328eb-81f9-559d-9380-de523a88c83c" +version = "1.0.10+0" + [[deps.Future]] deps = ["Random"] uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820" +[[deps.GLFW_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Libglvnd_jll", "Pkg", "Xorg_libXcursor_jll", "Xorg_libXi_jll", "Xorg_libXinerama_jll", "Xorg_libXrandr_jll"] +git-tree-sha1 = "d972031d28c8c8d9d7b41a536ad7bb0c2579caca" +uuid = "0656b61e-2033-5cc2-a64a-77c0f6c09b89" +version = "3.3.8+0" + [[deps.GPUArrays]] deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"] git-tree-sha1 = "45d7deaf05cbb44116ba785d147c518ab46352d7" @@ -432,6 +474,18 @@ git-tree-sha1 = "323949b0bbdf38c93d2ea1f7d3e68ff163c3f081" uuid = "61eb1bfa-7361-4325-ad38-22787b887f55" version = "0.16.5" +[[deps.GR]] +deps = ["Artifacts", "Base64", "DelimitedFiles", "Downloads", "GR_jll", "HTTP", "JSON", "Libdl", "LinearAlgebra", "Pkg", "Preferences", "Printf", "Random", "Serialization", "Sockets", "TOML", "Tar", "Test", "UUIDs", "p7zip_jll"] +git-tree-sha1 = "051072ff2accc6e0e87b708ddee39b18aa04a0bc" +uuid = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71" +version = "0.71.1" + +[[deps.GR_jll]] +deps = ["Artifacts", "Bzip2_jll", "Cairo_jll", "FFMPEG_jll", "Fontconfig_jll", "GLFW_jll", "JLLWrappers", "JpegTurbo_jll", "Libdl", "Libtiff_jll", "Pixman_jll", "Pkg", "Qt5Base_jll", "Zlib_jll", "libpng_jll"] +git-tree-sha1 = "501a4bf76fd679e7fcd678725d5072177392e756" +uuid = "d2c73de3-f751-5644-a686-071e5b155ba9" +version = "0.71.1+0" + [[deps.GeoInterface]] deps = ["Extents"] git-tree-sha1 = "fb28b5dc239d0174d7297310ef7b84a11804dfab" @@ -444,12 +498,41 @@ git-tree-sha1 = "12a584db96f1d460421d5fb8860822971cdb8455" uuid = "5c1252a2-5f33-56bf-86c9-59e7332b4326" version = "0.4.4" +[[deps.Gettext_jll]] +deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Libiconv_jll", "Pkg", "XML2_jll"] +git-tree-sha1 = "9b02998aba7bf074d14de89f9d37ca24a1a0b046" +uuid = "78b55507-aeef-58d4-861c-77aaff3498b1" +version = "0.21.0+0" + +[[deps.Glib_jll]] +deps = ["Artifacts", "Gettext_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Libiconv_jll", "Libmount_jll", "PCRE2_jll", "Pkg", "Zlib_jll"] +git-tree-sha1 = "fb83fbe02fe57f2c068013aa94bcdf6760d3a7a7" +uuid = "7746bdde-850d-59dc-9ae8-88ece973131d" +version = "2.74.0+1" + +[[deps.Graphite2_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "344bf40dcab1073aca04aa0df4fb092f920e4011" +uuid = "3b182d85-2403-5c21-9c21-1e1f0cc25472" +version = "1.3.14+0" + +[[deps.Grisu]] +git-tree-sha1 = "53bb909d1151e57e2484c3d1b53e19552b887fb2" +uuid = "42e2da0e-8278-4e71-bc24-59509adca0fe" +version = "1.0.2" + [[deps.HTTP]] deps = ["Base64", "CodecZlib", "Dates", "IniFile", "Logging", "LoggingExtras", "MbedTLS", "NetworkOptions", "OpenSSL", "Random", "SimpleBufferStream", "Sockets", "URIs", "UUIDs"] git-tree-sha1 = "a97d47758e933cd5fe5ea181d178936a9fc60427" uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3" version = "1.5.1" +[[deps.HarfBuzz_jll]] +deps = ["Artifacts", "Cairo_jll", "Fontconfig_jll", "FreeType2_jll", "Glib_jll", "Graphite2_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg"] +git-tree-sha1 = "129acf094d168394e80ee1dc4bc06ec835e510a3" +uuid = "2e76f6c2-a576-52d4-95c1-20adfe4de566" +version = "2.8.1+1" + [[deps.HostCPUFeatures]] deps = ["BitTwiddlingConvenienceFunctions", "IfElse", "Libdl", "Static"] git-tree-sha1 = "b7b88a4716ac33fe31d6556c02fc60017594343c" @@ -520,6 +603,12 @@ git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856" uuid = "82899510-4779-5014-852e-03e436cf321d" version = "1.0.0" +[[deps.JLFzf]] +deps = ["Pipe", "REPL", "Random", "fzf_jll"] +git-tree-sha1 = "f377670cda23b6b7c1c0b3893e37451c5c1a2185" +uuid = "1019f520-868f-41f5-a6de-eb00f4b6a39c" +version = "0.1.5" + [[deps.JLLWrappers]] deps = ["Preferences"] git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1" @@ -532,6 +621,24 @@ git-tree-sha1 = "3c837543ddb02250ef42f4738347454f95079d4e" uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6" version = "0.21.3" +[[deps.JpegTurbo_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "b53380851c6e6664204efb2e62cd24fa5c47e4ba" +uuid = "aacddb02-875f-59d6-b918-886e6ef4fbf8" +version = "2.1.2+0" + +[[deps.LAME_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "f6250b16881adf048549549fba48b1161acdac8c" +uuid = "c1c5ebd0-6772-5130-a774-d5fcae4a789d" +version = "3.100.1+0" + +[[deps.LERC_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "bf36f528eec6634efc60d7ec062008f171071434" +uuid = "88015f11-f218-50d7-93a8-a6af411a945d" +version = "3.0.0+1" + [[deps.LLVM]] deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"] git-tree-sha1 = "e7e9184b0bf0158ac4e4aa9daf00041b5909bf1a" @@ -544,6 +651,23 @@ git-tree-sha1 = "771bfe376249626d3ca12bcd58ba243d3f961576" uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab" version = "0.0.16+0" +[[deps.LZO_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "e5b909bcf985c5e2605737d2ce278ed791b89be6" +uuid = "dd4b983a-f0e5-5f8d-a1b7-129d4a5fb1ac" +version = "2.10.1+0" + +[[deps.LaTeXStrings]] +git-tree-sha1 = "f2355693d6778a178ade15952b7ac47a4ff97996" +uuid = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f" +version = "1.3.0" + +[[deps.Latexify]] +deps = ["Formatting", "InteractiveUtils", "LaTeXStrings", "MacroTools", "Markdown", "OrderedCollections", "Printf", "Requires"] +git-tree-sha1 = "ab9aa169d2160129beb241cb2750ca499b4e90e9" +uuid = "23fbe1c1-3f47-55db-b15f-69d7ec21a316" +version = "0.15.17" + [[deps.LatinHypercubeSampling]] deps = ["Random", "StableRNGs", "StatsBase", "Test"] git-tree-sha1 = "42938ab65e9ed3c3029a8d2c58382ca75bdab243" @@ -582,6 +706,54 @@ version = "1.10.2+0" [[deps.Libdl]] uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" +[[deps.Libffi_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "0b4a5d71f3e5200a7dff793393e09dfc2d874290" +uuid = "e9f186c6-92d2-5b65-8a66-fee21dc1b490" +version = "3.2.2+1" + +[[deps.Libgcrypt_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Libgpg_error_jll", "Pkg"] +git-tree-sha1 = "64613c82a59c120435c067c2b809fc61cf5166ae" +uuid = "d4300ac3-e22c-5743-9152-c294e39db1e4" +version = "1.8.7+0" + +[[deps.Libglvnd_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll", "Xorg_libXext_jll"] +git-tree-sha1 = "6f73d1dd803986947b2c750138528a999a6c7733" +uuid = "7e76a0d4-f3c7-5321-8279-8d96eeed0f29" +version = "1.6.0+0" + +[[deps.Libgpg_error_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "c333716e46366857753e273ce6a69ee0945a6db9" +uuid = "7add5ba3-2f88-524e-9cd5-f83b8a55f7b8" +version = "1.42.0+0" + +[[deps.Libiconv_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "42b62845d70a619f063a7da093d995ec8e15e778" +uuid = "94ce4f54-9a6c-5748-9c1c-f9c7231a4531" +version = "1.16.1+1" + +[[deps.Libmount_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "9c30530bf0effd46e15e0fdcf2b8636e78cbbd73" +uuid = "4b2f31a3-9ecc-558c-b454-b3730dcb73e9" +version = "2.35.0+0" + +[[deps.Libtiff_jll]] +deps = ["Artifacts", "JLLWrappers", "JpegTurbo_jll", "LERC_jll", "Libdl", "Pkg", "Zlib_jll", "Zstd_jll"] +git-tree-sha1 = "3eb79b0ca5764d4799c06699573fd8f533259713" +uuid = "89763e89-9b03-5906-acba-b20f662cd828" +version = "4.4.0+0" + +[[deps.Libuuid_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "7f3efec06033682db852f8b3bc3c1d2b0a0ab066" +uuid = "38a345b3-de98-5d2b-a5d3-14cd9215e700" +version = "2.36.0+0" + [[deps.LightGBM]] deps = ["Dates", "Libdl", "MLJModelInterface", "SparseArrays", "Statistics"] git-tree-sha1 = "658faa6a229fb5bb4aea5cc897cd99db66aafb51" @@ -723,6 +895,11 @@ deps = ["Artifacts", "Libdl"] uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" version = "2.28.0+0" +[[deps.Measures]] +git-tree-sha1 = "c13304c81eec1ed3af7fc20e75fb6b26092a1102" +uuid = "442fdcdd-2543-5da2-b0f3-8c86c306513e" +version = "0.3.2" + [[deps.Missings]] deps = ["DataAPI"] git-tree-sha1 = "bf210ce90b6c9eed32d25dbcae1ebc565df2687f" @@ -776,6 +953,12 @@ git-tree-sha1 = "f71d8950b724e9ff6110fc948dff5a329f901d64" uuid = "6fe1bfb0-de20-5000-8ca7-80f57d26f881" version = "1.12.8" +[[deps.Ogg_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "887579a3eb005446d514ab7aeac5d1d027658b8f" +uuid = "e7412a2a-1a6e-54c0-be00-318e2571c051" +version = "1.3.5+1" + [[deps.OpenBLAS_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" @@ -816,11 +999,22 @@ git-tree-sha1 = "b9fe76d1a39807fdcf790b991981a922de0c3050" uuid = "429524aa-4258-5aef-a3af-852621145aeb" version = "1.7.3" +[[deps.Opus_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "51a08fb14ec28da2ec7a927c4337e4332c2a4720" +uuid = "91d4177d-7536-5919-b921-800302f37372" +version = "1.3.2+0" + [[deps.OrderedCollections]] git-tree-sha1 = "85f8e6578bf1f9ee0d11e7bb1b1456435479d47c" uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d" version = "1.4.1" +[[deps.PCRE2_jll]] +deps = ["Artifacts", "Libdl"] +uuid = "efcefdf7-47ab-520b-bdef-62a2eaa19f15" +version = "10.40.0+0" + [[deps.PDMats]] deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"] git-tree-sha1 = "cf494dca75a69712a72b80bc48f59dcf3dea63ec" @@ -839,11 +1033,40 @@ git-tree-sha1 = "6c01a9b494f6d2a9fc180a08b182fcb06f0958a0" uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0" version = "2.4.2" +[[deps.Pipe]] +git-tree-sha1 = "6842804e7867b115ca9de748a0cf6b364523c16d" +uuid = "b98c9c47-44ae-5843-9183-064241ee97a0" +version = "1.3.0" + +[[deps.Pixman_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "b4f5d02549a10e20780a24fce72bea96b6329e29" +uuid = "30392449-352a-5448-841d-b1acce4e97dc" +version = "0.40.1+0" + [[deps.Pkg]] deps = ["Artifacts", "Dates", "Downloads", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"] uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" version = "1.8.0" +[[deps.PlotThemes]] +deps = ["PlotUtils", "Statistics"] +git-tree-sha1 = "1f03a2d339f42dca4a4da149c7e15e9b896ad899" +uuid = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a" +version = "3.1.0" + +[[deps.PlotUtils]] +deps = ["ColorSchemes", "Colors", "Dates", "Printf", "Random", "Reexport", "SnoopPrecompile", "Statistics"] +git-tree-sha1 = "21303256d239f6b484977314674aef4bb1fe4420" +uuid = "995b91a9-d308-5afd-9ec6-746e21dbc043" +version = "1.3.1" + +[[deps.Plots]] +deps = ["Base64", "Contour", "Dates", "Downloads", "FFMPEG", "FixedPointNumbers", "GR", "JLFzf", "JSON", "LaTeXStrings", "Latexify", "LinearAlgebra", "Measures", "NaNMath", "Pkg", "PlotThemes", "PlotUtils", "Printf", "REPL", "Random", "RecipesBase", "RecipesPipeline", "Reexport", "RelocatableFolders", "Requires", "Scratch", "Showoff", "SnoopPrecompile", "SparseArrays", "Statistics", "StatsBase", "UUIDs", "UnicodeFun", "Unzip"] +git-tree-sha1 = "6a9521b955b816aa500462951aa67f3e4467248a" +uuid = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" +version = "1.36.6" + [[deps.PolyesterWeave]] deps = ["BitTwiddlingConvenienceFunctions", "CPUSummary", "IfElse", "Static", "ThreadingUtilities"] git-tree-sha1 = "b42fb2292fbbaed36f25d33a15c8cc0b4f287fcf" @@ -895,6 +1118,12 @@ git-tree-sha1 = "53b8b07b721b77144a0fbbbc2675222ebf40a02d" uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0" version = "1.94.1" +[[deps.Qt5Base_jll]] +deps = ["Artifacts", "CompilerSupportLibraries_jll", "Fontconfig_jll", "Glib_jll", "JLLWrappers", "Libdl", "Libglvnd_jll", "OpenSSL_jll", "Pkg", "Xorg_libXext_jll", "Xorg_libxcb_jll", "Xorg_xcb_util_image_jll", "Xorg_xcb_util_keysyms_jll", "Xorg_xcb_util_renderutil_jll", "Xorg_xcb_util_wm_jll", "Zlib_jll", "xkbcommon_jll"] +git-tree-sha1 = "0c03844e2231e12fda4d0086fd7cbe4098ee8dc5" +uuid = "ea2cea3b-5b76-57ae-a6ef-0a8af62496e1" +version = "5.15.3+2" + [[deps.QuadGK]] deps = ["DataStructures", "LinearAlgebra"] git-tree-sha1 = "97aa253e65b784fd13e83774cadc95b38011d734" @@ -927,6 +1156,12 @@ git-tree-sha1 = "d12e612bba40d189cead6ff857ddb67bd2e6a387" uuid = "3cdcf5f2-1ef4-517c-9805-6587b60abb01" version = "1.3.1" +[[deps.RecipesPipeline]] +deps = ["Dates", "NaNMath", "PlotUtils", "RecipesBase", "SnoopPrecompile"] +git-tree-sha1 = "e974477be88cb5e3040009f3767611bc6357846f" +uuid = "01d81517-befc-4cb6-b9ec-a95719d0359c" +version = "0.6.11" + [[deps.Reexport]] git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" uuid = "189a3867-3050-52da-a836-e630ba90ab69" @@ -1019,6 +1254,12 @@ version = "1.1.1" deps = ["Distributed", "Mmap", "Random", "Serialization"] uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383" +[[deps.Showoff]] +deps = ["Dates", "Grisu"] +git-tree-sha1 = "91eddf657aca81df9ae6ceb20b959ae5653ad1de" +uuid = "992d4aef-0814-514b-bc4d-f2e9a6c4116f" +version = "1.0.3" + [[deps.SimpleBufferStream]] git-tree-sha1 = "874e8867b33a00e784c8a7e4b60afe9e037b74e1" uuid = "777ac1f9-54b0-4bf8-805c-2214025038e7" @@ -1176,6 +1417,12 @@ version = "1.0.2" [[deps.Unicode]] uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" +[[deps.UnicodeFun]] +deps = ["REPL"] +git-tree-sha1 = "53915e50200959667e78a92a418594b428dffddf" +uuid = "1cfade01-22cf-5700-b092-accc4b62d6e1" +version = "0.4.1" + [[deps.UnicodePlots]] deps = ["ColorSchemes", "ColorTypes", "Contour", "Crayons", "Dates", "FileIO", "FreeType", "LinearAlgebra", "MarchingCubes", "NaNMath", "Printf", "Requires", "SnoopPrecompile", "SparseArrays", "StaticArrays", "StatsBase", "Unitful"] git-tree-sha1 = "390b2e8e5535f5beb50885d1a1059f460547d3a5" @@ -1188,6 +1435,11 @@ git-tree-sha1 = "d57a4ed70b6f9ff1da6719f5f2713706d57e0d66" uuid = "1986cc42-f94f-5a68-af5c-568840ba703d" version = "1.12.0" +[[deps.Unzip]] +git-tree-sha1 = "ca0969166a028236229f63514992fc073799bb78" +uuid = "41fe7b60-77ed-43a1-b4f0-825fd5a5650d" +version = "0.2.0" + [[deps.VectorizationBase]] deps = ["ArrayInterface", "CPUSummary", "HostCPUFeatures", "IfElse", "LayoutPointers", "Libdl", "LinearAlgebra", "SIMDTypes", "Static"] git-tree-sha1 = "ba9d398034a2ba78059391492730889c6e45cf15" @@ -1199,16 +1451,208 @@ git-tree-sha1 = "58d6e80b4ee071f5efd07fda82cb9fbe17200868" uuid = "81def892-9a0e-5fdd-b105-ffc91e053289" version = "1.3.0" +[[deps.Wayland_jll]] +deps = ["Artifacts", "Expat_jll", "JLLWrappers", "Libdl", "Libffi_jll", "Pkg", "XML2_jll"] +git-tree-sha1 = "3e61f0b86f90dacb0bc0e73a0c5a83f6a8636e23" +uuid = "a2964d1f-97da-50d4-b82a-358c7fce9d89" +version = "1.19.0+0" + +[[deps.Wayland_protocols_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "4528479aa01ee1b3b4cd0e6faef0e04cf16466da" +uuid = "2381bf8a-dfd0-557d-9999-79630e7b1b91" +version = "1.25.0+0" + +[[deps.XML2_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Libiconv_jll", "Pkg", "Zlib_jll"] +git-tree-sha1 = "58443b63fb7e465a8a7210828c91c08b92132dff" +uuid = "02c8fc9c-b97f-50b9-bbe4-9be30ff0a78a" +version = "2.9.14+0" + +[[deps.XSLT_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Libgcrypt_jll", "Libgpg_error_jll", "Libiconv_jll", "Pkg", "XML2_jll", "Zlib_jll"] +git-tree-sha1 = "91844873c4085240b95e795f692c4cec4d805f8a" +uuid = "aed1982a-8fda-507f-9586-7b0439959a61" +version = "1.1.34+0" + +[[deps.Xorg_libX11_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxcb_jll", "Xorg_xtrans_jll"] +git-tree-sha1 = "5be649d550f3f4b95308bf0183b82e2582876527" +uuid = "4f6342f7-b3d2-589e-9d20-edeb45f2b2bc" +version = "1.6.9+4" + +[[deps.Xorg_libXau_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "4e490d5c960c314f33885790ed410ff3a94ce67e" +uuid = "0c0b7dd1-d40b-584c-a123-a41640f87eec" +version = "1.0.9+4" + +[[deps.Xorg_libXcursor_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXfixes_jll", "Xorg_libXrender_jll"] +git-tree-sha1 = "12e0eb3bc634fa2080c1c37fccf56f7c22989afd" +uuid = "935fb764-8cf2-53bf-bb30-45bb1f8bf724" +version = "1.2.0+4" + +[[deps.Xorg_libXdmcp_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "4fe47bd2247248125c428978740e18a681372dd4" +uuid = "a3789734-cfe1-5b06-b2d0-1dd0d9d62d05" +version = "1.1.3+4" + +[[deps.Xorg_libXext_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] +git-tree-sha1 = "b7c0aa8c376b31e4852b360222848637f481f8c3" +uuid = "1082639a-0dae-5f34-9b06-72781eeb8cb3" +version = "1.3.4+4" + +[[deps.Xorg_libXfixes_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] +git-tree-sha1 = "0e0dc7431e7a0587559f9294aeec269471c991a4" +uuid = "d091e8ba-531a-589c-9de9-94069b037ed8" +version = "5.0.3+4" + +[[deps.Xorg_libXi_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll", "Xorg_libXfixes_jll"] +git-tree-sha1 = "89b52bc2160aadc84d707093930ef0bffa641246" +uuid = "a51aa0fd-4e3c-5386-b890-e753decda492" +version = "1.7.10+4" + +[[deps.Xorg_libXinerama_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll"] +git-tree-sha1 = "26be8b1c342929259317d8b9f7b53bf2bb73b123" +uuid = "d1454406-59df-5ea1-beac-c340f2130bc3" +version = "1.1.4+4" + +[[deps.Xorg_libXrandr_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libXext_jll", "Xorg_libXrender_jll"] +git-tree-sha1 = "34cea83cb726fb58f325887bf0612c6b3fb17631" +uuid = "ec84b674-ba8e-5d96-8ba1-2a689ba10484" +version = "1.5.2+4" + +[[deps.Xorg_libXrender_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] +git-tree-sha1 = "19560f30fd49f4d4efbe7002a1037f8c43d43b96" +uuid = "ea2f1a96-1ddc-540d-b46f-429655e07cfa" +version = "0.9.10+4" + +[[deps.Xorg_libpthread_stubs_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "6783737e45d3c59a4a4c4091f5f88cdcf0908cbb" +uuid = "14d82f49-176c-5ed1-bb49-ad3f5cbd8c74" +version = "0.1.0+3" + +[[deps.Xorg_libxcb_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "XSLT_jll", "Xorg_libXau_jll", "Xorg_libXdmcp_jll", "Xorg_libpthread_stubs_jll"] +git-tree-sha1 = "daf17f441228e7a3833846cd048892861cff16d6" +uuid = "c7cfdc94-dc32-55de-ac96-5a1b8d977c5b" +version = "1.13.0+3" + +[[deps.Xorg_libxkbfile_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libX11_jll"] +git-tree-sha1 = "926af861744212db0eb001d9e40b5d16292080b2" +uuid = "cc61e674-0454-545c-8b26-ed2c68acab7a" +version = "1.1.0+4" + +[[deps.Xorg_xcb_util_image_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] +git-tree-sha1 = "0fab0a40349ba1cba2c1da699243396ff8e94b97" +uuid = "12413925-8142-5f55-bb0e-6d7ca50bb09b" +version = "0.4.0+1" + +[[deps.Xorg_xcb_util_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxcb_jll"] +git-tree-sha1 = "e7fd7b2881fa2eaa72717420894d3938177862d1" +uuid = "2def613f-5ad1-5310-b15b-b15d46f528f5" +version = "0.4.0+1" + +[[deps.Xorg_xcb_util_keysyms_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] +git-tree-sha1 = "d1151e2c45a544f32441a567d1690e701ec89b00" +uuid = "975044d2-76e6-5fbe-bf08-97ce7c6574c7" +version = "0.4.0+1" + +[[deps.Xorg_xcb_util_renderutil_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] +git-tree-sha1 = "dfd7a8f38d4613b6a575253b3174dd991ca6183e" +uuid = "0d47668e-0667-5a69-a72c-f761630bfb7e" +version = "0.3.9+1" + +[[deps.Xorg_xcb_util_wm_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xcb_util_jll"] +git-tree-sha1 = "e78d10aab01a4a154142c5006ed44fd9e8e31b67" +uuid = "c22f9ab0-d5fe-5066-847c-f4bb1cd4e361" +version = "0.4.1+1" + +[[deps.Xorg_xkbcomp_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_libxkbfile_jll"] +git-tree-sha1 = "4bcbf660f6c2e714f87e960a171b119d06ee163b" +uuid = "35661453-b289-5fab-8a00-3d9160c6a3a4" +version = "1.4.2+4" + +[[deps.Xorg_xkeyboard_config_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Xorg_xkbcomp_jll"] +git-tree-sha1 = "5c8424f8a67c3f2209646d4425f3d415fee5931d" +uuid = "33bec58e-1273-512f-9401-5d533626f822" +version = "2.27.0+4" + +[[deps.Xorg_xtrans_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "79c31e7844f6ecf779705fbc12146eb190b7d845" +uuid = "c5fb5394-a638-5e4d-96e5-b29de1b5cf10" +version = "1.4.0+3" + [[deps.Zlib_jll]] deps = ["Libdl"] uuid = "83775a58-1f1d-513f-b197-d71354ab007a" version = "1.2.12+3" +[[deps.Zstd_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "e45044cd873ded54b6a5bac0eb5c971392cf1927" +uuid = "3161d3a3-bdf6-5164-811a-617609db77b4" +version = "1.5.2+0" + +[[deps.fzf_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "868e669ccb12ba16eaf50cb2957ee2ff61261c56" +uuid = "214eeab7-80f7-51ab-84ad-2988db7cef09" +version = "0.29.0+0" + +[[deps.libaom_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "3a2ea60308f0996d26f1e5354e10c24e9ef905d4" +uuid = "a4ae2306-e953-59d6-aa16-d00cac43593b" +version = "3.4.0+0" + +[[deps.libass_jll]] +deps = ["Artifacts", "Bzip2_jll", "FreeType2_jll", "FriBidi_jll", "HarfBuzz_jll", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll"] +git-tree-sha1 = "5982a94fcba20f02f42ace44b9894ee2b140fe47" +uuid = "0ac62f75-1d6f-5e53-bd7c-93b484bb37c0" +version = "0.15.1+0" + [[deps.libblastrampoline_jll]] deps = ["Artifacts", "Libdl", "OpenBLAS_jll"] uuid = "8e850b90-86db-534c-a0d3-1478176c7d93" version = "5.1.1+0" +[[deps.libfdk_aac_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "daacc84a041563f965be61859a36e17c4e4fcd55" +uuid = "f638f0a6-7fb0-5443-88ba-1cc74229b280" +version = "2.0.2+0" + +[[deps.libpng_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll"] +git-tree-sha1 = "94d180a6d2b5e55e447e2d27a29ed04fe79eb30c" +uuid = "b53b4c65-9356-5827-b1ea-8c7a1a84506f" +version = "1.6.38+0" + +[[deps.libvorbis_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Ogg_jll", "Pkg"] +git-tree-sha1 = "b910cb81ef3fe6e78bf6acee440bda86fd6ae00c" +uuid = "f27f6e37-5d2b-51aa-960f-b287f2bc3b7a" +version = "1.3.7+1" + [[deps.nghttp2_jll]] deps = ["Artifacts", "Libdl"] uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d" @@ -1218,3 +1662,21 @@ version = "1.48.0+0" deps = ["Artifacts", "Libdl"] uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0" version = "17.4.0+0" + +[[deps.x264_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "4fea590b89e6ec504593146bf8b988b2c00922b2" +uuid = "1270edf5-f2f9-52d2-97e9-ab00b5d0237a" +version = "2021.5.5+0" + +[[deps.x265_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"] +git-tree-sha1 = "ee567a171cce03570d77ad3a43e90218e38937a9" +uuid = "dfaa095f-4041-5dcd-9319-2fabd8486b76" +version = "3.5.0+0" + +[[deps.xkbcommon_jll]] +deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg", "Wayland_jll", "Wayland_protocols_jll", "Xorg_libxcb_jll", "Xorg_xkeyboard_config_jll"] +git-tree-sha1 = "9ebfc140cc56e8c2156a15ceac2f0302e327ac0a" +uuid = "d8fb68d0-12a3-5cfd-a85a-d49703b185fd" +version = "1.4.1+0" diff --git a/test/Project.toml b/test/Project.toml index 906e83c..76c06f2 100644 --- a/test/Project.toml +++ b/test/Project.toml @@ -8,4 +8,5 @@ MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692" MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea" MLJScikitLearnInterface = "5ae90465-5518-4432-b9d2-8a1def2f0cab" NearestNeighborModels = "636a865e-7cf4-491e-846c-de09b730eb36" +Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" diff --git a/test/classification.jl b/test/classification.jl index b71498b..ca90fc4 100644 --- a/test/classification.jl +++ b/test/classification.jl @@ -1,4 +1,5 @@ using MLJ +using Plots # Data: X, y = MLJ.make_blobs(1000, 2, centers=2) @@ -34,6 +35,9 @@ conformal_models = merge(values(available_models[:classification])...) @test !isnothing(conf_model.scores) predict(mach, selectrows(X, test)) + # Plot + plot(mach.model, mach.fitresult, X, y) + end end diff --git a/test/regression.jl b/test/regression.jl index 1d6f77d..6dd4d3b 100644 --- a/test/regression.jl +++ b/test/regression.jl @@ -1,7 +1,8 @@ using MLJ +using Plots # Data: -X, y = MLJ.make_regression(1000, 2) +X, y = MLJ.make_regression(1000, 1) train, test = partition(eachindex(y), 0.8) # Atomic and conformal models: @@ -34,6 +35,9 @@ conformal_models = merge(values(available_models[:regression])...) @test !isnothing(conf_model.scores) predict(mach, selectrows(X, test)) + # Plot + plot(mach.model, mach.fitresult, X, y) + end end diff --git a/tmp.gif b/tmp.gif new file mode 100644 index 0000000..a7dfc81 Binary files /dev/null and b/tmp.gif differ