Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Implement bootstrap #107

Merged
merged 85 commits into from
Apr 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
bbd3272
Initial version of bootstrap class for empirical model
yonatank93 Jan 9, 2023
14fc880
`get_compute_arguments` can return a flat or nested list
yonatank93 Jan 9, 2023
f2aacc1
Add `has_opt_params_bounds` method to `_WrapperCalculator`
Jan 9, 2023
d7f7893
Initial updates on the neural network calculator
Jan 10, 2023
f534814
Initial working script for bootstrap neural network
Jan 10, 2023
290aa11
MOdify the default bootstrap cas generator for empirical model
Jan 12, 2023
f99bef1
Cache the initial parameter guess for empirical model
Jan 12, 2023
09f8184
Add an option to input initial guess in each step
Jan 12, 2023
facfd6d
Reset the parameters for NN and empirical model
Jan 13, 2023
22f01e0
Clean up
Jan 13, 2023
3180f70
Fix the list of residual function when using _WrapperCalculator
yonatank93 Jan 13, 2023
ff39f77
Documentation for bootstrap sampler clas for empirical models
Mar 14, 2023
cb176d3
Initial draft of bootstrap example for empirical model
Mar 14, 2023
26f783d
Test for bootstrap empirical model
Mar 16, 2023
54b844d
BUG: default generator function for empirical model when using multip…
Mar 16, 2023
c6e9c85
Documentation for bootstrap NN class
Mar 16, 2023
10d98a5
Test for bootstrap neural network model
Mar 16, 2023
2c2da19
Clean up and update documentation
Mar 16, 2023
8c01c5d
Work out the compatibility with CalculatorTorchSeparateSpecies
Mar 17, 2023
be23b7e
Finallyze the draft of bootstrap example
Mar 17, 2023
e0b2879
Refactoring
Mar 17, 2023
bb29000
Run pre-commit
Mar 17, 2023
cd5da02
Add callback function to bootstrap empirical model
Mar 20, 2023
97f8c85
Apply changes based on minor feedback
Mar 21, 2023
920b844
DOC: convert to google style
Mar 24, 2023
3e2e234
Add additional argument for default bootstrap dataset generator
Mar 24, 2023
9486269
BUG: Fix get parameter and update parameter
yonatank93 Mar 24, 2023
c3a442e
DOC: Convert to googledoc style
yonatank93 Mar 26, 2023
efd67e4
Fix example for the documentation page
yonatank93 Mar 26, 2023
466d1d3
Add an option to specify callback function for NN case
yonatank93 Mar 26, 2023
6d89512
Also import each bootstrap class for empirical and NN models
yonatank93 Mar 26, 2023
01275df
DOC: Add page about bootstrapping
yonatank93 Mar 26, 2023
4e740b9
Revert back the change to use `torch.Tensor`
yonatank93 Mar 27, 2023
08bb6bc
TST: Tests for retrieving and updating NN model parameters
yonatank93 Mar 27, 2023
4074bb6
DOC: Add the shape of numpy array.
yonatank93 Mar 28, 2023
0b885f7
Remove commands that are not needed
yonatank93 Mar 28, 2023
a0693d2
DOC: Fix typos
yonatank93 Mar 28, 2023
3d0f10c
Add a default random seed and use a local random seed generator
yonatank93 Apr 4, 2023
f5f71c6
Remove pretraining before running bootstrap and added notes about
yonatank93 Apr 4, 2023
1334451
Update how to compute sigma in MagnitudeInverseWeight
yonatank93 Apr 4, 2023
7f2ace3
Update due to `DeprecationWarning`
yonatank93 Apr 4, 2023
73791a8
Initial version of bootstrap class for empirical model
yonatank93 Jan 9, 2023
d14ccdc
`get_compute_arguments` can return a flat or nested list
yonatank93 Jan 9, 2023
2ba05a9
Add `has_opt_params_bounds` method to `_WrapperCalculator`
Jan 9, 2023
59c9b72
Initial updates on the neural network calculator
Jan 10, 2023
1f19d4b
Initial working script for bootstrap neural network
Jan 10, 2023
0f33f41
MOdify the default bootstrap cas generator for empirical model
Jan 12, 2023
fbc38cf
Cache the initial parameter guess for empirical model
Jan 12, 2023
78831f2
Add an option to input initial guess in each step
Jan 12, 2023
6912b93
Reset the parameters for NN and empirical model
Jan 13, 2023
26a7c1c
Clean up
Jan 13, 2023
7609580
Fix the list of residual function when using _WrapperCalculator
yonatank93 Jan 13, 2023
5a38a60
Documentation for bootstrap sampler clas for empirical models
Mar 14, 2023
270e687
Initial draft of bootstrap example for empirical model
Mar 14, 2023
a61f435
Test for bootstrap empirical model
Mar 16, 2023
bdb2406
BUG: default generator function for empirical model when using multip…
Mar 16, 2023
d6655a0
Documentation for bootstrap NN class
Mar 16, 2023
f50f935
Test for bootstrap neural network model
Mar 16, 2023
145cd2c
Clean up and update documentation
Mar 16, 2023
34fc791
Work out the compatibility with CalculatorTorchSeparateSpecies
Mar 17, 2023
65e56b6
Finallyze the draft of bootstrap example
Mar 17, 2023
44c26d9
Refactoring
Mar 17, 2023
2b983ff
Run pre-commit
Mar 17, 2023
7672b6d
Add callback function to bootstrap empirical model
Mar 20, 2023
248e670
Apply changes based on minor feedback
Mar 21, 2023
bc30a4e
DOC: convert to google style
Mar 24, 2023
3ebdab9
Add additional argument for default bootstrap dataset generator
Mar 24, 2023
35ef9e4
BUG: Fix get parameter and update parameter
yonatank93 Mar 24, 2023
d08e432
DOC: Convert to googledoc style
yonatank93 Mar 26, 2023
6d1f464
Fix example for the documentation page
yonatank93 Mar 26, 2023
2322919
Add an option to specify callback function for NN case
yonatank93 Mar 26, 2023
ad9aa34
Also import each bootstrap class for empirical and NN models
yonatank93 Mar 26, 2023
c13382d
DOC: Add page about bootstrapping
yonatank93 Mar 26, 2023
4c96b20
Revert back the change to use `torch.Tensor`
yonatank93 Mar 27, 2023
f59bac8
TST: Tests for retrieving and updating NN model parameters
yonatank93 Mar 27, 2023
e623e38
DOC: Add the shape of numpy array.
yonatank93 Mar 28, 2023
9de3928
Remove commands that are not needed
yonatank93 Mar 28, 2023
f872874
DOC: Fix typos
yonatank93 Mar 28, 2023
9a40764
Add a default random seed and use a local random seed generator
yonatank93 Apr 4, 2023
ad62939
Remove pretraining before running bootstrap and added notes about
yonatank93 Apr 4, 2023
cb84f2c
Update how to compute sigma in MagnitudeInverseWeight
yonatank93 Apr 4, 2023
c7b014c
Update due to `DeprecationWarning`
yonatank93 Apr 4, 2023
4e14f66
Merge branch 'implement_bootstrap' of https://github.com/yonatank93/k…
yonatank93 Apr 4, 2023
0978449
Apply pre-commit
Apr 4, 2023
ad844e6
Build documentation
yonatank93 Apr 4, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ tests/echo*
tests/fingerprints/
tmp_*
*_kliff_trained/
tests/uq/*.pkl
tests/uq/*.json
tests/uq/kliff_saved_model

# dataset
Si_training_set_4_configs
Expand Down
Binary file modified docs/source/auto_examples/auto_examples_jupyter.zip
Binary file not shown.
Binary file modified docs/source/auto_examples/auto_examples_python.zip
Binary file not shown.
158 changes: 158 additions & 0 deletions docs/source/auto_examples/example_uq_bootstrap.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n\n# Bootstrapping\n\nIn this example, we demonstrate how to perform uncertainty quantification (UQ) using\nbootstrap method. We use a Stillinger-Weber (SW) potential for silicon that is archived\nin OpenKIM_.\n\nFor simplicity, we only set the energy-scaling parameters, i.e., ``A`` and ``lambda`` as\nthe tunable parameters. These parameters will be calibrated to energies and forces of a\nsmall dataset, consisting of 4 compressed and stretched configurations of diamond silicon\nstructure.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To start, let's first install the SW model::\n\n $ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_006\n\n.. seealso::\n This installs the model and its driver into the ``User Collection``. See\n `install_model` for more information about installing KIM models.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\nimport numpy as np\n\nfrom kliff.calculators import Calculator\nfrom kliff.dataset import Dataset\nfrom kliff.loss import Loss\nfrom kliff.models import KIMModel\nfrom kliff.uq.bootstrap import BootstrapEmpiricalModel\nfrom kliff.utils import download_dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before running bootstrap, we need to define a loss function and train the model. More\ndetail information about this step can be found in `tut_kim_sw` and\n`tut_params_transform`.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Create the model\nmodel = KIMModel(model_name=\"SW_StillingerWeber_1985_Si__MO_405512056662_006\")\n\n# Set the tunable parameters and the initial guess\nopt_params = {\"A\": [[\"default\"]], \"lambda\": [[\"default\"]]}\n\nmodel.set_opt_params(**opt_params)\nmodel.echo_opt_params()\n\n# Get the dataset\ndataset_path = download_dataset(dataset_name=\"Si_training_set_4_configs\")\n# Read the dataset\ntset = Dataset(dataset_path)\nconfigs = tset.get_configs()\n\n# Create calculator\ncalc = Calculator(model)\n# Only use the forces data\nca = calc.create(configs, use_energy=False, use_forces=True)\n\n# Instantiate the loss function\nresidual_data = {\"normalize_by_natoms\": False}\nloss = Loss(calc, residual_data=residual_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To perform UQ by bootstrapping, the general workflow starts by instantiating\n:class:`~kliff.uq.bootstrap.BootstrapEmpiricalModel`, or\n:class:`~kliff.uq.bootstrap.BootstrapNeuralNetworkModel` if using a neural network\npotential.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Instantiate bootstrap class object\nBS = BootstrapEmpiricalModel(loss, seed=1717)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, we generate some bootstrap compute arguments. This is equivalent to generating\nbootstrap data. Typically, we just need to specify how many bootstrap data samples to\ngenerate. Additionally, if we call ``generate_bootstrap_compute_arguments`` multiple\ntimes, the new generated data samples will be appended to the previously generated data\nsamples. This is also the behavior if we read the data samples from the previously\nexported file.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Generate bootstrap compute arguments\nBS.generate_bootstrap_compute_arguments(100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we will iterate over these bootstrap data samples and train the potential\nusing each data sample. The resulting optimal parameters from each data sample give a\nsingle sample of parameters. By iterating over all data samples, then we will get an\nensemble of parameters.\n\nNote that the mapping from the bootstrap dataset to the parameters involve optimization.\nWe suggest to use the same mapping, i.e., the same optimizer setting, in each iteration.\nThis includes using the same set of initial parameter guess. In the case when the loss\nfunction has multiple local minima, we don't want the parameter ensemble to be biased\non the results of the other optimizations. For neural network model, we need to reset\nthe initial parameter value, which is done internally.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Run bootstrap\nmin_kwargs = dict(method=\"lm\") # Optimizer setting\ninitial_guess = calc.get_opt_params() # Initial guess in the optimization\nBS.run(min_kwargs=min_kwargs, initial_guess=initial_guess)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The resulting parameter ensemble can be accessed in `BS.samples` as a `np.ndarray`.\nThen, we can plot the distribution of the parameters, as an example, or propagate the\nerror to the target quantities we want to study.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Plot the distribution of the parameters\nplt.figure()\nplt.plot(*(BS.samples.T), \".\", alpha=0.5)\nparam_names = list(opt_params.keys())\nplt.xlabel(param_names[0])\nplt.ylabel(param_names[1])\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
123 changes: 123 additions & 0 deletions docs/source/auto_examples/example_uq_bootstrap.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
"""
.. _tut_bootstrap:

Bootstrapping
=============

In this example, we demonstrate how to perform uncertainty quantification (UQ) using
bootstrap method. We use a Stillinger-Weber (SW) potential for silicon that is archived
in OpenKIM_.

For simplicity, we only set the energy-scaling parameters, i.e., ``A`` and ``lambda`` as
the tunable parameters. These parameters will be calibrated to energies and forces of a
small dataset, consisting of 4 compressed and stretched configurations of diamond silicon
structure.
"""


##########################################################################################
# To start, let's first install the SW model::
#
# $ kim-api-collections-management install user SW_StillingerWeber_1985_Si__MO_405512056662_006
#
# .. seealso::
# This installs the model and its driver into the ``User Collection``. See
# :ref:`install_model` for more information about installing KIM models.


import matplotlib.pyplot as plt
import numpy as np

from kliff.calculators import Calculator
from kliff.dataset import Dataset
from kliff.loss import Loss
from kliff.models import KIMModel
from kliff.uq.bootstrap import BootstrapEmpiricalModel
from kliff.utils import download_dataset

##########################################################################################
# Before running bootstrap, we need to define a loss function and train the model. More
# detail information about this step can be found in :ref:`tut_kim_sw` and
# :ref:`tut_params_transform`.

# Create the model
model = KIMModel(model_name="SW_StillingerWeber_1985_Si__MO_405512056662_006")

# Set the tunable parameters and the initial guess
opt_params = {"A": [["default"]], "lambda": [["default"]]}

model.set_opt_params(**opt_params)
model.echo_opt_params()

# Get the dataset
dataset_path = download_dataset(dataset_name="Si_training_set_4_configs")
# Read the dataset
tset = Dataset(dataset_path)
configs = tset.get_configs()

# Create calculator
calc = Calculator(model)
# Only use the forces data
ca = calc.create(configs, use_energy=False, use_forces=True)

# Instantiate the loss function
residual_data = {"normalize_by_natoms": False}
loss = Loss(calc, residual_data=residual_data)

##########################################################################################
# To perform UQ by bootstrapping, the general workflow starts by instantiating
# :class:`~kliff.uq.bootstrap.BootstrapEmpiricalModel`, or
# :class:`~kliff.uq.bootstrap.BootstrapNeuralNetworkModel` if using a neural network
# potential.


# Instantiate bootstrap class object
BS = BootstrapEmpiricalModel(loss, seed=1717)

##########################################################################################
# Then, we generate some bootstrap compute arguments. This is equivalent to generating
# bootstrap data. Typically, we just need to specify how many bootstrap data samples to
# generate. Additionally, if we call ``generate_bootstrap_compute_arguments`` multiple
# times, the new generated data samples will be appended to the previously generated data
# samples. This is also the behavior if we read the data samples from the previously
# exported file.


# Generate bootstrap compute arguments
BS.generate_bootstrap_compute_arguments(100)

##########################################################################################
# Finally, we will iterate over these bootstrap data samples and train the potential
# using each data sample. The resulting optimal parameters from each data sample give a
# single sample of parameters. By iterating over all data samples, then we will get an
# ensemble of parameters.
#
# Note that the mapping from the bootstrap dataset to the parameters involve optimization.
# We suggest to use the same mapping, i.e., the same optimizer setting, in each iteration.
# This includes using the same set of initial parameter guess. In the case when the loss
# function has multiple local minima, we don't want the parameter ensemble to be biased
# on the results of the other optimizations. For neural network model, we need to reset
# the initial parameter value, which is done internally.


# Run bootstrap
min_kwargs = dict(method="lm") # Optimizer setting
initial_guess = calc.get_opt_params() # Initial guess in the optimization
BS.run(min_kwargs=min_kwargs, initial_guess=initial_guess)

##########################################################################################
# The resulting parameter ensemble can be accessed in `BS.samples` as a `np.ndarray`.
# Then, we can plot the distribution of the parameters, as an example, or propagate the
# error to the target quantities we want to study.


# Plot the distribution of the parameters
plt.figure()
plt.plot(*(BS.samples.T), ".", alpha=0.5)
param_names = list(opt_params.keys())
plt.xlabel(param_names[0])
plt.ylabel(param_names[1])
plt.show()

##########################################################################################
# .. _OpenKIM: https://openkim.org
1 change: 1 addition & 0 deletions docs/source/auto_examples/example_uq_bootstrap.py.md5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
d16579e397f3c9e5d2537a623bd65313
Loading