Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix… #66

rlouf · 2022-09-25T22:31:58Z

Related to #49 #64

Here are a few important guidelines and requirements to check before your PR can be merged:

There is an informative high-level description of the changes.
The description and/or commit message(s) references the relevant GitHub issue(s).
pre-commit is installed and set up.
The commit messages follow these guidelines.
The commits correspond to relevant logical changes, and there are no commits that fix changes introduced by other commits in the same branch/BR. If your commit description starts with "Fix...", then you're probably making this mistake.
There are tests covering the changes introduced in the PR.

Don't worry, your PR doesn't need to be in perfect order to submit it. As development progresses and/or reviewers request changes, you can always rewrite the history of your feature/PR branches.

If your PR is an ongoing effort and you would like to involve us in the process, simply make it a draft PR.

codecov · 2022-09-26T14:01:11Z

Codecov Report

Base: 97.35% // Head: 97.42% // Increases project coverage by +0.06% 🎉

Coverage data is based on head (7431376) compared to base (5b5ae2f).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   97.35%   97.42%   +0.06%     
==========================================
  Files           9       10       +1     
  Lines         606      621      +15     
  Branches       63       63              
==========================================
+ Hits          590      605      +15     
  Misses          5        5              
  Partials       11       11

Impacted Files	Coverage Δ
aemcmc/transforms.py	`100.00% <ø> (ø)`
aemcmc/scale_mixtures.py	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

rlouf · 2022-09-26T14:01:41Z

While the forward test passes this shouldn't be merged.

For the backward transformation, many-to-one, I need to select a rng, dtype and size for the Half-Cauchy. I choose the inner Inverse-Gamma (arbitrary for now), so I don't need to consider the parameters of the outer Inverse-Gamma. I am not sure if the following way of writing this in miniKanren is correct to support both ways:

lany(
    eq(rng_outer_lv, rng_lv),
    succeed,
),
lany(
    eq(size_outer_lv, size_lv),
    succeed,
),
lany(
    eq(type_idx_outer_lv, type_idx_lv),
    succeed,
)

rlouf · 2022-09-26T18:56:42Z

More concerning to me is how to handle the rngs in these one-to-many relationship. I currently set the rng to both InverseGammaRV to the Half-Cauchy's, but that can't be correct. I don't see how this could play nicely with the updates system. One solution is setting it to aesara.shared(np.random.default_rng()) but then it becomes non-deterministic even if the caller sets a seed.

Let's walk throught a simpler example and let's consider another kind of expansion, the convolution of two normal distributions. In particular, let's consider the relation that represents the sum of two normally-distributed random variables:

$$ \begin{equation*} \frac{ X \sim \operatorname{N}\left(\mu_x, \sigma_x^2\right), \quad Y \sim \operatorname{N}\left(\mu_y, \sigma_y^2\right), \quad X + Y = Z }{ Z \sim \operatorname{N}\left(\mu_x + \mu_y, \sigma_x^2 + \sigma_y^2\right) } \end{equation*} $$

Let's print the current etuplized version of the contracted expression:

import aesara.tensor as at
from etuples import etuplize


srng = at.random.RandomStream(0)
mu_x = at.scalar()
sigma2_x = at.scalar()
mu_y = at.scalar()
sigma2_y = at.scalar()

X_rv = srng.normal(mu_x + mu_y, sigma2_x + sigma2_y)
pprint(etuplize(X_rv))
# e(
#   e(aesara.tensor.random.basic.NormalRV, 'normal', 0, (0, 0), 'floatX', False),
#   RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FD6BBEA6960>),
#   TensorConstant{[]},
#   TensorConstant{11},
#   e(
#     e(
#       aesara.tensor.elemwise.Elemwise,
#       <aesara.scalar.basic.Add at 0x7fd6c0e014e0>,
#       <frozendict {}>),
#     <TensorType(float64, ())>,
#     <TensorType(float64, ())>),
#   e(
#     e(
#       aesara.tensor.elemwise.Elemwise,
#       <aesara.scalar.basic.Add at 0x7fd6c0e014e0>,
#       <frozendict {}>),
#     <TensorType(float64, ())>,
#     <TensorType(float64, ())>))

And the expanded version of this expression:

Y_rv = srng.normal(mu_x, sigma2_x) + srng.normal(mu_y, sigma2_y)
pprint(etuplize(Y_rv))
# e(
#   e(
#     aesara.tensor.elemwise.Elemwise,
#     <aesara.scalar.basic.Add at 0x7fd6c0e014e0>,
#     <frozendict {}>),
#   e(
#     e(
#       aesara.tensor.random.basic.NormalRV,
#       'normal',
#       0,
#       (0, 0),
#       'floatX',
#       False),
#     RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FD6BD33A880>),
#     TensorConstant{[]},
#     TensorConstant{11},
#     <TensorType(float64, ())>,
#     <TensorType(float64, ())>),
#   e(
#     e(
#       aesara.tensor.random.basic.NormalRV,
#       'normal',
#       0,
#       (0, 0),
#       'floatX',
#       False),
#     RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FD6BD33BE60>),
#     TensorConstant{[]},
#     TensorConstant{11},
#     <TensorType(float64, ())>,
#     <TensorType(float64, ())>))

To express the relation between these two expressions with miniKanren we start by writing the following expressions with value variables:

from etuples import etuple
from kanren import var

mu_x, mu_y, sigma2_x, sigma2_y = var(), var(), var(), var()

rng, size, dtype = var(), var(), var()
X_et = etuple(
    etuplize(at.random.normal),
    rng,
    size,
    dtype,
    etuple(
        etuplize(at.add),
        mu_x,
        mu_y
    ),
    etuple(
        etuplize(at.add),
        sigma2_x,
        sigma2_y,
    )
)

rng_x, size_x, dtype_x = var(), var(), var()
rng_y, size_y, dtype_y = var(), var(), var()
Y_et = etuple(
    etuplize(at.add),
    etuple(
        etuplize(at.random.normal),
        rng_x,
        size_x,
        dtype_x,
        mu_x,
        sigma2_x
    ),
    etuple(
        etuplize(at.random.normal),
        rng_y,
        size_y,
        dtype_y,
        mu_y,
        sigma2_y
    )
)

The problem here stems from the fact that on the contracted version we need one rng state, and on the expanded one we need to distinct rng states. The expanded -> contracted transformation is possible, although necessitates to make a choice of which rng state to keep. We will keep the first one, since it was created first:

from kanren import lall, eq

# Our goal is a function of (contract_expr, expand_expr)
# We simplify the problem assuming all of the distribution parameters have the same =dtype= and =size=.
lall(
    eq(contract_exp, X_et),
    eq(expand_exp, Y_et),
    eq(rng, rng_x), # We pick the first state
    eq(size, size_x),
    eq(dtype, dtype_x),
)

However this will not work for the contracted -> expanded transformation as rng_y, dtype_y and size_y will not be assigned a value! The dtype and size add unrelated complications, so we will forget about them in the rest of the discussion, and will only consider the following question: what value should we choose for rng_y?

We know that we need to advance the state of the random number generator, or the computations downstream will be wrong. We also know that we cannot reset to np.random.default_rng() or we’re loosing determinism when the user seeds the graph. It is not clear to me what options we have left.

Now imagine that we have a symbolic operator RNGUpdate and at.random.update = RNGUpdate(). So that we can update the rng state doing:

rng_new = at.random.update(rng)

which corresponds to the evaluated etuple:

etuple(etuplize(at.random.update), rng)

We can now re-write Y_et as follows:

rng_x, size_x, dtype_x = var(), var(), var()
Y_et = etuple(
    etuplize(at.add),
    etuple(
        etuplize(at.random.normal),
        rng_x,
        size_x,
        dtype_x,
        mu_x,
        sigma2_x
    ),
    etuple(
        etuplize(at.random.normal),
        etuple(
            etuplized(at.random.update),
            rng_x
        ),
        size_y,
        dtype_y,
        mu_y,
        sigma2_y
    )
)

The following goal (leaving concerns of dtype and size aside for this discussion) now works:

from kanren import lany

# Our goal is a function of (contract_expr, expand_expr)
# We simplify the problem assuming all of the distribution parameters have the same =dtype= and =size=.
lall(
    eq(contract_exp, X_et),
    eq(expand_exp, Y_et),
    eq(rng, rng_x),
)

The solution may be to explicit the behind-the-scenes updating logic in Aesara with etuplize (how?), like we are planning to do with the multiple outputs in aesara-devs/aesara#1036. But like the latter case, this feels like the symptom of something running deeper in Aesara's internals.

…ture

brandonwillard · 2022-09-27T01:12:46Z

More concerning to me is how to handle the rngs in these one-to-many relationship. I currently set the rng to both InverseGammaRV to the Half-Cauchy's, but that can't be correct. I don't see how this could play nicely with the updates system. One solution is setting it to aesara.shared(np.random.default_rng()) but then it becomes non-deterministic even if the caller sets a seed.
First, let's recap the situation at a high-level.

If the context requires a "strong" equivalence between a sub-graph and its rewritten form (e.g. sampling one must produce the exact same values as the other), then a lot of these rewrites are simply not appropriate (see aesara-devs/aesara#209); however, when it comes to generating samplers—as we're doing via these rewrites—then we're operating under considerably weaker equivalence assumptions, and these rewrites generally are appropriate. In this case, we only need consistency/reproducibility of the sampler's results between runs.

Let's print the current etuplized version of the contracted expression:
I think the kanren/etuple aspects may only end up complicating things, so let's consider the Aesara-only approach(es) to this first.

In some cases (e.g. ffbs_step), we've required that a user pass a RandomStream or seed value to the function(s) that generate a sampler. We may need to work this kind of thing into our general rewrites (e.g. some kind of global RandomStream that's accessible by rewrites), or start using the symbolic RNG constructors—i.e. use the original graphs' RNGs to seed new ones and effectively connect the user-specified state/seed to rewrite-generated ones.

The latter approach sounds like the best right now, so let's see what we can do along those lines first

rlouf · 2022-09-27T06:15:57Z

In this case, we only need consistency/reproducibility of the sampler's results between runs.

Agreed.

I think the kanren/etuple aspects may only end up complicating things,

It does, but it also helps a lot to think about these things.

or start using the symbolic RNG constructors—i.e. use the original graphs' RNGs to seed new ones and effectively connect the user-specified state/seed to rewrite-generated ones.

Yes that was the idea. We may be able to make all this happen at the etuplization/evaluation level, in the same way we discussed adding a nth operator in the etuplized version of Ops.

The latter approach sounds like the best right now, so let's see what we can do along those lines first.

👍

rlouf added miniKanren mathematical relations labels Sep 25, 2022

rlouf force-pushed the halfcauchy-scale-mixture branch 4 times, most recently from c711b2b to aeeda45 Compare September 26, 2022 13:49

rlouf mentioned this pull request Sep 26, 2022

Add convolutions of distributions #57

Open

Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix…

7431376

…ture

rlouf force-pushed the halfcauchy-scale-mixture branch from aeeda45 to 7431376 Compare September 26, 2022 20:00

rlouf mentioned this pull request Sep 26, 2022

Add the transformation between the inverse gamma and the exponential distributions #67

Merged

rlouf mentioned this pull request Sep 29, 2022

Wrap Ops during etuplization aesara-devs/aesara#1036

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix… #66

Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix… #66

rlouf commented Sep 25, 2022 •

edited

Loading

codecov bot commented Sep 26, 2022 •

edited

Loading

rlouf commented Sep 26, 2022 •

edited

Loading

rlouf commented Sep 26, 2022 •

edited

Loading

brandonwillard commented Sep 27, 2022 •

edited by rlouf

Loading

rlouf commented Sep 27, 2022

Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix… #66

Are you sure you want to change the base?

Add the relation between a Half-Cauchy and an Inverse-Gamma scale mix… #66

Conversation

rlouf commented Sep 25, 2022 • edited Loading

codecov bot commented Sep 26, 2022 • edited Loading

Codecov Report

rlouf commented Sep 26, 2022 • edited Loading

rlouf commented Sep 26, 2022 • edited Loading

brandonwillard commented Sep 27, 2022 • edited by rlouf Loading

rlouf commented Sep 27, 2022

rlouf commented Sep 25, 2022 •

edited

Loading

codecov bot commented Sep 26, 2022 •

edited

Loading

rlouf commented Sep 26, 2022 •

edited

Loading

rlouf commented Sep 26, 2022 •

edited

Loading

brandonwillard commented Sep 27, 2022 •

edited by rlouf

Loading