Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A fairly large reworking to allow entire method objects to be returned #434

Merged
merged 11 commits into from
Nov 9, 2024
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,20 @@

## [0.2.2] - 2024-XX-XX

**Documentation**

- Various fixes in documentation, including documenting returned fits.

**Breaking changes**

- The `return_posteriors` argument has been removed and replaced with `return_fit`.
An instance of one of two previously internal classes, `ExpectationPropagation`
and `BeliefPropagation`, are now returned when `return_fit=True`, and posteriors can
be obtained using `fit.node_posteriors()`.

- Topology-only dating (setting `mutation_rate=None`) has been removed for tree sequences
of more than one tree, as tests have found that span-weighting the conditional coalescent
causes substantial bias.

## [0.2.1] - 2024-07-31

Expand Down
4 changes: 2 additions & 2 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Need to set PYTHONPATH so that we pick up the local tsdate
PYPATH=$(shell pwd)/../
TSDATE_VERSION:=$(shell PYTHONPATH=${PYPATH} \
python3 -c 'import tsdate; print(tsdate.__version__.split("+")[0])')
python -c 'import tsdate; print(tsdate.__version__.split("+")[0])')

BUILDDIR = _build

Expand All @@ -16,4 +16,4 @@ dist:
PYTHONPATH=${PYPATH} ./build.sh

clean:
rm -fR $(BUILDDIR)
rm -fR $(BUILDDIR)
1 change: 1 addition & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ html:
sphinx:
extra_extensions:
- sphinx.ext.autodoc
- sphinx.ext.mathjax
- sphinx.ext.autosummary
- sphinx.ext.todo
- sphinx.ext.viewcode
Expand Down
19 changes: 17 additions & 2 deletions docs/python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,28 @@ This page provides formal documentation for the _tsdate_ Python API.
.. autofunction:: tsdate.maximization
```

## Underlying fit objects

Instances of the classes below are returned by setting `return_fit=True`
when dating. The fits can be inspected to obtain more detailed results than
may be present in the returned tree sequence and its metadata. The classes
are not intended to be instantiated directly.

```{eval-rst}
.. autoclass:: tsdate.discrete.BeliefPropagation()
:members:

.. autoclass:: tsdate.variational.ExpectationPropagation()
:members:
```

## Prior and Time Discretisation Options

```{eval-rst}
.. autofunction:: tsdate.build_prior_grid
.. autofunction:: tsdate.build_parameter_grid
.. autoclass:: tsdate.base.NodeGridValues
.. autodata:: tsdate.base.DEFAULT_APPROX_PRIOR_SIZE
.. autoclass:: tsdate.node_time_class.NodeTimeValues
.. autodata:: tsdate.prior.DEFAULT_APPROX_PRIOR_SIZE
```

## Variable population sizes
Expand Down
21 changes: 9 additions & 12 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,23 +175,20 @@ running _tsdate_:
The metadata values (currently saved as `mn` and `vr`) need not be constrained by
the topology of the tree sequence, and should be used in preference
e.g. to `nodes_time` and `mutations_time` when evaluating the accuracy of _tsdate_.
2. The `return_posteriors` parameter can be used when calling {func}`tsdate.date`, which
then returns both the dated tree sequence and a dictionary specifying the posterior
distributions.

<!--
The returned posterior is a dictionary keyed by integer node ID, with values representing the
probability distribution of times. This can be read in to a [pandas](https://pandas.pydata.org)
dataframe:
2. The `return_fit` parameter can be used when calling {func}`tsdate.date`, which
then returns both the dated tree sequence and a fit object. This object can then be
queried for the unconstrained posterior distributions using e.g. `.node_posteriors()`
which can be read in to a [pandas](https://pandas.pydata.org) dataframe, as below:

```{code-cell} ipython3
import pandas as pd
redated_ts, posteriors = tsdate.date(
sim_ts, mutation_rate=1e-6, method="inside_outside", return_posteriors=True)
posteriors_df = pd.DataFrame(posteriors)
posteriors_df.head() # Show the dataframe
redated_ts, fit = tsdate.date(
sim_ts, mutation_rate=1e-6, return_fit=True)
posteriors_df = pd.DataFrame(fit.node_posteriors()) # mutation_posteriors() also available
posteriors_df.tail() # Show the dataframe
```

<!--
Since we are using a {ref}`sec_methods_discrete_time` method, each node
(numbered column of the dataframe) is associated with a vector of probabilities
that sum to one: each cell gives the probability that the time of the node
Expand Down
4 changes: 1 addition & 3 deletions tests/test_accuracy.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,7 @@ def test_basic(
assert sim_mutations_parameters["command"] == "sim_mutations"
mu = sim_mutations_parameters["rate"]

dts, posteriors = tsdate.inside_outside(
ts, population_size=Ne, mutation_rate=mu, return_posteriors=True
)
dts = tsdate.inside_outside(ts, population_size=Ne, mutation_rate=mu)
# make sure we can read node metadata - old tsdate versions didn't set a schema
if dts.table_metadata_schemas.node.schema is None:
tables = dts.dump_tables()
Expand Down
7 changes: 4 additions & 3 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,10 +232,9 @@ def test_no_output_variational_gamma(self, tmp_path, capfd):

@pytest.mark.parametrize(("flag", "log_status"), logging_flags.items())
def test_verbosity(self, tmp_path, caplog, flag, log_status):
popsize = 10000
ts = msprime.simulate(
10,
Ne=popsize,
Ne=10000,
mutation_rate=1e-8,
recombination_rate=1e-8,
length=2e4,
Expand All @@ -246,7 +245,9 @@ def test_verbosity(self, tmp_path, caplog, flag, log_status):
caplog.set_level(getattr(logging, log_status))
# either tsdate preprocess or tsdate date (in_out method has debug asserts)
self.run_tsdate_cli(tmp_path, ts, flag, cmd="preprocess")
self.run_tsdate_cli(tmp_path, ts, f"-n 10 --method inside_outside {flag}")
self.run_tsdate_cli(
tmp_path, ts, f"--mutation-rate 1e-8 --rescaling-intervals 0 {flag}"
)
assert log_status in caplog.text

@pytest.mark.parametrize(
Expand Down
Loading
Loading