Releases: bodkan/slendr
slendr 1.0.0
- A massive update introducing the possibility of simulating non-neutral slendr models with
slim()
has been introduced. This update is too big to describe in the changelog -- for more information and motivation, see the description in the associated PR, or the new extensive vignette on the topic. (PR #155)
Implementing changes for the v1.0 release (particularly the support for non-neutral models) required changing slendr internals at a very low level across the whole codebase. Feedback on this functionality, missing features, and bug reports are highly appreciated!
Other changes:
-
The behavior previously implemented via the
output =
andts =
arguments ofslim()
(andmsprime()
) has been changed to facilitate more straightforward handling of output paths in user-defined SLiM extensions and other packages leveraging slendr for inference. Theslim()
andmsprime()
function interfaces are now simplified in the following way:slim()
: thets
argument is now logical.TRUE
switches on tree-sequence recording,FALSE
switches it off. If tree-sequence recording is on (the default setting), the function automatically returns a tree-sequence R object. If users want to save it to a custom location, they should use the functionts_write()
on the returned tree-sequence object. If customized output files are to be produced via user-defined extension scripts, those scripts can use a slendr/SLiM constantPATH
, which is always available in the built-in SLiM script and which can be set from R viaslim(..., path = <path to a directory>)
. In that case, theslim()
function always returns that path back. Crucially, in this caseslim()
will not return a tree sequence object, but that object can be loaded byts_read("<path to a directory>/slim.trees")
. In other words, nothing changes for the usual SLiM-based slendr workflow, but for models generating custom output files, a small amount of work is needed to load the tree sequence -- the tree-sequence file outputs are therefore treated exactly the same way as non-tree-sequence user-defined output files. As a result of these changes,slim()
no longer accepts aload = TRUE|FALSE
argument.
The above is implemented in PR #157.
-
ts_genotypes()
now works even for non-slendr tree sequences, which do not have slendr individual names of samples in thets_nodes()
output. (#d348ec) -
Due to frequent issues with installation of Python dependencies of slendr in a completely platform independent way (in the latest instance this being conda installation of pyslim crashing on M-architecture Macs),
setup_env()
now only uses conda to install msprime and tskit -- pyslim and tspop are always installed via pip regardless of whethersetup_env(pip = FALSE)
(the default) orsetup_env(pip = TRUE)
is used. (#408948) -
A new function
extract_parameters()
can extract parameters of either a compiled slendr model object or a tree sequence simulated from a slendr model. This can be useful particularly for simulation-based inferences where model parameters are often drawn from random distributions and there's a need to know which parameters of a model (split times, gene-flow rates, etc.) have been drawn. (#3632bd0) -
compile_model()
now allows to specify a description of time units used while scheduling slendr model events. This has purely descriptive purpose -- in particular, these units are used in model plotting functions, etc. (#9b5b7ea0) -
The
slim_script
argument ofcompile_model()
has been replaced byextension
argument, which allows users to provide their custom-designed SLiM snippets for extending the behavior of slendr's SLiM simulation engine. (#d11ac7) -
The
sim_length
argument ofcompile_model()
has been removed following a long period of deprecatiaon. (#12da50) -
When a named list of samples is used as
X
input tots_f4ratio()
, the name of the element is used in theX
column of the resulting data frame. (#0571a6) -
ts_table()
can now extract the "sites" tskit table asts_table(ts, "sites")
. (#e708f2) -
When applied to slendr tree sequences,
ts_recapitate()
no longer issues the warning:TimeUnitsMismatchWarning: The initial_state has time_units=ticks but time is measured in generations in msprime. This may lead to significant discrepancies between the timescales. If you wish to suppress this warning, you can use, e.g., warnings.simplefilter('ignore', msprime.TimeUnitsMismatchWarning)
. For slendr tree sequences, ticks are the same thing as generations anyway. (#43c45083) -
Running
slim(..., method = "gui")
was broken due to recent changes to make slendr work on Windows. A path to a generated SLiM script executed in SLiMgui was incorrectly normalized. Non-SLiMgui runs were not affected. (#ccae1df)
slendr 0.9.1
-
A new helper function
get_env()
now returns the name of the built-in slendr Python environment (without activating it). (#162ccc) -
clear_env()
now has a new argumentall = (TRUE|FALSE)
which allows deleting all slendr Python environments. Previously, the function always removed only the recent environment, which lead to the accumulation of potentially large number of slendr environments over time. (#8707b9) -
plot_model()
has a new optional argumentsamples =
which will -- when set to a result of a sampling schedule created byschedule_sampling()
-- visualize the counts of samples to be recorded at each given time-point. (#d72ac5) -
The msprime dependency of slendr has been updated to version 1.3.1. As a result,
setup_env()
will have to be re-run to update the internal slendr Python environment. (#dcb83d)
slendr 0.9.0
-
A full support for running SLiM and msprime simulations with slendr and for analyzing tree sequences using its tskit interface on Windows has been implemented. Please note that the Windows support is still rather experimental -- the internal slendr test suite currently assumes that SLiM has been installed using the msys2 system as described in the section 2.3.1 of the SLiM manual and other means of installing SLiM (such as via conda) might require additional adjustments. A fallback option in the form of the
slim_path=
argument of theslim()
function can be used in non-standard SLiM installation circumstances. For most convenience, please add the path to the directory containing theslim.exe
binary to thePATH
variable by editing theC:/Users/<username>/Documents/.Renviron
file accordingly. See the relevant section on Windows installation in the slendr documentation for additional information. Feedback on the Windows functionality and bug reports are highly appreciated via GitHub issues! Many thanks to @GKresearch and @rdinnager for their huge help in making the Windows port happen! (PR #149) -
A trivial change has been made to slendr's SLiM back-end script fixing the issue introduced in a SLiM 4.1 upgrade (see changelog for version 0.8.1 below). This is not expected to lead to different simulation outputs between the two versions of slendr (0.8.2 vs 0.8.1) or SLiM (4.1 vs 4.0.1) used. (PR #148)
-
The msprime internal dependency of slendr was updated to 1.3.0, and Python to 3.12. As a result, after loading slendr, users will be prompted to re-run
setup_env()
to make sure that the dedicated slendr Python environment is fully updated. At the same time, this prevents a failing installation on (at the very least) M1 macOS usingpip
. (#5ce212, #a210d4)
slendr 0.8.1
-
Fixed an issue of apparent contradiction in time direction in models where range expansion was scheduled within some time interval together with associated "locked-in" changes in population size over that time interval. (#d2a29e)
-
The introduction of tspop which is only installable via pip (see changelog for the previous version) caused GLIBCXX-related errors between conda and pip dependencies related to the pandas Python package. To work around this issue,
setup_env()
no longer installs pandas from conda regardless of the setting of thepip = TRUE|FALSE
parameter. Instead, pandas is installed via pip in a single step when tspop is being installed. (#cbe960)
WARNING: SLiM 4.1 which has just been released includes a couple of backwards incompatible changes related to the implementation of spatial maps which prevent the current version of slendr's slim()
function from working correctly. If you rely on the functionality provided by the slim()
function, you will have to use SLiM 4.0. (Note that if you want to have multiple versions of SLiM on your system, you can either use the slim_path =
argument of slim()
or specify the $PATH
to the required version of SLiM in your ~/.Renviron
file just like you do under normal circumstances). Porting slendr for SLiM 4.1 is being worked on.
slendr 0.8.0
-
In order to support the new
ts_tracts()
function backed by the tspop module (see the item below), a new slendr Python environment is required. As such, users will have to runsetup_env()
to get all the required Python dependencies. (#b5330c) -
Experimental support for the tspop
link-ancestors
algorithm for detection of ancestry tracts in the form of a new slendr functionts_tracts()
. Only works on slendr-generated msprime tree sequences and "pure" msprime and SLiM tree sequences (not slendr-generated SLiM tree sequences). Please use with caution until things settle down a bit and the new functionality is more extensively tested. Feedback is appreciated! (PR #145) -
Updated Python dependencies (bugfix pyslim release v1.0.4 and tskit v0.5.6, the latter due to a broken
jsonschema
dependency of tskit). (#001ee5) -
Experimental support for manually created spatial tree sequences. (PR #144)
slendr 0.7.2
-
A new function
ts_names()
has been added, avoiding the need for the extremely frequent (and, unfortunately, cumbersome) trick of getting named lists of individual symbolic namests_samples(ts) %>% split(., .[[split]]) %>% lapply(
[[, "name")
which is very confusing for all but the more experienced R users. (#7db6ea) -
Fixed broken concatenation of symbolic sample names in tree-sequence statistic functions, when those were provided as unnamed single-element lists of character vectors. (#b3c650)
-
plot_model()
now has an argumentfile =
, making it possible to save a visualization of a model without actually opening a plotting device. This can be useful particularly while working on a remote server, in order to avoid the often slow X11 rendering. (#e60078) -
plot_model()
now has an argumentorder =
allowing to override the default in-order ordering of populations along the x-axis. (#7a10ea)
slendr 0.7.1
- Starting from this release, the *spatial* simulation and data analysis functionality of slendr is conditional on the presence of R geospatial packages sf, stars, and rnaturalearth on the system. This means that users will be able to install slendr (and use all of its non-spatial functionality) even without having these R packages installed. That said, nothing really changes in practice: spatial features of slendr are just one
install.packages(c("sf", "stars", "rnaturalearth"))
away! The difference is that slendr doesn't try to do this during its own installation, but users are instructed to do this themselves (if needed) when the package is loaded. (#7a10ea)
If spatial dependencies are not present but a spatial slendr function is called regardless (such as world()
, move()
, etc.), an error message is printed with an information on how to install spatial dependencies via install.packages()
as above.
Why? It's true that the main reason for slendr's existence is its ability to simulate spatio-temporal data on realistic landscapes via SLiM. However, in practice, most of the "average" uses of slendr in the wild (and in classrooms!) rely on its traditional, non-spatial interface, with its spatial features being used comparatively rarely at the moment (except for some cutting-edge exploratory research). Given that setting up all of the spatial dependencies can be a bit of a hurdle, we have decided to make these dependencies optional, rather than force every user to go through the process of their installation whether they need the spatial features or not.
-
A function
check_dependencies()
is now exported and can be used to check whether a slendr Python environment () or SLiM () are present. This is useful for other software building upon slendr, normal users can freely ignore this. (#6ae6ce) -
A path to a file from which a tree sequence was loaded from is now tracked internally via a
attr(<tree sequence>, "path")
attribute. Note that this has been implemented for the purposes of clean up for large-scale simulation studies (such as those facilitated by demografr) as a mostly internal feature, and should be considered experimental. (#f181a2) -
Attempts to resize a population right at the time of the split (which led to issues with simulations) are now prevented. (#f181a2)
-
Fix for a minor issue preventing sampling an msprime population right at the time of its creation. (#aea231)
slendr 0.7.0
This is an emergency upgrade to match the latest pyslim 1.0.3 due to a serious bug in recapitation. See here and here for an extensive discussion during the process of identification of the bug and its eventual fix. For a brief summary of the practical consequences of this bug, see this thread by pyslim's developer and its formal announcement here.
-
This change will require you to re-run
setup_env()
in order to update slendr's Python internals by creating a new internal Python virtual environment. (#45539a) -
A potential issue with a parent population being scheduled for removal before a daughter population splits from it is now caught at the moment of the daughter
population()
call rather than during a simulationslim()
run. (#0791b5) -
The function
plot_model()
has a new argumentgene_flow=<TRUE|FALSE>
which determines whether gene-flow arrows will be visualized or not. (#104aa6) -
The possibility to perform recapitation, simplification, or mutation of a tree sequence right inside a call to
ts_load()
(by providingrecapitate = TRUE
,simplify = TRUE
, andmutate = TRUE
, together with their own arguments) has now been removed. The motivation for this change is the realization that there is no benefit of doing things likets_load("<path>", recapitate = TRUE, Ne = ..., recombination_rate = ...)
overts_load("<path>") %>% ts_recapitate(Ne = ..., recombination_rate = ...)
, and the frequent confusion whenrecapitate = TRUE
or other switches are forgotten by the user. All slendr teaching material and most actively used research codebases I know of use the latter, more explicit, pipeline approach anyway, and this has been the one example where reduncancy does more harm than good. (#ad82ee)
Note: Loading library(slendr)
will prompt a message "The legacy packages maptools, rgdal, and rgeos, underpinning the sp package, which was just loaded, will retire in October 2023. [...]." This is an internal business of packages used by slendr which unfortunately cannot be silenced from slendr's side. There's no reason to panic, you can safely ignore them. Apologies for the unnecessary noise.
slendr 0.6.0
This is a relatively large update, which unfortunately had to be released in haste due to the retirement of the rgdal package -- a significant dependency of the entire spatial R ecosystem which is being phased out in the effort to move towards modern low-level geospatial architecture. Although slendr itself does not depend on rgdal, many of its dependencies used to (but won't in the short term, hence the push to remove the rgdal dependency). The most significant update has been the addition of IBD functionality of tskit, as described below. However, large part of this functionality has not been extensively tested and should be considered extremely experimental at this stage. If you would like to use it, it might be safer to either wait for a later release in which the IBD functionality will be more stable, or use the underlying, battle-tested Python implementation in tskit.
-
ts_ibd()
now returns the ID number of a MRCA node of a pair of nodes sharing a given IBD segment, as well as the TMRCA of that node. (#7e2825) -
Trivial parameter errors are caught during
population()
calls rather than during simulation (solving minor issues discovered via big simulation runs during the development of demografr). (#e33373) -
Fix error in plotting exponential resizes which do not last until "the present". (#4c49a4)
-
ts_ibd()
no longer gives obscure error whenbetween =
is provided as a named list of individuals' names (instead of an expected unnamed list). The names of list elements are not used in any way, but the error happens somewhere deeply in the R->Python translation layer inside reticulate and there's no need for the users to concern themselves with it. (#7965e4) -
Population size parameters and times are now explicitly converted to integer numbers. This is more of an internal, formal change (the conversion has been happening implicitly inside the SLiM engine anyway) but is now explicitly stated, also in the documentation of each relevant function. (#b7e89e)
-
Population names are now restricted to only those strings which are also valid Python identifiers. Although this restriction is only needed for the msprime back end of slendr (not SLiM), it makes sense to keep things tidy and unified. This fixes msprime crashing with
ValueError: A population name must be a valid Python identifier
. (#4ef518) -
The layout algorithm of
plot_model()
has been improved significantly. (PR #135). -
A new optional argument
run =
has been added toslim()
andmsprime()
. If set toTRUE
(the default), the engines will operate the usual way. If set toFALSE
, no simulation will be run and the functions will simply print a command-line command to execute the engine in question (returning the CLI command invisibly). (#2e5b85) -
The following start-up note is no longer shown upon calling
library(slendr)
:
NOTE: Due to Python setup issues on some systems which have been
causing trouble particularly for novice users, calling library(slendr)
no longer activates slendr's Python environment automatically.
In order to use slendr's msprime back end or its tree-sequence
functionality, users must now activate slendr's Python environment
manually by executing init_env() after calling library(slendr).
(This note will be removed in the next major version of slendr.)
Users have to call init_env()
to manually activate the Python environment of slendr (see note under version 0.5.0 below for an extended explanation).
-
ts_simplify()
now accepts optional argumentskeep_unary
andkeep_unary_in_individuals
(see the official tskit docs for more detail) (#1b2112) -
Fix for
ts_load()
failing to load slendr-produced tree sequences after they were simplified down to a smaller set of sampled individuals (reported here). The issue was caused by incompatible sizes of the sampling table (always in the same form as used during simulation) and the table of individuals stored in the tree sequence after simplification (potentially containing a smaller set of individuals than in the original sampling table). To fix this, slendr tree sequence objects now track information about which individuals are regarded as "samples" (i.e. those with symbolic names) which is maintained through simplification, serialization and loading, and used by slendr's internal machinery during join operations. (PR #137) -
Metadata summary of
ts_nodes()
results is no longer printed whenever typed into the R console. Instead, summary can be obtained by explicit call tosummary()
on thets_nodes()
tables. (#01af51 -
ts_tree()
andts_phylo()
now extract trees based on tskit's own zero-based indexing #554e13. -
ts_simplify()
now acceptsfilter_nodes = TRUE|FALSE
, with the same behavior to tskit's own method #f07ffed.
slendr 0.5.1
-
This minor release implements an emergency fix for a CRAN warning which suddenly popped up in latest CRAN checks. (#5600a4)
-
A new function
ts_ibd()
has been added, representing an R interface to the tskit methodTreeSequence.ibd_segments()
. However, note thatts_ibd()
returns IBD results as a data frame (optionally, a spatially annotated sf data frame). The function does not operate around iteration, as does its Python counterpart in tskit. Until the next major version of slendr, this function should be considered experimental. (PR #123)