This repository contains code for simulation study exaiming the impact of gene conversion on the "softness" of selective sweeps, used for the analyses presented in a forthcoming preprint.
This repository contains four python scripts that generate, modify, and run SLiM scripts (for use with SLiM version 4.0.1) simulating a two-locus hitchhiking model with gene conversion events allowed at the selected locus but not the linked locus. This pipeline requires that stdpopsim
version 0.2.0 and all of its dependencies be installed. (This is pretty easy using conda
following the instructions on https://popsim-consortium.github.io/stdpopsim-docs/stable/installation.html). These scripts write all of their output to directories that they will create within the current working directory, so if you want to write them somewhere else you will have to modify them slightly.
-
buildAllSlimScripts.py
: This script usesstdpopsim
to generate all of the SLiM scripts which we will then modify to run our sweep simulations under the desired demographic models (using some Arabidopsis, Drosophila, and human models from thestdpopsim
catalog). Note that this builds a slim script (which will be modified in the next step) for EVERY parameter combination for each demographic model. This is admittedly very lazy design that results in many more simulation scripts than necessary, but it gets the job done. -
injectAllSlimScripts.py
: This code modifies the SliM scripts generated bystdpopsim
via the above script. These SLiM scripts are modified to contain selective sweeps with gene conversion occurring only at the selected locus. Note that this step could be simplified substantially by taking advantage ofstdpopsim
's ability to condition on sweeps occuring at a specified time (by adding code for this tobuildAllSlimScripts.py
), but the development of this pipeline began prior to the incorporation of that functionality intostdpopsim
. -
runSlimulations.py
: This code actually launches the simulation jobs to a high-performance computing cluster. The code is written assuming that the cluster uses theSLURM
scheduler and that the desired partition name isgeneral
, so the code may have to be modified to run on your computing resources. It can be run for each species as follows:python runSlimulations.py HomSap
python runSlimulations.py AraTha
python runSlimulations.py DroMel
-
parseOutputForSpecies.py
: This code parses the output from each SLiM simulation and writes information about the simulation's outcomes into a tabular format that can be read by the analysis notebook described below. This can be run for each species as follows:python parseOutputForSpecies.py HomSap
python parseOutputForSpecies.py DroMel
python parseOutputForSpecies.py AraTha
The softness_analysis.ipynb
notebook contains the code that reads in the simulation summaries generated by step four of the above pipeline and generate the figures found in the paper. The notebook also contains tables with fairly detailed summaries of simulation outcomes for each demographic model and parameter combination examined in the paper.