Skip to content

Commit

Permalink
Merge branch 'dev'
Browse files Browse the repository at this point in the history
  • Loading branch information
cflerin committed May 7, 2021
2 parents 25a8211 + 7bbdd9c commit 2919ae4
Show file tree
Hide file tree
Showing 26 changed files with 105 additions and 1,788 deletions.
29 changes: 22 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,35 @@ pySCENIC

|buildstatus|_ |pypipackage|_ |docstatus|_


pySCENIC is a lightning-fast python implementation of the SCENIC_ pipeline (Single-Cell rEgulatory Network Inference and
Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from
single-cell RNA-seq data.

The pioneering work was done in R and results were published in Nature Methods [1]_.
A new and comprehensive description of this Python implementation of the SCENIC pipeline is available in Nature Protocols [5]_ (`see here <https://doi.org/10.1038/s41596-020-0336-2>`_).
A new and comprehensive description of this Python implementation of the SCENIC pipeline is available in Nature Protocols [4]_.

pySCENIC can be run on a single desktop machine but easily scales to multi-core clusters to analyze thousands of cells
in no time. The latter is achieved via the dask_ framework for distributed computing [2]_.

**Full documentation** is available on `Read the Docs <https://pyscenic.readthedocs.io/en/latest/>`_
**Full documentation** for pySCENIC is available on `Read the Docs <https://pyscenic.readthedocs.io/en/latest/>`_

----

pySCENIC is part of the SCENIC Suite of tools!
See the main `SCENIC website <https://scenic.aertslab.org/>`_ for additional information and a full list of tools available.

----


News and releases
-----------------

0.11.2 | 2021-05-07
^^^^^^^^^^^^^^^^^^^

* Split some core cisTarget functions out into a separate repository, `ctxcore <https://github.com/aertslab/ctxcore>`_. This is now a required package for pySCENIC.

0.11.1 | 2021-02-11
^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -96,7 +110,9 @@ All the functionality of the original R implementation is available and in addit
Additional resources
--------------------

For more information, please visit the main SCENIC_ website.
For more information, please visit LCB_,
the main `SCENIC website <https://scenic.aertslab.org/>`_,
or `SCENIC (R version) <https://github.com/aertslab/SCENIC>`_.
There is a tutorial to `create new cisTarget databases <https://github.com/aertslab/create_cisTarget_databases>`_.
The CLI to pySCENIC has also been streamlined into a pipeline that can be run with a single command, using the Nextflow workflow manager.
There are two Nextflow implementations available:
Expand All @@ -114,11 +130,10 @@ We are grateful to all providers of TF-annotated position weight matrices, in pa
References
----------

.. [1] Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat Meth 14, 1083–1086 (2017).
.. [1] Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat Meth 14, 1083–1086 (2017). `doi:10.1038/nmeth.4463 <https://doi.org/10.1038/nmeth.4463>`_
.. [2] Rocklin, M. Dask: parallel computation with blocked algorithms and task scheduling. conference.scipy.org
.. [3] Huynh-Thu, V. A. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5, (2010).
.. [4] Zeisel, A. et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).
.. [5] Van de Sande B., Flerin C., et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat Protoc. June 2020:1-30. doi:10.1038/s41596-020-0336-2
.. [3] Huynh-Thu, V. A. et al. Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5, (2010). `doi:10.1371/journal.pone.0012776 <https://doi.org/10.1371/journal.pone.0012776>`_
.. [4] Van de Sande B., Flerin C., et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat Protoc. June 2020:1-30. `doi:10.1038/s41596-020-0336-2 <https://doi.org/10.1038/s41596-020-0336-2>`_
.. |buildstatus| image:: https://travis-ci.org/aertslab/pySCENIC.svg?branch=master
.. _buildstatus: https://travis-ci.org/aertslab/pySCENIC
Expand Down
19 changes: 7 additions & 12 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,8 @@ Frequently Asked Questions
I am having problems with Dask
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The Arboreto package :code:`v0.1.5`, and some steps of the cisTarget step within pySCENIC, seem to depend on an older version of Dask/Distributed.
Using a more recent version of Dask/Distributed can result in some cryptic errors.
It is recommended to use the older version of Dask and Distributed for stability here:

.. code-block:: bash
pip install dask==1.0.0 distributed'>=1.21.6,<2.0.0'
But in many cases this still results in issues with the GRN step.
An alternative is to use the multiprocessing implementation of Arboreto recently included in pySCENIC (`arboreto_with_multiprocessing.py <https://github.com/aertslab/pySCENIC/blob/master/src/pyscenic/cli/arboreto_with_multiprocessing.py>`_).
An alternative is to use the multiprocessing implementation of Arboreto included in pySCENIC (`arboreto_with_multiprocessing.py <https://github.com/aertslab/pySCENIC/blob/master/src/pyscenic/cli/arboreto_with_multiprocessing.py>`_).
This scrips is also available on the path once pySCENIC is installed.
This script uses the Arboreto and pySCENIC codebase to run GRNBoost2 (or GENIE3) without Dask.
The eliminates the possibility of running the GRN step across multiple nodes, but brings provides additional stability.
The run time is generally equivalent to the Dask implementation using the same number of workers.
Expand Down Expand Up @@ -101,7 +92,7 @@ Yes you can. The code snippet below shows you how to create your own databases:

.. code-block:: python
from pyscenic.rnkdb import DataFrameRankingDatabase as RankingDatabase
from ctxcore.rnkdb import DataFrameRankingDatabase as RankingDatabase
import numpy as np
import pandas as pd
Expand All @@ -114,6 +105,10 @@ Yes you can. The code snippet below shows you how to create your own databases:
dtype=np.int32)
RankingDatabase(df, 'custom').save('custom.db')
Please also see
`create_cisTarget_databases <https://github.com/aertslab/create_cisTarget_databases>`_
for more detailed and flexible methods to create custom cisTarget databases.


Can I draw the distribution of AUC values for a regulon across cells?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
57 changes: 55 additions & 2 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ You can also install the bleeding edge (i.e. less stable) version of the package
Containers
~~~~~~~~~
~~~~~~~~~~

**pySCENIC containers** are also available for download and immediate use. In this case, no compiling or installation is required, provided either Docker or Singularity software is installed on the user's system. Images are available from `Docker Hub`_. Usage of the containers is shown below (`Docker and Singularity Images`_).
To pull the docker images, for example:
Expand Down Expand Up @@ -194,7 +194,60 @@ Note that in this case, a bind needs to be specified.

.. code-block:: bash
singularity exec -B /data:/data aertslab-pyscenic-0.10.0.sif ipython kernel -f {connection_file}
singularity exec -B /data:/data aertslab-pyscenic-latest.sif ipython kernel -f {connection_file}
More generally, a local or remote kernel can be set up by using the following examples.
These would go in a kernel file in ``~/.local/share/jupyter/kernels/pyscenic-latest/kernel.json`` (for example).

**Remote singularity kernel:**

.. code-block:: bash
{
"argv": [
"/software/jupyter/bin/python",
"-m",
"remote_ikernel",
"--interface",
"ssh",
"--host",
"r23i27n14",
"--workdir",
"~/",
"--kernel_cmd",
"singularity",
"exec",
"-B",
"/path/to/mounts",
"/path/to/aertslab-pyscenic-latest.sif",
"ipython",
"kernel",
"-f",
"{connection_file}"
],
"display_name": "pySCENIC singularity remote",
"language": "Python"
}
**Local singularity kernel:**

.. code-block:: bash
{
"argv": [
"singularity",
"exec",
"-B",
"/path/to/mounts",
"/path/to/aertslab-pyscenic-latest.sif",
"ipython",
"kernel",
"-f",
"{connection_file}"
],
"display_name": "pySCENIC singularity local",
"language": "python"
}
Nextflow
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ First we import the necessary modules and declare some constants:
from arboreto.utils import load_tf_names
from arboreto.algo import grnboost2
from pyscenic.rnkdb import FeatherRankingDatabase as RankingDatabase
from ctxcore.rnkdb import FeatherRankingDatabase as RankingDatabase
from pyscenic.utils import modules_from_adjacencies, load_motifs
from pyscenic.prune import prune2df, df2regulons
from pyscenic.aucell import aucell
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
ctxcore
cytoolz
multiprocessing_on_dill
llvmlite
Expand All @@ -9,7 +10,6 @@ pandas>=0.20.1
cloudpickle
dask
distributed
pyarrow>=0.11.1,<0.17.0
arboreto>=0.1.6
boltons
setuptools
Expand Down
1 change: 1 addition & 0 deletions requirements_docker.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ cffi==1.14.4
chardet==3.0.4
click==7.1.2
cloudpickle==1.6.0
ctxcore==0.1.1
cycler==0.10.0
Cython==0.29.21
cytoolz==0.11.0
Expand Down
4 changes: 2 additions & 2 deletions src/pyscenic/aucell.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# -*- coding: utf-8 -*-

import pandas as pd
from .recovery import enrichment4cells
from ctxcore.recovery import enrichment4cells
from tqdm import tqdm
from typing import Sequence, Type
from .genesig import GeneSignature
from ctxcore.genesig import GeneSignature
from multiprocessing import cpu_count, Process, Array
from boltons.iterutils import chunked
from multiprocessing.sharedctypes import RawArray
Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/cli/db2feather.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import os
import argparse
from pyscenic.rnkdb import convert_sqlitedb_to_featherdb
from ctxcore.rnkdb import convert_sqlitedb_to_featherdb


def derive_db_name(fname: str) -> str:
Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/cli/gmt2regions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import os
import argparse
import sys
from pyscenic.genesig import GeneSignature
from ctxcore.genesig import GeneSignature
from pyscenic.regions import RegionRankingDatabase, Delineation, convert


Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/cli/invertdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import os
import argparse
from pyscenic.rnkdb import opendb, InvertedRankingDatabase
from ctxcore.rnkdb import opendb, InvertedRankingDatabase


def derive_db_name(fname: str) -> str:
Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/cli/pyscenic.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
from arboreto.utils import load_tf_names

from pyscenic.utils import modules_from_adjacencies, add_correlation
from pyscenic.rnkdb import opendb, RankingDatabase
from ctxcore.rnkdb import opendb, RankingDatabase
from pyscenic.prune import prune2df, find_features, _prepare_client
from pyscenic.aucell import aucell
from pyscenic.log import create_logging_handler
Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/cli/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import loompy as lp
from operator import attrgetter
from typing import Type, Sequence
from pyscenic.genesig import GeneSignature, openfile
from ctxcore.genesig import GeneSignature, openfile
from pyscenic.transform import df2regulons
from pyscenic.utils import load_motifs, load_from_yaml, save_to_yaml
from pyscenic.binarization import binarize
Expand Down
2 changes: 1 addition & 1 deletion src/pyscenic/export.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import loompy as lp
from sklearn.manifold import TSNE
from .aucell import aucell
from .genesig import Regulon
from ctxcore.genesig import Regulon
from typing import List, Mapping, Union, Sequence, Optional
from operator import attrgetter
from multiprocessing import cpu_count
Expand Down
Loading

0 comments on commit 2919ae4

Please sign in to comment.