diff --git a/site/404.html b/site/404.html new file mode 100644 index 0000000..09ca6a0 --- /dev/null +++ b/site/404.html @@ -0,0 +1,148 @@ + + + + + + + + Ensemblex + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • +
  • +
+
+
+
+
+ + +

404

+ +

Page not found

+ + +
+
+ +
+
+ +
+ +
+ +
+ + + + + +
+ + + + + + + + + diff --git a/site/Acknowledgement/index.html b/site/Acknowledgement/index.html new file mode 100644 index 0000000..57b24a4 --- /dev/null +++ b/site/Acknowledgement/index.html @@ -0,0 +1,164 @@ + + + + + + + + Acknowledgement - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • About »
  • + +
  • +
  • +
+
+
+
+
+ +

Acknowledgement

+

The Ensemblex pipeline was produced for projects funded by the Canadian Institute of Health Research and Michael J. Fox Foundation Parkinson's Progression Markers Initiative (MJFF PPMI) in collaboration with The Neuro's Early Drug Discovery Unit (EDDU), McGill University. It is written by Michael Fiorini and Saeid Amiri with supervision from Rhalena Thomas and Sali Farhan at the Montreal Neurological Institute-Hospital. Copyright belongs MNI BIOINFO CORE.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/Dataset1/index.html b/site/Dataset1/index.html new file mode 100644 index 0000000..7b16822 --- /dev/null +++ b/site/Dataset1/index.html @@ -0,0 +1,664 @@ + + + + + + + + Ensemblex with prior genotype information - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • Tutorial »
  • + +
  • +
  • +
+
+
+
+
+ +

Ensemblex pipeline with prior genotype information

+ +
+

Introduction

+

This guide illustrates how to use the Ensemblex pipeline to demultiplexed pooled scRNAseq samples with prior genotype information. Here, we will leverage a pooled scRNAseq dataset produced by Jerber et al.. This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state. The Ensemblex pipeline is illustrated in the diagram below:

+

+ +

+ +

NOTE: To download the necessary files for the tutorial please see the Downloading data section of the Ensemblex documentation.

+
+

Installation

+

[to be completed]

+

module load StdEnv/2023 +module load apptainer/1.2.4

+
+

Step 1: Set up

+

In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use.

+

First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip:

+
## Create and navigate to the working directory
+cd ensemblex_tutorial
+mkdir working_directory
+cd ~/ensemblex_tutorial/working_directory
+
+## Define the path to ensemblex.pip
+ensemblex_HOME=~/ensemblex.pip
+
+## Define the path to the working directory
+ensemblex_PWD=~/ensemblex_tutorial/working_directory
+
+

Next, we can set up the working directory and choose the Ensemblex pipeline for demultiplexing with prior genotype information (--step init-GT) using the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT
+
+

After running the above code, the working directory should have the following structure:

+
ensemblex_tutorial
+└── working_directory
+    ├── demuxalot
+    ├── demuxlet
+    ├── ensemblex_gt
+    ├── input_files
+    ├── job_info
+    │   ├── configs
+    │   │   └── ensemblex_config.ini
+    │   ├── logs
+    │   └── summary_report.txt
+    ├── souporcell
+    └── vireo_gt
+
+

Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools.

+
+

Step 2: Preparation of input files

+

In Step 2, we will define the necessary files needed for ensemblex's constituent genetic demultiplexing tools and will place them within the working directory.

+

Note: For the tutorial we will be using the data downloaded in the Downloading data section of the Ensemblex documentation.

+

First, define all of the required files:

+
BAM=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam
+
+BAM_INDEX=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam.bai
+
+BARCODES=~/ensemblex_tutorial/CellRanger/outs/filtered_gene_bc_matrices/refdata-cellranger-GRCh37/barcodes.tsv
+
+SAMPLE_VCF=~/ensemblex_tutorial/sample_genotype/sample_genotype_merge.vcf
+
+REFERENCE_VCF=~/ensemblex_tutorial/reference_files/common_SNPs_only.recode.vcf
+
+REFERENCE_FASTA=~/ensemblex_tutorial/reference_files/genome.fa
+
+REFERENCE_FASTA_INDEX=~/ensemblex_tutorial/reference_files/genome.fa.fai
+
+

Next, we will sort the pooled samples and reference .vcf files according to the .bam file and place them within the working directory:

+
## Sort pooled samples .vcf file
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/pooled_samples.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam
+
+## Sort reference .vcf file
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/reference.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam
+
+

NOTE: To sort the vcf files we use the pipeline produced by the authors of Demuxlet/Freemuxlet (Kang et al. ).

+

Next, we will place the remaining necessary files within the working directory:

+
cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam
+cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai 
+cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv
+cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa
+cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai
+
+

After running the above code, $ensemblex_PWD/input_files should contain the following files:

+
input_files
+├── pooled_bam.bam
+├── pooled_bam.bam.bai
+├── pooled_barcodes.tsv
+├── pooled_samples.vcf
+├── reference.fa
+├── reference.fa.fai
+└── reference.vcf
+
+

NOTE: It is important that the file names match those listed above as they are necessary for the Ensemblex pipeline to recognize them.

+
+

Step 3: Genetic demultiplexing by constituent tools

+

In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools:

+ +

First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for each of the constituent genetic demultiplexing tools:

+
## Navigate to the .ini file
+cd $ensemblex_PWD/job_info/configs
+
+## Open the .ini file and adjust parameters directly in the terminal
+nano ensemblex_config.ini
+
+

For the tutorial, we set the following parameters for the constituent genetic demultiplexing tools:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
PAR_demuxalot_genotype_names'HPSI0115i-hecn_6,HPSI0214i-pelm_3,HPSI0314i-sojd_3,HPSI0414i-sebn_3,HPSI0514i-uenn_3,HPSI0714i-pipw_4,HPSI0715i-meue_5,HPSI0914i-vaka_5,HPSI1014i-quls_2'
PAR_demuxalot_prior_strength100
PAR_demuxalot_minimum_coverage200
PAR_demuxalot_minimum_alternative_coverage10
PAR_demuxalot_n_best_snps_per_donor100
PAR_demuxalot_genotypes_prior_strength1
PAR_demuxalot_doublet_prior0.25
PAR_demuxlet_fieldGT
PAR_vireo_N9
PAR_vireo_typeGT
PAR_vireo_processes20
PAR_vireo_minMAF0.1
PAR_vireo_minCOUNT20
PAR_vireo_forcelearnGTT
PAR_minimap2'-ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no'
PAR_freebayes'-iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6'
PAR_vartrix_umiTRUE
PAR_vartrix_mapq30
PAR_vartrix_threads8
PAR_souporcell_k9
PAR_souporcell_t8
+

Now that the parameters have been defined, we can demultiplex the pools with the constituent genetic demultiplexing tools.

+
+

Demuxalot

+

To run Demuxalot use the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot
+
+

If Demuxalot completed successfully, the following files should be available in $ensemblex_PWD/demuxalot:

+
demuxalot
+    ├── Demuxalot_result.csv
+    └── new_snps_single_file.betas
+
+
+

Demuxlet

+

To run Demuxlet use the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet
+
+

If Demuxlet completed successfully, the following files should be available in $ensemblex_PWD/demuxlet:

+
demuxlet
+    ├── outs.best
+    ├── pileup.cel.gz
+    ├── pileup.plp.gz
+    ├── pileup.umi.gz
+    └── pileup.var.gz
+
+
+

Souporcell

+

To run Souporcell use the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell
+
+

If Souporcell completed successfully, the following files should be available in $ensemblex_PWD/souporcell:

+
souporcell
+    ├── alt.mtx
+    ├── cluster_genotypes.vcf
+    ├── clusters_tmp.tsv
+    ├── clusters.tsv
+    ├── fq.fq
+    ├── minimap.sam
+    ├── minitagged.bam
+    ├── minitagged_sorted.bam
+    ├── minitagged_sorted.bam.bai
+    ├── Pool.vcf
+    ├── ref.mtx
+    └── soup.txt
+
+
+

Vireo

+

To run Vireo-GT use the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo
+
+

If Vireo-GT completed successfully, the following files should be available in $ensemblex_PWD/vireo_gt:

+
vireo_gt
+    ├── cellSNP.base.vcf.gz
+    ├── cellSNP.cells.vcf.gz
+    ├── cellSNP.samples.tsv
+    ├── cellSNP.tag.AD.mtx
+    ├── cellSNP.tag.DP.mtx
+    ├── cellSNP.tag.OTH.mtx
+    ├── donor_ids.tsv
+    ├── fig_GT_distance_estimated.pdf
+    ├── fig_GT_distance_input.pdf
+    ├── GT_donors.vireo.vcf.gz
+    ├── _log.txt
+    ├── prob_doublet.tsv.gz
+    ├── prob_singlet.tsv.gz
+    └── summary.tsv
+
+
+

Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications

+

NOTE: To minimize computation time for the tutorial, we have provided the necessary outpu files from the constituent tools here. To access the files and place them in the working directory, use the following code:

+
## Demuxalot
+cd $ensemblex_PWD/demuxalot
+wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/Demuxalot_result.csv
+
+## Demuxlet
+cd $ensemblex_PWD/demuxlet
+wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/outs.best
+
+## Souporcell
+cd $ensemblex_PWD/souporcell
+wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/clusters.tsv
+
+## Vireo
+cd $ensemblex_PWD/vireo_gt
+wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/donor_ids.tsv
+
+
+
+

Step 4: Application of Ensemblex

+

In Step 4, we will process the output files of the four constituent genetic demultiplexing tools with the three-step Ensemblex algorithm:

+
    +
  • Step 1: Probabilistic-weighted ensemble
  • +
  • Step 2: Graph-based doublet detection
  • +
  • Step 3: Step 3: Ensemble-independent doublet detection
  • +
+

First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for the Ensemblex algorithm:

+
## Navigate to the .ini file
+cd $ensemblex_PWD/job_info/configs
+
+## Open the .ini file and adjust parameters directly in the terminal
+nano ensemblex_config.ini
+
+

For the tutorial, we set the following parameters for the Ensemblex algorithm:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
Pool parameters
PAR_ensemblex_sample_size9
PAR_ensemblex_expected_doublet_rate0.10
Set up parameters
PAR_ensemblex_merge_constituentsYes
Step 1 parameters: Probabilistic-weighted ensemble
PAR_ensemblex_probabilistic_weighted_ensembleYes
Step 2 parameters: Graph-based doublet detection
PAR_ensemblex_preliminary_parameter_sweepNo
PAR_ensemblex_nCDNULL
PAR_ensemblex_pTNULL
PAR_ensemblex_graph_based_doublet_detectionYes
Step 3 parameters: Ensemble-independent doublet detection
PAR_ensemblex_preliminary_ensemble_independent_doubletNo
PAR_ensemblex_ensemble_independent_doubletYes
PAR_ensemblex_doublet_Demuxalot_thresholdYes
PAR_ensemblex_doublet_Demuxalot_no_thresholdNo
PAR_ensemblex_doublet_Demuxlet_thresholdNo
PAR_ensemblex_doublet_Demuxlet_no_thresholdNo
PAR_ensemblex_doublet_Souporcell_thresholdNo
PAR_ensemblex_doublet_Souporcell_no_thresholdNo
PAR_ensemblex_doublet_Vireo_thresholdYes
PAR_ensemblex_doublet_Vireo_no_thresholdNo
Confidence score parameters
PAR_ensemblex_compute_singlet_confidenceYes
+

If Ensemblex completed successfully, the following files should be available in $ensemblex_PWD/ensemblex_gt:

+
ensemblex_gt
+├── confidence
+│   └── ensemblex_final_cell_assignment.csv
+├── constituent_tool_merge.csv
+├── step1
+│   ├── ARI_demultiplexing_tools.pdf
+│   ├── BA_demultiplexing_tools.pdf
+│   ├── Balanced_accuracy_summary.csv
+│   └── step1_cell_assignment.csv
+├── step2
+│   ├── optimal_nCD.pdf
+│   ├── optimal_pT.pdf
+│   ├── PC1_var_contrib.pdf
+│   ├── PC2_var_contrib.pdf
+│   ├── PCA1_graph_based_doublet_detection.pdf
+│   ├── PCA2_graph_based_doublet_detection.pdf
+│   ├── PCA3_graph_based_doublet_detection.pdf
+│   ├── PCA_plot.pdf
+│   ├── PCA_scree_plot.pdf
+│   └── Step2_cell_assignment.csv
+└── step3
+    ├── Doublet_overlap_no_threshold.pdf
+    ├── Doublet_overlap_threshold.pdf
+    ├── Number_ensemblex_doublets_EID_no_threshold.pdf
+    ├── Number_ensemblex_doublets_EID_threshold.pdf
+    └── Step3_cell_assignment.csv
+
+

Ensemblex's final assignments are described in the ensemblex_final_cell_assignment.csv file.

+

Specifically, the ensemblex_assignment column describes Ensemblex's final assignments after application of the singlet confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are labelled as unassigned); we recomment that users use this column to label their cells for downstream analyses. The ensemblex_best_assignment column describes Ensemblex's best assignments, independent of the singlets confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are NOT labelled as unassigned).

+

The cell barcodes listed under the barcode column can be used to add the ensemblex_final_cell_assignment.csv information to the metadata of a Seurat object.

+
+

Resource requirements

+

The following table describes the computational resources used in this tutorial for genetic demultiplexing by the constituent tools and application of the Ensemblex algorithm.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ToolTimeCPUMemory
Demuxalot01:34:59612.95 GB
Demuxlet03:16:036138.32 GB
Souporcell2-14:49:21121.83 GB
Vireo2-01:30:24629.42 GB
Ensemblex02:05:2715.67 GB
+
+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/Dataset2/index.html b/site/Dataset2/index.html new file mode 100644 index 0000000..3147af6 --- /dev/null +++ b/site/Dataset2/index.html @@ -0,0 +1,752 @@ + + + + + + + + Run pipeline on processed data - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • + +
  • +
  • +
+
+
+
+
+ +

HTO analysis track: PBMC dataset

+

Contents

+ +
+

Introduction

+

This guide illustrates the steps taken for our analysis of the PBMC dataset in our pre-print manuscript. Here, we are using the HTO analysis track of scRNAbox to analyze a publicly available scRNAseq dataset produced by Stoeckius et al.. This data set describes peripheral blood mononuclear cells (PBMC) from eight human donors, which were tagged with sample-specific barcodes, pooled, and sequenced together in a single run.

+
+

Downloading the PBMC dataset

+

In you want to use the PBMC dataset to test the scRNAbox pipeline, please see here for detialed instructions on how to download the publicly available data.

+
+

Installation

+

scrnabox.slurm installation

+

To download the latest version of scrnabox.slurm (v0.1.52.50) run the following command:

+
wget https://github.com/neurobioinfo/scrnabox/releases/download/v0.1.52.5/scrnabox.slurm.zip
+unzip scrnabox.slurm.zip
+
+

For a description of the options for running scrnabox.slurm run the following command:

+
bash /pathway/to/scrnabox.slurm/launch_scrnabox.sh -h 
+
+

If the scrnabox.slurm has been installed properly, the above command should return the folllowing:

+
scrnabox pipeline version 0.1.52.50
+------------------- 
+mandatory arguments:
+                -d  (--dir)  = Working directory (where all the outputs will be printed) (give full path)
+                --steps  =  Specify what steps, e.g., 2 to run step 2. 2-6, run steps 2 through 6
+
+        optional arguments:
+                -h  (--help)  = See helps regarding the pipeline arguments. 
+                --method  = Select your preferred method: HTO and SCRNA for hashtag, and Standard scRNA, respectively. 
+                --msd  = You can get the hashtag labels by running the following code (HTO Step 4). 
+                --markergsea  = Identify marker genes for each cluster and run marker gene set enrichment analysis (GSEA) using EnrichR libraries (Step 7). 
+                --knownmarkers  = Profile the individual or aggregated expression of known marker genes. 
+                --referenceannotation  = Generate annotation predictions based on the annotations of a reference Seurat object (Step 7). 
+                --annotate  = Add clustering annotations to Seurat object metadata (Step 7). 
+                --addmeta  = Add metadata columns to the Seurat object (Step 8). 
+                --rundge  = Perform differential gene expression contrasts (Step 8). 
+                --seulist  = You can directly call the list of Seurat objects to the pipeline. 
+                --rcheck  = You can identify which libraries are not installed.  
+
+ ------------------- 
+ For a comprehensive help, visit  https://neurobioinfo.github.io/scrnabox/site/ for documentation. 
+
+
+

CellRanger installation

+

For information regarding the installation of CellRanger, please visit the 10X Genomics documentation. If CellRanger is already installed on your HPC system, you may skip the CellRanger installation procedures.

+

For our analysis of the midbrain dataset we used the 10XGenomics GRCh38-3.0.0 reference genome and CellRanger v5.0.1. For more information regarding how to prepare reference genomes for the CellRanger counts pipeline, please see the 10X Genomics documentation.

+
+

R library preparation and R package installation

+

We must prepapre a common R library where we will load all of the required R packages. If the required R packages are already installed on your HPC system in a common R library, you may skip the following procedures. +

+

We will first install R. The analyses presented in our pre-print manuscript were conducted using v4.2.1.

+
# install R
+module load r/4.2.1
+
+

Then, we will run the installation code, which creates a directory where the R packages will be loaded and will install the required R packages:

+
# Folder for R packages 
+R_PATH=~/path/to/R/library
+mkdir -p $R_PATH
+
+# Install package
+Rscript ./scrnabox.slurm/soft/R/install_packages.R $R_PATH
+
+
+

scRNAbox pipeline

+

Step 0: Set up

+

Now that scrnabox.slurm, CellRanger, R, and the required R packages have been installed, we can proceed to our analysis with the scRNAbox pipeline. We will create a pipeline folder designated for the analysis and run Step 0, selecting the HTO analysis track (--method HTO), using the following code:

+
mkdir pipeline
+cd pipeline
+
+export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 0 \
+--method HTO
+
+

Next, we will navigate to the scrnabox_config.ini file in ~/pipeline/job_info/configs to define the HPC account holder (ACCOUNT), the path to the environmental module (MODULEUSE), the path to CellRanger from the environmental module directory (CELLRANGER), CellRanger version (CELLRANGER_VERSION), R version (R_VERSION), and the path to the R library (R_LIB_PATH):

+
cd ~/pipeline/job_info/configs
+nano scrnabox_config.ini
+
+ACCOUNT=account-name
+MODULEUSE=/path/to/environmental/module 
+CELLRANGER=/path/to/cellranger/from/module/directory 
+CELLRANGER_VERSION=5.0.1
+R_VERSION=4.2.1
+R_LIB_PATH=/path/to/R/library
+
+

Next, we can check to see if all of the required R packages have been properly installed using the following command:

+
bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 0 \
+--rcheck 
+
+
+

Step 1: FASTQ to gene expression matrix

+

In Step 1, we will run the CellRanger counts pipeline to generate feature-barcode expression matrices from the FASTQ files. While it is possible to manually prepare the library.csv and feature_ref.csv files for the sequencing run prior to running Step 1, for this analysis we are going to opt for automated library preparation. For more information regarding the manual prepartion of library.csv and feature_ref.csv files, please see the the CellRanger library preparation tutorial.
+
+For our analysis of the PBMC dataset we set the following execution parameters for Step 1 (~/pipeline/job_info/parameters/step1_par.txt):

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
par_automated_library_prepYes
par_fastq_directory/path/to/directory/contaning/fastqs
par_RNA_run_namesrun1GEX
par_HTO_run_namesrun1HTO
par_seq_run_namesrun1
par_paired_end_seqYes
par_idHash1, Hash2, Hash3, Hash4, Hash5, Hash6, Hash7, Hash8
par_nameA_TotalSeqA, B_TotalSeqA, C_TotalSeqA, D_TotalSeqA, E_TotalSeqA, F_TotalSeqA, G_TotalSeqA, H_TotalSeqA
par_readR2
par_pattern5P(BC)
par_sequenceAGGACCATCCAA, ACATGTTACCGT, AGCTTACTATCC, TCGATAATGCGA, GAGGCTGAGCTA, GTGTGACGTATT, ACTGTCTAACGG, TATCACATCGGT
par_ref_dir_grch~/genome/10xGenomics/refdata-cellranger-GRCh38-3.0.0
par_r1_lengthNULL (commented out)
par_r2_lengthNULL (commented out)
par_mempercode30
par_include_intronsNULL (commented out)
par_no_target_umi_filterNULL (commented out)
par_expect_cellsNULL (commented out)
par_force_cellsNULL (commented out)
par_no_bamNULL (commented out)
+

Note: The parameters file for each step is located in ~/pipeline/job_info/parameters. For a comprehensive description of the execution parameters for each step see here.

+

Given that CellRanger runs a user interface and is not submitted as a Job, it is recommended to run Step 1 in a 'screen' which will allow the the task to keep running if the connection is broken. To run Step 1, use the following command:

+
export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+screen -S run_PBMC_application_case
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 1
+
+

The outputs of the CellRanger counts pipeline are deposited into ~/pipeline/step1.

+
+

Step 2: Create Seurat object and remove ambient RNA

+

In Step 2, we are going to begin by correcting the RNA assay for ambient RNA removal using SoupX (Young et al. 2020). We will then use the the ambient RNA-corrected feature-barcode matrices to create a Seurat object.
+
+For our analysis of the PBMC dataset we set the following execution parameters for Step 2 (~/pipeline/job_info/parameters/step2_par.txt):

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
par_save_RNAYes
par_save_metadataYes
par_ambient_RNAYes
par_normalization.methodLogNormalize
par_scale.factor10000
par_selection.methodvst
par_nfeatures2500
+

We can run Step 2 using the following code:

+
export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 2
+
+

Step 2 produces the following outputs:

+
~/pipeline
+step2
+├── figs2
+│   ├── ambient_RNA_estimation_run1.pdf
+│   ├── ambient_RNA_markers_run1.pdf
+│   ├── cell_cyle_dim_plot_run1.pdf
+│   ├── vioplot_run1.pdf
+│   └── zoomed_in_vioplot_run1.pdf
+├── info2
+│   ├── estimated_ambient_RNA_run1.txt
+│   ├── MetaData_1.txt
+│   ├── meta_info_1.txt
+│   ├── run1_ambient_rna_summary.rds
+│   ├── sessionInfo.txt
+│   ├── seu1_RNA.txt
+│   └── summary_seu1.txt
+├── objs2
+│   └── run1.rds
+└── step2_ambient
+    └── run1
+        ├── barcodes.tsv
+        ├── genes.tsv
+        └── matrix.mtxs 
+
+

Note: For a comprehensive description of the outputs for each analytical step, please see the Outputs section of the scRNAbox documentation.

+

+
+

+ +

Figure 1. Figures produced by Step 2 of the scRNAbox pipeline. A) Estimated ambient RNA contamination rate (Rho) by SoupX. Estimates of the RNA contamination rate using various estimators are visualized via a frequency distribution; the true contamination rate is assigned as the most frequent estimate (red line; 8.7%). B) Log10 ratios of observed counts to expected counts for marker genes from each cluster. Clusters are defined by the CellRanger counts pipeline. The red line displays the estimated RNA contamination rate if the estimation was based entirely on the corresponding gene. C) Principal component analysis (PCA) of Seurat S and G2M cell cycle reference genes. D) Violin plots showing the distribution of cells according to quality control metrics calculated in Step 2. E) Zoomed in violin plots, from the minimum to the mean, showing the distribution of cells according to quality control metrics calculated in Step 2.

+
+

Step 3: Quality control and filtering

+

In Step 3, we are going to perform quality control procedures and filter out low quality cells. We are going to filter out cells with < 50 unique RNA transcripts, > 6000 unique RNA transcripts, < 200 total RNA transcripts, > 7000 total RNA transcripts, and > 50% mitochondria.

+

For our analysis of the PBMC dataset we set the following execution parameters for Step 3 (~/pipeline/job_info/parameters/step2_par.txt):

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
par_save_RNAYes
par_save_metadataYes
par_seurat_objectNULL
par_nFeature_RNA_L50
par_nFeature_RNA_U6000
par_nCount_RNA_L200
par_nCount_RNA_U7000
par_mitochondria_percent_L0
par_mitochondria_percent_U50
par_ribosomal_percent_L0
par_ribosomal_percent_U100
par_remove_mitochondrial_genesNo
par_remove_ribosomal_genesNo
par_remove_genesNULL
par_regress_cell_cycle_genesYes
par_normalization.methodLogNormalize
par_scale.factor10000
par_selection.methodvst
par_nfeatures2500
par_top10
par_npcs_pca30
+

We can run Step 3 using the following code:

+
export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 3
+
+

Step 3 produces the following outputs.

+
step3
+├── figs3
+│   ├── dimplot_pca_run1.pdf
+│   ├── elbowplot_run1.pdf
+│   ├── filtered_QC_vioplot_run1.pdf
+│   └── VariableFeaturePlot_run1.pdf
+├── info3
+│   ├── MetaData_run1.txt
+│   ├── meta_info_run1.txt
+│   ├── most_variable_genes_run1.txt
+│   ├── run1_RNA.txt
+│   ├── sessionInfo.txt
+│   └── summary_run1.txt
+└── objs3
+    └── run1.rds
+
+

+
+

+ +

Figure 2. Figures produced by Step 3 of the scRNAbox pipeline. A) Violin plots showing the distribution of cells according to quality control metrics after filtering by user-defined thresholds. B) Scatter plot showing the top 2500 most variable features; the top 10 most variable features are labelled. C) Principal component analysis (PCA) visualizing the first two principal component (PC). D) Elbow plot to visualize the percentage of variance explained by each PC.

+
+

Step 4: Demultiplexing and doublet detection

+

In Step 4, we are going to demultiplex the pooled samples and remove doublets (erroneous libraries produced by two or more cells) based on the expression of the sample-specific barcodes (antibody assay).

+

If the barcode labels used in the analysis are unknown, the first step is to retrieve them from the Seurat object. To do this, we do not need to modify the execution parameters and can go straight to running the following code:

+
export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 4 \
+--msd T 
+
+

The above code produces the following file:

+
step4
+├── figs4
+├── info4
+│   └── seu1.rds_old_antibody_label_MULTIseqDemuxHTOcounts.csv
+└── objs4
+
+

Which contains the names of the barcode labels (i.e. A_TotalSeqA, B_TotalSeqA, C_TotalSeqA, D_TotalSeqA, E_TotalSeqA, F_TotalSeqA, G_TotalSeqA, H_TotalSeqA, Doublet, Negative).

+

Now that we know the barcode labels used in the PBMC dataset, we can perform demultiplexing and doublet detection.

+

For our analysis of the PBMC dataset we set the following execution parameters for Step 4 (~/pipeline/job_info/parameters/step4_par.txt):

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterValue
par_save_RNAYes
par_save_metadataYes
par_normalization.methodCLR
par_scale.factor10000
par_selection.methodvst
par_nfeatures2500
par_dimensionality_reductionYes
par_npcs_pca30
par_dims_umap3
par_n.neighbor65
par_dropDNYes
par_label_dropDNDoublet, Negative
par_quantile0.9
par_autoThreshTRUE
par_maxiter5
par_RidgePlot_ncol3
par_old_antibody_labelA-TotalSeqA, B-TotalSeqA, C-TotalSeqA, D-TotalSeqA, E-TotalSeqA, F-TotalSeqA, G-TotalSeqA, H-TotalSeqA, Doublet
par_new_antibody_labelsample-A, sample-B, sample-C, sample-D, sample-E, sample-F, sample-G, sample-H, Doublet
+

We can run Step 4 using the following code:

+
export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm
+export SCRNABOX_PWD=~/pipeline
+
+bash $SCRNABOX_HOME/launch_scrnabox.sh \
+-d ${SCRNABOX_PWD} \
+--steps 4
+
+

Step 4 produces the following outputs.

+
step4
+├── figs4
+│   ├── run1_DotPlot_HTO_MSD.pdf
+│   ├── run1_Heatmap_HTO_MSD.pdf
+│   ├── run1_HTO_dimplot_pca.pdf
+│   ├── run1_HTO_dimplot_umap.pdf
+│   ├── run1_nCounts_RNA_MSD.pdf
+│   └── run1_Ridgeplot_HTO_MSD.pdf
+├── info4
+│   ├── run1_filtered_MULTIseqDemuxHTOcounts.csv
+│   ├── run1_MetaData.txt
+│   ├── run1_meta_info_.txt
+│   ├── run1_MULTIseqDemuxHTOcounts.csv
+│   ├── run1_RNA.txt
+│   └── sessionInfo.txt
+└── objs4
+    └── run1.rds
+
+

+
+

+ +

Figure 3. Figures produced by Step 4 of the Cell Hashtag Analysis Track. A) Uniform Manifold Approximation and Projections (UMAP) plot, taking the first three pricipal components (PC) of the antibody assay as input. B) Principal component analysis (PCA) showing the first two PCs of the antibody assay. C) Ridgeplot visualizing the enrichment of barcode labels across sample assignments at the sample level. D) Dot plot visualizing the enrichment of barcode labels across sample assignments at the sample level. E) Heatmap visualizing the enrichment of barcode labels across sample assignments at the cel level. D) Violin plot visualizing the distribution of the number of total RNA transcripts identified per cell, startified by sample assignment.

+
+

Publication-ready figures

+

The code used to produce the publication-ready figures used in our pre-print manuscript is avaliable here here.

+
+

Job Configurations

+

The following job configurations were used for our analysis of the PBMC dataset. Job Configurations can be modified for each analytical step in the scrnabox_config.ini file in ~/pipeline/job_info/configs

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
StepTHREADS_ARRAYMEM_ARRAYWALLTIME_ARRAY
Step2416g00-05:00
Step3416g00-05:00
Step4416g00-05:00
+ +
+
+ +
+
+ +
+ +
+ +
+ + + + + +
+ + + + + + + + + diff --git a/site/LICENSE/index.html b/site/LICENSE/index.html new file mode 100644 index 0000000..7e19f80 --- /dev/null +++ b/site/LICENSE/index.html @@ -0,0 +1,177 @@ + + + + + + + + License - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • About »
  • + +
  • +
  • +
+
+
+
+
+ +

License

+

MIT License

+

Copyright (c) 2022 The Neuro Bioinformatics Core

+

Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions:

+

The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software.

+

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + +
+ + + + + + + + + diff --git a/site/Step0/index.html b/site/Step0/index.html new file mode 100644 index 0000000..7ee3fb8 --- /dev/null +++ b/site/Step0/index.html @@ -0,0 +1,234 @@ + + + + + + + + Step 1: Set up - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Step 1: Setting up the Ensemblex pipeline

+

In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use:

+
    +
  1. Demultiplexing with prior genotype information
  2. +
  3. Demultiplexing without prior genotype information
  4. +
+
+

Demultiplexing with prior genotype information

+

First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip:

+
## Create and navigate to the working directory
+mkdir working_directory
+cd /path/to/working_directory
+
+## Define the path to ensemblex.pip
+ensemblex_HOME=/path/to/ensemblex.pip
+
+## Define the path to the working directory
+ensemblex_PWD=/path/to/working_directory
+
+

Next, we can set up the working directory for demultiplexing with prior genotype information using the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT
+
+

After running the above code, the working directory should have the following structure

+
working_directory
+├── demuxalot
+├── demuxlet
+├── ensemblex_gt
+├── input_files
+├── job_info
+│   ├── configs
+│   │   └── ensemblex_config.ini
+│   ├── logs
+│   └── summary_report.txt
+├── souporcell
+└── vireo_gt
+
+

Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files

+
+

Demultiplexing without prior genotype information

+

First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip:

+
## Create and navigate to the working directory
+mkdir working_directory
+cd /path/to/working_directory
+
+## Define the path to ensemblex.pip
+ensemblex_HOME=/path/to/ensemblex.pip
+
+## Define the path to the working directory
+ensemblex_PWD=/path/to/working_directory
+
+

Next, we can set up the working directory for demultiplexing without prior genotype information using the following code:

+
bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-noGT
+
+

After running the above code, the working directory should have the following structure

+
working_directory
+├── demuxalot
+├── freemuxlet
+├── ensemblex
+├── input_files
+├── job_info
+│   ├── configs
+│   │   └── ensemblex_config.ini
+│   ├── logs
+│   └── summary_report.txt
+├── souporcell
+└── vireo
+
+

Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/Step1/index.html b/site/Step1/index.html new file mode 100644 index 0000000..57c6c3b --- /dev/null +++ b/site/Step1/index.html @@ -0,0 +1,335 @@ + + + + + + + + Step 2: Preparation of input files - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Step 2: Preparing input files for genetic demultiplexing

+

In Step 2, we will define the necessary files needed for Ensemblex's constituent genetic demultiplexing tools and will place them within the working directory. The necessary files vary depending on the version of the Ensemblex pipeline being used:

+ +
+

Demultiplexing with prior genotype information

+

Required files

+

To demultiplex the pooled samples with prior genotype information, the following files are required:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FileDescription
gene_expression.bamGene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam)
gene_expression.bam.baiGene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai)
barcodes.tsvBarcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv)
pooled_samples.vcfvcf file describing the genotypes of the pooled samples
genome_reference.faGenome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa)
genome_reference.fa.faiGenome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai)
genotype_reference.vcfPopulation reference vcf file (e.g., 1000 Genomes Project)
+

NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation.

+

Placing files into the Ensemblex pipeline working directory

+

First, define all of the required files:

+
BAM=/path/to/possorted_genome_bam.bam
+BAM_INDEX=/path/to/possorted_genome_bam.bam.bai
+BARCODES=/path/to/barcodes.tsv
+SAMPLE_VCF=/path/to/pooled_samples.vcf
+REFERENCE_VCF=/path/to/genotype_reference.vcf
+REFERENCE_FASTA=/path/to/genome.fa
+REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai
+
+

Then, place the required files in the Ensemblex pipeline working directory:

+
## Define the path to the working directory
+ensemblex_PWD=/path/to/working_directory
+
+## Copy the files to the input_files directory in the working directory
+cp $BAM  $ensemblex_PWD/input_files/pooled_bam.bam
+cp $BAM_INDEX  $ensemblex_PWD/input_files/pooled_bam.bam.bai
+cp $BARCODES  $ensemblex_PWD/input_files/pooled_barcodes.tsv
+cp $SAMPLE_VCF  $ensemblex_PWD/input_files/pooled_samples.vcf
+cp $REFERENCE_VCF  $ensemblex_PWD/input_files/reference.vcf
+cp $REFERENCE_FASTA  $ensemblex_PWD/input_files/reference.fa
+cp $REFERENCE_FASTA_INDEX  $ensemblex_PWD/input_files/reference.fa.fai
+
+

If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files:

+
working_directory
+└── input_files
+    ├── pooled_bam.bam
+    ├── pooled_bam.bam.bai
+    ├── pooled_barcodes.tsv
+    ├── pooled_samples.vcf
+    ├── reference.fa
+    ├── reference.fa.fai
+    └── reference.vcf
+
+

NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly.

+

Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools

+
+

Demultiplexing without prior genotype information

+

Required files

+

To demultiplex the pooled samples without prior genotype information, the following files are required:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FileDescription
gene_expression.bamGene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam)
gene_expression.bam.baiGene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai)
barcodes.tsvBarcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv)
genome_reference.faGenome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa)
genome_reference.fa.faiGenome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai)
genotype_reference.vcfPopulation reference vcf file (e.g., 1000 Genomes Project)
+

NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation.

+

Placing files into the Ensemblex pipeline working directory

+

First, define all of the required files:

+
BAM=/path/to/possorted_genome_bam.bam
+BAM_INDEX=/path/to/possorted_genome_bam.bam.bai
+BARCODES=/path/to/barcodes.tsv
+REFERENCE_VCF=/path/to/genotype_reference.vcf
+REFERENCE_FASTA=/path/to/genome.fa
+REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai
+
+

Then, place the required files in the Ensemblex pipeline working directory:

+
## Define the path to the working directory
+ensemblex_PWD=/path/to/working_directory
+
+## Copy the files to the input_files directory in the working directory
+cp $BAM  $ensemblex_PWD/input_files/pooled_bam.bam
+cp $BAM_INDEX  $ensemblex_PWD/input_files/pooled_bam.bam.bai
+cp $BARCODES  $ensemblex_PWD/input_files/pooled_barcodes.tsv
+cp $REFERENCE_VCF  $ensemblex_PWD/input_files/reference.vcf
+cp $REFERENCE_FASTA  $ensemblex_PWD/input_files/reference.fa
+cp $REFERENCE_FASTA_INDEX  $ensemblex_PWD/input_files/reference.fa.fai
+
+

If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files:

+
working_directory
+└── input_files
+    ├── pooled_bam.bam
+    ├── pooled_bam.bam.bai
+    ├── pooled_barcodes.tsv
+    ├── reference.fa
+    ├── reference.fa.fai
+    └── reference.vcf
+
+

NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly.

+

Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/Step2/index.html b/site/Step2/index.html new file mode 100644 index 0000000..f68fe3d --- /dev/null +++ b/site/Step2/index.html @@ -0,0 +1,383 @@ + + + + + + + + Step 3: Genetic demultiplexing by constituent tools - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Step 3: Genetic demultiplexing by constituent demultiplexing tools

+

In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools. The constituent genetic demultiplexing tools will vary depending on the version of the Ensemblex pipeline being used:

+ +

NOTE: The analytical parameters for each constiuent tool can be adjusted using the the ensemblex_config.ini file located in ~/working_directory/job_info/configs. For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters.

+
+

Demultiplexing with prior genotype information

+

When demultiplexing with prior genotype information, Ensemblex leverages the sample labels from

+ +
+

Demuxalot

+

To run Demuxalot use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot
+
+

If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot

+
working_directory
+└── demuxalot
+    ├── Demuxalot_result.csv
+    └── new_snps_single_file.betas
+
+
+

Demuxlet

+

To run Demuxlet use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet
+
+

If Demuxlet completed successfully, the following files should be available in ~/working_directory/demuxlet

+
working_directory
+└── demuxlet
+    ├── outs.best
+    ├── pileup.cel.gz
+    ├── pileup.plp.gz
+    ├── pileup.umi.gz
+    └── pileup.var.gz
+
+
+

Souporcell

+

To run Souporcell use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell
+
+

If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell

+
working_directory
+└── souporcell
+    ├── alt.mtx
+    ├── cluster_genotypes.vcf
+    ├── clusters_tmp.tsv
+    ├── clusters.tsv
+    ├── fq.fq
+    ├── minimap.sam
+    ├── minitagged.bam
+    ├── minitagged_sorted.bam
+    ├── minitagged_sorted.bam.bai
+    ├── Pool.vcf
+    ├── ref.mtx
+    └── soup.txt
+
+
+

Vireo-GT

+

To run Vireo-GT use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo
+
+

If Vireo-GT completed successfully, the following files should be available in ~/working_directory/vireo_gt

+
working_directory
+└── vireo_gt
+    ├── cellSNP.base.vcf.gz
+    ├── cellSNP.cells.vcf.gz
+    ├── cellSNP.samples.tsv
+    ├── cellSNP.tag.AD.mtx
+    ├── cellSNP.tag.DP.mtx
+    ├── cellSNP.tag.OTH.mtx
+    ├── donor_ids.tsv
+    ├── fig_GT_distance_estimated.pdf
+    ├── fig_GT_distance_input.pdf
+    ├── GT_donors.vireo.vcf.gz
+    ├── _log.txt
+    ├── prob_doublet.tsv.gz
+    ├── prob_singlet.tsv.gz
+    └── summary.tsv
+
+
+

Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex

+
+

Demultiplexing without prior genotype information

+

When demultiplexing without prior genotype information, Ensemblex leverages the sample labels from

+ +
+

Freemuxlet

+

To run Freemuxlet use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step freemuxlet
+
+

If Freemuxlet completed successfully, the following files should be available in ~/working_directory/freemuxlet

+
working_directory
+└── freemuxlet
+    ├── outs.clust1.samples.gz
+    ├── outs.clust1.vcf
+    ├── outs.lmix
+    ├── pileup.cel.gz
+    ├── pileup.plp.gz
+    ├── pileup.umi.gz
+    └── pileup.var.gz
+
+
+

Souporcell

+

To run Souporcell use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell
+
+

If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell

+
working_directory
+└── souporcell
+    ├── alt.mtx
+    ├── cluster_genotypes.vcf
+    ├── clusters_tmp.tsv
+    ├── clusters.tsv
+    ├── fq.fq
+    ├── minimap.sam
+    ├── minitagged.bam
+    ├── minitagged_sorted.bam
+    ├── minitagged_sorted.bam.bai
+    ├── Pool.vcf
+    ├── ref.mtx
+    └── soup.txt
+
+
+

Vireo

+

To run Vireo use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo
+
+

If Vireo completed successfully, the following files should be available in ~/working_directory/vireo

+
working_directory
+└── vireo
+    ├── cellSNP.base.vcf.gz
+    ├── cellSNP.cells.vcf.gz
+    ├── cellSNP.samples.tsv
+    ├── cellSNP.tag.AD.mtx
+    ├── cellSNP.tag.DP.mtx
+    ├── cellSNP.tag.OTH.mtx
+    ├── donor_ids.tsv
+    ├── fig_GT_distance_estimated.pdf
+    ├── GT_donors.vireo.vcf.gz
+    ├── _log.txt
+    ├── prob_doublet.tsv.gz
+    ├── prob_singlet.tsv.gz
+    └── summary.tsv
+
+
+

Demuxalot

+

NOTE: Because the Demuxalot algorithm requires prior genotype information, the Ensemblex pipeline uses the predicted vcf file generated by Freemuxlet as input into Demuxalot when prior genotype information is not available. Therefore, it is important to wait for Freemuxlet to complete before running Demuxalot. To check if the required Freemuxlet-generated vcf file is available prior to running Demuxalot, you can use the following code:

+
if test -f /path/to/working_directory/freemuxlet/outs.clust1.vcf; then
+  echo "File exists."
+fi
+
+

Upon confirming that the required Freemuxlet-generated file exists, we can run Demuxalot using the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot
+
+

If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot

+
working_directory
+└── demuxalot
+    ├── Demuxalot_result.csv
+    └── new_snps_single_file.betas
+
+
+

Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/Step3/index.html b/site/Step3/index.html new file mode 100644 index 0000000..4191581 --- /dev/null +++ b/site/Step3/index.html @@ -0,0 +1,361 @@ + + + + + + + + Step 4: Application of Ensemblex - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Step 4: Application of Ensemblex

+ +
+

Introduction

+

In Step 4, we will process the output files from the constituent genetic demultiplexing tools with the Ensemblex framework. Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools:

+

Step 1: Probabilistic-weighted ensemble
+In Step 1, Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool’s assignment probability distribution by its estimated balanced accuracy for the dataset. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell.

+

Step 2: Graph-based doublet detection
+In Step 2, Ensemblex utilizes a graph-based approach to identify doublets that were incorrectly labeled as singlets in Step 1. Pooled cells are embedded into PCA space and the most confident doublets in the pool (nCD) are identified. Then, based on the Euclidean distance in PCA space, the pooled cells that surpass the percentile threshold (pT) of the nearest neighbour frequency to the confident doublets are labelled as doublets by Ensemblex. Ensemblex performs an automated parameter sweep to identify the optimal nCD and pT values; however, user can opt to manually define these parameters.

+

Step 3: Ensemble-independent doublet detection
+In Step 3, Ensemblex utilizes an ensemble-independent approach to further improve doublet detection. Here, cells that are labelled as doublets by Demuxalot or Vireo are labelled as doublets by Ensemblex; however, users can nominate different tools to utilize for Step 3, depending on the desired doublet detection stringency.

+
+

Ensemblex parameters

+

Users can choose to run each step of the Ensemblex framework sequentially (Steps 1 to 3) or can opt to skip certain steps. While Step 1 is necessary to generate the ensemble sample labels, Steps 2 and 3 were implemented to improve Ensemblex's ability to identify doublets; thus, if users do not want to prioritize doublet detection, they may skip Steps 2 and/or 3. Nonetheless, we demonstrated in our pre-print manuscript that utilizing the entire Ensemblex framework is important for maximizing the demultiplexing accuracy. Users can define which steps of the Ensemblex framework they want to utilize in the adjustable parameters file.

+

The adjustable parameters file (ensemblex_config.ini) is located in ~/working_directory/job_info/configs/. For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters. The following parameters are adjustable when applying the Ensemblex algorithm:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
Pool parameters
PAR_ensemblex_sample_sizeNULLNumber of samples multiplexed in the pool.
PAR_ensemblex_expected_doublet_rateNULLExpected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation.
Set up parameters
PAR_ensemblex_merge_constituentsYesWhether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to "Yes". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to "No" as the pipeline will automatically detect the previously generated merged file.
Step 1 parameters: Probabilistic-weighted ensemble
PAR_ensemblex_probabilistic_weighted_ensembleYesWhether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to "Yes". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to "No" as the pipeline will automatically detect the previously generated Step 1 output file.
Step 2 parameters: Graph-based doublet detection
PAR_ensemblex_preliminary_parameter_sweepNoWhether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively.
PAR_ensemblex_nCDNULLManually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to "Yes".
PAR_ensemblex_pTNULLManually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to "Yes".
PAR_ensemblex_graph_based_doublet_detectionYesWhether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values.
Step 3 parameters: Ensemble-independent doublet detection
PAR_ensemblex_preliminary_ensemble_independent_doubletNoWhether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters.
PAR_ensemblex_ensemble_independent_doubletYesWhether or not to perform Step 3: Ensemble-independent doublet detection.
PAR_ensemblex_doublet_Demuxalot_thresholdYesWhether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Demuxalot_no_thresholdNoWhether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Demuxlet_thresholdNoWhether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Demuxlet_no_thresholdNoWhether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Souporcell_thresholdNoWhether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Souporcell_no_thresholdNoWhether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Vireo_thresholdYesWhether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Vireo_no_thresholdNoWhether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability.
Confidence score parameters
PAR_ensemblex_compute_singlet_confidenceYesWhether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses.
+
+

Applying the Ensemblex algorithm

+

To apply the Ensemblex algorithm use the following code:

+
ensemblex_HOME=/path/to/ensemblex.pip
+ensemblex_PWD=/path/to/working_directory
+
+bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step ensemblexing
+
+

If the ensemblex algorithm completed successfully, the following files should be available in ~/working_directory/ensemblex

+
working_directory
+└── ensemblex
+    ├── confidence
+    │   └── ensemblex_final_cell_assignment.csv
+    ├── constituent_tool_merge.csv
+    ├── step1
+    │   ├── ARI_demultiplexing_tools.pdf
+    │   ├── BA_demultiplexing_tools.pdf
+    │   ├── Balanced_accuracy_summary.csv
+    │   └── step1_cell_assignment.csv
+    ├── step2
+    │   ├── optimal_nCD.pdf
+    │   ├── optimal_pT.pdf
+    │   ├── PC1_var_contrib.pdf
+    │   ├── PC2_var_contrib.pdf
+    │   ├── PCA1_graph_based_doublet_detection.pdf
+    │   ├── PCA2_graph_based_doublet_detection.pdf
+    │   ├── PCA3_graph_based_doublet_detection.pdf
+    │   ├── PCA_plot.pdf
+    │   ├── PCA_scree_plot.pdf
+    │   └── Step2_cell_assignment.csv
+    └── step3
+        ├── Doublet_overlap_no_threshold.pdf
+        ├── Doublet_overlap_threshold.pdf
+        ├── Number_Ensemblux_doublets_EID_no_threshold.pdf
+        ├── Number_Ensemblux_doublets_EID_threshold.pdf
+        └── Step3_cell_assignment.csv
+
+

For a comprehensive description of the Ensemblex algorithm output files, please see Ensemblex outputs.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/contributing/index.html b/site/contributing/index.html new file mode 100644 index 0000000..1a4f7d2 --- /dev/null +++ b/site/contributing/index.html @@ -0,0 +1,166 @@ + + + + + + + + Help and Feedback - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • About »
  • + +
  • +
  • +
+
+
+
+
+ +

Help and Feedback

+

Any contributions or suggestions for improving the ensemblex pipeline are welcomed and appreciated. You may directly contact Michael Fiorini or Saeid Amiri.

+

If you encounter any issues, please open an issue in the GitHub repository.

+

Alternatively, you are welcomed to email the developers directly; for any questions please contact Michael Fiorini: michael.fiorini@mail.mcgill.ca

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/css/fonts/Roboto-Slab-Bold.woff b/site/css/fonts/Roboto-Slab-Bold.woff new file mode 100644 index 0000000..6cb6000 Binary files /dev/null and b/site/css/fonts/Roboto-Slab-Bold.woff differ diff --git a/site/css/fonts/Roboto-Slab-Bold.woff2 b/site/css/fonts/Roboto-Slab-Bold.woff2 new file mode 100644 index 0000000..7059e23 Binary files /dev/null and b/site/css/fonts/Roboto-Slab-Bold.woff2 differ diff --git a/site/css/fonts/Roboto-Slab-Regular.woff b/site/css/fonts/Roboto-Slab-Regular.woff new file mode 100644 index 0000000..f815f63 Binary files /dev/null and b/site/css/fonts/Roboto-Slab-Regular.woff differ diff --git a/site/css/fonts/Roboto-Slab-Regular.woff2 b/site/css/fonts/Roboto-Slab-Regular.woff2 new file mode 100644 index 0000000..f2c76e5 Binary files /dev/null and b/site/css/fonts/Roboto-Slab-Regular.woff2 differ diff --git a/site/css/fonts/fontawesome-webfont.eot b/site/css/fonts/fontawesome-webfont.eot new file mode 100644 index 0000000..e9f60ca Binary files /dev/null and b/site/css/fonts/fontawesome-webfont.eot differ diff --git a/site/css/fonts/fontawesome-webfont.svg b/site/css/fonts/fontawesome-webfont.svg new file mode 100644 index 0000000..855c845 --- /dev/null +++ b/site/css/fonts/fontawesome-webfont.svg @@ -0,0 +1,2671 @@ + + + + +Created by FontForge 20120731 at Mon Oct 24 17:37:40 2016 + By ,,, +Copyright Dave Gandy 2016. All rights reserved. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/site/css/fonts/fontawesome-webfont.ttf b/site/css/fonts/fontawesome-webfont.ttf new file mode 100644 index 0000000..35acda2 Binary files /dev/null and b/site/css/fonts/fontawesome-webfont.ttf differ diff --git a/site/css/fonts/fontawesome-webfont.woff b/site/css/fonts/fontawesome-webfont.woff new file mode 100644 index 0000000..400014a Binary files /dev/null and b/site/css/fonts/fontawesome-webfont.woff differ diff --git a/site/css/fonts/fontawesome-webfont.woff2 b/site/css/fonts/fontawesome-webfont.woff2 new file mode 100644 index 0000000..4d13fc6 Binary files /dev/null and b/site/css/fonts/fontawesome-webfont.woff2 differ diff --git a/site/css/fonts/lato-bold-italic.woff b/site/css/fonts/lato-bold-italic.woff new file mode 100644 index 0000000..88ad05b Binary files /dev/null and b/site/css/fonts/lato-bold-italic.woff differ diff --git a/site/css/fonts/lato-bold-italic.woff2 b/site/css/fonts/lato-bold-italic.woff2 new file mode 100644 index 0000000..c4e3d80 Binary files /dev/null and b/site/css/fonts/lato-bold-italic.woff2 differ diff --git a/site/css/fonts/lato-bold.woff b/site/css/fonts/lato-bold.woff new file mode 100644 index 0000000..c6dff51 Binary files /dev/null and b/site/css/fonts/lato-bold.woff differ diff --git a/site/css/fonts/lato-bold.woff2 b/site/css/fonts/lato-bold.woff2 new file mode 100644 index 0000000..bb19504 Binary files /dev/null and b/site/css/fonts/lato-bold.woff2 differ diff --git a/site/css/fonts/lato-normal-italic.woff b/site/css/fonts/lato-normal-italic.woff new file mode 100644 index 0000000..76114bc Binary files /dev/null and b/site/css/fonts/lato-normal-italic.woff differ diff --git a/site/css/fonts/lato-normal-italic.woff2 b/site/css/fonts/lato-normal-italic.woff2 new file mode 100644 index 0000000..3404f37 Binary files /dev/null and b/site/css/fonts/lato-normal-italic.woff2 differ diff --git a/site/css/fonts/lato-normal.woff b/site/css/fonts/lato-normal.woff new file mode 100644 index 0000000..ae1307f Binary files /dev/null and b/site/css/fonts/lato-normal.woff differ diff --git a/site/css/fonts/lato-normal.woff2 b/site/css/fonts/lato-normal.woff2 new file mode 100644 index 0000000..3bf9843 Binary files /dev/null and b/site/css/fonts/lato-normal.woff2 differ diff --git a/site/css/theme.css b/site/css/theme.css new file mode 100644 index 0000000..ad77300 --- /dev/null +++ b/site/css/theme.css @@ -0,0 +1,13 @@ +/* + * This file is copied from the upstream ReadTheDocs Sphinx + * theme. To aid upgradability this file should *not* be edited. + * modifications we need should be included in theme_extra.css. + * + * https://github.com/readthedocs/sphinx_rtd_theme + */ + + /* sphinx_rtd_theme version 1.2.0 | MIT license */ +html{box-sizing:border-box}*,:after,:before{box-sizing:inherit}article,aside,details,figcaption,figure,footer,header,hgroup,nav,section{display:block}audio,canvas,video{display:inline-block;*display:inline;*zoom:1}[hidden],audio:not([controls]){display:none}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:100%;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}blockquote{margin:0}dfn{font-style:italic}ins{background:#ff9;text-decoration:none}ins,mark{color:#000}mark{background:#ff0;font-style:italic;font-weight:700}.rst-content code,.rst-content tt,code,kbd,pre,samp{font-family:monospace,serif;_font-family:courier new,monospace;font-size:1em}pre{white-space:pre}q{quotes:none}q:after,q:before{content:"";content:none}small{font-size:85%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}dl,ol,ul{margin:0;padding:0;list-style:none;list-style-image:none}li{list-style:none}dd{margin:0}img{border:0;-ms-interpolation-mode:bicubic;vertical-align:middle;max-width:100%}svg:not(:root){overflow:hidden}figure,form{margin:0}label{cursor:pointer}button,input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}button,input{line-height:normal}button,input[type=button],input[type=reset],input[type=submit]{cursor:pointer;-webkit-appearance:button;*overflow:visible}button[disabled],input[disabled]{cursor:default}input[type=search]{-webkit-appearance:textfield;-moz-box-sizing:content-box;-webkit-box-sizing:content-box;box-sizing:content-box}textarea{resize:vertical}table{border-collapse:collapse;border-spacing:0}td{vertical-align:top}.chromeframe{margin:.2em 0;background:#ccc;color:#000;padding:.2em 0}.ir{display:block;border:0;text-indent:-999em;overflow:hidden;background-color:transparent;background-repeat:no-repeat;text-align:left;direction:ltr;*line-height:0}.ir br{display:none}.hidden{display:none!important;visibility:hidden}.visuallyhidden{border:0;clip:rect(0 0 0 0);height:1px;margin:-1px;overflow:hidden;padding:0;position:absolute;width:1px}.visuallyhidden.focusable:active,.visuallyhidden.focusable:focus{clip:auto;height:auto;margin:0;overflow:visible;position:static;width:auto}.invisible{visibility:hidden}.relative{position:relative}big,small{font-size:100%}@media print{body,html,section{background:none!important}*{box-shadow:none!important;text-shadow:none!important;filter:none!important;-ms-filter:none!important}a,a:visited{text-decoration:underline}.ir a:after,a[href^="#"]:after,a[href^="javascript:"]:after{content:""}blockquote,pre{page-break-inside:avoid}thead{display:table-header-group}img,tr{page-break-inside:avoid}img{max-width:100%!important}@page{margin:.5cm}.rst-content .toctree-wrapper>p.caption,h2,h3,p{orphans:3;widows:3}.rst-content .toctree-wrapper>p.caption,h2,h3{page-break-after:avoid}}.btn,.fa:before,.icon:before,.rst-content .admonition,.rst-content .admonition-title:before,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .code-block-caption .headerlink:before,.rst-content .danger,.rst-content .eqno .headerlink:before,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-alert,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before,input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week],select,textarea{-webkit-font-smoothing:antialiased}.clearfix{*zoom:1}.clearfix:after,.clearfix:before{display:table;content:""}.clearfix:after{clear:both}/*! + * Font Awesome 4.7.0 by @davegandy - http://fontawesome.io - @fontawesome + * License - http://fontawesome.io/license (Font: SIL OFL 1.1, CSS: MIT License) + */@font-face{font-family:FontAwesome;src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713);src:url(fonts/fontawesome-webfont.eot?674f50d287a8c48dc19ba404d20fe713?#iefix&v=4.7.0) format("embedded-opentype"),url(fonts/fontawesome-webfont.woff2?af7ae505a9eed503f8b8e6982036873e) format("woff2"),url(fonts/fontawesome-webfont.woff?fee66e712a8a08eef5805a46892932ad) format("woff"),url(fonts/fontawesome-webfont.ttf?b06871f281fee6b241d60582ae9369b9) format("truetype"),url(fonts/fontawesome-webfont.svg?912ec66d7572ff821749319396470bde#fontawesomeregular) format("svg");font-weight:400;font-style:normal}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{display:inline-block;font:normal normal normal 14px/1 FontAwesome;font-size:inherit;text-rendering:auto;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.fa-lg{font-size:1.33333em;line-height:.75em;vertical-align:-15%}.fa-2x{font-size:2em}.fa-3x{font-size:3em}.fa-4x{font-size:4em}.fa-5x{font-size:5em}.fa-fw{width:1.28571em;text-align:center}.fa-ul{padding-left:0;margin-left:2.14286em;list-style-type:none}.fa-ul>li{position:relative}.fa-li{position:absolute;left:-2.14286em;width:2.14286em;top:.14286em;text-align:center}.fa-li.fa-lg{left:-1.85714em}.fa-border{padding:.2em .25em .15em;border:.08em solid #eee;border-radius:.1em}.fa-pull-left{float:left}.fa-pull-right{float:right}.fa-pull-left.icon,.fa.fa-pull-left,.rst-content .code-block-caption .fa-pull-left.headerlink,.rst-content .eqno .fa-pull-left.headerlink,.rst-content .fa-pull-left.admonition-title,.rst-content code.download span.fa-pull-left:first-child,.rst-content dl dt .fa-pull-left.headerlink,.rst-content h1 .fa-pull-left.headerlink,.rst-content h2 .fa-pull-left.headerlink,.rst-content h3 .fa-pull-left.headerlink,.rst-content h4 .fa-pull-left.headerlink,.rst-content h5 .fa-pull-left.headerlink,.rst-content h6 .fa-pull-left.headerlink,.rst-content p .fa-pull-left.headerlink,.rst-content table>caption .fa-pull-left.headerlink,.rst-content tt.download span.fa-pull-left:first-child,.wy-menu-vertical li.current>a button.fa-pull-left.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-left.toctree-expand,.wy-menu-vertical li button.fa-pull-left.toctree-expand{margin-right:.3em}.fa-pull-right.icon,.fa.fa-pull-right,.rst-content .code-block-caption .fa-pull-right.headerlink,.rst-content .eqno .fa-pull-right.headerlink,.rst-content .fa-pull-right.admonition-title,.rst-content code.download span.fa-pull-right:first-child,.rst-content dl dt .fa-pull-right.headerlink,.rst-content h1 .fa-pull-right.headerlink,.rst-content h2 .fa-pull-right.headerlink,.rst-content h3 .fa-pull-right.headerlink,.rst-content h4 .fa-pull-right.headerlink,.rst-content h5 .fa-pull-right.headerlink,.rst-content h6 .fa-pull-right.headerlink,.rst-content p .fa-pull-right.headerlink,.rst-content table>caption .fa-pull-right.headerlink,.rst-content tt.download span.fa-pull-right:first-child,.wy-menu-vertical li.current>a button.fa-pull-right.toctree-expand,.wy-menu-vertical li.on a button.fa-pull-right.toctree-expand,.wy-menu-vertical li button.fa-pull-right.toctree-expand{margin-left:.3em}.pull-right{float:right}.pull-left{float:left}.fa.pull-left,.pull-left.icon,.rst-content .code-block-caption .pull-left.headerlink,.rst-content .eqno .pull-left.headerlink,.rst-content .pull-left.admonition-title,.rst-content code.download span.pull-left:first-child,.rst-content dl dt .pull-left.headerlink,.rst-content h1 .pull-left.headerlink,.rst-content h2 .pull-left.headerlink,.rst-content h3 .pull-left.headerlink,.rst-content h4 .pull-left.headerlink,.rst-content h5 .pull-left.headerlink,.rst-content h6 .pull-left.headerlink,.rst-content p .pull-left.headerlink,.rst-content table>caption .pull-left.headerlink,.rst-content tt.download span.pull-left:first-child,.wy-menu-vertical li.current>a button.pull-left.toctree-expand,.wy-menu-vertical li.on a button.pull-left.toctree-expand,.wy-menu-vertical li button.pull-left.toctree-expand{margin-right:.3em}.fa.pull-right,.pull-right.icon,.rst-content .code-block-caption .pull-right.headerlink,.rst-content .eqno .pull-right.headerlink,.rst-content .pull-right.admonition-title,.rst-content code.download span.pull-right:first-child,.rst-content dl dt .pull-right.headerlink,.rst-content h1 .pull-right.headerlink,.rst-content h2 .pull-right.headerlink,.rst-content h3 .pull-right.headerlink,.rst-content h4 .pull-right.headerlink,.rst-content h5 .pull-right.headerlink,.rst-content h6 .pull-right.headerlink,.rst-content p .pull-right.headerlink,.rst-content table>caption .pull-right.headerlink,.rst-content tt.download span.pull-right:first-child,.wy-menu-vertical li.current>a button.pull-right.toctree-expand,.wy-menu-vertical li.on a button.pull-right.toctree-expand,.wy-menu-vertical li button.pull-right.toctree-expand{margin-left:.3em}.fa-spin{-webkit-animation:fa-spin 2s linear infinite;animation:fa-spin 2s linear infinite}.fa-pulse{-webkit-animation:fa-spin 1s steps(8) infinite;animation:fa-spin 1s steps(8) infinite}@-webkit-keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}@keyframes fa-spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg)}to{-webkit-transform:rotate(359deg);transform:rotate(359deg)}}.fa-rotate-90{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=1)";-webkit-transform:rotate(90deg);-ms-transform:rotate(90deg);transform:rotate(90deg)}.fa-rotate-180{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2)";-webkit-transform:rotate(180deg);-ms-transform:rotate(180deg);transform:rotate(180deg)}.fa-rotate-270{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=3)";-webkit-transform:rotate(270deg);-ms-transform:rotate(270deg);transform:rotate(270deg)}.fa-flip-horizontal{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=0, mirror=1)";-webkit-transform:scaleX(-1);-ms-transform:scaleX(-1);transform:scaleX(-1)}.fa-flip-vertical{-ms-filter:"progid:DXImageTransform.Microsoft.BasicImage(rotation=2, mirror=1)";-webkit-transform:scaleY(-1);-ms-transform:scaleY(-1);transform:scaleY(-1)}:root .fa-flip-horizontal,:root .fa-flip-vertical,:root .fa-rotate-90,:root .fa-rotate-180,:root .fa-rotate-270{filter:none}.fa-stack{position:relative;display:inline-block;width:2em;height:2em;line-height:2em;vertical-align:middle}.fa-stack-1x,.fa-stack-2x{position:absolute;left:0;width:100%;text-align:center}.fa-stack-1x{line-height:inherit}.fa-stack-2x{font-size:2em}.fa-inverse{color:#fff}.fa-glass:before{content:""}.fa-music:before{content:""}.fa-search:before,.icon-search:before{content:""}.fa-envelope-o:before{content:""}.fa-heart:before{content:""}.fa-star:before{content:""}.fa-star-o:before{content:""}.fa-user:before{content:""}.fa-film:before{content:""}.fa-th-large:before{content:""}.fa-th:before{content:""}.fa-th-list:before{content:""}.fa-check:before{content:""}.fa-close:before,.fa-remove:before,.fa-times:before{content:""}.fa-search-plus:before{content:""}.fa-search-minus:before{content:""}.fa-power-off:before{content:""}.fa-signal:before{content:""}.fa-cog:before,.fa-gear:before{content:""}.fa-trash-o:before{content:""}.fa-home:before,.icon-home:before{content:""}.fa-file-o:before{content:""}.fa-clock-o:before{content:""}.fa-road:before{content:""}.fa-download:before,.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{content:""}.fa-arrow-circle-o-down:before{content:""}.fa-arrow-circle-o-up:before{content:""}.fa-inbox:before{content:""}.fa-play-circle-o:before{content:""}.fa-repeat:before,.fa-rotate-right:before{content:""}.fa-refresh:before{content:""}.fa-list-alt:before{content:""}.fa-lock:before{content:""}.fa-flag:before{content:""}.fa-headphones:before{content:""}.fa-volume-off:before{content:""}.fa-volume-down:before{content:""}.fa-volume-up:before{content:""}.fa-qrcode:before{content:""}.fa-barcode:before{content:""}.fa-tag:before{content:""}.fa-tags:before{content:""}.fa-book:before,.icon-book:before{content:""}.fa-bookmark:before{content:""}.fa-print:before{content:""}.fa-camera:before{content:""}.fa-font:before{content:""}.fa-bold:before{content:""}.fa-italic:before{content:""}.fa-text-height:before{content:""}.fa-text-width:before{content:""}.fa-align-left:before{content:""}.fa-align-center:before{content:""}.fa-align-right:before{content:""}.fa-align-justify:before{content:""}.fa-list:before{content:""}.fa-dedent:before,.fa-outdent:before{content:""}.fa-indent:before{content:""}.fa-video-camera:before{content:""}.fa-image:before,.fa-photo:before,.fa-picture-o:before{content:""}.fa-pencil:before{content:""}.fa-map-marker:before{content:""}.fa-adjust:before{content:""}.fa-tint:before{content:""}.fa-edit:before,.fa-pencil-square-o:before{content:""}.fa-share-square-o:before{content:""}.fa-check-square-o:before{content:""}.fa-arrows:before{content:""}.fa-step-backward:before{content:""}.fa-fast-backward:before{content:""}.fa-backward:before{content:""}.fa-play:before{content:""}.fa-pause:before{content:""}.fa-stop:before{content:""}.fa-forward:before{content:""}.fa-fast-forward:before{content:""}.fa-step-forward:before{content:""}.fa-eject:before{content:""}.fa-chevron-left:before{content:""}.fa-chevron-right:before{content:""}.fa-plus-circle:before{content:""}.fa-minus-circle:before{content:""}.fa-times-circle:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before{content:""}.fa-check-circle:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before{content:""}.fa-question-circle:before{content:""}.fa-info-circle:before{content:""}.fa-crosshairs:before{content:""}.fa-times-circle-o:before{content:""}.fa-check-circle-o:before{content:""}.fa-ban:before{content:""}.fa-arrow-left:before{content:""}.fa-arrow-right:before{content:""}.fa-arrow-up:before{content:""}.fa-arrow-down:before{content:""}.fa-mail-forward:before,.fa-share:before{content:""}.fa-expand:before{content:""}.fa-compress:before{content:""}.fa-plus:before{content:""}.fa-minus:before{content:""}.fa-asterisk:before{content:""}.fa-exclamation-circle:before,.rst-content .admonition-title:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before{content:""}.fa-gift:before{content:""}.fa-leaf:before{content:""}.fa-fire:before,.icon-fire:before{content:""}.fa-eye:before{content:""}.fa-eye-slash:before{content:""}.fa-exclamation-triangle:before,.fa-warning:before{content:""}.fa-plane:before{content:""}.fa-calendar:before{content:""}.fa-random:before{content:""}.fa-comment:before{content:""}.fa-magnet:before{content:""}.fa-chevron-up:before{content:""}.fa-chevron-down:before{content:""}.fa-retweet:before{content:""}.fa-shopping-cart:before{content:""}.fa-folder:before{content:""}.fa-folder-open:before{content:""}.fa-arrows-v:before{content:""}.fa-arrows-h:before{content:""}.fa-bar-chart-o:before,.fa-bar-chart:before{content:""}.fa-twitter-square:before{content:""}.fa-facebook-square:before{content:""}.fa-camera-retro:before{content:""}.fa-key:before{content:""}.fa-cogs:before,.fa-gears:before{content:""}.fa-comments:before{content:""}.fa-thumbs-o-up:before{content:""}.fa-thumbs-o-down:before{content:""}.fa-star-half:before{content:""}.fa-heart-o:before{content:""}.fa-sign-out:before{content:""}.fa-linkedin-square:before{content:""}.fa-thumb-tack:before{content:""}.fa-external-link:before{content:""}.fa-sign-in:before{content:""}.fa-trophy:before{content:""}.fa-github-square:before{content:""}.fa-upload:before{content:""}.fa-lemon-o:before{content:""}.fa-phone:before{content:""}.fa-square-o:before{content:""}.fa-bookmark-o:before{content:""}.fa-phone-square:before{content:""}.fa-twitter:before{content:""}.fa-facebook-f:before,.fa-facebook:before{content:""}.fa-github:before,.icon-github:before{content:""}.fa-unlock:before{content:""}.fa-credit-card:before{content:""}.fa-feed:before,.fa-rss:before{content:""}.fa-hdd-o:before{content:""}.fa-bullhorn:before{content:""}.fa-bell:before{content:""}.fa-certificate:before{content:""}.fa-hand-o-right:before{content:""}.fa-hand-o-left:before{content:""}.fa-hand-o-up:before{content:""}.fa-hand-o-down:before{content:""}.fa-arrow-circle-left:before,.icon-circle-arrow-left:before{content:""}.fa-arrow-circle-right:before,.icon-circle-arrow-right:before{content:""}.fa-arrow-circle-up:before{content:""}.fa-arrow-circle-down:before{content:""}.fa-globe:before{content:""}.fa-wrench:before{content:""}.fa-tasks:before{content:""}.fa-filter:before{content:""}.fa-briefcase:before{content:""}.fa-arrows-alt:before{content:""}.fa-group:before,.fa-users:before{content:""}.fa-chain:before,.fa-link:before,.icon-link:before{content:""}.fa-cloud:before{content:""}.fa-flask:before{content:""}.fa-cut:before,.fa-scissors:before{content:""}.fa-copy:before,.fa-files-o:before{content:""}.fa-paperclip:before{content:""}.fa-floppy-o:before,.fa-save:before{content:""}.fa-square:before{content:""}.fa-bars:before,.fa-navicon:before,.fa-reorder:before{content:""}.fa-list-ul:before{content:""}.fa-list-ol:before{content:""}.fa-strikethrough:before{content:""}.fa-underline:before{content:""}.fa-table:before{content:""}.fa-magic:before{content:""}.fa-truck:before{content:""}.fa-pinterest:before{content:""}.fa-pinterest-square:before{content:""}.fa-google-plus-square:before{content:""}.fa-google-plus:before{content:""}.fa-money:before{content:""}.fa-caret-down:before,.icon-caret-down:before,.wy-dropdown .caret:before{content:""}.fa-caret-up:before{content:""}.fa-caret-left:before{content:""}.fa-caret-right:before{content:""}.fa-columns:before{content:""}.fa-sort:before,.fa-unsorted:before{content:""}.fa-sort-desc:before,.fa-sort-down:before{content:""}.fa-sort-asc:before,.fa-sort-up:before{content:""}.fa-envelope:before{content:""}.fa-linkedin:before{content:""}.fa-rotate-left:before,.fa-undo:before{content:""}.fa-gavel:before,.fa-legal:before{content:""}.fa-dashboard:before,.fa-tachometer:before{content:""}.fa-comment-o:before{content:""}.fa-comments-o:before{content:""}.fa-bolt:before,.fa-flash:before{content:""}.fa-sitemap:before{content:""}.fa-umbrella:before{content:""}.fa-clipboard:before,.fa-paste:before{content:""}.fa-lightbulb-o:before{content:""}.fa-exchange:before{content:""}.fa-cloud-download:before{content:""}.fa-cloud-upload:before{content:""}.fa-user-md:before{content:""}.fa-stethoscope:before{content:""}.fa-suitcase:before{content:""}.fa-bell-o:before{content:""}.fa-coffee:before{content:""}.fa-cutlery:before{content:""}.fa-file-text-o:before{content:""}.fa-building-o:before{content:""}.fa-hospital-o:before{content:""}.fa-ambulance:before{content:""}.fa-medkit:before{content:""}.fa-fighter-jet:before{content:""}.fa-beer:before{content:""}.fa-h-square:before{content:""}.fa-plus-square:before{content:""}.fa-angle-double-left:before{content:""}.fa-angle-double-right:before{content:""}.fa-angle-double-up:before{content:""}.fa-angle-double-down:before{content:""}.fa-angle-left:before{content:""}.fa-angle-right:before{content:""}.fa-angle-up:before{content:""}.fa-angle-down:before{content:""}.fa-desktop:before{content:""}.fa-laptop:before{content:""}.fa-tablet:before{content:""}.fa-mobile-phone:before,.fa-mobile:before{content:""}.fa-circle-o:before{content:""}.fa-quote-left:before{content:""}.fa-quote-right:before{content:""}.fa-spinner:before{content:""}.fa-circle:before{content:""}.fa-mail-reply:before,.fa-reply:before{content:""}.fa-github-alt:before{content:""}.fa-folder-o:before{content:""}.fa-folder-open-o:before{content:""}.fa-smile-o:before{content:""}.fa-frown-o:before{content:""}.fa-meh-o:before{content:""}.fa-gamepad:before{content:""}.fa-keyboard-o:before{content:""}.fa-flag-o:before{content:""}.fa-flag-checkered:before{content:""}.fa-terminal:before{content:""}.fa-code:before{content:""}.fa-mail-reply-all:before,.fa-reply-all:before{content:""}.fa-star-half-empty:before,.fa-star-half-full:before,.fa-star-half-o:before{content:""}.fa-location-arrow:before{content:""}.fa-crop:before{content:""}.fa-code-fork:before{content:""}.fa-chain-broken:before,.fa-unlink:before{content:""}.fa-question:before{content:""}.fa-info:before{content:""}.fa-exclamation:before{content:""}.fa-superscript:before{content:""}.fa-subscript:before{content:""}.fa-eraser:before{content:""}.fa-puzzle-piece:before{content:""}.fa-microphone:before{content:""}.fa-microphone-slash:before{content:""}.fa-shield:before{content:""}.fa-calendar-o:before{content:""}.fa-fire-extinguisher:before{content:""}.fa-rocket:before{content:""}.fa-maxcdn:before{content:""}.fa-chevron-circle-left:before{content:""}.fa-chevron-circle-right:before{content:""}.fa-chevron-circle-up:before{content:""}.fa-chevron-circle-down:before{content:""}.fa-html5:before{content:""}.fa-css3:before{content:""}.fa-anchor:before{content:""}.fa-unlock-alt:before{content:""}.fa-bullseye:before{content:""}.fa-ellipsis-h:before{content:""}.fa-ellipsis-v:before{content:""}.fa-rss-square:before{content:""}.fa-play-circle:before{content:""}.fa-ticket:before{content:""}.fa-minus-square:before{content:""}.fa-minus-square-o:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before{content:""}.fa-level-up:before{content:""}.fa-level-down:before{content:""}.fa-check-square:before{content:""}.fa-pencil-square:before{content:""}.fa-external-link-square:before{content:""}.fa-share-square:before{content:""}.fa-compass:before{content:""}.fa-caret-square-o-down:before,.fa-toggle-down:before{content:""}.fa-caret-square-o-up:before,.fa-toggle-up:before{content:""}.fa-caret-square-o-right:before,.fa-toggle-right:before{content:""}.fa-eur:before,.fa-euro:before{content:""}.fa-gbp:before{content:""}.fa-dollar:before,.fa-usd:before{content:""}.fa-inr:before,.fa-rupee:before{content:""}.fa-cny:before,.fa-jpy:before,.fa-rmb:before,.fa-yen:before{content:""}.fa-rouble:before,.fa-rub:before,.fa-ruble:before{content:""}.fa-krw:before,.fa-won:before{content:""}.fa-bitcoin:before,.fa-btc:before{content:""}.fa-file:before{content:""}.fa-file-text:before{content:""}.fa-sort-alpha-asc:before{content:""}.fa-sort-alpha-desc:before{content:""}.fa-sort-amount-asc:before{content:""}.fa-sort-amount-desc:before{content:""}.fa-sort-numeric-asc:before{content:""}.fa-sort-numeric-desc:before{content:""}.fa-thumbs-up:before{content:""}.fa-thumbs-down:before{content:""}.fa-youtube-square:before{content:""}.fa-youtube:before{content:""}.fa-xing:before{content:""}.fa-xing-square:before{content:""}.fa-youtube-play:before{content:""}.fa-dropbox:before{content:""}.fa-stack-overflow:before{content:""}.fa-instagram:before{content:""}.fa-flickr:before{content:""}.fa-adn:before{content:""}.fa-bitbucket:before,.icon-bitbucket:before{content:""}.fa-bitbucket-square:before{content:""}.fa-tumblr:before{content:""}.fa-tumblr-square:before{content:""}.fa-long-arrow-down:before{content:""}.fa-long-arrow-up:before{content:""}.fa-long-arrow-left:before{content:""}.fa-long-arrow-right:before{content:""}.fa-apple:before{content:""}.fa-windows:before{content:""}.fa-android:before{content:""}.fa-linux:before{content:""}.fa-dribbble:before{content:""}.fa-skype:before{content:""}.fa-foursquare:before{content:""}.fa-trello:before{content:""}.fa-female:before{content:""}.fa-male:before{content:""}.fa-gittip:before,.fa-gratipay:before{content:""}.fa-sun-o:before{content:""}.fa-moon-o:before{content:""}.fa-archive:before{content:""}.fa-bug:before{content:""}.fa-vk:before{content:""}.fa-weibo:before{content:""}.fa-renren:before{content:""}.fa-pagelines:before{content:""}.fa-stack-exchange:before{content:""}.fa-arrow-circle-o-right:before{content:""}.fa-arrow-circle-o-left:before{content:""}.fa-caret-square-o-left:before,.fa-toggle-left:before{content:""}.fa-dot-circle-o:before{content:""}.fa-wheelchair:before{content:""}.fa-vimeo-square:before{content:""}.fa-try:before,.fa-turkish-lira:before{content:""}.fa-plus-square-o:before,.wy-menu-vertical li button.toctree-expand:before{content:""}.fa-space-shuttle:before{content:""}.fa-slack:before{content:""}.fa-envelope-square:before{content:""}.fa-wordpress:before{content:""}.fa-openid:before{content:""}.fa-bank:before,.fa-institution:before,.fa-university:before{content:""}.fa-graduation-cap:before,.fa-mortar-board:before{content:""}.fa-yahoo:before{content:""}.fa-google:before{content:""}.fa-reddit:before{content:""}.fa-reddit-square:before{content:""}.fa-stumbleupon-circle:before{content:""}.fa-stumbleupon:before{content:""}.fa-delicious:before{content:""}.fa-digg:before{content:""}.fa-pied-piper-pp:before{content:""}.fa-pied-piper-alt:before{content:""}.fa-drupal:before{content:""}.fa-joomla:before{content:""}.fa-language:before{content:""}.fa-fax:before{content:""}.fa-building:before{content:""}.fa-child:before{content:""}.fa-paw:before{content:""}.fa-spoon:before{content:""}.fa-cube:before{content:""}.fa-cubes:before{content:""}.fa-behance:before{content:""}.fa-behance-square:before{content:""}.fa-steam:before{content:""}.fa-steam-square:before{content:""}.fa-recycle:before{content:""}.fa-automobile:before,.fa-car:before{content:""}.fa-cab:before,.fa-taxi:before{content:""}.fa-tree:before{content:""}.fa-spotify:before{content:""}.fa-deviantart:before{content:""}.fa-soundcloud:before{content:""}.fa-database:before{content:""}.fa-file-pdf-o:before{content:""}.fa-file-word-o:before{content:""}.fa-file-excel-o:before{content:""}.fa-file-powerpoint-o:before{content:""}.fa-file-image-o:before,.fa-file-photo-o:before,.fa-file-picture-o:before{content:""}.fa-file-archive-o:before,.fa-file-zip-o:before{content:""}.fa-file-audio-o:before,.fa-file-sound-o:before{content:""}.fa-file-movie-o:before,.fa-file-video-o:before{content:""}.fa-file-code-o:before{content:""}.fa-vine:before{content:""}.fa-codepen:before{content:""}.fa-jsfiddle:before{content:""}.fa-life-bouy:before,.fa-life-buoy:before,.fa-life-ring:before,.fa-life-saver:before,.fa-support:before{content:""}.fa-circle-o-notch:before{content:""}.fa-ra:before,.fa-rebel:before,.fa-resistance:before{content:""}.fa-empire:before,.fa-ge:before{content:""}.fa-git-square:before{content:""}.fa-git:before{content:""}.fa-hacker-news:before,.fa-y-combinator-square:before,.fa-yc-square:before{content:""}.fa-tencent-weibo:before{content:""}.fa-qq:before{content:""}.fa-wechat:before,.fa-weixin:before{content:""}.fa-paper-plane:before,.fa-send:before{content:""}.fa-paper-plane-o:before,.fa-send-o:before{content:""}.fa-history:before{content:""}.fa-circle-thin:before{content:""}.fa-header:before{content:""}.fa-paragraph:before{content:""}.fa-sliders:before{content:""}.fa-share-alt:before{content:""}.fa-share-alt-square:before{content:""}.fa-bomb:before{content:""}.fa-futbol-o:before,.fa-soccer-ball-o:before{content:""}.fa-tty:before{content:""}.fa-binoculars:before{content:""}.fa-plug:before{content:""}.fa-slideshare:before{content:""}.fa-twitch:before{content:""}.fa-yelp:before{content:""}.fa-newspaper-o:before{content:""}.fa-wifi:before{content:""}.fa-calculator:before{content:""}.fa-paypal:before{content:""}.fa-google-wallet:before{content:""}.fa-cc-visa:before{content:""}.fa-cc-mastercard:before{content:""}.fa-cc-discover:before{content:""}.fa-cc-amex:before{content:""}.fa-cc-paypal:before{content:""}.fa-cc-stripe:before{content:""}.fa-bell-slash:before{content:""}.fa-bell-slash-o:before{content:""}.fa-trash:before{content:""}.fa-copyright:before{content:""}.fa-at:before{content:""}.fa-eyedropper:before{content:""}.fa-paint-brush:before{content:""}.fa-birthday-cake:before{content:""}.fa-area-chart:before{content:""}.fa-pie-chart:before{content:""}.fa-line-chart:before{content:""}.fa-lastfm:before{content:""}.fa-lastfm-square:before{content:""}.fa-toggle-off:before{content:""}.fa-toggle-on:before{content:""}.fa-bicycle:before{content:""}.fa-bus:before{content:""}.fa-ioxhost:before{content:""}.fa-angellist:before{content:""}.fa-cc:before{content:""}.fa-ils:before,.fa-shekel:before,.fa-sheqel:before{content:""}.fa-meanpath:before{content:""}.fa-buysellads:before{content:""}.fa-connectdevelop:before{content:""}.fa-dashcube:before{content:""}.fa-forumbee:before{content:""}.fa-leanpub:before{content:""}.fa-sellsy:before{content:""}.fa-shirtsinbulk:before{content:""}.fa-simplybuilt:before{content:""}.fa-skyatlas:before{content:""}.fa-cart-plus:before{content:""}.fa-cart-arrow-down:before{content:""}.fa-diamond:before{content:""}.fa-ship:before{content:""}.fa-user-secret:before{content:""}.fa-motorcycle:before{content:""}.fa-street-view:before{content:""}.fa-heartbeat:before{content:""}.fa-venus:before{content:""}.fa-mars:before{content:""}.fa-mercury:before{content:""}.fa-intersex:before,.fa-transgender:before{content:""}.fa-transgender-alt:before{content:""}.fa-venus-double:before{content:""}.fa-mars-double:before{content:""}.fa-venus-mars:before{content:""}.fa-mars-stroke:before{content:""}.fa-mars-stroke-v:before{content:""}.fa-mars-stroke-h:before{content:""}.fa-neuter:before{content:""}.fa-genderless:before{content:""}.fa-facebook-official:before{content:""}.fa-pinterest-p:before{content:""}.fa-whatsapp:before{content:""}.fa-server:before{content:""}.fa-user-plus:before{content:""}.fa-user-times:before{content:""}.fa-bed:before,.fa-hotel:before{content:""}.fa-viacoin:before{content:""}.fa-train:before{content:""}.fa-subway:before{content:""}.fa-medium:before{content:""}.fa-y-combinator:before,.fa-yc:before{content:""}.fa-optin-monster:before{content:""}.fa-opencart:before{content:""}.fa-expeditedssl:before{content:""}.fa-battery-4:before,.fa-battery-full:before,.fa-battery:before{content:""}.fa-battery-3:before,.fa-battery-three-quarters:before{content:""}.fa-battery-2:before,.fa-battery-half:before{content:""}.fa-battery-1:before,.fa-battery-quarter:before{content:""}.fa-battery-0:before,.fa-battery-empty:before{content:""}.fa-mouse-pointer:before{content:""}.fa-i-cursor:before{content:""}.fa-object-group:before{content:""}.fa-object-ungroup:before{content:""}.fa-sticky-note:before{content:""}.fa-sticky-note-o:before{content:""}.fa-cc-jcb:before{content:""}.fa-cc-diners-club:before{content:""}.fa-clone:before{content:""}.fa-balance-scale:before{content:""}.fa-hourglass-o:before{content:""}.fa-hourglass-1:before,.fa-hourglass-start:before{content:""}.fa-hourglass-2:before,.fa-hourglass-half:before{content:""}.fa-hourglass-3:before,.fa-hourglass-end:before{content:""}.fa-hourglass:before{content:""}.fa-hand-grab-o:before,.fa-hand-rock-o:before{content:""}.fa-hand-paper-o:before,.fa-hand-stop-o:before{content:""}.fa-hand-scissors-o:before{content:""}.fa-hand-lizard-o:before{content:""}.fa-hand-spock-o:before{content:""}.fa-hand-pointer-o:before{content:""}.fa-hand-peace-o:before{content:""}.fa-trademark:before{content:""}.fa-registered:before{content:""}.fa-creative-commons:before{content:""}.fa-gg:before{content:""}.fa-gg-circle:before{content:""}.fa-tripadvisor:before{content:""}.fa-odnoklassniki:before{content:""}.fa-odnoklassniki-square:before{content:""}.fa-get-pocket:before{content:""}.fa-wikipedia-w:before{content:""}.fa-safari:before{content:""}.fa-chrome:before{content:""}.fa-firefox:before{content:""}.fa-opera:before{content:""}.fa-internet-explorer:before{content:""}.fa-television:before,.fa-tv:before{content:""}.fa-contao:before{content:""}.fa-500px:before{content:""}.fa-amazon:before{content:""}.fa-calendar-plus-o:before{content:""}.fa-calendar-minus-o:before{content:""}.fa-calendar-times-o:before{content:""}.fa-calendar-check-o:before{content:""}.fa-industry:before{content:""}.fa-map-pin:before{content:""}.fa-map-signs:before{content:""}.fa-map-o:before{content:""}.fa-map:before{content:""}.fa-commenting:before{content:""}.fa-commenting-o:before{content:""}.fa-houzz:before{content:""}.fa-vimeo:before{content:""}.fa-black-tie:before{content:""}.fa-fonticons:before{content:""}.fa-reddit-alien:before{content:""}.fa-edge:before{content:""}.fa-credit-card-alt:before{content:""}.fa-codiepie:before{content:""}.fa-modx:before{content:""}.fa-fort-awesome:before{content:""}.fa-usb:before{content:""}.fa-product-hunt:before{content:""}.fa-mixcloud:before{content:""}.fa-scribd:before{content:""}.fa-pause-circle:before{content:""}.fa-pause-circle-o:before{content:""}.fa-stop-circle:before{content:""}.fa-stop-circle-o:before{content:""}.fa-shopping-bag:before{content:""}.fa-shopping-basket:before{content:""}.fa-hashtag:before{content:""}.fa-bluetooth:before{content:""}.fa-bluetooth-b:before{content:""}.fa-percent:before{content:""}.fa-gitlab:before,.icon-gitlab:before{content:""}.fa-wpbeginner:before{content:""}.fa-wpforms:before{content:""}.fa-envira:before{content:""}.fa-universal-access:before{content:""}.fa-wheelchair-alt:before{content:""}.fa-question-circle-o:before{content:""}.fa-blind:before{content:""}.fa-audio-description:before{content:""}.fa-volume-control-phone:before{content:""}.fa-braille:before{content:""}.fa-assistive-listening-systems:before{content:""}.fa-american-sign-language-interpreting:before,.fa-asl-interpreting:before{content:""}.fa-deaf:before,.fa-deafness:before,.fa-hard-of-hearing:before{content:""}.fa-glide:before{content:""}.fa-glide-g:before{content:""}.fa-sign-language:before,.fa-signing:before{content:""}.fa-low-vision:before{content:""}.fa-viadeo:before{content:""}.fa-viadeo-square:before{content:""}.fa-snapchat:before{content:""}.fa-snapchat-ghost:before{content:""}.fa-snapchat-square:before{content:""}.fa-pied-piper:before{content:""}.fa-first-order:before{content:""}.fa-yoast:before{content:""}.fa-themeisle:before{content:""}.fa-google-plus-circle:before,.fa-google-plus-official:before{content:""}.fa-fa:before,.fa-font-awesome:before{content:""}.fa-handshake-o:before{content:""}.fa-envelope-open:before{content:""}.fa-envelope-open-o:before{content:""}.fa-linode:before{content:""}.fa-address-book:before{content:""}.fa-address-book-o:before{content:""}.fa-address-card:before,.fa-vcard:before{content:""}.fa-address-card-o:before,.fa-vcard-o:before{content:""}.fa-user-circle:before{content:""}.fa-user-circle-o:before{content:""}.fa-user-o:before{content:""}.fa-id-badge:before{content:""}.fa-drivers-license:before,.fa-id-card:before{content:""}.fa-drivers-license-o:before,.fa-id-card-o:before{content:""}.fa-quora:before{content:""}.fa-free-code-camp:before{content:""}.fa-telegram:before{content:""}.fa-thermometer-4:before,.fa-thermometer-full:before,.fa-thermometer:before{content:""}.fa-thermometer-3:before,.fa-thermometer-three-quarters:before{content:""}.fa-thermometer-2:before,.fa-thermometer-half:before{content:""}.fa-thermometer-1:before,.fa-thermometer-quarter:before{content:""}.fa-thermometer-0:before,.fa-thermometer-empty:before{content:""}.fa-shower:before{content:""}.fa-bath:before,.fa-bathtub:before,.fa-s15:before{content:""}.fa-podcast:before{content:""}.fa-window-maximize:before{content:""}.fa-window-minimize:before{content:""}.fa-window-restore:before{content:""}.fa-times-rectangle:before,.fa-window-close:before{content:""}.fa-times-rectangle-o:before,.fa-window-close-o:before{content:""}.fa-bandcamp:before{content:""}.fa-grav:before{content:""}.fa-etsy:before{content:""}.fa-imdb:before{content:""}.fa-ravelry:before{content:""}.fa-eercast:before{content:""}.fa-microchip:before{content:""}.fa-snowflake-o:before{content:""}.fa-superpowers:before{content:""}.fa-wpexplorer:before{content:""}.fa-meetup:before{content:""}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0,0,0,0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto}.fa,.icon,.rst-content .admonition-title,.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content code.download span:first-child,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink,.rst-content tt.download span:first-child,.wy-dropdown .caret,.wy-inline-validate.wy-inline-validate-danger .wy-input-context,.wy-inline-validate.wy-inline-validate-info .wy-input-context,.wy-inline-validate.wy-inline-validate-success .wy-input-context,.wy-inline-validate.wy-inline-validate-warning .wy-input-context,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li button.toctree-expand{font-family:inherit}.fa:before,.icon:before,.rst-content .admonition-title:before,.rst-content .code-block-caption .headerlink:before,.rst-content .eqno .headerlink:before,.rst-content code.download span:first-child:before,.rst-content dl dt .headerlink:before,.rst-content h1 .headerlink:before,.rst-content h2 .headerlink:before,.rst-content h3 .headerlink:before,.rst-content h4 .headerlink:before,.rst-content h5 .headerlink:before,.rst-content h6 .headerlink:before,.rst-content p.caption .headerlink:before,.rst-content p .headerlink:before,.rst-content table>caption .headerlink:before,.rst-content tt.download span:first-child:before,.wy-dropdown .caret:before,.wy-inline-validate.wy-inline-validate-danger .wy-input-context:before,.wy-inline-validate.wy-inline-validate-info .wy-input-context:before,.wy-inline-validate.wy-inline-validate-success .wy-input-context:before,.wy-inline-validate.wy-inline-validate-warning .wy-input-context:before,.wy-menu-vertical li.current>a button.toctree-expand:before,.wy-menu-vertical li.on a button.toctree-expand:before,.wy-menu-vertical li button.toctree-expand:before{font-family:FontAwesome;display:inline-block;font-style:normal;font-weight:400;line-height:1;text-decoration:inherit}.rst-content .code-block-caption a .headerlink,.rst-content .eqno a .headerlink,.rst-content a .admonition-title,.rst-content code.download a span:first-child,.rst-content dl dt a .headerlink,.rst-content h1 a .headerlink,.rst-content h2 a .headerlink,.rst-content h3 a .headerlink,.rst-content h4 a .headerlink,.rst-content h5 a .headerlink,.rst-content h6 a .headerlink,.rst-content p.caption a .headerlink,.rst-content p a .headerlink,.rst-content table>caption a .headerlink,.rst-content tt.download a span:first-child,.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand,.wy-menu-vertical li a button.toctree-expand,a .fa,a .icon,a .rst-content .admonition-title,a .rst-content .code-block-caption .headerlink,a .rst-content .eqno .headerlink,a .rst-content code.download span:first-child,a .rst-content dl dt .headerlink,a .rst-content h1 .headerlink,a .rst-content h2 .headerlink,a .rst-content h3 .headerlink,a .rst-content h4 .headerlink,a .rst-content h5 .headerlink,a .rst-content h6 .headerlink,a .rst-content p.caption .headerlink,a .rst-content p .headerlink,a .rst-content table>caption .headerlink,a .rst-content tt.download span:first-child,a .wy-menu-vertical li button.toctree-expand{display:inline-block;text-decoration:inherit}.btn .fa,.btn .icon,.btn .rst-content .admonition-title,.btn .rst-content .code-block-caption .headerlink,.btn .rst-content .eqno .headerlink,.btn .rst-content code.download span:first-child,.btn .rst-content dl dt .headerlink,.btn .rst-content h1 .headerlink,.btn .rst-content h2 .headerlink,.btn .rst-content h3 .headerlink,.btn .rst-content h4 .headerlink,.btn .rst-content h5 .headerlink,.btn .rst-content h6 .headerlink,.btn .rst-content p .headerlink,.btn .rst-content table>caption .headerlink,.btn .rst-content tt.download span:first-child,.btn .wy-menu-vertical li.current>a button.toctree-expand,.btn .wy-menu-vertical li.on a button.toctree-expand,.btn .wy-menu-vertical li button.toctree-expand,.nav .fa,.nav .icon,.nav .rst-content .admonition-title,.nav .rst-content .code-block-caption .headerlink,.nav .rst-content .eqno .headerlink,.nav .rst-content code.download span:first-child,.nav .rst-content dl dt .headerlink,.nav .rst-content h1 .headerlink,.nav .rst-content h2 .headerlink,.nav .rst-content h3 .headerlink,.nav .rst-content h4 .headerlink,.nav .rst-content h5 .headerlink,.nav .rst-content h6 .headerlink,.nav .rst-content p .headerlink,.nav .rst-content table>caption .headerlink,.nav .rst-content tt.download span:first-child,.nav .wy-menu-vertical li.current>a button.toctree-expand,.nav .wy-menu-vertical li.on a button.toctree-expand,.nav .wy-menu-vertical li button.toctree-expand,.rst-content .btn .admonition-title,.rst-content .code-block-caption .btn .headerlink,.rst-content .code-block-caption .nav .headerlink,.rst-content .eqno .btn .headerlink,.rst-content .eqno .nav .headerlink,.rst-content .nav .admonition-title,.rst-content code.download .btn span:first-child,.rst-content code.download .nav span:first-child,.rst-content dl dt .btn .headerlink,.rst-content dl dt .nav .headerlink,.rst-content h1 .btn .headerlink,.rst-content h1 .nav .headerlink,.rst-content h2 .btn .headerlink,.rst-content h2 .nav .headerlink,.rst-content h3 .btn .headerlink,.rst-content h3 .nav .headerlink,.rst-content h4 .btn .headerlink,.rst-content h4 .nav .headerlink,.rst-content h5 .btn .headerlink,.rst-content h5 .nav .headerlink,.rst-content h6 .btn .headerlink,.rst-content h6 .nav .headerlink,.rst-content p .btn .headerlink,.rst-content p .nav .headerlink,.rst-content table>caption .btn .headerlink,.rst-content table>caption .nav .headerlink,.rst-content tt.download .btn span:first-child,.rst-content tt.download .nav span:first-child,.wy-menu-vertical li .btn button.toctree-expand,.wy-menu-vertical li.current>a .btn button.toctree-expand,.wy-menu-vertical li.current>a .nav button.toctree-expand,.wy-menu-vertical li .nav button.toctree-expand,.wy-menu-vertical li.on a .btn button.toctree-expand,.wy-menu-vertical li.on a .nav button.toctree-expand{display:inline}.btn .fa-large.icon,.btn .fa.fa-large,.btn .rst-content .code-block-caption .fa-large.headerlink,.btn .rst-content .eqno .fa-large.headerlink,.btn .rst-content .fa-large.admonition-title,.btn .rst-content code.download span.fa-large:first-child,.btn .rst-content dl dt .fa-large.headerlink,.btn .rst-content h1 .fa-large.headerlink,.btn .rst-content h2 .fa-large.headerlink,.btn .rst-content h3 .fa-large.headerlink,.btn .rst-content h4 .fa-large.headerlink,.btn .rst-content h5 .fa-large.headerlink,.btn .rst-content h6 .fa-large.headerlink,.btn .rst-content p .fa-large.headerlink,.btn .rst-content table>caption .fa-large.headerlink,.btn .rst-content tt.download span.fa-large:first-child,.btn .wy-menu-vertical li button.fa-large.toctree-expand,.nav .fa-large.icon,.nav .fa.fa-large,.nav .rst-content .code-block-caption .fa-large.headerlink,.nav .rst-content .eqno .fa-large.headerlink,.nav .rst-content .fa-large.admonition-title,.nav .rst-content code.download span.fa-large:first-child,.nav .rst-content dl dt .fa-large.headerlink,.nav .rst-content h1 .fa-large.headerlink,.nav .rst-content h2 .fa-large.headerlink,.nav .rst-content h3 .fa-large.headerlink,.nav .rst-content h4 .fa-large.headerlink,.nav .rst-content h5 .fa-large.headerlink,.nav .rst-content h6 .fa-large.headerlink,.nav .rst-content p .fa-large.headerlink,.nav .rst-content table>caption .fa-large.headerlink,.nav .rst-content tt.download span.fa-large:first-child,.nav .wy-menu-vertical li button.fa-large.toctree-expand,.rst-content .btn .fa-large.admonition-title,.rst-content .code-block-caption .btn .fa-large.headerlink,.rst-content .code-block-caption .nav .fa-large.headerlink,.rst-content .eqno .btn .fa-large.headerlink,.rst-content .eqno .nav .fa-large.headerlink,.rst-content .nav .fa-large.admonition-title,.rst-content code.download .btn span.fa-large:first-child,.rst-content code.download .nav span.fa-large:first-child,.rst-content dl dt .btn .fa-large.headerlink,.rst-content dl dt .nav .fa-large.headerlink,.rst-content h1 .btn .fa-large.headerlink,.rst-content h1 .nav .fa-large.headerlink,.rst-content h2 .btn .fa-large.headerlink,.rst-content h2 .nav .fa-large.headerlink,.rst-content h3 .btn .fa-large.headerlink,.rst-content h3 .nav .fa-large.headerlink,.rst-content h4 .btn .fa-large.headerlink,.rst-content h4 .nav .fa-large.headerlink,.rst-content h5 .btn .fa-large.headerlink,.rst-content h5 .nav .fa-large.headerlink,.rst-content h6 .btn .fa-large.headerlink,.rst-content h6 .nav .fa-large.headerlink,.rst-content p .btn .fa-large.headerlink,.rst-content p .nav .fa-large.headerlink,.rst-content table>caption .btn .fa-large.headerlink,.rst-content table>caption .nav .fa-large.headerlink,.rst-content tt.download .btn span.fa-large:first-child,.rst-content tt.download .nav span.fa-large:first-child,.wy-menu-vertical li .btn button.fa-large.toctree-expand,.wy-menu-vertical li .nav button.fa-large.toctree-expand{line-height:.9em}.btn .fa-spin.icon,.btn .fa.fa-spin,.btn .rst-content .code-block-caption .fa-spin.headerlink,.btn .rst-content .eqno .fa-spin.headerlink,.btn .rst-content .fa-spin.admonition-title,.btn .rst-content code.download span.fa-spin:first-child,.btn .rst-content dl dt .fa-spin.headerlink,.btn .rst-content h1 .fa-spin.headerlink,.btn .rst-content h2 .fa-spin.headerlink,.btn .rst-content h3 .fa-spin.headerlink,.btn .rst-content h4 .fa-spin.headerlink,.btn .rst-content h5 .fa-spin.headerlink,.btn .rst-content h6 .fa-spin.headerlink,.btn .rst-content p .fa-spin.headerlink,.btn .rst-content table>caption .fa-spin.headerlink,.btn .rst-content tt.download span.fa-spin:first-child,.btn .wy-menu-vertical li button.fa-spin.toctree-expand,.nav .fa-spin.icon,.nav .fa.fa-spin,.nav .rst-content .code-block-caption .fa-spin.headerlink,.nav .rst-content .eqno .fa-spin.headerlink,.nav .rst-content .fa-spin.admonition-title,.nav .rst-content code.download span.fa-spin:first-child,.nav .rst-content dl dt .fa-spin.headerlink,.nav .rst-content h1 .fa-spin.headerlink,.nav .rst-content h2 .fa-spin.headerlink,.nav .rst-content h3 .fa-spin.headerlink,.nav .rst-content h4 .fa-spin.headerlink,.nav .rst-content h5 .fa-spin.headerlink,.nav .rst-content h6 .fa-spin.headerlink,.nav .rst-content p .fa-spin.headerlink,.nav .rst-content table>caption .fa-spin.headerlink,.nav .rst-content tt.download span.fa-spin:first-child,.nav .wy-menu-vertical li button.fa-spin.toctree-expand,.rst-content .btn .fa-spin.admonition-title,.rst-content .code-block-caption .btn .fa-spin.headerlink,.rst-content .code-block-caption .nav .fa-spin.headerlink,.rst-content .eqno .btn .fa-spin.headerlink,.rst-content .eqno .nav .fa-spin.headerlink,.rst-content .nav .fa-spin.admonition-title,.rst-content code.download .btn span.fa-spin:first-child,.rst-content code.download .nav span.fa-spin:first-child,.rst-content dl dt .btn .fa-spin.headerlink,.rst-content dl dt .nav .fa-spin.headerlink,.rst-content h1 .btn .fa-spin.headerlink,.rst-content h1 .nav .fa-spin.headerlink,.rst-content h2 .btn .fa-spin.headerlink,.rst-content h2 .nav .fa-spin.headerlink,.rst-content h3 .btn .fa-spin.headerlink,.rst-content h3 .nav .fa-spin.headerlink,.rst-content h4 .btn .fa-spin.headerlink,.rst-content h4 .nav .fa-spin.headerlink,.rst-content h5 .btn .fa-spin.headerlink,.rst-content h5 .nav .fa-spin.headerlink,.rst-content h6 .btn .fa-spin.headerlink,.rst-content h6 .nav .fa-spin.headerlink,.rst-content p .btn .fa-spin.headerlink,.rst-content p .nav .fa-spin.headerlink,.rst-content table>caption .btn .fa-spin.headerlink,.rst-content table>caption .nav .fa-spin.headerlink,.rst-content tt.download .btn span.fa-spin:first-child,.rst-content tt.download .nav span.fa-spin:first-child,.wy-menu-vertical li .btn button.fa-spin.toctree-expand,.wy-menu-vertical li .nav button.fa-spin.toctree-expand{display:inline-block}.btn.fa:before,.btn.icon:before,.rst-content .btn.admonition-title:before,.rst-content .code-block-caption .btn.headerlink:before,.rst-content .eqno .btn.headerlink:before,.rst-content code.download span.btn:first-child:before,.rst-content dl dt .btn.headerlink:before,.rst-content h1 .btn.headerlink:before,.rst-content h2 .btn.headerlink:before,.rst-content h3 .btn.headerlink:before,.rst-content h4 .btn.headerlink:before,.rst-content h5 .btn.headerlink:before,.rst-content h6 .btn.headerlink:before,.rst-content p .btn.headerlink:before,.rst-content table>caption .btn.headerlink:before,.rst-content tt.download span.btn:first-child:before,.wy-menu-vertical li button.btn.toctree-expand:before{opacity:.5;-webkit-transition:opacity .05s ease-in;-moz-transition:opacity .05s ease-in;transition:opacity .05s ease-in}.btn.fa:hover:before,.btn.icon:hover:before,.rst-content .btn.admonition-title:hover:before,.rst-content .code-block-caption .btn.headerlink:hover:before,.rst-content .eqno .btn.headerlink:hover:before,.rst-content code.download span.btn:first-child:hover:before,.rst-content dl dt .btn.headerlink:hover:before,.rst-content h1 .btn.headerlink:hover:before,.rst-content h2 .btn.headerlink:hover:before,.rst-content h3 .btn.headerlink:hover:before,.rst-content h4 .btn.headerlink:hover:before,.rst-content h5 .btn.headerlink:hover:before,.rst-content h6 .btn.headerlink:hover:before,.rst-content p .btn.headerlink:hover:before,.rst-content table>caption .btn.headerlink:hover:before,.rst-content tt.download span.btn:first-child:hover:before,.wy-menu-vertical li button.btn.toctree-expand:hover:before{opacity:1}.btn-mini .fa:before,.btn-mini .icon:before,.btn-mini .rst-content .admonition-title:before,.btn-mini .rst-content .code-block-caption .headerlink:before,.btn-mini .rst-content .eqno .headerlink:before,.btn-mini .rst-content code.download span:first-child:before,.btn-mini .rst-content dl dt .headerlink:before,.btn-mini .rst-content h1 .headerlink:before,.btn-mini .rst-content h2 .headerlink:before,.btn-mini .rst-content h3 .headerlink:before,.btn-mini .rst-content h4 .headerlink:before,.btn-mini .rst-content h5 .headerlink:before,.btn-mini .rst-content h6 .headerlink:before,.btn-mini .rst-content p .headerlink:before,.btn-mini .rst-content table>caption .headerlink:before,.btn-mini .rst-content tt.download span:first-child:before,.btn-mini .wy-menu-vertical li button.toctree-expand:before,.rst-content .btn-mini .admonition-title:before,.rst-content .code-block-caption .btn-mini .headerlink:before,.rst-content .eqno .btn-mini .headerlink:before,.rst-content code.download .btn-mini span:first-child:before,.rst-content dl dt .btn-mini .headerlink:before,.rst-content h1 .btn-mini .headerlink:before,.rst-content h2 .btn-mini .headerlink:before,.rst-content h3 .btn-mini .headerlink:before,.rst-content h4 .btn-mini .headerlink:before,.rst-content h5 .btn-mini .headerlink:before,.rst-content h6 .btn-mini .headerlink:before,.rst-content p .btn-mini .headerlink:before,.rst-content table>caption .btn-mini .headerlink:before,.rst-content tt.download .btn-mini span:first-child:before,.wy-menu-vertical li .btn-mini button.toctree-expand:before{font-size:14px;vertical-align:-15%}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning,.wy-alert{padding:12px;line-height:24px;margin-bottom:24px;background:#e7f2fa}.rst-content .admonition-title,.wy-alert-title{font-weight:700;display:block;color:#fff;background:#6ab0de;padding:6px 12px;margin:-12px -12px 12px}.rst-content .danger,.rst-content .error,.rst-content .wy-alert-danger.admonition,.rst-content .wy-alert-danger.admonition-todo,.rst-content .wy-alert-danger.attention,.rst-content .wy-alert-danger.caution,.rst-content .wy-alert-danger.hint,.rst-content .wy-alert-danger.important,.rst-content .wy-alert-danger.note,.rst-content .wy-alert-danger.seealso,.rst-content .wy-alert-danger.tip,.rst-content .wy-alert-danger.warning,.wy-alert.wy-alert-danger{background:#fdf3f2}.rst-content .danger .admonition-title,.rst-content .danger .wy-alert-title,.rst-content .error .admonition-title,.rst-content .error .wy-alert-title,.rst-content .wy-alert-danger.admonition-todo .admonition-title,.rst-content .wy-alert-danger.admonition-todo .wy-alert-title,.rst-content .wy-alert-danger.admonition .admonition-title,.rst-content .wy-alert-danger.admonition .wy-alert-title,.rst-content .wy-alert-danger.attention .admonition-title,.rst-content .wy-alert-danger.attention .wy-alert-title,.rst-content .wy-alert-danger.caution .admonition-title,.rst-content .wy-alert-danger.caution .wy-alert-title,.rst-content .wy-alert-danger.hint .admonition-title,.rst-content .wy-alert-danger.hint .wy-alert-title,.rst-content .wy-alert-danger.important .admonition-title,.rst-content .wy-alert-danger.important .wy-alert-title,.rst-content .wy-alert-danger.note .admonition-title,.rst-content .wy-alert-danger.note .wy-alert-title,.rst-content .wy-alert-danger.seealso .admonition-title,.rst-content .wy-alert-danger.seealso .wy-alert-title,.rst-content .wy-alert-danger.tip .admonition-title,.rst-content .wy-alert-danger.tip .wy-alert-title,.rst-content .wy-alert-danger.warning .admonition-title,.rst-content .wy-alert-danger.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-danger .admonition-title,.wy-alert.wy-alert-danger .rst-content .admonition-title,.wy-alert.wy-alert-danger .wy-alert-title{background:#f29f97}.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .warning,.rst-content .wy-alert-warning.admonition,.rst-content .wy-alert-warning.danger,.rst-content .wy-alert-warning.error,.rst-content .wy-alert-warning.hint,.rst-content .wy-alert-warning.important,.rst-content .wy-alert-warning.note,.rst-content .wy-alert-warning.seealso,.rst-content .wy-alert-warning.tip,.wy-alert.wy-alert-warning{background:#ffedcc}.rst-content .admonition-todo .admonition-title,.rst-content .admonition-todo .wy-alert-title,.rst-content .attention .admonition-title,.rst-content .attention .wy-alert-title,.rst-content .caution .admonition-title,.rst-content .caution .wy-alert-title,.rst-content .warning .admonition-title,.rst-content .warning .wy-alert-title,.rst-content .wy-alert-warning.admonition .admonition-title,.rst-content .wy-alert-warning.admonition .wy-alert-title,.rst-content .wy-alert-warning.danger .admonition-title,.rst-content .wy-alert-warning.danger .wy-alert-title,.rst-content .wy-alert-warning.error .admonition-title,.rst-content .wy-alert-warning.error .wy-alert-title,.rst-content .wy-alert-warning.hint .admonition-title,.rst-content .wy-alert-warning.hint .wy-alert-title,.rst-content .wy-alert-warning.important .admonition-title,.rst-content .wy-alert-warning.important .wy-alert-title,.rst-content .wy-alert-warning.note .admonition-title,.rst-content .wy-alert-warning.note .wy-alert-title,.rst-content .wy-alert-warning.seealso .admonition-title,.rst-content .wy-alert-warning.seealso .wy-alert-title,.rst-content .wy-alert-warning.tip .admonition-title,.rst-content .wy-alert-warning.tip .wy-alert-title,.rst-content .wy-alert.wy-alert-warning .admonition-title,.wy-alert.wy-alert-warning .rst-content .admonition-title,.wy-alert.wy-alert-warning .wy-alert-title{background:#f0b37e}.rst-content .note,.rst-content .seealso,.rst-content .wy-alert-info.admonition,.rst-content .wy-alert-info.admonition-todo,.rst-content .wy-alert-info.attention,.rst-content .wy-alert-info.caution,.rst-content .wy-alert-info.danger,.rst-content .wy-alert-info.error,.rst-content .wy-alert-info.hint,.rst-content .wy-alert-info.important,.rst-content .wy-alert-info.tip,.rst-content .wy-alert-info.warning,.wy-alert.wy-alert-info{background:#e7f2fa}.rst-content .note .admonition-title,.rst-content .note .wy-alert-title,.rst-content .seealso .admonition-title,.rst-content .seealso .wy-alert-title,.rst-content .wy-alert-info.admonition-todo .admonition-title,.rst-content .wy-alert-info.admonition-todo .wy-alert-title,.rst-content .wy-alert-info.admonition .admonition-title,.rst-content .wy-alert-info.admonition .wy-alert-title,.rst-content .wy-alert-info.attention .admonition-title,.rst-content .wy-alert-info.attention .wy-alert-title,.rst-content .wy-alert-info.caution .admonition-title,.rst-content .wy-alert-info.caution .wy-alert-title,.rst-content .wy-alert-info.danger .admonition-title,.rst-content .wy-alert-info.danger .wy-alert-title,.rst-content .wy-alert-info.error .admonition-title,.rst-content .wy-alert-info.error .wy-alert-title,.rst-content .wy-alert-info.hint .admonition-title,.rst-content .wy-alert-info.hint .wy-alert-title,.rst-content .wy-alert-info.important .admonition-title,.rst-content .wy-alert-info.important .wy-alert-title,.rst-content .wy-alert-info.tip .admonition-title,.rst-content .wy-alert-info.tip .wy-alert-title,.rst-content .wy-alert-info.warning .admonition-title,.rst-content .wy-alert-info.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-info .admonition-title,.wy-alert.wy-alert-info .rst-content .admonition-title,.wy-alert.wy-alert-info .wy-alert-title{background:#6ab0de}.rst-content .hint,.rst-content .important,.rst-content .tip,.rst-content .wy-alert-success.admonition,.rst-content .wy-alert-success.admonition-todo,.rst-content .wy-alert-success.attention,.rst-content .wy-alert-success.caution,.rst-content .wy-alert-success.danger,.rst-content .wy-alert-success.error,.rst-content .wy-alert-success.note,.rst-content .wy-alert-success.seealso,.rst-content .wy-alert-success.warning,.wy-alert.wy-alert-success{background:#dbfaf4}.rst-content .hint .admonition-title,.rst-content .hint .wy-alert-title,.rst-content .important .admonition-title,.rst-content .important .wy-alert-title,.rst-content .tip .admonition-title,.rst-content .tip .wy-alert-title,.rst-content .wy-alert-success.admonition-todo .admonition-title,.rst-content .wy-alert-success.admonition-todo .wy-alert-title,.rst-content .wy-alert-success.admonition .admonition-title,.rst-content .wy-alert-success.admonition .wy-alert-title,.rst-content .wy-alert-success.attention .admonition-title,.rst-content .wy-alert-success.attention .wy-alert-title,.rst-content .wy-alert-success.caution .admonition-title,.rst-content .wy-alert-success.caution .wy-alert-title,.rst-content .wy-alert-success.danger .admonition-title,.rst-content .wy-alert-success.danger .wy-alert-title,.rst-content .wy-alert-success.error .admonition-title,.rst-content .wy-alert-success.error .wy-alert-title,.rst-content .wy-alert-success.note .admonition-title,.rst-content .wy-alert-success.note .wy-alert-title,.rst-content .wy-alert-success.seealso .admonition-title,.rst-content .wy-alert-success.seealso .wy-alert-title,.rst-content .wy-alert-success.warning .admonition-title,.rst-content .wy-alert-success.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-success .admonition-title,.wy-alert.wy-alert-success .rst-content .admonition-title,.wy-alert.wy-alert-success .wy-alert-title{background:#1abc9c}.rst-content .wy-alert-neutral.admonition,.rst-content .wy-alert-neutral.admonition-todo,.rst-content .wy-alert-neutral.attention,.rst-content .wy-alert-neutral.caution,.rst-content .wy-alert-neutral.danger,.rst-content .wy-alert-neutral.error,.rst-content .wy-alert-neutral.hint,.rst-content .wy-alert-neutral.important,.rst-content .wy-alert-neutral.note,.rst-content .wy-alert-neutral.seealso,.rst-content .wy-alert-neutral.tip,.rst-content .wy-alert-neutral.warning,.wy-alert.wy-alert-neutral{background:#f3f6f6}.rst-content .wy-alert-neutral.admonition-todo .admonition-title,.rst-content .wy-alert-neutral.admonition-todo .wy-alert-title,.rst-content .wy-alert-neutral.admonition .admonition-title,.rst-content .wy-alert-neutral.admonition .wy-alert-title,.rst-content .wy-alert-neutral.attention .admonition-title,.rst-content .wy-alert-neutral.attention .wy-alert-title,.rst-content .wy-alert-neutral.caution .admonition-title,.rst-content .wy-alert-neutral.caution .wy-alert-title,.rst-content .wy-alert-neutral.danger .admonition-title,.rst-content .wy-alert-neutral.danger .wy-alert-title,.rst-content .wy-alert-neutral.error .admonition-title,.rst-content .wy-alert-neutral.error .wy-alert-title,.rst-content .wy-alert-neutral.hint .admonition-title,.rst-content .wy-alert-neutral.hint .wy-alert-title,.rst-content .wy-alert-neutral.important .admonition-title,.rst-content .wy-alert-neutral.important .wy-alert-title,.rst-content .wy-alert-neutral.note .admonition-title,.rst-content .wy-alert-neutral.note .wy-alert-title,.rst-content .wy-alert-neutral.seealso .admonition-title,.rst-content .wy-alert-neutral.seealso .wy-alert-title,.rst-content .wy-alert-neutral.tip .admonition-title,.rst-content .wy-alert-neutral.tip .wy-alert-title,.rst-content .wy-alert-neutral.warning .admonition-title,.rst-content .wy-alert-neutral.warning .wy-alert-title,.rst-content .wy-alert.wy-alert-neutral .admonition-title,.wy-alert.wy-alert-neutral .rst-content .admonition-title,.wy-alert.wy-alert-neutral .wy-alert-title{color:#404040;background:#e1e4e5}.rst-content .wy-alert-neutral.admonition-todo a,.rst-content .wy-alert-neutral.admonition a,.rst-content .wy-alert-neutral.attention a,.rst-content .wy-alert-neutral.caution a,.rst-content .wy-alert-neutral.danger a,.rst-content .wy-alert-neutral.error a,.rst-content .wy-alert-neutral.hint a,.rst-content .wy-alert-neutral.important a,.rst-content .wy-alert-neutral.note a,.rst-content .wy-alert-neutral.seealso a,.rst-content .wy-alert-neutral.tip a,.rst-content .wy-alert-neutral.warning a,.wy-alert.wy-alert-neutral a{color:#2980b9}.rst-content .admonition-todo p:last-child,.rst-content .admonition p:last-child,.rst-content .attention p:last-child,.rst-content .caution p:last-child,.rst-content .danger p:last-child,.rst-content .error p:last-child,.rst-content .hint p:last-child,.rst-content .important p:last-child,.rst-content .note p:last-child,.rst-content .seealso p:last-child,.rst-content .tip p:last-child,.rst-content .warning p:last-child,.wy-alert p:last-child{margin-bottom:0}.wy-tray-container{position:fixed;bottom:0;left:0;z-index:600}.wy-tray-container li{display:block;width:300px;background:transparent;color:#fff;text-align:center;box-shadow:0 5px 5px 0 rgba(0,0,0,.1);padding:0 24px;min-width:20%;opacity:0;height:0;line-height:56px;overflow:hidden;-webkit-transition:all .3s ease-in;-moz-transition:all .3s ease-in;transition:all .3s ease-in}.wy-tray-container li.wy-tray-item-success{background:#27ae60}.wy-tray-container li.wy-tray-item-info{background:#2980b9}.wy-tray-container li.wy-tray-item-warning{background:#e67e22}.wy-tray-container li.wy-tray-item-danger{background:#e74c3c}.wy-tray-container li.on{opacity:1;height:56px}@media screen and (max-width:768px){.wy-tray-container{bottom:auto;top:0;width:100%}.wy-tray-container li{width:100%}}button{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle;cursor:pointer;line-height:normal;-webkit-appearance:button;*overflow:visible}button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0}button[disabled]{cursor:default}.btn{display:inline-block;border-radius:2px;line-height:normal;white-space:nowrap;text-align:center;cursor:pointer;font-size:100%;padding:6px 12px 8px;color:#fff;border:1px solid rgba(0,0,0,.1);background-color:#27ae60;text-decoration:none;font-weight:400;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 2px -1px hsla(0,0%,100%,.5),inset 0 -2px 0 0 rgba(0,0,0,.1);outline-none:false;vertical-align:middle;*display:inline;zoom:1;-webkit-user-drag:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;-webkit-transition:all .1s linear;-moz-transition:all .1s linear;transition:all .1s linear}.btn-hover{background:#2e8ece;color:#fff}.btn:hover{background:#2cc36b;color:#fff}.btn:focus{background:#2cc36b;outline:0}.btn:active{box-shadow:inset 0 -1px 0 0 rgba(0,0,0,.05),inset 0 2px 0 0 rgba(0,0,0,.1);padding:8px 12px 6px}.btn:visited{color:#fff}.btn-disabled,.btn-disabled:active,.btn-disabled:focus,.btn-disabled:hover,.btn:disabled{background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled = false);filter:alpha(opacity=40);opacity:.4;cursor:not-allowed;box-shadow:none}.btn::-moz-focus-inner{padding:0;border:0}.btn-small{font-size:80%}.btn-info{background-color:#2980b9!important}.btn-info:hover{background-color:#2e8ece!important}.btn-neutral{background-color:#f3f6f6!important;color:#404040!important}.btn-neutral:hover{background-color:#e5ebeb!important;color:#404040}.btn-neutral:visited{color:#404040!important}.btn-success{background-color:#27ae60!important}.btn-success:hover{background-color:#295!important}.btn-danger{background-color:#e74c3c!important}.btn-danger:hover{background-color:#ea6153!important}.btn-warning{background-color:#e67e22!important}.btn-warning:hover{background-color:#e98b39!important}.btn-invert{background-color:#222}.btn-invert:hover{background-color:#2f2f2f!important}.btn-link{background-color:transparent!important;color:#2980b9;box-shadow:none;border-color:transparent!important}.btn-link:active,.btn-link:hover{background-color:transparent!important;color:#409ad5!important;box-shadow:none}.btn-link:visited{color:#9b59b6}.wy-btn-group .btn,.wy-control .btn{vertical-align:middle}.wy-btn-group{margin-bottom:24px;*zoom:1}.wy-btn-group:after,.wy-btn-group:before{display:table;content:""}.wy-btn-group:after{clear:both}.wy-dropdown{position:relative;display:inline-block}.wy-dropdown-active .wy-dropdown-menu{display:block}.wy-dropdown-menu{position:absolute;left:0;display:none;float:left;top:100%;min-width:100%;background:#fcfcfc;z-index:100;border:1px solid #cfd7dd;box-shadow:0 2px 2px 0 rgba(0,0,0,.1);padding:12px}.wy-dropdown-menu>dd>a{display:block;clear:both;color:#404040;white-space:nowrap;font-size:90%;padding:0 12px;cursor:pointer}.wy-dropdown-menu>dd>a:hover{background:#2980b9;color:#fff}.wy-dropdown-menu>dd.divider{border-top:1px solid #cfd7dd;margin:6px 0}.wy-dropdown-menu>dd.search{padding-bottom:12px}.wy-dropdown-menu>dd.search input[type=search]{width:100%}.wy-dropdown-menu>dd.call-to-action{background:#e3e3e3;text-transform:uppercase;font-weight:500;font-size:80%}.wy-dropdown-menu>dd.call-to-action:hover{background:#e3e3e3}.wy-dropdown-menu>dd.call-to-action .btn{color:#fff}.wy-dropdown.wy-dropdown-up .wy-dropdown-menu{bottom:100%;top:auto;left:auto;right:0}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu{background:#fcfcfc;margin-top:2px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a{padding:6px 12px}.wy-dropdown.wy-dropdown-bubble .wy-dropdown-menu a:hover{background:#2980b9;color:#fff}.wy-dropdown.wy-dropdown-left .wy-dropdown-menu{right:0;left:auto;text-align:right}.wy-dropdown-arrow:before{content:" ";border-bottom:5px solid #f5f5f5;border-left:5px solid transparent;border-right:5px solid transparent;position:absolute;display:block;top:-4px;left:50%;margin-left:-3px}.wy-dropdown-arrow.wy-dropdown-arrow-left:before{left:11px}.wy-form-stacked select{display:block}.wy-form-aligned .wy-help-inline,.wy-form-aligned input,.wy-form-aligned label,.wy-form-aligned select,.wy-form-aligned textarea{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-form-aligned .wy-control-group>label{display:inline-block;vertical-align:middle;width:10em;margin:6px 12px 0 0;float:left}.wy-form-aligned .wy-control{float:left}.wy-form-aligned .wy-control label{display:block}.wy-form-aligned .wy-control select{margin-top:6px}fieldset{margin:0}fieldset,legend{border:0;padding:0}legend{width:100%;white-space:normal;margin-bottom:24px;font-size:150%;*margin-left:-7px}label,legend{display:block}label{margin:0 0 .3125em;color:#333;font-size:90%}input,select,textarea{font-size:100%;margin:0;vertical-align:baseline;*vertical-align:middle}.wy-control-group{margin-bottom:24px;max-width:1200px;margin-left:auto;margin-right:auto;*zoom:1}.wy-control-group:after,.wy-control-group:before{display:table;content:""}.wy-control-group:after{clear:both}.wy-control-group.wy-control-group-required>label:after{content:" *";color:#e74c3c}.wy-control-group .wy-form-full,.wy-control-group .wy-form-halves,.wy-control-group .wy-form-thirds{padding-bottom:12px}.wy-control-group .wy-form-full input[type=color],.wy-control-group .wy-form-full input[type=date],.wy-control-group .wy-form-full input[type=datetime-local],.wy-control-group .wy-form-full input[type=datetime],.wy-control-group .wy-form-full input[type=email],.wy-control-group .wy-form-full input[type=month],.wy-control-group .wy-form-full input[type=number],.wy-control-group .wy-form-full input[type=password],.wy-control-group .wy-form-full input[type=search],.wy-control-group .wy-form-full input[type=tel],.wy-control-group .wy-form-full input[type=text],.wy-control-group .wy-form-full input[type=time],.wy-control-group .wy-form-full input[type=url],.wy-control-group .wy-form-full input[type=week],.wy-control-group .wy-form-full select,.wy-control-group .wy-form-halves input[type=color],.wy-control-group .wy-form-halves input[type=date],.wy-control-group .wy-form-halves input[type=datetime-local],.wy-control-group .wy-form-halves input[type=datetime],.wy-control-group .wy-form-halves input[type=email],.wy-control-group .wy-form-halves input[type=month],.wy-control-group .wy-form-halves input[type=number],.wy-control-group .wy-form-halves input[type=password],.wy-control-group .wy-form-halves input[type=search],.wy-control-group .wy-form-halves input[type=tel],.wy-control-group .wy-form-halves input[type=text],.wy-control-group .wy-form-halves input[type=time],.wy-control-group .wy-form-halves input[type=url],.wy-control-group .wy-form-halves input[type=week],.wy-control-group .wy-form-halves select,.wy-control-group .wy-form-thirds input[type=color],.wy-control-group .wy-form-thirds input[type=date],.wy-control-group .wy-form-thirds input[type=datetime-local],.wy-control-group .wy-form-thirds input[type=datetime],.wy-control-group .wy-form-thirds input[type=email],.wy-control-group .wy-form-thirds input[type=month],.wy-control-group .wy-form-thirds input[type=number],.wy-control-group .wy-form-thirds input[type=password],.wy-control-group .wy-form-thirds input[type=search],.wy-control-group .wy-form-thirds input[type=tel],.wy-control-group .wy-form-thirds input[type=text],.wy-control-group .wy-form-thirds input[type=time],.wy-control-group .wy-form-thirds input[type=url],.wy-control-group .wy-form-thirds input[type=week],.wy-control-group .wy-form-thirds select{width:100%}.wy-control-group .wy-form-full{float:left;display:block;width:100%;margin-right:0}.wy-control-group .wy-form-full:last-child{margin-right:0}.wy-control-group .wy-form-halves{float:left;display:block;margin-right:2.35765%;width:48.82117%}.wy-control-group .wy-form-halves:last-child,.wy-control-group .wy-form-halves:nth-of-type(2n){margin-right:0}.wy-control-group .wy-form-halves:nth-of-type(odd){clear:left}.wy-control-group .wy-form-thirds{float:left;display:block;margin-right:2.35765%;width:31.76157%}.wy-control-group .wy-form-thirds:last-child,.wy-control-group .wy-form-thirds:nth-of-type(3n){margin-right:0}.wy-control-group .wy-form-thirds:nth-of-type(3n+1){clear:left}.wy-control-group.wy-control-group-no-input .wy-control,.wy-control-no-input{margin:6px 0 0;font-size:90%}.wy-control-no-input{display:inline-block}.wy-control-group.fluid-input input[type=color],.wy-control-group.fluid-input input[type=date],.wy-control-group.fluid-input input[type=datetime-local],.wy-control-group.fluid-input input[type=datetime],.wy-control-group.fluid-input input[type=email],.wy-control-group.fluid-input input[type=month],.wy-control-group.fluid-input input[type=number],.wy-control-group.fluid-input input[type=password],.wy-control-group.fluid-input input[type=search],.wy-control-group.fluid-input input[type=tel],.wy-control-group.fluid-input input[type=text],.wy-control-group.fluid-input input[type=time],.wy-control-group.fluid-input input[type=url],.wy-control-group.fluid-input input[type=week]{width:100%}.wy-form-message-inline{padding-left:.3em;color:#666;font-size:90%}.wy-form-message{display:block;color:#999;font-size:70%;margin-top:.3125em;font-style:italic}.wy-form-message p{font-size:inherit;font-style:italic;margin-bottom:6px}.wy-form-message p:last-child{margin-bottom:0}input{line-height:normal}input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;*overflow:visible}input[type=color],input[type=date],input[type=datetime-local],input[type=datetime],input[type=email],input[type=month],input[type=number],input[type=password],input[type=search],input[type=tel],input[type=text],input[type=time],input[type=url],input[type=week]{-webkit-appearance:none;padding:6px;display:inline-block;border:1px solid #ccc;font-size:80%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;box-shadow:inset 0 1px 3px #ddd;border-radius:0;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}input[type=datetime-local]{padding:.34375em .625em}input[disabled]{cursor:default}input[type=checkbox],input[type=radio]{padding:0;margin-right:.3125em;*height:13px;*width:13px}input[type=checkbox],input[type=radio],input[type=search]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}input[type=search]::-webkit-search-cancel-button,input[type=search]::-webkit-search-decoration{-webkit-appearance:none}input[type=color]:focus,input[type=date]:focus,input[type=datetime-local]:focus,input[type=datetime]:focus,input[type=email]:focus,input[type=month]:focus,input[type=number]:focus,input[type=password]:focus,input[type=search]:focus,input[type=tel]:focus,input[type=text]:focus,input[type=time]:focus,input[type=url]:focus,input[type=week]:focus{outline:0;outline:thin dotted\9;border-color:#333}input.no-focus:focus{border-color:#ccc!important}input[type=checkbox]:focus,input[type=file]:focus,input[type=radio]:focus{outline:thin dotted #333;outline:1px auto #129fea}input[type=color][disabled],input[type=date][disabled],input[type=datetime-local][disabled],input[type=datetime][disabled],input[type=email][disabled],input[type=month][disabled],input[type=number][disabled],input[type=password][disabled],input[type=search][disabled],input[type=tel][disabled],input[type=text][disabled],input[type=time][disabled],input[type=url][disabled],input[type=week][disabled]{cursor:not-allowed;background-color:#fafafa}input:focus:invalid,select:focus:invalid,textarea:focus:invalid{color:#e74c3c;border:1px solid #e74c3c}input:focus:invalid:focus,select:focus:invalid:focus,textarea:focus:invalid:focus{border-color:#e74c3c}input[type=checkbox]:focus:invalid:focus,input[type=file]:focus:invalid:focus,input[type=radio]:focus:invalid:focus{outline-color:#e74c3c}input.wy-input-large{padding:12px;font-size:100%}textarea{overflow:auto;vertical-align:top;width:100%;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif}select,textarea{padding:.5em .625em;display:inline-block;border:1px solid #ccc;font-size:80%;box-shadow:inset 0 1px 3px #ddd;-webkit-transition:border .3s linear;-moz-transition:border .3s linear;transition:border .3s linear}select{border:1px solid #ccc;background-color:#fff}select[multiple]{height:auto}select:focus,textarea:focus{outline:0}input[readonly],select[disabled],select[readonly],textarea[disabled],textarea[readonly]{cursor:not-allowed;background-color:#fafafa}input[type=checkbox][disabled],input[type=radio][disabled]{cursor:not-allowed}.wy-checkbox,.wy-radio{margin:6px 0;color:#404040;display:block}.wy-checkbox input,.wy-radio input{vertical-align:baseline}.wy-form-message-inline{display:inline-block;*display:inline;*zoom:1;vertical-align:middle}.wy-input-prefix,.wy-input-suffix{white-space:nowrap;padding:6px}.wy-input-prefix .wy-input-context,.wy-input-suffix .wy-input-context{line-height:27px;padding:0 8px;display:inline-block;font-size:80%;background-color:#f3f6f6;border:1px solid #ccc;color:#999}.wy-input-suffix .wy-input-context{border-left:0}.wy-input-prefix .wy-input-context{border-right:0}.wy-switch{position:relative;display:block;height:24px;margin-top:12px;cursor:pointer}.wy-switch:before{left:0;top:0;width:36px;height:12px;background:#ccc}.wy-switch:after,.wy-switch:before{position:absolute;content:"";display:block;border-radius:4px;-webkit-transition:all .2s ease-in-out;-moz-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.wy-switch:after{width:18px;height:18px;background:#999;left:-3px;top:-3px}.wy-switch span{position:absolute;left:48px;display:block;font-size:12px;color:#ccc;line-height:1}.wy-switch.active:before{background:#1e8449}.wy-switch.active:after{left:24px;background:#27ae60}.wy-switch.disabled{cursor:not-allowed;opacity:.8}.wy-control-group.wy-control-group-error .wy-form-message,.wy-control-group.wy-control-group-error>label{color:#e74c3c}.wy-control-group.wy-control-group-error input[type=color],.wy-control-group.wy-control-group-error input[type=date],.wy-control-group.wy-control-group-error input[type=datetime-local],.wy-control-group.wy-control-group-error input[type=datetime],.wy-control-group.wy-control-group-error input[type=email],.wy-control-group.wy-control-group-error input[type=month],.wy-control-group.wy-control-group-error input[type=number],.wy-control-group.wy-control-group-error input[type=password],.wy-control-group.wy-control-group-error input[type=search],.wy-control-group.wy-control-group-error input[type=tel],.wy-control-group.wy-control-group-error input[type=text],.wy-control-group.wy-control-group-error input[type=time],.wy-control-group.wy-control-group-error input[type=url],.wy-control-group.wy-control-group-error input[type=week],.wy-control-group.wy-control-group-error textarea{border:1px solid #e74c3c}.wy-inline-validate{white-space:nowrap}.wy-inline-validate .wy-input-context{padding:.5em .625em;display:inline-block;font-size:80%}.wy-inline-validate.wy-inline-validate-success .wy-input-context{color:#27ae60}.wy-inline-validate.wy-inline-validate-danger .wy-input-context{color:#e74c3c}.wy-inline-validate.wy-inline-validate-warning .wy-input-context{color:#e67e22}.wy-inline-validate.wy-inline-validate-info .wy-input-context{color:#2980b9}.rotate-90{-webkit-transform:rotate(90deg);-moz-transform:rotate(90deg);-ms-transform:rotate(90deg);-o-transform:rotate(90deg);transform:rotate(90deg)}.rotate-180{-webkit-transform:rotate(180deg);-moz-transform:rotate(180deg);-ms-transform:rotate(180deg);-o-transform:rotate(180deg);transform:rotate(180deg)}.rotate-270{-webkit-transform:rotate(270deg);-moz-transform:rotate(270deg);-ms-transform:rotate(270deg);-o-transform:rotate(270deg);transform:rotate(270deg)}.mirror{-webkit-transform:scaleX(-1);-moz-transform:scaleX(-1);-ms-transform:scaleX(-1);-o-transform:scaleX(-1);transform:scaleX(-1)}.mirror.rotate-90{-webkit-transform:scaleX(-1) rotate(90deg);-moz-transform:scaleX(-1) rotate(90deg);-ms-transform:scaleX(-1) rotate(90deg);-o-transform:scaleX(-1) rotate(90deg);transform:scaleX(-1) rotate(90deg)}.mirror.rotate-180{-webkit-transform:scaleX(-1) rotate(180deg);-moz-transform:scaleX(-1) rotate(180deg);-ms-transform:scaleX(-1) rotate(180deg);-o-transform:scaleX(-1) rotate(180deg);transform:scaleX(-1) rotate(180deg)}.mirror.rotate-270{-webkit-transform:scaleX(-1) rotate(270deg);-moz-transform:scaleX(-1) rotate(270deg);-ms-transform:scaleX(-1) rotate(270deg);-o-transform:scaleX(-1) rotate(270deg);transform:scaleX(-1) rotate(270deg)}@media only screen and (max-width:480px){.wy-form button[type=submit]{margin:.7em 0 0}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=text],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week],.wy-form label{margin-bottom:.3em;display:block}.wy-form input[type=color],.wy-form input[type=date],.wy-form input[type=datetime-local],.wy-form input[type=datetime],.wy-form input[type=email],.wy-form input[type=month],.wy-form input[type=number],.wy-form input[type=password],.wy-form input[type=search],.wy-form input[type=tel],.wy-form input[type=time],.wy-form input[type=url],.wy-form input[type=week]{margin-bottom:0}.wy-form-aligned .wy-control-group label{margin-bottom:.3em;text-align:left;display:block;width:100%}.wy-form-aligned .wy-control{margin:1.5em 0 0}.wy-form-message,.wy-form-message-inline,.wy-form .wy-help-inline{display:block;font-size:80%;padding:6px 0}}@media screen and (max-width:768px){.tablet-hide{display:none}}@media screen and (max-width:480px){.mobile-hide{display:none}}.float-left{float:left}.float-right{float:right}.full-width{width:100%}.rst-content table.docutils,.rst-content table.field-list,.wy-table{border-collapse:collapse;border-spacing:0;empty-cells:show;margin-bottom:24px}.rst-content table.docutils caption,.rst-content table.field-list caption,.wy-table caption{color:#000;font:italic 85%/1 arial,sans-serif;padding:1em 0;text-align:center}.rst-content table.docutils td,.rst-content table.docutils th,.rst-content table.field-list td,.rst-content table.field-list th,.wy-table td,.wy-table th{font-size:90%;margin:0;overflow:visible;padding:8px 16px}.rst-content table.docutils td:first-child,.rst-content table.docutils th:first-child,.rst-content table.field-list td:first-child,.rst-content table.field-list th:first-child,.wy-table td:first-child,.wy-table th:first-child{border-left-width:0}.rst-content table.docutils thead,.rst-content table.field-list thead,.wy-table thead{color:#000;text-align:left;vertical-align:bottom;white-space:nowrap}.rst-content table.docutils thead th,.rst-content table.field-list thead th,.wy-table thead th{font-weight:700;border-bottom:2px solid #e1e4e5}.rst-content table.docutils td,.rst-content table.field-list td,.wy-table td{background-color:transparent;vertical-align:middle}.rst-content table.docutils td p,.rst-content table.field-list td p,.wy-table td p{line-height:18px}.rst-content table.docutils td p:last-child,.rst-content table.field-list td p:last-child,.wy-table td p:last-child{margin-bottom:0}.rst-content table.docutils .wy-table-cell-min,.rst-content table.field-list .wy-table-cell-min,.wy-table .wy-table-cell-min{width:1%;padding-right:0}.rst-content table.docutils .wy-table-cell-min input[type=checkbox],.rst-content table.field-list .wy-table-cell-min input[type=checkbox],.wy-table .wy-table-cell-min input[type=checkbox]{margin:0}.wy-table-secondary{color:grey;font-size:90%}.wy-table-tertiary{color:grey;font-size:80%}.rst-content table.docutils:not(.field-list) tr:nth-child(2n-1) td,.wy-table-backed,.wy-table-odd td,.wy-table-striped tr:nth-child(2n-1) td{background-color:#f3f6f6}.rst-content table.docutils,.wy-table-bordered-all{border:1px solid #e1e4e5}.rst-content table.docutils td,.wy-table-bordered-all td{border-bottom:1px solid #e1e4e5;border-left:1px solid #e1e4e5}.rst-content table.docutils tbody>tr:last-child td,.wy-table-bordered-all tbody>tr:last-child td{border-bottom-width:0}.wy-table-bordered{border:1px solid #e1e4e5}.wy-table-bordered-rows td{border-bottom:1px solid #e1e4e5}.wy-table-bordered-rows tbody>tr:last-child td{border-bottom-width:0}.wy-table-horizontal td,.wy-table-horizontal th{border-width:0 0 1px;border-bottom:1px solid #e1e4e5}.wy-table-horizontal tbody>tr:last-child td{border-bottom-width:0}.wy-table-responsive{margin-bottom:24px;max-width:100%;overflow:auto}.wy-table-responsive table{margin-bottom:0!important}.wy-table-responsive table td,.wy-table-responsive table th{white-space:nowrap}a{color:#2980b9;text-decoration:none;cursor:pointer}a:hover{color:#3091d1}a:visited{color:#9b59b6}html{height:100%}body,html{overflow-x:hidden}body{font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;font-weight:400;color:#404040;min-height:100%;background:#edf0f2}.wy-text-left{text-align:left}.wy-text-center{text-align:center}.wy-text-right{text-align:right}.wy-text-large{font-size:120%}.wy-text-normal{font-size:100%}.wy-text-small,small{font-size:80%}.wy-text-strike{text-decoration:line-through}.wy-text-warning{color:#e67e22!important}a.wy-text-warning:hover{color:#eb9950!important}.wy-text-info{color:#2980b9!important}a.wy-text-info:hover{color:#409ad5!important}.wy-text-success{color:#27ae60!important}a.wy-text-success:hover{color:#36d278!important}.wy-text-danger{color:#e74c3c!important}a.wy-text-danger:hover{color:#ed7669!important}.wy-text-neutral{color:#404040!important}a.wy-text-neutral:hover{color:#595959!important}.rst-content .toctree-wrapper>p.caption,h1,h2,h3,h4,h5,h6,legend{margin-top:0;font-weight:700;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif}p{line-height:24px;font-size:16px;margin:0 0 24px}h1{font-size:175%}.rst-content .toctree-wrapper>p.caption,h2{font-size:150%}h3{font-size:125%}h4{font-size:115%}h5{font-size:110%}h6{font-size:100%}hr{display:block;height:1px;border:0;border-top:1px solid #e1e4e5;margin:24px 0;padding:0}.rst-content code,.rst-content tt,code{white-space:nowrap;max-width:100%;background:#fff;border:1px solid #e1e4e5;font-size:75%;padding:0 5px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#e74c3c;overflow-x:auto}.rst-content tt.code-large,code.code-large{font-size:90%}.rst-content .section ul,.rst-content .toctree-wrapper ul,.rst-content section ul,.wy-plain-list-disc,article ul{list-style:disc;line-height:24px;margin-bottom:24px}.rst-content .section ul li,.rst-content .toctree-wrapper ul li,.rst-content section ul li,.wy-plain-list-disc li,article ul li{list-style:disc;margin-left:24px}.rst-content .section ul li p:last-child,.rst-content .section ul li ul,.rst-content .toctree-wrapper ul li p:last-child,.rst-content .toctree-wrapper ul li ul,.rst-content section ul li p:last-child,.rst-content section ul li ul,.wy-plain-list-disc li p:last-child,.wy-plain-list-disc li ul,article ul li p:last-child,article ul li ul{margin-bottom:0}.rst-content .section ul li li,.rst-content .toctree-wrapper ul li li,.rst-content section ul li li,.wy-plain-list-disc li li,article ul li li{list-style:circle}.rst-content .section ul li li li,.rst-content .toctree-wrapper ul li li li,.rst-content section ul li li li,.wy-plain-list-disc li li li,article ul li li li{list-style:square}.rst-content .section ul li ol li,.rst-content .toctree-wrapper ul li ol li,.rst-content section ul li ol li,.wy-plain-list-disc li ol li,article ul li ol li{list-style:decimal}.rst-content .section ol,.rst-content .section ol.arabic,.rst-content .toctree-wrapper ol,.rst-content .toctree-wrapper ol.arabic,.rst-content section ol,.rst-content section ol.arabic,.wy-plain-list-decimal,article ol{list-style:decimal;line-height:24px;margin-bottom:24px}.rst-content .section ol.arabic li,.rst-content .section ol li,.rst-content .toctree-wrapper ol.arabic li,.rst-content .toctree-wrapper ol li,.rst-content section ol.arabic li,.rst-content section ol li,.wy-plain-list-decimal li,article ol li{list-style:decimal;margin-left:24px}.rst-content .section ol.arabic li ul,.rst-content .section ol li p:last-child,.rst-content .section ol li ul,.rst-content .toctree-wrapper ol.arabic li ul,.rst-content .toctree-wrapper ol li p:last-child,.rst-content .toctree-wrapper ol li ul,.rst-content section ol.arabic li ul,.rst-content section ol li p:last-child,.rst-content section ol li ul,.wy-plain-list-decimal li p:last-child,.wy-plain-list-decimal li ul,article ol li p:last-child,article ol li ul{margin-bottom:0}.rst-content .section ol.arabic li ul li,.rst-content .section ol li ul li,.rst-content .toctree-wrapper ol.arabic li ul li,.rst-content .toctree-wrapper ol li ul li,.rst-content section ol.arabic li ul li,.rst-content section ol li ul li,.wy-plain-list-decimal li ul li,article ol li ul li{list-style:disc}.wy-breadcrumbs{*zoom:1}.wy-breadcrumbs:after,.wy-breadcrumbs:before{display:table;content:""}.wy-breadcrumbs:after{clear:both}.wy-breadcrumbs>li{display:inline-block;padding-top:5px}.wy-breadcrumbs>li.wy-breadcrumbs-aside{float:right}.rst-content .wy-breadcrumbs>li code,.rst-content .wy-breadcrumbs>li tt,.wy-breadcrumbs>li .rst-content tt,.wy-breadcrumbs>li code{all:inherit;color:inherit}.breadcrumb-item:before{content:"/";color:#bbb;font-size:13px;padding:0 6px 0 3px}.wy-breadcrumbs-extra{margin-bottom:0;color:#b3b3b3;font-size:80%;display:inline-block}@media screen and (max-width:480px){.wy-breadcrumbs-extra,.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}@media print{.wy-breadcrumbs li.wy-breadcrumbs-aside{display:none}}html{font-size:16px}.wy-affix{position:fixed;top:1.618em}.wy-menu a:hover{text-decoration:none}.wy-menu-horiz{*zoom:1}.wy-menu-horiz:after,.wy-menu-horiz:before{display:table;content:""}.wy-menu-horiz:after{clear:both}.wy-menu-horiz li,.wy-menu-horiz ul{display:inline-block}.wy-menu-horiz li:hover{background:hsla(0,0%,100%,.1)}.wy-menu-horiz li.divide-left{border-left:1px solid #404040}.wy-menu-horiz li.divide-right{border-right:1px solid #404040}.wy-menu-horiz a{height:32px;display:inline-block;line-height:32px;padding:0 16px}.wy-menu-vertical{width:300px}.wy-menu-vertical header,.wy-menu-vertical p.caption{color:#55a5d9;height:32px;line-height:32px;padding:0 1.618em;margin:12px 0 0;display:block;font-weight:700;text-transform:uppercase;font-size:85%;white-space:nowrap}.wy-menu-vertical ul{margin-bottom:0}.wy-menu-vertical li.divide-top{border-top:1px solid #404040}.wy-menu-vertical li.divide-bottom{border-bottom:1px solid #404040}.wy-menu-vertical li.current{background:#e3e3e3}.wy-menu-vertical li.current a{color:grey;border-right:1px solid #c9c9c9;padding:.4045em 2.427em}.wy-menu-vertical li.current a:hover{background:#d6d6d6}.rst-content .wy-menu-vertical li tt,.wy-menu-vertical li .rst-content tt,.wy-menu-vertical li code{border:none;background:inherit;color:inherit;padding-left:0;padding-right:0}.wy-menu-vertical li button.toctree-expand{display:block;float:left;margin-left:-1.2em;line-height:18px;color:#4d4d4d;border:none;background:none;padding:0}.wy-menu-vertical li.current>a,.wy-menu-vertical li.on a{color:#404040;font-weight:700;position:relative;background:#fcfcfc;border:none;padding:.4045em 1.618em}.wy-menu-vertical li.current>a:hover,.wy-menu-vertical li.on a:hover{background:#fcfcfc}.wy-menu-vertical li.current>a:hover button.toctree-expand,.wy-menu-vertical li.on a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.current>a button.toctree-expand,.wy-menu-vertical li.on a button.toctree-expand{display:block;line-height:18px;color:#333}.wy-menu-vertical li.toctree-l1.current>a{border-bottom:1px solid #c9c9c9;border-top:1px solid #c9c9c9}.wy-menu-vertical .toctree-l1.current .toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .toctree-l11>ul{display:none}.wy-menu-vertical .toctree-l1.current .current.toctree-l2>ul,.wy-menu-vertical .toctree-l2.current .current.toctree-l3>ul,.wy-menu-vertical .toctree-l3.current .current.toctree-l4>ul,.wy-menu-vertical .toctree-l4.current .current.toctree-l5>ul,.wy-menu-vertical .toctree-l5.current .current.toctree-l6>ul,.wy-menu-vertical .toctree-l6.current .current.toctree-l7>ul,.wy-menu-vertical .toctree-l7.current .current.toctree-l8>ul,.wy-menu-vertical .toctree-l8.current .current.toctree-l9>ul,.wy-menu-vertical .toctree-l9.current .current.toctree-l10>ul,.wy-menu-vertical .toctree-l10.current .current.toctree-l11>ul{display:block}.wy-menu-vertical li.toctree-l3,.wy-menu-vertical li.toctree-l4{font-size:.9em}.wy-menu-vertical li.toctree-l2 a,.wy-menu-vertical li.toctree-l3 a,.wy-menu-vertical li.toctree-l4 a,.wy-menu-vertical li.toctree-l5 a,.wy-menu-vertical li.toctree-l6 a,.wy-menu-vertical li.toctree-l7 a,.wy-menu-vertical li.toctree-l8 a,.wy-menu-vertical li.toctree-l9 a,.wy-menu-vertical li.toctree-l10 a{color:#404040}.wy-menu-vertical li.toctree-l2 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l3 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l4 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l5 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l6 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l7 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l8 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l9 a:hover button.toctree-expand,.wy-menu-vertical li.toctree-l10 a:hover button.toctree-expand{color:grey}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a,.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a,.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a,.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a,.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a,.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a,.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a,.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{display:block}.wy-menu-vertical li.toctree-l2.current>a{padding:.4045em 2.427em}.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{padding:.4045em 1.618em .4045em 4.045em}.wy-menu-vertical li.toctree-l3.current>a{padding:.4045em 4.045em}.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{padding:.4045em 1.618em .4045em 5.663em}.wy-menu-vertical li.toctree-l4.current>a{padding:.4045em 5.663em}.wy-menu-vertical li.toctree-l4.current li.toctree-l5>a{padding:.4045em 1.618em .4045em 7.281em}.wy-menu-vertical li.toctree-l5.current>a{padding:.4045em 7.281em}.wy-menu-vertical li.toctree-l5.current li.toctree-l6>a{padding:.4045em 1.618em .4045em 8.899em}.wy-menu-vertical li.toctree-l6.current>a{padding:.4045em 8.899em}.wy-menu-vertical li.toctree-l6.current li.toctree-l7>a{padding:.4045em 1.618em .4045em 10.517em}.wy-menu-vertical li.toctree-l7.current>a{padding:.4045em 10.517em}.wy-menu-vertical li.toctree-l7.current li.toctree-l8>a{padding:.4045em 1.618em .4045em 12.135em}.wy-menu-vertical li.toctree-l8.current>a{padding:.4045em 12.135em}.wy-menu-vertical li.toctree-l8.current li.toctree-l9>a{padding:.4045em 1.618em .4045em 13.753em}.wy-menu-vertical li.toctree-l9.current>a{padding:.4045em 13.753em}.wy-menu-vertical li.toctree-l9.current li.toctree-l10>a{padding:.4045em 1.618em .4045em 15.371em}.wy-menu-vertical li.toctree-l10.current>a{padding:.4045em 15.371em}.wy-menu-vertical li.toctree-l10.current li.toctree-l11>a{padding:.4045em 1.618em .4045em 16.989em}.wy-menu-vertical li.toctree-l2.current>a,.wy-menu-vertical li.toctree-l2.current li.toctree-l3>a{background:#c9c9c9}.wy-menu-vertical li.toctree-l2 button.toctree-expand{color:#a3a3a3}.wy-menu-vertical li.toctree-l3.current>a,.wy-menu-vertical li.toctree-l3.current li.toctree-l4>a{background:#bdbdbd}.wy-menu-vertical li.toctree-l3 button.toctree-expand{color:#969696}.wy-menu-vertical li.current ul{display:block}.wy-menu-vertical li ul{margin-bottom:0;display:none}.wy-menu-vertical li ul li a{margin-bottom:0;color:#d9d9d9;font-weight:400}.wy-menu-vertical a{line-height:18px;padding:.4045em 1.618em;display:block;position:relative;font-size:90%;color:#d9d9d9}.wy-menu-vertical a:hover{background-color:#4e4a4a;cursor:pointer}.wy-menu-vertical a:hover button.toctree-expand{color:#d9d9d9}.wy-menu-vertical a:active{background-color:#2980b9;cursor:pointer;color:#fff}.wy-menu-vertical a:active button.toctree-expand{color:#fff}.wy-side-nav-search{display:block;width:300px;padding:.809em;margin-bottom:.809em;z-index:200;background-color:#2980b9;text-align:center;color:#fcfcfc}.wy-side-nav-search input[type=text]{width:100%;border-radius:50px;padding:6px 12px;border-color:#2472a4}.wy-side-nav-search img{display:block;margin:auto auto .809em;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-side-nav-search .wy-dropdown>a,.wy-side-nav-search>a{color:#fcfcfc;font-size:100%;font-weight:700;display:inline-block;padding:4px 6px;margin-bottom:.809em;max-width:100%}.wy-side-nav-search .wy-dropdown>a:hover,.wy-side-nav-search>a:hover{background:hsla(0,0%,100%,.1)}.wy-side-nav-search .wy-dropdown>a img.logo,.wy-side-nav-search>a img.logo{display:block;margin:0 auto;height:auto;width:auto;border-radius:0;max-width:100%;background:transparent}.wy-side-nav-search .wy-dropdown>a.icon img.logo,.wy-side-nav-search>a.icon img.logo{margin-top:.85em}.wy-side-nav-search>div.version{margin-top:-.4045em;margin-bottom:.809em;font-weight:400;color:hsla(0,0%,100%,.3)}.wy-nav .wy-menu-vertical header{color:#2980b9}.wy-nav .wy-menu-vertical a{color:#b3b3b3}.wy-nav .wy-menu-vertical a:hover{background-color:#2980b9;color:#fff}[data-menu-wrap]{-webkit-transition:all .2s ease-in;-moz-transition:all .2s ease-in;transition:all .2s ease-in;position:absolute;opacity:1;width:100%;opacity:0}[data-menu-wrap].move-center{left:0;right:auto;opacity:1}[data-menu-wrap].move-left{right:auto;left:-100%;opacity:0}[data-menu-wrap].move-right{right:-100%;left:auto;opacity:0}.wy-body-for-nav{background:#fcfcfc}.wy-grid-for-nav{position:absolute;width:100%;height:100%}.wy-nav-side{position:fixed;top:0;bottom:0;left:0;padding-bottom:2em;width:300px;overflow-x:hidden;overflow-y:hidden;min-height:100%;color:#9b9b9b;background:#343131;z-index:200}.wy-side-scroll{width:320px;position:relative;overflow-x:hidden;overflow-y:scroll;height:100%}.wy-nav-top{display:none;background:#2980b9;color:#fff;padding:.4045em .809em;position:relative;line-height:50px;text-align:center;font-size:100%;*zoom:1}.wy-nav-top:after,.wy-nav-top:before{display:table;content:""}.wy-nav-top:after{clear:both}.wy-nav-top a{color:#fff;font-weight:700}.wy-nav-top img{margin-right:12px;height:45px;width:45px;background-color:#2980b9;padding:5px;border-radius:100%}.wy-nav-top i{font-size:30px;float:left;cursor:pointer;padding-top:inherit}.wy-nav-content-wrap{margin-left:300px;background:#fcfcfc;min-height:100%}.wy-nav-content{padding:1.618em 3.236em;height:100%;max-width:800px;margin:auto}.wy-body-mask{position:fixed;width:100%;height:100%;background:rgba(0,0,0,.2);display:none;z-index:499}.wy-body-mask.on{display:block}footer{color:grey}footer p{margin-bottom:12px}.rst-content footer span.commit tt,footer span.commit .rst-content tt,footer span.commit code{padding:0;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:1em;background:none;border:none;color:grey}.rst-footer-buttons{*zoom:1}.rst-footer-buttons:after,.rst-footer-buttons:before{width:100%;display:table;content:""}.rst-footer-buttons:after{clear:both}.rst-breadcrumbs-buttons{margin-top:12px;*zoom:1}.rst-breadcrumbs-buttons:after,.rst-breadcrumbs-buttons:before{display:table;content:""}.rst-breadcrumbs-buttons:after{clear:both}#search-results .search li{margin-bottom:24px;border-bottom:1px solid #e1e4e5;padding-bottom:24px}#search-results .search li:first-child{border-top:1px solid #e1e4e5;padding-top:24px}#search-results .search li a{font-size:120%;margin-bottom:12px;display:inline-block}#search-results .context{color:grey;font-size:90%}.genindextable li>ul{margin-left:24px}@media screen and (max-width:768px){.wy-body-for-nav{background:#fcfcfc}.wy-nav-top{display:block}.wy-nav-side{left:-300px}.wy-nav-side.shift{width:85%;left:0}.wy-menu.wy-menu-vertical,.wy-side-nav-search,.wy-side-scroll{width:auto}.wy-nav-content-wrap{margin-left:0}.wy-nav-content-wrap .wy-nav-content{padding:1.618em}.wy-nav-content-wrap.shift{position:fixed;min-width:100%;left:85%;top:0;height:100%;overflow:hidden}}@media screen and (min-width:1100px){.wy-nav-content-wrap{background:rgba(0,0,0,.05)}.wy-nav-content{margin:0;background:#fcfcfc}}@media print{.rst-versions,.wy-nav-side,footer{display:none}.wy-nav-content-wrap{margin-left:0}}.rst-versions{position:fixed;bottom:0;left:0;width:300px;color:#fcfcfc;background:#1f1d1d;font-family:Lato,proxima-nova,Helvetica Neue,Arial,sans-serif;z-index:400}.rst-versions a{color:#2980b9;text-decoration:none}.rst-versions .rst-badge-small{display:none}.rst-versions .rst-current-version{padding:12px;background-color:#272525;display:block;text-align:right;font-size:90%;cursor:pointer;color:#27ae60;*zoom:1}.rst-versions .rst-current-version:after,.rst-versions .rst-current-version:before{display:table;content:""}.rst-versions .rst-current-version:after{clear:both}.rst-content .code-block-caption .rst-versions .rst-current-version .headerlink,.rst-content .eqno .rst-versions .rst-current-version .headerlink,.rst-content .rst-versions .rst-current-version .admonition-title,.rst-content code.download .rst-versions .rst-current-version span:first-child,.rst-content dl dt .rst-versions .rst-current-version .headerlink,.rst-content h1 .rst-versions .rst-current-version .headerlink,.rst-content h2 .rst-versions .rst-current-version .headerlink,.rst-content h3 .rst-versions .rst-current-version .headerlink,.rst-content h4 .rst-versions .rst-current-version .headerlink,.rst-content h5 .rst-versions .rst-current-version .headerlink,.rst-content h6 .rst-versions .rst-current-version .headerlink,.rst-content p .rst-versions .rst-current-version .headerlink,.rst-content table>caption .rst-versions .rst-current-version .headerlink,.rst-content tt.download .rst-versions .rst-current-version span:first-child,.rst-versions .rst-current-version .fa,.rst-versions .rst-current-version .icon,.rst-versions .rst-current-version .rst-content .admonition-title,.rst-versions .rst-current-version .rst-content .code-block-caption .headerlink,.rst-versions .rst-current-version .rst-content .eqno .headerlink,.rst-versions .rst-current-version .rst-content code.download span:first-child,.rst-versions .rst-current-version .rst-content dl dt .headerlink,.rst-versions .rst-current-version .rst-content h1 .headerlink,.rst-versions .rst-current-version .rst-content h2 .headerlink,.rst-versions .rst-current-version .rst-content h3 .headerlink,.rst-versions .rst-current-version .rst-content h4 .headerlink,.rst-versions .rst-current-version .rst-content h5 .headerlink,.rst-versions .rst-current-version .rst-content h6 .headerlink,.rst-versions .rst-current-version .rst-content p .headerlink,.rst-versions .rst-current-version .rst-content table>caption .headerlink,.rst-versions .rst-current-version .rst-content tt.download span:first-child,.rst-versions .rst-current-version .wy-menu-vertical li button.toctree-expand,.wy-menu-vertical li .rst-versions .rst-current-version button.toctree-expand{color:#fcfcfc}.rst-versions .rst-current-version .fa-book,.rst-versions .rst-current-version .icon-book{float:left}.rst-versions .rst-current-version.rst-out-of-date{background-color:#e74c3c;color:#fff}.rst-versions .rst-current-version.rst-active-old-version{background-color:#f1c40f;color:#000}.rst-versions.shift-up{height:auto;max-height:100%;overflow-y:scroll}.rst-versions.shift-up .rst-other-versions{display:block}.rst-versions .rst-other-versions{font-size:90%;padding:12px;color:grey;display:none}.rst-versions .rst-other-versions hr{display:block;height:1px;border:0;margin:20px 0;padding:0;border-top:1px solid #413d3d}.rst-versions .rst-other-versions dd{display:inline-block;margin:0}.rst-versions .rst-other-versions dd a{display:inline-block;padding:6px;color:#fcfcfc}.rst-versions.rst-badge{width:auto;bottom:20px;right:20px;left:auto;border:none;max-width:300px;max-height:90%}.rst-versions.rst-badge .fa-book,.rst-versions.rst-badge .icon-book{float:none;line-height:30px}.rst-versions.rst-badge.shift-up .rst-current-version{text-align:right}.rst-versions.rst-badge.shift-up .rst-current-version .fa-book,.rst-versions.rst-badge.shift-up .rst-current-version .icon-book{float:left}.rst-versions.rst-badge>.rst-current-version{width:auto;height:30px;line-height:30px;padding:0 6px;display:block;text-align:center}@media screen and (max-width:768px){.rst-versions{width:85%;display:none}.rst-versions.shift{display:block}}.rst-content .toctree-wrapper>p.caption,.rst-content h1,.rst-content h2,.rst-content h3,.rst-content h4,.rst-content h5,.rst-content h6{margin-bottom:24px}.rst-content img{max-width:100%;height:auto}.rst-content div.figure,.rst-content figure{margin-bottom:24px}.rst-content div.figure .caption-text,.rst-content figure .caption-text{font-style:italic}.rst-content div.figure p:last-child.caption,.rst-content figure p:last-child.caption{margin-bottom:0}.rst-content div.figure.align-center,.rst-content figure.align-center{text-align:center}.rst-content .section>a>img,.rst-content .section>img,.rst-content section>a>img,.rst-content section>img{margin-bottom:24px}.rst-content abbr[title]{text-decoration:none}.rst-content.style-external-links a.reference.external:after{font-family:FontAwesome;content:"\f08e";color:#b3b3b3;vertical-align:super;font-size:60%;margin:0 .2em}.rst-content blockquote{margin-left:24px;line-height:24px;margin-bottom:24px}.rst-content pre.literal-block{white-space:pre;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;display:block;overflow:auto}.rst-content div[class^=highlight],.rst-content pre.literal-block{border:1px solid #e1e4e5;overflow-x:auto;margin:1px 0 24px}.rst-content div[class^=highlight] div[class^=highlight],.rst-content pre.literal-block div[class^=highlight]{padding:0;border:none;margin:0}.rst-content div[class^=highlight] td.code{width:100%}.rst-content .linenodiv pre{border-right:1px solid #e6e9ea;margin:0;padding:12px;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;user-select:none;pointer-events:none}.rst-content div[class^=highlight] pre{white-space:pre;margin:0;padding:12px;display:block;overflow:auto}.rst-content div[class^=highlight] pre .hll{display:block;margin:0 -12px;padding:0 12px}.rst-content .linenodiv pre,.rst-content div[class^=highlight] pre,.rst-content pre.literal-block{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;font-size:12px;line-height:1.4}.rst-content div.highlight .gp,.rst-content div.highlight span.linenos{user-select:none;pointer-events:none}.rst-content div.highlight span.linenos{display:inline-block;padding-left:0;padding-right:12px;margin-right:12px;border-right:1px solid #e6e9ea}.rst-content .code-block-caption{font-style:italic;font-size:85%;line-height:1;padding:1em 0;text-align:center}@media print{.rst-content .codeblock,.rst-content div[class^=highlight],.rst-content div[class^=highlight] pre{white-space:pre-wrap}}.rst-content .admonition,.rst-content .admonition-todo,.rst-content .attention,.rst-content .caution,.rst-content .danger,.rst-content .error,.rst-content .hint,.rst-content .important,.rst-content .note,.rst-content .seealso,.rst-content .tip,.rst-content .warning{clear:both}.rst-content .admonition-todo .last,.rst-content .admonition-todo>:last-child,.rst-content .admonition .last,.rst-content .admonition>:last-child,.rst-content .attention .last,.rst-content .attention>:last-child,.rst-content .caution .last,.rst-content .caution>:last-child,.rst-content .danger .last,.rst-content .danger>:last-child,.rst-content .error .last,.rst-content .error>:last-child,.rst-content .hint .last,.rst-content .hint>:last-child,.rst-content .important .last,.rst-content .important>:last-child,.rst-content .note .last,.rst-content .note>:last-child,.rst-content .seealso .last,.rst-content .seealso>:last-child,.rst-content .tip .last,.rst-content .tip>:last-child,.rst-content .warning .last,.rst-content .warning>:last-child{margin-bottom:0}.rst-content .admonition-title:before{margin-right:4px}.rst-content .admonition table{border-color:rgba(0,0,0,.1)}.rst-content .admonition table td,.rst-content .admonition table th{background:transparent!important;border-color:rgba(0,0,0,.1)!important}.rst-content .section ol.loweralpha,.rst-content .section ol.loweralpha>li,.rst-content .toctree-wrapper ol.loweralpha,.rst-content .toctree-wrapper ol.loweralpha>li,.rst-content section ol.loweralpha,.rst-content section ol.loweralpha>li{list-style:lower-alpha}.rst-content .section ol.upperalpha,.rst-content .section ol.upperalpha>li,.rst-content .toctree-wrapper ol.upperalpha,.rst-content .toctree-wrapper ol.upperalpha>li,.rst-content section ol.upperalpha,.rst-content section ol.upperalpha>li{list-style:upper-alpha}.rst-content .section ol li>*,.rst-content .section ul li>*,.rst-content .toctree-wrapper ol li>*,.rst-content .toctree-wrapper ul li>*,.rst-content section ol li>*,.rst-content section ul li>*{margin-top:12px;margin-bottom:12px}.rst-content .section ol li>:first-child,.rst-content .section ul li>:first-child,.rst-content .toctree-wrapper ol li>:first-child,.rst-content .toctree-wrapper ul li>:first-child,.rst-content section ol li>:first-child,.rst-content section ul li>:first-child{margin-top:0}.rst-content .section ol li>p,.rst-content .section ol li>p:last-child,.rst-content .section ul li>p,.rst-content .section ul li>p:last-child,.rst-content .toctree-wrapper ol li>p,.rst-content .toctree-wrapper ol li>p:last-child,.rst-content .toctree-wrapper ul li>p,.rst-content .toctree-wrapper ul li>p:last-child,.rst-content section ol li>p,.rst-content section ol li>p:last-child,.rst-content section ul li>p,.rst-content section ul li>p:last-child{margin-bottom:12px}.rst-content .section ol li>p:only-child,.rst-content .section ol li>p:only-child:last-child,.rst-content .section ul li>p:only-child,.rst-content .section ul li>p:only-child:last-child,.rst-content .toctree-wrapper ol li>p:only-child,.rst-content .toctree-wrapper ol li>p:only-child:last-child,.rst-content .toctree-wrapper ul li>p:only-child,.rst-content .toctree-wrapper ul li>p:only-child:last-child,.rst-content section ol li>p:only-child,.rst-content section ol li>p:only-child:last-child,.rst-content section ul li>p:only-child,.rst-content section ul li>p:only-child:last-child{margin-bottom:0}.rst-content .section ol li>ol,.rst-content .section ol li>ul,.rst-content .section ul li>ol,.rst-content .section ul li>ul,.rst-content .toctree-wrapper ol li>ol,.rst-content .toctree-wrapper ol li>ul,.rst-content .toctree-wrapper ul li>ol,.rst-content .toctree-wrapper ul li>ul,.rst-content section ol li>ol,.rst-content section ol li>ul,.rst-content section ul li>ol,.rst-content section ul li>ul{margin-bottom:12px}.rst-content .section ol.simple li>*,.rst-content .section ol.simple li ol,.rst-content .section ol.simple li ul,.rst-content .section ul.simple li>*,.rst-content .section ul.simple li ol,.rst-content .section ul.simple li ul,.rst-content .toctree-wrapper ol.simple li>*,.rst-content .toctree-wrapper ol.simple li ol,.rst-content .toctree-wrapper ol.simple li ul,.rst-content .toctree-wrapper ul.simple li>*,.rst-content .toctree-wrapper ul.simple li ol,.rst-content .toctree-wrapper ul.simple li ul,.rst-content section ol.simple li>*,.rst-content section ol.simple li ol,.rst-content section ol.simple li ul,.rst-content section ul.simple li>*,.rst-content section ul.simple li ol,.rst-content section ul.simple li ul{margin-top:0;margin-bottom:0}.rst-content .line-block{margin-left:0;margin-bottom:24px;line-height:24px}.rst-content .line-block .line-block{margin-left:24px;margin-bottom:0}.rst-content .topic-title{font-weight:700;margin-bottom:12px}.rst-content .toc-backref{color:#404040}.rst-content .align-right{float:right;margin:0 0 24px 24px}.rst-content .align-left{float:left;margin:0 24px 24px 0}.rst-content .align-center{margin:auto}.rst-content .align-center:not(table){display:block}.rst-content .code-block-caption .headerlink,.rst-content .eqno .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink,.rst-content dl dt .headerlink,.rst-content h1 .headerlink,.rst-content h2 .headerlink,.rst-content h3 .headerlink,.rst-content h4 .headerlink,.rst-content h5 .headerlink,.rst-content h6 .headerlink,.rst-content p.caption .headerlink,.rst-content p .headerlink,.rst-content table>caption .headerlink{opacity:0;font-size:14px;font-family:FontAwesome;margin-left:.5em}.rst-content .code-block-caption .headerlink:focus,.rst-content .code-block-caption:hover .headerlink,.rst-content .eqno .headerlink:focus,.rst-content .eqno:hover .headerlink,.rst-content .toctree-wrapper>p.caption .headerlink:focus,.rst-content .toctree-wrapper>p.caption:hover .headerlink,.rst-content dl dt .headerlink:focus,.rst-content dl dt:hover .headerlink,.rst-content h1 .headerlink:focus,.rst-content h1:hover .headerlink,.rst-content h2 .headerlink:focus,.rst-content h2:hover .headerlink,.rst-content h3 .headerlink:focus,.rst-content h3:hover .headerlink,.rst-content h4 .headerlink:focus,.rst-content h4:hover .headerlink,.rst-content h5 .headerlink:focus,.rst-content h5:hover .headerlink,.rst-content h6 .headerlink:focus,.rst-content h6:hover .headerlink,.rst-content p.caption .headerlink:focus,.rst-content p.caption:hover .headerlink,.rst-content p .headerlink:focus,.rst-content p:hover .headerlink,.rst-content table>caption .headerlink:focus,.rst-content table>caption:hover .headerlink{opacity:1}.rst-content p a{overflow-wrap:anywhere}.rst-content .wy-table td p,.rst-content .wy-table td ul,.rst-content .wy-table th p,.rst-content .wy-table th ul,.rst-content table.docutils td p,.rst-content table.docutils td ul,.rst-content table.docutils th p,.rst-content table.docutils th ul,.rst-content table.field-list td p,.rst-content table.field-list td ul,.rst-content table.field-list th p,.rst-content table.field-list th ul{font-size:inherit}.rst-content .btn:focus{outline:2px solid}.rst-content table>caption .headerlink:after{font-size:12px}.rst-content .centered{text-align:center}.rst-content .sidebar{float:right;width:40%;display:block;margin:0 0 24px 24px;padding:24px;background:#f3f6f6;border:1px solid #e1e4e5}.rst-content .sidebar dl,.rst-content .sidebar p,.rst-content .sidebar ul{font-size:90%}.rst-content .sidebar .last,.rst-content .sidebar>:last-child{margin-bottom:0}.rst-content .sidebar .sidebar-title{display:block;font-family:Roboto Slab,ff-tisa-web-pro,Georgia,Arial,sans-serif;font-weight:700;background:#e1e4e5;padding:6px 12px;margin:-24px -24px 24px;font-size:100%}.rst-content .highlighted{background:#f1c40f;box-shadow:0 0 0 2px #f1c40f;display:inline;font-weight:700}.rst-content .citation-reference,.rst-content .footnote-reference{vertical-align:baseline;position:relative;top:-.4em;line-height:0;font-size:90%}.rst-content .citation-reference>span.fn-bracket,.rst-content .footnote-reference>span.fn-bracket{display:none}.rst-content .hlist{width:100%}.rst-content dl dt span.classifier:before{content:" : "}.rst-content dl dt span.classifier-delimiter{display:none!important}html.writer-html4 .rst-content table.docutils.citation,html.writer-html4 .rst-content table.docutils.footnote{background:none;border:none}html.writer-html4 .rst-content table.docutils.citation td,html.writer-html4 .rst-content table.docutils.citation tr,html.writer-html4 .rst-content table.docutils.footnote td,html.writer-html4 .rst-content table.docutils.footnote tr{border:none;background-color:transparent!important;white-space:normal}html.writer-html4 .rst-content table.docutils.citation td.label,html.writer-html4 .rst-content table.docutils.footnote td.label{padding-left:0;padding-right:0;vertical-align:top}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{display:grid;grid-template-columns:auto minmax(80%,95%)}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{display:inline-grid;grid-template-columns:max-content auto}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{display:grid;grid-template-columns:auto auto minmax(.65rem,auto) minmax(40%,95%)}html.writer-html5 .rst-content aside.citation>span.label,html.writer-html5 .rst-content aside.footnote>span.label,html.writer-html5 .rst-content div.citation>span.label{grid-column-start:1;grid-column-end:2}html.writer-html5 .rst-content aside.citation>span.backrefs,html.writer-html5 .rst-content aside.footnote>span.backrefs,html.writer-html5 .rst-content div.citation>span.backrefs{grid-column-start:2;grid-column-end:3;grid-row-start:1;grid-row-end:3}html.writer-html5 .rst-content aside.citation>p,html.writer-html5 .rst-content aside.footnote>p,html.writer-html5 .rst-content div.citation>p{grid-column-start:4;grid-column-end:5}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.field-list,html.writer-html5 .rst-content dl.footnote{margin-bottom:24px}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dt{padding-left:1rem}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.field-list>dd,html.writer-html5 .rst-content dl.field-list>dt,html.writer-html5 .rst-content dl.footnote>dd,html.writer-html5 .rst-content dl.footnote>dt{margin-bottom:0}html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{font-size:.9rem}html.writer-html5 .rst-content dl.citation>dt,html.writer-html5 .rst-content dl.footnote>dt{margin:0 .5rem .5rem 0;line-height:1.2rem;word-break:break-all;font-weight:400}html.writer-html5 .rst-content dl.citation>dt>span.brackets:before,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:before{content:"["}html.writer-html5 .rst-content dl.citation>dt>span.brackets:after,html.writer-html5 .rst-content dl.footnote>dt>span.brackets:after{content:"]"}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a{word-break:keep-all}html.writer-html5 .rst-content dl.citation>dt>span.fn-backref>a:not(:first-child):before,html.writer-html5 .rst-content dl.footnote>dt>span.fn-backref>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content dl.citation>dd,html.writer-html5 .rst-content dl.footnote>dd{margin:0 0 .5rem;line-height:1.2rem}html.writer-html5 .rst-content dl.citation>dd p,html.writer-html5 .rst-content dl.footnote>dd p{font-size:.9rem}html.writer-html5 .rst-content aside.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content div.citation{padding-left:1rem;padding-right:1rem;font-size:.9rem;line-height:1.2rem}html.writer-html5 .rst-content aside.citation p,html.writer-html5 .rst-content aside.footnote p,html.writer-html5 .rst-content div.citation p{font-size:.9rem;line-height:1.2rem;margin-bottom:12px}html.writer-html5 .rst-content aside.citation span.backrefs,html.writer-html5 .rst-content aside.footnote span.backrefs,html.writer-html5 .rst-content div.citation span.backrefs{text-align:left;font-style:italic;margin-left:.65rem;word-break:break-word;word-spacing:-.1rem;max-width:5rem}html.writer-html5 .rst-content aside.citation span.backrefs>a,html.writer-html5 .rst-content aside.footnote span.backrefs>a,html.writer-html5 .rst-content div.citation span.backrefs>a{word-break:keep-all}html.writer-html5 .rst-content aside.citation span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content aside.footnote span.backrefs>a:not(:first-child):before,html.writer-html5 .rst-content div.citation span.backrefs>a:not(:first-child):before{content:" "}html.writer-html5 .rst-content aside.citation span.label,html.writer-html5 .rst-content aside.footnote span.label,html.writer-html5 .rst-content div.citation span.label{line-height:1.2rem}html.writer-html5 .rst-content aside.citation-list,html.writer-html5 .rst-content aside.footnote-list,html.writer-html5 .rst-content div.citation-list{margin-bottom:24px}html.writer-html5 .rst-content dl.option-list kbd{font-size:.9rem}.rst-content table.docutils.footnote,html.writer-html4 .rst-content table.docutils.citation,html.writer-html5 .rst-content aside.footnote,html.writer-html5 .rst-content aside.footnote-list aside.footnote,html.writer-html5 .rst-content div.citation-list>div.citation,html.writer-html5 .rst-content dl.citation,html.writer-html5 .rst-content dl.footnote{color:grey}.rst-content table.docutils.footnote code,.rst-content table.docutils.footnote tt,html.writer-html4 .rst-content table.docutils.citation code,html.writer-html4 .rst-content table.docutils.citation tt,html.writer-html5 .rst-content aside.footnote-list aside.footnote code,html.writer-html5 .rst-content aside.footnote-list aside.footnote tt,html.writer-html5 .rst-content aside.footnote code,html.writer-html5 .rst-content aside.footnote tt,html.writer-html5 .rst-content div.citation-list>div.citation code,html.writer-html5 .rst-content div.citation-list>div.citation tt,html.writer-html5 .rst-content dl.citation code,html.writer-html5 .rst-content dl.citation tt,html.writer-html5 .rst-content dl.footnote code,html.writer-html5 .rst-content dl.footnote tt{color:#555}.rst-content .wy-table-responsive.citation,.rst-content .wy-table-responsive.footnote{margin-bottom:0}.rst-content .wy-table-responsive.citation+:not(.citation),.rst-content .wy-table-responsive.footnote+:not(.footnote){margin-top:24px}.rst-content .wy-table-responsive.citation:last-child,.rst-content .wy-table-responsive.footnote:last-child{margin-bottom:24px}.rst-content table.docutils th{border-color:#e1e4e5}html.writer-html5 .rst-content table.docutils th{border:1px solid #e1e4e5}html.writer-html5 .rst-content table.docutils td>p,html.writer-html5 .rst-content table.docutils th>p{line-height:1rem;margin-bottom:0;font-size:.9rem}.rst-content table.docutils td .last,.rst-content table.docutils td .last>:last-child{margin-bottom:0}.rst-content table.field-list,.rst-content table.field-list td{border:none}.rst-content table.field-list td p{line-height:inherit}.rst-content table.field-list td>strong{display:inline-block}.rst-content table.field-list .field-name{padding-right:10px;text-align:left;white-space:nowrap}.rst-content table.field-list .field-body{text-align:left}.rst-content code,.rst-content tt{color:#000;font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;padding:2px 5px}.rst-content code big,.rst-content code em,.rst-content tt big,.rst-content tt em{font-size:100%!important;line-height:normal}.rst-content code.literal,.rst-content tt.literal{color:#e74c3c;white-space:normal}.rst-content code.xref,.rst-content tt.xref,a .rst-content code,a .rst-content tt{font-weight:700;color:#404040;overflow-wrap:normal}.rst-content kbd,.rst-content pre,.rst-content samp{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace}.rst-content a code,.rst-content a tt{color:#2980b9}.rst-content dl{margin-bottom:24px}.rst-content dl dt{font-weight:700;margin-bottom:12px}.rst-content dl ol,.rst-content dl p,.rst-content dl table,.rst-content dl ul{margin-bottom:12px}.rst-content dl dd{margin:0 0 12px 24px;line-height:24px}.rst-content dl dd>ol:last-child,.rst-content dl dd>p:last-child,.rst-content dl dd>table:last-child,.rst-content dl dd>ul:last-child{margin-bottom:0}html.writer-html4 .rst-content dl:not(.docutils),html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple){margin-bottom:24px}html.writer-html4 .rst-content dl:not(.docutils)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{display:table;margin:6px 0;font-size:90%;line-height:normal;background:#e7f2fa;color:#2980b9;border-top:3px solid #6ab0de;padding:6px;position:relative}html.writer-html4 .rst-content dl:not(.docutils)>dt:before,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:before{color:#6ab0de}html.writer-html4 .rst-content dl:not(.docutils)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt{margin-bottom:6px;border:none;border-left:3px solid #ccc;background:#f0f0f0;color:#555}html.writer-html4 .rst-content dl:not(.docutils) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) dl:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt .headerlink{color:#404040;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils)>dt:first-child,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple)>dt:first-child{margin-top:0}html.writer-html4 .rst-content dl:not(.docutils) code.descclassname,html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descclassname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{background-color:transparent;border:none;padding:0;font-size:100%!important}html.writer-html4 .rst-content dl:not(.docutils) code.descname,html.writer-html4 .rst-content dl:not(.docutils) tt.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) code.descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) tt.descname{font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .optional,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .optional{display:inline-block;padding:0 4px;color:#000;font-weight:700}html.writer-html4 .rst-content dl:not(.docutils) .property,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .property{display:inline-block;padding-right:8px;max-width:100%}html.writer-html4 .rst-content dl:not(.docutils) .k,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .k{font-style:italic}html.writer-html4 .rst-content dl:not(.docutils) .descclassname,html.writer-html4 .rst-content dl:not(.docutils) .descname,html.writer-html4 .rst-content dl:not(.docutils) .sig-name,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descclassname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .descname,html.writer-html5 .rst-content dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.citation):not(.glossary):not(.simple) .sig-name{font-family:SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,Courier,monospace;color:#000}.rst-content .viewcode-back,.rst-content .viewcode-link{display:inline-block;color:#27ae60;font-size:80%;padding-left:24px}.rst-content .viewcode-back{display:block;float:right}.rst-content p.rubric{margin-bottom:12px;font-weight:700}.rst-content code.download,.rst-content tt.download{background:inherit;padding:inherit;font-weight:400;font-family:inherit;font-size:inherit;color:inherit;border:inherit;white-space:inherit}.rst-content code.download span:first-child,.rst-content tt.download span:first-child{-webkit-font-smoothing:subpixel-antialiased}.rst-content code.download span:first-child:before,.rst-content tt.download span:first-child:before{margin-right:4px}.rst-content .guilabel{border:1px solid #7fbbe3;background:#e7f2fa;font-size:80%;font-weight:700;border-radius:4px;padding:2.4px 6px;margin:auto 2px}.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>.kbd,.rst-content :not(dl.option-list)>:not(dt):not(kbd):not(.kbd)>kbd{color:inherit;font-size:80%;background-color:#fff;border:1px solid #a6a6a6;border-radius:4px;box-shadow:0 2px grey;padding:2.4px 6px;margin:auto 0}.rst-content .versionmodified{font-style:italic}@media screen and (max-width:480px){.rst-content .sidebar{width:100%}}span[id*=MathJax-Span]{color:#404040}.math{text-align:center}@font-face{font-family:Lato;src:url(fonts/lato-normal.woff2?bd03a2cc277bbbc338d464e679fe9942) format("woff2"),url(fonts/lato-normal.woff?27bd77b9162d388cb8d4c4217c7c5e2a) format("woff");font-weight:400;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold.woff2?cccb897485813c7c256901dbca54ecf2) format("woff2"),url(fonts/lato-bold.woff?d878b6c29b10beca227e9eef4246111b) format("woff");font-weight:700;font-style:normal;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-bold-italic.woff2?0b6bb6725576b072c5d0b02ecdd1900d) format("woff2"),url(fonts/lato-bold-italic.woff?9c7e4e9eb485b4a121c760e61bc3707c) format("woff");font-weight:700;font-style:italic;font-display:block}@font-face{font-family:Lato;src:url(fonts/lato-normal-italic.woff2?4eb103b4d12be57cb1d040ed5e162e9d) format("woff2"),url(fonts/lato-normal-italic.woff?f28f2d6482446544ef1ea1ccc6dd5892) format("woff");font-weight:400;font-style:italic;font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:400;src:url(fonts/Roboto-Slab-Regular.woff2?7abf5b8d04d26a2cafea937019bca958) format("woff2"),url(fonts/Roboto-Slab-Regular.woff?c1be9284088d487c5e3ff0a10a92e58c) format("woff");font-display:block}@font-face{font-family:Roboto Slab;font-style:normal;font-weight:700;src:url(fonts/Roboto-Slab-Bold.woff2?9984f4a9bda09be08e83f2506954adbe) format("woff2"),url(fonts/Roboto-Slab-Bold.woff?bed5564a116b05148e3b3bea6fb1162a) format("woff");font-display:block} diff --git a/site/css/theme_extra.css b/site/css/theme_extra.css new file mode 100644 index 0000000..9f4b063 --- /dev/null +++ b/site/css/theme_extra.css @@ -0,0 +1,191 @@ +/* + * Wrap inline code samples otherwise they shoot of the side and + * can't be read at all. + * + * https://github.com/mkdocs/mkdocs/issues/313 + * https://github.com/mkdocs/mkdocs/issues/233 + * https://github.com/mkdocs/mkdocs/issues/834 + */ +.rst-content code { + white-space: pre-wrap; + word-wrap: break-word; + padding: 2px 5px; +} + +/** + * Make code blocks display as blocks and give them the appropriate + * font size and padding. + * + * https://github.com/mkdocs/mkdocs/issues/855 + * https://github.com/mkdocs/mkdocs/issues/834 + * https://github.com/mkdocs/mkdocs/issues/233 + */ +.rst-content pre code { + white-space: pre; + word-wrap: normal; + display: block; + padding: 12px; + font-size: 12px; +} + +/** + * Fix code colors + * + * https://github.com/mkdocs/mkdocs/issues/2027 + */ +.rst-content code { + color: #E74C3C; +} + +.rst-content pre code { + color: #000; + background: #f8f8f8; +} + +/* + * Fix link colors when the link text is inline code. + * + * https://github.com/mkdocs/mkdocs/issues/718 + */ +a code { + color: #2980B9; +} +a:hover code { + color: #3091d1; +} +a:visited code { + color: #9B59B6; +} + +/* + * The CSS classes from highlight.js seem to clash with the + * ReadTheDocs theme causing some code to be incorrectly made + * bold and italic. + * + * https://github.com/mkdocs/mkdocs/issues/411 + */ +pre .cs, pre .c { + font-weight: inherit; + font-style: inherit; +} + +/* + * Fix some issues with the theme and non-highlighted code + * samples. Without and highlighting styles attached the + * formatting is broken. + * + * https://github.com/mkdocs/mkdocs/issues/319 + */ +.rst-content .no-highlight { + display: block; + padding: 0.5em; + color: #333; +} + + +/* + * Additions specific to the search functionality provided by MkDocs + */ + +.search-results { + margin-top: 23px; +} + +.search-results article { + border-top: 1px solid #E1E4E5; + padding-top: 24px; +} + +.search-results article:first-child { + border-top: none; +} + +form .search-query { + width: 100%; + border-radius: 50px; + padding: 6px 12px; /* csslint allow: box-model */ + border-color: #D1D4D5; +} + +/* + * Improve inline code blocks within admonitions. + * + * https://github.com/mkdocs/mkdocs/issues/656 + */ + .rst-content .admonition code { + color: #404040; + border: 1px solid #c7c9cb; + border: 1px solid rgba(0, 0, 0, 0.2); + background: #f8fbfd; + background: rgba(255, 255, 255, 0.7); +} + +/* + * Account for wide tables which go off the side. + * Override borders to avoid weirdness on narrow tables. + * + * https://github.com/mkdocs/mkdocs/issues/834 + * https://github.com/mkdocs/mkdocs/pull/1034 + */ +.rst-content .section .docutils { + width: 100%; + overflow: auto; + display: block; + border: none; +} + +td, th { + border: 1px solid #e1e4e5 !important; /* csslint allow: important */ + border-collapse: collapse; +} + +/* + * Without the following amendments, the navigation in the theme will be + * slightly cut off. This is due to the fact that the .wy-nav-side has a + * padding-bottom of 2em, which must not necessarily align with the font-size of + * 90 % on the .rst-current-version container, combined with the padding of 12px + * above and below. These amendments fix this in two steps: First, make sure the + * .rst-current-version container has a fixed height of 40px, achieved using + * line-height, and then applying a padding-bottom of 40px to this container. In + * a second step, the items within that container are re-aligned using flexbox. + * + * https://github.com/mkdocs/mkdocs/issues/2012 + */ + .wy-nav-side { + padding-bottom: 40px; +} + +/* + * The second step of above amendment: Here we make sure the items are aligned + * correctly within the .rst-current-version container. Using flexbox, we + * achieve it in such a way that it will look like the following: + * + * [No repo_name] + * Next >> // On the first page + * << Previous Next >> // On all subsequent pages + * + * [With repo_name] + * Next >> // On the first page + * << Previous Next >> // On all subsequent pages + * + * https://github.com/mkdocs/mkdocs/issues/2012 + */ +.rst-versions .rst-current-version { + padding: 0 12px; + display: flex; + font-size: initial; + justify-content: space-between; + align-items: center; + line-height: 40px; +} + +/* + * Please note that this amendment also involves removing certain inline-styles + * from the file ./mkdocs/themes/readthedocs/versions.html. + * + * https://github.com/mkdocs/mkdocs/issues/2012 + */ +.rst-current-version span { + flex: 1; + text-align: center; +} diff --git a/site/img/favicon.ico b/site/img/favicon.ico new file mode 100644 index 0000000..e85006a Binary files /dev/null and b/site/img/favicon.ico differ diff --git a/site/index.html b/site/index.html new file mode 100644 index 0000000..817c29f --- /dev/null +++ b/site/index.html @@ -0,0 +1,232 @@ + + + + + + + + Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • + +
  • +
  • +
+
+
+
+
+ +

Welcome to the Ensemblex documentation!

+

Ensemblex is an accuracy-weighted ensemble framework for genetic demultiplexing of pooled single-cell RNA seqeuncing (scRNAseq) data. By addressing the limitiations of individual genetic demultiplexing tools, we demonstrated that Ensemblex:

+
    +
  • Achieves higher demultiplexing accuracy
  • +
  • Limits the introduction of technical noise into scRNAseq analysis
  • +
  • Retains a high proportion of cells for downstream analyses.
  • +
+

The ensemble method capitalizes on the added confidence of combining distinct statistical frameworks for genetic demultiplexing, but the modular algorithm can adapt to the overall performance of its constituent tools on the respective dataset, making it resilient against a poorly performing constituent tool.

+

Ensemblex can be used to demultiplex pools with or without prior genotype information. When demultiplexing with prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools:

+
    +
  1. Demuxalot (Rogozhnikov et al. )
  2. +
  3. Demuxlet (Kang et al. )
  4. +
  5. Souporcell (Heaton et al. )
  6. +
  7. Vireo-GT (Huang et al. )
  8. +
+

When demultiplexing without prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools:

+
    +
  1. Demuxalot (Rogozhnikov et al. )
  2. +
  3. Freemuxlet (Kang et al. )
  4. +
  5. Souporcell (Heaton et al. )
  6. +
  7. Vireo (Huang et al. )
  8. +
+

Upon demultiplexing pools with each of the four constituent genetic demultiplexing tools, Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools:

+

Step 1: Probabilistic-weighted ensemble
+Step 2: Graph-based doublet detection
+Step 3: Ensemble-independent doublet detection

+

As output, Ensemblex returns its own cell-specific sample labels and corresponding assignment probabilities and singlet confidence score, as well as the sample labels and corresponding assignment probabilities for each of its constituents. The demultiplexed sample labels could then be used to perform downstream analyses.

+

+ +

+ +

Figure 1. Overview of the Ensemblex worflow. A) The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework. B) The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection. C) As output, Ensemblex returns its own sample-cell assignments as well as the sample-cell assignments of each of its constituent tools. D) Ensemblex's sample-cell assignments can be used to perform downstream analysis on the pooled scRNAseq data.

+

To facilitate the application of Ensemblex, we provide a pipeline that demultiplexes pooled cells by each of the individual constituent genetic demultiplexing tools and processes the outputs with the Ensemblex algorithm. In this documentation, we outline each step of the Ensemblex pipeline, illustrate how to run the pipeline, define best practices, and provide a tutorial with pubicly available datasets.

+

For a comprehensive descripttion of Ensemblex, ground-truth benchmarking, and application to real-world datasets, see our pre-print manuscript: Pre-print

+
+

Contents

+ + +
+
+ +
+
+ +
+ +
+ +
+ + + + + Next » + + +
+ + + + + + + + + + + diff --git a/site/installation/index.html b/site/installation/index.html new file mode 100644 index 0000000..c0bfb1b --- /dev/null +++ b/site/installation/index.html @@ -0,0 +1,248 @@ + + + + + + + + Installation - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Installation

+

The Ensemblex container is freely available under an MIT open-source license at https://zenodo.org/records/11639103.

+

The Ensemblex container can be downloaded using the following code:

+
## Download the Ensemblex container
+curl "https://zenodo.org/records/11639103/files/ensemblex.pip.zip?download=1" --output ensemblex.pip.zip
+
+## Unzip the Ensemblex container
+unzip ensemblex.pip.zip
+
+

If installation was successful the following will be available:

+
ensemblex.pip
+├── gt
+│   ├── configs
+│   │   └── ensemblex_config.ini
+│   └── scripts
+│       ├── demuxalot
+│       │   ├── pipeline_demuxalot.sh
+│       │   └── pipline_demuxalot.py
+│       ├── demuxlet
+│       │   └── pipeline_demuxlet.sh
+│       ├── ensemblexing
+│       │   ├── ensemblexing.R
+│       │   ├── functions.R
+│       │   └── pipeline_ensemblexing.sh
+│       ├── souporcell
+│       │   └── pipeline_souporcell_generate.sh
+│       └── vireo
+│           └── pipeline_vireo.sh
+├── launch
+│   ├── launch_gt.sh
+│   └── launch_nogt.sh
+├── launch_ensemblex.sh
+├── nogt
+│   ├── configs
+│   │   └── ensemblex_config.ini
+│   └── scripts
+│       ├── demuxalot
+│       │   ├── pipeline_demuxalot.py
+│       │   └── pipeline_demuxalot.sh
+│       ├── ensemblexing
+│       │   ├── ensemblexing_nogt.R
+│       │   ├── functions_nogt.R
+│       │   └── pipeline_ensemblexing.sh
+│       ├── freemuxlet
+│       │   └── pipeline_freemuxlet.sh
+│       ├── souporcell
+│       │   └── pipeline_souporcell_generate.sh
+│       └── vireo
+│           └── pipeline_vireo.sh
+├── README
+├── soft
+│   └── ensemblex.sif
+└── tools
+    ├── sort_vcf_same_as_bam.sh
+    └── utils.sh
+
+

In addition to the Ensemblex container, users must install Apptainer. For example:

+
## Load Apptainer
+module load apptainer/1.2.4 
+
+

To test if the Ensemblex container is installed properly, run the following code:

+
## Define the path to ensemblex.pip
+ensemblex_HOME=/path/to/ensemblex.pip
+
+## Print help message
+bash $ensemblex_HOME/launch_ensemblex.sh -h
+
+

Which should return the following help message:

+
------------------- 
+Usage:  /home/fiorini9/scratch/ensemblex.pip/launch_ensemblex.sh [arguments]
+        mandatory arguments:
+                -d  (--dir)  = Working directory (where all the outputs will be printed) (give full path)
+                --steps  =  Specify the steps to execute. Begin by selecting either init-GT or init-noGT to establish the working directory. 
+                       For GT: vireo, demuxalot, demuxlet, souporcell, ensemblexing 
+                       For noGT: vireo, demuxalot, freemuxlet, souporcell, ensemblexing 
+
+        optional arguments:
+                -h  (--help)  = See helps regarding the pipeline arguments 
+                --vcf  = The path of vcf file 
+                --bam  = The path of bam file 
+                --sortout  = The path snd nsme of vcf generated using sort  
+ ------------------- 
+ For a comprehensive help, visit  https://neurobioinfo.github.io/ensemblex/site/ for documentation. 
+
+
+

Upon installing up the Ensemblex container, we can proceed to Step 1 where we will initiate the Ensemblex pipeline for demultiplexing: Set up

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/js/html5shiv.min.js b/site/js/html5shiv.min.js new file mode 100644 index 0000000..1a01c94 --- /dev/null +++ b/site/js/html5shiv.min.js @@ -0,0 +1,4 @@ +/** +* @preserve HTML5 Shiv 3.7.3 | @afarkas @jdalton @jon_neal @rem | MIT/GPL2 Licensed +*/ +!function(a,b){function c(a,b){var c=a.createElement("p"),d=a.getElementsByTagName("head")[0]||a.documentElement;return c.innerHTML="x",d.insertBefore(c.lastChild,d.firstChild)}function d(){var a=t.elements;return"string"==typeof a?a.split(" "):a}function e(a,b){var c=t.elements;"string"!=typeof c&&(c=c.join(" ")),"string"!=typeof a&&(a=a.join(" ")),t.elements=c+" "+a,j(b)}function f(a){var b=s[a[q]];return b||(b={},r++,a[q]=r,s[r]=b),b}function g(a,c,d){if(c||(c=b),l)return c.createElement(a);d||(d=f(c));var e;return e=d.cache[a]?d.cache[a].cloneNode():p.test(a)?(d.cache[a]=d.createElem(a)).cloneNode():d.createElem(a),!e.canHaveChildren||o.test(a)||e.tagUrn?e:d.frag.appendChild(e)}function h(a,c){if(a||(a=b),l)return a.createDocumentFragment();c=c||f(a);for(var e=c.frag.cloneNode(),g=0,h=d(),i=h.length;i>g;g++)e.createElement(h[g]);return e}function i(a,b){b.cache||(b.cache={},b.createElem=a.createElement,b.createFrag=a.createDocumentFragment,b.frag=b.createFrag()),a.createElement=function(c){return t.shivMethods?g(c,a,b):b.createElem(c)},a.createDocumentFragment=Function("h,f","return function(){var n=f.cloneNode(),c=n.createElement;h.shivMethods&&("+d().join().replace(/[\w\-:]+/g,function(a){return b.createElem(a),b.frag.createElement(a),'c("'+a+'")'})+");return n}")(t,b.frag)}function j(a){a||(a=b);var d=f(a);return!t.shivCSS||k||d.hasCSS||(d.hasCSS=!!c(a,"article,aside,dialog,figcaption,figure,footer,header,hgroup,main,nav,section{display:block}mark{background:#FF0;color:#000}template{display:none}")),l||i(a,d),a}var k,l,m="3.7.3",n=a.html5||{},o=/^<|^(?:button|map|select|textarea|object|iframe|option|optgroup)$/i,p=/^(?:a|b|code|div|fieldset|h1|h2|h3|h4|h5|h6|i|label|li|ol|p|q|span|strong|style|table|tbody|td|th|tr|ul)$/i,q="_html5shiv",r=0,s={};!function(){try{var a=b.createElement("a");a.innerHTML="",k="hidden"in a,l=1==a.childNodes.length||function(){b.createElement("a");var a=b.createDocumentFragment();return"undefined"==typeof a.cloneNode||"undefined"==typeof a.createDocumentFragment||"undefined"==typeof a.createElement}()}catch(c){k=!0,l=!0}}();var t={elements:n.elements||"abbr article aside audio bdi canvas data datalist details dialog figcaption figure footer header hgroup main mark meter nav output picture progress section summary template time video",version:m,shivCSS:n.shivCSS!==!1,supportsUnknownElements:l,shivMethods:n.shivMethods!==!1,type:"default",shivDocument:j,createElement:g,createDocumentFragment:h,addElements:e};a.html5=t,j(b),"object"==typeof module&&module.exports&&(module.exports=t)}("undefined"!=typeof window?window:this,document); diff --git a/site/js/jquery-3.6.0.min.js b/site/js/jquery-3.6.0.min.js new file mode 100644 index 0000000..c4c6022 --- /dev/null +++ b/site/js/jquery-3.6.0.min.js @@ -0,0 +1,2 @@ +/*! jQuery v3.6.0 | (c) OpenJS Foundation and other contributors | jquery.org/license */ +!function(e,t){"use strict";"object"==typeof module&&"object"==typeof module.exports?module.exports=e.document?t(e,!0):function(e){if(!e.document)throw new Error("jQuery requires a window with a document");return t(e)}:t(e)}("undefined"!=typeof window?window:this,function(C,e){"use strict";var t=[],r=Object.getPrototypeOf,s=t.slice,g=t.flat?function(e){return t.flat.call(e)}:function(e){return t.concat.apply([],e)},u=t.push,i=t.indexOf,n={},o=n.toString,v=n.hasOwnProperty,a=v.toString,l=a.call(Object),y={},m=function(e){return"function"==typeof e&&"number"!=typeof e.nodeType&&"function"!=typeof e.item},x=function(e){return null!=e&&e===e.window},E=C.document,c={type:!0,src:!0,nonce:!0,noModule:!0};function b(e,t,n){var r,i,o=(n=n||E).createElement("script");if(o.text=e,t)for(r in c)(i=t[r]||t.getAttribute&&t.getAttribute(r))&&o.setAttribute(r,i);n.head.appendChild(o).parentNode.removeChild(o)}function w(e){return null==e?e+"":"object"==typeof e||"function"==typeof e?n[o.call(e)]||"object":typeof e}var f="3.6.0",S=function(e,t){return new S.fn.init(e,t)};function p(e){var t=!!e&&"length"in e&&e.length,n=w(e);return!m(e)&&!x(e)&&("array"===n||0===t||"number"==typeof t&&0+~]|"+M+")"+M+"*"),U=new RegExp(M+"|>"),X=new RegExp(F),V=new RegExp("^"+I+"$"),G={ID:new RegExp("^#("+I+")"),CLASS:new RegExp("^\\.("+I+")"),TAG:new RegExp("^("+I+"|[*])"),ATTR:new RegExp("^"+W),PSEUDO:new RegExp("^"+F),CHILD:new RegExp("^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\("+M+"*(even|odd|(([+-]|)(\\d*)n|)"+M+"*(?:([+-]|)"+M+"*(\\d+)|))"+M+"*\\)|)","i"),bool:new RegExp("^(?:"+R+")$","i"),needsContext:new RegExp("^"+M+"*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\("+M+"*((?:-\\d)?\\d*)"+M+"*\\)|)(?=[^-]|$)","i")},Y=/HTML$/i,Q=/^(?:input|select|textarea|button)$/i,J=/^h\d$/i,K=/^[^{]+\{\s*\[native \w/,Z=/^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/,ee=/[+~]/,te=new RegExp("\\\\[\\da-fA-F]{1,6}"+M+"?|\\\\([^\\r\\n\\f])","g"),ne=function(e,t){var n="0x"+e.slice(1)-65536;return t||(n<0?String.fromCharCode(n+65536):String.fromCharCode(n>>10|55296,1023&n|56320))},re=/([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g,ie=function(e,t){return t?"\0"===e?"\ufffd":e.slice(0,-1)+"\\"+e.charCodeAt(e.length-1).toString(16)+" ":"\\"+e},oe=function(){T()},ae=be(function(e){return!0===e.disabled&&"fieldset"===e.nodeName.toLowerCase()},{dir:"parentNode",next:"legend"});try{H.apply(t=O.call(p.childNodes),p.childNodes),t[p.childNodes.length].nodeType}catch(e){H={apply:t.length?function(e,t){L.apply(e,O.call(t))}:function(e,t){var n=e.length,r=0;while(e[n++]=t[r++]);e.length=n-1}}}function se(t,e,n,r){var i,o,a,s,u,l,c,f=e&&e.ownerDocument,p=e?e.nodeType:9;if(n=n||[],"string"!=typeof t||!t||1!==p&&9!==p&&11!==p)return n;if(!r&&(T(e),e=e||C,E)){if(11!==p&&(u=Z.exec(t)))if(i=u[1]){if(9===p){if(!(a=e.getElementById(i)))return n;if(a.id===i)return n.push(a),n}else if(f&&(a=f.getElementById(i))&&y(e,a)&&a.id===i)return n.push(a),n}else{if(u[2])return H.apply(n,e.getElementsByTagName(t)),n;if((i=u[3])&&d.getElementsByClassName&&e.getElementsByClassName)return H.apply(n,e.getElementsByClassName(i)),n}if(d.qsa&&!N[t+" "]&&(!v||!v.test(t))&&(1!==p||"object"!==e.nodeName.toLowerCase())){if(c=t,f=e,1===p&&(U.test(t)||z.test(t))){(f=ee.test(t)&&ye(e.parentNode)||e)===e&&d.scope||((s=e.getAttribute("id"))?s=s.replace(re,ie):e.setAttribute("id",s=S)),o=(l=h(t)).length;while(o--)l[o]=(s?"#"+s:":scope")+" "+xe(l[o]);c=l.join(",")}try{return H.apply(n,f.querySelectorAll(c)),n}catch(e){N(t,!0)}finally{s===S&&e.removeAttribute("id")}}}return g(t.replace($,"$1"),e,n,r)}function ue(){var r=[];return function e(t,n){return r.push(t+" ")>b.cacheLength&&delete e[r.shift()],e[t+" "]=n}}function le(e){return e[S]=!0,e}function ce(e){var t=C.createElement("fieldset");try{return!!e(t)}catch(e){return!1}finally{t.parentNode&&t.parentNode.removeChild(t),t=null}}function fe(e,t){var n=e.split("|"),r=n.length;while(r--)b.attrHandle[n[r]]=t}function pe(e,t){var n=t&&e,r=n&&1===e.nodeType&&1===t.nodeType&&e.sourceIndex-t.sourceIndex;if(r)return r;if(n)while(n=n.nextSibling)if(n===t)return-1;return e?1:-1}function de(t){return function(e){return"input"===e.nodeName.toLowerCase()&&e.type===t}}function he(n){return function(e){var t=e.nodeName.toLowerCase();return("input"===t||"button"===t)&&e.type===n}}function ge(t){return function(e){return"form"in e?e.parentNode&&!1===e.disabled?"label"in e?"label"in e.parentNode?e.parentNode.disabled===t:e.disabled===t:e.isDisabled===t||e.isDisabled!==!t&&ae(e)===t:e.disabled===t:"label"in e&&e.disabled===t}}function ve(a){return le(function(o){return o=+o,le(function(e,t){var n,r=a([],e.length,o),i=r.length;while(i--)e[n=r[i]]&&(e[n]=!(t[n]=e[n]))})})}function ye(e){return e&&"undefined"!=typeof e.getElementsByTagName&&e}for(e in d=se.support={},i=se.isXML=function(e){var t=e&&e.namespaceURI,n=e&&(e.ownerDocument||e).documentElement;return!Y.test(t||n&&n.nodeName||"HTML")},T=se.setDocument=function(e){var t,n,r=e?e.ownerDocument||e:p;return r!=C&&9===r.nodeType&&r.documentElement&&(a=(C=r).documentElement,E=!i(C),p!=C&&(n=C.defaultView)&&n.top!==n&&(n.addEventListener?n.addEventListener("unload",oe,!1):n.attachEvent&&n.attachEvent("onunload",oe)),d.scope=ce(function(e){return a.appendChild(e).appendChild(C.createElement("div")),"undefined"!=typeof e.querySelectorAll&&!e.querySelectorAll(":scope fieldset div").length}),d.attributes=ce(function(e){return e.className="i",!e.getAttribute("className")}),d.getElementsByTagName=ce(function(e){return e.appendChild(C.createComment("")),!e.getElementsByTagName("*").length}),d.getElementsByClassName=K.test(C.getElementsByClassName),d.getById=ce(function(e){return a.appendChild(e).id=S,!C.getElementsByName||!C.getElementsByName(S).length}),d.getById?(b.filter.ID=function(e){var t=e.replace(te,ne);return function(e){return e.getAttribute("id")===t}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n=t.getElementById(e);return n?[n]:[]}}):(b.filter.ID=function(e){var n=e.replace(te,ne);return function(e){var t="undefined"!=typeof e.getAttributeNode&&e.getAttributeNode("id");return t&&t.value===n}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n,r,i,o=t.getElementById(e);if(o){if((n=o.getAttributeNode("id"))&&n.value===e)return[o];i=t.getElementsByName(e),r=0;while(o=i[r++])if((n=o.getAttributeNode("id"))&&n.value===e)return[o]}return[]}}),b.find.TAG=d.getElementsByTagName?function(e,t){return"undefined"!=typeof t.getElementsByTagName?t.getElementsByTagName(e):d.qsa?t.querySelectorAll(e):void 0}:function(e,t){var n,r=[],i=0,o=t.getElementsByTagName(e);if("*"===e){while(n=o[i++])1===n.nodeType&&r.push(n);return r}return o},b.find.CLASS=d.getElementsByClassName&&function(e,t){if("undefined"!=typeof t.getElementsByClassName&&E)return t.getElementsByClassName(e)},s=[],v=[],(d.qsa=K.test(C.querySelectorAll))&&(ce(function(e){var t;a.appendChild(e).innerHTML="",e.querySelectorAll("[msallowcapture^='']").length&&v.push("[*^$]="+M+"*(?:''|\"\")"),e.querySelectorAll("[selected]").length||v.push("\\["+M+"*(?:value|"+R+")"),e.querySelectorAll("[id~="+S+"-]").length||v.push("~="),(t=C.createElement("input")).setAttribute("name",""),e.appendChild(t),e.querySelectorAll("[name='']").length||v.push("\\["+M+"*name"+M+"*="+M+"*(?:''|\"\")"),e.querySelectorAll(":checked").length||v.push(":checked"),e.querySelectorAll("a#"+S+"+*").length||v.push(".#.+[+~]"),e.querySelectorAll("\\\f"),v.push("[\\r\\n\\f]")}),ce(function(e){e.innerHTML="";var t=C.createElement("input");t.setAttribute("type","hidden"),e.appendChild(t).setAttribute("name","D"),e.querySelectorAll("[name=d]").length&&v.push("name"+M+"*[*^$|!~]?="),2!==e.querySelectorAll(":enabled").length&&v.push(":enabled",":disabled"),a.appendChild(e).disabled=!0,2!==e.querySelectorAll(":disabled").length&&v.push(":enabled",":disabled"),e.querySelectorAll("*,:x"),v.push(",.*:")})),(d.matchesSelector=K.test(c=a.matches||a.webkitMatchesSelector||a.mozMatchesSelector||a.oMatchesSelector||a.msMatchesSelector))&&ce(function(e){d.disconnectedMatch=c.call(e,"*"),c.call(e,"[s!='']:x"),s.push("!=",F)}),v=v.length&&new RegExp(v.join("|")),s=s.length&&new RegExp(s.join("|")),t=K.test(a.compareDocumentPosition),y=t||K.test(a.contains)?function(e,t){var n=9===e.nodeType?e.documentElement:e,r=t&&t.parentNode;return e===r||!(!r||1!==r.nodeType||!(n.contains?n.contains(r):e.compareDocumentPosition&&16&e.compareDocumentPosition(r)))}:function(e,t){if(t)while(t=t.parentNode)if(t===e)return!0;return!1},j=t?function(e,t){if(e===t)return l=!0,0;var n=!e.compareDocumentPosition-!t.compareDocumentPosition;return n||(1&(n=(e.ownerDocument||e)==(t.ownerDocument||t)?e.compareDocumentPosition(t):1)||!d.sortDetached&&t.compareDocumentPosition(e)===n?e==C||e.ownerDocument==p&&y(p,e)?-1:t==C||t.ownerDocument==p&&y(p,t)?1:u?P(u,e)-P(u,t):0:4&n?-1:1)}:function(e,t){if(e===t)return l=!0,0;var n,r=0,i=e.parentNode,o=t.parentNode,a=[e],s=[t];if(!i||!o)return e==C?-1:t==C?1:i?-1:o?1:u?P(u,e)-P(u,t):0;if(i===o)return pe(e,t);n=e;while(n=n.parentNode)a.unshift(n);n=t;while(n=n.parentNode)s.unshift(n);while(a[r]===s[r])r++;return r?pe(a[r],s[r]):a[r]==p?-1:s[r]==p?1:0}),C},se.matches=function(e,t){return se(e,null,null,t)},se.matchesSelector=function(e,t){if(T(e),d.matchesSelector&&E&&!N[t+" "]&&(!s||!s.test(t))&&(!v||!v.test(t)))try{var n=c.call(e,t);if(n||d.disconnectedMatch||e.document&&11!==e.document.nodeType)return n}catch(e){N(t,!0)}return 0":{dir:"parentNode",first:!0}," ":{dir:"parentNode"},"+":{dir:"previousSibling",first:!0},"~":{dir:"previousSibling"}},preFilter:{ATTR:function(e){return e[1]=e[1].replace(te,ne),e[3]=(e[3]||e[4]||e[5]||"").replace(te,ne),"~="===e[2]&&(e[3]=" "+e[3]+" "),e.slice(0,4)},CHILD:function(e){return e[1]=e[1].toLowerCase(),"nth"===e[1].slice(0,3)?(e[3]||se.error(e[0]),e[4]=+(e[4]?e[5]+(e[6]||1):2*("even"===e[3]||"odd"===e[3])),e[5]=+(e[7]+e[8]||"odd"===e[3])):e[3]&&se.error(e[0]),e},PSEUDO:function(e){var t,n=!e[6]&&e[2];return G.CHILD.test(e[0])?null:(e[3]?e[2]=e[4]||e[5]||"":n&&X.test(n)&&(t=h(n,!0))&&(t=n.indexOf(")",n.length-t)-n.length)&&(e[0]=e[0].slice(0,t),e[2]=n.slice(0,t)),e.slice(0,3))}},filter:{TAG:function(e){var t=e.replace(te,ne).toLowerCase();return"*"===e?function(){return!0}:function(e){return e.nodeName&&e.nodeName.toLowerCase()===t}},CLASS:function(e){var t=m[e+" "];return t||(t=new RegExp("(^|"+M+")"+e+"("+M+"|$)"))&&m(e,function(e){return t.test("string"==typeof e.className&&e.className||"undefined"!=typeof e.getAttribute&&e.getAttribute("class")||"")})},ATTR:function(n,r,i){return function(e){var t=se.attr(e,n);return null==t?"!="===r:!r||(t+="","="===r?t===i:"!="===r?t!==i:"^="===r?i&&0===t.indexOf(i):"*="===r?i&&-1:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i;function j(e,n,r){return m(n)?S.grep(e,function(e,t){return!!n.call(e,t,e)!==r}):n.nodeType?S.grep(e,function(e){return e===n!==r}):"string"!=typeof n?S.grep(e,function(e){return-1)[^>]*|#([\w-]+))$/;(S.fn.init=function(e,t,n){var r,i;if(!e)return this;if(n=n||D,"string"==typeof e){if(!(r="<"===e[0]&&">"===e[e.length-1]&&3<=e.length?[null,e,null]:q.exec(e))||!r[1]&&t)return!t||t.jquery?(t||n).find(e):this.constructor(t).find(e);if(r[1]){if(t=t instanceof S?t[0]:t,S.merge(this,S.parseHTML(r[1],t&&t.nodeType?t.ownerDocument||t:E,!0)),N.test(r[1])&&S.isPlainObject(t))for(r in t)m(this[r])?this[r](t[r]):this.attr(r,t[r]);return this}return(i=E.getElementById(r[2]))&&(this[0]=i,this.length=1),this}return e.nodeType?(this[0]=e,this.length=1,this):m(e)?void 0!==n.ready?n.ready(e):e(S):S.makeArray(e,this)}).prototype=S.fn,D=S(E);var L=/^(?:parents|prev(?:Until|All))/,H={children:!0,contents:!0,next:!0,prev:!0};function O(e,t){while((e=e[t])&&1!==e.nodeType);return e}S.fn.extend({has:function(e){var t=S(e,this),n=t.length;return this.filter(function(){for(var e=0;e\x20\t\r\n\f]*)/i,he=/^$|^module$|\/(?:java|ecma)script/i;ce=E.createDocumentFragment().appendChild(E.createElement("div")),(fe=E.createElement("input")).setAttribute("type","radio"),fe.setAttribute("checked","checked"),fe.setAttribute("name","t"),ce.appendChild(fe),y.checkClone=ce.cloneNode(!0).cloneNode(!0).lastChild.checked,ce.innerHTML="",y.noCloneChecked=!!ce.cloneNode(!0).lastChild.defaultValue,ce.innerHTML="",y.option=!!ce.lastChild;var ge={thead:[1,"","
"],col:[2,"","
"],tr:[2,"","
"],td:[3,"","
"],_default:[0,"",""]};function ve(e,t){var n;return n="undefined"!=typeof e.getElementsByTagName?e.getElementsByTagName(t||"*"):"undefined"!=typeof e.querySelectorAll?e.querySelectorAll(t||"*"):[],void 0===t||t&&A(e,t)?S.merge([e],n):n}function ye(e,t){for(var n=0,r=e.length;n",""]);var me=/<|&#?\w+;/;function xe(e,t,n,r,i){for(var o,a,s,u,l,c,f=t.createDocumentFragment(),p=[],d=0,h=e.length;d\s*$/g;function je(e,t){return A(e,"table")&&A(11!==t.nodeType?t:t.firstChild,"tr")&&S(e).children("tbody")[0]||e}function De(e){return e.type=(null!==e.getAttribute("type"))+"/"+e.type,e}function qe(e){return"true/"===(e.type||"").slice(0,5)?e.type=e.type.slice(5):e.removeAttribute("type"),e}function Le(e,t){var n,r,i,o,a,s;if(1===t.nodeType){if(Y.hasData(e)&&(s=Y.get(e).events))for(i in Y.remove(t,"handle events"),s)for(n=0,r=s[i].length;n").attr(n.scriptAttrs||{}).prop({charset:n.scriptCharset,src:n.url}).on("load error",i=function(e){r.remove(),i=null,e&&t("error"===e.type?404:200,e.type)}),E.head.appendChild(r[0])},abort:function(){i&&i()}}});var _t,zt=[],Ut=/(=)\?(?=&|$)|\?\?/;S.ajaxSetup({jsonp:"callback",jsonpCallback:function(){var e=zt.pop()||S.expando+"_"+wt.guid++;return this[e]=!0,e}}),S.ajaxPrefilter("json jsonp",function(e,t,n){var r,i,o,a=!1!==e.jsonp&&(Ut.test(e.url)?"url":"string"==typeof e.data&&0===(e.contentType||"").indexOf("application/x-www-form-urlencoded")&&Ut.test(e.data)&&"data");if(a||"jsonp"===e.dataTypes[0])return r=e.jsonpCallback=m(e.jsonpCallback)?e.jsonpCallback():e.jsonpCallback,a?e[a]=e[a].replace(Ut,"$1"+r):!1!==e.jsonp&&(e.url+=(Tt.test(e.url)?"&":"?")+e.jsonp+"="+r),e.converters["script json"]=function(){return o||S.error(r+" was not called"),o[0]},e.dataTypes[0]="json",i=C[r],C[r]=function(){o=arguments},n.always(function(){void 0===i?S(C).removeProp(r):C[r]=i,e[r]&&(e.jsonpCallback=t.jsonpCallback,zt.push(r)),o&&m(i)&&i(o[0]),o=i=void 0}),"script"}),y.createHTMLDocument=((_t=E.implementation.createHTMLDocument("").body).innerHTML="
",2===_t.childNodes.length),S.parseHTML=function(e,t,n){return"string"!=typeof e?[]:("boolean"==typeof t&&(n=t,t=!1),t||(y.createHTMLDocument?((r=(t=E.implementation.createHTMLDocument("")).createElement("base")).href=E.location.href,t.head.appendChild(r)):t=E),o=!n&&[],(i=N.exec(e))?[t.createElement(i[1])]:(i=xe([e],t,o),o&&o.length&&S(o).remove(),S.merge([],i.childNodes)));var r,i,o},S.fn.load=function(e,t,n){var r,i,o,a=this,s=e.indexOf(" ");return-1").append(S.parseHTML(e)).find(r):e)}).always(n&&function(e,t){a.each(function(){n.apply(this,o||[e.responseText,t,e])})}),this},S.expr.pseudos.animated=function(t){return S.grep(S.timers,function(e){return t===e.elem}).length},S.offset={setOffset:function(e,t,n){var r,i,o,a,s,u,l=S.css(e,"position"),c=S(e),f={};"static"===l&&(e.style.position="relative"),s=c.offset(),o=S.css(e,"top"),u=S.css(e,"left"),("absolute"===l||"fixed"===l)&&-1<(o+u).indexOf("auto")?(a=(r=c.position()).top,i=r.left):(a=parseFloat(o)||0,i=parseFloat(u)||0),m(t)&&(t=t.call(e,n,S.extend({},s))),null!=t.top&&(f.top=t.top-s.top+a),null!=t.left&&(f.left=t.left-s.left+i),"using"in t?t.using.call(e,f):c.css(f)}},S.fn.extend({offset:function(t){if(arguments.length)return void 0===t?this:this.each(function(e){S.offset.setOffset(this,t,e)});var e,n,r=this[0];return r?r.getClientRects().length?(e=r.getBoundingClientRect(),n=r.ownerDocument.defaultView,{top:e.top+n.pageYOffset,left:e.left+n.pageXOffset}):{top:0,left:0}:void 0},position:function(){if(this[0]){var e,t,n,r=this[0],i={top:0,left:0};if("fixed"===S.css(r,"position"))t=r.getBoundingClientRect();else{t=this.offset(),n=r.ownerDocument,e=r.offsetParent||n.documentElement;while(e&&(e===n.body||e===n.documentElement)&&"static"===S.css(e,"position"))e=e.parentNode;e&&e!==r&&1===e.nodeType&&((i=S(e).offset()).top+=S.css(e,"borderTopWidth",!0),i.left+=S.css(e,"borderLeftWidth",!0))}return{top:t.top-i.top-S.css(r,"marginTop",!0),left:t.left-i.left-S.css(r,"marginLeft",!0)}}},offsetParent:function(){return this.map(function(){var e=this.offsetParent;while(e&&"static"===S.css(e,"position"))e=e.offsetParent;return e||re})}}),S.each({scrollLeft:"pageXOffset",scrollTop:"pageYOffset"},function(t,i){var o="pageYOffset"===i;S.fn[t]=function(e){return $(this,function(e,t,n){var r;if(x(e)?r=e:9===e.nodeType&&(r=e.defaultView),void 0===n)return r?r[i]:e[t];r?r.scrollTo(o?r.pageXOffset:n,o?n:r.pageYOffset):e[t]=n},t,e,arguments.length)}}),S.each(["top","left"],function(e,n){S.cssHooks[n]=Fe(y.pixelPosition,function(e,t){if(t)return t=We(e,n),Pe.test(t)?S(e).position()[n]+"px":t})}),S.each({Height:"height",Width:"width"},function(a,s){S.each({padding:"inner"+a,content:s,"":"outer"+a},function(r,o){S.fn[o]=function(e,t){var n=arguments.length&&(r||"boolean"!=typeof e),i=r||(!0===e||!0===t?"margin":"border");return $(this,function(e,t,n){var r;return x(e)?0===o.indexOf("outer")?e["inner"+a]:e.document.documentElement["client"+a]:9===e.nodeType?(r=e.documentElement,Math.max(e.body["scroll"+a],r["scroll"+a],e.body["offset"+a],r["offset"+a],r["client"+a])):void 0===n?S.css(e,t,i):S.style(e,t,n,i)},s,n?e:void 0,n)}})}),S.each(["ajaxStart","ajaxStop","ajaxComplete","ajaxError","ajaxSuccess","ajaxSend"],function(e,t){S.fn[t]=function(e){return this.on(t,e)}}),S.fn.extend({bind:function(e,t,n){return this.on(e,null,t,n)},unbind:function(e,t){return this.off(e,null,t)},delegate:function(e,t,n,r){return this.on(t,e,n,r)},undelegate:function(e,t,n){return 1===arguments.length?this.off(e,"**"):this.off(t,e||"**",n)},hover:function(e,t){return this.mouseenter(e).mouseleave(t||e)}}),S.each("blur focus focusin focusout resize scroll click dblclick mousedown mouseup mousemove mouseover mouseout mouseenter mouseleave change select submit keydown keypress keyup contextmenu".split(" "),function(e,n){S.fn[n]=function(e,t){return 0"),n("table.docutils.footnote").wrap("
"),n("table.docutils.citation").wrap("
"),n(".wy-menu-vertical ul").not(".simple").siblings("a").each((function(){var t=n(this);expand=n(''),expand.on("click",(function(n){return e.toggleCurrent(t),n.stopPropagation(),!1})),t.prepend(expand)}))},reset:function(){var n=encodeURI(window.location.hash)||"#";try{var e=$(".wy-menu-vertical"),t=e.find('[href="'+n+'"]');if(0===t.length){var i=$('.document [id="'+n.substring(1)+'"]').closest("div.section");0===(t=e.find('[href="#'+i.attr("id")+'"]')).length&&(t=e.find('[href="#"]'))}if(t.length>0){$(".wy-menu-vertical .current").removeClass("current").attr("aria-expanded","false"),t.addClass("current").attr("aria-expanded","true"),t.closest("li.toctree-l1").parent().addClass("current").attr("aria-expanded","true");for(let n=1;n<=10;n++)t.closest("li.toctree-l"+n).addClass("current").attr("aria-expanded","true");t[0].scrollIntoView()}}catch(n){console.log("Error expanding nav for anchor",n)}},onScroll:function(){this.winScroll=!1;var n=this.win.scrollTop(),e=n+this.winHeight,t=this.navBar.scrollTop()+(n-this.winPosition);n<0||e>this.docHeight||(this.navBar.scrollTop(t),this.winPosition=n)},onResize:function(){this.winResize=!1,this.winHeight=this.win.height(),this.docHeight=$(document).height()},hashChange:function(){this.linkScroll=!0,this.win.one("hashchange",(function(){this.linkScroll=!1}))},toggleCurrent:function(n){var e=n.closest("li");e.siblings("li.current").removeClass("current").attr("aria-expanded","false"),e.siblings().find("li.current").removeClass("current").attr("aria-expanded","false");var t=e.find("> ul li");t.length&&(t.removeClass("current").attr("aria-expanded","false"),e.toggleClass("current").attr("aria-expanded",(function(n,e){return"true"==e?"false":"true"})))}},"undefined"!=typeof window&&(window.SphinxRtdTheme={Navigation:n.exports.ThemeNav,StickyNav:n.exports.ThemeNav}),function(){for(var n=0,e=["ms","moz","webkit","o"],t=0;t + + + + + + + Downloading data - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • Tutorial »
  • + +
  • +
  • +
+
+
+
+
+ +

Data Download

+ +
+

Introduction

+

For the tutorial, we will leverage a pooled scRNAseq dataset produced by Jerber et al.. This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state.

+

In this section of the tutorial, we will:

+
    +
  1. Download and process the pooled scRNAseq data with the CellRanger counts pipeline
  2. +
  3. Download and process the sample genotype data
  4. +
  5. Download reference genotype data
  6. +
  7. Download a reference genome file
  8. +
+

Before we begin, we will create a designated folder for the Ensemblex tutorial:

+
mkdir ensemblex_tutorial
+cd ensemblex_tutorial
+
+
+

Downloading and processing scRNAseq data

+

We will begin by downloading the pooled scRNAseq data from the Sequence Read Archive (SRA):

+
## Create a folder to place pooled scRNAseq data
+mkdir pooled_scRNAseq
+cd pooled_scRNAseq
+
+
+## Download pooled scRNAseq FASTQ files
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_1.fastq.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_2.fastq.gz
+
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_1.fastq.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_2.fastq.gz
+
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_1.fastq.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_2.fastq.gz
+
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_1.fastq.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_2.fastq.gz
+
+
+## Rename pooled scRNAseq FASTQ files
+mv ERR4700019_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R1_001.fastq.gz 
+mv ERR4700019_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R2_001.fastq.gz 
+
+mv ERR4700020_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R1_001.fastq.gz 
+mv ERR4700020_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R2_001.fastq.gz
+
+mv ERR4700021_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R1_001.fastq.gz
+mv ERR4700021_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R2_001.fastq.gz
+
+mv ERR4700022_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R1_001.fastq.gz
+mv ERR4700022_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R2_001.fastq.gz
+
+
+

Next, we will process the pooled scRNAseq data with the CellRanger counts pipeline:

+
## Create CellRanger directory
+cd ~/ensemblex_tutorial
+mkdir CellRanger
+cd CellRanger
+
+cellranger count \
+--id=pool \
+--fastqs=/home/fiorini9/scratch/ensemblex_pipeline_test/ensemblex_tutorial/pooled_scRNAseq \
+--sample=pool \
+--transcriptome=~/10xGenomics/refdata-cellranger-GRCh37
+
+

If the CellRanger counts pipeline completed successfully, it will have generated the following files that we will use for genetic demultiplexing downstream:

+
    +
  • possorted_genome_bam.bam
  • +
  • possorted_genome_bam.bam.bai
  • +
  • barcodes.tsv
  • +
+

NOTE: For more information regarding the CellRanger counts pipeline, please see the 10X documentation.

+
+

Downloading sample genotype data

+

Next, we will download the whole exome .vcf files corresponding to the nine pooled individuals from which the iPSC lines derived. We will download the .vcf files from the European Nucleotide Archive (ENA):

+
## Create a folder to place sample genotype data
+cd ~/ensemblex_tutorial
+mkdir sample_genotype
+cd sample_genotype
+
+## HPSI0115i-hecn_6
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi
+
+## HPSI0214i-pelm_3 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz.tbi
+
+## HPSI0314i-sojd_3 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz.tbi
+
+## HPSI0414i-sebn_3 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi
+
+## HPSI0514i-uenn_3 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi
+
+## HPSI0714i-pipw_4 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi
+
+## HPSI0715i-meue_5 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi
+
+## HPSI0914i-vaka_5 
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi
+
+## HPSI1014i-quls_2
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz
+wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi
+
+

Upon downloading the individual genotype data, we will merge the individual files to generate a single .vcf file.

+
## Merge .vcf files
+module load bcftools
+bcftools merge *.vcf.gz > sample_genotype_merge.vcf
+
+

The resulting sample_genotype_merge.vcf file will be used as prior genotype information for genetic demultiplexing downstream.

+
+

Downloading reference genotype data

+

Next, we will download a reference genotype file from the 1000 Genomes Project, Phase 3:

+
## Create a folder to place the reference files
+cd ~/ensemblex_tutorial
+mkdir reference_files
+cd reference_files
+
+## Download reference .vcf
+wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz
+wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz.tbi
+
+## Unzip .vcf file
+gunzip ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz
+
+## Only keep SNPs
+module load vcftools
+vcftools --vcf ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf --remove-indels --recode --recode-INFO-all --out SNPs_only
+
+
+## Only keep common variants
+module load bcftools
+bcftools filter -e 'AF<0.01' SNPs_only.recode.vcf > common_SNPs_only.recode.vcf
+
+

The resulting common_SNPs_only.recode.vcf file will be used as reference genotype data for genetic demultiplexing downstream.

+
+

Downloading genome reference file

+

Finally, we will prepare a reference genome. For our tutorial we will use the GRCh37 10X reference genome. For information regarding references, see the 10X documentation.

+
## Copy pre-built reference genome to working directory
+cp /cvmfs/soft.mugqic/CentOS6/genomes/species/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa ~/ensemblex_pipeline_test/ensemblex_tutorial/reference_files
+
+

We will use the genome.fa reference genome for genetic demultiplexing downstream.

+
+

To run the Ensemblex pipeline on the downloaded data please see the Ensemblex with prior genotype information section of the Ensemblex pipeline.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/outputs/index.html b/site/outputs/index.html new file mode 100644 index 0000000..7169a7f --- /dev/null +++ b/site/outputs/index.html @@ -0,0 +1,427 @@ + + + + + + + + Ensemblex outputs - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • Documentation »
  • + +
  • +
  • +
+
+
+
+
+ +

Ensemblex algorithm outputs

+ +
+

Introduction

+

After applying the Ensemblex algorithm to the output files of the constituent genetic demultiplexing tools in Step 4, the ~/working_directory/ensemblex folder will have the following structure:

+
working_directory
+└── ensemblex
+    ├── constituent_tool_merge.csv
+    ├── step1
+    ├── step2
+    ├── step3
+    └── confidence
+
+
    +
  • constituent_tool_merge.csv is the merged outputs from each constituent genetic demultiplexing tool.
  • +
  • step1/ contains the outputs from Step 1: probabilistic-weighted ensemble.
  • +
  • step2/ contains the outputs from Step 2: graph-based doublet detection.
  • +
  • step3/ contains the outputs from Step 3: ensemble-independent doublet detection.
  • +
  • confidence/ contains the final Ensemblex output file, whose sample labels have been annotate with the Ensemblex signlet confidence score.
  • +
+

Note: If users re-run a step of the Ensemblex workflow, the outputs from the previous run will automatically be overwritten. If you do not want to lose the outputs from a previous run, it is important to copy the materials to a separate directory.

+
+

Outputs

+

Merging constituent output files

+

Ensemblex begins by merging the output files of the constituent genetic demultiplexing tools by cell barcode, which produces the constituent_tool_merge.csv file. In this file, each constituent genetic demultiplexing tool has two columns corresponding to their sample labels:

+
    +
  • demuxalot_assignment
  • +
  • demuxalot_best_assignment
  • +
  • demuxlet_assignment
  • +
  • demuxlet_best_assignment
  • +
  • souporcell_assignment
  • +
  • souporcell_best_assignment
  • +
  • vireo_assignment
  • +
  • vireo_best_assignment
  • +
+

Taking Vireo as an example, vireo_assignment shows Vireo's sample labels after applying its recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will be labeled as "unassigned". In turn, vireo_best_assignment shows Vireo's best guess assignments with out applying the recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will still show the best sample label and will not be labelled as "unassigned".

+

The constituent_tool_merge.csv file also contains a general_consensus column. This is not Ensemblex's sample labels. The general_consensus column simply shows the sample labels that result from a majority vote classifier; split decisions are labeled as unassigned.

+
+

Step 1: Accuracy-weighted probabilistic ensemble

+

After running Step 1 of the Ensemblex algorithm, the /PWE folder will contain the following files:

+
working_directory
+└── ensemblex
+    └── step1
+        ├── ARI_demultiplexing_tools.pdf
+        ├── BA_demultiplexing_tools.pdf
+        ├── Balanced_accuracy_summary.csv
+        └── Step1_cell_assignment.csv
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Output typeNameDescription
FigureARI_demultiplexing_tools.pdfHeatmap showing the Adjusted Rand Index (ARI) between the sample labels of the constituent genetic demultiplexing tools.
FigureBA_demultiplexing_tools.pdfBarplot showing the estimated balanced accuracy for each constituent genetic demultiplexing tool.
FileBalanced_accuracy_summary.csvSummary file describing the estimated balanced accuracy computation for each constituent genetic demultiplexing tool.
FileStep1_cell_assignment.csvData file containing Ensemblex's sample labels after Step 1: accuracy-weighted probabilistic ensemble.
+

The Step1_cell_assignment.csv file contains the following important columns:

+
    +
  • ensemblex_assignment: Ensemblex sample labels after performing accuracy-weighted probabilistic ensemble.
  • +
  • ensemblex_probability: Accuracy-weighted ensemble probability corresponding to Ensemblex's sample labels.
  • +
+

NOTE: Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.

+
+

Step 2: Graph-based doublet detection

+

After running Step 2 of the Ensemblex algorithm, the /GBD folder will contain the following files:

+
working_directory
+└── ensemblex
+    └── step2
+        ├── optimal_nCD.pdf
+        ├── optimal_pT.pdf
+        ├── PC1_var_contrib.pdf
+        ├── PC2_var_contrib.pdf
+        ├── PCA1_graph_based_doublet_detection.pdf
+        ├── PCA2_graph_based_doublet_detection.pdf
+        ├── PCA3_graph_based_doublet_detection.pdf
+        ├── PCA_plot.pdf
+        ├── PCA_scree_plot.pdf
+        └── Step2_cell_assignment.csv
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Output typeNameDescription
Figureoptimal_nCD.pdfDot plot showing the optimal nCD value.
Figureoptimal_pT.pdfDot plot showing the optimal pT value.
FigurePC1_var_contrib.pdfBar plot showing the contribution of each variable to the variation across the first principal component.
FigurePC2_var_contrib.pdfBar plot showing the contribution of each variable to the variation across the second principal component.
FigurePCA1_graph_based_doublet_detection.pdfPCA showing Ensemblex sample labels (singlet or doublet) prior to performing graph-based doublet detection.
FigurePCA2_graph_based_doublet_detection.pdfPCA showing the cells identified as the n most confident doublets in the pool.
FigurePCA3_graph_based_doublet_detection.pdfPCA showing Ensemblex sample labels (singlet or doublet) after performing graph-based doublet detection.
FigurePCA_plot.pdfPCA of pooled cells.
FigurePCA_scree_plot.pdfBar plot showing the variance explained by each principal component.
FileStep2_cell_assignment.csvData file containing Ensemblex's sample labels after Step 2: graph-based doublet detection.
+

The Step2_cell_assignment.csv file contains the following important column:

+
    +
  • ensemblex_assignment: Ensemblex sample labels after performing graph-based doublet detection.
  • +
+

NOTE: Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.

+
+

Step 3: Ensemble-independent doublet detection

+

After running Step 3 of the Ensemblex algorithm, the /EID folder will contain the following files:

+
working_directory
+└── ensemblex
+    └── step3
+        ├── Doublet_overlap_no_threshold.pdf
+        ├── Doublet_overlap_threshold.pdf
+        ├── Number_ensemblex_doublets_EID_no_threshold.pdf
+        ├── Number_ensemblex_doublets_EID_threshold.pdf
+        └── Step3_cell_assignment.csv
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Output typeNameDescription
FigureDoublet_overlap_no_threshold.pdfProportion of doublet calls overlapping between constituent genetic demultiplexing tools without applying assignment probability thresholds.
FigureDoublet_overlap_threshold.pdfProportion of doublet calls overlapping between constituent genetic demultiplexing tools after applying assignment probability thresholds.
FigureNumber_ensemblex_doublets_EID_no_threshold.pdfNumber of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, without applying assignment probability thresholds.
FigureNumber_ensemblex_doublets_EID_threshold.pdfNumber of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, after applying assignment probability thresholds.
FileStep3_cell_assignment.csvData file containing Ensemblex's sample labels after Step 3: ensemble-independent doublet detection.
+

The Step3_cell_assignment.csv file contains the following important column:

+
    +
  • ensemblex_assignment: Ensemblex sample labels after performing ensemble-independent doublet detection.
  • +
+

NOTE: Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.

+
+

Singlet confidence score

+

After computing the Ensemblex singlet confidence score, the /confidence folder will contain the following file:

+
working_directory
+└── ensemblex
+    └── confidence
+        └── ensemblex_final_cell_assignment.csv
+
+
+
+ + + + + + + + + + + + + + + +
Output typeNameDescription
Fileensemblex_final_cell_assignment.csvData file containing Ensemblex's final sample labels after computing the singlet confidence score.
+

The ensemblex_final_cell_assignment.csv file contains the following important column:

+
    +
  • ensemblex_assignment: Ensemblex sample labels after applying the recommended singlet confidence score threshold; singlets with a confidence score < 1 are labeled as "unassigned".
  • +
  • ensemblex_best_assignment: Ensemblex's best guess assignments with out applying the recommended confidence score threshold; singlets with a confidence score < 1 will still show the best sample label and will not be labelled as "unassigned".
  • +
  • ensemblex_singlet_confidence: Ensemblex singlet confidence score.
  • +
+

NOTE: We recommend using the sample labels from ensemblex_assignment for downstream analyses.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/overview/index.html b/site/overview/index.html new file mode 100644 index 0000000..18a7031 --- /dev/null +++ b/site/overview/index.html @@ -0,0 +1,267 @@ + + + + + + + + Ensemblex algorithm overview - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex algorithm »
  • + +
  • +
  • +
+
+
+
+
+ +

Ensemblex algorithm overview

+ +
+

Workflow

+

The Ensemblex workflow begins by demultiplexing pooled cells with each of its constituent tools: Demuxalot, Demuxlet, Souporcell and Vireo-GT if using prior genotype information or Demuxalot, Freemuxlet, Souporcell and Vireo if prior genotype information is not available.

+

+ +

+ +

Figure 1. Input into the Ensemblex framework. The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework.

+

Upon demultiplexing pools with each individual constituent genetic demultiplexing tool, Ensemblex processes the outputs in a three-step pipeline:

+ +

+ +

+ +

Figure 2. Overview of the three-step Ensemblex framework. The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection.

+

For demonstration purposes throughout this section, we leveraged simulated pools with known ground-truth sample labels that were generated with 80 independetly-sequenced induced pluripotent stem cell (iPSC) lines from individuals with Parkinson's disease and neurologically healthy controls. The lines were differentiated towards a dopaminergic cell fate as part of the Foundational Data Initiative for Parkinson's disease (FOUNDIN-PD; Bressan et al.)

+
+

Step 1: Accuracy-weighted probabilistic ensemble

+

The accuracy-weighted probabilistic ensemble component of the Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool’s assignment probability distribution by its estimated balanced accuracy for the dataset in a framework that was largely inspired by the work of Large et al.. To estimate the balanced accuracy of a particular constituent tool (e.g. Demuxalot) for real-word datasets lacking ground-truth labels, Ensemblex leverages the cells with a consensus assignment across the three remaining tools (e.g. Demuxlet, Souporcell, and Vireo-GT) as a proxy for ground-truth. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell.

+

+ +

+ +

Figure 3. Graphical representation of the accuracy-weighted probabilistic ensemble component of the Ensemblex framework.

+
+

Step 2: Graph-based doublet detection

+

The graph-based doublet detection component of the Ensemblex framework was implemented to identify doublets that are incorrectly labeled as singlets by the accuracy-weighted probablistic ensemble component (Step 1). To demonstrate Step 2 of the Ensemblex framework we leveraged a simulated pool comprising 24 pooled samples, 17,384 cells, and a 15% doublet rate.

+

+ +

+ +

Figure 4. Graphical representation of the graph-based doublet detection component of the Ensemblex framework.

+

The graph-based doublet detection component begins by leveraging select variables returned from each constituent tool:

+
    +
  1. Demuxalot: doublet probability;
  2. +
  3. Demuxlet/Freemuxlet: singlet log likelihood – doublet log likelihood;
  4. +
  5. Demuxlet/Freemuxlet: number of single nucleotide polymorphisms (SNP) per cell;
  6. +
  7. Demuxlet/Freemuxlet: number of reads per cell;
  8. +
  9. Souporcell: doublet log probability;
  10. +
  11. Vireo: doublet probability;
  12. +
  13. Vireo: doublet log likelihood ratio
  14. +
+

+ +

+ +

Figure 5. Select variables returned by the constituent genetic demultiplexing tools used for graph-based doubet detection.

+

Using these variables, Ensemblex screens each pooled cell to identify the n most confident doublets in the pool and performs a principal component analysis (PCA).

+

+ +

+ +

Figure 6. PCA of pooled cells using select variables returned by the constituent genetic demultiplexing tools. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting the n most confident doublets identified by Ensemblex.

+

The PCA embedding is then converted into a Euclidean distance matrix and each cell is assigned a percentile rank based on their distance to each confident doublet. After performing an automated parameter sweep, Ensemblex identifies the droplets that appear most frequently amongst the nearest neighbours of confident doublets as doublets.

+

+ +

+ +

Figure 7. PCA of pooled cells labeled according to Ensemblex labels prior to and after graph-based doublet detection. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting Ensemblex's labels prior to graph-based doublet detection. C) PCA highlighting Ensemblex's labels after graph-based doublet detection.

+
+

Step 3: Ensemble-independent doublet detection

+

The ensemble-independent doublet detection component of the Ensemblex framework was implemented to further improve Ensemblex's ability to identify doublets. Benchmarking on simulated pools with known ground-truth sample labels revealed that certain genetic demultiplexing tools, namely Demuxalot and Vireo, showed high doublet detection specificity.

+

+ +

+ +

Figure 8. Constituent genetic demultiplexing tool doublet specificity on computationally multiplexed pools with ground truth sample labels. Doublet specificity was evaluated on pools ranging in size from 4 to 80 multiplexed samples.

+

However, Steps 1 and 2 of the Ensemblex workflow failed to correctly label a subset of doublet calls by these tools. To mitigate this issue and maximize the rate of doublet identification, Ensemblex labels the cells that are identified as doublets by Vireo or Demuxalot as doublets, by default; however, users can nominate different tools for the ensemble-independent doublet detection component depending on the desired doublet detection stringency.

+

+ +

+ +

Figure 9. Graphical representation of the ensemble-independent doublet detection component of the Ensemblex framework.

+
+

Contribution of each step to overall demultiplexing accuracy

+

We sequentially applied each step of the Ensemblex framework to 96 computationally multiplexed pools with known ground truth sample labels ranging in size from 4 to 80 samples. The proportion of correctly classified singlets and doublets identified by Ensemblex after each step of the framework is shown in Figure 10.

+

+ +

+ +

Figure 10. Contribution of each component of the Ensemblex framework to demultiplexing accuracy. The average proportion of correctly classified A) singlets and B) doublets across replicates at a given pool size is shown after sequentially applying each step of the Ensemblex framework. The right panels show the average proportion of correct classifications across all 96 pools. The blue points show the proportion of cells that were correctly classified by at least one tool: Demuxalot, Demuxlet, Souporcell, or Vireo.

+
+

For detailed methodology please see our pre-print manuscript.

+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/overview_pipeline/index.html b/site/overview_pipeline/index.html new file mode 100644 index 0000000..e90f417 --- /dev/null +++ b/site/overview_pipeline/index.html @@ -0,0 +1,181 @@ + + + + + + + + Ensemblex pipeline overview - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • The Ensemblex Pipeline »
  • + +
  • +
  • +
+
+
+
+
+ +

Ensemblex pipeline overview

+

The Ensemblex pipeline was developed to facilitate the application of each of Ensemblex's constituent demultiplexing tools and seamlessly integrate the output files into the Ensemblex framework. We provide two distinct, yet highly comparable pipelines:

+
    +
  1. Demultiplexing with prior genotype information
  2. +
  3. Demultiplexing without prior genotype information
  4. +
+

The pipelines comprise of four distinct steps:

+
    +
  1. Selection of Ensemblex pipeline and establishing the working directory (Set up)
  2. +
  3. Prepare input files for constituent genetic demultiplexing tools
  4. +
  5. Genetic demultiplexing by constituent demultiplexing tools
  6. +
  7. Application of the Ensemblex framework
  8. +
+

+ +

+ +

Each step of the pipeline is comprehensively described in the following sections of the Ensemblex documentation.

+
+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/reference/index.html b/site/reference/index.html new file mode 100644 index 0000000..4aebd28 --- /dev/null +++ b/site/reference/index.html @@ -0,0 +1,653 @@ + + + + + + + + Execution parameters - Ensemblex + + + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • Documentation »
  • + +
  • +
  • +
+
+
+
+
+ +

Adjustable execution parameters for the Ensemblex pipeline

+ +
+

Introduction

+

Prior to running the Ensemblex pipeline, users should modify the execution parameters for the constituent genetic demultiplexing tools and the Ensemblex algorithm. Upon running Step 1: Set up, a /job_info folder will be created in the wording directory. Within the /job_info folder is a /configs folder which contains the ensemblex_config.ini; this .ini file contains all of the adjustable parameters for the Ensemblex pipeline.

+
working_directory
+└── job_info
+    ├── configs
+    │   └── ensemblex_config.ini
+    ├── logs
+    └── summary_report.txt
+
+

To ensure replicability, the execution parameters are documented in ~/working_directory/job_info/summary_report.txt.

+
+

How to modify the parameter files

+

The following section illustrates how to modify the ensemblex_config.ini parameter file directly from the terminal. To begin, navigate to the /configs folder and view its contents:

+
cd ~/working_directory/job_info/configs
+ls
+
+

The following file will be available: ensemblex_config.ini

+

To modify the ensemblex_config.ini parameter file directly in the terminal we will use Nano:

+
nano ensemblex_config.ini
+
+

This will open ensemblex_config.ini in the terminal and allow users to modify the parameters. To save the modifications and exit the parameter file, type ctrl+o followed by ctrl+x.

+
+

Constituent genetic demultiplexing tools with prior genotype information

+

Demuxalot

+

The following parameters are adjustable for Demuxalot:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_demuxalot_genotype_namesNULLList of Sample ID's in the sample VCF file (e.g., 'Sample_1,Sample_2,Sample_3').
PAR_demuxalot_minimum_coverage200Minimum read coverage.
PAR_demuxalot_minimum_alternative_coverage10Minimum alternative read coverage.
PAR_demuxalot_n_best_snps_per_donor100Number of best snps for each donor to use for demultiplexing.
PAR_demuxalot_genotypes_prior_strength1Genotype prior strength.
PAR_demuxalot_doublet_prior0.25Doublet prior strength.
+
+

Demuxlet

+

The following parameters are adjustable for Demuxlet:

+ + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_demuxlet_fieldGTField to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file.
+

NOTE: We are currently working on expanding the execution parameters for Demuxlet.

+
+

Vireo

+

The following parameters are adjustable for Vireo:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_vireo_NNULLNumber of pooled samples.
PAR_vireo_typeGTField to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file.
PAR_vireo_processes20Number of subprocesses for computing.
PAR_vireo_minMAF0.1Minimum minor allele frequency.
PAR_vireo_minCOUNT20Minimum aggregated count.
PAR_vireo_forcelearnGTTWhether or not to treat donor GT as prior only.
+

NOTE: We are currently working on expanding the execution parameters for Vireo.

+
+

Souporcell

+

The following parameters are adjustable for Souporcell:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_minimap2-ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=noFor information regarding the minimap2 parameters, please see the documentation.
PAR_freebayes-iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6For information regarding the freebayes parameters, please see the documentation.
PAR_vartrix_umiTRUEWhether or no to consider UMI information when populating coverage matrices.
PAR_vartrix_mapq30Minimum read mapping quality.
PAR_vartrix_threads8Number of threads for computing.
PAR_souporcell_kNULLNumber of pooled samples.
PAR_souporcell_t8Number of threads for computing.
+

NOTE: We are currently working on expanding the execution parameters for Souporcell.

+
+

Constituent genetic demultiplexing tools without prior genotype information

+

Demuxalot

+

The following parameters are adjustable for Demuxalot:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_demuxalot_genotype_namesNULLList of Sample ID's in the sample VCF file generated by Freemuxlet: outs.clust1.vcf (e.g., 'CLUST0,CLUST1,CLUST2').
PAR_demuxalot_minimum_coverage200Minimum read coverage.
PAR_demuxalot_minimum_alternative_coverage10Minimum alternative read coverage.
PAR_demuxalot_n_best_snps_per_donor100Number of best snps for each donor to use for demultiplexing.
PAR_demuxalot_genotypes_prior_strength1Genotype prior strength.
PAR_demuxalot_doublet_prior0.25Doublet prior strength.
+
+

Freemuxlet

+

The following parameters are adjustable for Freemuxlet:

+ + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_freemuxlet_nsampleNULLNumber of pooled samples.
+

NOTE: We are currently working on expanding the execution parameters for Freemuxlet.

+
+

Vireo

+

The following parameters are adjustable for Vireo:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_vireo_NNULLNumber of pooled samples.
PAR_vireo_processes20Number of subprocesses for computing.
PAR_vireo_minMAF0.1Minimum minor allele frequency.
PAR_vireo_minCOUNT20Minimum aggregated count.
+

NOTE: We are currently working on expanding the execution parameters for Vireo.

+
+

Souporcell

+

The following parameters are adjustable for Souporcell:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
PAR_minimap2-ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=noFor information regarding the minimap2 parameters, please see the documentation.
PAR_freebayes-iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6For information regarding the freebayes parameters, please see the documentation.
PAR_vartrix_umiTRUEWhether or no to consider UMI information when populating coverage matrices.
PAR_vartrix_mapq30Minimum read mapping quality.
PAR_vartrix_threads8Number of threads for computing.
PAR_souporcell_kNULLNumber of pooled samples.
PAR_souporcell_t8Number of threads for computing.
+

NOTE: We are currently working on expanding the execution parameters for Souporcell.

+
+

Ensemblex

+

The following parameters are adjustable for the Ensemblex algorithm:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ParameterDefaultDescription
Pool parameters
PAR_ensemblex_sample_sizeNULLNumber of samples multiplexed in the pool.
PAR_ensemblex_expected_doublet_rateNULLExpected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation.
Set up parameters
PAR_ensemblex_merge_constituentsYesWhether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to "Yes". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to "No" as the pipeline will automatically detect the previously generated merged file.
Step 1 parameters: Probabilistic-weighted ensemble
PAR_ensemblex_probabilistic_weighted_ensembleYesWhether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to "Yes". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to "No" as the pipeline will automatically detect the previously generated Step 1 output file.
Step 2 parameters: Graph-based doublet detection
PAR_ensemblex_preliminary_parameter_sweepNoWhether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively.
PAR_ensemblex_nCDNULLManually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to "Yes".
PAR_ensemblex_pTNULLManually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to "Yes".
PAR_ensemblex_graph_based_doublet_detectionYesWhether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values.
Step 3 parameters: Ensemble-independent doublet detection
PAR_ensemblex_preliminary_ensemble_independent_doubletNoWhether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters.
PAR_ensemblex_ensemble_independent_doubletYesWhether or not to perform Step 3: Ensemble-independent doublet detection.
PAR_ensemblex_doublet_Demuxalot_thresholdYesWhether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Demuxalot_no_thresholdNoWhether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Demuxlet_thresholdNoWhether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Demuxlet_no_thresholdNoWhether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Souporcell_thresholdNoWhether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Souporcell_no_thresholdNoWhether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability.
PAR_ensemblex_doublet_Vireo_thresholdYesWhether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex.
PAR_ensemblex_doublet_Vireo_no_thresholdNoWhether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability.
Confidence score parameters
PAR_ensemblex_compute_singlet_confidenceYesWhether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses.
+ +
+
+ +
+
+ +
+ +
+ +
+ + + + « Previous + + + Next » + + +
+ + + + + + + + + diff --git a/site/search.html b/site/search.html new file mode 100644 index 0000000..fd7077b --- /dev/null +++ b/site/search.html @@ -0,0 +1,155 @@ + + + + + + + + Ensemblex + + + + + + + + + + + +
+ + +
+ +
+
+
    +
  • »
  • +
  • +
  • +
+
+
+
+
+ + +

Search Results

+ + + +
+ Searching... +
+ + +
+
+ +
+
+ +
+ +
+ +
+ + + + + +
+ + + + + + + + + diff --git a/site/search/lunr.js b/site/search/lunr.js new file mode 100644 index 0000000..aca0a16 --- /dev/null +++ b/site/search/lunr.js @@ -0,0 +1,3475 @@ +/** + * lunr - http://lunrjs.com - A bit like Solr, but much smaller and not as bright - 2.3.9 + * Copyright (C) 2020 Oliver Nightingale + * @license MIT + */ + +;(function(){ + +/** + * A convenience function for configuring and constructing + * a new lunr Index. + * + * A lunr.Builder instance is created and the pipeline setup + * with a trimmer, stop word filter and stemmer. + * + * This builder object is yielded to the configuration function + * that is passed as a parameter, allowing the list of fields + * and other builder parameters to be customised. + * + * All documents _must_ be added within the passed config function. + * + * @example + * var idx = lunr(function () { + * this.field('title') + * this.field('body') + * this.ref('id') + * + * documents.forEach(function (doc) { + * this.add(doc) + * }, this) + * }) + * + * @see {@link lunr.Builder} + * @see {@link lunr.Pipeline} + * @see {@link lunr.trimmer} + * @see {@link lunr.stopWordFilter} + * @see {@link lunr.stemmer} + * @namespace {function} lunr + */ +var lunr = function (config) { + var builder = new lunr.Builder + + builder.pipeline.add( + lunr.trimmer, + lunr.stopWordFilter, + lunr.stemmer + ) + + builder.searchPipeline.add( + lunr.stemmer + ) + + config.call(builder, builder) + return builder.build() +} + +lunr.version = "2.3.9" +/*! + * lunr.utils + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * A namespace containing utils for the rest of the lunr library + * @namespace lunr.utils + */ +lunr.utils = {} + +/** + * Print a warning message to the console. + * + * @param {String} message The message to be printed. + * @memberOf lunr.utils + * @function + */ +lunr.utils.warn = (function (global) { + /* eslint-disable no-console */ + return function (message) { + if (global.console && console.warn) { + console.warn(message) + } + } + /* eslint-enable no-console */ +})(this) + +/** + * Convert an object to a string. + * + * In the case of `null` and `undefined` the function returns + * the empty string, in all other cases the result of calling + * `toString` on the passed object is returned. + * + * @param {Any} obj The object to convert to a string. + * @return {String} string representation of the passed object. + * @memberOf lunr.utils + */ +lunr.utils.asString = function (obj) { + if (obj === void 0 || obj === null) { + return "" + } else { + return obj.toString() + } +} + +/** + * Clones an object. + * + * Will create a copy of an existing object such that any mutations + * on the copy cannot affect the original. + * + * Only shallow objects are supported, passing a nested object to this + * function will cause a TypeError. + * + * Objects with primitives, and arrays of primitives are supported. + * + * @param {Object} obj The object to clone. + * @return {Object} a clone of the passed object. + * @throws {TypeError} when a nested object is passed. + * @memberOf Utils + */ +lunr.utils.clone = function (obj) { + if (obj === null || obj === undefined) { + return obj + } + + var clone = Object.create(null), + keys = Object.keys(obj) + + for (var i = 0; i < keys.length; i++) { + var key = keys[i], + val = obj[key] + + if (Array.isArray(val)) { + clone[key] = val.slice() + continue + } + + if (typeof val === 'string' || + typeof val === 'number' || + typeof val === 'boolean') { + clone[key] = val + continue + } + + throw new TypeError("clone is not deep and does not support nested objects") + } + + return clone +} +lunr.FieldRef = function (docRef, fieldName, stringValue) { + this.docRef = docRef + this.fieldName = fieldName + this._stringValue = stringValue +} + +lunr.FieldRef.joiner = "/" + +lunr.FieldRef.fromString = function (s) { + var n = s.indexOf(lunr.FieldRef.joiner) + + if (n === -1) { + throw "malformed field ref string" + } + + var fieldRef = s.slice(0, n), + docRef = s.slice(n + 1) + + return new lunr.FieldRef (docRef, fieldRef, s) +} + +lunr.FieldRef.prototype.toString = function () { + if (this._stringValue == undefined) { + this._stringValue = this.fieldName + lunr.FieldRef.joiner + this.docRef + } + + return this._stringValue +} +/*! + * lunr.Set + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * A lunr set. + * + * @constructor + */ +lunr.Set = function (elements) { + this.elements = Object.create(null) + + if (elements) { + this.length = elements.length + + for (var i = 0; i < this.length; i++) { + this.elements[elements[i]] = true + } + } else { + this.length = 0 + } +} + +/** + * A complete set that contains all elements. + * + * @static + * @readonly + * @type {lunr.Set} + */ +lunr.Set.complete = { + intersect: function (other) { + return other + }, + + union: function () { + return this + }, + + contains: function () { + return true + } +} + +/** + * An empty set that contains no elements. + * + * @static + * @readonly + * @type {lunr.Set} + */ +lunr.Set.empty = { + intersect: function () { + return this + }, + + union: function (other) { + return other + }, + + contains: function () { + return false + } +} + +/** + * Returns true if this set contains the specified object. + * + * @param {object} object - Object whose presence in this set is to be tested. + * @returns {boolean} - True if this set contains the specified object. + */ +lunr.Set.prototype.contains = function (object) { + return !!this.elements[object] +} + +/** + * Returns a new set containing only the elements that are present in both + * this set and the specified set. + * + * @param {lunr.Set} other - set to intersect with this set. + * @returns {lunr.Set} a new set that is the intersection of this and the specified set. + */ + +lunr.Set.prototype.intersect = function (other) { + var a, b, elements, intersection = [] + + if (other === lunr.Set.complete) { + return this + } + + if (other === lunr.Set.empty) { + return other + } + + if (this.length < other.length) { + a = this + b = other + } else { + a = other + b = this + } + + elements = Object.keys(a.elements) + + for (var i = 0; i < elements.length; i++) { + var element = elements[i] + if (element in b.elements) { + intersection.push(element) + } + } + + return new lunr.Set (intersection) +} + +/** + * Returns a new set combining the elements of this and the specified set. + * + * @param {lunr.Set} other - set to union with this set. + * @return {lunr.Set} a new set that is the union of this and the specified set. + */ + +lunr.Set.prototype.union = function (other) { + if (other === lunr.Set.complete) { + return lunr.Set.complete + } + + if (other === lunr.Set.empty) { + return this + } + + return new lunr.Set(Object.keys(this.elements).concat(Object.keys(other.elements))) +} +/** + * A function to calculate the inverse document frequency for + * a posting. This is shared between the builder and the index + * + * @private + * @param {object} posting - The posting for a given term + * @param {number} documentCount - The total number of documents. + */ +lunr.idf = function (posting, documentCount) { + var documentsWithTerm = 0 + + for (var fieldName in posting) { + if (fieldName == '_index') continue // Ignore the term index, its not a field + documentsWithTerm += Object.keys(posting[fieldName]).length + } + + var x = (documentCount - documentsWithTerm + 0.5) / (documentsWithTerm + 0.5) + + return Math.log(1 + Math.abs(x)) +} + +/** + * A token wraps a string representation of a token + * as it is passed through the text processing pipeline. + * + * @constructor + * @param {string} [str=''] - The string token being wrapped. + * @param {object} [metadata={}] - Metadata associated with this token. + */ +lunr.Token = function (str, metadata) { + this.str = str || "" + this.metadata = metadata || {} +} + +/** + * Returns the token string that is being wrapped by this object. + * + * @returns {string} + */ +lunr.Token.prototype.toString = function () { + return this.str +} + +/** + * A token update function is used when updating or optionally + * when cloning a token. + * + * @callback lunr.Token~updateFunction + * @param {string} str - The string representation of the token. + * @param {Object} metadata - All metadata associated with this token. + */ + +/** + * Applies the given function to the wrapped string token. + * + * @example + * token.update(function (str, metadata) { + * return str.toUpperCase() + * }) + * + * @param {lunr.Token~updateFunction} fn - A function to apply to the token string. + * @returns {lunr.Token} + */ +lunr.Token.prototype.update = function (fn) { + this.str = fn(this.str, this.metadata) + return this +} + +/** + * Creates a clone of this token. Optionally a function can be + * applied to the cloned token. + * + * @param {lunr.Token~updateFunction} [fn] - An optional function to apply to the cloned token. + * @returns {lunr.Token} + */ +lunr.Token.prototype.clone = function (fn) { + fn = fn || function (s) { return s } + return new lunr.Token (fn(this.str, this.metadata), this.metadata) +} +/*! + * lunr.tokenizer + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * A function for splitting a string into tokens ready to be inserted into + * the search index. Uses `lunr.tokenizer.separator` to split strings, change + * the value of this property to change how strings are split into tokens. + * + * This tokenizer will convert its parameter to a string by calling `toString` and + * then will split this string on the character in `lunr.tokenizer.separator`. + * Arrays will have their elements converted to strings and wrapped in a lunr.Token. + * + * Optional metadata can be passed to the tokenizer, this metadata will be cloned and + * added as metadata to every token that is created from the object to be tokenized. + * + * @static + * @param {?(string|object|object[])} obj - The object to convert into tokens + * @param {?object} metadata - Optional metadata to associate with every token + * @returns {lunr.Token[]} + * @see {@link lunr.Pipeline} + */ +lunr.tokenizer = function (obj, metadata) { + if (obj == null || obj == undefined) { + return [] + } + + if (Array.isArray(obj)) { + return obj.map(function (t) { + return new lunr.Token( + lunr.utils.asString(t).toLowerCase(), + lunr.utils.clone(metadata) + ) + }) + } + + var str = obj.toString().toLowerCase(), + len = str.length, + tokens = [] + + for (var sliceEnd = 0, sliceStart = 0; sliceEnd <= len; sliceEnd++) { + var char = str.charAt(sliceEnd), + sliceLength = sliceEnd - sliceStart + + if ((char.match(lunr.tokenizer.separator) || sliceEnd == len)) { + + if (sliceLength > 0) { + var tokenMetadata = lunr.utils.clone(metadata) || {} + tokenMetadata["position"] = [sliceStart, sliceLength] + tokenMetadata["index"] = tokens.length + + tokens.push( + new lunr.Token ( + str.slice(sliceStart, sliceEnd), + tokenMetadata + ) + ) + } + + sliceStart = sliceEnd + 1 + } + + } + + return tokens +} + +/** + * The separator used to split a string into tokens. Override this property to change the behaviour of + * `lunr.tokenizer` behaviour when tokenizing strings. By default this splits on whitespace and hyphens. + * + * @static + * @see lunr.tokenizer + */ +lunr.tokenizer.separator = /[\s\-]+/ +/*! + * lunr.Pipeline + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * lunr.Pipelines maintain an ordered list of functions to be applied to all + * tokens in documents entering the search index and queries being ran against + * the index. + * + * An instance of lunr.Index created with the lunr shortcut will contain a + * pipeline with a stop word filter and an English language stemmer. Extra + * functions can be added before or after either of these functions or these + * default functions can be removed. + * + * When run the pipeline will call each function in turn, passing a token, the + * index of that token in the original list of all tokens and finally a list of + * all the original tokens. + * + * The output of functions in the pipeline will be passed to the next function + * in the pipeline. To exclude a token from entering the index the function + * should return undefined, the rest of the pipeline will not be called with + * this token. + * + * For serialisation of pipelines to work, all functions used in an instance of + * a pipeline should be registered with lunr.Pipeline. Registered functions can + * then be loaded. If trying to load a serialised pipeline that uses functions + * that are not registered an error will be thrown. + * + * If not planning on serialising the pipeline then registering pipeline functions + * is not necessary. + * + * @constructor + */ +lunr.Pipeline = function () { + this._stack = [] +} + +lunr.Pipeline.registeredFunctions = Object.create(null) + +/** + * A pipeline function maps lunr.Token to lunr.Token. A lunr.Token contains the token + * string as well as all known metadata. A pipeline function can mutate the token string + * or mutate (or add) metadata for a given token. + * + * A pipeline function can indicate that the passed token should be discarded by returning + * null, undefined or an empty string. This token will not be passed to any downstream pipeline + * functions and will not be added to the index. + * + * Multiple tokens can be returned by returning an array of tokens. Each token will be passed + * to any downstream pipeline functions and all will returned tokens will be added to the index. + * + * Any number of pipeline functions may be chained together using a lunr.Pipeline. + * + * @interface lunr.PipelineFunction + * @param {lunr.Token} token - A token from the document being processed. + * @param {number} i - The index of this token in the complete list of tokens for this document/field. + * @param {lunr.Token[]} tokens - All tokens for this document/field. + * @returns {(?lunr.Token|lunr.Token[])} + */ + +/** + * Register a function with the pipeline. + * + * Functions that are used in the pipeline should be registered if the pipeline + * needs to be serialised, or a serialised pipeline needs to be loaded. + * + * Registering a function does not add it to a pipeline, functions must still be + * added to instances of the pipeline for them to be used when running a pipeline. + * + * @param {lunr.PipelineFunction} fn - The function to check for. + * @param {String} label - The label to register this function with + */ +lunr.Pipeline.registerFunction = function (fn, label) { + if (label in this.registeredFunctions) { + lunr.utils.warn('Overwriting existing registered function: ' + label) + } + + fn.label = label + lunr.Pipeline.registeredFunctions[fn.label] = fn +} + +/** + * Warns if the function is not registered as a Pipeline function. + * + * @param {lunr.PipelineFunction} fn - The function to check for. + * @private + */ +lunr.Pipeline.warnIfFunctionNotRegistered = function (fn) { + var isRegistered = fn.label && (fn.label in this.registeredFunctions) + + if (!isRegistered) { + lunr.utils.warn('Function is not registered with pipeline. This may cause problems when serialising the index.\n', fn) + } +} + +/** + * Loads a previously serialised pipeline. + * + * All functions to be loaded must already be registered with lunr.Pipeline. + * If any function from the serialised data has not been registered then an + * error will be thrown. + * + * @param {Object} serialised - The serialised pipeline to load. + * @returns {lunr.Pipeline} + */ +lunr.Pipeline.load = function (serialised) { + var pipeline = new lunr.Pipeline + + serialised.forEach(function (fnName) { + var fn = lunr.Pipeline.registeredFunctions[fnName] + + if (fn) { + pipeline.add(fn) + } else { + throw new Error('Cannot load unregistered function: ' + fnName) + } + }) + + return pipeline +} + +/** + * Adds new functions to the end of the pipeline. + * + * Logs a warning if the function has not been registered. + * + * @param {lunr.PipelineFunction[]} functions - Any number of functions to add to the pipeline. + */ +lunr.Pipeline.prototype.add = function () { + var fns = Array.prototype.slice.call(arguments) + + fns.forEach(function (fn) { + lunr.Pipeline.warnIfFunctionNotRegistered(fn) + this._stack.push(fn) + }, this) +} + +/** + * Adds a single function after a function that already exists in the + * pipeline. + * + * Logs a warning if the function has not been registered. + * + * @param {lunr.PipelineFunction} existingFn - A function that already exists in the pipeline. + * @param {lunr.PipelineFunction} newFn - The new function to add to the pipeline. + */ +lunr.Pipeline.prototype.after = function (existingFn, newFn) { + lunr.Pipeline.warnIfFunctionNotRegistered(newFn) + + var pos = this._stack.indexOf(existingFn) + if (pos == -1) { + throw new Error('Cannot find existingFn') + } + + pos = pos + 1 + this._stack.splice(pos, 0, newFn) +} + +/** + * Adds a single function before a function that already exists in the + * pipeline. + * + * Logs a warning if the function has not been registered. + * + * @param {lunr.PipelineFunction} existingFn - A function that already exists in the pipeline. + * @param {lunr.PipelineFunction} newFn - The new function to add to the pipeline. + */ +lunr.Pipeline.prototype.before = function (existingFn, newFn) { + lunr.Pipeline.warnIfFunctionNotRegistered(newFn) + + var pos = this._stack.indexOf(existingFn) + if (pos == -1) { + throw new Error('Cannot find existingFn') + } + + this._stack.splice(pos, 0, newFn) +} + +/** + * Removes a function from the pipeline. + * + * @param {lunr.PipelineFunction} fn The function to remove from the pipeline. + */ +lunr.Pipeline.prototype.remove = function (fn) { + var pos = this._stack.indexOf(fn) + if (pos == -1) { + return + } + + this._stack.splice(pos, 1) +} + +/** + * Runs the current list of functions that make up the pipeline against the + * passed tokens. + * + * @param {Array} tokens The tokens to run through the pipeline. + * @returns {Array} + */ +lunr.Pipeline.prototype.run = function (tokens) { + var stackLength = this._stack.length + + for (var i = 0; i < stackLength; i++) { + var fn = this._stack[i] + var memo = [] + + for (var j = 0; j < tokens.length; j++) { + var result = fn(tokens[j], j, tokens) + + if (result === null || result === void 0 || result === '') continue + + if (Array.isArray(result)) { + for (var k = 0; k < result.length; k++) { + memo.push(result[k]) + } + } else { + memo.push(result) + } + } + + tokens = memo + } + + return tokens +} + +/** + * Convenience method for passing a string through a pipeline and getting + * strings out. This method takes care of wrapping the passed string in a + * token and mapping the resulting tokens back to strings. + * + * @param {string} str - The string to pass through the pipeline. + * @param {?object} metadata - Optional metadata to associate with the token + * passed to the pipeline. + * @returns {string[]} + */ +lunr.Pipeline.prototype.runString = function (str, metadata) { + var token = new lunr.Token (str, metadata) + + return this.run([token]).map(function (t) { + return t.toString() + }) +} + +/** + * Resets the pipeline by removing any existing processors. + * + */ +lunr.Pipeline.prototype.reset = function () { + this._stack = [] +} + +/** + * Returns a representation of the pipeline ready for serialisation. + * + * Logs a warning if the function has not been registered. + * + * @returns {Array} + */ +lunr.Pipeline.prototype.toJSON = function () { + return this._stack.map(function (fn) { + lunr.Pipeline.warnIfFunctionNotRegistered(fn) + + return fn.label + }) +} +/*! + * lunr.Vector + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * A vector is used to construct the vector space of documents and queries. These + * vectors support operations to determine the similarity between two documents or + * a document and a query. + * + * Normally no parameters are required for initializing a vector, but in the case of + * loading a previously dumped vector the raw elements can be provided to the constructor. + * + * For performance reasons vectors are implemented with a flat array, where an elements + * index is immediately followed by its value. E.g. [index, value, index, value]. This + * allows the underlying array to be as sparse as possible and still offer decent + * performance when being used for vector calculations. + * + * @constructor + * @param {Number[]} [elements] - The flat list of element index and element value pairs. + */ +lunr.Vector = function (elements) { + this._magnitude = 0 + this.elements = elements || [] +} + + +/** + * Calculates the position within the vector to insert a given index. + * + * This is used internally by insert and upsert. If there are duplicate indexes then + * the position is returned as if the value for that index were to be updated, but it + * is the callers responsibility to check whether there is a duplicate at that index + * + * @param {Number} insertIdx - The index at which the element should be inserted. + * @returns {Number} + */ +lunr.Vector.prototype.positionForIndex = function (index) { + // For an empty vector the tuple can be inserted at the beginning + if (this.elements.length == 0) { + return 0 + } + + var start = 0, + end = this.elements.length / 2, + sliceLength = end - start, + pivotPoint = Math.floor(sliceLength / 2), + pivotIndex = this.elements[pivotPoint * 2] + + while (sliceLength > 1) { + if (pivotIndex < index) { + start = pivotPoint + } + + if (pivotIndex > index) { + end = pivotPoint + } + + if (pivotIndex == index) { + break + } + + sliceLength = end - start + pivotPoint = start + Math.floor(sliceLength / 2) + pivotIndex = this.elements[pivotPoint * 2] + } + + if (pivotIndex == index) { + return pivotPoint * 2 + } + + if (pivotIndex > index) { + return pivotPoint * 2 + } + + if (pivotIndex < index) { + return (pivotPoint + 1) * 2 + } +} + +/** + * Inserts an element at an index within the vector. + * + * Does not allow duplicates, will throw an error if there is already an entry + * for this index. + * + * @param {Number} insertIdx - The index at which the element should be inserted. + * @param {Number} val - The value to be inserted into the vector. + */ +lunr.Vector.prototype.insert = function (insertIdx, val) { + this.upsert(insertIdx, val, function () { + throw "duplicate index" + }) +} + +/** + * Inserts or updates an existing index within the vector. + * + * @param {Number} insertIdx - The index at which the element should be inserted. + * @param {Number} val - The value to be inserted into the vector. + * @param {function} fn - A function that is called for updates, the existing value and the + * requested value are passed as arguments + */ +lunr.Vector.prototype.upsert = function (insertIdx, val, fn) { + this._magnitude = 0 + var position = this.positionForIndex(insertIdx) + + if (this.elements[position] == insertIdx) { + this.elements[position + 1] = fn(this.elements[position + 1], val) + } else { + this.elements.splice(position, 0, insertIdx, val) + } +} + +/** + * Calculates the magnitude of this vector. + * + * @returns {Number} + */ +lunr.Vector.prototype.magnitude = function () { + if (this._magnitude) return this._magnitude + + var sumOfSquares = 0, + elementsLength = this.elements.length + + for (var i = 1; i < elementsLength; i += 2) { + var val = this.elements[i] + sumOfSquares += val * val + } + + return this._magnitude = Math.sqrt(sumOfSquares) +} + +/** + * Calculates the dot product of this vector and another vector. + * + * @param {lunr.Vector} otherVector - The vector to compute the dot product with. + * @returns {Number} + */ +lunr.Vector.prototype.dot = function (otherVector) { + var dotProduct = 0, + a = this.elements, b = otherVector.elements, + aLen = a.length, bLen = b.length, + aVal = 0, bVal = 0, + i = 0, j = 0 + + while (i < aLen && j < bLen) { + aVal = a[i], bVal = b[j] + if (aVal < bVal) { + i += 2 + } else if (aVal > bVal) { + j += 2 + } else if (aVal == bVal) { + dotProduct += a[i + 1] * b[j + 1] + i += 2 + j += 2 + } + } + + return dotProduct +} + +/** + * Calculates the similarity between this vector and another vector. + * + * @param {lunr.Vector} otherVector - The other vector to calculate the + * similarity with. + * @returns {Number} + */ +lunr.Vector.prototype.similarity = function (otherVector) { + return this.dot(otherVector) / this.magnitude() || 0 +} + +/** + * Converts the vector to an array of the elements within the vector. + * + * @returns {Number[]} + */ +lunr.Vector.prototype.toArray = function () { + var output = new Array (this.elements.length / 2) + + for (var i = 1, j = 0; i < this.elements.length; i += 2, j++) { + output[j] = this.elements[i] + } + + return output +} + +/** + * A JSON serializable representation of the vector. + * + * @returns {Number[]} + */ +lunr.Vector.prototype.toJSON = function () { + return this.elements +} +/* eslint-disable */ +/*! + * lunr.stemmer + * Copyright (C) 2020 Oliver Nightingale + * Includes code from - http://tartarus.org/~martin/PorterStemmer/js.txt + */ + +/** + * lunr.stemmer is an english language stemmer, this is a JavaScript + * implementation of the PorterStemmer taken from http://tartarus.org/~martin + * + * @static + * @implements {lunr.PipelineFunction} + * @param {lunr.Token} token - The string to stem + * @returns {lunr.Token} + * @see {@link lunr.Pipeline} + * @function + */ +lunr.stemmer = (function(){ + var step2list = { + "ational" : "ate", + "tional" : "tion", + "enci" : "ence", + "anci" : "ance", + "izer" : "ize", + "bli" : "ble", + "alli" : "al", + "entli" : "ent", + "eli" : "e", + "ousli" : "ous", + "ization" : "ize", + "ation" : "ate", + "ator" : "ate", + "alism" : "al", + "iveness" : "ive", + "fulness" : "ful", + "ousness" : "ous", + "aliti" : "al", + "iviti" : "ive", + "biliti" : "ble", + "logi" : "log" + }, + + step3list = { + "icate" : "ic", + "ative" : "", + "alize" : "al", + "iciti" : "ic", + "ical" : "ic", + "ful" : "", + "ness" : "" + }, + + c = "[^aeiou]", // consonant + v = "[aeiouy]", // vowel + C = c + "[^aeiouy]*", // consonant sequence + V = v + "[aeiou]*", // vowel sequence + + mgr0 = "^(" + C + ")?" + V + C, // [C]VC... is m>0 + meq1 = "^(" + C + ")?" + V + C + "(" + V + ")?$", // [C]VC[V] is m=1 + mgr1 = "^(" + C + ")?" + V + C + V + C, // [C]VCVC... is m>1 + s_v = "^(" + C + ")?" + v; // vowel in stem + + var re_mgr0 = new RegExp(mgr0); + var re_mgr1 = new RegExp(mgr1); + var re_meq1 = new RegExp(meq1); + var re_s_v = new RegExp(s_v); + + var re_1a = /^(.+?)(ss|i)es$/; + var re2_1a = /^(.+?)([^s])s$/; + var re_1b = /^(.+?)eed$/; + var re2_1b = /^(.+?)(ed|ing)$/; + var re_1b_2 = /.$/; + var re2_1b_2 = /(at|bl|iz)$/; + var re3_1b_2 = new RegExp("([^aeiouylsz])\\1$"); + var re4_1b_2 = new RegExp("^" + C + v + "[^aeiouwxy]$"); + + var re_1c = /^(.+?[^aeiou])y$/; + var re_2 = /^(.+?)(ational|tional|enci|anci|izer|bli|alli|entli|eli|ousli|ization|ation|ator|alism|iveness|fulness|ousness|aliti|iviti|biliti|logi)$/; + + var re_3 = /^(.+?)(icate|ative|alize|iciti|ical|ful|ness)$/; + + var re_4 = /^(.+?)(al|ance|ence|er|ic|able|ible|ant|ement|ment|ent|ou|ism|ate|iti|ous|ive|ize)$/; + var re2_4 = /^(.+?)(s|t)(ion)$/; + + var re_5 = /^(.+?)e$/; + var re_5_1 = /ll$/; + var re3_5 = new RegExp("^" + C + v + "[^aeiouwxy]$"); + + var porterStemmer = function porterStemmer(w) { + var stem, + suffix, + firstch, + re, + re2, + re3, + re4; + + if (w.length < 3) { return w; } + + firstch = w.substr(0,1); + if (firstch == "y") { + w = firstch.toUpperCase() + w.substr(1); + } + + // Step 1a + re = re_1a + re2 = re2_1a; + + if (re.test(w)) { w = w.replace(re,"$1$2"); } + else if (re2.test(w)) { w = w.replace(re2,"$1$2"); } + + // Step 1b + re = re_1b; + re2 = re2_1b; + if (re.test(w)) { + var fp = re.exec(w); + re = re_mgr0; + if (re.test(fp[1])) { + re = re_1b_2; + w = w.replace(re,""); + } + } else if (re2.test(w)) { + var fp = re2.exec(w); + stem = fp[1]; + re2 = re_s_v; + if (re2.test(stem)) { + w = stem; + re2 = re2_1b_2; + re3 = re3_1b_2; + re4 = re4_1b_2; + if (re2.test(w)) { w = w + "e"; } + else if (re3.test(w)) { re = re_1b_2; w = w.replace(re,""); } + else if (re4.test(w)) { w = w + "e"; } + } + } + + // Step 1c - replace suffix y or Y by i if preceded by a non-vowel which is not the first letter of the word (so cry -> cri, by -> by, say -> say) + re = re_1c; + if (re.test(w)) { + var fp = re.exec(w); + stem = fp[1]; + w = stem + "i"; + } + + // Step 2 + re = re_2; + if (re.test(w)) { + var fp = re.exec(w); + stem = fp[1]; + suffix = fp[2]; + re = re_mgr0; + if (re.test(stem)) { + w = stem + step2list[suffix]; + } + } + + // Step 3 + re = re_3; + if (re.test(w)) { + var fp = re.exec(w); + stem = fp[1]; + suffix = fp[2]; + re = re_mgr0; + if (re.test(stem)) { + w = stem + step3list[suffix]; + } + } + + // Step 4 + re = re_4; + re2 = re2_4; + if (re.test(w)) { + var fp = re.exec(w); + stem = fp[1]; + re = re_mgr1; + if (re.test(stem)) { + w = stem; + } + } else if (re2.test(w)) { + var fp = re2.exec(w); + stem = fp[1] + fp[2]; + re2 = re_mgr1; + if (re2.test(stem)) { + w = stem; + } + } + + // Step 5 + re = re_5; + if (re.test(w)) { + var fp = re.exec(w); + stem = fp[1]; + re = re_mgr1; + re2 = re_meq1; + re3 = re3_5; + if (re.test(stem) || (re2.test(stem) && !(re3.test(stem)))) { + w = stem; + } + } + + re = re_5_1; + re2 = re_mgr1; + if (re.test(w) && re2.test(w)) { + re = re_1b_2; + w = w.replace(re,""); + } + + // and turn initial Y back to y + + if (firstch == "y") { + w = firstch.toLowerCase() + w.substr(1); + } + + return w; + }; + + return function (token) { + return token.update(porterStemmer); + } +})(); + +lunr.Pipeline.registerFunction(lunr.stemmer, 'stemmer') +/*! + * lunr.stopWordFilter + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * lunr.generateStopWordFilter builds a stopWordFilter function from the provided + * list of stop words. + * + * The built in lunr.stopWordFilter is built using this generator and can be used + * to generate custom stopWordFilters for applications or non English languages. + * + * @function + * @param {Array} token The token to pass through the filter + * @returns {lunr.PipelineFunction} + * @see lunr.Pipeline + * @see lunr.stopWordFilter + */ +lunr.generateStopWordFilter = function (stopWords) { + var words = stopWords.reduce(function (memo, stopWord) { + memo[stopWord] = stopWord + return memo + }, {}) + + return function (token) { + if (token && words[token.toString()] !== token.toString()) return token + } +} + +/** + * lunr.stopWordFilter is an English language stop word list filter, any words + * contained in the list will not be passed through the filter. + * + * This is intended to be used in the Pipeline. If the token does not pass the + * filter then undefined will be returned. + * + * @function + * @implements {lunr.PipelineFunction} + * @params {lunr.Token} token - A token to check for being a stop word. + * @returns {lunr.Token} + * @see {@link lunr.Pipeline} + */ +lunr.stopWordFilter = lunr.generateStopWordFilter([ + 'a', + 'able', + 'about', + 'across', + 'after', + 'all', + 'almost', + 'also', + 'am', + 'among', + 'an', + 'and', + 'any', + 'are', + 'as', + 'at', + 'be', + 'because', + 'been', + 'but', + 'by', + 'can', + 'cannot', + 'could', + 'dear', + 'did', + 'do', + 'does', + 'either', + 'else', + 'ever', + 'every', + 'for', + 'from', + 'get', + 'got', + 'had', + 'has', + 'have', + 'he', + 'her', + 'hers', + 'him', + 'his', + 'how', + 'however', + 'i', + 'if', + 'in', + 'into', + 'is', + 'it', + 'its', + 'just', + 'least', + 'let', + 'like', + 'likely', + 'may', + 'me', + 'might', + 'most', + 'must', + 'my', + 'neither', + 'no', + 'nor', + 'not', + 'of', + 'off', + 'often', + 'on', + 'only', + 'or', + 'other', + 'our', + 'own', + 'rather', + 'said', + 'say', + 'says', + 'she', + 'should', + 'since', + 'so', + 'some', + 'than', + 'that', + 'the', + 'their', + 'them', + 'then', + 'there', + 'these', + 'they', + 'this', + 'tis', + 'to', + 'too', + 'twas', + 'us', + 'wants', + 'was', + 'we', + 'were', + 'what', + 'when', + 'where', + 'which', + 'while', + 'who', + 'whom', + 'why', + 'will', + 'with', + 'would', + 'yet', + 'you', + 'your' +]) + +lunr.Pipeline.registerFunction(lunr.stopWordFilter, 'stopWordFilter') +/*! + * lunr.trimmer + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * lunr.trimmer is a pipeline function for trimming non word + * characters from the beginning and end of tokens before they + * enter the index. + * + * This implementation may not work correctly for non latin + * characters and should either be removed or adapted for use + * with languages with non-latin characters. + * + * @static + * @implements {lunr.PipelineFunction} + * @param {lunr.Token} token The token to pass through the filter + * @returns {lunr.Token} + * @see lunr.Pipeline + */ +lunr.trimmer = function (token) { + return token.update(function (s) { + return s.replace(/^\W+/, '').replace(/\W+$/, '') + }) +} + +lunr.Pipeline.registerFunction(lunr.trimmer, 'trimmer') +/*! + * lunr.TokenSet + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * A token set is used to store the unique list of all tokens + * within an index. Token sets are also used to represent an + * incoming query to the index, this query token set and index + * token set are then intersected to find which tokens to look + * up in the inverted index. + * + * A token set can hold multiple tokens, as in the case of the + * index token set, or it can hold a single token as in the + * case of a simple query token set. + * + * Additionally token sets are used to perform wildcard matching. + * Leading, contained and trailing wildcards are supported, and + * from this edit distance matching can also be provided. + * + * Token sets are implemented as a minimal finite state automata, + * where both common prefixes and suffixes are shared between tokens. + * This helps to reduce the space used for storing the token set. + * + * @constructor + */ +lunr.TokenSet = function () { + this.final = false + this.edges = {} + this.id = lunr.TokenSet._nextId + lunr.TokenSet._nextId += 1 +} + +/** + * Keeps track of the next, auto increment, identifier to assign + * to a new tokenSet. + * + * TokenSets require a unique identifier to be correctly minimised. + * + * @private + */ +lunr.TokenSet._nextId = 1 + +/** + * Creates a TokenSet instance from the given sorted array of words. + * + * @param {String[]} arr - A sorted array of strings to create the set from. + * @returns {lunr.TokenSet} + * @throws Will throw an error if the input array is not sorted. + */ +lunr.TokenSet.fromArray = function (arr) { + var builder = new lunr.TokenSet.Builder + + for (var i = 0, len = arr.length; i < len; i++) { + builder.insert(arr[i]) + } + + builder.finish() + return builder.root +} + +/** + * Creates a token set from a query clause. + * + * @private + * @param {Object} clause - A single clause from lunr.Query. + * @param {string} clause.term - The query clause term. + * @param {number} [clause.editDistance] - The optional edit distance for the term. + * @returns {lunr.TokenSet} + */ +lunr.TokenSet.fromClause = function (clause) { + if ('editDistance' in clause) { + return lunr.TokenSet.fromFuzzyString(clause.term, clause.editDistance) + } else { + return lunr.TokenSet.fromString(clause.term) + } +} + +/** + * Creates a token set representing a single string with a specified + * edit distance. + * + * Insertions, deletions, substitutions and transpositions are each + * treated as an edit distance of 1. + * + * Increasing the allowed edit distance will have a dramatic impact + * on the performance of both creating and intersecting these TokenSets. + * It is advised to keep the edit distance less than 3. + * + * @param {string} str - The string to create the token set from. + * @param {number} editDistance - The allowed edit distance to match. + * @returns {lunr.Vector} + */ +lunr.TokenSet.fromFuzzyString = function (str, editDistance) { + var root = new lunr.TokenSet + + var stack = [{ + node: root, + editsRemaining: editDistance, + str: str + }] + + while (stack.length) { + var frame = stack.pop() + + // no edit + if (frame.str.length > 0) { + var char = frame.str.charAt(0), + noEditNode + + if (char in frame.node.edges) { + noEditNode = frame.node.edges[char] + } else { + noEditNode = new lunr.TokenSet + frame.node.edges[char] = noEditNode + } + + if (frame.str.length == 1) { + noEditNode.final = true + } + + stack.push({ + node: noEditNode, + editsRemaining: frame.editsRemaining, + str: frame.str.slice(1) + }) + } + + if (frame.editsRemaining == 0) { + continue + } + + // insertion + if ("*" in frame.node.edges) { + var insertionNode = frame.node.edges["*"] + } else { + var insertionNode = new lunr.TokenSet + frame.node.edges["*"] = insertionNode + } + + if (frame.str.length == 0) { + insertionNode.final = true + } + + stack.push({ + node: insertionNode, + editsRemaining: frame.editsRemaining - 1, + str: frame.str + }) + + // deletion + // can only do a deletion if we have enough edits remaining + // and if there are characters left to delete in the string + if (frame.str.length > 1) { + stack.push({ + node: frame.node, + editsRemaining: frame.editsRemaining - 1, + str: frame.str.slice(1) + }) + } + + // deletion + // just removing the last character from the str + if (frame.str.length == 1) { + frame.node.final = true + } + + // substitution + // can only do a substitution if we have enough edits remaining + // and if there are characters left to substitute + if (frame.str.length >= 1) { + if ("*" in frame.node.edges) { + var substitutionNode = frame.node.edges["*"] + } else { + var substitutionNode = new lunr.TokenSet + frame.node.edges["*"] = substitutionNode + } + + if (frame.str.length == 1) { + substitutionNode.final = true + } + + stack.push({ + node: substitutionNode, + editsRemaining: frame.editsRemaining - 1, + str: frame.str.slice(1) + }) + } + + // transposition + // can only do a transposition if there are edits remaining + // and there are enough characters to transpose + if (frame.str.length > 1) { + var charA = frame.str.charAt(0), + charB = frame.str.charAt(1), + transposeNode + + if (charB in frame.node.edges) { + transposeNode = frame.node.edges[charB] + } else { + transposeNode = new lunr.TokenSet + frame.node.edges[charB] = transposeNode + } + + if (frame.str.length == 1) { + transposeNode.final = true + } + + stack.push({ + node: transposeNode, + editsRemaining: frame.editsRemaining - 1, + str: charA + frame.str.slice(2) + }) + } + } + + return root +} + +/** + * Creates a TokenSet from a string. + * + * The string may contain one or more wildcard characters (*) + * that will allow wildcard matching when intersecting with + * another TokenSet. + * + * @param {string} str - The string to create a TokenSet from. + * @returns {lunr.TokenSet} + */ +lunr.TokenSet.fromString = function (str) { + var node = new lunr.TokenSet, + root = node + + /* + * Iterates through all characters within the passed string + * appending a node for each character. + * + * When a wildcard character is found then a self + * referencing edge is introduced to continually match + * any number of any characters. + */ + for (var i = 0, len = str.length; i < len; i++) { + var char = str[i], + final = (i == len - 1) + + if (char == "*") { + node.edges[char] = node + node.final = final + + } else { + var next = new lunr.TokenSet + next.final = final + + node.edges[char] = next + node = next + } + } + + return root +} + +/** + * Converts this TokenSet into an array of strings + * contained within the TokenSet. + * + * This is not intended to be used on a TokenSet that + * contains wildcards, in these cases the results are + * undefined and are likely to cause an infinite loop. + * + * @returns {string[]} + */ +lunr.TokenSet.prototype.toArray = function () { + var words = [] + + var stack = [{ + prefix: "", + node: this + }] + + while (stack.length) { + var frame = stack.pop(), + edges = Object.keys(frame.node.edges), + len = edges.length + + if (frame.node.final) { + /* In Safari, at this point the prefix is sometimes corrupted, see: + * https://github.com/olivernn/lunr.js/issues/279 Calling any + * String.prototype method forces Safari to "cast" this string to what + * it's supposed to be, fixing the bug. */ + frame.prefix.charAt(0) + words.push(frame.prefix) + } + + for (var i = 0; i < len; i++) { + var edge = edges[i] + + stack.push({ + prefix: frame.prefix.concat(edge), + node: frame.node.edges[edge] + }) + } + } + + return words +} + +/** + * Generates a string representation of a TokenSet. + * + * This is intended to allow TokenSets to be used as keys + * in objects, largely to aid the construction and minimisation + * of a TokenSet. As such it is not designed to be a human + * friendly representation of the TokenSet. + * + * @returns {string} + */ +lunr.TokenSet.prototype.toString = function () { + // NOTE: Using Object.keys here as this.edges is very likely + // to enter 'hash-mode' with many keys being added + // + // avoiding a for-in loop here as it leads to the function + // being de-optimised (at least in V8). From some simple + // benchmarks the performance is comparable, but allowing + // V8 to optimize may mean easy performance wins in the future. + + if (this._str) { + return this._str + } + + var str = this.final ? '1' : '0', + labels = Object.keys(this.edges).sort(), + len = labels.length + + for (var i = 0; i < len; i++) { + var label = labels[i], + node = this.edges[label] + + str = str + label + node.id + } + + return str +} + +/** + * Returns a new TokenSet that is the intersection of + * this TokenSet and the passed TokenSet. + * + * This intersection will take into account any wildcards + * contained within the TokenSet. + * + * @param {lunr.TokenSet} b - An other TokenSet to intersect with. + * @returns {lunr.TokenSet} + */ +lunr.TokenSet.prototype.intersect = function (b) { + var output = new lunr.TokenSet, + frame = undefined + + var stack = [{ + qNode: b, + output: output, + node: this + }] + + while (stack.length) { + frame = stack.pop() + + // NOTE: As with the #toString method, we are using + // Object.keys and a for loop instead of a for-in loop + // as both of these objects enter 'hash' mode, causing + // the function to be de-optimised in V8 + var qEdges = Object.keys(frame.qNode.edges), + qLen = qEdges.length, + nEdges = Object.keys(frame.node.edges), + nLen = nEdges.length + + for (var q = 0; q < qLen; q++) { + var qEdge = qEdges[q] + + for (var n = 0; n < nLen; n++) { + var nEdge = nEdges[n] + + if (nEdge == qEdge || qEdge == '*') { + var node = frame.node.edges[nEdge], + qNode = frame.qNode.edges[qEdge], + final = node.final && qNode.final, + next = undefined + + if (nEdge in frame.output.edges) { + // an edge already exists for this character + // no need to create a new node, just set the finality + // bit unless this node is already final + next = frame.output.edges[nEdge] + next.final = next.final || final + + } else { + // no edge exists yet, must create one + // set the finality bit and insert it + // into the output + next = new lunr.TokenSet + next.final = final + frame.output.edges[nEdge] = next + } + + stack.push({ + qNode: qNode, + output: next, + node: node + }) + } + } + } + } + + return output +} +lunr.TokenSet.Builder = function () { + this.previousWord = "" + this.root = new lunr.TokenSet + this.uncheckedNodes = [] + this.minimizedNodes = {} +} + +lunr.TokenSet.Builder.prototype.insert = function (word) { + var node, + commonPrefix = 0 + + if (word < this.previousWord) { + throw new Error ("Out of order word insertion") + } + + for (var i = 0; i < word.length && i < this.previousWord.length; i++) { + if (word[i] != this.previousWord[i]) break + commonPrefix++ + } + + this.minimize(commonPrefix) + + if (this.uncheckedNodes.length == 0) { + node = this.root + } else { + node = this.uncheckedNodes[this.uncheckedNodes.length - 1].child + } + + for (var i = commonPrefix; i < word.length; i++) { + var nextNode = new lunr.TokenSet, + char = word[i] + + node.edges[char] = nextNode + + this.uncheckedNodes.push({ + parent: node, + char: char, + child: nextNode + }) + + node = nextNode + } + + node.final = true + this.previousWord = word +} + +lunr.TokenSet.Builder.prototype.finish = function () { + this.minimize(0) +} + +lunr.TokenSet.Builder.prototype.minimize = function (downTo) { + for (var i = this.uncheckedNodes.length - 1; i >= downTo; i--) { + var node = this.uncheckedNodes[i], + childKey = node.child.toString() + + if (childKey in this.minimizedNodes) { + node.parent.edges[node.char] = this.minimizedNodes[childKey] + } else { + // Cache the key for this node since + // we know it can't change anymore + node.child._str = childKey + + this.minimizedNodes[childKey] = node.child + } + + this.uncheckedNodes.pop() + } +} +/*! + * lunr.Index + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * An index contains the built index of all documents and provides a query interface + * to the index. + * + * Usually instances of lunr.Index will not be created using this constructor, instead + * lunr.Builder should be used to construct new indexes, or lunr.Index.load should be + * used to load previously built and serialized indexes. + * + * @constructor + * @param {Object} attrs - The attributes of the built search index. + * @param {Object} attrs.invertedIndex - An index of term/field to document reference. + * @param {Object} attrs.fieldVectors - Field vectors + * @param {lunr.TokenSet} attrs.tokenSet - An set of all corpus tokens. + * @param {string[]} attrs.fields - The names of indexed document fields. + * @param {lunr.Pipeline} attrs.pipeline - The pipeline to use for search terms. + */ +lunr.Index = function (attrs) { + this.invertedIndex = attrs.invertedIndex + this.fieldVectors = attrs.fieldVectors + this.tokenSet = attrs.tokenSet + this.fields = attrs.fields + this.pipeline = attrs.pipeline +} + +/** + * A result contains details of a document matching a search query. + * @typedef {Object} lunr.Index~Result + * @property {string} ref - The reference of the document this result represents. + * @property {number} score - A number between 0 and 1 representing how similar this document is to the query. + * @property {lunr.MatchData} matchData - Contains metadata about this match including which term(s) caused the match. + */ + +/** + * Although lunr provides the ability to create queries using lunr.Query, it also provides a simple + * query language which itself is parsed into an instance of lunr.Query. + * + * For programmatically building queries it is advised to directly use lunr.Query, the query language + * is best used for human entered text rather than program generated text. + * + * At its simplest queries can just be a single term, e.g. `hello`, multiple terms are also supported + * and will be combined with OR, e.g `hello world` will match documents that contain either 'hello' + * or 'world', though those that contain both will rank higher in the results. + * + * Wildcards can be included in terms to match one or more unspecified characters, these wildcards can + * be inserted anywhere within the term, and more than one wildcard can exist in a single term. Adding + * wildcards will increase the number of documents that will be found but can also have a negative + * impact on query performance, especially with wildcards at the beginning of a term. + * + * Terms can be restricted to specific fields, e.g. `title:hello`, only documents with the term + * hello in the title field will match this query. Using a field not present in the index will lead + * to an error being thrown. + * + * Modifiers can also be added to terms, lunr supports edit distance and boost modifiers on terms. A term + * boost will make documents matching that term score higher, e.g. `foo^5`. Edit distance is also supported + * to provide fuzzy matching, e.g. 'hello~2' will match documents with hello with an edit distance of 2. + * Avoid large values for edit distance to improve query performance. + * + * Each term also supports a presence modifier. By default a term's presence in document is optional, however + * this can be changed to either required or prohibited. For a term's presence to be required in a document the + * term should be prefixed with a '+', e.g. `+foo bar` is a search for documents that must contain 'foo' and + * optionally contain 'bar'. Conversely a leading '-' sets the terms presence to prohibited, i.e. it must not + * appear in a document, e.g. `-foo bar` is a search for documents that do not contain 'foo' but may contain 'bar'. + * + * To escape special characters the backslash character '\' can be used, this allows searches to include + * characters that would normally be considered modifiers, e.g. `foo\~2` will search for a term "foo~2" instead + * of attempting to apply a boost of 2 to the search term "foo". + * + * @typedef {string} lunr.Index~QueryString + * @example Simple single term query + * hello + * @example Multiple term query + * hello world + * @example term scoped to a field + * title:hello + * @example term with a boost of 10 + * hello^10 + * @example term with an edit distance of 2 + * hello~2 + * @example terms with presence modifiers + * -foo +bar baz + */ + +/** + * Performs a search against the index using lunr query syntax. + * + * Results will be returned sorted by their score, the most relevant results + * will be returned first. For details on how the score is calculated, please see + * the {@link https://lunrjs.com/guides/searching.html#scoring|guide}. + * + * For more programmatic querying use lunr.Index#query. + * + * @param {lunr.Index~QueryString} queryString - A string containing a lunr query. + * @throws {lunr.QueryParseError} If the passed query string cannot be parsed. + * @returns {lunr.Index~Result[]} + */ +lunr.Index.prototype.search = function (queryString) { + return this.query(function (query) { + var parser = new lunr.QueryParser(queryString, query) + parser.parse() + }) +} + +/** + * A query builder callback provides a query object to be used to express + * the query to perform on the index. + * + * @callback lunr.Index~queryBuilder + * @param {lunr.Query} query - The query object to build up. + * @this lunr.Query + */ + +/** + * Performs a query against the index using the yielded lunr.Query object. + * + * If performing programmatic queries against the index, this method is preferred + * over lunr.Index#search so as to avoid the additional query parsing overhead. + * + * A query object is yielded to the supplied function which should be used to + * express the query to be run against the index. + * + * Note that although this function takes a callback parameter it is _not_ an + * asynchronous operation, the callback is just yielded a query object to be + * customized. + * + * @param {lunr.Index~queryBuilder} fn - A function that is used to build the query. + * @returns {lunr.Index~Result[]} + */ +lunr.Index.prototype.query = function (fn) { + // for each query clause + // * process terms + // * expand terms from token set + // * find matching documents and metadata + // * get document vectors + // * score documents + + var query = new lunr.Query(this.fields), + matchingFields = Object.create(null), + queryVectors = Object.create(null), + termFieldCache = Object.create(null), + requiredMatches = Object.create(null), + prohibitedMatches = Object.create(null) + + /* + * To support field level boosts a query vector is created per + * field. An empty vector is eagerly created to support negated + * queries. + */ + for (var i = 0; i < this.fields.length; i++) { + queryVectors[this.fields[i]] = new lunr.Vector + } + + fn.call(query, query) + + for (var i = 0; i < query.clauses.length; i++) { + /* + * Unless the pipeline has been disabled for this term, which is + * the case for terms with wildcards, we need to pass the clause + * term through the search pipeline. A pipeline returns an array + * of processed terms. Pipeline functions may expand the passed + * term, which means we may end up performing multiple index lookups + * for a single query term. + */ + var clause = query.clauses[i], + terms = null, + clauseMatches = lunr.Set.empty + + if (clause.usePipeline) { + terms = this.pipeline.runString(clause.term, { + fields: clause.fields + }) + } else { + terms = [clause.term] + } + + for (var m = 0; m < terms.length; m++) { + var term = terms[m] + + /* + * Each term returned from the pipeline needs to use the same query + * clause object, e.g. the same boost and or edit distance. The + * simplest way to do this is to re-use the clause object but mutate + * its term property. + */ + clause.term = term + + /* + * From the term in the clause we create a token set which will then + * be used to intersect the indexes token set to get a list of terms + * to lookup in the inverted index + */ + var termTokenSet = lunr.TokenSet.fromClause(clause), + expandedTerms = this.tokenSet.intersect(termTokenSet).toArray() + + /* + * If a term marked as required does not exist in the tokenSet it is + * impossible for the search to return any matches. We set all the field + * scoped required matches set to empty and stop examining any further + * clauses. + */ + if (expandedTerms.length === 0 && clause.presence === lunr.Query.presence.REQUIRED) { + for (var k = 0; k < clause.fields.length; k++) { + var field = clause.fields[k] + requiredMatches[field] = lunr.Set.empty + } + + break + } + + for (var j = 0; j < expandedTerms.length; j++) { + /* + * For each term get the posting and termIndex, this is required for + * building the query vector. + */ + var expandedTerm = expandedTerms[j], + posting = this.invertedIndex[expandedTerm], + termIndex = posting._index + + for (var k = 0; k < clause.fields.length; k++) { + /* + * For each field that this query term is scoped by (by default + * all fields are in scope) we need to get all the document refs + * that have this term in that field. + * + * The posting is the entry in the invertedIndex for the matching + * term from above. + */ + var field = clause.fields[k], + fieldPosting = posting[field], + matchingDocumentRefs = Object.keys(fieldPosting), + termField = expandedTerm + "/" + field, + matchingDocumentsSet = new lunr.Set(matchingDocumentRefs) + + /* + * if the presence of this term is required ensure that the matching + * documents are added to the set of required matches for this clause. + * + */ + if (clause.presence == lunr.Query.presence.REQUIRED) { + clauseMatches = clauseMatches.union(matchingDocumentsSet) + + if (requiredMatches[field] === undefined) { + requiredMatches[field] = lunr.Set.complete + } + } + + /* + * if the presence of this term is prohibited ensure that the matching + * documents are added to the set of prohibited matches for this field, + * creating that set if it does not yet exist. + */ + if (clause.presence == lunr.Query.presence.PROHIBITED) { + if (prohibitedMatches[field] === undefined) { + prohibitedMatches[field] = lunr.Set.empty + } + + prohibitedMatches[field] = prohibitedMatches[field].union(matchingDocumentsSet) + + /* + * Prohibited matches should not be part of the query vector used for + * similarity scoring and no metadata should be extracted so we continue + * to the next field + */ + continue + } + + /* + * The query field vector is populated using the termIndex found for + * the term and a unit value with the appropriate boost applied. + * Using upsert because there could already be an entry in the vector + * for the term we are working with. In that case we just add the scores + * together. + */ + queryVectors[field].upsert(termIndex, clause.boost, function (a, b) { return a + b }) + + /** + * If we've already seen this term, field combo then we've already collected + * the matching documents and metadata, no need to go through all that again + */ + if (termFieldCache[termField]) { + continue + } + + for (var l = 0; l < matchingDocumentRefs.length; l++) { + /* + * All metadata for this term/field/document triple + * are then extracted and collected into an instance + * of lunr.MatchData ready to be returned in the query + * results + */ + var matchingDocumentRef = matchingDocumentRefs[l], + matchingFieldRef = new lunr.FieldRef (matchingDocumentRef, field), + metadata = fieldPosting[matchingDocumentRef], + fieldMatch + + if ((fieldMatch = matchingFields[matchingFieldRef]) === undefined) { + matchingFields[matchingFieldRef] = new lunr.MatchData (expandedTerm, field, metadata) + } else { + fieldMatch.add(expandedTerm, field, metadata) + } + + } + + termFieldCache[termField] = true + } + } + } + + /** + * If the presence was required we need to update the requiredMatches field sets. + * We do this after all fields for the term have collected their matches because + * the clause terms presence is required in _any_ of the fields not _all_ of the + * fields. + */ + if (clause.presence === lunr.Query.presence.REQUIRED) { + for (var k = 0; k < clause.fields.length; k++) { + var field = clause.fields[k] + requiredMatches[field] = requiredMatches[field].intersect(clauseMatches) + } + } + } + + /** + * Need to combine the field scoped required and prohibited + * matching documents into a global set of required and prohibited + * matches + */ + var allRequiredMatches = lunr.Set.complete, + allProhibitedMatches = lunr.Set.empty + + for (var i = 0; i < this.fields.length; i++) { + var field = this.fields[i] + + if (requiredMatches[field]) { + allRequiredMatches = allRequiredMatches.intersect(requiredMatches[field]) + } + + if (prohibitedMatches[field]) { + allProhibitedMatches = allProhibitedMatches.union(prohibitedMatches[field]) + } + } + + var matchingFieldRefs = Object.keys(matchingFields), + results = [], + matches = Object.create(null) + + /* + * If the query is negated (contains only prohibited terms) + * we need to get _all_ fieldRefs currently existing in the + * index. This is only done when we know that the query is + * entirely prohibited terms to avoid any cost of getting all + * fieldRefs unnecessarily. + * + * Additionally, blank MatchData must be created to correctly + * populate the results. + */ + if (query.isNegated()) { + matchingFieldRefs = Object.keys(this.fieldVectors) + + for (var i = 0; i < matchingFieldRefs.length; i++) { + var matchingFieldRef = matchingFieldRefs[i] + var fieldRef = lunr.FieldRef.fromString(matchingFieldRef) + matchingFields[matchingFieldRef] = new lunr.MatchData + } + } + + for (var i = 0; i < matchingFieldRefs.length; i++) { + /* + * Currently we have document fields that match the query, but we + * need to return documents. The matchData and scores are combined + * from multiple fields belonging to the same document. + * + * Scores are calculated by field, using the query vectors created + * above, and combined into a final document score using addition. + */ + var fieldRef = lunr.FieldRef.fromString(matchingFieldRefs[i]), + docRef = fieldRef.docRef + + if (!allRequiredMatches.contains(docRef)) { + continue + } + + if (allProhibitedMatches.contains(docRef)) { + continue + } + + var fieldVector = this.fieldVectors[fieldRef], + score = queryVectors[fieldRef.fieldName].similarity(fieldVector), + docMatch + + if ((docMatch = matches[docRef]) !== undefined) { + docMatch.score += score + docMatch.matchData.combine(matchingFields[fieldRef]) + } else { + var match = { + ref: docRef, + score: score, + matchData: matchingFields[fieldRef] + } + matches[docRef] = match + results.push(match) + } + } + + /* + * Sort the results objects by score, highest first. + */ + return results.sort(function (a, b) { + return b.score - a.score + }) +} + +/** + * Prepares the index for JSON serialization. + * + * The schema for this JSON blob will be described in a + * separate JSON schema file. + * + * @returns {Object} + */ +lunr.Index.prototype.toJSON = function () { + var invertedIndex = Object.keys(this.invertedIndex) + .sort() + .map(function (term) { + return [term, this.invertedIndex[term]] + }, this) + + var fieldVectors = Object.keys(this.fieldVectors) + .map(function (ref) { + return [ref, this.fieldVectors[ref].toJSON()] + }, this) + + return { + version: lunr.version, + fields: this.fields, + fieldVectors: fieldVectors, + invertedIndex: invertedIndex, + pipeline: this.pipeline.toJSON() + } +} + +/** + * Loads a previously serialized lunr.Index + * + * @param {Object} serializedIndex - A previously serialized lunr.Index + * @returns {lunr.Index} + */ +lunr.Index.load = function (serializedIndex) { + var attrs = {}, + fieldVectors = {}, + serializedVectors = serializedIndex.fieldVectors, + invertedIndex = Object.create(null), + serializedInvertedIndex = serializedIndex.invertedIndex, + tokenSetBuilder = new lunr.TokenSet.Builder, + pipeline = lunr.Pipeline.load(serializedIndex.pipeline) + + if (serializedIndex.version != lunr.version) { + lunr.utils.warn("Version mismatch when loading serialised index. Current version of lunr '" + lunr.version + "' does not match serialized index '" + serializedIndex.version + "'") + } + + for (var i = 0; i < serializedVectors.length; i++) { + var tuple = serializedVectors[i], + ref = tuple[0], + elements = tuple[1] + + fieldVectors[ref] = new lunr.Vector(elements) + } + + for (var i = 0; i < serializedInvertedIndex.length; i++) { + var tuple = serializedInvertedIndex[i], + term = tuple[0], + posting = tuple[1] + + tokenSetBuilder.insert(term) + invertedIndex[term] = posting + } + + tokenSetBuilder.finish() + + attrs.fields = serializedIndex.fields + + attrs.fieldVectors = fieldVectors + attrs.invertedIndex = invertedIndex + attrs.tokenSet = tokenSetBuilder.root + attrs.pipeline = pipeline + + return new lunr.Index(attrs) +} +/*! + * lunr.Builder + * Copyright (C) 2020 Oliver Nightingale + */ + +/** + * lunr.Builder performs indexing on a set of documents and + * returns instances of lunr.Index ready for querying. + * + * All configuration of the index is done via the builder, the + * fields to index, the document reference, the text processing + * pipeline and document scoring parameters are all set on the + * builder before indexing. + * + * @constructor + * @property {string} _ref - Internal reference to the document reference field. + * @property {string[]} _fields - Internal reference to the document fields to index. + * @property {object} invertedIndex - The inverted index maps terms to document fields. + * @property {object} documentTermFrequencies - Keeps track of document term frequencies. + * @property {object} documentLengths - Keeps track of the length of documents added to the index. + * @property {lunr.tokenizer} tokenizer - Function for splitting strings into tokens for indexing. + * @property {lunr.Pipeline} pipeline - The pipeline performs text processing on tokens before indexing. + * @property {lunr.Pipeline} searchPipeline - A pipeline for processing search terms before querying the index. + * @property {number} documentCount - Keeps track of the total number of documents indexed. + * @property {number} _b - A parameter to control field length normalization, setting this to 0 disabled normalization, 1 fully normalizes field lengths, the default value is 0.75. + * @property {number} _k1 - A parameter to control how quickly an increase in term frequency results in term frequency saturation, the default value is 1.2. + * @property {number} termIndex - A counter incremented for each unique term, used to identify a terms position in the vector space. + * @property {array} metadataWhitelist - A list of metadata keys that have been whitelisted for entry in the index. + */ +lunr.Builder = function () { + this._ref = "id" + this._fields = Object.create(null) + this._documents = Object.create(null) + this.invertedIndex = Object.create(null) + this.fieldTermFrequencies = {} + this.fieldLengths = {} + this.tokenizer = lunr.tokenizer + this.pipeline = new lunr.Pipeline + this.searchPipeline = new lunr.Pipeline + this.documentCount = 0 + this._b = 0.75 + this._k1 = 1.2 + this.termIndex = 0 + this.metadataWhitelist = [] +} + +/** + * Sets the document field used as the document reference. Every document must have this field. + * The type of this field in the document should be a string, if it is not a string it will be + * coerced into a string by calling toString. + * + * The default ref is 'id'. + * + * The ref should _not_ be changed during indexing, it should be set before any documents are + * added to the index. Changing it during indexing can lead to inconsistent results. + * + * @param {string} ref - The name of the reference field in the document. + */ +lunr.Builder.prototype.ref = function (ref) { + this._ref = ref +} + +/** + * A function that is used to extract a field from a document. + * + * Lunr expects a field to be at the top level of a document, if however the field + * is deeply nested within a document an extractor function can be used to extract + * the right field for indexing. + * + * @callback fieldExtractor + * @param {object} doc - The document being added to the index. + * @returns {?(string|object|object[])} obj - The object that will be indexed for this field. + * @example Extracting a nested field + * function (doc) { return doc.nested.field } + */ + +/** + * Adds a field to the list of document fields that will be indexed. Every document being + * indexed should have this field. Null values for this field in indexed documents will + * not cause errors but will limit the chance of that document being retrieved by searches. + * + * All fields should be added before adding documents to the index. Adding fields after + * a document has been indexed will have no effect on already indexed documents. + * + * Fields can be boosted at build time. This allows terms within that field to have more + * importance when ranking search results. Use a field boost to specify that matches within + * one field are more important than other fields. + * + * @param {string} fieldName - The name of a field to index in all documents. + * @param {object} attributes - Optional attributes associated with this field. + * @param {number} [attributes.boost=1] - Boost applied to all terms within this field. + * @param {fieldExtractor} [attributes.extractor] - Function to extract a field from a document. + * @throws {RangeError} fieldName cannot contain unsupported characters '/' + */ +lunr.Builder.prototype.field = function (fieldName, attributes) { + if (/\//.test(fieldName)) { + throw new RangeError ("Field '" + fieldName + "' contains illegal character '/'") + } + + this._fields[fieldName] = attributes || {} +} + +/** + * A parameter to tune the amount of field length normalisation that is applied when + * calculating relevance scores. A value of 0 will completely disable any normalisation + * and a value of 1 will fully normalise field lengths. The default is 0.75. Values of b + * will be clamped to the range 0 - 1. + * + * @param {number} number - The value to set for this tuning parameter. + */ +lunr.Builder.prototype.b = function (number) { + if (number < 0) { + this._b = 0 + } else if (number > 1) { + this._b = 1 + } else { + this._b = number + } +} + +/** + * A parameter that controls the speed at which a rise in term frequency results in term + * frequency saturation. The default value is 1.2. Setting this to a higher value will give + * slower saturation levels, a lower value will result in quicker saturation. + * + * @param {number} number - The value to set for this tuning parameter. + */ +lunr.Builder.prototype.k1 = function (number) { + this._k1 = number +} + +/** + * Adds a document to the index. + * + * Before adding fields to the index the index should have been fully setup, with the document + * ref and all fields to index already having been specified. + * + * The document must have a field name as specified by the ref (by default this is 'id') and + * it should have all fields defined for indexing, though null or undefined values will not + * cause errors. + * + * Entire documents can be boosted at build time. Applying a boost to a document indicates that + * this document should rank higher in search results than other documents. + * + * @param {object} doc - The document to add to the index. + * @param {object} attributes - Optional attributes associated with this document. + * @param {number} [attributes.boost=1] - Boost applied to all terms within this document. + */ +lunr.Builder.prototype.add = function (doc, attributes) { + var docRef = doc[this._ref], + fields = Object.keys(this._fields) + + this._documents[docRef] = attributes || {} + this.documentCount += 1 + + for (var i = 0; i < fields.length; i++) { + var fieldName = fields[i], + extractor = this._fields[fieldName].extractor, + field = extractor ? extractor(doc) : doc[fieldName], + tokens = this.tokenizer(field, { + fields: [fieldName] + }), + terms = this.pipeline.run(tokens), + fieldRef = new lunr.FieldRef (docRef, fieldName), + fieldTerms = Object.create(null) + + this.fieldTermFrequencies[fieldRef] = fieldTerms + this.fieldLengths[fieldRef] = 0 + + // store the length of this field for this document + this.fieldLengths[fieldRef] += terms.length + + // calculate term frequencies for this field + for (var j = 0; j < terms.length; j++) { + var term = terms[j] + + if (fieldTerms[term] == undefined) { + fieldTerms[term] = 0 + } + + fieldTerms[term] += 1 + + // add to inverted index + // create an initial posting if one doesn't exist + if (this.invertedIndex[term] == undefined) { + var posting = Object.create(null) + posting["_index"] = this.termIndex + this.termIndex += 1 + + for (var k = 0; k < fields.length; k++) { + posting[fields[k]] = Object.create(null) + } + + this.invertedIndex[term] = posting + } + + // add an entry for this term/fieldName/docRef to the invertedIndex + if (this.invertedIndex[term][fieldName][docRef] == undefined) { + this.invertedIndex[term][fieldName][docRef] = Object.create(null) + } + + // store all whitelisted metadata about this token in the + // inverted index + for (var l = 0; l < this.metadataWhitelist.length; l++) { + var metadataKey = this.metadataWhitelist[l], + metadata = term.metadata[metadataKey] + + if (this.invertedIndex[term][fieldName][docRef][metadataKey] == undefined) { + this.invertedIndex[term][fieldName][docRef][metadataKey] = [] + } + + this.invertedIndex[term][fieldName][docRef][metadataKey].push(metadata) + } + } + + } +} + +/** + * Calculates the average document length for this index + * + * @private + */ +lunr.Builder.prototype.calculateAverageFieldLengths = function () { + + var fieldRefs = Object.keys(this.fieldLengths), + numberOfFields = fieldRefs.length, + accumulator = {}, + documentsWithField = {} + + for (var i = 0; i < numberOfFields; i++) { + var fieldRef = lunr.FieldRef.fromString(fieldRefs[i]), + field = fieldRef.fieldName + + documentsWithField[field] || (documentsWithField[field] = 0) + documentsWithField[field] += 1 + + accumulator[field] || (accumulator[field] = 0) + accumulator[field] += this.fieldLengths[fieldRef] + } + + var fields = Object.keys(this._fields) + + for (var i = 0; i < fields.length; i++) { + var fieldName = fields[i] + accumulator[fieldName] = accumulator[fieldName] / documentsWithField[fieldName] + } + + this.averageFieldLength = accumulator +} + +/** + * Builds a vector space model of every document using lunr.Vector + * + * @private + */ +lunr.Builder.prototype.createFieldVectors = function () { + var fieldVectors = {}, + fieldRefs = Object.keys(this.fieldTermFrequencies), + fieldRefsLength = fieldRefs.length, + termIdfCache = Object.create(null) + + for (var i = 0; i < fieldRefsLength; i++) { + var fieldRef = lunr.FieldRef.fromString(fieldRefs[i]), + fieldName = fieldRef.fieldName, + fieldLength = this.fieldLengths[fieldRef], + fieldVector = new lunr.Vector, + termFrequencies = this.fieldTermFrequencies[fieldRef], + terms = Object.keys(termFrequencies), + termsLength = terms.length + + + var fieldBoost = this._fields[fieldName].boost || 1, + docBoost = this._documents[fieldRef.docRef].boost || 1 + + for (var j = 0; j < termsLength; j++) { + var term = terms[j], + tf = termFrequencies[term], + termIndex = this.invertedIndex[term]._index, + idf, score, scoreWithPrecision + + if (termIdfCache[term] === undefined) { + idf = lunr.idf(this.invertedIndex[term], this.documentCount) + termIdfCache[term] = idf + } else { + idf = termIdfCache[term] + } + + score = idf * ((this._k1 + 1) * tf) / (this._k1 * (1 - this._b + this._b * (fieldLength / this.averageFieldLength[fieldName])) + tf) + score *= fieldBoost + score *= docBoost + scoreWithPrecision = Math.round(score * 1000) / 1000 + // Converts 1.23456789 to 1.234. + // Reducing the precision so that the vectors take up less + // space when serialised. Doing it now so that they behave + // the same before and after serialisation. Also, this is + // the fastest approach to reducing a number's precision in + // JavaScript. + + fieldVector.insert(termIndex, scoreWithPrecision) + } + + fieldVectors[fieldRef] = fieldVector + } + + this.fieldVectors = fieldVectors +} + +/** + * Creates a token set of all tokens in the index using lunr.TokenSet + * + * @private + */ +lunr.Builder.prototype.createTokenSet = function () { + this.tokenSet = lunr.TokenSet.fromArray( + Object.keys(this.invertedIndex).sort() + ) +} + +/** + * Builds the index, creating an instance of lunr.Index. + * + * This completes the indexing process and should only be called + * once all documents have been added to the index. + * + * @returns {lunr.Index} + */ +lunr.Builder.prototype.build = function () { + this.calculateAverageFieldLengths() + this.createFieldVectors() + this.createTokenSet() + + return new lunr.Index({ + invertedIndex: this.invertedIndex, + fieldVectors: this.fieldVectors, + tokenSet: this.tokenSet, + fields: Object.keys(this._fields), + pipeline: this.searchPipeline + }) +} + +/** + * Applies a plugin to the index builder. + * + * A plugin is a function that is called with the index builder as its context. + * Plugins can be used to customise or extend the behaviour of the index + * in some way. A plugin is just a function, that encapsulated the custom + * behaviour that should be applied when building the index. + * + * The plugin function will be called with the index builder as its argument, additional + * arguments can also be passed when calling use. The function will be called + * with the index builder as its context. + * + * @param {Function} plugin The plugin to apply. + */ +lunr.Builder.prototype.use = function (fn) { + var args = Array.prototype.slice.call(arguments, 1) + args.unshift(this) + fn.apply(this, args) +} +/** + * Contains and collects metadata about a matching document. + * A single instance of lunr.MatchData is returned as part of every + * lunr.Index~Result. + * + * @constructor + * @param {string} term - The term this match data is associated with + * @param {string} field - The field in which the term was found + * @param {object} metadata - The metadata recorded about this term in this field + * @property {object} metadata - A cloned collection of metadata associated with this document. + * @see {@link lunr.Index~Result} + */ +lunr.MatchData = function (term, field, metadata) { + var clonedMetadata = Object.create(null), + metadataKeys = Object.keys(metadata || {}) + + // Cloning the metadata to prevent the original + // being mutated during match data combination. + // Metadata is kept in an array within the inverted + // index so cloning the data can be done with + // Array#slice + for (var i = 0; i < metadataKeys.length; i++) { + var key = metadataKeys[i] + clonedMetadata[key] = metadata[key].slice() + } + + this.metadata = Object.create(null) + + if (term !== undefined) { + this.metadata[term] = Object.create(null) + this.metadata[term][field] = clonedMetadata + } +} + +/** + * An instance of lunr.MatchData will be created for every term that matches a + * document. However only one instance is required in a lunr.Index~Result. This + * method combines metadata from another instance of lunr.MatchData with this + * objects metadata. + * + * @param {lunr.MatchData} otherMatchData - Another instance of match data to merge with this one. + * @see {@link lunr.Index~Result} + */ +lunr.MatchData.prototype.combine = function (otherMatchData) { + var terms = Object.keys(otherMatchData.metadata) + + for (var i = 0; i < terms.length; i++) { + var term = terms[i], + fields = Object.keys(otherMatchData.metadata[term]) + + if (this.metadata[term] == undefined) { + this.metadata[term] = Object.create(null) + } + + for (var j = 0; j < fields.length; j++) { + var field = fields[j], + keys = Object.keys(otherMatchData.metadata[term][field]) + + if (this.metadata[term][field] == undefined) { + this.metadata[term][field] = Object.create(null) + } + + for (var k = 0; k < keys.length; k++) { + var key = keys[k] + + if (this.metadata[term][field][key] == undefined) { + this.metadata[term][field][key] = otherMatchData.metadata[term][field][key] + } else { + this.metadata[term][field][key] = this.metadata[term][field][key].concat(otherMatchData.metadata[term][field][key]) + } + + } + } + } +} + +/** + * Add metadata for a term/field pair to this instance of match data. + * + * @param {string} term - The term this match data is associated with + * @param {string} field - The field in which the term was found + * @param {object} metadata - The metadata recorded about this term in this field + */ +lunr.MatchData.prototype.add = function (term, field, metadata) { + if (!(term in this.metadata)) { + this.metadata[term] = Object.create(null) + this.metadata[term][field] = metadata + return + } + + if (!(field in this.metadata[term])) { + this.metadata[term][field] = metadata + return + } + + var metadataKeys = Object.keys(metadata) + + for (var i = 0; i < metadataKeys.length; i++) { + var key = metadataKeys[i] + + if (key in this.metadata[term][field]) { + this.metadata[term][field][key] = this.metadata[term][field][key].concat(metadata[key]) + } else { + this.metadata[term][field][key] = metadata[key] + } + } +} +/** + * A lunr.Query provides a programmatic way of defining queries to be performed + * against a {@link lunr.Index}. + * + * Prefer constructing a lunr.Query using the {@link lunr.Index#query} method + * so the query object is pre-initialized with the right index fields. + * + * @constructor + * @property {lunr.Query~Clause[]} clauses - An array of query clauses. + * @property {string[]} allFields - An array of all available fields in a lunr.Index. + */ +lunr.Query = function (allFields) { + this.clauses = [] + this.allFields = allFields +} + +/** + * Constants for indicating what kind of automatic wildcard insertion will be used when constructing a query clause. + * + * This allows wildcards to be added to the beginning and end of a term without having to manually do any string + * concatenation. + * + * The wildcard constants can be bitwise combined to select both leading and trailing wildcards. + * + * @constant + * @default + * @property {number} wildcard.NONE - The term will have no wildcards inserted, this is the default behaviour + * @property {number} wildcard.LEADING - Prepend the term with a wildcard, unless a leading wildcard already exists + * @property {number} wildcard.TRAILING - Append a wildcard to the term, unless a trailing wildcard already exists + * @see lunr.Query~Clause + * @see lunr.Query#clause + * @see lunr.Query#term + * @example query term with trailing wildcard + * query.term('foo', { wildcard: lunr.Query.wildcard.TRAILING }) + * @example query term with leading and trailing wildcard + * query.term('foo', { + * wildcard: lunr.Query.wildcard.LEADING | lunr.Query.wildcard.TRAILING + * }) + */ + +lunr.Query.wildcard = new String ("*") +lunr.Query.wildcard.NONE = 0 +lunr.Query.wildcard.LEADING = 1 +lunr.Query.wildcard.TRAILING = 2 + +/** + * Constants for indicating what kind of presence a term must have in matching documents. + * + * @constant + * @enum {number} + * @see lunr.Query~Clause + * @see lunr.Query#clause + * @see lunr.Query#term + * @example query term with required presence + * query.term('foo', { presence: lunr.Query.presence.REQUIRED }) + */ +lunr.Query.presence = { + /** + * Term's presence in a document is optional, this is the default value. + */ + OPTIONAL: 1, + + /** + * Term's presence in a document is required, documents that do not contain + * this term will not be returned. + */ + REQUIRED: 2, + + /** + * Term's presence in a document is prohibited, documents that do contain + * this term will not be returned. + */ + PROHIBITED: 3 +} + +/** + * A single clause in a {@link lunr.Query} contains a term and details on how to + * match that term against a {@link lunr.Index}. + * + * @typedef {Object} lunr.Query~Clause + * @property {string[]} fields - The fields in an index this clause should be matched against. + * @property {number} [boost=1] - Any boost that should be applied when matching this clause. + * @property {number} [editDistance] - Whether the term should have fuzzy matching applied, and how fuzzy the match should be. + * @property {boolean} [usePipeline] - Whether the term should be passed through the search pipeline. + * @property {number} [wildcard=lunr.Query.wildcard.NONE] - Whether the term should have wildcards appended or prepended. + * @property {number} [presence=lunr.Query.presence.OPTIONAL] - The terms presence in any matching documents. + */ + +/** + * Adds a {@link lunr.Query~Clause} to this query. + * + * Unless the clause contains the fields to be matched all fields will be matched. In addition + * a default boost of 1 is applied to the clause. + * + * @param {lunr.Query~Clause} clause - The clause to add to this query. + * @see lunr.Query~Clause + * @returns {lunr.Query} + */ +lunr.Query.prototype.clause = function (clause) { + if (!('fields' in clause)) { + clause.fields = this.allFields + } + + if (!('boost' in clause)) { + clause.boost = 1 + } + + if (!('usePipeline' in clause)) { + clause.usePipeline = true + } + + if (!('wildcard' in clause)) { + clause.wildcard = lunr.Query.wildcard.NONE + } + + if ((clause.wildcard & lunr.Query.wildcard.LEADING) && (clause.term.charAt(0) != lunr.Query.wildcard)) { + clause.term = "*" + clause.term + } + + if ((clause.wildcard & lunr.Query.wildcard.TRAILING) && (clause.term.slice(-1) != lunr.Query.wildcard)) { + clause.term = "" + clause.term + "*" + } + + if (!('presence' in clause)) { + clause.presence = lunr.Query.presence.OPTIONAL + } + + this.clauses.push(clause) + + return this +} + +/** + * A negated query is one in which every clause has a presence of + * prohibited. These queries require some special processing to return + * the expected results. + * + * @returns boolean + */ +lunr.Query.prototype.isNegated = function () { + for (var i = 0; i < this.clauses.length; i++) { + if (this.clauses[i].presence != lunr.Query.presence.PROHIBITED) { + return false + } + } + + return true +} + +/** + * Adds a term to the current query, under the covers this will create a {@link lunr.Query~Clause} + * to the list of clauses that make up this query. + * + * The term is used as is, i.e. no tokenization will be performed by this method. Instead conversion + * to a token or token-like string should be done before calling this method. + * + * The term will be converted to a string by calling `toString`. Multiple terms can be passed as an + * array, each term in the array will share the same options. + * + * @param {object|object[]} term - The term(s) to add to the query. + * @param {object} [options] - Any additional properties to add to the query clause. + * @returns {lunr.Query} + * @see lunr.Query#clause + * @see lunr.Query~Clause + * @example adding a single term to a query + * query.term("foo") + * @example adding a single term to a query and specifying search fields, term boost and automatic trailing wildcard + * query.term("foo", { + * fields: ["title"], + * boost: 10, + * wildcard: lunr.Query.wildcard.TRAILING + * }) + * @example using lunr.tokenizer to convert a string to tokens before using them as terms + * query.term(lunr.tokenizer("foo bar")) + */ +lunr.Query.prototype.term = function (term, options) { + if (Array.isArray(term)) { + term.forEach(function (t) { this.term(t, lunr.utils.clone(options)) }, this) + return this + } + + var clause = options || {} + clause.term = term.toString() + + this.clause(clause) + + return this +} +lunr.QueryParseError = function (message, start, end) { + this.name = "QueryParseError" + this.message = message + this.start = start + this.end = end +} + +lunr.QueryParseError.prototype = new Error +lunr.QueryLexer = function (str) { + this.lexemes = [] + this.str = str + this.length = str.length + this.pos = 0 + this.start = 0 + this.escapeCharPositions = [] +} + +lunr.QueryLexer.prototype.run = function () { + var state = lunr.QueryLexer.lexText + + while (state) { + state = state(this) + } +} + +lunr.QueryLexer.prototype.sliceString = function () { + var subSlices = [], + sliceStart = this.start, + sliceEnd = this.pos + + for (var i = 0; i < this.escapeCharPositions.length; i++) { + sliceEnd = this.escapeCharPositions[i] + subSlices.push(this.str.slice(sliceStart, sliceEnd)) + sliceStart = sliceEnd + 1 + } + + subSlices.push(this.str.slice(sliceStart, this.pos)) + this.escapeCharPositions.length = 0 + + return subSlices.join('') +} + +lunr.QueryLexer.prototype.emit = function (type) { + this.lexemes.push({ + type: type, + str: this.sliceString(), + start: this.start, + end: this.pos + }) + + this.start = this.pos +} + +lunr.QueryLexer.prototype.escapeCharacter = function () { + this.escapeCharPositions.push(this.pos - 1) + this.pos += 1 +} + +lunr.QueryLexer.prototype.next = function () { + if (this.pos >= this.length) { + return lunr.QueryLexer.EOS + } + + var char = this.str.charAt(this.pos) + this.pos += 1 + return char +} + +lunr.QueryLexer.prototype.width = function () { + return this.pos - this.start +} + +lunr.QueryLexer.prototype.ignore = function () { + if (this.start == this.pos) { + this.pos += 1 + } + + this.start = this.pos +} + +lunr.QueryLexer.prototype.backup = function () { + this.pos -= 1 +} + +lunr.QueryLexer.prototype.acceptDigitRun = function () { + var char, charCode + + do { + char = this.next() + charCode = char.charCodeAt(0) + } while (charCode > 47 && charCode < 58) + + if (char != lunr.QueryLexer.EOS) { + this.backup() + } +} + +lunr.QueryLexer.prototype.more = function () { + return this.pos < this.length +} + +lunr.QueryLexer.EOS = 'EOS' +lunr.QueryLexer.FIELD = 'FIELD' +lunr.QueryLexer.TERM = 'TERM' +lunr.QueryLexer.EDIT_DISTANCE = 'EDIT_DISTANCE' +lunr.QueryLexer.BOOST = 'BOOST' +lunr.QueryLexer.PRESENCE = 'PRESENCE' + +lunr.QueryLexer.lexField = function (lexer) { + lexer.backup() + lexer.emit(lunr.QueryLexer.FIELD) + lexer.ignore() + return lunr.QueryLexer.lexText +} + +lunr.QueryLexer.lexTerm = function (lexer) { + if (lexer.width() > 1) { + lexer.backup() + lexer.emit(lunr.QueryLexer.TERM) + } + + lexer.ignore() + + if (lexer.more()) { + return lunr.QueryLexer.lexText + } +} + +lunr.QueryLexer.lexEditDistance = function (lexer) { + lexer.ignore() + lexer.acceptDigitRun() + lexer.emit(lunr.QueryLexer.EDIT_DISTANCE) + return lunr.QueryLexer.lexText +} + +lunr.QueryLexer.lexBoost = function (lexer) { + lexer.ignore() + lexer.acceptDigitRun() + lexer.emit(lunr.QueryLexer.BOOST) + return lunr.QueryLexer.lexText +} + +lunr.QueryLexer.lexEOS = function (lexer) { + if (lexer.width() > 0) { + lexer.emit(lunr.QueryLexer.TERM) + } +} + +// This matches the separator used when tokenising fields +// within a document. These should match otherwise it is +// not possible to search for some tokens within a document. +// +// It is possible for the user to change the separator on the +// tokenizer so it _might_ clash with any other of the special +// characters already used within the search string, e.g. :. +// +// This means that it is possible to change the separator in +// such a way that makes some words unsearchable using a search +// string. +lunr.QueryLexer.termSeparator = lunr.tokenizer.separator + +lunr.QueryLexer.lexText = function (lexer) { + while (true) { + var char = lexer.next() + + if (char == lunr.QueryLexer.EOS) { + return lunr.QueryLexer.lexEOS + } + + // Escape character is '\' + if (char.charCodeAt(0) == 92) { + lexer.escapeCharacter() + continue + } + + if (char == ":") { + return lunr.QueryLexer.lexField + } + + if (char == "~") { + lexer.backup() + if (lexer.width() > 0) { + lexer.emit(lunr.QueryLexer.TERM) + } + return lunr.QueryLexer.lexEditDistance + } + + if (char == "^") { + lexer.backup() + if (lexer.width() > 0) { + lexer.emit(lunr.QueryLexer.TERM) + } + return lunr.QueryLexer.lexBoost + } + + // "+" indicates term presence is required + // checking for length to ensure that only + // leading "+" are considered + if (char == "+" && lexer.width() === 1) { + lexer.emit(lunr.QueryLexer.PRESENCE) + return lunr.QueryLexer.lexText + } + + // "-" indicates term presence is prohibited + // checking for length to ensure that only + // leading "-" are considered + if (char == "-" && lexer.width() === 1) { + lexer.emit(lunr.QueryLexer.PRESENCE) + return lunr.QueryLexer.lexText + } + + if (char.match(lunr.QueryLexer.termSeparator)) { + return lunr.QueryLexer.lexTerm + } + } +} + +lunr.QueryParser = function (str, query) { + this.lexer = new lunr.QueryLexer (str) + this.query = query + this.currentClause = {} + this.lexemeIdx = 0 +} + +lunr.QueryParser.prototype.parse = function () { + this.lexer.run() + this.lexemes = this.lexer.lexemes + + var state = lunr.QueryParser.parseClause + + while (state) { + state = state(this) + } + + return this.query +} + +lunr.QueryParser.prototype.peekLexeme = function () { + return this.lexemes[this.lexemeIdx] +} + +lunr.QueryParser.prototype.consumeLexeme = function () { + var lexeme = this.peekLexeme() + this.lexemeIdx += 1 + return lexeme +} + +lunr.QueryParser.prototype.nextClause = function () { + var completedClause = this.currentClause + this.query.clause(completedClause) + this.currentClause = {} +} + +lunr.QueryParser.parseClause = function (parser) { + var lexeme = parser.peekLexeme() + + if (lexeme == undefined) { + return + } + + switch (lexeme.type) { + case lunr.QueryLexer.PRESENCE: + return lunr.QueryParser.parsePresence + case lunr.QueryLexer.FIELD: + return lunr.QueryParser.parseField + case lunr.QueryLexer.TERM: + return lunr.QueryParser.parseTerm + default: + var errorMessage = "expected either a field or a term, found " + lexeme.type + + if (lexeme.str.length >= 1) { + errorMessage += " with value '" + lexeme.str + "'" + } + + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } +} + +lunr.QueryParser.parsePresence = function (parser) { + var lexeme = parser.consumeLexeme() + + if (lexeme == undefined) { + return + } + + switch (lexeme.str) { + case "-": + parser.currentClause.presence = lunr.Query.presence.PROHIBITED + break + case "+": + parser.currentClause.presence = lunr.Query.presence.REQUIRED + break + default: + var errorMessage = "unrecognised presence operator'" + lexeme.str + "'" + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + var nextLexeme = parser.peekLexeme() + + if (nextLexeme == undefined) { + var errorMessage = "expecting term or field, found nothing" + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + switch (nextLexeme.type) { + case lunr.QueryLexer.FIELD: + return lunr.QueryParser.parseField + case lunr.QueryLexer.TERM: + return lunr.QueryParser.parseTerm + default: + var errorMessage = "expecting term or field, found '" + nextLexeme.type + "'" + throw new lunr.QueryParseError (errorMessage, nextLexeme.start, nextLexeme.end) + } +} + +lunr.QueryParser.parseField = function (parser) { + var lexeme = parser.consumeLexeme() + + if (lexeme == undefined) { + return + } + + if (parser.query.allFields.indexOf(lexeme.str) == -1) { + var possibleFields = parser.query.allFields.map(function (f) { return "'" + f + "'" }).join(', '), + errorMessage = "unrecognised field '" + lexeme.str + "', possible fields: " + possibleFields + + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + parser.currentClause.fields = [lexeme.str] + + var nextLexeme = parser.peekLexeme() + + if (nextLexeme == undefined) { + var errorMessage = "expecting term, found nothing" + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + switch (nextLexeme.type) { + case lunr.QueryLexer.TERM: + return lunr.QueryParser.parseTerm + default: + var errorMessage = "expecting term, found '" + nextLexeme.type + "'" + throw new lunr.QueryParseError (errorMessage, nextLexeme.start, nextLexeme.end) + } +} + +lunr.QueryParser.parseTerm = function (parser) { + var lexeme = parser.consumeLexeme() + + if (lexeme == undefined) { + return + } + + parser.currentClause.term = lexeme.str.toLowerCase() + + if (lexeme.str.indexOf("*") != -1) { + parser.currentClause.usePipeline = false + } + + var nextLexeme = parser.peekLexeme() + + if (nextLexeme == undefined) { + parser.nextClause() + return + } + + switch (nextLexeme.type) { + case lunr.QueryLexer.TERM: + parser.nextClause() + return lunr.QueryParser.parseTerm + case lunr.QueryLexer.FIELD: + parser.nextClause() + return lunr.QueryParser.parseField + case lunr.QueryLexer.EDIT_DISTANCE: + return lunr.QueryParser.parseEditDistance + case lunr.QueryLexer.BOOST: + return lunr.QueryParser.parseBoost + case lunr.QueryLexer.PRESENCE: + parser.nextClause() + return lunr.QueryParser.parsePresence + default: + var errorMessage = "Unexpected lexeme type '" + nextLexeme.type + "'" + throw new lunr.QueryParseError (errorMessage, nextLexeme.start, nextLexeme.end) + } +} + +lunr.QueryParser.parseEditDistance = function (parser) { + var lexeme = parser.consumeLexeme() + + if (lexeme == undefined) { + return + } + + var editDistance = parseInt(lexeme.str, 10) + + if (isNaN(editDistance)) { + var errorMessage = "edit distance must be numeric" + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + parser.currentClause.editDistance = editDistance + + var nextLexeme = parser.peekLexeme() + + if (nextLexeme == undefined) { + parser.nextClause() + return + } + + switch (nextLexeme.type) { + case lunr.QueryLexer.TERM: + parser.nextClause() + return lunr.QueryParser.parseTerm + case lunr.QueryLexer.FIELD: + parser.nextClause() + return lunr.QueryParser.parseField + case lunr.QueryLexer.EDIT_DISTANCE: + return lunr.QueryParser.parseEditDistance + case lunr.QueryLexer.BOOST: + return lunr.QueryParser.parseBoost + case lunr.QueryLexer.PRESENCE: + parser.nextClause() + return lunr.QueryParser.parsePresence + default: + var errorMessage = "Unexpected lexeme type '" + nextLexeme.type + "'" + throw new lunr.QueryParseError (errorMessage, nextLexeme.start, nextLexeme.end) + } +} + +lunr.QueryParser.parseBoost = function (parser) { + var lexeme = parser.consumeLexeme() + + if (lexeme == undefined) { + return + } + + var boost = parseInt(lexeme.str, 10) + + if (isNaN(boost)) { + var errorMessage = "boost must be numeric" + throw new lunr.QueryParseError (errorMessage, lexeme.start, lexeme.end) + } + + parser.currentClause.boost = boost + + var nextLexeme = parser.peekLexeme() + + if (nextLexeme == undefined) { + parser.nextClause() + return + } + + switch (nextLexeme.type) { + case lunr.QueryLexer.TERM: + parser.nextClause() + return lunr.QueryParser.parseTerm + case lunr.QueryLexer.FIELD: + parser.nextClause() + return lunr.QueryParser.parseField + case lunr.QueryLexer.EDIT_DISTANCE: + return lunr.QueryParser.parseEditDistance + case lunr.QueryLexer.BOOST: + return lunr.QueryParser.parseBoost + case lunr.QueryLexer.PRESENCE: + parser.nextClause() + return lunr.QueryParser.parsePresence + default: + var errorMessage = "Unexpected lexeme type '" + nextLexeme.type + "'" + throw new lunr.QueryParseError (errorMessage, nextLexeme.start, nextLexeme.end) + } +} + + /** + * export the module via AMD, CommonJS or as a browser global + * Export code from https://github.com/umdjs/umd/blob/master/returnExports.js + */ + ;(function (root, factory) { + if (typeof define === 'function' && define.amd) { + // AMD. Register as an anonymous module. + define(factory) + } else if (typeof exports === 'object') { + /** + * Node. Does not work with strict CommonJS, but + * only CommonJS-like environments that support module.exports, + * like Node. + */ + module.exports = factory() + } else { + // Browser globals (root is window) + root.lunr = factory() + } + }(this, function () { + /** + * Just return a value to define the module export. + * This example returns an object, but the module + * can return a function as the exported value. + */ + return lunr + })) +})(); diff --git a/site/search/main.js b/site/search/main.js new file mode 100644 index 0000000..a5e469d --- /dev/null +++ b/site/search/main.js @@ -0,0 +1,109 @@ +function getSearchTermFromLocation() { + var sPageURL = window.location.search.substring(1); + var sURLVariables = sPageURL.split('&'); + for (var i = 0; i < sURLVariables.length; i++) { + var sParameterName = sURLVariables[i].split('='); + if (sParameterName[0] == 'q') { + return decodeURIComponent(sParameterName[1].replace(/\+/g, '%20')); + } + } +} + +function joinUrl (base, path) { + if (path.substring(0, 1) === "/") { + // path starts with `/`. Thus it is absolute. + return path; + } + if (base.substring(base.length-1) === "/") { + // base ends with `/` + return base + path; + } + return base + "/" + path; +} + +function escapeHtml (value) { + return value.replace(/&/g, '&') + .replace(/"/g, '"') + .replace(//g, '>'); +} + +function formatResult (location, title, summary) { + return ''; +} + +function displayResults (results) { + var search_results = document.getElementById("mkdocs-search-results"); + while (search_results.firstChild) { + search_results.removeChild(search_results.firstChild); + } + if (results.length > 0){ + for (var i=0; i < results.length; i++){ + var result = results[i]; + var html = formatResult(result.location, result.title, result.summary); + search_results.insertAdjacentHTML('beforeend', html); + } + } else { + var noResultsText = search_results.getAttribute('data-no-results-text'); + if (!noResultsText) { + noResultsText = "No results found"; + } + search_results.insertAdjacentHTML('beforeend', '

' + noResultsText + '

'); + } +} + +function doSearch () { + var query = document.getElementById('mkdocs-search-query').value; + if (query.length > min_search_length) { + if (!window.Worker) { + displayResults(search(query)); + } else { + searchWorker.postMessage({query: query}); + } + } else { + // Clear results for short queries + displayResults([]); + } +} + +function initSearch () { + var search_input = document.getElementById('mkdocs-search-query'); + if (search_input) { + search_input.addEventListener("keyup", doSearch); + } + var term = getSearchTermFromLocation(); + if (term) { + search_input.value = term; + doSearch(); + } +} + +function onWorkerMessage (e) { + if (e.data.allowSearch) { + initSearch(); + } else if (e.data.results) { + var results = e.data.results; + displayResults(results); + } else if (e.data.config) { + min_search_length = e.data.config.min_search_length-1; + } +} + +if (!window.Worker) { + console.log('Web Worker API not supported'); + // load index in main thread + $.getScript(joinUrl(base_url, "search/worker.js")).done(function () { + console.log('Loaded worker'); + init(); + window.postMessage = function (msg) { + onWorkerMessage({data: msg}); + }; + }).fail(function (jqxhr, settings, exception) { + console.error('Could not load worker.js'); + }); +} else { + // Wrap search in a web worker + var searchWorker = new Worker(joinUrl(base_url, "search/worker.js")); + searchWorker.postMessage({init: true}); + searchWorker.onmessage = onWorkerMessage; +} diff --git a/site/search/search_index.json b/site/search/search_index.json new file mode 100644 index 0000000..9ccbead --- /dev/null +++ b/site/search/search_index.json @@ -0,0 +1 @@ +{"config":{"indexing":"full","lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Welcome to the Ensemblex documentation! Ensemblex is an accuracy-weighted ensemble framework for genetic demultiplexing of pooled single-cell RNA seqeuncing (scRNAseq) data. By addressing the limitiations of individual genetic demultiplexing tools, we demonstrated that Ensemblex: Achieves higher demultiplexing accuracy Limits the introduction of technical noise into scRNAseq analysis Retains a high proportion of cells for downstream analyses. The ensemble method capitalizes on the added confidence of combining distinct statistical frameworks for genetic demultiplexing, but the modular algorithm can adapt to the overall performance of its constituent tools on the respective dataset, making it resilient against a poorly performing constituent tool. Ensemblex can be used to demultiplex pools with or without prior genotype information. When demultiplexing with prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools: Demuxalot ( Rogozhnikov et al. ) Demuxlet ( Kang et al. ) Souporcell ( Heaton et al. ) Vireo-GT ( Huang et al. ) When demultiplexing without prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools: Demuxalot ( Rogozhnikov et al. ) Freemuxlet ( Kang et al. ) Souporcell ( Heaton et al. ) Vireo ( Huang et al. ) Upon demultiplexing pools with each of the four constituent genetic demultiplexing tools, Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools: Step 1 : Probabilistic-weighted ensemble Step 2 : Graph-based doublet detection Step 3 : Ensemble-independent doublet detection As output, Ensemblex returns its own cell-specific sample labels and corresponding assignment probabilities and singlet confidence score, as well as the sample labels and corresponding assignment probabilities for each of its constituents. The demultiplexed sample labels could then be used to perform downstream analyses. Figure 1. Overview of the Ensemblex worflow. A) The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework. B) The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection. C) As output, Ensemblex returns its own sample-cell assignments as well as the sample-cell assignments of each of its constituent tools. D) Ensemblex's sample-cell assignments can be used to perform downstream analysis on the pooled scRNAseq data. To facilitate the application of Ensemblex, we provide a pipeline that demultiplexes pooled cells by each of the individual constituent genetic demultiplexing tools and processes the outputs with the Ensemblex algorithm. In this documentation, we outline each step of the Ensemblex pipeline, illustrate how to run the pipeline, define best practices, and provide a tutorial with pubicly available datasets. For a comprehensive descripttion of Ensemblex, ground-truth benchmarking, and application to real-world datasets, see our pre-print manuscript: Pre-print Contents The Ensemblex Algorithm: Ensemblex algorithm overview The Ensemblex Pipeline: Ensemblex pipeline overview Installation Step 1: Set up Step 2: Preparation of inpute files Step 3: Genetic demultiplexing by constituent tools Step 4: Application of Ensemblex Documentation: Execution parameters Ensemblex outputs Tutorial: Downloading data Ensemblex with prior genotype information About: Help and Feedback Acknowledgement License","title":"Home"},{"location":"#welcome-to-the-ensemblex-documentation","text":"Ensemblex is an accuracy-weighted ensemble framework for genetic demultiplexing of pooled single-cell RNA seqeuncing (scRNAseq) data. By addressing the limitiations of individual genetic demultiplexing tools, we demonstrated that Ensemblex: Achieves higher demultiplexing accuracy Limits the introduction of technical noise into scRNAseq analysis Retains a high proportion of cells for downstream analyses. The ensemble method capitalizes on the added confidence of combining distinct statistical frameworks for genetic demultiplexing, but the modular algorithm can adapt to the overall performance of its constituent tools on the respective dataset, making it resilient against a poorly performing constituent tool. Ensemblex can be used to demultiplex pools with or without prior genotype information. When demultiplexing with prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools: Demuxalot ( Rogozhnikov et al. ) Demuxlet ( Kang et al. ) Souporcell ( Heaton et al. ) Vireo-GT ( Huang et al. ) When demultiplexing without prior genotype information, Ensemblex leverages the sample assignments of four individual, constituent genetic demultiplexing tools: Demuxalot ( Rogozhnikov et al. ) Freemuxlet ( Kang et al. ) Souporcell ( Heaton et al. ) Vireo ( Huang et al. ) Upon demultiplexing pools with each of the four constituent genetic demultiplexing tools, Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools: Step 1 : Probabilistic-weighted ensemble Step 2 : Graph-based doublet detection Step 3 : Ensemble-independent doublet detection As output, Ensemblex returns its own cell-specific sample labels and corresponding assignment probabilities and singlet confidence score, as well as the sample labels and corresponding assignment probabilities for each of its constituents. The demultiplexed sample labels could then be used to perform downstream analyses. Figure 1. Overview of the Ensemblex worflow. A) The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework. B) The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection. C) As output, Ensemblex returns its own sample-cell assignments as well as the sample-cell assignments of each of its constituent tools. D) Ensemblex's sample-cell assignments can be used to perform downstream analysis on the pooled scRNAseq data. To facilitate the application of Ensemblex, we provide a pipeline that demultiplexes pooled cells by each of the individual constituent genetic demultiplexing tools and processes the outputs with the Ensemblex algorithm. In this documentation, we outline each step of the Ensemblex pipeline, illustrate how to run the pipeline, define best practices, and provide a tutorial with pubicly available datasets. For a comprehensive descripttion of Ensemblex, ground-truth benchmarking, and application to real-world datasets, see our pre-print manuscript: Pre-print","title":"Welcome to the Ensemblex documentation!"},{"location":"#contents","text":"The Ensemblex Algorithm: Ensemblex algorithm overview The Ensemblex Pipeline: Ensemblex pipeline overview Installation Step 1: Set up Step 2: Preparation of inpute files Step 3: Genetic demultiplexing by constituent tools Step 4: Application of Ensemblex Documentation: Execution parameters Ensemblex outputs Tutorial: Downloading data Ensemblex with prior genotype information About: Help and Feedback Acknowledgement License","title":"Contents"},{"location":"Acknowledgement/","text":"Acknowledgement The Ensemblex pipeline was produced for projects funded by the Canadian Institute of Health Research and Michael J. Fox Foundation Parkinson's Progression Markers Initiative (MJFF PPMI) in collaboration with The Neuro's Early Drug Discovery Unit (EDDU), McGill University. It is written by Michael Fiorini and Saeid Amiri with supervision from Rhalena Thomas and Sali Farhan at the Montreal Neurological Institute-Hospital. Copyright belongs MNI BIOINFO CORE .","title":"Acknowledgement"},{"location":"Acknowledgement/#acknowledgement","text":"The Ensemblex pipeline was produced for projects funded by the Canadian Institute of Health Research and Michael J. Fox Foundation Parkinson's Progression Markers Initiative (MJFF PPMI) in collaboration with The Neuro's Early Drug Discovery Unit (EDDU), McGill University. It is written by Michael Fiorini and Saeid Amiri with supervision from Rhalena Thomas and Sali Farhan at the Montreal Neurological Institute-Hospital. Copyright belongs MNI BIOINFO CORE .","title":"Acknowledgement"},{"location":"Dataset1/","text":"Ensemblex pipeline with prior genotype information Introduction Installation Step 1: Set up Step 2: Preparation of input files Step 3: Genetic demultiplexing by constituent tools Step 4: Application of Ensemblex Resource requirements Introduction This guide illustrates how to use the Ensemblex pipeline to demultiplexed pooled scRNAseq samples with prior genotype information. Here, we will leverage a pooled scRNAseq dataset produced by Jerber et al. . This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state. The Ensemblex pipeline is illustrated in the diagram below: NOTE : To download the necessary files for the tutorial please see the Downloading data section of the Ensemblex documentation. Installation [to be completed] module load StdEnv/2023 module load apptainer/1.2.4 Step 1: Set up In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use. First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory cd ensemblex_tutorial mkdir working_directory cd ~/ensemblex_tutorial/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=~/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=~/ensemblex_tutorial/working_directory Next, we can set up the working directory and choose the Ensemblex pipeline for demultiplexing with prior genotype information ( --step init-GT ) using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT After running the above code, the working directory should have the following structure: ensemblex_tutorial \u2514\u2500\u2500 working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 demuxlet \u251c\u2500\u2500 ensemblex_gt \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo_gt Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools. Step 2: Preparation of input files In Step 2, we will define the necessary files needed for ensemblex's constituent genetic demultiplexing tools and will place them within the working directory. Note : For the tutorial we will be using the data downloaded in the Downloading data section of the Ensemblex documentation. First, define all of the required files: BAM=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam BAM_INDEX=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam.bai BARCODES=~/ensemblex_tutorial/CellRanger/outs/filtered_gene_bc_matrices/refdata-cellranger-GRCh37/barcodes.tsv SAMPLE_VCF=~/ensemblex_tutorial/sample_genotype/sample_genotype_merge.vcf REFERENCE_VCF=~/ensemblex_tutorial/reference_files/common_SNPs_only.recode.vcf REFERENCE_FASTA=~/ensemblex_tutorial/reference_files/genome.fa REFERENCE_FASTA_INDEX=~/ensemblex_tutorial/reference_files/genome.fa.fai Next, we will sort the pooled samples and reference .vcf files according to the .bam file and place them within the working directory: ## Sort pooled samples .vcf file bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/pooled_samples.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam ## Sort reference .vcf file bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/reference.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam NOTE : To sort the vcf files we use the pipeline produced by the authors of Demuxlet/Freemuxlet ( Kang et al. ). Next, we will place the remaining necessary files within the working directory: cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai After running the above code, $ensemblex_PWD/input_files should contain the following files: input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 pooled_samples.vcf \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE : It is important that the file names match those listed above as they are necessary for the Ensemblex pipeline to recognize them. Step 3: Genetic demultiplexing by constituent tools In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools: Demuxalot Demuxlet Souporcell Vireo-GT First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for each of the constituent genetic demultiplexing tools: ## Navigate to the .ini file cd $ensemblex_PWD/job_info/configs ## Open the .ini file and adjust parameters directly in the terminal nano ensemblex_config.ini For the tutorial, we set the following parameters for the constituent genetic demultiplexing tools: Parameter Value PAR_demuxalot_genotype_names 'HPSI0115i-hecn_6,HPSI0214i-pelm_3,HPSI0314i-sojd_3,HPSI0414i-sebn_3,HPSI0514i-uenn_3,HPSI0714i-pipw_4,HPSI0715i-meue_5,HPSI0914i-vaka_5,HPSI1014i-quls_2' PAR_demuxalot_prior_strength 100 PAR_demuxalot_minimum_coverage 200 PAR_demuxalot_minimum_alternative_coverage 10 PAR_demuxalot_n_best_snps_per_donor 100 PAR_demuxalot_genotypes_prior_strength 1 PAR_demuxalot_doublet_prior 0.25 PAR_demuxlet_field GT PAR_vireo_N 9 PAR_vireo_type GT PAR_vireo_processes 20 PAR_vireo_minMAF 0.1 PAR_vireo_minCOUNT 20 PAR_vireo_forcelearnGT T PAR_minimap2 '-ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no' PAR_freebayes '-iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6' PAR_vartrix_umi TRUE PAR_vartrix_mapq 30 PAR_vartrix_threads 8 PAR_souporcell_k 9 PAR_souporcell_t 8 Now that the parameters have been defined, we can demultiplex the pools with the constituent genetic demultiplexing tools. Demuxalot To run Demuxalot use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in $ensemblex_PWD/demuxalot : demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas Demuxlet To run Demuxlet use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet If Demuxlet completed successfully, the following files should be available in $ensemblex_PWD/demuxlet : demuxlet \u251c\u2500\u2500 outs.best \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz Souporcell To run Souporcell use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in $ensemblex_PWD/souporcell : souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt Vireo To run Vireo-GT use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo-GT completed successfully, the following files should be available in $ensemblex_PWD/vireo_gt : vireo_gt \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 fig_GT_distance_input.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications NOTE : To minimize computation time for the tutorial, we have provided the necessary outpu files from the constituent tools here . To access the files and place them in the working directory, use the following code: ## Demuxalot cd $ensemblex_PWD/demuxalot wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/Demuxalot_result.csv ## Demuxlet cd $ensemblex_PWD/demuxlet wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/outs.best ## Souporcell cd $ensemblex_PWD/souporcell wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/clusters.tsv ## Vireo cd $ensemblex_PWD/vireo_gt wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/donor_ids.tsv Step 4: Application of Ensemblex In Step 4, we will process the output files of the four constituent genetic demultiplexing tools with the three-step Ensemblex algorithm: Step 1: Probabilistic-weighted ensemble Step 2: Graph-based doublet detection Step 3: Step 3: Ensemble-independent doublet detection First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for the Ensemblex algorithm: ## Navigate to the .ini file cd $ensemblex_PWD/job_info/configs ## Open the .ini file and adjust parameters directly in the terminal nano ensemblex_config.ini For the tutorial, we set the following parameters for the Ensemblex algorithm: Parameter Value Pool parameters PAR_ensemblex_sample_size 9 PAR_ensemblex_expected_doublet_rate 0.10 Set up parameters PAR_ensemblex_merge_constituents Yes Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No PAR_ensemblex_nCD NULL PAR_ensemblex_pT NULL PAR_ensemblex_graph_based_doublet_detection Yes Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No PAR_ensemblex_ensemble_independent_doublet Yes PAR_ensemblex_doublet_Demuxalot_threshold Yes PAR_ensemblex_doublet_Demuxalot_no_threshold No PAR_ensemblex_doublet_Demuxlet_threshold No PAR_ensemblex_doublet_Demuxlet_no_threshold No PAR_ensemblex_doublet_Souporcell_threshold No PAR_ensemblex_doublet_Souporcell_no_threshold No PAR_ensemblex_doublet_Vireo_threshold Yes PAR_ensemblex_doublet_Vireo_no_threshold No Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes If Ensemblex completed successfully, the following files should be available in $ensemblex_PWD/ensemblex_gt : ensemblex_gt \u251c\u2500\u2500 confidence \u2502 \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u2502 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2502 \u2514\u2500\u2500 step1_cell_assignment.csv \u251c\u2500\u2500 step2 \u2502 \u251c\u2500\u2500 optimal_nCD.pdf \u2502 \u251c\u2500\u2500 optimal_pT.pdf \u2502 \u251c\u2500\u2500 PC1_var_contrib.pdf \u2502 \u251c\u2500\u2500 PC2_var_contrib.pdf \u2502 \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA_plot.pdf \u2502 \u251c\u2500\u2500 PCA_scree_plot.pdf \u2502 \u2514\u2500\u2500 Step2_cell_assignment.csv \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv Ensemblex's final assignments are described in the ensemblex_final_cell_assignment.csv file. Specifically, the ensemblex_assignment column describes Ensemblex's final assignments after application of the singlet confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are labelled as unassigned); we recomment that users use this column to label their cells for downstream analyses. The ensemblex_best_assignment column describes Ensemblex's best assignments, independent of the singlets confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are NOT labelled as unassigned). The cell barcodes listed under the barcode column can be used to add the ensemblex_final_cell_assignment.csv information to the metadata of a Seurat object. Resource requirements The following table describes the computational resources used in this tutorial for genetic demultiplexing by the constituent tools and application of the Ensemblex algorithm. Tool Time CPU Memory Demuxalot 01:34:59 6 12.95 GB Demuxlet 03:16:03 6 138.32 GB Souporcell 2-14:49:21 1 21.83 GB Vireo 2-01:30:24 6 29.42 GB Ensemblex 02:05:27 1 5.67 GB","title":"Ensemblex with prior genotype information"},{"location":"Dataset1/#ensemblex-pipeline-with-prior-genotype-information","text":"Introduction Installation Step 1: Set up Step 2: Preparation of input files Step 3: Genetic demultiplexing by constituent tools Step 4: Application of Ensemblex Resource requirements","title":"Ensemblex pipeline with prior genotype information"},{"location":"Dataset1/#introduction","text":"This guide illustrates how to use the Ensemblex pipeline to demultiplexed pooled scRNAseq samples with prior genotype information. Here, we will leverage a pooled scRNAseq dataset produced by Jerber et al. . This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state. The Ensemblex pipeline is illustrated in the diagram below: NOTE : To download the necessary files for the tutorial please see the Downloading data section of the Ensemblex documentation.","title":"Introduction"},{"location":"Dataset1/#installation","text":"[to be completed] module load StdEnv/2023 module load apptainer/1.2.4","title":"Installation"},{"location":"Dataset1/#step-1-set-up","text":"In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use. First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory cd ensemblex_tutorial mkdir working_directory cd ~/ensemblex_tutorial/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=~/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=~/ensemblex_tutorial/working_directory Next, we can set up the working directory and choose the Ensemblex pipeline for demultiplexing with prior genotype information ( --step init-GT ) using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT After running the above code, the working directory should have the following structure: ensemblex_tutorial \u2514\u2500\u2500 working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 demuxlet \u251c\u2500\u2500 ensemblex_gt \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo_gt Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools.","title":"Step 1: Set up"},{"location":"Dataset1/#step-2-preparation-of-input-files","text":"In Step 2, we will define the necessary files needed for ensemblex's constituent genetic demultiplexing tools and will place them within the working directory. Note : For the tutorial we will be using the data downloaded in the Downloading data section of the Ensemblex documentation. First, define all of the required files: BAM=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam BAM_INDEX=~/ensemblex_tutorial/CellRanger/outs/possorted_genome_bam.bam.bai BARCODES=~/ensemblex_tutorial/CellRanger/outs/filtered_gene_bc_matrices/refdata-cellranger-GRCh37/barcodes.tsv SAMPLE_VCF=~/ensemblex_tutorial/sample_genotype/sample_genotype_merge.vcf REFERENCE_VCF=~/ensemblex_tutorial/reference_files/common_SNPs_only.recode.vcf REFERENCE_FASTA=~/ensemblex_tutorial/reference_files/genome.fa REFERENCE_FASTA_INDEX=~/ensemblex_tutorial/reference_files/genome.fa.fai Next, we will sort the pooled samples and reference .vcf files according to the .bam file and place them within the working directory: ## Sort pooled samples .vcf file bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/pooled_samples.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam ## Sort reference .vcf file bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD/input_files/reference.vcf --step sort --vcf $SAMPLE_VCF --bam $ensemblex_PWD/input_files/pooled_bam.bam NOTE : To sort the vcf files we use the pipeline produced by the authors of Demuxlet/Freemuxlet ( Kang et al. ). Next, we will place the remaining necessary files within the working directory: cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai After running the above code, $ensemblex_PWD/input_files should contain the following files: input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 pooled_samples.vcf \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE : It is important that the file names match those listed above as they are necessary for the Ensemblex pipeline to recognize them.","title":"Step 2: Preparation of input files"},{"location":"Dataset1/#step-3-genetic-demultiplexing-by-constituent-tools","text":"In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools: Demuxalot Demuxlet Souporcell Vireo-GT First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for each of the constituent genetic demultiplexing tools: ## Navigate to the .ini file cd $ensemblex_PWD/job_info/configs ## Open the .ini file and adjust parameters directly in the terminal nano ensemblex_config.ini For the tutorial, we set the following parameters for the constituent genetic demultiplexing tools: Parameter Value PAR_demuxalot_genotype_names 'HPSI0115i-hecn_6,HPSI0214i-pelm_3,HPSI0314i-sojd_3,HPSI0414i-sebn_3,HPSI0514i-uenn_3,HPSI0714i-pipw_4,HPSI0715i-meue_5,HPSI0914i-vaka_5,HPSI1014i-quls_2' PAR_demuxalot_prior_strength 100 PAR_demuxalot_minimum_coverage 200 PAR_demuxalot_minimum_alternative_coverage 10 PAR_demuxalot_n_best_snps_per_donor 100 PAR_demuxalot_genotypes_prior_strength 1 PAR_demuxalot_doublet_prior 0.25 PAR_demuxlet_field GT PAR_vireo_N 9 PAR_vireo_type GT PAR_vireo_processes 20 PAR_vireo_minMAF 0.1 PAR_vireo_minCOUNT 20 PAR_vireo_forcelearnGT T PAR_minimap2 '-ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no' PAR_freebayes '-iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6' PAR_vartrix_umi TRUE PAR_vartrix_mapq 30 PAR_vartrix_threads 8 PAR_souporcell_k 9 PAR_souporcell_t 8 Now that the parameters have been defined, we can demultiplex the pools with the constituent genetic demultiplexing tools.","title":"Step 3: Genetic demultiplexing by constituent tools"},{"location":"Dataset1/#demuxalot","text":"To run Demuxalot use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in $ensemblex_PWD/demuxalot : demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas","title":"Demuxalot"},{"location":"Dataset1/#demuxlet","text":"To run Demuxlet use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet If Demuxlet completed successfully, the following files should be available in $ensemblex_PWD/demuxlet : demuxlet \u251c\u2500\u2500 outs.best \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz","title":"Demuxlet"},{"location":"Dataset1/#souporcell","text":"To run Souporcell use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in $ensemblex_PWD/souporcell : souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt","title":"Souporcell"},{"location":"Dataset1/#vireo","text":"To run Vireo-GT use the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo-GT completed successfully, the following files should be available in $ensemblex_PWD/vireo_gt : vireo_gt \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 fig_GT_distance_input.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications NOTE : To minimize computation time for the tutorial, we have provided the necessary outpu files from the constituent tools here . To access the files and place them in the working directory, use the following code: ## Demuxalot cd $ensemblex_PWD/demuxalot wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/Demuxalot_result.csv ## Demuxlet cd $ensemblex_PWD/demuxlet wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/outs.best ## Souporcell cd $ensemblex_PWD/souporcell wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/clusters.tsv ## Vireo cd $ensemblex_PWD/vireo_gt wget https://github.com/neurobioinfo/ensemblex/blob/caad8c250566bfa9a6d7a78b77d2cc338468a58e/tutorial/donor_ids.tsv","title":"Vireo"},{"location":"Dataset1/#step-4-application-of-ensemblex","text":"In Step 4, we will process the output files of the four constituent genetic demultiplexing tools with the three-step Ensemblex algorithm: Step 1: Probabilistic-weighted ensemble Step 2: Graph-based doublet detection Step 3: Step 3: Ensemble-independent doublet detection First, we will navigate to the ensemblex_config.ini file to adjust the demultiplexing parameters for the Ensemblex algorithm: ## Navigate to the .ini file cd $ensemblex_PWD/job_info/configs ## Open the .ini file and adjust parameters directly in the terminal nano ensemblex_config.ini For the tutorial, we set the following parameters for the Ensemblex algorithm: Parameter Value Pool parameters PAR_ensemblex_sample_size 9 PAR_ensemblex_expected_doublet_rate 0.10 Set up parameters PAR_ensemblex_merge_constituents Yes Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No PAR_ensemblex_nCD NULL PAR_ensemblex_pT NULL PAR_ensemblex_graph_based_doublet_detection Yes Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No PAR_ensemblex_ensemble_independent_doublet Yes PAR_ensemblex_doublet_Demuxalot_threshold Yes PAR_ensemblex_doublet_Demuxalot_no_threshold No PAR_ensemblex_doublet_Demuxlet_threshold No PAR_ensemblex_doublet_Demuxlet_no_threshold No PAR_ensemblex_doublet_Souporcell_threshold No PAR_ensemblex_doublet_Souporcell_no_threshold No PAR_ensemblex_doublet_Vireo_threshold Yes PAR_ensemblex_doublet_Vireo_no_threshold No Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes If Ensemblex completed successfully, the following files should be available in $ensemblex_PWD/ensemblex_gt : ensemblex_gt \u251c\u2500\u2500 confidence \u2502 \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u2502 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2502 \u2514\u2500\u2500 step1_cell_assignment.csv \u251c\u2500\u2500 step2 \u2502 \u251c\u2500\u2500 optimal_nCD.pdf \u2502 \u251c\u2500\u2500 optimal_pT.pdf \u2502 \u251c\u2500\u2500 PC1_var_contrib.pdf \u2502 \u251c\u2500\u2500 PC2_var_contrib.pdf \u2502 \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA_plot.pdf \u2502 \u251c\u2500\u2500 PCA_scree_plot.pdf \u2502 \u2514\u2500\u2500 Step2_cell_assignment.csv \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv Ensemblex's final assignments are described in the ensemblex_final_cell_assignment.csv file. Specifically, the ensemblex_assignment column describes Ensemblex's final assignments after application of the singlet confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are labelled as unassigned); we recomment that users use this column to label their cells for downstream analyses. The ensemblex_best_assignment column describes Ensemblex's best assignments, independent of the singlets confidence threshold (i.e., singlets that fail to meet a singlet confidence of 1.0 are NOT labelled as unassigned). The cell barcodes listed under the barcode column can be used to add the ensemblex_final_cell_assignment.csv information to the metadata of a Seurat object.","title":"Step 4: Application of Ensemblex"},{"location":"Dataset1/#resource-requirements","text":"The following table describes the computational resources used in this tutorial for genetic demultiplexing by the constituent tools and application of the Ensemblex algorithm. Tool Time CPU Memory Demuxalot 01:34:59 6 12.95 GB Demuxlet 03:16:03 6 138.32 GB Souporcell 2-14:49:21 1 21.83 GB Vireo 2-01:30:24 6 29.42 GB Ensemblex 02:05:27 1 5.67 GB","title":"Resource requirements"},{"location":"Dataset2/","text":"HTO analysis track: PBMC dataset Contents Introduction Downloading the pbmc dataset Installation scrnabox.slurm installation CellRanger installation R library preparation and R package installation scRNAbox: HTO Analysis Track Step 0: Set up Step 1: FASTQ to gene expression matrix Step 2: Create Seurat object and remove ambient RNA Step 3: Quality control and filtering Step 4: Demultiplexing and doublet detection Publication-ready figures Job Configurations Introduction This guide illustrates the steps taken for our analysis of the PBMC dataset in our pre-print manuscript . Here, we are using the HTO analysis track of scRNAbox to analyze a publicly available scRNAseq dataset produced by Stoeckius et al. . This data set describes peripheral blood mononuclear cells (PBMC) from eight human donors, which were tagged with sample-specific barcodes, pooled, and sequenced together in a single run. Downloading the PBMC dataset In you want to use the PBMC dataset to test the scRNAbox pipeline, please see here for detialed instructions on how to download the publicly available data. Installation scrnabox.slurm installation To download the latest version of scrnabox.slurm (v0.1.52.50) run the following command: wget https://github.com/neurobioinfo/scrnabox/releases/download/v0.1.52.5/scrnabox.slurm.zip unzip scrnabox.slurm.zip For a description of the options for running scrnabox.slurm run the following command: bash /pathway/to/scrnabox.slurm/launch_scrnabox.sh -h If the scrnabox.slurm has been installed properly, the above command should return the folllowing: scrnabox pipeline version 0.1.52.50 ------------------- mandatory arguments: -d (--dir) = Working directory (where all the outputs will be printed) (give full path) --steps = Specify what steps, e.g., 2 to run step 2. 2-6, run steps 2 through 6 optional arguments: -h (--help) = See helps regarding the pipeline arguments. --method = Select your preferred method: HTO and SCRNA for hashtag, and Standard scRNA, respectively. --msd = You can get the hashtag labels by running the following code (HTO Step 4). --markergsea = Identify marker genes for each cluster and run marker gene set enrichment analysis (GSEA) using EnrichR libraries (Step 7). --knownmarkers = Profile the individual or aggregated expression of known marker genes. --referenceannotation = Generate annotation predictions based on the annotations of a reference Seurat object (Step 7). --annotate = Add clustering annotations to Seurat object metadata (Step 7). --addmeta = Add metadata columns to the Seurat object (Step 8). --rundge = Perform differential gene expression contrasts (Step 8). --seulist = You can directly call the list of Seurat objects to the pipeline. --rcheck = You can identify which libraries are not installed. ------------------- For a comprehensive help, visit https://neurobioinfo.github.io/scrnabox/site/ for documentation. CellRanger installation For information regarding the installation of CellRanger, please visit the 10X Genomics documentation . If CellRanger is already installed on your HPC system, you may skip the CellRanger installation procedures. For our analysis of the midbrain dataset we used the 10XGenomics GRCh38-3.0.0 reference genome and CellRanger v5.0.1. For more information regarding how to prepare reference genomes for the CellRanger counts pipeline, please see the 10X Genomics documentation . R library preparation and R package installation We must prepapre a common R library where we will load all of the required R packages. If the required R packages are already installed on your HPC system in a common R library, you may skip the following procedures. We will first install R . The analyses presented in our pre-print manuscript were conducted using v4.2.1. # install R module load r/4.2.1 Then, we will run the installation code, which creates a directory where the R packages will be loaded and will install the required R packages: # Folder for R packages R_PATH=~/path/to/R/library mkdir -p $R_PATH # Install package Rscript ./scrnabox.slurm/soft/R/install_packages.R $R_PATH scRNAbox pipeline Step 0: Set up Now that scrnabox.slurm , CellRanger , R , and the required R packages have been installed, we can proceed to our analysis with the scRNAbox pipeline. We will create a pipeline folder designated for the analysis and run Step 0, selecting the HTO analysis track ( --method HTO ), using the following code: mkdir pipeline cd pipeline export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 0 \\ --method HTO Next, we will navigate to the scrnabox_config.ini file in ~/pipeline/job_info/configs to define the HPC account holder ( ACCOUNT ), the path to the environmental module ( MODULEUSE ), the path to CellRanger from the environmental module directory ( CELLRANGER ), CellRanger version ( CELLRANGER_VERSION ), R version ( R_VERSION ), and the path to the R library ( R_LIB_PATH ): cd ~/pipeline/job_info/configs nano scrnabox_config.ini ACCOUNT=account-name MODULEUSE=/path/to/environmental/module CELLRANGER=/path/to/cellranger/from/module/directory CELLRANGER_VERSION=5.0.1 R_VERSION=4.2.1 R_LIB_PATH=/path/to/R/library Next, we can check to see if all of the required R packages have been properly installed using the following command: bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 0 \\ --rcheck Step 1: FASTQ to gene expression matrix In Step 1, we will run the CellRanger counts pipeline to generate feature-barcode expression matrices from the FASTQ files. While it is possible to manually prepare the library.csv and feature_ref.csv files for the sequencing run prior to running Step 1, for this analysis we are going to opt for automated library preparation. For more information regarding the manual prepartion of library.csv and feature_ref.csv files, please see the the CellRanger library preparation tutorial. For our analysis of the PBMC dataset we set the following execution parameters for Step 1 ( ~/pipeline/job_info/parameters/step1_par.txt ): Parameter Value par_automated_library_prep Yes par_fastq_directory /path/to/directory/contaning/fastqs par_RNA_run_names run1GEX par_HTO_run_names run1HTO par_seq_run_names run1 par_paired_end_seq Yes par_id Hash1, Hash2, Hash3, Hash4, Hash5, Hash6, Hash7, Hash8 par_name A_TotalSeqA, B_TotalSeqA, C_TotalSeqA, D_TotalSeqA, E_TotalSeqA, F_TotalSeqA, G_TotalSeqA, H_TotalSeqA par_read R2 par_pattern 5P(BC) par_sequence AGGACCATCCAA, ACATGTTACCGT, AGCTTACTATCC, TCGATAATGCGA, GAGGCTGAGCTA, GTGTGACGTATT, ACTGTCTAACGG, TATCACATCGGT par_ref_dir_grch ~/genome/10xGenomics/refdata-cellranger-GRCh38-3.0.0 par_r1_length NULL (commented out) par_r2_length NULL (commented out) par_mempercode 30 par_include_introns NULL (commented out) par_no_target_umi_filter NULL (commented out) par_expect_cells NULL (commented out) par_force_cells NULL (commented out) par_no_bam NULL (commented out) Note: The parameters file for each step is located in ~/pipeline/job_info/parameters . For a comprehensive description of the execution parameters for each step see here . Given that CellRanger runs a user interface and is not submitted as a Job, it is recommended to run Step 1 in a 'screen' which will allow the the task to keep running if the connection is broken. To run Step 1, use the following command: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline screen -S run_PBMC_application_case bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 1 The outputs of the CellRanger counts pipeline are deposited into ~/pipeline/step1 . Step 2: Create Seurat object and remove ambient RNA In Step 2, we are going to begin by correcting the RNA assay for ambient RNA removal using SoupX ( Young et al. 2020 ). We will then use the the ambient RNA-corrected feature-barcode matrices to create a Seurat object. For our analysis of the PBMC dataset we set the following execution parameters for Step 2 ( ~/pipeline/job_info/parameters/step2_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_ambient_RNA Yes par_normalization.method LogNormalize par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 We can run Step 2 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 2 Step 2 produces the following outputs: ~/pipeline step2 \u251c\u2500\u2500 figs2 \u2502 \u251c\u2500\u2500 ambient_RNA_estimation_run1.pdf \u2502 \u251c\u2500\u2500 ambient_RNA_markers_run1.pdf \u2502 \u251c\u2500\u2500 cell_cyle_dim_plot_run1.pdf \u2502 \u251c\u2500\u2500 vioplot_run1.pdf \u2502 \u2514\u2500\u2500 zoomed_in_vioplot_run1.pdf \u251c\u2500\u2500 info2 \u2502 \u251c\u2500\u2500 estimated_ambient_RNA_run1.txt \u2502 \u251c\u2500\u2500 MetaData_1.txt \u2502 \u251c\u2500\u2500 meta_info_1.txt \u2502 \u251c\u2500\u2500 run1_ambient_rna_summary.rds \u2502 \u251c\u2500\u2500 sessionInfo.txt \u2502 \u251c\u2500\u2500 seu1_RNA.txt \u2502 \u2514\u2500\u2500 summary_seu1.txt \u251c\u2500\u2500 objs2 \u2502 \u2514\u2500\u2500 run1.rds \u2514\u2500\u2500 step2_ambient \u2514\u2500\u2500 run1 \u251c\u2500\u2500 barcodes.tsv \u251c\u2500\u2500 genes.tsv \u2514\u2500\u2500 matrix.mtxs Note: For a comprehensive description of the outputs for each analytical step, please see the Outputs section of the scRNAbox documentation. Figure 1. Figures produced by Step 2 of the scRNAbox pipeline. A) Estimated ambient RNA contamination rate (Rho) by SoupX. Estimates of the RNA contamination rate using various estimators are visualized via a frequency distribution; the true contamination rate is assigned as the most frequent estimate (red line; 8.7%). B) Log10 ratios of observed counts to expected counts for marker genes from each cluster. Clusters are defined by the CellRanger counts pipeline. The red line displays the estimated RNA contamination rate if the estimation was based entirely on the corresponding gene. C) Principal component analysis (PCA) of Seurat S and G2M cell cycle reference genes. D) Violin plots showing the distribution of cells according to quality control metrics calculated in Step 2. E) Zoomed in violin plots, from the minimum to the mean, showing the distribution of cells according to quality control metrics calculated in Step 2. Step 3: Quality control and filtering In Step 3, we are going to perform quality control procedures and filter out low quality cells. We are going to filter out cells with < 50 unique RNA transcripts, > 6000 unique RNA transcripts, < 200 total RNA transcripts, > 7000 total RNA transcripts, and > 50% mitochondria. For our analysis of the PBMC dataset we set the following execution parameters for Step 3 ( ~/pipeline/job_info/parameters/step2_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_seurat_object NULL par_nFeature_RNA_L 50 par_nFeature_RNA_U 6000 par_nCount_RNA_L 200 par_nCount_RNA_U 7000 par_mitochondria_percent_L 0 par_mitochondria_percent_U 50 par_ribosomal_percent_L 0 par_ribosomal_percent_U 100 par_remove_mitochondrial_genes No par_remove_ribosomal_genes No par_remove_genes NULL par_regress_cell_cycle_genes Yes par_normalization.method LogNormalize par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 par_top 10 par_npcs_pca 30 We can run Step 3 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 3 Step 3 produces the following outputs. step3 \u251c\u2500\u2500 figs3 \u2502 \u251c\u2500\u2500 dimplot_pca_run1.pdf \u2502 \u251c\u2500\u2500 elbowplot_run1.pdf \u2502 \u251c\u2500\u2500 filtered_QC_vioplot_run1.pdf \u2502 \u2514\u2500\u2500 VariableFeaturePlot_run1.pdf \u251c\u2500\u2500 info3 \u2502 \u251c\u2500\u2500 MetaData_run1.txt \u2502 \u251c\u2500\u2500 meta_info_run1.txt \u2502 \u251c\u2500\u2500 most_variable_genes_run1.txt \u2502 \u251c\u2500\u2500 run1_RNA.txt \u2502 \u251c\u2500\u2500 sessionInfo.txt \u2502 \u2514\u2500\u2500 summary_run1.txt \u2514\u2500\u2500 objs3 \u2514\u2500\u2500 run1.rds Figure 2. Figures produced by Step 3 of the scRNAbox pipeline. A) Violin plots showing the distribution of cells according to quality control metrics after filtering by user-defined thresholds. B) Scatter plot showing the top 2500 most variable features; the top 10 most variable features are labelled. C) Principal component analysis (PCA) visualizing the first two principal component (PC). D) Elbow plot to visualize the percentage of variance explained by each PC. Step 4: Demultiplexing and doublet detection In Step 4, we are going to demultiplex the pooled samples and remove doublets (erroneous libraries produced by two or more cells) based on the expression of the sample-specific barcodes (antibody assay). If the barcode labels used in the analysis are unknown, the first step is to retrieve them from the Seurat object. To do this, we do not need to modify the execution parameters and can go straight to running the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 4 \\ --msd T The above code produces the following file: step4 \u251c\u2500\u2500 figs4 \u251c\u2500\u2500 info4 \u2502 \u2514\u2500\u2500 seu1.rds_old_antibody_label_MULTIseqDemuxHTOcounts.csv \u2514\u2500\u2500 objs4 Which contains the names of the barcode labels (i.e. A_TotalSeqA , B_TotalSeqA , C_TotalSeqA , D_TotalSeqA , E_TotalSeqA , F_TotalSeqA , G_TotalSeqA , H_TotalSeqA , Doublet , Negative ). Now that we know the barcode labels used in the PBMC dataset, we can perform demultiplexing and doublet detection. For our analysis of the PBMC dataset we set the following execution parameters for Step 4 ( ~/pipeline/job_info/parameters/step4_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_normalization.method CLR par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 par_dimensionality_reduction Yes par_npcs_pca 30 par_dims_umap 3 par_n.neighbor 65 par_dropDN Yes par_label_dropDN Doublet, Negative par_quantile 0.9 par_autoThresh TRUE par_maxiter 5 par_RidgePlot_ncol 3 par_old_antibody_label A-TotalSeqA, B-TotalSeqA, C-TotalSeqA, D-TotalSeqA, E-TotalSeqA, F-TotalSeqA, G-TotalSeqA, H-TotalSeqA, Doublet par_new_antibody_label sample-A, sample-B, sample-C, sample-D, sample-E, sample-F, sample-G, sample-H, Doublet We can run Step 4 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 4 Step 4 produces the following outputs. step4 \u251c\u2500\u2500 figs4 \u2502 \u251c\u2500\u2500 run1_DotPlot_HTO_MSD.pdf \u2502 \u251c\u2500\u2500 run1_Heatmap_HTO_MSD.pdf \u2502 \u251c\u2500\u2500 run1_HTO_dimplot_pca.pdf \u2502 \u251c\u2500\u2500 run1_HTO_dimplot_umap.pdf \u2502 \u251c\u2500\u2500 run1_nCounts_RNA_MSD.pdf \u2502 \u2514\u2500\u2500 run1_Ridgeplot_HTO_MSD.pdf \u251c\u2500\u2500 info4 \u2502 \u251c\u2500\u2500 run1_filtered_MULTIseqDemuxHTOcounts.csv \u2502 \u251c\u2500\u2500 run1_MetaData.txt \u2502 \u251c\u2500\u2500 run1_meta_info_.txt \u2502 \u251c\u2500\u2500 run1_MULTIseqDemuxHTOcounts.csv \u2502 \u251c\u2500\u2500 run1_RNA.txt \u2502 \u2514\u2500\u2500 sessionInfo.txt \u2514\u2500\u2500 objs4 \u2514\u2500\u2500 run1.rds Figure 3. Figures produced by Step 4 of the Cell Hashtag Analysis Track. A) Uniform Manifold Approximation and Projections (UMAP) plot, taking the first three pricipal components (PC) of the antibody assay as input. B) Principal component analysis (PCA) showing the first two PCs of the antibody assay. C) Ridgeplot visualizing the enrichment of barcode labels across sample assignments at the sample level. D) Dot plot visualizing the enrichment of barcode labels across sample assignments at the sample level. E) Heatmap visualizing the enrichment of barcode labels across sample assignments at the cel level. D) Violin plot visualizing the distribution of the number of total RNA transcripts identified per cell, startified by sample assignment. Publication-ready figures The code used to produce the publication-ready figures used in our pre-print manuscript is avaliable here here . Job Configurations The following job configurations were used for our analysis of the PBMC dataset. Job Configurations can be modified for each analytical step in the scrnabox_config.ini file in ~/pipeline/job_info/configs Step THREADS_ARRAY MEM_ARRAY WALLTIME_ARRAY Step2 4 16g 00-05:00 Step3 4 16g 00-05:00 Step4 4 16g 00-05:00","title":"Run pipeline on processed data"},{"location":"Dataset2/#hto-analysis-track-pbmc-dataset","text":"","title":"HTO analysis track: PBMC dataset"},{"location":"Dataset2/#contents","text":"Introduction Downloading the pbmc dataset Installation scrnabox.slurm installation CellRanger installation R library preparation and R package installation scRNAbox: HTO Analysis Track Step 0: Set up Step 1: FASTQ to gene expression matrix Step 2: Create Seurat object and remove ambient RNA Step 3: Quality control and filtering Step 4: Demultiplexing and doublet detection Publication-ready figures Job Configurations","title":"Contents"},{"location":"Dataset2/#introduction","text":"This guide illustrates the steps taken for our analysis of the PBMC dataset in our pre-print manuscript . Here, we are using the HTO analysis track of scRNAbox to analyze a publicly available scRNAseq dataset produced by Stoeckius et al. . This data set describes peripheral blood mononuclear cells (PBMC) from eight human donors, which were tagged with sample-specific barcodes, pooled, and sequenced together in a single run.","title":"Introduction"},{"location":"Dataset2/#downloading-the-pbmc-dataset","text":"In you want to use the PBMC dataset to test the scRNAbox pipeline, please see here for detialed instructions on how to download the publicly available data.","title":"Downloading the PBMC dataset"},{"location":"Dataset2/#installation","text":"","title":"Installation"},{"location":"Dataset2/#scrnaboxslurm-installation","text":"To download the latest version of scrnabox.slurm (v0.1.52.50) run the following command: wget https://github.com/neurobioinfo/scrnabox/releases/download/v0.1.52.5/scrnabox.slurm.zip unzip scrnabox.slurm.zip For a description of the options for running scrnabox.slurm run the following command: bash /pathway/to/scrnabox.slurm/launch_scrnabox.sh -h If the scrnabox.slurm has been installed properly, the above command should return the folllowing: scrnabox pipeline version 0.1.52.50 ------------------- mandatory arguments: -d (--dir) = Working directory (where all the outputs will be printed) (give full path) --steps = Specify what steps, e.g., 2 to run step 2. 2-6, run steps 2 through 6 optional arguments: -h (--help) = See helps regarding the pipeline arguments. --method = Select your preferred method: HTO and SCRNA for hashtag, and Standard scRNA, respectively. --msd = You can get the hashtag labels by running the following code (HTO Step 4). --markergsea = Identify marker genes for each cluster and run marker gene set enrichment analysis (GSEA) using EnrichR libraries (Step 7). --knownmarkers = Profile the individual or aggregated expression of known marker genes. --referenceannotation = Generate annotation predictions based on the annotations of a reference Seurat object (Step 7). --annotate = Add clustering annotations to Seurat object metadata (Step 7). --addmeta = Add metadata columns to the Seurat object (Step 8). --rundge = Perform differential gene expression contrasts (Step 8). --seulist = You can directly call the list of Seurat objects to the pipeline. --rcheck = You can identify which libraries are not installed. ------------------- For a comprehensive help, visit https://neurobioinfo.github.io/scrnabox/site/ for documentation.","title":"scrnabox.slurm installation"},{"location":"Dataset2/#cellranger-installation","text":"For information regarding the installation of CellRanger, please visit the 10X Genomics documentation . If CellRanger is already installed on your HPC system, you may skip the CellRanger installation procedures. For our analysis of the midbrain dataset we used the 10XGenomics GRCh38-3.0.0 reference genome and CellRanger v5.0.1. For more information regarding how to prepare reference genomes for the CellRanger counts pipeline, please see the 10X Genomics documentation .","title":"CellRanger installation"},{"location":"Dataset2/#r-library-preparation-and-r-package-installation","text":"We must prepapre a common R library where we will load all of the required R packages. If the required R packages are already installed on your HPC system in a common R library, you may skip the following procedures. We will first install R . The analyses presented in our pre-print manuscript were conducted using v4.2.1. # install R module load r/4.2.1 Then, we will run the installation code, which creates a directory where the R packages will be loaded and will install the required R packages: # Folder for R packages R_PATH=~/path/to/R/library mkdir -p $R_PATH # Install package Rscript ./scrnabox.slurm/soft/R/install_packages.R $R_PATH","title":"R library preparation and R package installation"},{"location":"Dataset2/#scrnabox-pipeline","text":"","title":"scRNAbox pipeline"},{"location":"Dataset2/#step-0-set-up","text":"Now that scrnabox.slurm , CellRanger , R , and the required R packages have been installed, we can proceed to our analysis with the scRNAbox pipeline. We will create a pipeline folder designated for the analysis and run Step 0, selecting the HTO analysis track ( --method HTO ), using the following code: mkdir pipeline cd pipeline export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 0 \\ --method HTO Next, we will navigate to the scrnabox_config.ini file in ~/pipeline/job_info/configs to define the HPC account holder ( ACCOUNT ), the path to the environmental module ( MODULEUSE ), the path to CellRanger from the environmental module directory ( CELLRANGER ), CellRanger version ( CELLRANGER_VERSION ), R version ( R_VERSION ), and the path to the R library ( R_LIB_PATH ): cd ~/pipeline/job_info/configs nano scrnabox_config.ini ACCOUNT=account-name MODULEUSE=/path/to/environmental/module CELLRANGER=/path/to/cellranger/from/module/directory CELLRANGER_VERSION=5.0.1 R_VERSION=4.2.1 R_LIB_PATH=/path/to/R/library Next, we can check to see if all of the required R packages have been properly installed using the following command: bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 0 \\ --rcheck","title":"Step 0: Set up"},{"location":"Dataset2/#step-1-fastq-to-gene-expression-matrix","text":"In Step 1, we will run the CellRanger counts pipeline to generate feature-barcode expression matrices from the FASTQ files. While it is possible to manually prepare the library.csv and feature_ref.csv files for the sequencing run prior to running Step 1, for this analysis we are going to opt for automated library preparation. For more information regarding the manual prepartion of library.csv and feature_ref.csv files, please see the the CellRanger library preparation tutorial. For our analysis of the PBMC dataset we set the following execution parameters for Step 1 ( ~/pipeline/job_info/parameters/step1_par.txt ): Parameter Value par_automated_library_prep Yes par_fastq_directory /path/to/directory/contaning/fastqs par_RNA_run_names run1GEX par_HTO_run_names run1HTO par_seq_run_names run1 par_paired_end_seq Yes par_id Hash1, Hash2, Hash3, Hash4, Hash5, Hash6, Hash7, Hash8 par_name A_TotalSeqA, B_TotalSeqA, C_TotalSeqA, D_TotalSeqA, E_TotalSeqA, F_TotalSeqA, G_TotalSeqA, H_TotalSeqA par_read R2 par_pattern 5P(BC) par_sequence AGGACCATCCAA, ACATGTTACCGT, AGCTTACTATCC, TCGATAATGCGA, GAGGCTGAGCTA, GTGTGACGTATT, ACTGTCTAACGG, TATCACATCGGT par_ref_dir_grch ~/genome/10xGenomics/refdata-cellranger-GRCh38-3.0.0 par_r1_length NULL (commented out) par_r2_length NULL (commented out) par_mempercode 30 par_include_introns NULL (commented out) par_no_target_umi_filter NULL (commented out) par_expect_cells NULL (commented out) par_force_cells NULL (commented out) par_no_bam NULL (commented out) Note: The parameters file for each step is located in ~/pipeline/job_info/parameters . For a comprehensive description of the execution parameters for each step see here . Given that CellRanger runs a user interface and is not submitted as a Job, it is recommended to run Step 1 in a 'screen' which will allow the the task to keep running if the connection is broken. To run Step 1, use the following command: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline screen -S run_PBMC_application_case bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 1 The outputs of the CellRanger counts pipeline are deposited into ~/pipeline/step1 .","title":"Step 1: FASTQ to gene expression matrix"},{"location":"Dataset2/#step-2-create-seurat-object-and-remove-ambient-rna","text":"In Step 2, we are going to begin by correcting the RNA assay for ambient RNA removal using SoupX ( Young et al. 2020 ). We will then use the the ambient RNA-corrected feature-barcode matrices to create a Seurat object. For our analysis of the PBMC dataset we set the following execution parameters for Step 2 ( ~/pipeline/job_info/parameters/step2_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_ambient_RNA Yes par_normalization.method LogNormalize par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 We can run Step 2 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 2 Step 2 produces the following outputs: ~/pipeline step2 \u251c\u2500\u2500 figs2 \u2502 \u251c\u2500\u2500 ambient_RNA_estimation_run1.pdf \u2502 \u251c\u2500\u2500 ambient_RNA_markers_run1.pdf \u2502 \u251c\u2500\u2500 cell_cyle_dim_plot_run1.pdf \u2502 \u251c\u2500\u2500 vioplot_run1.pdf \u2502 \u2514\u2500\u2500 zoomed_in_vioplot_run1.pdf \u251c\u2500\u2500 info2 \u2502 \u251c\u2500\u2500 estimated_ambient_RNA_run1.txt \u2502 \u251c\u2500\u2500 MetaData_1.txt \u2502 \u251c\u2500\u2500 meta_info_1.txt \u2502 \u251c\u2500\u2500 run1_ambient_rna_summary.rds \u2502 \u251c\u2500\u2500 sessionInfo.txt \u2502 \u251c\u2500\u2500 seu1_RNA.txt \u2502 \u2514\u2500\u2500 summary_seu1.txt \u251c\u2500\u2500 objs2 \u2502 \u2514\u2500\u2500 run1.rds \u2514\u2500\u2500 step2_ambient \u2514\u2500\u2500 run1 \u251c\u2500\u2500 barcodes.tsv \u251c\u2500\u2500 genes.tsv \u2514\u2500\u2500 matrix.mtxs Note: For a comprehensive description of the outputs for each analytical step, please see the Outputs section of the scRNAbox documentation. Figure 1. Figures produced by Step 2 of the scRNAbox pipeline. A) Estimated ambient RNA contamination rate (Rho) by SoupX. Estimates of the RNA contamination rate using various estimators are visualized via a frequency distribution; the true contamination rate is assigned as the most frequent estimate (red line; 8.7%). B) Log10 ratios of observed counts to expected counts for marker genes from each cluster. Clusters are defined by the CellRanger counts pipeline. The red line displays the estimated RNA contamination rate if the estimation was based entirely on the corresponding gene. C) Principal component analysis (PCA) of Seurat S and G2M cell cycle reference genes. D) Violin plots showing the distribution of cells according to quality control metrics calculated in Step 2. E) Zoomed in violin plots, from the minimum to the mean, showing the distribution of cells according to quality control metrics calculated in Step 2.","title":"Step 2: Create Seurat object and remove ambient RNA"},{"location":"Dataset2/#step-3-quality-control-and-filtering","text":"In Step 3, we are going to perform quality control procedures and filter out low quality cells. We are going to filter out cells with < 50 unique RNA transcripts, > 6000 unique RNA transcripts, < 200 total RNA transcripts, > 7000 total RNA transcripts, and > 50% mitochondria. For our analysis of the PBMC dataset we set the following execution parameters for Step 3 ( ~/pipeline/job_info/parameters/step2_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_seurat_object NULL par_nFeature_RNA_L 50 par_nFeature_RNA_U 6000 par_nCount_RNA_L 200 par_nCount_RNA_U 7000 par_mitochondria_percent_L 0 par_mitochondria_percent_U 50 par_ribosomal_percent_L 0 par_ribosomal_percent_U 100 par_remove_mitochondrial_genes No par_remove_ribosomal_genes No par_remove_genes NULL par_regress_cell_cycle_genes Yes par_normalization.method LogNormalize par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 par_top 10 par_npcs_pca 30 We can run Step 3 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 3 Step 3 produces the following outputs. step3 \u251c\u2500\u2500 figs3 \u2502 \u251c\u2500\u2500 dimplot_pca_run1.pdf \u2502 \u251c\u2500\u2500 elbowplot_run1.pdf \u2502 \u251c\u2500\u2500 filtered_QC_vioplot_run1.pdf \u2502 \u2514\u2500\u2500 VariableFeaturePlot_run1.pdf \u251c\u2500\u2500 info3 \u2502 \u251c\u2500\u2500 MetaData_run1.txt \u2502 \u251c\u2500\u2500 meta_info_run1.txt \u2502 \u251c\u2500\u2500 most_variable_genes_run1.txt \u2502 \u251c\u2500\u2500 run1_RNA.txt \u2502 \u251c\u2500\u2500 sessionInfo.txt \u2502 \u2514\u2500\u2500 summary_run1.txt \u2514\u2500\u2500 objs3 \u2514\u2500\u2500 run1.rds Figure 2. Figures produced by Step 3 of the scRNAbox pipeline. A) Violin plots showing the distribution of cells according to quality control metrics after filtering by user-defined thresholds. B) Scatter plot showing the top 2500 most variable features; the top 10 most variable features are labelled. C) Principal component analysis (PCA) visualizing the first two principal component (PC). D) Elbow plot to visualize the percentage of variance explained by each PC.","title":"Step 3: Quality control and filtering"},{"location":"Dataset2/#step-4-demultiplexing-and-doublet-detection","text":"In Step 4, we are going to demultiplex the pooled samples and remove doublets (erroneous libraries produced by two or more cells) based on the expression of the sample-specific barcodes (antibody assay). If the barcode labels used in the analysis are unknown, the first step is to retrieve them from the Seurat object. To do this, we do not need to modify the execution parameters and can go straight to running the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 4 \\ --msd T The above code produces the following file: step4 \u251c\u2500\u2500 figs4 \u251c\u2500\u2500 info4 \u2502 \u2514\u2500\u2500 seu1.rds_old_antibody_label_MULTIseqDemuxHTOcounts.csv \u2514\u2500\u2500 objs4 Which contains the names of the barcode labels (i.e. A_TotalSeqA , B_TotalSeqA , C_TotalSeqA , D_TotalSeqA , E_TotalSeqA , F_TotalSeqA , G_TotalSeqA , H_TotalSeqA , Doublet , Negative ). Now that we know the barcode labels used in the PBMC dataset, we can perform demultiplexing and doublet detection. For our analysis of the PBMC dataset we set the following execution parameters for Step 4 ( ~/pipeline/job_info/parameters/step4_par.txt ): Parameter Value par_save_RNA Yes par_save_metadata Yes par_normalization.method CLR par_scale.factor 10000 par_selection.method vst par_nfeatures 2500 par_dimensionality_reduction Yes par_npcs_pca 30 par_dims_umap 3 par_n.neighbor 65 par_dropDN Yes par_label_dropDN Doublet, Negative par_quantile 0.9 par_autoThresh TRUE par_maxiter 5 par_RidgePlot_ncol 3 par_old_antibody_label A-TotalSeqA, B-TotalSeqA, C-TotalSeqA, D-TotalSeqA, E-TotalSeqA, F-TotalSeqA, G-TotalSeqA, H-TotalSeqA, Doublet par_new_antibody_label sample-A, sample-B, sample-C, sample-D, sample-E, sample-F, sample-G, sample-H, Doublet We can run Step 4 using the following code: export SCRNABOX_HOME=~/scrnabox/scrnabox.slurm export SCRNABOX_PWD=~/pipeline bash $SCRNABOX_HOME/launch_scrnabox.sh \\ -d ${SCRNABOX_PWD} \\ --steps 4 Step 4 produces the following outputs. step4 \u251c\u2500\u2500 figs4 \u2502 \u251c\u2500\u2500 run1_DotPlot_HTO_MSD.pdf \u2502 \u251c\u2500\u2500 run1_Heatmap_HTO_MSD.pdf \u2502 \u251c\u2500\u2500 run1_HTO_dimplot_pca.pdf \u2502 \u251c\u2500\u2500 run1_HTO_dimplot_umap.pdf \u2502 \u251c\u2500\u2500 run1_nCounts_RNA_MSD.pdf \u2502 \u2514\u2500\u2500 run1_Ridgeplot_HTO_MSD.pdf \u251c\u2500\u2500 info4 \u2502 \u251c\u2500\u2500 run1_filtered_MULTIseqDemuxHTOcounts.csv \u2502 \u251c\u2500\u2500 run1_MetaData.txt \u2502 \u251c\u2500\u2500 run1_meta_info_.txt \u2502 \u251c\u2500\u2500 run1_MULTIseqDemuxHTOcounts.csv \u2502 \u251c\u2500\u2500 run1_RNA.txt \u2502 \u2514\u2500\u2500 sessionInfo.txt \u2514\u2500\u2500 objs4 \u2514\u2500\u2500 run1.rds Figure 3. Figures produced by Step 4 of the Cell Hashtag Analysis Track. A) Uniform Manifold Approximation and Projections (UMAP) plot, taking the first three pricipal components (PC) of the antibody assay as input. B) Principal component analysis (PCA) showing the first two PCs of the antibody assay. C) Ridgeplot visualizing the enrichment of barcode labels across sample assignments at the sample level. D) Dot plot visualizing the enrichment of barcode labels across sample assignments at the sample level. E) Heatmap visualizing the enrichment of barcode labels across sample assignments at the cel level. D) Violin plot visualizing the distribution of the number of total RNA transcripts identified per cell, startified by sample assignment.","title":"Step 4: Demultiplexing and doublet detection"},{"location":"Dataset2/#publication-ready-figures","text":"The code used to produce the publication-ready figures used in our pre-print manuscript is avaliable here here .","title":"Publication-ready figures"},{"location":"Dataset2/#job-configurations","text":"The following job configurations were used for our analysis of the PBMC dataset. Job Configurations can be modified for each analytical step in the scrnabox_config.ini file in ~/pipeline/job_info/configs Step THREADS_ARRAY MEM_ARRAY WALLTIME_ARRAY Step2 4 16g 00-05:00 Step3 4 16g 00-05:00 Step4 4 16g 00-05:00","title":"Job Configurations"},{"location":"LICENSE/","text":"License MIT License Copyright (c) 2022 The Neuro Bioinformatics Core Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.","title":"License"},{"location":"LICENSE/#license","text":"MIT License Copyright (c) 2022 The Neuro Bioinformatics Core Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.","title":"License"},{"location":"Step0/","text":"Step 1: Setting up the Ensemblex pipeline In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use: Demultiplexing with prior genotype information Demultiplexing without prior genotype information Demultiplexing with prior genotype information First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory mkdir working_directory cd /path/to/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory Next, we can set up the working directory for demultiplexing with prior genotype information using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT After running the above code, the working directory should have the following structure working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 demuxlet \u251c\u2500\u2500 ensemblex_gt \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo_gt Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files Demultiplexing without prior genotype information First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory mkdir working_directory cd /path/to/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory Next, we can set up the working directory for demultiplexing without prior genotype information using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-noGT After running the above code, the working directory should have the following structure working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 freemuxlet \u251c\u2500\u2500 ensemblex \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files","title":"Step 1: Set up"},{"location":"Step0/#step-1-setting-up-the-ensemblex-pipeline","text":"In Step 1, we will set up the working directory for the Ensemblex pipeline and decide which version of the pipeline we want to use: Demultiplexing with prior genotype information Demultiplexing without prior genotype information","title":"Step 1: Setting up the Ensemblex pipeline"},{"location":"Step0/#demultiplexing-with-prior-genotype-information","text":"First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory mkdir working_directory cd /path/to/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory Next, we can set up the working directory for demultiplexing with prior genotype information using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-GT After running the above code, the working directory should have the following structure working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 demuxlet \u251c\u2500\u2500 ensemblex_gt \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo_gt Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files","title":"Demultiplexing with prior genotype information"},{"location":"Step0/#demultiplexing-without-prior-genotype-information","text":"First, create a dedicated folder for the analysis (hereafter referred to as the working directory). Then, define the path to the working directory and the path to ensemblex.pip: ## Create and navigate to the working directory mkdir working_directory cd /path/to/working_directory ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory Next, we can set up the working directory for demultiplexing without prior genotype information using the following code: bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step init-noGT After running the above code, the working directory should have the following structure working_directory \u251c\u2500\u2500 demuxalot \u251c\u2500\u2500 freemuxlet \u251c\u2500\u2500 ensemblex \u251c\u2500\u2500 input_files \u251c\u2500\u2500 job_info \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u251c\u2500\u2500 logs \u2502 \u2514\u2500\u2500 summary_report.txt \u251c\u2500\u2500 souporcell \u2514\u2500\u2500 vireo Upon setting up the Ensemblex pipeline, we can proceed to Step 2 where we will prepare the input files for Ensemblex's constituent genetic demultiplexing tools: Preparation of input files","title":"Demultiplexing without prior genotype information"},{"location":"Step1/","text":"Step 2: Preparing input files for genetic demultiplexing In Step 2, we will define the necessary files needed for Ensemblex's constituent genetic demultiplexing tools and will place them within the working directory. The necessary files vary depending on the version of the Ensemblex pipeline being used: Demultiplexing with prior genotype information Demultiplexing without prior genotype information Demultiplexing with prior genotype information Required files To demultiplex the pooled samples with prior genotype information, the following files are required: File Description gene_expression.bam Gene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam) gene_expression.bam.bai Gene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai) barcodes.tsv Barcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv) pooled_samples.vcf vcf file describing the genotypes of the pooled samples genome_reference.fa Genome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa) genome_reference.fa.fai Genome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai) genotype_reference.vcf Population reference vcf file (e.g., 1000 Genomes Project) NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation. Placing files into the Ensemblex pipeline working directory First, define all of the required files: BAM=/path/to/possorted_genome_bam.bam BAM_INDEX=/path/to/possorted_genome_bam.bam.bai BARCODES=/path/to/barcodes.tsv SAMPLE_VCF=/path/to/pooled_samples.vcf REFERENCE_VCF=/path/to/genotype_reference.vcf REFERENCE_FASTA=/path/to/genome.fa REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai Then, place the required files in the Ensemblex pipeline working directory: ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory ## Copy the files to the input_files directory in the working directory cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $SAMPLE_VCF $ensemblex_PWD/input_files/pooled_samples.vcf cp $REFERENCE_VCF $ensemblex_PWD/input_files/reference.vcf cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files: working_directory \u2514\u2500\u2500 input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 pooled_samples.vcf \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly. Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools Demultiplexing without prior genotype information Required files To demultiplex the pooled samples without prior genotype information, the following files are required: File Description gene_expression.bam Gene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam) gene_expression.bam.bai Gene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai) barcodes.tsv Barcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv) genome_reference.fa Genome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa) genome_reference.fa.fai Genome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai) genotype_reference.vcf Population reference vcf file (e.g., 1000 Genomes Project) NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation. Placing files into the Ensemblex pipeline working directory First, define all of the required files: BAM=/path/to/possorted_genome_bam.bam BAM_INDEX=/path/to/possorted_genome_bam.bam.bai BARCODES=/path/to/barcodes.tsv REFERENCE_VCF=/path/to/genotype_reference.vcf REFERENCE_FASTA=/path/to/genome.fa REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai Then, place the required files in the Ensemblex pipeline working directory: ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory ## Copy the files to the input_files directory in the working directory cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $REFERENCE_VCF $ensemblex_PWD/input_files/reference.vcf cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files: working_directory \u2514\u2500\u2500 input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly. Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools","title":"Step 2: Preparation of input files"},{"location":"Step1/#step-2-preparing-input-files-for-genetic-demultiplexing","text":"In Step 2, we will define the necessary files needed for Ensemblex's constituent genetic demultiplexing tools and will place them within the working directory. The necessary files vary depending on the version of the Ensemblex pipeline being used: Demultiplexing with prior genotype information Demultiplexing without prior genotype information","title":"Step 2: Preparing input files for genetic demultiplexing"},{"location":"Step1/#demultiplexing-with-prior-genotype-information","text":"","title":"Demultiplexing with prior genotype information"},{"location":"Step1/#required-files","text":"To demultiplex the pooled samples with prior genotype information, the following files are required: File Description gene_expression.bam Gene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam) gene_expression.bam.bai Gene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai) barcodes.tsv Barcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv) pooled_samples.vcf vcf file describing the genotypes of the pooled samples genome_reference.fa Genome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa) genome_reference.fa.fai Genome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai) genotype_reference.vcf Population reference vcf file (e.g., 1000 Genomes Project) NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation.","title":"Required files"},{"location":"Step1/#placing-files-into-the-ensemblex-pipeline-working-directory","text":"First, define all of the required files: BAM=/path/to/possorted_genome_bam.bam BAM_INDEX=/path/to/possorted_genome_bam.bam.bai BARCODES=/path/to/barcodes.tsv SAMPLE_VCF=/path/to/pooled_samples.vcf REFERENCE_VCF=/path/to/genotype_reference.vcf REFERENCE_FASTA=/path/to/genome.fa REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai Then, place the required files in the Ensemblex pipeline working directory: ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory ## Copy the files to the input_files directory in the working directory cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $SAMPLE_VCF $ensemblex_PWD/input_files/pooled_samples.vcf cp $REFERENCE_VCF $ensemblex_PWD/input_files/reference.vcf cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files: working_directory \u2514\u2500\u2500 input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 pooled_samples.vcf \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly. Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools","title":"Placing files into the Ensemblex pipeline working directory"},{"location":"Step1/#demultiplexing-without-prior-genotype-information","text":"","title":"Demultiplexing without prior genotype information"},{"location":"Step1/#required-files_1","text":"To demultiplex the pooled samples without prior genotype information, the following files are required: File Description gene_expression.bam Gene expression bam file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam) gene_expression.bam.bai Gene expression bam index file of the pooled samples (e.g., 10X Genomics possorted_genome_bam.bam.bai) barcodes.tsv Barcodes tsv file of the pooled cells (e.g., 10X Genomics barcodes.tsv) genome_reference.fa Genome reference fasta file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa) genome_reference.fa.fai Genome reference fasta index file (e.g., 10X Genomics: ~/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa.fai) genotype_reference.vcf Population reference vcf file (e.g., 1000 Genomes Project) NOTE: We demonstrate how to download reference vcf and fasta files in the Tutorial section of the Ensemblex documentation.","title":"Required files"},{"location":"Step1/#placing-files-into-the-ensemblex-pipeline-working-directory_1","text":"First, define all of the required files: BAM=/path/to/possorted_genome_bam.bam BAM_INDEX=/path/to/possorted_genome_bam.bam.bai BARCODES=/path/to/barcodes.tsv REFERENCE_VCF=/path/to/genotype_reference.vcf REFERENCE_FASTA=/path/to/genome.fa REFERENCE_FASTA_INDEX=/path/to/genome.fa.fai Then, place the required files in the Ensemblex pipeline working directory: ## Define the path to the working directory ensemblex_PWD=/path/to/working_directory ## Copy the files to the input_files directory in the working directory cp $BAM $ensemblex_PWD/input_files/pooled_bam.bam cp $BAM_INDEX $ensemblex_PWD/input_files/pooled_bam.bam.bai cp $BARCODES $ensemblex_PWD/input_files/pooled_barcodes.tsv cp $REFERENCE_VCF $ensemblex_PWD/input_files/reference.vcf cp $REFERENCE_FASTA $ensemblex_PWD/input_files/reference.fa cp $REFERENCE_FASTA_INDEX $ensemblex_PWD/input_files/reference.fa.fai If the file transfer was successful, the input_files directory of the Ensemblex pipeline working directory will contain the following files: working_directory \u2514\u2500\u2500 input_files \u251c\u2500\u2500 pooled_bam.bam \u251c\u2500\u2500 pooled_bam.bam.bai \u251c\u2500\u2500 pooled_barcodes.tsv \u251c\u2500\u2500 reference.fa \u251c\u2500\u2500 reference.fa.fai \u2514\u2500\u2500 reference.vcf NOTE: You will notice that the names of the input files have been standardized, it is important that the input files have the corresonding name for the Ensemblex pipeline to work properly. Upon placing the required files in the Ensemblex pipeline, we can proceed to Step 3 where we will demultiplex the pooled samples using Ensemblex's constituent genetic demultiplexing tools: Genetic demultiplexing by consituent tools","title":"Placing files into the Ensemblex pipeline working directory"},{"location":"Step2/","text":"Step 3: Genetic demultiplexing by constituent demultiplexing tools In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools. The constituent genetic demultiplexing tools will vary depending on the version of the Ensemblex pipeline being used: Demultiplexing with prior genotype information Demultiplexing without prior genotype information NOTE : The analytical parameters for each constiuent tool can be adjusted using the the ensemblex_config.ini file located in ~/working_directory/job_info/configs . For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters . Demultiplexing with prior genotype information When demultiplexing with prior genotype information, Ensemblex leverages the sample labels from Demuxalot Demuxlet Souporcell Vireo-GT Demuxalot To run Demuxalot use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot working_directory \u2514\u2500\u2500 demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas Demuxlet To run Demuxlet use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet If Demuxlet completed successfully, the following files should be available in ~/working_directory/demuxlet working_directory \u2514\u2500\u2500 demuxlet \u251c\u2500\u2500 outs.best \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz Souporcell To run Souporcell use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell working_directory \u2514\u2500\u2500 souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt Vireo-GT To run Vireo-GT use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo-GT completed successfully, the following files should be available in ~/working_directory/vireo_gt working_directory \u2514\u2500\u2500 vireo_gt \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 fig_GT_distance_input.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex Demultiplexing without prior genotype information When demultiplexing without prior genotype information, Ensemblex leverages the sample labels from Freemuxlet Souporcell Vireo Demuxalot Freemuxlet To run Freemuxlet use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step freemuxlet If Freemuxlet completed successfully, the following files should be available in ~/working_directory/freemuxlet working_directory \u2514\u2500\u2500 freemuxlet \u251c\u2500\u2500 outs.clust1.samples.gz \u251c\u2500\u2500 outs.clust1.vcf \u251c\u2500\u2500 outs.lmix \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz Souporcell To run Souporcell use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell working_directory \u2514\u2500\u2500 souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt Vireo To run Vireo use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo completed successfully, the following files should be available in ~/working_directory/vireo working_directory \u2514\u2500\u2500 vireo \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv Demuxalot NOTE : Because the Demuxalot algorithm requires prior genotype information, the Ensemblex pipeline uses the predicted vcf file generated by Freemuxlet as input into Demuxalot when prior genotype information is not available. Therefore, it is important to wait for Freemuxlet to complete before running Demuxalot. To check if the required Freemuxlet-generated vcf file is available prior to running Demuxalot, you can use the following code: if test -f /path/to/working_directory/freemuxlet/outs.clust1.vcf; then echo \"File exists.\" fi Upon confirming that the required Freemuxlet-generated file exists, we can run Demuxalot using the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot working_directory \u2514\u2500\u2500 demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex","title":"Step 3: Genetic demultiplexing by constituent tools"},{"location":"Step2/#step-3-genetic-demultiplexing-by-constituent-demultiplexing-tools","text":"In Step 3, we will demultiplex the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools. The constituent genetic demultiplexing tools will vary depending on the version of the Ensemblex pipeline being used: Demultiplexing with prior genotype information Demultiplexing without prior genotype information NOTE : The analytical parameters for each constiuent tool can be adjusted using the the ensemblex_config.ini file located in ~/working_directory/job_info/configs . For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters .","title":"Step 3: Genetic demultiplexing by constituent demultiplexing tools"},{"location":"Step2/#demultiplexing-with-prior-genotype-information","text":"When demultiplexing with prior genotype information, Ensemblex leverages the sample labels from Demuxalot Demuxlet Souporcell Vireo-GT","title":"Demultiplexing with prior genotype information"},{"location":"Step2/#demuxalot","text":"To run Demuxalot use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot working_directory \u2514\u2500\u2500 demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas","title":"Demuxalot"},{"location":"Step2/#demuxlet","text":"To run Demuxlet use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxlet If Demuxlet completed successfully, the following files should be available in ~/working_directory/demuxlet working_directory \u2514\u2500\u2500 demuxlet \u251c\u2500\u2500 outs.best \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz","title":"Demuxlet"},{"location":"Step2/#souporcell","text":"To run Souporcell use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell working_directory \u2514\u2500\u2500 souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt","title":"Souporcell"},{"location":"Step2/#vireo-gt","text":"To run Vireo-GT use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo-GT completed successfully, the following files should be available in ~/working_directory/vireo_gt working_directory \u2514\u2500\u2500 vireo_gt \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 fig_GT_distance_input.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex","title":"Vireo-GT"},{"location":"Step2/#demultiplexing-without-prior-genotype-information","text":"When demultiplexing without prior genotype information, Ensemblex leverages the sample labels from Freemuxlet Souporcell Vireo Demuxalot","title":"Demultiplexing without prior genotype information"},{"location":"Step2/#freemuxlet","text":"To run Freemuxlet use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step freemuxlet If Freemuxlet completed successfully, the following files should be available in ~/working_directory/freemuxlet working_directory \u2514\u2500\u2500 freemuxlet \u251c\u2500\u2500 outs.clust1.samples.gz \u251c\u2500\u2500 outs.clust1.vcf \u251c\u2500\u2500 outs.lmix \u251c\u2500\u2500 pileup.cel.gz \u251c\u2500\u2500 pileup.plp.gz \u251c\u2500\u2500 pileup.umi.gz \u2514\u2500\u2500 pileup.var.gz","title":"Freemuxlet"},{"location":"Step2/#souporcell_1","text":"To run Souporcell use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step souporcell If Souporcell completed successfully, the following files should be available in ~/working_directory/souporcell working_directory \u2514\u2500\u2500 souporcell \u251c\u2500\u2500 alt.mtx \u251c\u2500\u2500 cluster_genotypes.vcf \u251c\u2500\u2500 clusters_tmp.tsv \u251c\u2500\u2500 clusters.tsv \u251c\u2500\u2500 fq.fq \u251c\u2500\u2500 minimap.sam \u251c\u2500\u2500 minitagged.bam \u251c\u2500\u2500 minitagged_sorted.bam \u251c\u2500\u2500 minitagged_sorted.bam.bai \u251c\u2500\u2500 Pool.vcf \u251c\u2500\u2500 ref.mtx \u2514\u2500\u2500 soup.txt","title":"Souporcell"},{"location":"Step2/#vireo","text":"To run Vireo use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step vireo If Vireo completed successfully, the following files should be available in ~/working_directory/vireo working_directory \u2514\u2500\u2500 vireo \u251c\u2500\u2500 cellSNP.base.vcf.gz \u251c\u2500\u2500 cellSNP.cells.vcf.gz \u251c\u2500\u2500 cellSNP.samples.tsv \u251c\u2500\u2500 cellSNP.tag.AD.mtx \u251c\u2500\u2500 cellSNP.tag.DP.mtx \u251c\u2500\u2500 cellSNP.tag.OTH.mtx \u251c\u2500\u2500 donor_ids.tsv \u251c\u2500\u2500 fig_GT_distance_estimated.pdf \u251c\u2500\u2500 GT_donors.vireo.vcf.gz \u251c\u2500\u2500 _log.txt \u251c\u2500\u2500 prob_doublet.tsv.gz \u251c\u2500\u2500 prob_singlet.tsv.gz \u2514\u2500\u2500 summary.tsv","title":"Vireo"},{"location":"Step2/#demuxalot_1","text":"NOTE : Because the Demuxalot algorithm requires prior genotype information, the Ensemblex pipeline uses the predicted vcf file generated by Freemuxlet as input into Demuxalot when prior genotype information is not available. Therefore, it is important to wait for Freemuxlet to complete before running Demuxalot. To check if the required Freemuxlet-generated vcf file is available prior to running Demuxalot, you can use the following code: if test -f /path/to/working_directory/freemuxlet/outs.clust1.vcf; then echo \"File exists.\" fi Upon confirming that the required Freemuxlet-generated file exists, we can run Demuxalot using the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step demuxalot If Demuxalot completed successfully, the following files should be available in ~/working_directory/demuxalot working_directory \u2514\u2500\u2500 demuxalot \u251c\u2500\u2500 Demuxalot_result.csv \u2514\u2500\u2500 new_snps_single_file.betas Upon demultiplexing the pooled samples with each of Ensemblex's constituent genetic demultiplexing tools, we can proceed to Step 4 where we will process the output files of the consituent tools with the Ensemblex algorithm to generate the ensemble sample classifications: Application of Ensemblex","title":"Demuxalot"},{"location":"Step3/","text":"Step 4: Application of Ensemblex Introduction Ensemblex parameters Applying the Ensemblex algorithm Introduction In Step 4, we will process the output files from the constituent genetic demultiplexing tools with the Ensemblex framework. Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools: Step 1: Probabilistic-weighted ensemble In Step 1, Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool\u2019s assignment probability distribution by its estimated balanced accuracy for the dataset. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell. Step 2: Graph-based doublet detection In Step 2, Ensemblex utilizes a graph-based approach to identify doublets that were incorrectly labeled as singlets in Step 1. Pooled cells are embedded into PCA space and the most confident doublets in the pool (nCD) are identified. Then, based on the Euclidean distance in PCA space, the pooled cells that surpass the percentile threshold (pT) of the nearest neighbour frequency to the confident doublets are labelled as doublets by Ensemblex. Ensemblex performs an automated parameter sweep to identify the optimal nCD and pT values; however, user can opt to manually define these parameters. Step 3: Ensemble-independent doublet detection In Step 3, Ensemblex utilizes an ensemble-independent approach to further improve doublet detection. Here, cells that are labelled as doublets by Demuxalot or Vireo are labelled as doublets by Ensemblex; however, users can nominate different tools to utilize for Step 3, depending on the desired doublet detection stringency. Ensemblex parameters Users can choose to run each step of the Ensemblex framework sequentially (Steps 1 to 3) or can opt to skip certain steps. While Step 1 is necessary to generate the ensemble sample labels, Steps 2 and 3 were implemented to improve Ensemblex's ability to identify doublets; thus, if users do not want to prioritize doublet detection, they may skip Steps 2 and/or 3. Nonetheless, we demonstrated in our pre-print manuscript that utilizing the entire Ensemblex framework is important for maximizing the demultiplexing accuracy. Users can define which steps of the Ensemblex framework they want to utilize in the adjustable parameters file. The adjustable parameters file ( ensemblex_config.ini ) is located in ~/working_directory/job_info/configs/ . For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters . The following parameters are adjustable when applying the Ensemblex algorithm: Parameter Default Description Pool parameters PAR_ensemblex_sample_size NULL Number of samples multiplexed in the pool. PAR_ensemblex_expected_doublet_rate NULL Expected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation . Set up parameters PAR_ensemblex_merge_constituents Yes Whether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated merged file. Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Whether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated Step 1 output file. Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No Whether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively. PAR_ensemblex_nCD NULL Manually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_pT NULL Manually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_graph_based_doublet_detection Yes Whether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values. Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No Whether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters. PAR_ensemblex_ensemble_independent_doublet Yes Whether or not to perform Step 3: Ensemble-independent doublet detection. PAR_ensemblex_doublet_Demuxalot_threshold Yes Whether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxalot_no_threshold No Whether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Demuxlet_threshold No Whether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxlet_no_threshold No Whether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Souporcell_threshold No Whether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Souporcell_no_threshold No Whether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Vireo_threshold Yes Whether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Vireo_no_threshold No Whether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability. Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes Whether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses. Applying the Ensemblex algorithm To apply the Ensemblex algorithm use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step ensemblexing If the ensemblex algorithm completed successfully, the following files should be available in ~/working_directory/ensemblex working_directory \u2514\u2500\u2500 ensemblex \u251c\u2500\u2500 confidence \u2502 \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u2502 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2502 \u2514\u2500\u2500 step1_cell_assignment.csv \u251c\u2500\u2500 step2 \u2502 \u251c\u2500\u2500 optimal_nCD.pdf \u2502 \u251c\u2500\u2500 optimal_pT.pdf \u2502 \u251c\u2500\u2500 PC1_var_contrib.pdf \u2502 \u251c\u2500\u2500 PC2_var_contrib.pdf \u2502 \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA_plot.pdf \u2502 \u251c\u2500\u2500 PCA_scree_plot.pdf \u2502 \u2514\u2500\u2500 Step2_cell_assignment.csv \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_Ensemblux_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_Ensemblux_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv For a comprehensive description of the Ensemblex algorithm output files, please see Ensemblex outputs .","title":"Step 4: Application of Ensemblex"},{"location":"Step3/#step-4-application-of-ensemblex","text":"Introduction Ensemblex parameters Applying the Ensemblex algorithm","title":"Step 4: Application of Ensemblex"},{"location":"Step3/#introduction","text":"In Step 4, we will process the output files from the constituent genetic demultiplexing tools with the Ensemblex framework. Ensemblex processes the output files in a three-step pipeline to identify the most probable sample label for each cell based on the predictions of the constituent tools: Step 1: Probabilistic-weighted ensemble In Step 1, Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool\u2019s assignment probability distribution by its estimated balanced accuracy for the dataset. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell. Step 2: Graph-based doublet detection In Step 2, Ensemblex utilizes a graph-based approach to identify doublets that were incorrectly labeled as singlets in Step 1. Pooled cells are embedded into PCA space and the most confident doublets in the pool (nCD) are identified. Then, based on the Euclidean distance in PCA space, the pooled cells that surpass the percentile threshold (pT) of the nearest neighbour frequency to the confident doublets are labelled as doublets by Ensemblex. Ensemblex performs an automated parameter sweep to identify the optimal nCD and pT values; however, user can opt to manually define these parameters. Step 3: Ensemble-independent doublet detection In Step 3, Ensemblex utilizes an ensemble-independent approach to further improve doublet detection. Here, cells that are labelled as doublets by Demuxalot or Vireo are labelled as doublets by Ensemblex; however, users can nominate different tools to utilize for Step 3, depending on the desired doublet detection stringency.","title":"Introduction"},{"location":"Step3/#ensemblex-parameters","text":"Users can choose to run each step of the Ensemblex framework sequentially (Steps 1 to 3) or can opt to skip certain steps. While Step 1 is necessary to generate the ensemble sample labels, Steps 2 and 3 were implemented to improve Ensemblex's ability to identify doublets; thus, if users do not want to prioritize doublet detection, they may skip Steps 2 and/or 3. Nonetheless, we demonstrated in our pre-print manuscript that utilizing the entire Ensemblex framework is important for maximizing the demultiplexing accuracy. Users can define which steps of the Ensemblex framework they want to utilize in the adjustable parameters file. The adjustable parameters file ( ensemblex_config.ini ) is located in ~/working_directory/job_info/configs/ . For a comprehensive description of how to adjust the analytical parameters of the Ensemblex pipeline please see Execution parameters . The following parameters are adjustable when applying the Ensemblex algorithm: Parameter Default Description Pool parameters PAR_ensemblex_sample_size NULL Number of samples multiplexed in the pool. PAR_ensemblex_expected_doublet_rate NULL Expected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation . Set up parameters PAR_ensemblex_merge_constituents Yes Whether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated merged file. Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Whether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated Step 1 output file. Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No Whether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively. PAR_ensemblex_nCD NULL Manually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_pT NULL Manually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_graph_based_doublet_detection Yes Whether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values. Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No Whether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters. PAR_ensemblex_ensemble_independent_doublet Yes Whether or not to perform Step 3: Ensemble-independent doublet detection. PAR_ensemblex_doublet_Demuxalot_threshold Yes Whether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxalot_no_threshold No Whether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Demuxlet_threshold No Whether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxlet_no_threshold No Whether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Souporcell_threshold No Whether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Souporcell_no_threshold No Whether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Vireo_threshold Yes Whether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Vireo_no_threshold No Whether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability. Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes Whether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses.","title":"Ensemblex parameters"},{"location":"Step3/#applying-the-ensemblex-algorithm","text":"To apply the Ensemblex algorithm use the following code: ensemblex_HOME=/path/to/ensemblex.pip ensemblex_PWD=/path/to/working_directory bash $ensemblex_HOME/launch_ensemblex.sh -d $ensemblex_PWD --step ensemblexing If the ensemblex algorithm completed successfully, the following files should be available in ~/working_directory/ensemblex working_directory \u2514\u2500\u2500 ensemblex \u251c\u2500\u2500 confidence \u2502 \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u2502 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u2502 \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2502 \u2514\u2500\u2500 step1_cell_assignment.csv \u251c\u2500\u2500 step2 \u2502 \u251c\u2500\u2500 optimal_nCD.pdf \u2502 \u251c\u2500\u2500 optimal_pT.pdf \u2502 \u251c\u2500\u2500 PC1_var_contrib.pdf \u2502 \u251c\u2500\u2500 PC2_var_contrib.pdf \u2502 \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u2502 \u251c\u2500\u2500 PCA_plot.pdf \u2502 \u251c\u2500\u2500 PCA_scree_plot.pdf \u2502 \u2514\u2500\u2500 Step2_cell_assignment.csv \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_Ensemblux_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_Ensemblux_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv For a comprehensive description of the Ensemblex algorithm output files, please see Ensemblex outputs .","title":"Applying the Ensemblex algorithm"},{"location":"contributing/","text":"Help and Feedback Any contributions or suggestions for improving the ensemblex pipeline are welcomed and appreciated. You may directly contact Michael Fiorini or Saeid Amiri . If you encounter any issues, please open an issue in the GitHub repository . Alternatively, you are welcomed to email the developers directly; for any questions please contact Michael Fiorini: michael.fiorini@mail.mcgill.ca","title":"Help and Feedback"},{"location":"contributing/#help-and-feedback","text":"Any contributions or suggestions for improving the ensemblex pipeline are welcomed and appreciated. You may directly contact Michael Fiorini or Saeid Amiri . If you encounter any issues, please open an issue in the GitHub repository . Alternatively, you are welcomed to email the developers directly; for any questions please contact Michael Fiorini: michael.fiorini@mail.mcgill.ca","title":"Help and Feedback"},{"location":"installation/","text":"Installation The Ensemblex container is freely available under an MIT open-source license at https://zenodo.org/records/11639103 . The Ensemblex container can be downloaded using the following code: ## Download the Ensemblex container curl \"https://zenodo.org/records/11639103/files/ensemblex.pip.zip?download=1\" --output ensemblex.pip.zip ## Unzip the Ensemblex container unzip ensemblex.pip.zip If installation was successful the following will be available: ensemblex.pip \u251c\u2500\u2500 gt \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u2514\u2500\u2500 scripts \u2502 \u251c\u2500\u2500 demuxalot \u2502 \u2502 \u251c\u2500\u2500 pipeline_demuxalot.sh \u2502 \u2502 \u2514\u2500\u2500 pipline_demuxalot.py \u2502 \u251c\u2500\u2500 demuxlet \u2502 \u2502 \u2514\u2500\u2500 pipeline_demuxlet.sh \u2502 \u251c\u2500\u2500 ensemblexing \u2502 \u2502 \u251c\u2500\u2500 ensemblexing.R \u2502 \u2502 \u251c\u2500\u2500 functions.R \u2502 \u2502 \u2514\u2500\u2500 pipeline_ensemblexing.sh \u2502 \u251c\u2500\u2500 souporcell \u2502 \u2502 \u2514\u2500\u2500 pipeline_souporcell_generate.sh \u2502 \u2514\u2500\u2500 vireo \u2502 \u2514\u2500\u2500 pipeline_vireo.sh \u251c\u2500\u2500 launch \u2502 \u251c\u2500\u2500 launch_gt.sh \u2502 \u2514\u2500\u2500 launch_nogt.sh \u251c\u2500\u2500 launch_ensemblex.sh \u251c\u2500\u2500 nogt \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u2514\u2500\u2500 scripts \u2502 \u251c\u2500\u2500 demuxalot \u2502 \u2502 \u251c\u2500\u2500 pipeline_demuxalot.py \u2502 \u2502 \u2514\u2500\u2500 pipeline_demuxalot.sh \u2502 \u251c\u2500\u2500 ensemblexing \u2502 \u2502 \u251c\u2500\u2500 ensemblexing_nogt.R \u2502 \u2502 \u251c\u2500\u2500 functions_nogt.R \u2502 \u2502 \u2514\u2500\u2500 pipeline_ensemblexing.sh \u2502 \u251c\u2500\u2500 freemuxlet \u2502 \u2502 \u2514\u2500\u2500 pipeline_freemuxlet.sh \u2502 \u251c\u2500\u2500 souporcell \u2502 \u2502 \u2514\u2500\u2500 pipeline_souporcell_generate.sh \u2502 \u2514\u2500\u2500 vireo \u2502 \u2514\u2500\u2500 pipeline_vireo.sh \u251c\u2500\u2500 README \u251c\u2500\u2500 soft \u2502 \u2514\u2500\u2500 ensemblex.sif \u2514\u2500\u2500 tools \u251c\u2500\u2500 sort_vcf_same_as_bam.sh \u2514\u2500\u2500 utils.sh In addition to the Ensemblex container, users must install Apptainer . For example: ## Load Apptainer module load apptainer/1.2.4 To test if the Ensemblex container is installed properly, run the following code: ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Print help message bash $ensemblex_HOME/launch_ensemblex.sh -h Which should return the following help message: ------------------- Usage: /home/fiorini9/scratch/ensemblex.pip/launch_ensemblex.sh [arguments] mandatory arguments: -d (--dir) = Working directory (where all the outputs will be printed) (give full path) --steps = Specify the steps to execute. Begin by selecting either init-GT or init-noGT to establish the working directory. For GT: vireo, demuxalot, demuxlet, souporcell, ensemblexing For noGT: vireo, demuxalot, freemuxlet, souporcell, ensemblexing optional arguments: -h (--help) = See helps regarding the pipeline arguments --vcf = The path of vcf file --bam = The path of bam file --sortout = The path snd nsme of vcf generated using sort ------------------- For a comprehensive help, visit https://neurobioinfo.github.io/ensemblex/site/ for documentation. Upon installing up the Ensemblex container, we can proceed to Step 1 where we will initiate the Ensemblex pipeline for demultiplexing: Set up","title":"Installation"},{"location":"installation/#installation","text":"The Ensemblex container is freely available under an MIT open-source license at https://zenodo.org/records/11639103 . The Ensemblex container can be downloaded using the following code: ## Download the Ensemblex container curl \"https://zenodo.org/records/11639103/files/ensemblex.pip.zip?download=1\" --output ensemblex.pip.zip ## Unzip the Ensemblex container unzip ensemblex.pip.zip If installation was successful the following will be available: ensemblex.pip \u251c\u2500\u2500 gt \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u2514\u2500\u2500 scripts \u2502 \u251c\u2500\u2500 demuxalot \u2502 \u2502 \u251c\u2500\u2500 pipeline_demuxalot.sh \u2502 \u2502 \u2514\u2500\u2500 pipline_demuxalot.py \u2502 \u251c\u2500\u2500 demuxlet \u2502 \u2502 \u2514\u2500\u2500 pipeline_demuxlet.sh \u2502 \u251c\u2500\u2500 ensemblexing \u2502 \u2502 \u251c\u2500\u2500 ensemblexing.R \u2502 \u2502 \u251c\u2500\u2500 functions.R \u2502 \u2502 \u2514\u2500\u2500 pipeline_ensemblexing.sh \u2502 \u251c\u2500\u2500 souporcell \u2502 \u2502 \u2514\u2500\u2500 pipeline_souporcell_generate.sh \u2502 \u2514\u2500\u2500 vireo \u2502 \u2514\u2500\u2500 pipeline_vireo.sh \u251c\u2500\u2500 launch \u2502 \u251c\u2500\u2500 launch_gt.sh \u2502 \u2514\u2500\u2500 launch_nogt.sh \u251c\u2500\u2500 launch_ensemblex.sh \u251c\u2500\u2500 nogt \u2502 \u251c\u2500\u2500 configs \u2502 \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u2502 \u2514\u2500\u2500 scripts \u2502 \u251c\u2500\u2500 demuxalot \u2502 \u2502 \u251c\u2500\u2500 pipeline_demuxalot.py \u2502 \u2502 \u2514\u2500\u2500 pipeline_demuxalot.sh \u2502 \u251c\u2500\u2500 ensemblexing \u2502 \u2502 \u251c\u2500\u2500 ensemblexing_nogt.R \u2502 \u2502 \u251c\u2500\u2500 functions_nogt.R \u2502 \u2502 \u2514\u2500\u2500 pipeline_ensemblexing.sh \u2502 \u251c\u2500\u2500 freemuxlet \u2502 \u2502 \u2514\u2500\u2500 pipeline_freemuxlet.sh \u2502 \u251c\u2500\u2500 souporcell \u2502 \u2502 \u2514\u2500\u2500 pipeline_souporcell_generate.sh \u2502 \u2514\u2500\u2500 vireo \u2502 \u2514\u2500\u2500 pipeline_vireo.sh \u251c\u2500\u2500 README \u251c\u2500\u2500 soft \u2502 \u2514\u2500\u2500 ensemblex.sif \u2514\u2500\u2500 tools \u251c\u2500\u2500 sort_vcf_same_as_bam.sh \u2514\u2500\u2500 utils.sh In addition to the Ensemblex container, users must install Apptainer . For example: ## Load Apptainer module load apptainer/1.2.4 To test if the Ensemblex container is installed properly, run the following code: ## Define the path to ensemblex.pip ensemblex_HOME=/path/to/ensemblex.pip ## Print help message bash $ensemblex_HOME/launch_ensemblex.sh -h Which should return the following help message: ------------------- Usage: /home/fiorini9/scratch/ensemblex.pip/launch_ensemblex.sh [arguments] mandatory arguments: -d (--dir) = Working directory (where all the outputs will be printed) (give full path) --steps = Specify the steps to execute. Begin by selecting either init-GT or init-noGT to establish the working directory. For GT: vireo, demuxalot, demuxlet, souporcell, ensemblexing For noGT: vireo, demuxalot, freemuxlet, souporcell, ensemblexing optional arguments: -h (--help) = See helps regarding the pipeline arguments --vcf = The path of vcf file --bam = The path of bam file --sortout = The path snd nsme of vcf generated using sort ------------------- For a comprehensive help, visit https://neurobioinfo.github.io/ensemblex/site/ for documentation. Upon installing up the Ensemblex container, we can proceed to Step 1 where we will initiate the Ensemblex pipeline for demultiplexing: Set up","title":"Installation"},{"location":"midbrain_download/","text":"Data Download Introduction Downloading and processing scRNAseq data Downloading sample genotype data Downloading reference genotype data Downloading genome reference file Introduction For the tutorial, we will leverage a pooled scRNAseq dataset produced by Jerber et al. . This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state. In this section of the tutorial, we will: Download and process the pooled scRNAseq data with the CellRanger counts pipeline Download and process the sample genotype data Download reference genotype data Download a reference genome file Before we begin, we will create a designated folder for the Ensemblex tutorial: mkdir ensemblex_tutorial cd ensemblex_tutorial Downloading and processing scRNAseq data We will begin by downloading the pooled scRNAseq data from the Sequence Read Archive (SRA): ## Create a folder to place pooled scRNAseq data mkdir pooled_scRNAseq cd pooled_scRNAseq ## Download pooled scRNAseq FASTQ files wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_2.fastq.gz ## Rename pooled scRNAseq FASTQ files mv ERR4700019_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R1_001.fastq.gz mv ERR4700019_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R2_001.fastq.gz mv ERR4700020_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R1_001.fastq.gz mv ERR4700020_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R2_001.fastq.gz mv ERR4700021_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R1_001.fastq.gz mv ERR4700021_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R2_001.fastq.gz mv ERR4700022_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R1_001.fastq.gz mv ERR4700022_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R2_001.fastq.gz Next, we will process the pooled scRNAseq data with the CellRanger counts pipeline: ## Create CellRanger directory cd ~/ensemblex_tutorial mkdir CellRanger cd CellRanger cellranger count \\ --id=pool \\ --fastqs=/home/fiorini9/scratch/ensemblex_pipeline_test/ensemblex_tutorial/pooled_scRNAseq \\ --sample=pool \\ --transcriptome=~/10xGenomics/refdata-cellranger-GRCh37 If the CellRanger counts pipeline completed successfully, it will have generated the following files that we will use for genetic demultiplexing downstream: possorted_genome_bam.bam possorted_genome_bam.bam.bai barcodes.tsv NOTE : For more information regarding the CellRanger counts pipeline, please see the 10X documentation . Downloading sample genotype data Next, we will download the whole exome .vcf files corresponding to the nine pooled individuals from which the iPSC lines derived. We will download the .vcf files from the European Nucleotide Archive (ENA): ## Create a folder to place sample genotype data cd ~/ensemblex_tutorial mkdir sample_genotype cd sample_genotype ## HPSI0115i-hecn_6 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI0214i-pelm_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz.tbi ## HPSI0314i-sojd_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz.tbi ## HPSI0414i-sebn_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0514i-uenn_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI0714i-pipw_4 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0715i-meue_5 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0914i-vaka_5 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI1014i-quls_2 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi Upon downloading the individual genotype data, we will merge the individual files to generate a single .vcf file. ## Merge .vcf files module load bcftools bcftools merge *.vcf.gz > sample_genotype_merge.vcf The resulting sample_genotype_merge.vcf file will be used as prior genotype information for genetic demultiplexing downstream. Downloading reference genotype data Next, we will download a reference genotype file from the 1000 Genomes Project, Phase 3 : ## Create a folder to place the reference files cd ~/ensemblex_tutorial mkdir reference_files cd reference_files ## Download reference .vcf wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz.tbi ## Unzip .vcf file gunzip ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz ## Only keep SNPs module load vcftools vcftools --vcf ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf --remove-indels --recode --recode-INFO-all --out SNPs_only ## Only keep common variants module load bcftools bcftools filter -e 'AF<0.01' SNPs_only.recode.vcf > common_SNPs_only.recode.vcf The resulting common_SNPs_only.recode.vcf file will be used as reference genotype data for genetic demultiplexing downstream. Downloading genome reference file Finally, we will prepare a reference genome. For our tutorial we will use the GRCh37 10X reference genome. For information regarding references, see the 10X documentation . ## Copy pre-built reference genome to working directory cp /cvmfs/soft.mugqic/CentOS6/genomes/species/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa ~/ensemblex_pipeline_test/ensemblex_tutorial/reference_files We will use the genome.fa reference genome for genetic demultiplexing downstream. To run the Ensemblex pipeline on the downloaded data please see the Ensemblex with prior genotype information section of the Ensemblex pipeline.","title":"Downloading data"},{"location":"midbrain_download/#data-download","text":"Introduction Downloading and processing scRNAseq data Downloading sample genotype data Downloading reference genotype data Downloading genome reference file","title":"Data Download"},{"location":"midbrain_download/#introduction","text":"For the tutorial, we will leverage a pooled scRNAseq dataset produced by Jerber et al. . This pool contains induced pluripotent cell lines (iPSC) from 9 healthy controls that were differentiated towards a dopaminergic neuron state. In this section of the tutorial, we will: Download and process the pooled scRNAseq data with the CellRanger counts pipeline Download and process the sample genotype data Download reference genotype data Download a reference genome file Before we begin, we will create a designated folder for the Ensemblex tutorial: mkdir ensemblex_tutorial cd ensemblex_tutorial","title":"Introduction"},{"location":"midbrain_download/#downloading-and-processing-scrnaseq-data","text":"We will begin by downloading the pooled scRNAseq data from the Sequence Read Archive (SRA): ## Create a folder to place pooled scRNAseq data mkdir pooled_scRNAseq cd pooled_scRNAseq ## Download pooled scRNAseq FASTQ files wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/009/ERR4700019/ERR4700019_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/000/ERR4700020/ERR4700020_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/001/ERR4700021/ERR4700021_2.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR470/002/ERR4700022/ERR4700022_2.fastq.gz ## Rename pooled scRNAseq FASTQ files mv ERR4700019_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R1_001.fastq.gz mv ERR4700019_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L001_R2_001.fastq.gz mv ERR4700020_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R1_001.fastq.gz mv ERR4700020_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L002_R2_001.fastq.gz mv ERR4700021_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R1_001.fastq.gz mv ERR4700021_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L003_R2_001.fastq.gz mv ERR4700022_1.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R1_001.fastq.gz mv ERR4700022_2.fastq.gz ~/ensemblex_tutorial/pooled_scRNAseq/pool_S1_L004_R2_001.fastq.gz Next, we will process the pooled scRNAseq data with the CellRanger counts pipeline: ## Create CellRanger directory cd ~/ensemblex_tutorial mkdir CellRanger cd CellRanger cellranger count \\ --id=pool \\ --fastqs=/home/fiorini9/scratch/ensemblex_pipeline_test/ensemblex_tutorial/pooled_scRNAseq \\ --sample=pool \\ --transcriptome=~/10xGenomics/refdata-cellranger-GRCh37 If the CellRanger counts pipeline completed successfully, it will have generated the following files that we will use for genetic demultiplexing downstream: possorted_genome_bam.bam possorted_genome_bam.bam.bai barcodes.tsv NOTE : For more information regarding the CellRanger counts pipeline, please see the 10X documentation .","title":"Downloading and processing scRNAseq data"},{"location":"midbrain_download/#downloading-sample-genotype-data","text":"Next, we will download the whole exome .vcf files corresponding to the nine pooled individuals from which the iPSC lines derived. We will download the .vcf files from the European Nucleotide Archive (ENA): ## Create a folder to place sample genotype data cd ~/ensemblex_tutorial mkdir sample_genotype cd sample_genotype ## HPSI0115i-hecn_6 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487971/HPSI0115i-hecn_6.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI0214i-pelm_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ122/ERZ122924/HPSI0214i-pelm_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20150415.genotypes.vcf.gz.tbi ## HPSI0314i-sojd_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ266/ERZ266723/HPSI0314i-sojd_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20160122.genotypes.vcf.gz.tbi ## HPSI0414i-sebn_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376769/HPSI0414i-sebn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0514i-uenn_3 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ488/ERZ488039/HPSI0514i-uenn_3.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI0714i-pipw_4 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376869/HPSI0714i-pipw_4.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0715i-meue_5 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ376/ERZ376787/HPSI0715i-meue_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20161031.genotypes.vcf.gz.tbi ## HPSI0914i-vaka_5 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487965/HPSI0914i-vaka_5.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi ## HPSI1014i-quls_2 wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/analysis/ERZ487/ERZ487886/HPSI1014i-quls_2.wes.exomeseq.SureSelect_HumanAllExon_v5.mpileup.20170327.genotypes.vcf.gz.tbi Upon downloading the individual genotype data, we will merge the individual files to generate a single .vcf file. ## Merge .vcf files module load bcftools bcftools merge *.vcf.gz > sample_genotype_merge.vcf The resulting sample_genotype_merge.vcf file will be used as prior genotype information for genetic demultiplexing downstream.","title":"Downloading sample genotype data"},{"location":"midbrain_download/#downloading-reference-genotype-data","text":"Next, we will download a reference genotype file from the 1000 Genomes Project, Phase 3 : ## Create a folder to place the reference files cd ~/ensemblex_tutorial mkdir reference_files cd reference_files ## Download reference .vcf wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz.tbi ## Unzip .vcf file gunzip ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf.gz ## Only keep SNPs module load vcftools vcftools --vcf ALL.wgs.phase3_shapeit2_mvncall_integrated_v5c.20130502.sites.vcf --remove-indels --recode --recode-INFO-all --out SNPs_only ## Only keep common variants module load bcftools bcftools filter -e 'AF<0.01' SNPs_only.recode.vcf > common_SNPs_only.recode.vcf The resulting common_SNPs_only.recode.vcf file will be used as reference genotype data for genetic demultiplexing downstream.","title":"Downloading reference genotype data"},{"location":"midbrain_download/#downloading-genome-reference-file","text":"Finally, we will prepare a reference genome. For our tutorial we will use the GRCh37 10X reference genome. For information regarding references, see the 10X documentation . ## Copy pre-built reference genome to working directory cp /cvmfs/soft.mugqic/CentOS6/genomes/species/Homo_sapiens.GRCh37/genome/10xGenomics/refdata-cellranger-GRCh37/fasta/genome.fa ~/ensemblex_pipeline_test/ensemblex_tutorial/reference_files We will use the genome.fa reference genome for genetic demultiplexing downstream. To run the Ensemblex pipeline on the downloaded data please see the Ensemblex with prior genotype information section of the Ensemblex pipeline.","title":"Downloading genome reference file"},{"location":"outputs/","text":"Ensemblex algorithm outputs Introduction Outputs Merging constituent output files Step 1: Accuracy-weighted probabilistic ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Singlet confidence score Introduction After applying the Ensemblex algorithm to the output files of the constituent genetic demultiplexing tools in Step 4, the ~/working_directory/ensemblex folder will have the following structure: working_directory \u2514\u2500\u2500 ensemblex \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u251c\u2500\u2500 step2 \u251c\u2500\u2500 step3 \u2514\u2500\u2500 confidence constituent_tool_merge.csv is the merged outputs from each constituent genetic demultiplexing tool. step1/ contains the outputs from Step 1: probabilistic-weighted ensemble. step2/ contains the outputs from Step 2: graph-based doublet detection. step3/ contains the outputs from Step 3: ensemble-independent doublet detection. confidence/ contains the final Ensemblex output file, whose sample labels have been annotate with the Ensemblex signlet confidence score. Note: If users re-run a step of the Ensemblex workflow, the outputs from the previous run will automatically be overwritten. If you do not want to lose the outputs from a previous run, it is important to copy the materials to a separate directory. Outputs Merging constituent output files Ensemblex begins by merging the output files of the constituent genetic demultiplexing tools by cell barcode, which produces the constituent_tool_merge.csv file. In this file, each constituent genetic demultiplexing tool has two columns corresponding to their sample labels: demuxalot_assignment demuxalot_best_assignment demuxlet_assignment demuxlet_best_assignment souporcell_assignment souporcell_best_assignment vireo_assignment vireo_best_assignment Taking Vireo as an example, vireo_assignment shows Vireo's sample labels after applying its recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will be labeled as \"unassigned\". In turn, vireo_best_assignment shows Vireo's best guess assignments with out applying the recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will still show the best sample label and will not be labelled as \"unassigned\". The constituent_tool_merge.csv file also contains a general_consensus column. This is not Ensemblex's sample labels . The general_consensus column simply shows the sample labels that result from a majority vote classifier; split decisions are labeled as unassigned. Step 1: Accuracy-weighted probabilistic ensemble After running Step 1 of the Ensemblex algorithm, the /PWE folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step1 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2514\u2500\u2500 Step1_cell_assignment.csv Output type Name Description Figure ARI_demultiplexing_tools.pdf Heatmap showing the Adjusted Rand Index (ARI) between the sample labels of the constituent genetic demultiplexing tools. Figure BA_demultiplexing_tools.pdf Barplot showing the estimated balanced accuracy for each constituent genetic demultiplexing tool. File Balanced_accuracy_summary.csv Summary file describing the estimated balanced accuracy computation for each constituent genetic demultiplexing tool. File Step1_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 1: accuracy-weighted probabilistic ensemble. The Step1_cell_assignment.csv file contains the following important columns: ensemblex_assignment : Ensemblex sample labels after performing accuracy-weighted probabilistic ensemble. ensemblex_probability : Accuracy-weighted ensemble probability corresponding to Ensemblex's sample labels. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats. Step 2: Graph-based doublet detection After running Step 2 of the Ensemblex algorithm, the /GBD folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step2 \u251c\u2500\u2500 optimal_nCD.pdf \u251c\u2500\u2500 optimal_pT.pdf \u251c\u2500\u2500 PC1_var_contrib.pdf \u251c\u2500\u2500 PC2_var_contrib.pdf \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA_plot.pdf \u251c\u2500\u2500 PCA_scree_plot.pdf \u2514\u2500\u2500 Step2_cell_assignment.csv Output type Name Description Figure optimal_nCD.pdf Dot plot showing the optimal nCD value. Figure optimal_pT.pdf Dot plot showing the optimal pT value. Figure PC1_var_contrib.pdf Bar plot showing the contribution of each variable to the variation across the first principal component. Figure PC2_var_contrib.pdf Bar plot showing the contribution of each variable to the variation across the second principal component. Figure PCA1_graph_based_doublet_detection.pdf PCA showing Ensemblex sample labels (singlet or doublet) prior to performing graph-based doublet detection. Figure PCA2_graph_based_doublet_detection.pdf PCA showing the cells identified as the n most confident doublets in the pool. Figure PCA3_graph_based_doublet_detection.pdf PCA showing Ensemblex sample labels (singlet or doublet) after performing graph-based doublet detection. Figure PCA_plot.pdf PCA of pooled cells. Figure PCA_scree_plot.pdf Bar plot showing the variance explained by each principal component. File Step2_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 2: graph-based doublet detection. The Step2_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after performing graph-based doublet detection. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats. Step 3: Ensemble-independent doublet detection After running Step 3 of the Ensemblex algorithm, the /EID folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv Output type Name Description Figure Doublet_overlap_no_threshold.pdf Proportion of doublet calls overlapping between constituent genetic demultiplexing tools without applying assignment probability thresholds. Figure Doublet_overlap_threshold.pdf Proportion of doublet calls overlapping between constituent genetic demultiplexing tools after applying assignment probability thresholds. Figure Number_ensemblex_doublets_EID_no_threshold.pdf Number of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, without applying assignment probability thresholds. Figure Number_ensemblex_doublets_EID_threshold.pdf Number of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, after applying assignment probability thresholds. File Step3_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 3: ensemble-independent doublet detection. The Step3_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after performing ensemble-independent doublet detection. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats. Singlet confidence score After computing the Ensemblex singlet confidence score, the /confidence folder will contain the following file: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 confidence \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv Output type Name Description File ensemblex_final_cell_assignment.csv Data file containing Ensemblex's final sample labels after computing the singlet confidence score. The ensemblex_final_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after applying the recommended singlet confidence score threshold; singlets with a confidence score < 1 are labeled as \"unassigned\". ensemblex_best_assignment : Ensemblex's best guess assignments with out applying the recommended confidence score threshold; singlets with a confidence score < 1 will still show the best sample label and will not be labelled as \"unassigned\". ensemblex_singlet_confidence : Ensemblex singlet confidence score. NOTE : We recommend using the sample labels from ensemblex_assignment for downstream analyses.","title":"Ensemblex outputs"},{"location":"outputs/#ensemblex-algorithm-outputs","text":"Introduction Outputs Merging constituent output files Step 1: Accuracy-weighted probabilistic ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Singlet confidence score","title":"Ensemblex algorithm outputs"},{"location":"outputs/#introduction","text":"After applying the Ensemblex algorithm to the output files of the constituent genetic demultiplexing tools in Step 4, the ~/working_directory/ensemblex folder will have the following structure: working_directory \u2514\u2500\u2500 ensemblex \u251c\u2500\u2500 constituent_tool_merge.csv \u251c\u2500\u2500 step1 \u251c\u2500\u2500 step2 \u251c\u2500\u2500 step3 \u2514\u2500\u2500 confidence constituent_tool_merge.csv is the merged outputs from each constituent genetic demultiplexing tool. step1/ contains the outputs from Step 1: probabilistic-weighted ensemble. step2/ contains the outputs from Step 2: graph-based doublet detection. step3/ contains the outputs from Step 3: ensemble-independent doublet detection. confidence/ contains the final Ensemblex output file, whose sample labels have been annotate with the Ensemblex signlet confidence score. Note: If users re-run a step of the Ensemblex workflow, the outputs from the previous run will automatically be overwritten. If you do not want to lose the outputs from a previous run, it is important to copy the materials to a separate directory.","title":"Introduction"},{"location":"outputs/#outputs","text":"","title":"Outputs"},{"location":"outputs/#merging-constituent-output-files","text":"Ensemblex begins by merging the output files of the constituent genetic demultiplexing tools by cell barcode, which produces the constituent_tool_merge.csv file. In this file, each constituent genetic demultiplexing tool has two columns corresponding to their sample labels: demuxalot_assignment demuxalot_best_assignment demuxlet_assignment demuxlet_best_assignment souporcell_assignment souporcell_best_assignment vireo_assignment vireo_best_assignment Taking Vireo as an example, vireo_assignment shows Vireo's sample labels after applying its recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will be labeled as \"unassigned\". In turn, vireo_best_assignment shows Vireo's best guess assignments with out applying the recommended probability threshold; thus, cells that do not meet Vireo's recommended probability threshold will still show the best sample label and will not be labelled as \"unassigned\". The constituent_tool_merge.csv file also contains a general_consensus column. This is not Ensemblex's sample labels . The general_consensus column simply shows the sample labels that result from a majority vote classifier; split decisions are labeled as unassigned.","title":"Merging constituent output files"},{"location":"outputs/#step-1-accuracy-weighted-probabilistic-ensemble","text":"After running Step 1 of the Ensemblex algorithm, the /PWE folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step1 \u251c\u2500\u2500 ARI_demultiplexing_tools.pdf \u251c\u2500\u2500 BA_demultiplexing_tools.pdf \u251c\u2500\u2500 Balanced_accuracy_summary.csv \u2514\u2500\u2500 Step1_cell_assignment.csv Output type Name Description Figure ARI_demultiplexing_tools.pdf Heatmap showing the Adjusted Rand Index (ARI) between the sample labels of the constituent genetic demultiplexing tools. Figure BA_demultiplexing_tools.pdf Barplot showing the estimated balanced accuracy for each constituent genetic demultiplexing tool. File Balanced_accuracy_summary.csv Summary file describing the estimated balanced accuracy computation for each constituent genetic demultiplexing tool. File Step1_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 1: accuracy-weighted probabilistic ensemble. The Step1_cell_assignment.csv file contains the following important columns: ensemblex_assignment : Ensemblex sample labels after performing accuracy-weighted probabilistic ensemble. ensemblex_probability : Accuracy-weighted ensemble probability corresponding to Ensemblex's sample labels. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.","title":"Step 1: Accuracy-weighted probabilistic ensemble"},{"location":"outputs/#step-2-graph-based-doublet-detection","text":"After running Step 2 of the Ensemblex algorithm, the /GBD folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step2 \u251c\u2500\u2500 optimal_nCD.pdf \u251c\u2500\u2500 optimal_pT.pdf \u251c\u2500\u2500 PC1_var_contrib.pdf \u251c\u2500\u2500 PC2_var_contrib.pdf \u251c\u2500\u2500 PCA1_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA2_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA3_graph_based_doublet_detection.pdf \u251c\u2500\u2500 PCA_plot.pdf \u251c\u2500\u2500 PCA_scree_plot.pdf \u2514\u2500\u2500 Step2_cell_assignment.csv Output type Name Description Figure optimal_nCD.pdf Dot plot showing the optimal nCD value. Figure optimal_pT.pdf Dot plot showing the optimal pT value. Figure PC1_var_contrib.pdf Bar plot showing the contribution of each variable to the variation across the first principal component. Figure PC2_var_contrib.pdf Bar plot showing the contribution of each variable to the variation across the second principal component. Figure PCA1_graph_based_doublet_detection.pdf PCA showing Ensemblex sample labels (singlet or doublet) prior to performing graph-based doublet detection. Figure PCA2_graph_based_doublet_detection.pdf PCA showing the cells identified as the n most confident doublets in the pool. Figure PCA3_graph_based_doublet_detection.pdf PCA showing Ensemblex sample labels (singlet or doublet) after performing graph-based doublet detection. Figure PCA_plot.pdf PCA of pooled cells. Figure PCA_scree_plot.pdf Bar plot showing the variance explained by each principal component. File Step2_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 2: graph-based doublet detection. The Step2_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after performing graph-based doublet detection. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.","title":"Step 2: Graph-based doublet detection"},{"location":"outputs/#step-3-ensemble-independent-doublet-detection","text":"After running Step 3 of the Ensemblex algorithm, the /EID folder will contain the following files: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 step3 \u251c\u2500\u2500 Doublet_overlap_no_threshold.pdf \u251c\u2500\u2500 Doublet_overlap_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_no_threshold.pdf \u251c\u2500\u2500 Number_ensemblex_doublets_EID_threshold.pdf \u2514\u2500\u2500 Step3_cell_assignment.csv Output type Name Description Figure Doublet_overlap_no_threshold.pdf Proportion of doublet calls overlapping between constituent genetic demultiplexing tools without applying assignment probability thresholds. Figure Doublet_overlap_threshold.pdf Proportion of doublet calls overlapping between constituent genetic demultiplexing tools after applying assignment probability thresholds. Figure Number_ensemblex_doublets_EID_no_threshold.pdf Number of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, without applying assignment probability thresholds. Figure Number_ensemblex_doublets_EID_threshold.pdf Number of cells that would be labelled as doublets by Ensemblex if a constituent tool was nominated for ensemble-independent doublet detection, after applying assignment probability thresholds. File Step3_cell_assignment.csv Data file containing Ensemblex's sample labels after Step 3: ensemble-independent doublet detection. The Step3_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after performing ensemble-independent doublet detection. NOTE : Prior to using Ensemblex's sample labels for downstream analyses, we recommend computing the Ensemblex singlet confidence score to identify low confidence singlet assignments that should be removed from the dataset to mitigate the introduction of technical artificats.","title":"Step 3: Ensemble-independent doublet detection"},{"location":"outputs/#singlet-confidence-score","text":"After computing the Ensemblex singlet confidence score, the /confidence folder will contain the following file: working_directory \u2514\u2500\u2500 ensemblex \u2514\u2500\u2500 confidence \u2514\u2500\u2500 ensemblex_final_cell_assignment.csv Output type Name Description File ensemblex_final_cell_assignment.csv Data file containing Ensemblex's final sample labels after computing the singlet confidence score. The ensemblex_final_cell_assignment.csv file contains the following important column: ensemblex_assignment : Ensemblex sample labels after applying the recommended singlet confidence score threshold; singlets with a confidence score < 1 are labeled as \"unassigned\". ensemblex_best_assignment : Ensemblex's best guess assignments with out applying the recommended confidence score threshold; singlets with a confidence score < 1 will still show the best sample label and will not be labelled as \"unassigned\". ensemblex_singlet_confidence : Ensemblex singlet confidence score. NOTE : We recommend using the sample labels from ensemblex_assignment for downstream analyses.","title":"Singlet confidence score"},{"location":"overview/","text":"Ensemblex algorithm overview Workflow Step 1: Accuracy-weighted probabilistic ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Contribution of each step to overall demultiplexing accuracy Workflow The Ensemblex workflow begins by demultiplexing pooled cells with each of its constituent tools: Demuxalot, Demuxlet, Souporcell and Vireo-GT if using prior genotype information or Demuxalot, Freemuxlet, Souporcell and Vireo if prior genotype information is not available. Figure 1. Input into the Ensemblex framework. The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework. Upon demultiplexing pools with each individual constituent genetic demultiplexing tool, Ensemblex processes the outputs in a three-step pipeline: Step 1: Accuracy-weighted probabilistic-weighted ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Figure 2. Overview of the three-step Ensemblex framework. The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection. For demonstration purposes throughout this section, we leveraged simulated pools with known ground-truth sample labels that were generated with 80 independetly-sequenced induced pluripotent stem cell (iPSC) lines from individuals with Parkinson's disease and neurologically healthy controls. The lines were differentiated towards a dopaminergic cell fate as part of the Foundational Data Initiative for Parkinson's disease (FOUNDIN-PD; Bressan et al. ) Step 1: Accuracy-weighted probabilistic ensemble The accuracy-weighted probabilistic ensemble component of the Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool\u2019s assignment probability distribution by its estimated balanced accuracy for the dataset in a framework that was largely inspired by the work of Large et al. . To estimate the balanced accuracy of a particular constituent tool (e.g. Demuxalot) for real-word datasets lacking ground-truth labels, Ensemblex leverages the cells with a consensus assignment across the three remaining tools (e.g. Demuxlet, Souporcell, and Vireo-GT) as a proxy for ground-truth. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell. Figure 3. Graphical representation of the accuracy-weighted probabilistic ensemble component of the Ensemblex framework. Step 2: Graph-based doublet detection The graph-based doublet detection component of the Ensemblex framework was implemented to identify doublets that are incorrectly labeled as singlets by the accuracy-weighted probablistic ensemble component (Step 1). To demonstrate Step 2 of the Ensemblex framework we leveraged a simulated pool comprising 24 pooled samples, 17,384 cells, and a 15% doublet rate. Figure 4. Graphical representation of the graph-based doublet detection component of the Ensemblex framework. The graph-based doublet detection component begins by leveraging select variables returned from each constituent tool: Demuxalot: doublet probability; Demuxlet/Freemuxlet: singlet log likelihood \u2013 doublet log likelihood; Demuxlet/Freemuxlet: number of single nucleotide polymorphisms (SNP) per cell; Demuxlet/Freemuxlet: number of reads per cell; Souporcell: doublet log probability; Vireo: doublet probability; Vireo: doublet log likelihood ratio Figure 5. Select variables returned by the constituent genetic demultiplexing tools used for graph-based doubet detection. Using these variables, Ensemblex screens each pooled cell to identify the n most confident doublets in the pool and performs a principal component analysis (PCA). Figure 6. PCA of pooled cells using select variables returned by the constituent genetic demultiplexing tools. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting the n most confident doublets identified by Ensemblex. The PCA embedding is then converted into a Euclidean distance matrix and each cell is assigned a percentile rank based on their distance to each confident doublet. After performing an automated parameter sweep, Ensemblex identifies the droplets that appear most frequently amongst the nearest neighbours of confident doublets as doublets. Figure 7. PCA of pooled cells labeled according to Ensemblex labels prior to and after graph-based doublet detection. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting Ensemblex's labels prior to graph-based doublet detection. C) PCA highlighting Ensemblex's labels after graph-based doublet detection. Step 3: Ensemble-independent doublet detection The ensemble-independent doublet detection component of the Ensemblex framework was implemented to further improve Ensemblex's ability to identify doublets. Benchmarking on simulated pools with known ground-truth sample labels revealed that certain genetic demultiplexing tools, namely Demuxalot and Vireo, showed high doublet detection specificity. Figure 8. Constituent genetic demultiplexing tool doublet specificity on computationally multiplexed pools with ground truth sample labels. Doublet specificity was evaluated on pools ranging in size from 4 to 80 multiplexed samples. However, Steps 1 and 2 of the Ensemblex workflow failed to correctly label a subset of doublet calls by these tools. To mitigate this issue and maximize the rate of doublet identification, Ensemblex labels the cells that are identified as doublets by Vireo or Demuxalot as doublets, by default; however, users can nominate different tools for the ensemble-independent doublet detection component depending on the desired doublet detection stringency. Figure 9. Graphical representation of the ensemble-independent doublet detection component of the Ensemblex framework. Contribution of each step to overall demultiplexing accuracy We sequentially applied each step of the Ensemblex framework to 96 computationally multiplexed pools with known ground truth sample labels ranging in size from 4 to 80 samples. The proportion of correctly classified singlets and doublets identified by Ensemblex after each step of the framework is shown in Figure 10. Figure 10. Contribution of each component of the Ensemblex framework to demultiplexing accuracy. The average proportion of correctly classified A) singlets and B) doublets across replicates at a given pool size is shown after sequentially applying each step of the Ensemblex framework. The right panels show the average proportion of correct classifications across all 96 pools. The blue points show the proportion of cells that were correctly classified by at least one tool: Demuxalot, Demuxlet, Souporcell, or Vireo. For detailed methodology please see our pre-print manuscript .","title":"Ensemblex algorithm overview"},{"location":"overview/#ensemblex-algorithm-overview","text":"Workflow Step 1: Accuracy-weighted probabilistic ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Contribution of each step to overall demultiplexing accuracy","title":"Ensemblex algorithm overview"},{"location":"overview/#workflow","text":"The Ensemblex workflow begins by demultiplexing pooled cells with each of its constituent tools: Demuxalot, Demuxlet, Souporcell and Vireo-GT if using prior genotype information or Demuxalot, Freemuxlet, Souporcell and Vireo if prior genotype information is not available. Figure 1. Input into the Ensemblex framework. The Ensemblex workflow begins with demultiplexing pooled samples by each of the constituent tools. The outputs from each individual demultiplexing tool are then used as input into the Ensemblex framework. Upon demultiplexing pools with each individual constituent genetic demultiplexing tool, Ensemblex processes the outputs in a three-step pipeline: Step 1: Accuracy-weighted probabilistic-weighted ensemble Step 2: Graph-based doublet detection Step 3: Ensemble-independent doublet detection Figure 2. Overview of the three-step Ensemblex framework. The Ensemblex framework comprises three distinct steps that are assembled into a pipeline: 1) accuracy-weighted probabilistic ensemble, 2) graph-based doublet detection, and 3) ensemble-independent doublet detection. For demonstration purposes throughout this section, we leveraged simulated pools with known ground-truth sample labels that were generated with 80 independetly-sequenced induced pluripotent stem cell (iPSC) lines from individuals with Parkinson's disease and neurologically healthy controls. The lines were differentiated towards a dopaminergic cell fate as part of the Foundational Data Initiative for Parkinson's disease (FOUNDIN-PD; Bressan et al. )","title":"Workflow"},{"location":"overview/#step-1-accuracy-weighted-probabilistic-ensemble","text":"The accuracy-weighted probabilistic ensemble component of the Ensemblex utilizes an unsupervised weighting model to identify the most probable sample label for each cell. Ensemblex weighs each constituent tool\u2019s assignment probability distribution by its estimated balanced accuracy for the dataset in a framework that was largely inspired by the work of Large et al. . To estimate the balanced accuracy of a particular constituent tool (e.g. Demuxalot) for real-word datasets lacking ground-truth labels, Ensemblex leverages the cells with a consensus assignment across the three remaining tools (e.g. Demuxlet, Souporcell, and Vireo-GT) as a proxy for ground-truth. The weighted assignment probabilities across all four constituent tools are then used to inform the most probable sample label for each cell. Figure 3. Graphical representation of the accuracy-weighted probabilistic ensemble component of the Ensemblex framework.","title":"Step 1: Accuracy-weighted probabilistic ensemble"},{"location":"overview/#step-2-graph-based-doublet-detection","text":"The graph-based doublet detection component of the Ensemblex framework was implemented to identify doublets that are incorrectly labeled as singlets by the accuracy-weighted probablistic ensemble component (Step 1). To demonstrate Step 2 of the Ensemblex framework we leveraged a simulated pool comprising 24 pooled samples, 17,384 cells, and a 15% doublet rate. Figure 4. Graphical representation of the graph-based doublet detection component of the Ensemblex framework. The graph-based doublet detection component begins by leveraging select variables returned from each constituent tool: Demuxalot: doublet probability; Demuxlet/Freemuxlet: singlet log likelihood \u2013 doublet log likelihood; Demuxlet/Freemuxlet: number of single nucleotide polymorphisms (SNP) per cell; Demuxlet/Freemuxlet: number of reads per cell; Souporcell: doublet log probability; Vireo: doublet probability; Vireo: doublet log likelihood ratio Figure 5. Select variables returned by the constituent genetic demultiplexing tools used for graph-based doubet detection. Using these variables, Ensemblex screens each pooled cell to identify the n most confident doublets in the pool and performs a principal component analysis (PCA). Figure 6. PCA of pooled cells using select variables returned by the constituent genetic demultiplexing tools. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting the n most confident doublets identified by Ensemblex. The PCA embedding is then converted into a Euclidean distance matrix and each cell is assigned a percentile rank based on their distance to each confident doublet. After performing an automated parameter sweep, Ensemblex identifies the droplets that appear most frequently amongst the nearest neighbours of confident doublets as doublets. Figure 7. PCA of pooled cells labeled according to Ensemblex labels prior to and after graph-based doublet detection. A) PCA highlighting ground truth cell labels: singlet or doublet. B) PCA highlighting Ensemblex's labels prior to graph-based doublet detection. C) PCA highlighting Ensemblex's labels after graph-based doublet detection.","title":"Step 2: Graph-based doublet detection"},{"location":"overview/#step-3-ensemble-independent-doublet-detection","text":"The ensemble-independent doublet detection component of the Ensemblex framework was implemented to further improve Ensemblex's ability to identify doublets. Benchmarking on simulated pools with known ground-truth sample labels revealed that certain genetic demultiplexing tools, namely Demuxalot and Vireo, showed high doublet detection specificity. Figure 8. Constituent genetic demultiplexing tool doublet specificity on computationally multiplexed pools with ground truth sample labels. Doublet specificity was evaluated on pools ranging in size from 4 to 80 multiplexed samples. However, Steps 1 and 2 of the Ensemblex workflow failed to correctly label a subset of doublet calls by these tools. To mitigate this issue and maximize the rate of doublet identification, Ensemblex labels the cells that are identified as doublets by Vireo or Demuxalot as doublets, by default; however, users can nominate different tools for the ensemble-independent doublet detection component depending on the desired doublet detection stringency. Figure 9. Graphical representation of the ensemble-independent doublet detection component of the Ensemblex framework.","title":"Step 3: Ensemble-independent doublet detection"},{"location":"overview/#contribution-of-each-step-to-overall-demultiplexing-accuracy","text":"We sequentially applied each step of the Ensemblex framework to 96 computationally multiplexed pools with known ground truth sample labels ranging in size from 4 to 80 samples. The proportion of correctly classified singlets and doublets identified by Ensemblex after each step of the framework is shown in Figure 10. Figure 10. Contribution of each component of the Ensemblex framework to demultiplexing accuracy. The average proportion of correctly classified A) singlets and B) doublets across replicates at a given pool size is shown after sequentially applying each step of the Ensemblex framework. The right panels show the average proportion of correct classifications across all 96 pools. The blue points show the proportion of cells that were correctly classified by at least one tool: Demuxalot, Demuxlet, Souporcell, or Vireo. For detailed methodology please see our pre-print manuscript .","title":"Contribution of each step to overall demultiplexing accuracy"},{"location":"overview_pipeline/","text":"Ensemblex pipeline overview The Ensemblex pipeline was developed to facilitate the application of each of Ensemblex's constituent demultiplexing tools and seamlessly integrate the output files into the Ensemblex framework. We provide two distinct, yet highly comparable pipelines: Demultiplexing with prior genotype information Demultiplexing without prior genotype information The pipelines comprise of four distinct steps: Selection of Ensemblex pipeline and establishing the working directory (Set up) Prepare input files for constituent genetic demultiplexing tools Genetic demultiplexing by constituent demultiplexing tools Application of the Ensemblex framework Each step of the pipeline is comprehensively described in the following sections of the Ensemblex documentation.","title":"Ensemblex pipeline overview"},{"location":"overview_pipeline/#ensemblex-pipeline-overview","text":"The Ensemblex pipeline was developed to facilitate the application of each of Ensemblex's constituent demultiplexing tools and seamlessly integrate the output files into the Ensemblex framework. We provide two distinct, yet highly comparable pipelines: Demultiplexing with prior genotype information Demultiplexing without prior genotype information The pipelines comprise of four distinct steps: Selection of Ensemblex pipeline and establishing the working directory (Set up) Prepare input files for constituent genetic demultiplexing tools Genetic demultiplexing by constituent demultiplexing tools Application of the Ensemblex framework Each step of the pipeline is comprehensively described in the following sections of the Ensemblex documentation.","title":"Ensemblex pipeline overview"},{"location":"reference/","text":"Adjustable execution parameters for the Ensemblex pipeline Introduction How to modify the parameter files Constituent genetic demultiplexing tools with prior genotype information Demuxalot Demuxlet Souporcell Vireo Constituent genetic demultiplexing tools without prior genotype information Demuxalot Freemuxlet Souporcell Vireo Ensemblex algorithm Introduction Prior to running the Ensemblex pipeline, users should modify the execution parameters for the constituent genetic demultiplexing tools and the Ensemblex algorithm. Upon running Step 1: Set up , a /job_info folder will be created in the wording directory. Within the /job_info folder is a /configs folder which contains the ensemblex_config.ini ; this .ini file contains all of the adjustable parameters for the Ensemblex pipeline. working_directory \u2514\u2500\u2500 job_info \u251c\u2500\u2500 configs \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u251c\u2500\u2500 logs \u2514\u2500\u2500 summary_report.txt To ensure replicability, the execution parameters are documented in ~/working_directory/job_info/summary_report.txt . How to modify the parameter files The following section illustrates how to modify the ensemblex_config.ini parameter file directly from the terminal. To begin, navigate to the /configs folder and view its contents: cd ~/working_directory/job_info/configs ls The following file will be available: ensemblex_config.ini To modify the ensemblex_config.ini parameter file directly in the terminal we will use Nano : nano ensemblex_config.ini This will open ensemblex_config.ini in the terminal and allow users to modify the parameters. To save the modifications and exit the parameter file, type ctrl+o followed by ctrl+x . Constituent genetic demultiplexing tools with prior genotype information Demuxalot The following parameters are adjustable for Demuxalot: Parameter Default Description PAR_demuxalot_genotype_names NULL List of Sample ID's in the sample VCF file (e.g., 'Sample_1,Sample_2,Sample_3'). PAR_demuxalot_minimum_coverage 200 Minimum read coverage. PAR_demuxalot_minimum_alternative_coverage 10 Minimum alternative read coverage. PAR_demuxalot_n_best_snps_per_donor 100 Number of best snps for each donor to use for demultiplexing. PAR_demuxalot_genotypes_prior_strength 1 Genotype prior strength. PAR_demuxalot_doublet_prior 0.25 Doublet prior strength. Demuxlet The following parameters are adjustable for Demuxlet: Parameter Default Description PAR_demuxlet_field GT Field to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file. NOTE : We are currently working on expanding the execution parameters for Demuxlet. Vireo The following parameters are adjustable for Vireo: Parameter Default Description PAR_vireo_N NULL Number of pooled samples. PAR_vireo_type GT Field to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file. PAR_vireo_processes 20 Number of subprocesses for computing. PAR_vireo_minMAF 0.1 Minimum minor allele frequency. PAR_vireo_minCOUNT 20 Minimum aggregated count. PAR_vireo_forcelearnGT T Whether or not to treat donor GT as prior only. NOTE : We are currently working on expanding the execution parameters for Vireo. Souporcell The following parameters are adjustable for Souporcell: Parameter Default Description PAR_minimap2 -ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no For information regarding the minimap2 parameters, please see the documentation . PAR_freebayes -iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6 For information regarding the freebayes parameters, please see the documentation . PAR_vartrix_umi TRUE Whether or no to consider UMI information when populating coverage matrices. PAR_vartrix_mapq 30 Minimum read mapping quality. PAR_vartrix_threads 8 Number of threads for computing. PAR_souporcell_k NULL Number of pooled samples. PAR_souporcell_t 8 Number of threads for computing. NOTE : We are currently working on expanding the execution parameters for Souporcell. Constituent genetic demultiplexing tools without prior genotype information Demuxalot The following parameters are adjustable for Demuxalot: Parameter Default Description PAR_demuxalot_genotype_names NULL List of Sample ID's in the sample VCF file generated by Freemuxlet: outs.clust1.vcf (e.g., 'CLUST0,CLUST1,CLUST2'). PAR_demuxalot_minimum_coverage 200 Minimum read coverage. PAR_demuxalot_minimum_alternative_coverage 10 Minimum alternative read coverage. PAR_demuxalot_n_best_snps_per_donor 100 Number of best snps for each donor to use for demultiplexing. PAR_demuxalot_genotypes_prior_strength 1 Genotype prior strength. PAR_demuxalot_doublet_prior 0.25 Doublet prior strength. Freemuxlet The following parameters are adjustable for Freemuxlet: Parameter Default Description PAR_freemuxlet_nsample NULL Number of pooled samples. NOTE : We are currently working on expanding the execution parameters for Freemuxlet. Vireo The following parameters are adjustable for Vireo: Parameter Default Description PAR_vireo_N NULL Number of pooled samples. PAR_vireo_processes 20 Number of subprocesses for computing. PAR_vireo_minMAF 0.1 Minimum minor allele frequency. PAR_vireo_minCOUNT 20 Minimum aggregated count. NOTE : We are currently working on expanding the execution parameters for Vireo. Souporcell The following parameters are adjustable for Souporcell: Parameter Default Description PAR_minimap2 -ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no For information regarding the minimap2 parameters, please see the documentation . PAR_freebayes -iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6 For information regarding the freebayes parameters, please see the documentation . PAR_vartrix_umi TRUE Whether or no to consider UMI information when populating coverage matrices. PAR_vartrix_mapq 30 Minimum read mapping quality. PAR_vartrix_threads 8 Number of threads for computing. PAR_souporcell_k NULL Number of pooled samples. PAR_souporcell_t 8 Number of threads for computing. NOTE : We are currently working on expanding the execution parameters for Souporcell. Ensemblex The following parameters are adjustable for the Ensemblex algorithm: Parameter Default Description Pool parameters PAR_ensemblex_sample_size NULL Number of samples multiplexed in the pool. PAR_ensemblex_expected_doublet_rate NULL Expected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation . Set up parameters PAR_ensemblex_merge_constituents Yes Whether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated merged file. Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Whether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated Step 1 output file. Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No Whether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively. PAR_ensemblex_nCD NULL Manually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_pT NULL Manually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_graph_based_doublet_detection Yes Whether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values. Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No Whether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters. PAR_ensemblex_ensemble_independent_doublet Yes Whether or not to perform Step 3: Ensemble-independent doublet detection. PAR_ensemblex_doublet_Demuxalot_threshold Yes Whether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxalot_no_threshold No Whether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Demuxlet_threshold No Whether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxlet_no_threshold No Whether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Souporcell_threshold No Whether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Souporcell_no_threshold No Whether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Vireo_threshold Yes Whether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Vireo_no_threshold No Whether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability. Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes Whether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses.","title":"Execution parameters"},{"location":"reference/#adjustable-execution-parameters-for-the-ensemblex-pipeline","text":"Introduction How to modify the parameter files Constituent genetic demultiplexing tools with prior genotype information Demuxalot Demuxlet Souporcell Vireo Constituent genetic demultiplexing tools without prior genotype information Demuxalot Freemuxlet Souporcell Vireo Ensemblex algorithm","title":"Adjustable execution parameters for the Ensemblex pipeline"},{"location":"reference/#introduction","text":"Prior to running the Ensemblex pipeline, users should modify the execution parameters for the constituent genetic demultiplexing tools and the Ensemblex algorithm. Upon running Step 1: Set up , a /job_info folder will be created in the wording directory. Within the /job_info folder is a /configs folder which contains the ensemblex_config.ini ; this .ini file contains all of the adjustable parameters for the Ensemblex pipeline. working_directory \u2514\u2500\u2500 job_info \u251c\u2500\u2500 configs \u2502 \u2514\u2500\u2500 ensemblex_config.ini \u251c\u2500\u2500 logs \u2514\u2500\u2500 summary_report.txt To ensure replicability, the execution parameters are documented in ~/working_directory/job_info/summary_report.txt .","title":"Introduction"},{"location":"reference/#how-to-modify-the-parameter-files","text":"The following section illustrates how to modify the ensemblex_config.ini parameter file directly from the terminal. To begin, navigate to the /configs folder and view its contents: cd ~/working_directory/job_info/configs ls The following file will be available: ensemblex_config.ini To modify the ensemblex_config.ini parameter file directly in the terminal we will use Nano : nano ensemblex_config.ini This will open ensemblex_config.ini in the terminal and allow users to modify the parameters. To save the modifications and exit the parameter file, type ctrl+o followed by ctrl+x .","title":"How to modify the parameter files"},{"location":"reference/#constituent-genetic-demultiplexing-tools-with-prior-genotype-information","text":"","title":"Constituent genetic demultiplexing tools with prior genotype information"},{"location":"reference/#demuxalot","text":"The following parameters are adjustable for Demuxalot: Parameter Default Description PAR_demuxalot_genotype_names NULL List of Sample ID's in the sample VCF file (e.g., 'Sample_1,Sample_2,Sample_3'). PAR_demuxalot_minimum_coverage 200 Minimum read coverage. PAR_demuxalot_minimum_alternative_coverage 10 Minimum alternative read coverage. PAR_demuxalot_n_best_snps_per_donor 100 Number of best snps for each donor to use for demultiplexing. PAR_demuxalot_genotypes_prior_strength 1 Genotype prior strength. PAR_demuxalot_doublet_prior 0.25 Doublet prior strength.","title":"Demuxalot"},{"location":"reference/#demuxlet","text":"The following parameters are adjustable for Demuxlet: Parameter Default Description PAR_demuxlet_field GT Field to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file. NOTE : We are currently working on expanding the execution parameters for Demuxlet.","title":"Demuxlet"},{"location":"reference/#vireo","text":"The following parameters are adjustable for Vireo: Parameter Default Description PAR_vireo_N NULL Number of pooled samples. PAR_vireo_type GT Field to extract the genotypes (GT), genotype likelihood (PL), or posterior probability (GP) from the sample .vcf file. PAR_vireo_processes 20 Number of subprocesses for computing. PAR_vireo_minMAF 0.1 Minimum minor allele frequency. PAR_vireo_minCOUNT 20 Minimum aggregated count. PAR_vireo_forcelearnGT T Whether or not to treat donor GT as prior only. NOTE : We are currently working on expanding the execution parameters for Vireo.","title":"Vireo"},{"location":"reference/#souporcell","text":"The following parameters are adjustable for Souporcell: Parameter Default Description PAR_minimap2 -ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no For information regarding the minimap2 parameters, please see the documentation . PAR_freebayes -iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6 For information regarding the freebayes parameters, please see the documentation . PAR_vartrix_umi TRUE Whether or no to consider UMI information when populating coverage matrices. PAR_vartrix_mapq 30 Minimum read mapping quality. PAR_vartrix_threads 8 Number of threads for computing. PAR_souporcell_k NULL Number of pooled samples. PAR_souporcell_t 8 Number of threads for computing. NOTE : We are currently working on expanding the execution parameters for Souporcell.","title":"Souporcell"},{"location":"reference/#constituent-genetic-demultiplexing-tools-without-prior-genotype-information","text":"","title":"Constituent genetic demultiplexing tools without prior genotype information"},{"location":"reference/#demuxalot_1","text":"The following parameters are adjustable for Demuxalot: Parameter Default Description PAR_demuxalot_genotype_names NULL List of Sample ID's in the sample VCF file generated by Freemuxlet: outs.clust1.vcf (e.g., 'CLUST0,CLUST1,CLUST2'). PAR_demuxalot_minimum_coverage 200 Minimum read coverage. PAR_demuxalot_minimum_alternative_coverage 10 Minimum alternative read coverage. PAR_demuxalot_n_best_snps_per_donor 100 Number of best snps for each donor to use for demultiplexing. PAR_demuxalot_genotypes_prior_strength 1 Genotype prior strength. PAR_demuxalot_doublet_prior 0.25 Doublet prior strength.","title":"Demuxalot"},{"location":"reference/#freemuxlet","text":"The following parameters are adjustable for Freemuxlet: Parameter Default Description PAR_freemuxlet_nsample NULL Number of pooled samples. NOTE : We are currently working on expanding the execution parameters for Freemuxlet.","title":"Freemuxlet"},{"location":"reference/#vireo_1","text":"The following parameters are adjustable for Vireo: Parameter Default Description PAR_vireo_N NULL Number of pooled samples. PAR_vireo_processes 20 Number of subprocesses for computing. PAR_vireo_minMAF 0.1 Minimum minor allele frequency. PAR_vireo_minCOUNT 20 Minimum aggregated count. NOTE : We are currently working on expanding the execution parameters for Vireo.","title":"Vireo"},{"location":"reference/#souporcell_1","text":"The following parameters are adjustable for Souporcell: Parameter Default Description PAR_minimap2 -ax splice -t 8 -G50k -k 21 -w 11 --sr -A2 -B8 -O12,32 -E2,1 -r200 -p.5 -N20 -f1000,5000 -n2 -m20 -s40 -g2000 -2K50m --secondary=no For information regarding the minimap2 parameters, please see the documentation . PAR_freebayes -iXu -C 2 -q 20 -n 3 -E 1 -m 30 --min-coverage 6 For information regarding the freebayes parameters, please see the documentation . PAR_vartrix_umi TRUE Whether or no to consider UMI information when populating coverage matrices. PAR_vartrix_mapq 30 Minimum read mapping quality. PAR_vartrix_threads 8 Number of threads for computing. PAR_souporcell_k NULL Number of pooled samples. PAR_souporcell_t 8 Number of threads for computing. NOTE : We are currently working on expanding the execution parameters for Souporcell.","title":"Souporcell"},{"location":"reference/#ensemblex","text":"The following parameters are adjustable for the Ensemblex algorithm: Parameter Default Description Pool parameters PAR_ensemblex_sample_size NULL Number of samples multiplexed in the pool. PAR_ensemblex_expected_doublet_rate NULL Expected doublet rate for the pool. If using 10X Genomics, the expected doublet rate can be estimated based on the number of recovered cells. For more information see 10X Genomics Documentation . Set up parameters PAR_ensemblex_merge_constituents Yes Whether or not to merge the output files of the constituent demultiplexing tools. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated merged file. Step 1 parameters: Probabilistic-weighted ensemble PAR_ensemblex_probabilistic_weighted_ensemble Yes Whether or not to perform Step 1: Probabilistic-weighted ensemble. If running Ensemblex on a pool for the first time, this parameter should be set to \"Yes\". Subsequent runs of Ensemblex (e.g., parameter optimization) can have this parameter set to \"No\" as the pipeline will automatically detect the previously generated Step 1 output file. Step 2 parameters: Graph-based doublet detection PAR_ensemblex_preliminary_parameter_sweep No Whether or not to perform a preliminary parameter sweep for Step 2: Graph-based doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define the number of confident doublets in the pool (nCD) and the percentile threshold of the nearest neighour frequency (pT), which can be defined in the following two parameters, respectively. PAR_ensemblex_nCD NULL Manually defined number of confident doublets in the pool (nCD). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_pT NULL Manually defined percentile threshold of the nearest neighour frequency (pT). Value can be informed by the output files generated by setting PAR_ensemblex_preliminary_parameter_sweep to \"Yes\". PAR_ensemblex_graph_based_doublet_detection Yes Whether or not to perform Step 2: Graph-based doublet detection. If PAR_ensemblex_nCD and PAR_ensemblex_pT are not defined by the user (NULL), Ensemblex will automatically determine the optimal parameter values using an unsupervised parameter sweep. If PAR_ensemblex_nCD and PAR_ensemblex_pT are defined by the user, graph-based doublet detection will be performed with the user-defined values. Step 3 parameters: Ensemble-independent doublet detection PAR_ensemblex_preliminary_ensemble_independent_doublet No Whether or not to perform a preliminary parameter sweep for Step 3: Ensemble-independent doublet detection. Users should utilize the preliminary parameter sweep if they wish to manually define which constituent tools to utilize for ensemble-independent doublet detection. Users can define which tools to utilize for ensemble-independent doublet detection in the following parameters. PAR_ensemblex_ensemble_independent_doublet Yes Whether or not to perform Step 3: Ensemble-independent doublet detection. PAR_ensemblex_doublet_Demuxalot_threshold Yes Whether or not to label doublets identified by Demuxalot as doublets. Only doublets with assignment probabilities exceeding Demuxalot's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxalot_no_threshold No Whether or not to label doublets identified by Demuxalot as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Demuxlet_threshold No Whether or not to label doublets identified by Demuxlet as doublets. Only doublets with assignment probabilities exceeding Demuxlet's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Demuxlet_no_threshold No Whether or not to label doublets identified by Demuxlet as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Souporcell_threshold No Whether or not to label doublets identified by Souporcell as doublets. Only doublets with assignment probabilities exceeding Souporcell's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Souporcell_no_threshold No Whether or not to label doublets identified by Souporcell as doublets, regardless of the corresponding assignment probability. PAR_ensemblex_doublet_Vireo_threshold Yes Whether or not to label doublets identified by Vireo as doublets. Only doublets with assignment probabilities exceeding Vireo's recommended probability threshold will be labeled as doublets by Ensemblex. PAR_ensemblex_doublet_Vireo_no_threshold No Whether or not to label doublets identified by Vireo as doublets, regardless of the corresponding assignment probability. Confidence score parameters PAR_ensemblex_compute_singlet_confidence Yes Whether or not to compute Ensemblex's singlet confidence score. This will define low confidence assignments which should be removed from downstream analyses.","title":"Ensemblex"}]} \ No newline at end of file diff --git a/site/search/worker.js b/site/search/worker.js new file mode 100644 index 0000000..8628dbc --- /dev/null +++ b/site/search/worker.js @@ -0,0 +1,133 @@ +var base_path = 'function' === typeof importScripts ? '.' : '/search/'; +var allowSearch = false; +var index; +var documents = {}; +var lang = ['en']; +var data; + +function getScript(script, callback) { + console.log('Loading script: ' + script); + $.getScript(base_path + script).done(function () { + callback(); + }).fail(function (jqxhr, settings, exception) { + console.log('Error: ' + exception); + }); +} + +function getScriptsInOrder(scripts, callback) { + if (scripts.length === 0) { + callback(); + return; + } + getScript(scripts[0], function() { + getScriptsInOrder(scripts.slice(1), callback); + }); +} + +function loadScripts(urls, callback) { + if( 'function' === typeof importScripts ) { + importScripts.apply(null, urls); + callback(); + } else { + getScriptsInOrder(urls, callback); + } +} + +function onJSONLoaded () { + data = JSON.parse(this.responseText); + var scriptsToLoad = ['lunr.js']; + if (data.config && data.config.lang && data.config.lang.length) { + lang = data.config.lang; + } + if (lang.length > 1 || lang[0] !== "en") { + scriptsToLoad.push('lunr.stemmer.support.js'); + if (lang.length > 1) { + scriptsToLoad.push('lunr.multi.js'); + } + if (lang.includes("ja") || lang.includes("jp")) { + scriptsToLoad.push('tinyseg.js'); + } + for (var i=0; i < lang.length; i++) { + if (lang[i] != 'en') { + scriptsToLoad.push(['lunr', lang[i], 'js'].join('.')); + } + } + } + loadScripts(scriptsToLoad, onScriptsLoaded); +} + +function onScriptsLoaded () { + console.log('All search scripts loaded, building Lunr index...'); + if (data.config && data.config.separator && data.config.separator.length) { + lunr.tokenizer.separator = new RegExp(data.config.separator); + } + + if (data.index) { + index = lunr.Index.load(data.index); + data.docs.forEach(function (doc) { + documents[doc.location] = doc; + }); + console.log('Lunr pre-built index loaded, search ready'); + } else { + index = lunr(function () { + if (lang.length === 1 && lang[0] !== "en" && lunr[lang[0]]) { + this.use(lunr[lang[0]]); + } else if (lang.length > 1) { + this.use(lunr.multiLanguage.apply(null, lang)); // spread operator not supported in all browsers: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_operator#Browser_compatibility + } + this.field('title'); + this.field('text'); + this.ref('location'); + + for (var i=0; i < data.docs.length; i++) { + var doc = data.docs[i]; + this.add(doc); + documents[doc.location] = doc; + } + }); + console.log('Lunr index built, search ready'); + } + allowSearch = true; + postMessage({config: data.config}); + postMessage({allowSearch: allowSearch}); +} + +function init () { + var oReq = new XMLHttpRequest(); + oReq.addEventListener("load", onJSONLoaded); + var index_path = base_path + '/search_index.json'; + if( 'function' === typeof importScripts ){ + index_path = 'search_index.json'; + } + oReq.open("GET", index_path); + oReq.send(); +} + +function search (query) { + if (!allowSearch) { + console.error('Assets for search still loading'); + return; + } + + var resultDocuments = []; + var results = index.search(query); + for (var i=0; i < results.length; i++){ + var result = results[i]; + doc = documents[result.ref]; + doc.summary = doc.text.substring(0, 200); + resultDocuments.push(doc); + } + return resultDocuments; +} + +if( 'function' === typeof importScripts ) { + onmessage = function (e) { + if (e.data.init) { + init(); + } else if (e.data.query) { + postMessage({ results: search(e.data.query) }); + } else { + console.error("Worker - Unrecognized message: " + e); + } + }; +} diff --git a/site/sitemap.xml b/site/sitemap.xml new file mode 100644 index 0000000..32db3fe --- /dev/null +++ b/site/sitemap.xml @@ -0,0 +1,83 @@ + + + + https://neurobioinfo.github.io/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Acknowledgement/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Dataset1/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Dataset2/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/LICENSE/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Step0/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Step1/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Step2/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/Step3/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/contributing/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/installation/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/midbrain_download/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/outputs/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/overview/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/overview_pipeline/ + 2024-06-14 + daily + + + https://neurobioinfo.github.io/reference/ + 2024-06-14 + daily + + \ No newline at end of file diff --git a/site/sitemap.xml.gz b/site/sitemap.xml.gz new file mode 100644 index 0000000..335c66a Binary files /dev/null and b/site/sitemap.xml.gz differ