-
Notifications
You must be signed in to change notification settings - Fork 166
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: expanded the misc. guidance in quickstart, tumor, germline
- Loading branch information
Showing
4 changed files
with
93 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,33 @@ | ||
Germline analysis | ||
================= | ||
|
||
.. TODO - see e-mails, biostars, notes | ||
CNVkit can be used with exome sequencing of constitutional (non-tumor) samples, | ||
for example to detect germline copy number alterations associated with heritable | ||
conditions. However, note that CNVkit is less accurate in detecting CNVs | ||
smaller than 1 Mbp, typically only detecting variants that span multiple exons | ||
or captured regions. When used on exome or target panel datasets, CNVkit will | ||
not detect the small CNVs that are more common in populations. | ||
|
||
CNVkit is less accurate in detecting CNVs smaller than 1 Mbp. | ||
To use CNVkit to detect medium-to-large CNVs or unbalanced SVs in constitutional | ||
samples: | ||
|
||
The ``--drop-low-coverage`` option (see :doc:`tumor`) should not be used; it | ||
will typically remove germline deep deletions altogether, which is not | ||
desirable. | ||
- The :ref:`call` command can be used directly without specifying the | ||
``--purity`` and ``--ploidy`` values, as the defaults will be correct for | ||
mammalian cells. (For non-diploid species, use the correct ``--ploidy``, of | ||
course.) The default ``--method threshold`` assigns integer copy number | ||
similarly to ``--method clonal``, but with smaller thresholds for calling | ||
single-copy changes. The default thresholds allow for mosaicism in CNVs, which | ||
have smaller log2 value than a single-copy CNV would indicate. (They're more | ||
common than often thought.) | ||
|
||
Watch for mosaicism in CNVs, resulting in non-integer copy numbers (i.e. smaller | ||
log2 value than a single-copy CNV would indicate); they're more common than | ||
often thought. | ||
- The ``--filter`` option in :ref:`call` can be used to reduce the number of | ||
false-positive segments returned. To use the ``ci`` (recommended) or ``sem`` | ||
filters, first run each sample's segmented .cns file through :ref:`segmetrics` | ||
with the ``--ci`` option, which adds upper and lower confidence limits to the | ||
.cns output that ``call --filter ci`` can then use. | ||
|
||
- The ``--drop-low-coverage`` option (see :doc:`tumor`) should not be used; it | ||
will typically remove germline deep deletions altogether, which is not | ||
desirable. | ||
|
||
- For using CNVkit with whole-genome sequencing datasets, see :doc:`nonhybrid`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,42 @@ | ||
Tumor analysis | ||
============== | ||
|
||
Solid tumor samples: Use ``--drop-low-coverage`` in the :ref:`batch` and | ||
:ref:`segment` commands. Virtually all tumor samples, even cancer cell lines, | ||
are not completely homogeneous. Even in regions of homozygous deletion in the | ||
largest tumor-cell clonal population, some sequencing reads will be obtained | ||
from contaminating normal cells without the deletion. | ||
Therefore, extremely low log2 copy ratio values (below -15) do not indicate | ||
homozygous deletions but failed sequencing or mapping in all cells regardless | ||
of copy number status at that site, which are not informative for copy number. | ||
This option in the :ref:`batch` command applies to segmentation; the option is | ||
also available in the :ref:`segment`, :ref:`metrics`, :ref:`segmetrics`, | ||
:ref:`gainloss` and :doc:`heterogeneity` commands. | ||
CNVkit has been used most extensively on solid tumor samples sequenced with a | ||
target panel or whole-exome sequencing protocol. Several options and approaches | ||
are available to support this use case: | ||
|
||
If you have unpaired tumor samples, or no normal samples sequenced on the | ||
same platform, see the :ref:`reference` command for strategies. | ||
- If you have unpaired tumor samples, or no normal samples sequenced on the same | ||
platform, see the :ref:`reference` command for strategies. | ||
|
||
- Use ``--drop-low-coverage`` to ignore bins with log2 normalized coverage | ||
values below -15. Virtually all tumor samples, even cancer cell lines, are | ||
not completely homogeneous. Even in regions of homozygous deletion in the | ||
largest tumor-cell clonal population, some sequencing reads will be obtained | ||
from contaminating normal cells without the deletion. Therefore, extremely low | ||
log2 copy ratio values do not indicate homozygous deletions but failed | ||
sequencing or mapping in all cells regardless of copy number status at that | ||
site, which are not informative for copy number. This option in the | ||
:ref:`batch` command applies to segmentation; the option is also available in | ||
the :ref:`segment`, :ref:`metrics`, :ref:`segmetrics`, :ref:`gainloss` and | ||
:doc:`heterogeneity` commands. | ||
|
||
- Why -15? The null log2 value substituted for bins with zero coverage is | ||
-20 (about 1 millionth the average bin's coverage), and the maximum | ||
positive shift that can be introduced by normalizing to the reference is 5 | ||
(for bins with 1/32 the average coverage; bins below this are masked out | ||
by the reference). In a .cnr file, any bins with log2 value below -15 are | ||
probably based on dummy values corresponding to zero-coverage (perhaps | ||
unmappable) bins, and not real observations. | ||
|
||
- The :ref:`batch` command does not directly output integer copy number calls | ||
(see :doc:`heterogeneity`). Instead, use the ``--ploidy`` and ``--purity`` | ||
options in :ref:`call` to calculate copy number for each sample individually | ||
using known or estimated tumor-cell fractions. Also consider using ``--center | ||
median`` in highly aneuploid samples to shift the log2 value of true neutral | ||
regions closer to zero, as it may be slightly off initially. | ||
|
||
- If SNV calls are available in VCF format, use the ``-v``/``--vcf`` option in | ||
the :ref:`call` and :ref:`scatter` commands to calculate or plot b-allele | ||
frequencies alongside each segment's total copy number or log2 ratio. These | ||
values reveal allelic imbalance and loss of heterozygosity (LOH), supporting | ||
and extending the inferred CNVs. |