Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the functionality of scale.factor in NormalizeData being set to "median" of counts #9389

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

pranavm2109
Copy link

Background

The NormalizeData generic has several different implementations depending on the class of the first object passed as a parameter to the method. These are NormalizeData.Assay, NormalizeData.default, NormalizeData.Seurat, NormalizeData.StdAssay, and NormalizeData.V3Matrix*. Each of these methods has three possible in-built normalization methods: "LogNormalize", "CLR" and "RC". Excluding "CLR", the other two methods make use of a scale.factor parameter to multiply the normalized values during the normalization process. This value is defaulted to 1e4 for LogNormalize and 1 for RelativeCounts (RC).

Updates

I have added in the capacity for the implementations of LogNormalize, RelativeCounts and .SparseNormalize to compute the median of the counts across all columns (cells) (or rows (genes) if margin = 1L in the case of LogNormalize.default) and use this as the scale.factor, if the value passed to the scale.factor parameter is "median".

I have also tested the modifications to these functions by writing unit tests in test_preprocessing.R that make sure that the median is being computed correctly if the value passed to the scale.factor parameter is "median".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant