Skip to content
This repository has been archived by the owner on Jul 29, 2024. It is now read-only.

Commit

Permalink
Change format style of docs (remove unnecessary linebreaks), update d…
Browse files Browse the repository at this point in the history
…ocs to include continuous outcome information, add documentation for tuple argument to be passed.
  • Loading branch information
rsyi committed Aug 25, 2019
1 parent e0421c0 commit b267722
Show file tree
Hide file tree
Showing 9 changed files with 79 additions and 287 deletions.
16 changes: 4 additions & 12 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,11 @@
Contributing
============

We are happy to have you contribute! The project is hosted on
https://github.com/wayfair/pylift. The easiest way to get started is to
check out the “Issues” and “Projects” sections of the repo.
We are happy to have you contribute! The project is hosted on https://github.com/wayfair/pylift. The easiest way to get started is to check out the “Issues” and “Projects” sections of the repo.

In general, we have four main requirements for contribution:

1. **Contributions do not affect the default functionality of existing
methods.** We want to make sure any new features or modifications
still work with older code samples.
2. **Contributions should be accompanied by a unit test to ensure basic
functionality (if applicable).** Tests are contained in the
``./tests/`` directory. The package is automatically tested on
TravisCI with every push.
1. **Contributions do not affect the default functionality of existing methods.** We want to make sure any new features or modifications still work with older code samples.
2. **Contributions should be accompanied by a unit test to ensure basic functionality (if applicable).** Tests are contained in the ``./tests/`` directory. The package is automatically tested on TravisCI with every push.
3. **Contributions are accompanied by documentation.**
4. **New methods are tested, well-understood, and have a defined
purpose.**
4. **New methods are tested, well-understood, and have a defined purpose.**
101 changes: 19 additions & 82 deletions docs/evaluation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,9 @@ All curves can be plotted using

up.plot(plot_type='qini')

Where ``plot_type`` can be any of the following values. In the formulaic
representations
Where ``plot_type`` can be any of the following values. In the formulaic representations

- ``qini``: typical Qini curve (see Radcliffe 2007), except we
normalize by the total number of people in treatment. The typical
definition is
- ``qini``: typical Qini curve (see Radcliffe 2007), except we normalize by the total number of people in treatment. The typical definition is

.. math:: n_{t,1} - n_{c,1} N_t/N_c.

Expand All @@ -31,46 +28,20 @@ representations

.. math:: n_{t,1}/n_t - n_{c,1}/n_c.

- ``uplift``: typical uplift curve, calculated the same as cuplift but
only returning the average value within the bin, rather than
cumulatively.
- ``cgains``: cumulative gains curve (see Gutierrez, Gerardy 2016),
defined as
- ``uplift``: typical uplift curve, calculated the same as cuplift but only returning the average value within the bin, rather than cumulatively.
- ``cgains``: cumulative gains curve (see Gutierrez, Gerardy 2016), defined as

.. math:: ((n_{t,1}/n_t - n_{c,1}/n_c)\phi.

- ``balance``: ratio of treatment group size to total group size within
each bin,
- ``balance``: ratio of treatment group size to total group size within each bin,

.. math:: n_t/(n_c + n_t).

Above, :math:`\phi` corresponds to the fraction of
individuals targeted – the x-axis of these curves. :math:`n` and :math:`N`
correspond to counts up to :math:`phi` (except for the uplift curve, which
is only within the bin at the :math:`phi` position) or within the entire
group, respectively. The subscript :math:`t` indicates the treatment group,
and :math:`c`, the control. The subscript :math:`1` indicates the subset of
the count for which individuals had a positive outcome.

A number of scores are stored in both the ``test_results_`` and
``train_results_`` objects, containing scores calculated over the test
set and train set, respectively. Namely, there are three important
scores: \* ``Q``: unnormalized area between the qini curve and the
random selection line. \* ``q1``: ``Q``, normalized by the theoretical
maximum value of ``Q``. \* ``q2``: ``Q``, normalized by the practical
maximum value of ``Q``.

Each of these can be accesses as attributes of ``test_results_`` or
``train_results_``. Either ``_qini``, ``_aqini``, or ``_cgains`` can be
appended to obtain the same calculation for the qini curve, adjusted
qini curve, or the cumulative gains curve, respectively. The score most
unaffected by anomalous treatment/control ordering, without any bias to
treatment or control (i.e. if you’re looking at lift between two equally
viable treatments) is the ``q1_cgains`` score, but if you are looking at
a simple treatment vs. control situation, ``q1_aqini`` is preferred.
Because this only really has meaning over an independent holdout [test]
set, the most valuable value to access, then, would likely be
``up.test_results_.q1_aqini``.
Above, :math:`\phi` corresponds to the fraction of individuals targeted – the x-axis of these curves. :math:`n` and :math:`N` correspond to counts up to :math:`phi` (except for the uplift curve, which is only within the bin at the :math:`phi` position) or within the entire group, respectively. The subscript :math:`t` indicates the treatment group, and :math:`c`, the control. The subscript :math:`1` indicates the subset of the count for which individuals had a positive outcome.

A number of scores are stored in both the ``test_results_`` and ``train_results_`` objects, containing scores calculated over the test set and train set, respectively. Namely, there are three important scores: \* ``Q``: unnormalized area between the qini curve and the random selection line. \* ``q1``: ``Q``, normalized by the theoretical maximum value of ``Q``. \* ``q2``: ``Q``, normalized by the practical maximum value of ``Q``.

Each of these can be accesses as attributes of ``test_results_`` or ``train_results_``. Either ``_qini``, ``_aqini``, or ``_cgains`` can be appended to obtain the same calculation for the qini curve, adjusted qini curve, or the cumulative gains curve, respectively. The score most unaffected by anomalous treatment/control ordering, without any bias to treatment or control (i.e. if you’re looking at lift between two equally viable treatments) is the ``q1_cgains`` score, but if you are looking at a simple treatment vs. control situation, ``q1_aqini`` is preferred. Because this only really has meaning over an independent holdout [test] set, the most valuable value to access, then, would likely be ``up.test_results_.q1_aqini``.

::

Expand All @@ -80,29 +51,11 @@ Maximal curves can also be toggled by passing flags into ``up.plot()``.
\* ``show_theoretical_max`` \* ``show_practical_max`` \*
``show_no_dogs`` \* ``show_random_selection``

Each of these curves satisfies shows the maximally attainable curve
given different assumptions about the underlying data. The
``show_theoretical_max`` curve corresponds to a sorting in which we
assume that an individual is persuadable (uplift = 1) if and only if
they respond in the treatment group (and the same reasoning applies to
the control group, for sleeping dogs). The ``show_practical_max`` curve
assumes that all individuals that have a positive outcome in the
treatment group must also have a counterpart (relative to the proportion
of individuals in the treatment and control group) in the control group
that did not respond. This is a more conservative, realistic curve. The
former can only be attained through overfitting, while the latter can
only be attained under very generous circumstances. Within the package,
we also calculate the ``show_no_dogs`` curve, which simply precludes the
possibility of negative effects.

The random selection line is shown by default, but the option to toggle
it off is included in case you’d like to plot multiple plots on top of
each other.

The below code plots the practical max over the aqini curve of a model
contained in the TransformedOutcome object ``up``, then overlays the
aqini curve of a second model contained in ``up1``, also changing the
line color.
Each of these curves satisfies shows the maximally attainable curve given different assumptions about the underlying data. The ``show_theoretical_max`` curve corresponds to a sorting in which we assume that an individual is persuadable (uplift = 1) if and only if they respond in the treatment group (and the same reasoning applies to the control group, for sleeping dogs). The ``show_practical_max`` curve assumes that all individuals that have a positive outcome in the treatment group must also have a counterpart (relative to the proportion of individuals in the treatment and control group) in the control group that did not respond. This is a more conservative, realistic curve. The former can only be attained through overfitting, while the latter can only be attained under very generous circumstances. Within the package, we also calculate the ``show_no_dogs`` curve, which simply precludes the possibility of negative effects.

The random selection line is shown by default, but the option to toggle it off is included in case you’d like to plot multiple plots on top of each other.

The below code plots the practical max over the aqini curve of a model contained in the TransformedOutcome object ``up``, then overlays the aqini curve of a second model contained in ``up1``, also changing the line color.

::

Expand All @@ -112,40 +65,24 @@ line color.
Error bars
~~~~~~~~~~

It is often useful to obtain error bars on your qini curves. We’ve
implemented two ways to do this: 1. ``up.shuffle_fit()``: Seeds the
``train_test_split``, fit the model over the new training data, and
evaluate on the new test data. Average these curves. 1.
``up.noise_fit()``: Randomly shuffle the labels independently of the
features and fit a model. This can help distinguish your evaluation
curves from noise.
It is often useful to obtain error bars on your qini curves. We’ve implemented two ways to do this: 1. ``up.shuffle_fit()``: Seeds the ``train_test_split``, fit the model over the new training data, and evaluate on the new test data. Average these curves. 1. ``up.noise_fit()``: Randomly shuffle the labels independently of the features and fit a model. This can help distinguish your evaluation curves from noise.

::

up.shuffle_fit()
up.plot(plot_type='aqini', show_shuffle_fits=True)

Adjustments can also be made to the aesthetics of these curves by
passing in dictionaries that pass down to plot elements. For example,
``shuffle_band_kwargs`` is a dictionary of kwargs that modifies the
``fill_between`` shaded error bar region.
Adjustments can also be made to the aesthetics of these curves by passing in dictionaries that pass down to plot elements. For example, ``shuffle_band_kwargs`` is a dictionary of kwargs that modifies the ``fill_between`` shaded error bar region.

With ``UpliftEval``
-------------------

The ``UpliftEval`` class can also independently be used to apply the
above evaluation visualizations and calculations. Note that the ``up``
object uses ``UpliftEval`` to generate the plots, so the ``UpliftEval``
class object for the train set and test set can be obtained in
``up.train_results_`` and ``up.test_results_``, respectively.
The ``UpliftEval`` class can also independently be used to apply the above evaluation visualizations and calculations. Note that the ``up`` object uses ``UpliftEval`` to generate the plots, so the ``UpliftEval`` class object for the train set and test set can be obtained in ``up.train_results_`` and ``up.test_results_``, respectively.

::

from pylift.eval import UpliftEval
upev = UpliftEval(treatment, outcome, predictions)
upev.plot(plot_type='aqini')

It generally functions the same as the ``up.plot()`` function, except
error bars cannot be obtained. Note that ``UpliftEval`` could still be
used, however, to manually generate the curves that can be aggregated to
make error bars.
It generally functions the same as the ``up.plot()`` function, except error bars cannot be obtained. Note that ``UpliftEval`` could still be used, however, to manually generate the curves that can be aggregated to make error bars.
6 changes: 3 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ Welcome to pylift's documentation!

**pylift** has two main features:

#. A `TransformedOutcome` class (inheriting a more general `BaseProxyMethod` class) that allows for full end-to-end uplift modeling.
#. An `UpliftEval` class that allows for evaluation of any model prediction. This class is used within the `TransformedOutcome` class, but can be called independently to evaluate the performance of, for example, scores from a modeling approach external to **pylift**.
#. A `TransformedOutcome` class (inheriting a more general `BaseProxyMethod` class) that allows for full end-to-end uplift modeling.
#. An `UpliftEval` class that allows for evaluation of any model prediction. This class is used within the `TransformedOutcome` class, but can be called independently to evaluate the performance of, for example, scores from a modeling approach external to **pylift**.

The `TransformedOutcome` class (and so, the `BaseProxyMethod` class) simply wraps `sklearn` classes and functions. Therefore, it's generally possible to do anything you can do with `sklearn` within `pylift` as well. Advanced usage of **pylift**, therefore, should feel familiar to those well-versed in `sklearn`.
The `TransformedOutcome` class (and so, the `BaseProxyMethod` class) simply wraps `sklearn` classes and functions. Therefore, it's generally possible to do anything you can do with `sklearn` within `pylift` as well. Advanced usage of **pylift**, therefore, should feel familiar to those well-versed in `sklearn`.

Indices and tables
==================
Expand Down
16 changes: 6 additions & 10 deletions docs/installation.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
Installation
============

**pylift** has only been tested on Python **3.6** and **3.7**. It
currently requires the following package versions:
**pylift** has only been tested on Python **3.6** and **3.7**. It currently requires the following package versions:

::

Expand All @@ -12,19 +11,16 @@ currently requires the following package versions:
scipy >= 1.0.0
xgboost >= 0.6a2

A ``requirements.txt`` file is included in the parent directory of the
github repo that contains these lower-limit package versions, as these
are the versions we have most extensively tested pylift on, but newer
versions generally appear to work.
A ``requirements.txt`` file is included in the parent directory of the github repo that contains these lower-limit package versions, as these are the versions we have most extensively tested pylift on, but newer versions generally appear to work.

At the moment, the package must be built from source. This means cloning
the repo and installing, using the following commands:
The package can be built from source (for the latest version) or simply sourced from pypi. To install from source, clone the repo and install, using the following commands:

::

git clone https://github.com/wayfair/pylift
cd pylift
pip install .

To upgrade, ``git pull origin master`` in the repo folder, and then run
``pip install --upgrade --no-cache-dir .``.
To upgrade, ``git pull origin master`` in the repo folder, and then run ``pip install --upgrade --no-cache-dir .``.

Alternatively, install from pypi by simply running ``pip install pylift``.
Loading

0 comments on commit b267722

Please sign in to comment.