Releases · ilias-ant/adversarial-validation

22 Jul 15:52

ilias-ant

v0.1.1

df7294f

v0.1.1 Latest

Latest

Fixed

wrap preprocessing INFO statement, printed to the stdout, under verbose functionality - as expected. This particular
statement got printed even when verbose=False was passed to the validate function.
```
INFO: Working only with available numerical features, 
categorical features are not yet supported.
```

Assets 2

22 Jul 12:34

ilias-ant

v0.1.0

7e9c563

v0.1.0

The first non pre-release of the package. 🎉

v0.1.0 is still considered a beta release, as the API has not been tested extensively across many and diverse datasets. I have tested it with 3 different Kaggle datasets up to this point.

No changes to the functionality are introduced, only the article https://ilias-ant.github.io/blog/adversarial-validation/ is referenced in the README, meant to serve as additional contextual documentation.

Assets 2

19 Jul 21:22

ilias-ant

v0.1.0-beta

71eb2ef

v0.1.0-beta Pre-release

Pre-release

This is considered the beta pre-release version, introducing some minor additions after a bit of personal testing on 2-3 kaggle datasets.

Features:

Passing explicitly a random_state is now propagated to the underlying classifier as well.

Documentation:

Added short README/homepage introduction on the concept of adversarial validation and where this package stands.

Also, added a homemade package logo (available in README + homepage https://advertion.readthedocs.io/en/latest/)

Assets 2

19 Jul 07:42

ilias-ant

v0.1.0-alpha

f421afb

v0.1.0-alpha Pre-release

Pre-release

This is considered the alpha pre-release version, introducing some backwards-incompatible changes w.r.t. the previous release.

Features:

Response of the main public object, advertion.validate, has changed from bool to dict:

from advertion import validate

train = pd.read_csv("...")
test = pd.read_csv("...")

validate(
    trainset=train,
    testset=test,
    target="label",
)

# // {
# //     "datasets_follow_same_distribution": True,
# //     'mean_roc_auc': 0.5021320833333334,
# //     "adversarial_features': ['id'],
# // }

Also, upon selecting smart=True (is actually the default case), an improved identification logic of adversarial features has been introduced, based on the Kolmogorov–Smirnov test. Having verbose=True prints to the standard output the statistic value and the p-value of the test for every feature that is deemed as adversarial.

Documentation:

New page on adversarial features: https://advertion.readthedocs.io/en/latest/adversarial-features/. It is also referenced on the standard output when smart=True and verbose=True.

Tests:

Tests have been developed for the package's public interface, reaching 100% test coverage on the project.

CI/CD:

Continuous Integration - enabled through Github Actions - enriched with 2 additional linters:

autoflake (detects unused imports)
bandit (detects common software security issues)

Also, test suite now runs against the following combinations:

python-version: ['3.8', '3.9', '3.10', '3.11']
os: [ubuntu-latest, macos-latest, windows-latest]

Last but not least, codecov has been introduced.

For more details, see:

.github/workflows/ci.yml

Assets 2

16 Jul 11:09

ilias-ant

v0.1.0-alpha2

3869755

v0.1.0-alpha2 Pre-release

Pre-release

A follow-up, pre-alpha release that introduces continuous documentation capabilities to the project, through MkDocs + readthedocs. Material for MkDocs has been utilized as the theme.

URL: https://advertion.readthedocs.io/en/latest/

No change to the functionality since inaugural pre-release v0.1.0-alpha1.

Assets 2

16 Jul 08:51

ilias-ant

v0.1.0-alpha1

1a30431

v0.1.0-alpha1 Pre-release

Pre-release

This inaugural pre-alpha release introduces the core functionality of adversarial validation, exposed to the end user through the following method:

from advertion import validate

train = pd.read_csv("...")   # let's say target variable is "label"
test = pd.read_csv("...")

are_similar = validate(
    train=train,
    test=test,
    target="label",
)
# are_similar = True: train and test are following the same underlying distribution.
# are_similar = False: test dataset exhibits a different underlying distribution than train dataset.

At the same time:

passing smart=True employs a pruning strategy of design matrix features based on feature importance - this helps remove featutes with strongly identifiable properties such as IDs, timestamps etc.
passing an n_splits value controls the number of cross-validation folds that take place internally.
passing verbose=True prints to the standard output informative messages on the adversarial validation strategy.
passing a random_state value ensures reproducible output across multiple function calls.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed

Releases: ilias-ant/adversarial-validation

v0.1.1

Fixed

v0.1.0

v0.1.0-beta

v0.1.0-alpha

v0.1.0-alpha2

v0.1.0-alpha1