Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Loss] Poor performance with the NegativeBinomial DistributionLoss #712

Open
Antoine-Schwartz opened this issue Aug 1, 2023 · 1 comment
Labels

Comments

@Antoine-Schwartz
Copy link

Antoine-Schwartz commented Aug 1, 2023

What happened + What you expected to happen

I suspect a bug around the binomial negative. Indeed, performance seems to be off compared with other available distributions, even when faced with positive count data on which it is supposed to be efficient.

Perhaps a conflict with the way the input data is scaled? I know that on Pytorch-Forecasting, they block the use of negative binomial when applying centered normalization: https://pytorch-forecasting.readthedocs.io/en/stable/_modules/pytorch_forecasting/metrics/distributions.html#NegativeBinomialDistributionLoss

I can't share the results on my data, but I've coded a quick example that illustrates the problem.

Versions / Dependencies

neuralforecast==1.7.4
torch==2.3.1+cu121

Reproduction script

import pandas as pd
import numpy as np 
import itertools

from neuralforecast import NeuralForecast
from neuralforecast.models import DeepAR, TFT, NHITS
from neuralforecast.losses.pytorch import DistributionLoss
from neuralforecast.losses.numpy import mae
from neuralforecast.utils import AirPassengersPanel

Y_df = AirPassengersPanel

nf = NeuralForecast(
    models=[
        eval(model)(
            h=12,
            input_size=48,
            max_steps=100,
            scaler_type="robust",
            loss=DistributionLoss(distr, level=[]),
            alias=f"{model}-{distr}",
            enable_model_summary=False,
            enable_checkpointing=False,
            enable_progress_bar=False,
            logger=False
        )
        for model, distr in itertools.product(
            ["DeepAR", "TFT", "NHITS"], ["Poisson", "Normal", "StudentT", "NegativeBinomial"]
        )
    ],
    freq="M"
)
cv_df = nf.cross_validation(Y_df, n_windows=5, step_size=12).reset_index();

def evaluate(df):
    eval_ = {}
    df = df.merge(Y_df[["unique_id", "ds", "y_[lag12]"]], how="left").rename(columns={"y_[lag12]": "seasonal_naive"})
    models = ["seasonal_naive"] + list(df.columns[df.columns.str.contains('median')])
    for model in models:
        eval_[model] = {}
        eval_[model][mae.__name__] = int(np.round(mae(df['y'].values, df[model].values), 0))
    eval_df = pd.DataFrame(eval_).rename_axis('metric')
    return eval_df

cv_df.groupby('cutoff').apply(lambda df: evaluate(df))

Output:
image

Issue Severity

Medium: It is a significant difficulty but I can work around it.

@Antoine-Schwartz
Copy link
Author

Hello @jmoralez and @cchallu,

I'm bringing this up again because it's becoming a sticking point for me: I need to get output samples and not quantiles. And in my field we're dealing with "count data" (i.e. positive integers), and historically we've played a lot with Tweedie and NegativeBinomial :(
I've tried to identify the problem by also looking at NBMM, but it seems to be facing the same problem overall. In my opinion it looks to be correlated to the scaling of the data in some way, as the results are even more catastrophic compared to other distributions with scaler="identity" (with NHITS for example).

If you even have a hunch, I could take the time to deep dive if need be.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant