Whether to reuse the Sigmoid function in the confidence loss calculation code? #89

EATMustard · 2024-06-05T02:08:44Z

Pytorch.BCEWithLogitsLoss performs the Sigmoid operation, but the Sigmoid is reused in the model code. Is this a bug?

class TokenConfidence(nn.Module):

def __init__(self, dim: int) -> None:

    super().__init__()

    self.token = nn.Sequential(nn.Linear(dim, 1), nn.Sigmoid())  # sigmoid once

    self.loss_fn = nn.BCEWithLogitsLoss(reduction="none")  # sigmoid twice

The text was updated successfully, but these errors were encountered:

noahzn · 2024-08-06T12:05:49Z

I think this is a bug, I tried removing nn.Sigmoid in self.token and adding torch.sigmoid in forward(), but the results are almost the same. Have you tried?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whether to reuse the Sigmoid function in the confidence loss calculation code? #89

Whether to reuse the Sigmoid function in the confidence loss calculation code? #89

EATMustard commented Jun 5, 2024 •

edited

Loading

noahzn commented Aug 6, 2024

Whether to reuse the Sigmoid function in the confidence loss calculation code? #89

Whether to reuse the Sigmoid function in the confidence loss calculation code? #89

Comments

EATMustard commented Jun 5, 2024 • edited Loading

noahzn commented Aug 6, 2024

EATMustard commented Jun 5, 2024 •

edited

Loading