Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting very different inference results on windows vs. linux/macos #23443

Closed
reyammer opened this issue Jan 21, 2025 · 3 comments
Closed

Getting very different inference results on windows vs. linux/macos #23443

reyammer opened this issue Jan 21, 2025 · 3 comments
Labels
platform:windows issues related to the Windows platform

Comments

@reyammer
Copy link

Describe the issue

Hello,

We are using onnxruntime's python module in google/magika, and we just noticed that the inference results can vary significantly (beyond rounding errors) between windows and linux/macos machines. It seems this has always been the case, but we just noticed as the results on a given input changed so much that it caused a misclassification on our test set.

/cc @ia0 @invernizzi

To reproduce

I've created a proof of concept via github action runners (details below).

In essence, you can see how the inference result (using the same code/model, but on different OSes) leads to a very different prediction score.

The inference score on the different platforms is:

  • ubuntu: 0.75302654504776
  • macos: 0.7530270218849182
  • windows: 0.928835928440094

As you can see, the windows' one is different beyond what rounding errors can explain (linux vs. macos are fine). We checked the results on many files with linux vs. macos: they are all the same (within 0.x% rounding errors). The problem seems very specific to Windows.

Note that we also have client written in rust (thus, it does not use the onnxruntime python module) and we see the same very significant discrepancies (the inference scores of the python module seems to match the ones from the rust client).

Details to reproduce:

Urgency

Due to this bug, we need to halt the release of the magika's windows packages, as it seems too unpredictable.

Platform

Windows

OS Version

Windows Server 2022 (github's windows-latest runner)

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-1.19.2-cp312-cp312-win_amd64.whl

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@eKevinHoang
Copy link

eKevinHoang commented Jan 21, 2025

It seems that the input blob data differs between Windows and Linux.

While waiting for assistance from experts, please verify whether the libraries and their versions are consistent across Windows and Linux/MacOS. Additionally, check if the input data before being fed into ONNX Runtime is identical.

I also noticed a warning in your GitHub Actions logs as follows:
UserWarning: Unsupported Windows version (2022server). ONNX Runtime supports Windows 10 and above, only.

Are you running on Windows Server 2022?

@reyammer
Copy link
Author

About the difference in input: there is a chance you are right... The feature extraction code is so simple that a bug there didn't even cross my mind. Investigating and will report back...

@reyammer
Copy link
Author

I believe I found the bug, and it has nothing to do with onnxruntime (nor with magika's features extraction code): it turns out that, on windows, git's checkout automatically converts "\n" to "\r\n", leading to different extracted features vs. linux/macos. I'm pretty sure this is it. Closing the bug for now, will re-open if the issue does not go away.

Thanks for the quick reply and for the great project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

2 participants