Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Releases with zesInit() implemented or unsupported #650

Closed
bgoglin opened this issue Jun 12, 2023 · 17 comments
Closed

Releases with zesInit() implemented or unsupported #650

bgoglin opened this issue Jun 12, 2023 · 17 comments
Labels
bug in queue L0 Sysman Issue related to L0 Sysman

Comments

@bgoglin
Copy link

bgoglin commented Jun 12, 2023

zesInit() might be the right way to finally fix oneapi-src/level-zero#36 but current releases provide zesInit() without actually implementing it (it returns "unsupported feature" in 26032.30).

I know we are supposed to handle API errors, but this means we should do a runtime check at configure time before enabling zesInit() in our code. We can't do a runtime-check at runtime because it would be too late to fallback to setting ZES_ENABLE_SYSMAN=1 (has to be done very early in the process, ideally inside a library constructor).

When are you planning to publish a compute-runtime release with proper support for zesInit()? It seems to in the git tree already.

Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.

@saik-intel
Copy link
Contributor

@bgoglin zesInitbased implementation is not fully complete and we are working on it and please let me know your test / reproduction step where you are seeing **unsupported ** , will investigate and update you

@bgoglin
Copy link
Author

bgoglin commented Jun 12, 2023

I installed packages from the latest compute-runtime release (https://github.com/intel/compute-runtime/releases/tag/23.13.26032.30) on a Debian bookworm running Linux kernel 6.1. This is a laptop with an old Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz.
Test program is just does zesInit(0) and prints the return value. I get 0x78000003 = ZE_RESULT_ERROR_UNSUPPORTED_FEATURE

@saik-intel
Copy link
Contributor

thanks @bgoglin we will look into it and update you the fix

@bgoglin
Copy link
Author

bgoglin commented Jun 20, 2023

I just discovered that zesInit() returns 0 instead of 0x78000003 if ZES_ENABLE_SYSMAN=1 is set in the environment. If took me quite a bit of time to understand why things weren't failing anymore, so I am writing it here in case it helps you reproduce the issue.

@JablonskiMateusz JablonskiMateusz added bug in queue L0 Sysman Issue related to L0 Sysman labels Jul 7, 2023
@bgoglin
Copy link
Author

bgoglin commented Oct 26, 2023

Looks like zesInit() works now? Using these packages:
intel-igc-core_1.0.14828.8_amd64.deb
intel-igc-opencl_1.0.14828.8_amd64.deb
intel-level-zero-gpu_1.3.26918.9_amd64.deb
intel-opencl-icd_23.30.26918.9_amd64.deb
libigdgmm12_22.3.0_amd64.deb
It didn't with those:
intel-igc-core_1.0.14062.11_amd64.deb
intel-igc-opencl_1.0.14062.11_amd64.deb
intel-level-zero-gpu_1.3.26516.18_amd64.deb
intel-opencl-icd_23.22.26516.18_amd64.deb
libigdgmm12_22.3.0_amd64.deb
If you know which package and version first brought the fix, I'd like to know it so that we tell people avoid earlier releases.

@bgoglin
Copy link
Author

bgoglin commented Oct 26, 2023

From git log, I'd say it's this commit:

commit 73b40a49dcd1b320309cdefd24a2a2b23149ee01
Author: Kulkarni, Ashwin Kumar <[email protected]>
Date:   Wed Jun 7 10:10:13 2023 +0000

    fix(sysman): Enables zesInit flow without setting ZES_ENABLE_SYSMAN
    
    Related-To: LOCI-4458

which first appeared in tag 23.26.26690.12.

@saik-intel
Copy link
Contributor

@bgoglin can we close this issue now? if zesInit is working for you ?

@bgoglin
Copy link
Author

bgoglin commented Nov 29, 2023

The question was rather: Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.

@saik-intel
Copy link
Contributor

please start using new releases with zesInit ()

@eero-t
Copy link

eero-t commented Feb 26, 2024

The question was rather: Could you explictly list which releases have a unsupported zesInit() ? We are going to tell users to avoid these releases. Either they use newer releases with zesInit(). Or they use older releases without zesInit() at all and we'll set ZES_ENABLE_SYSMAN=1 in the library constructor.

please start using new releases with zesInit ()

@saik-intel compute-runtime project releases multiple versions in parallel, and customers may get old version of it from distros. You did not answer the question or the subject of this ticket; from which releases onward there's (full) zesInit() support?

@eero-t
Copy link

eero-t commented Mar 1, 2024

zesInit() stub was introduced on Feb 2023 in commit bf481e6.

From git log, I'd say it's this commit:

commit 73b40a49dcd1b320309cdefd24a2a2b23149ee01
Author: Kulkarni, Ashwin Kumar <[email protected]>
Date:   Wed Jun 7 10:10:13 2023 +0000

    fix(sysman): Enables zesInit flow without setting ZES_ENABLE_SYSMAN

which first appeared in tag 23.26.26690.12.

When looking at last tag series preceding the 23.26.x one:

 $ for tag in $(git tag); do echo "$tag: $(git show --format=fuller $tag|grep ^CommitDate)"; done | grep 2023
...
23.22.26516.33: CommitDate: Tue Dec 12 18:45:47 2023 +0100
23.22.26516.34: CommitDate: Tue Dec 19 11:15:15 2023 +0100
23.22.26516.8: CommitDate: Wed Jun 14 11:01:51 2023 +0200
23.26.26690.12: CommitDate: Tue Jul 25 14:34:41 2023 +0200
23.26.26690.13: CommitDate: Tue Jul 25 16:26:36 2023 +0200
...

23.22.26516.34 does not contain the above indicated commit, only 23.26.x series and ones after it do.

=> anything with tag number smaller than 23.26.26690.12 needs to be avoided for zesInit() to work as expected.

@eero-t
Copy link

eero-t commented Mar 1, 2024

@bgoglin Does above answer your question i.e. can this be closed?

@bgoglin
Copy link
Author

bgoglin commented Mar 1, 2024

"anything with tag number smaller than 23.26.26690.12", do you also mean released in 2023 ? (equivalent to starting with "23." ?)

@eero-t
Copy link

eero-t commented Mar 1, 2024

Looking at the Git content... While L0 1.5 i.e. zesdInit() support is already in earlier versions, last tag in a preceding 23.22.x series (from Dec 2023) did not include 73b40a4 commit, and last tags in series before that, are all before that commit was made, and check the env variable.

@saik-intel
Copy link
Contributor

@bgoglin can we close this issue now ?

@eero-t
Copy link

eero-t commented Apr 24, 2024

@saik-intel While relevant backend version is now known, i.e. this can be closed, it still requires manual setup outside the L0 application itself.

Proper support would require also L0 frontend (init return values) to be fixed, so that clients can differentiate between total backend failure (due to missing HW or wrong kernel driver), and missing backend functionality (due to too old backend version). See: oneapi-src/level-zero#140

@bgoglin
Copy link
Author

bgoglin commented Apr 24, 2024

Agreed, this issue can be closed but user code is still hard to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug in queue L0 Sysman Issue related to L0 Sysman
Projects
None yet
Development

No branches or pull requests

4 participants