-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM and SPIRV-LLVM-Translator pulldown (WW50 2024) #16372
Conversation
… (#108413) Introduce `offload-tblgen` and an initial implementation of a subset of the new API. The tablegen files are intended to be the single source of truth for the new API, with the header files, documentation, and others bits of source all automatically generated. **TODO** (based on review feedback so far): - [x] Check in the generated headers - [x] Add an `offload-generate` target to trigger the generation rather than building them every time - [x] Decide how error handling should work - [x] Finish up new error handling implementation - [x] Decide naming convention - [x] Add testing for the new API - [x] Add tablegen specific testing - [x] clang-tidy and use llvm:: types when possible - [x] Add optional code location arguments - [x] Avoid multiple returns from one function ### offload-tblgen See the included [README](https://github.com/callumfare/llvm-project/blob/d80db06491d85444bb6f7e59d8068a22cef3a6b4/offload/new-api/API/README.md) for more information on how the API definition and generation works. I'm happy to answer any questions about it and plan to walk through it in a future LLVM Offload call. It should be noted that struct definitions have not been fully implemented/tested as they aren't used by the initial API definitions, but finishing that off in the future shouldn't be too much work. The tablegen tooling has been designed to be easily extended with new backends, using the classes in `RecordTypes.hpp` to abstract over the tablegen records. ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh $ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git $ cd unified-runtime $ mkdir build $ cd build $ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \ -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \ -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory> $ ninja urinfo export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries> $ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo [cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030 [12030] # Demo with tracing $ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo ---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS ---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS ... ``` ### Open questions and future work * The new API is implemented in a separate library (`liboffload_new.so`). It could just as easily be part of the existing `libomptarget` library - I have no strong feelings on which is better. * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
When linking programs with `eld`, we get a link error like below: Error: /inst/clang+llvm-19.1.0-cross-hexagon-unknown-linux-musl/x86_64-linux-gnu/bin/../target/hexagon-unknown-linux-musl//usr/lib/libc.a(scalbn.lo)(.text.scalbn+0x3c): undefined reference to `__hexagon_muldf3' libc has references to the clang_rt builtins library, so the order of the libraries should be reversed.
Separates the stack allocations from prologue in preparation for the stack clash protection support.
This patch support for intrinsics in clang, as well as assembly instructions in the backend. Co-authored-by: Sirish Pande <[email protected]>
Co-authored-by: Pravin Jagtap <[email protected]>
The cfi_offset is incorrect for the RVV registers when cm.push is used.
OPSEL ASM Syntax: opsel:[x,y,z] where, opsel[x] = Inst{11} = src0_modifier{2} opsel[y] = Inst{12} = src1_modifier{2} opsel[z] = Inst{14} = src0_modifier{3} Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax. Co-authored-by: Pravin Jagtap <[email protected]>
OPSEL ASM Syntax: opsel:[x,y,z] where, opsel[x] = Inst{11} = src0_modifier{2} opsel[y] = Inst{12} = src1_modifier{2} opsel[z] = Inst{14} = src0_modifier{3} Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax. Co-authored-by: Pravin Jagtap <[email protected]>
…17382) OPSEL[3] selects low/high 16 bits of dest write. Co-authored-by: Pravin Jagtap <[email protected]>
…7383) OPSEL[0] selects srcword to read. Co-authored-by: Pravin Jagtap <[email protected]>
… (#117384) OPSEL ASM Syntax: opsel:[x,y,z] where, opsel[z] = Inst{14} = src0_modifier{3} Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax. Co-authored-by: Pravin Jagtap <[email protected]>
The standard mandates that this returns the width of the type, which is the number of bits in the value. For bool, that's required to be `1` explicitly. Fixes #117348
This adds the f32/f64/f16/bf16 test cases for below pattern : `fmul x, select(y, A, B)` with just one use of select Inst above. It acts as pre-commit tests for dagCombining above pattern into cheaper ldexp in case of non-inlline 32 bit-constants. (#111109)
…ffset for RVV CSRs. (#117408) getCalleeSavedStackSize() already contains RVPushStackSize. Don't subtract it again.
OPSEL ASM Syntax for v_cvt_scalef32_pk_f32_fp4 : opsel:[x,y,z] where, x & y i.e. OPSEL[1 : 0] selects which src_byte to read. OPSEL ASM Syntax for v_cvt_scalef32_pk_fp4_f32 : opsel:[a,b,c,d] where, c & d i.e. OPSEL[3 : 2] selects which dst_byte to write. Co-authored-by: Pravin Jagtap <[email protected]>
This relands #86209 which was reverted because ./bin/llvm no longer accepted test paths in the source tree instead of the build tree. This was happening because `add_subdirectory(${LLVM_MAIN_SRC_DIR}/utils/llvm-lit` was called before all tsst suites were registered, and therefore it was missing the source->build dir mappings. Original commit message: I am currently trying to test the LLVM runtimes (including compiler-rt) against an installed LLVM tree rather than a build tree (since that is no longer available). Currently, the runtimes build of compiler-rt assumes that LLVM_BINARY_DIR is writable since it uses configure_file() to write there during the CMake configure stage. Instead, generate this file inside CMAKE_CURRENT_BINARY_DIR, which will match LLVM_BINARY_DIR when invoked from llvm/runtimes/CMakeLists.txt. I also needed to make a minor change to the hwasan tests: hwasan_symbolize was previously found in the LLVM_BINARY_DIR, but since it is generated as part of the compiler-rt build it is now inside the CMake build directory instead. I fixed this by passing the output directory to lit as config.compiler_rt_bindir and using llvm_config.add_tool_substitutions(). For testing that we no longer write to the LLVM install directory as part of testing or configuration, I created a read-only bind mount and configured the runtimes builds as follows: ``` $ sudo mount --bind --read-only ~/llvm-install /tmp/upstream-llvm-readonly $ cmake -DCMAKE_BUILD_TYPE=Debug \ -DCMAKE_C_COMPILER=/tmp/upstream-llvm-readonly/bin/clang \ -DCMAKE_CXX_COMPILER=/tmp/upstream-llvm-readonly/bin/clang++ \ -DLLVM_INCLUDE_TESTS=TRUE -DLLVM_ENABLE_ASSERTIONS=TRUE \ -DCOMPILER_RT_INCLUDE_TESTS=TRUE -DCOMPILER_RT_DEBUG=OFF \ -DLLVM_ENABLE_RUNTIMES=compiler-rt \ -DCMAKE_DISABLE_FIND_PACKAGE_LLVM=TRUE \ -DCMAKE_DISABLE_FIND_PACKAGE_Clang=TRUE \ -G Ninja -S ~/upstream-llvm-project/runtimes \ -B ~/upstream-llvm-project/runtimes/cmake-build-debug-llvm-git ``` Pull Request: llvm/llvm-project#114307
…entation (#108413)" This reverts commit 8a2311c.
Retry of pr #109090 and #111099. Co-authored-by: Alexander Richardson <[email protected]>
The default expansion tries to create an illegal integer type after legalization.
The Z size was optional, meaning it matched with the 128-bit SSE instructions as well. Noticed while triaging the strange perf numbers on #110308
…117303) In the DXIL CreateHandle and CreateHandleFromBinding ops, resource bindings are indexed from the beginning of the binding space, not from the binding itself. Translate from an index into the binding to one from the beginning of the space when lowering to these operations.
This update follows up on change #112671 and is mostly a NFC, with the following exceptions: - Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required. - Parameter count is now calculated based on the unique hash count. - Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size. - Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time. - Moved a sorting operation outside of the loop. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.
The only constructor in current use is the one that takes IndexedMemProfData. Likewise, the only accessor in current use is takeMemProfData.
This reverts commit fdf1f69.
This come up in the context of pr 108889. We always end up sinking the value in the phi if we dispatched via a switch, but not if we'd dispatched via a branch. This is purely an artifact of current lowering.
…#117605) We can move the logic from adjustStackForRVV into adjustReg, which results in the remaining logic being trivially inlined to the two callers and allows a duplicate copy of the same logic in eliminateFrameIndex to be pruned.
…7418) OPSEL ASM Syntax for v_cvt_scalef32_pk_{f|bf}16_fp4 : opsel:[x,y,z] where, x & y i.e. OPSEL[1 : 0] selects which src_byte to read. Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax. Co-authored-by: Pravin Jagtap <[email protected]> Co-authored-by: Pravin Jagtap <[email protected]>
This patch fixes: llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp:476:25: error: unused variable 'ST' [-Werror,-Wunused-variable]
Make it homogenous with other runtime entry points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approve for llvm-spirv changes
@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a /merge. Thanks. |
/merge |
Tue 17 Dec 2024 06:27:32 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Tue 17 Dec 2024 06:27:34 PM UTC --- Merge failed with error: Please check whether the PR is mergeable |
@jsjj Git version issue again? |
/merge |
Tue 17 Dec 2024 06:41:05 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Tue 17 Dec 2024 06:41:06 PM UTC --- Merge failed with error: Please check whether the PR is mergeable |
/merge |
Tue 17 Dec 2024 06:47:17 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Tue 17 Dec 2024 06:47:19 PM UTC --- Merge failed with error: Please check whether the PR is mergeable |
@DoyleLi The bot is still not working, please have a look. Thanks. |
Wed 18 Dec 2024 02:03:33 AM UTC --- Merge failed with error: Please check whether the PR is mergeable |
Wed 18 Dec 2024 02:19:50 AM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Wed 18 Dec 2024 02:25:13 AM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
LLVM: llvm/llvm-project@53c0a25
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@b8b1b96