forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes) #451
Open
jorickert
wants to merge
859
commits into
feature/fused-ops
Choose a base branch
from
bump_to_0a17bdfc
base: feature/fused-ops
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
llvm#113119) StringMap::find takes StringRef. We don't need to create an instance of std::string from StringRef only to convert it right back to StringRef.
This code intentionally discards the high bits, so set implicitTrunc=true. This is currently NFC but will enable an APInt assertion in the future.
…vm#111575) Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline, which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before running the runner. The eventual goal is to move all host/device code generation steps out of the runner, like with some of the other runners.
6bac414 added this opcode with the wrong number of operands. It didn't fail on check-llvm for me or on pre-commit CI, but once committed we got buildbot failures. This patch fixes the definition of the instruction and fixes the failing test.
With the truncssat nodes these are relatively simple tablegen patterns to add. The existing intrinsics are converted to shift+truncsat to they can lower using the new patterns. Fixes llvm#112925.
…eturn value is different in pointer / lvalue ref / rvalue ref (llvm#112853) Per https://cplusplus.github.io/CWG/issues/960.html.
This patch adds assembly/disassembly for the following instructions: ldfadd{a,al,l,}, ldbfadd{a,al,l,} ldfmax{a,al,l,}, ldbfmax{a,al,l,} ldfmaxnm{a,al,l,}, ldbfmaxnm{a,al,l,} ldfmin{a,al,l,}, ldbfmin{a,al,l,} ldfminnm{a,al,l,} ldbfminnm{a,al,l,} stfadd{l,}, stbfadd{l,} stfmax{l,}, stbfmax{l,} stfmaxnm{l,}, stbfmaxnm{l,} stfmin{l,}, stbfmin{l,} stfminnm{l,}, stbfminnm{l,} According to [1] [1]https://developer.arm.com/documentation/ddi0602 Co-authored-by: Spencer Abson [[email protected]](mailto:[email protected]) Co-authored-by: Caroline Concatto [[email protected]](mailto:[email protected])
…m#112935) Done in preparation of exploring rtsan on windows.
Add new register classes/operands and their encoder/decoder behaviour required for the new Armv9.6 instructions (see https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension). This work is the basis ofthe 2024 Armv9.6 architecture update effort for SME. Co-authored-by: Caroline Concatto [email protected] Co-authored-by: Marian Lukac [email protected] Co-authored-by: Momchil Velikov [email protected]
…-opt" (llvm#113176) Reverts llvm#111575 This caused build failures: https://lab.llvm.org/buildbot/#/builders/138/builds/5244
Improve the codegen for uaddo node for i64 in 64-bit mode and i32 in 32-bit mode by custom lowering.
Test added by commit 47a6da2 fails on the AIX bot. So XFAIL for now to investigate further.
Previously we were attempting to remove the memprof-related metadata when iterating through instructions in the LTO backend. However, we missed some as there are a number of cases where we skip instructions, or even entire functions. Simplify the cleanup and ensure all is removed by doing a full sweep over all instructions after completing cloning. This is largely NFC except with -memprof-report-hinted-sizes enabled, because we were propagating and simplifying the metadata after inlining in the LTO backend, which caused some stray messages as metadata was re-converted to attributes.
movzx r11d,BYTE PTR [rdx] is four bytes long. Follow-up to llvm#111638
Remove the unused functions and register classes from the change below llvm@4679583
Root gather/buildvector node should be ignored when SLP vectorizer tries to find matching gather nodes, vectorized earlier. This node is definitely the last one in the pipeline and it does not have users. It may cause the compiler crash Fixes llvm#113143
This patch adds a LegalityResultWithReason class for describing the reason why legality decided not to vectorize the code.
…llvm#109850) Special case small int constant in the PPC custom lowering of scalar_to_vector.
This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on. Built by GCC 11. Fix warning In destructor ‘llvm::APInt::~APInt()’, inlined from ‘llvm::APInt::~APInt()’ at llvm-project/llvm/include/llvm/ADT/APInt.h:190:3, inlined from ‘llvm::APSInt::~APSInt()’ at llvm-project/llvm/include/llvm/ADT/APSInt.h:23:21, inlined from ‘bool checkOMPArraySectionConstantForReduction(clang::ASTContext&, const clang::ArraySectionExpr*, bool&, llvm::SmallVectorImpl<llvm::APSInt>&)’ at llvm-project/clang/lib/Sema/SemaOpenMP.cpp:18357:45, inlined from ‘bool actOnOMPReductionKindClause(clang::Sema&, {anonymous}::DSAStackTy*, clang::OpenMPClauseKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, clang::SourceLocation, clang::SourceLocation, clang::SourceLocation, clang::CXXScopeSpec&, const clang::DeclarationNameInfo&, llvm::ArrayRef<clang::Expr*>, {anonymous}::ReductionData&)’ at llvm-project/clang/lib/Sema/SemaOpenMP.cpp:18715:68: llvm-project/llvm/include/llvm/ADT/APInt.h:192:18: error: ‘void operator delete [](void*)’ called on a pointer to an unallocated object ‘1’ [-Werror=free-nonheap-object] 192 | delete[] U.pVal; | ^~~~
This patch fixes: llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Legality.h:85:16: error: private field 'Reason' is not used [-Werror,-Wunused-private-field]
…2765) MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's been identified this message is only really needed for VGPR limited kernels. A kernel becomes VGPR limited if a total number of VGPRs per SIMD / number of used VGPRs is more than a number of wave slots.
…#112990) Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA values. LegalizeData suggested that it fixed data mapping. This change also adds support to fix up ssa values for data clause operations. Effectively, compute regions within a data region use the ssa values from data operations also. The ssa values within data regions but not within compute regions are not updated. This change is to support the requirement in the OpenACC spec which notes that a visible data clause is not just one on the current compute construct but on the lexically containing data construct or visible declare directive.
Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM intrinsics do not support out params. To get around this limitation implementers will make intrinsics return structs to capture a return type and an out param. This implementation detail should not impact scalarization since these cases should be elementwise operations. ## Three changes are needed. - The CallInst visitor needs to be updated to handle Structs - A new visitor is needed for `ExtractValue` instructions - finsh needs to be update to handle structs so that insert elements are properly propogated. ## Testing changes - Add support for `llvm.frexp` - Add support for `llvm.dx.splitdouble` fixes llvm#111437
…vm#112997) The x86-fold-tables.td has been failing for me and [in CI](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4) if Git happens to decide to check out the baseline file with Windows line endings. This fix for this is to add the `--strip-trailing-cr` option to diff to normalize the line endings before comparing them.
…lvm#112995) llvm#98060 introduced a warning for unterminated string constants, however it was only checking for `\n` which means that it produced strange results on Windows (always blaming column 1) including having the [associated test fail](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4) if Git happened to use Windows newlines when creating the file. This fix for this is to detect both `\r` and `\n`, but don't double-warn for Windows newlines.
…lazy archive symbol to the symbol table on ARM64EC (llvm#113284) On ARM64EC, a function symbol may appear in both mangled and demangled forms: - ARM64EC archives contain only the mangled name, while the demangled symbol is defined by the object file as an alias. - x86_64 archives contain only the demangled name (the mangled name is usually defined by an object referencing the symbol as an alias to a guess exit thunk). - ARM64EC import files contain both the mangled and demangled names for thunks. If more than one archive defines the same function, this could lead to different libraries being used for the same function depending on how they are referenced. Avoid this by checking if the paired symbol is already defined before adding a symbol to the table.
…m#112928) Member pointers refer to data or function members of a `CXXRecordDecl` and require a `MSInheritanceAttr` in order to be complete. Without that we cannot calculate their size in memory. The attempt has been causing a crash further down in the clang AST context. In order to implement the feature, DWARF will need a new attribtue to convey the information. For the moment, this patch teaches LLDB to handle to situation and avoid the crash.
…lvm#111130) Before this patch, redundant COPY couldn't be removed for the following case: ``` $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 ``` This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: ``` $R1 = OP ... ... // Replace all uses of %R0 with $R1 ```
This PR merges large offsets into the base address loading.
llvm#113309) llvm-cxxfilt can demangle names of data symbols, in addition to function names. $ llvm-cxxfilt _ZN6garden5gnomeE garden::gnome And type names too, on request: $ llvm-cxxfilt -t i int Update some overly specific the wording in the --help and documentation that suggests otherwise.
This patch adds these new vector sizes for neon: mfloat8x16_t and mfloat8x8_t According to the ARM ACLE PR#323[1]. [1] ARM-software/acle#323
llvm#111531) Bot maintainers should be aware and it became too much of a burden for developers. In particular on Windows, where make.exe won't be found in Path typically.
…2867) The Intel C++ Compiler (ICX) passes linker flags through the driver unlike MSVC and clang-cl, and therefore needs them to be prefixed with `/Qoption,link` (the equivalent of -Wl, for gcc on *nix). Use the `LINKER:` prefix for the `/EXPORT:` options in clang-repl, this expands to the correct flag for ICX and nothing for MSVC / clang-cl. RFC: https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446
These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see llvm#108980 (comment)). Note: Both SLEEF and ArmPL state that they do not set `errno`: - https://developer.arm.com/documentation/101004/2410/General-information/Arm-Performance-Libraries-math-functions * "The vector functions in libamath which are available on Linux may not set errno nor raise exceptions" - https://sleef.org/2-references/libm/ * "These functions do not set errno nor raise an exception."
…llvm#113167) Define `OmpIteratorSpecifier` and `OmpIteratorModifier` parser classes, and add parsing for them. Those are reusable between any clauses that use iterator modifiers. Add support for iterator modifiers to the MAP clause up to lowering, where a TODO message is emitted.
…at container-inserter does (llvm#113103) This patch implements LWG4016: container-insertable checks do not match what container-inserter does.
…lvm#111236) The underlying issue with msan was fixed by llvm#113200
When compiling for an SVE target we can use INDEX to generate constant fixed-length step vectors, e.g.: ``` uint32x4_t foo() { return (uint32x4_t){0, 1, 2, 3}; } ``` Currently: ``` foo(): adrp x8, .LCPI1_0 ldr q0, [x8, :lo12:.LCPI1_0] ret ``` With INDEX: ``` foo(): index z0.s, #0, #1 ret ``` The logic for this was already in `LowerBUILD_VECTOR`, though it was hidden under a check for `!Subtarget->isNeonAvailable()`. This patch refactors this to enable the corresponding code path unconditionally for constant step vectors (as long as we can use SVE for them).
jorickert
changed the title
[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14)
[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)
Jan 14, 2025
Does a corresponding PR exist on Xilinx/onnx-mlir? If yes, could you please link them? |
Xilinx/onnx-mlir#272 I will update it today |
I update the onnx-bump, but we will need to include #455 in the same bump too |
jorickert
changed the title
[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)
[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)
Jan 24, 2025
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As we are blocked by the onnx windows issue, this can be worked around by removing the test "parallel/krnl_parallel_clause_to_omp.mlir"