[AutoBump] Merge with 51365212 (Aug 25) (10) #363

mgehre-amd · 2024-09-20T17:48:31Z

No description provided.

…05549) Fix SIInsertWaitcnts to account for this by adding extra waits to avoid WAW dependencies.

…lvm#83807) This fixes an odd problem with the regex when `CMAKE_INSTALL_LIBDIR` is not defined: `string sub-command REGEX, mode REPLACE: regex "$" matched an empty string.` Fixes llvm#83802

…eclare} (llvm#105570) Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}.

This reverts commit 6528157. I'm reverting llvm#104523 (llvm@f01f80c) and this fixup belongs to the same series of changes.

This reverts commit 6f45602, which depends on llvm#104523, which I'm reverting.

llvm#104523)" This reverts commit f01f80c. This commit introduces an msan violation. See the discussion on llvm#104523.

Discard the subexpr.

…#105544) - Refactor SetTheory code to use const pointers when possible. - Use auto for variables initialized using dyn_cast<>. - Use range based for loops and early continue.

There was a duplicate link target.

This region is intended to separate alloca operations from reduction variable initialization. This makes it easier to hoist allocas to the entry block before control flow and complex code for initialization. The verifier checks that there is at most one block in the alloc region. This is not sufficient to avoid control flow in general MLIR, but by the time we are converting to LLVMIR structured control flow should already have been lowered to the cf dialect. 1/3 Part 2: llvm#102524 Part 3: llvm#102525

The intention of this change is to ensure that allocas end up in the entry block not spread out amongst complex reduction variable initialization code. The tests we have are quite minimized for readability and maintainability, making the benefits less obvious. The use case for this is when there are multiple reduction variables each will multiple blocks inside of the init region for that reduction. 2/3 Part 1: llvm#102522 Part 3: llvm#102525

I removed the `*-hlfir*` tests because they are duplicate now that the other tests have been updated to use the HLFIR lowering. 3/3 Part 1: llvm#102522 Part 2: llvm#102524

Proof: https://alive2.llvm.org/ce/z/v6VtXz

…finitions and partial specializations (llvm#104030) We need to rebuild the template parameters of out-of-line definitions/specializations of member templates in the context of the current instantiation for the purposes of declaration matching. We already do this for function templates and class templates, but not variable templates, partial specializations of variable template, and partial specializations of class templates. This patch fixes the latter cases.

) Convert them to Pointers, do the offset calculation and then convert them back to function pointers.

Tests for cases that would have been regressed by llvm#104941.

@skatrak

…105644) This can be handled in ODS instead of writing custom parsing/printing code. Thanks for the idea @skatrak

…vm#102752) Currently `mlir.llvm.constant` of structure types restricts that the structure type effectively represents a complex type -- it must have exactly two fields of the same type and the field type must be either an integer type or a float type. This PR relaxes this restriction and it allows the structure type to have an arbitrary number of fields.

…ble objects (llvm#104778) Whilst dealing with review comments on llvm#96752 I discovered that SCEV does not know about the dereferenceable attribute on function arguments so I have updated getRangeRef to make use of it by calling getPointerDereferenceableBytes.

These builtins are currently returning CR0 which will have the format [0, 0, flag_true_if_saved, XER]. We only want to return flag_true_if_saved. This patch adds a shift to remove the XER bit before returning.

This aligns the transform with what foldLogOpOfMaskedICmp() does.

- Landing page: add link to the libc++ Discord channel - Landing page: reorder "Getting Involved" above "Design documents" - Landing page: remove "Notes and Known Issues" which was completely outdated - Rename "Using Libc++" to "User Documentation" and update contents - Rename "Building Libc++" to "Vendor Documentation" and update contents The "BuildingLibcxx" and "UsingLibcxx" pages have basically been used for vendor and user documentation respectively. However, they were named in a way that doesn't really make that clear. Renaming the pages now gives us a location to clearly document what we target at vendors and what we target at users, and to do that separately.

…ns (llvm#105455) This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. For RISCV specifically, it's worth noting that an alternate padded lowering is available when VL is one less than a power of two, and LMUL <= m1. We could slide the vector operand up by one, and insert the padding via a vslide1up. We don't currently pattern match this, but we could. This form would arguably be better iff the surrounding code wanted VL=4. This patch will force a VL toggle in that case instead. Basically, it comes down to a question of whether we think odd sized vectors are going to appear clustered with odd size vector operations, or mixed in with larger power of two operations. Note there is a potential downside of using vp nodes; we loose any generic DAG combines which might have applied to the widened form.

…lvm#104689) This is a fairly narrow transform (at the moment) to reduce the VLs of instructions feeding a store with a smaller VL. Note that the goal of this transform isn't really to reduce VL - it's to reduce VL *toggles*. To our knowledge, small reductions in VL without also changing LMUL are generally not profitable on existing hardware. For a single use instruction without side effects, fp exceptions, or a result dependency on VL, reducing VL is legal if only a subset of elements are legal. We'd already implemented this logic for vmv.v.v, and this patch simply applies it to stores as an alternate root. Longer term, I plan to extend this to other root instructions (i.e. different kind of stores, reduces, etc..), and add a more general recursive walkback through operands. One risk with the dataflow based approach is that we could be reducing VL of an instruction scheduled in a region with the wider VL (i.e. mixed mode computations) forcing an additional VL toggle. An example of this is the @insert_subvector_dag_loop test case, but it doesn't appear to happen widely. I think this is a risk we should accept.

This patch extends llvm#73964 and optimises SVE cmp intrinsics to zero vector when predicate is zero.

This patch removes obsolete status pages for projects that were completed: LLVM 18 release, C++20 Ranges and Spaceship support. Co-authored-by: Hristo Hristov <[email protected]>

Since this must be true, add an assertion instead of just documenting it via the comment.

…and has the `nuw` or `nsw` property. (llvm#105914) This patch updates the select operand when the cond has the nuw or nsw property. Considering the semantics of the nuw and nsw flag, if there is no poison value in this expression, this code assumes that X can only be 0, 1 or -1. close: llvm#96765 alive2: https://alive2.llvm.org/ce/z/3n3n2Q

The intent is that the tests should not be running on PowerPC as the fp128 type will differ. This attempts to fix the bots by using __powerpc__ instead, which appears to be defined in godbolt.

…ephole (llvm#105792) Currently we move the source down to where vmv.v.v to make sure that the new passthru dominates, but we do this even if it already does. This adds a simple local dominance check (taken from X86FastPreTileConfig.cpp) and avoids doing the move if it can. It also modifies the move to only move it to just past the passthru definition, and not all the way down to the vmv.v.v. This allows folding to succeed in some edge cases, which prevents regressions in an upcoming patch.

Only used for an assertion.

TLI might not be valid for all contexts that constant folding is performed. Add a quick guard that it is not null.

On macOS the dynamic loader prunes dyld specific environment variables such as `DYLD_INSERT_LIBRARIES`, `DYLD_LIBRARY_PATH`, etc. If these are set in the lit config it's safe to assume that the user actually wanted their subprocesses to run with these variables, versus the python interpreter that gets executed with them before they are pruned. This change exports all known variables in the shell script instead of relying on them being passed through.

Followup to llvm#90109. In Microsoft, our automated scans are warning that LLVM has vulnerable dependencies. Specifically: * [CVE-2024-35195](https://nvd.nist.gov/vuln/detail/CVE-2024-35195) was fixed in `requests` 2.32.0. * [CVE-2024-37891](https://nvd.nist.gov/vuln/detail/CVE-2024-37891) was fixed in `urllib3` 2.2.2. I've updated LLVM's dependencies by running the following commands in `llvm/utils/git`: ``` pip-compile --upgrade --generate-hashes --output-file=requirements.txt requirements.txt.in pip-compile --upgrade --generate-hashes --output-file=requirements_formatting.txt requirements_formatting.txt.in ``` Note that for `requirements_formatting.txt` this adds `--generate-hashes` (according to my vague understanding, it's highly desirable and was already used for `requirements.txt`) and was locally run within `llvm/utils/git` (changing the recorded command, which apparently was originally run from the repo root - again, `requirements.txt` was already being regenerated with a locally run command, so this increases consistency). I observe that this has updated the relevant components to pick up the CVE fixes. Note that I am largely clueless in this area, so I hope that (like llvm#90109) no other changes will be necessary.

Followup to llvm#99570. * `TEST_COMPILER_MSVC` must be tested for `defined`ness, as it is everywhere else. + Definition: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/support/test_macros.h#L71-L72 + Example usage: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/std/utilities/function.objects/func.not_fn/not_fn.pass.cpp#L248 + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(33): fatal error C1017: invalid integer constant expression` * Fix bogus return type: `msvc_is_lock_free_macro_value()` returns `2` or `0`, so it needs to return `int`. + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(41): warning C4305: 'return': truncation from 'int' to 'bool'` * Clarity improvement: also add parens when mixing bitwise with arithmetic operators.

Fix bug introduced in llvm#105730 The bug is in how the batch RAUW is implemented. If we have ``` %0 = mov %src %1 = mov %0 use %0 use %1 ``` The use of `%1` is rewritten to `%0`, not `%src`. This PR just looks for a replacement when it maps to the src register, which should transitively propagate the replacements.

…tions (llvm#105840) This is a follow up to llvm#105455 which updates the VPIntrinsic mappings for the fadd and fmul cases, and supports both ordered and unordered reductions. This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. This has all the same tradeoffs as the previous patch.

@ya

… mangling for MSVC 1920+ / VS2019+ (llvm#104722) Reapply llvm#102848. The description in this PR will detail the changes from the reverted original PR above. For `auto&&` return types that can partake in reference collapsing we weren't properly handling that mangling that can arise. When collapsing occurs an inner reference is created with the collapsed reference type. If we return `int&` from such a function then an inner reference of `int&` is created within the `auto&&` return type. `getPointeeType` on a reference type goes through all inner references before returning the pointee type which ends up being a builtin type, `int`, which is unexpected. We can use `getPointeeTypeAsWritten` to get the `AutoType` as expected however for the instantiated template declaration reference collapsing already occurred on the return type. This means `auto&&` is turned into `auto&` in our example above. We end up mangling an lvalue reference type. This is unintended as MSVC mangles on the declaration of the return type, `auto&&` in this case, which is treated as an rvalue reference. ``` template<class T> auto&& AutoReferenceCollapseT(int& x) { return static_cast<int&>(x); } void test() { int x = 1; auto&& rref = AutoReferenceCollapseT<void>(x); // "??$AutoReferenceCollapseT@X@@ya$$QEA_PAEAH@Z" // Mangled as an rvalue reference to auto } ``` If we are mangling a template with a placeholder return type we want to get the first template declaration and use its return type to do the mangling of any instantiations. This fixes the bug reported in the original PR that caused the revert with libcxx `std::variant`. I also tested locally with libcxx and the following test code which fails in the original PR but now works in this PR. ``` #include <variant> void test() { std::variant<int> v{ 1 }; int& r = std::get<0>(v); (void)r; } ```

) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, #56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, #58 ldr w0, [x2, x8, lsl #2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.

v16i8 VECTOR_REG_CAST (v16i8 Op) can use v16i8 Op directly, as the VECTOR_REG_CAST is a noop.

We assign I->getNumOperands() to J and immediately print that out as a debug message. We don't need to keep J across iterations.

…uble (llvm#104929)" ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`. LLVM should not change the behavior depending on host configurations. This reverts commit 14c7e4a. (llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)

…llvm#105921) Fixes llvm#105880.

…lvm#105941) Fixes llvm#105877.

…105973)

Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is not expected, it should be in normal OPCODE area.

…iveIn/removeLiveIn. NFC We already used it for addLiveIn.

If we fail to initialize the ASTContext builtins, LLDB may crash in non-obvious ways down-the-line, e.g., when it tries to call `ASTContext::getTypeSize` on a builtin like `ast.UnsignedCharTy`, which would derefernce a `null` `QualType`. The initialization can fail if we either didn't set the `TypeSystemClang` target triple, or if the embedded clang isn't enabled for a certain target. This patch attempts to help pin-point the failure case post-mortem by adding a log message here that prints the triple. rdar://134260837

) This reverts commit 1f89cd4.

jayfoad and others added 30 commits August 22, 2024 11:46

[AMDGPU] GFX12 VMEM loads can write VGPR results out of order (llvm#1…

5506831

…05549) Fix SIInsertWaitcnts to account for this by adding extra waits to avoid WAW dependencies.

[cmake] Include GNUInstallDirs before using variables defined by it. (l…

5bbd598

…lvm#83807) This fixes an odd problem with the regex when `CMAKE_INSTALL_LIBDIR` is not defined: `string sub-command REGEX, mode REPLACE: regex "$" matched an empty string.` Fixes llvm#83802

[DebugInfo][NFC] Constify debug DbgVariableRecord::{isDbgValue,isDbgD…

743e70b

…eclare} (llvm#105570) Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}.

Revert "[lldb][swig] Use the correct variable in the return statement"

7323e7e

This reverts commit 6528157. I'm reverting llvm#104523 (llvm@f01f80c) and this fixup belongs to the same series of changes.

Revert "[lldb-dap] Mark hidden frames as "subtle" (llvm#105457)"

aa70f83

This reverts commit 6f45602, which depends on llvm#104523, which I'm reverting.

Revert "[lldb] Extend frame recognizers to hide frames from backtraces (

547917a

llvm#104523)" This reverts commit f01f80c. This commit introduces an msan violation. See the discussion on llvm#104523.

[clang][bytecode] Fix void unary * operators (llvm#105640)

125aa10

Discard the subexpr.

[NFC][VPlan] Correct two typos in comments.

6932f47

[NFC][SetTheory] Refactor to use const pointers and range loops (llvm…

d7da79f

…#105544) - Refactor SetTheory code to use const pointers when possible. - Use auto for variables initialized using dyn_cast<>. - Use range based for loops and early continue.

[libc++] Fix the documentation build

c73b14c

There was a duplicate link target.

[libc++] Add link to the Github conformance table from the documentation

6d30b67

[flang][OpenMP] use reduction alloc region (llvm#102525)

f2027a9

I removed the `*-hlfir*` tests because they are duplicate now that the other tests have been updated to use the HLFIR lowering. 3/3 Part 1: llvm#102522 Part 2: llvm#102524

[InstCombine] Fold scmp(x -nsw y, 0) to scmp(x, y) (llvm#105583)

d163935

Proof: https://alive2.llvm.org/ce/z/v6VtXz

[clang][bytecode] Allow adding offsets to function pointers (llvm#105641

db94852

) Convert them to Pointers, do the offset calculation and then convert them back to function pointers.

[InstCombine] Add more tests for foldLogOpOfMaskedICmps transform (NFC)

7e3f9dd

Tests for cases that would have been regressed by llvm#104941.

[mlir][OpenMP][NFC] clean up optional reduction region parsing (llvm#…

dd3b43a

…105644) This can be handled in ODS instead of writing custom parsing/printing code. Thanks for the idea @skatrak

[PowerPC] Fix mask for __st[d/w/h/b]cx builtins (llvm#104453)

327edbe

These builtins are currently returning CR0 which will have the format [0, 0, flag_true_if_saved, XER]. We only want to return flag_true_if_saved. This patch adds a shift to remove the XER bit before returning.

[LLVM][CodeGen][SVE] Increase vector.insert test coverage.

11e1378

[InstCombine] Add more test variants with poison elements (NFC)

c8f40e7

[InstCombine] Handle logical op for and/or of icmp 0/-1

32679e1

This aligns the transform with what foldLogOpOfMaskedICmp() does.

[AArch64] optimise SVE cmp intrinsics with no active lanes (llvm#104779)

29cb1e6

This patch extends llvm#73964 and optimises SVE cmp intrinsics to zero vector when predicate is zero.

[libc++] Post-LLVM19-release docs cleanup (llvm#99667)

58ac764

This patch removes obsolete status pages for projects that were completed: LLVM 18 release, C++20 Ranges and Spaceship support. Co-authored-by: Hristo Hristov <[email protected]>

tbaederr and others added 29 commits August 24, 2024 09:23

[clang][bytecode][NFC] Add an additional assertion (llvm#105927)

99b85ca

Since this must be true, add an assertion instead of just documenting it via the comment.

[Tests] Attempt to fix PowerPC buildbots.

001e423

The intent is that the tests should not be running on PowerPC as the fp128 type will differ. This attempts to fix the bots by using __powerpc__ instead, which appears to be defined in godbolt.

[VPlan] Wrap planContainsAdditionalSimplifications in NDEBUG (NFC)

40975da

Only used for an assertion.

[ConstantFolding] Ensure TLI is valid when simplifying fp128 intrinsics.

83a5c7c

TLI might not be valid for all contexts that constant folding is performed. Add a quick guard that it is not null.

[Analysis] Copy-construct SmallVector (NFC) (llvm#105911)

08acc3f

[Target] Use llvm::replace (NFC) (llvm#105942)

a5d89d5

[IR] Modernize StructuralHashImpl (NFC) (llvm#105951)

d252365

Update my email

6f618a7

[ARM] Add a number of extra vmovimm tests for BE. NFC

9f82f6d

[ARM] Add VECTOR_REG_CAST identity fold.

b9a0276

v16i8 VECTOR_REG_CAST (v16i8 Op) can use v16i8 Op directly, as the VECTOR_REG_CAST is a noop.

[Mips] Remove a trivial variable (NFC) (llvm#105940)

a6f87ab

We assign I->getNumOperands() to J and immediately print that out as a debug message. We don't need to keep J across iterations.

[clang-format] Fix a misannotation of redundant r_paren as CastRParen (…

6bc225e

…llvm#105921) Fixes llvm#105880.

[clang-format] Fix a misannotation of less/greater as angle brackets (l…

0916ae4

…lvm#105941) Fixes llvm#105877.

[X86][AMX] Avoid to construct invalid shape for checking, NFCI (llvm#…

5c94dd7

…105973)

[RISCV][ISel] Move VCIX ISDs to correct position. NFC (llvm#105934)

579fd59

Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is not expected, it should be in normal OPCODE area.

[CodeGen] Replace MCPhysReg with MCRegister in MachineBasicBlock::isL…

f22b1da

…iveIn/removeLiveIn. NFC We already used it for addLiveIn.

Reapply "[compiler-rt][nsan] Add support for nan detection" (llvm#105909

5136521

) This reverts commit 1f89cd4.

[AutoBump] Merge with 5136521 (Aug 25)

f0747cd

cferry-AMD approved these changes Sep 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 51365212 (Aug 25) (10) #363

[AutoBump] Merge with 51365212 (Aug 25) (10) #363

mgehre-amd commented Sep 20, 2024

[AutoBump] Merge with 51365212 (Aug 25) (10) #363

Are you sure you want to change the base?

[AutoBump] Merge with 51365212 (Aug 25) (10) #363

Conversation

mgehre-amd commented Sep 20, 2024