forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 51365212 (Aug 25) (10) #363
Open
mgehre-amd
wants to merge
427
commits into
bump_to_b96f18b2
Choose a base branch
from
bump_to_51365212
base: bump_to_b96f18b2
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…05549) Fix SIInsertWaitcnts to account for this by adding extra waits to avoid WAW dependencies.
…lvm#83807) This fixes an odd problem with the regex when `CMAKE_INSTALL_LIBDIR` is not defined: `string sub-command REGEX, mode REPLACE: regex "$" matched an empty string.` Fixes llvm#83802
…eclare} (llvm#105570) Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}.
This reverts commit 6528157. I'm reverting llvm#104523 (llvm@f01f80c) and this fixup belongs to the same series of changes.
This reverts commit 6f45602, which depends on llvm#104523, which I'm reverting.
llvm#104523)" This reverts commit f01f80c. This commit introduces an msan violation. See the discussion on llvm#104523.
Discard the subexpr.
…#105544) - Refactor SetTheory code to use const pointers when possible. - Use auto for variables initialized using dyn_cast<>. - Use range based for loops and early continue.
There was a duplicate link target.
This region is intended to separate alloca operations from reduction variable initialization. This makes it easier to hoist allocas to the entry block before control flow and complex code for initialization. The verifier checks that there is at most one block in the alloc region. This is not sufficient to avoid control flow in general MLIR, but by the time we are converting to LLVMIR structured control flow should already have been lowered to the cf dialect. 1/3 Part 2: llvm#102524 Part 3: llvm#102525
The intention of this change is to ensure that allocas end up in the entry block not spread out amongst complex reduction variable initialization code. The tests we have are quite minimized for readability and maintainability, making the benefits less obvious. The use case for this is when there are multiple reduction variables each will multiple blocks inside of the init region for that reduction. 2/3 Part 1: llvm#102522 Part 3: llvm#102525
I removed the `*-hlfir*` tests because they are duplicate now that the other tests have been updated to use the HLFIR lowering. 3/3 Part 1: llvm#102522 Part 2: llvm#102524
…finitions and partial specializations (llvm#104030) We need to rebuild the template parameters of out-of-line definitions/specializations of member templates in the context of the current instantiation for the purposes of declaration matching. We already do this for function templates and class templates, but not variable templates, partial specializations of variable template, and partial specializations of class templates. This patch fixes the latter cases.
Tests for cases that would have been regressed by llvm#104941.
…vm#102752) Currently `mlir.llvm.constant` of structure types restricts that the structure type effectively represents a complex type -- it must have exactly two fields of the same type and the field type must be either an integer type or a float type. This PR relaxes this restriction and it allows the structure type to have an arbitrary number of fields.
…ble objects (llvm#104778) Whilst dealing with review comments on llvm#96752 I discovered that SCEV does not know about the dereferenceable attribute on function arguments so I have updated getRangeRef to make use of it by calling getPointerDereferenceableBytes.
These builtins are currently returning CR0 which will have the format [0, 0, flag_true_if_saved, XER]. We only want to return flag_true_if_saved. This patch adds a shift to remove the XER bit before returning.
This aligns the transform with what foldLogOpOfMaskedICmp() does.
- Landing page: add link to the libc++ Discord channel - Landing page: reorder "Getting Involved" above "Design documents" - Landing page: remove "Notes and Known Issues" which was completely outdated - Rename "Using Libc++" to "User Documentation" and update contents - Rename "Building Libc++" to "Vendor Documentation" and update contents The "BuildingLibcxx" and "UsingLibcxx" pages have basically been used for vendor and user documentation respectively. However, they were named in a way that doesn't really make that clear. Renaming the pages now gives us a location to clearly document what we target at vendors and what we target at users, and to do that separately.
…ns (llvm#105455) This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. For RISCV specifically, it's worth noting that an alternate padded lowering is available when VL is one less than a power of two, and LMUL <= m1. We could slide the vector operand up by one, and insert the padding via a vslide1up. We don't currently pattern match this, but we could. This form would arguably be better iff the surrounding code wanted VL=4. This patch will force a VL toggle in that case instead. Basically, it comes down to a question of whether we think odd sized vectors are going to appear clustered with odd size vector operations, or mixed in with larger power of two operations. Note there is a potential downside of using vp nodes; we loose any generic DAG combines which might have applied to the widened form.
…lvm#104689) This is a fairly narrow transform (at the moment) to reduce the VLs of instructions feeding a store with a smaller VL. Note that the goal of this transform isn't really to reduce VL - it's to reduce VL *toggles*. To our knowledge, small reductions in VL without also changing LMUL are generally not profitable on existing hardware. For a single use instruction without side effects, fp exceptions, or a result dependency on VL, reducing VL is legal if only a subset of elements are legal. We'd already implemented this logic for vmv.v.v, and this patch simply applies it to stores as an alternate root. Longer term, I plan to extend this to other root instructions (i.e. different kind of stores, reduces, etc..), and add a more general recursive walkback through operands. One risk with the dataflow based approach is that we could be reducing VL of an instruction scheduled in a region with the wider VL (i.e. mixed mode computations) forcing an additional VL toggle. An example of this is the @insert_subvector_dag_loop test case, but it doesn't appear to happen widely. I think this is a risk we should accept.
This patch extends llvm#73964 and optimises SVE cmp intrinsics to zero vector when predicate is zero.
This patch removes obsolete status pages for projects that were completed: LLVM 18 release, C++20 Ranges and Spaceship support. Co-authored-by: Hristo Hristov <[email protected]>
Since this must be true, add an assertion instead of just documenting it via the comment.
…and has the `nuw` or `nsw` property. (llvm#105914) This patch updates the select operand when the cond has the nuw or nsw property. Considering the semantics of the nuw and nsw flag, if there is no poison value in this expression, this code assumes that X can only be 0, 1 or -1. close: llvm#96765 alive2: https://alive2.llvm.org/ce/z/3n3n2Q
The intent is that the tests should not be running on PowerPC as the fp128 type will differ. This attempts to fix the bots by using __powerpc__ instead, which appears to be defined in godbolt.
…ephole (llvm#105792) Currently we move the source down to where vmv.v.v to make sure that the new passthru dominates, but we do this even if it already does. This adds a simple local dominance check (taken from X86FastPreTileConfig.cpp) and avoids doing the move if it can. It also modifies the move to only move it to just past the passthru definition, and not all the way down to the vmv.v.v. This allows folding to succeed in some edge cases, which prevents regressions in an upcoming patch.
Only used for an assertion.
TLI might not be valid for all contexts that constant folding is performed. Add a quick guard that it is not null.
On macOS the dynamic loader prunes dyld specific environment variables such as `DYLD_INSERT_LIBRARIES`, `DYLD_LIBRARY_PATH`, etc. If these are set in the lit config it's safe to assume that the user actually wanted their subprocesses to run with these variables, versus the python interpreter that gets executed with them before they are pruned. This change exports all known variables in the shell script instead of relying on them being passed through.
Followup to llvm#90109. In Microsoft, our automated scans are warning that LLVM has vulnerable dependencies. Specifically: * [CVE-2024-35195](https://nvd.nist.gov/vuln/detail/CVE-2024-35195) was fixed in `requests` 2.32.0. * [CVE-2024-37891](https://nvd.nist.gov/vuln/detail/CVE-2024-37891) was fixed in `urllib3` 2.2.2. I've updated LLVM's dependencies by running the following commands in `llvm/utils/git`: ``` pip-compile --upgrade --generate-hashes --output-file=requirements.txt requirements.txt.in pip-compile --upgrade --generate-hashes --output-file=requirements_formatting.txt requirements_formatting.txt.in ``` Note that for `requirements_formatting.txt` this adds `--generate-hashes` (according to my vague understanding, it's highly desirable and was already used for `requirements.txt`) and was locally run within `llvm/utils/git` (changing the recorded command, which apparently was originally run from the repo root - again, `requirements.txt` was already being regenerated with a locally run command, so this increases consistency). I observe that this has updated the relevant components to pick up the CVE fixes. Note that I am largely clueless in this area, so I hope that (like llvm#90109) no other changes will be necessary.
Followup to llvm#99570. * `TEST_COMPILER_MSVC` must be tested for `defined`ness, as it is everywhere else. + Definition: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/support/test_macros.h#L71-L72 + Example usage: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/std/utilities/function.objects/func.not_fn/not_fn.pass.cpp#L248 + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(33): fatal error C1017: invalid integer constant expression` * Fix bogus return type: `msvc_is_lock_free_macro_value()` returns `2` or `0`, so it needs to return `int`. + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(41): warning C4305: 'return': truncation from 'int' to 'bool'` * Clarity improvement: also add parens when mixing bitwise with arithmetic operators.
Fix bug introduced in llvm#105730 The bug is in how the batch RAUW is implemented. If we have ``` %0 = mov %src %1 = mov %0 use %0 use %1 ``` The use of `%1` is rewritten to `%0`, not `%src`. This PR just looks for a replacement when it maps to the src register, which should transitively propagate the replacements.
…tions (llvm#105840) This is a follow up to llvm#105455 which updates the VPIntrinsic mappings for the fadd and fmul cases, and supports both ordered and unordered reductions. This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. This has all the same tradeoffs as the previous patch.
… mangling for MSVC 1920+ / VS2019+ (llvm#104722) Reapply llvm#102848. The description in this PR will detail the changes from the reverted original PR above. For `auto&&` return types that can partake in reference collapsing we weren't properly handling that mangling that can arise. When collapsing occurs an inner reference is created with the collapsed reference type. If we return `int&` from such a function then an inner reference of `int&` is created within the `auto&&` return type. `getPointeeType` on a reference type goes through all inner references before returning the pointee type which ends up being a builtin type, `int`, which is unexpected. We can use `getPointeeTypeAsWritten` to get the `AutoType` as expected however for the instantiated template declaration reference collapsing already occurred on the return type. This means `auto&&` is turned into `auto&` in our example above. We end up mangling an lvalue reference type. This is unintended as MSVC mangles on the declaration of the return type, `auto&&` in this case, which is treated as an rvalue reference. ``` template<class T> auto&& AutoReferenceCollapseT(int& x) { return static_cast<int&>(x); } void test() { int x = 1; auto&& rref = AutoReferenceCollapseT<void>(x); // "??$AutoReferenceCollapseT@X@@ya$$QEA_PAEAH@Z" // Mangled as an rvalue reference to auto } ``` If we are mangling a template with a placeholder return type we want to get the first template declaration and use its return type to do the mangling of any instantiations. This fixes the bug reported in the original PR that caused the revert with libcxx `std::variant`. I also tested locally with libcxx and the following test code which fails in the original PR but now works in this PR. ``` #include <variant> void test() { std::variant<int> v{ 1 }; int& r = std::get<0>(v); (void)r; } ```
) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, #56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, #58 ldr w0, [x2, x8, lsl #2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.
v16i8 VECTOR_REG_CAST (v16i8 Op) can use v16i8 Op directly, as the VECTOR_REG_CAST is a noop.
We assign I->getNumOperands() to J and immediately print that out as a debug message. We don't need to keep J across iterations.
…uble (llvm#104929)" ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`. LLVM should not change the behavior depending on host configurations. This reverts commit 14c7e4a. (llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is not expected, it should be in normal OPCODE area.
…iveIn/removeLiveIn. NFC We already used it for addLiveIn.
If we fail to initialize the ASTContext builtins, LLDB may crash in non-obvious ways down-the-line, e.g., when it tries to call `ASTContext::getTypeSize` on a builtin like `ast.UnsignedCharTy`, which would derefernce a `null` `QualType`. The initialization can fail if we either didn't set the `TypeSystemClang` target triple, or if the embedded clang isn't enabled for a certain target. This patch attempts to help pin-point the failure case post-mortem by adding a log message here that prints the triple. rdar://134260837
) This reverts commit 1f89cd4.
cferry-AMD
approved these changes
Sep 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.