-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow link to llvm shared library for current distros #68
base: amd-stg-open
Are you sure you want to change the base?
Allow link to llvm shared library for current distros #68
Commits on Apr 30, 2024
-
[NFC] Remove method from FoldingSet that already existed in APInt. (l…
…lvm#90486) Noticed that there already was a function in APInt that updated a FoldingSet so there was no need for me to add it in llvm#84617.
Configuration menu - View commit details
-
Copy full SHA for 9a1386e - Browse repository at this point
Copy the full SHA 9a1386eView commit details -
[mlir][sparse] fold explicit value during sparsification (llvm#90530)
This ensures the explicit value is generated (and not a load into the values array). Note that actually not storing values array at all is still TBD, this is just the very first step.
Configuration menu - View commit details
-
Copy full SHA for 65ee8f1 - Browse repository at this point
Copy the full SHA 65ee8f1View commit details -
[Attributes] Support Attributes being declared as supporting an exper…
…imental late parsing mode "extension" (llvm#88596) This patch changes the `LateParsed` field of `Attr` in `Attr.td` to be an instantiation of the new `LateAttrParseKind` class. The instation can be one of the following: * `LateAttrParsingNever` - Corresponds with the false value of `LateParsed` prior to this patch (the default for an attribute). * `LateAttrParseStandard` - Corresponds with the true value of `LateParsed` prior to this patch. * `LateAttrParseExperimentalExt` - A new mode described below. `LateAttrParseExperimentalExt` is an experimental extension to `LateAttrParseStandard`. Essentially this allows `Parser::ParseGNUAttributes(...)` to distinguish between these cases: 1. Only `LateAttrParseExperimentalExt` attributes should be late parsed. 2. Both `LateAttrParseExperimentalExt` and `LateAttrParseStandard` attributes should be late parsed. Callers (and indirect callers) of `Parser::ParseGNUAttributes(...)` indicate the desired behavior by setting a flag in the `LateParsedAttrList` object that is passed to the function. In addition to the above, a new driver and frontend flag (`-fexperimental-late-parse-attributes`) with a corresponding LangOpt (`ExperimentalLateParseAttributes`) is added that changes how `LateAttrParseExperimentalExt` attributes are parsed. * When the flag is disabled (default), in cases where only `LateAttrParsingExperimentalOnly` late parsing is requested, the attribute will be parsed immediately (i.e. **NOT** late parsed). This allows the attribute to act just like a `LateAttrParseStandard` attribute when the flag is disabled. * When the flag is enabled, in cases where only `LateAttrParsingExperimentalOnly` late parsing is requested, the attribute will be late parsed. The motivation behind this change is to allow the new `counted_by` attribute (part of `-fbounds-safety`) to support late parsing but **only** when `-fexperimental-late-parse-attributes` is enabled. This attribute needs to support late parsing to allow it to refer to fields later in a struct definition (or function parameters declared later). However, there isn't a precedent for supporting late attribute parsing in C so this flag allows the new behavior to exist in Clang but not be on by default. This behavior was requested as part of the `-fbounds-safety` RFC process (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854/68). This patch doesn't introduce any uses of `LateAttrParseExperimentalExt`. This will be added for the `counted_by` attribute in a future patch (llvm#87596). A consequence is the new behavior added in this patch is not yet testable. Hence, the lack of tests covering the new behavior. rdar://125400257
Configuration menu - View commit details
-
Copy full SHA for b1867e1 - Browse repository at this point
Copy the full SHA b1867e1View commit details -
[NewPM][CodeGen] Add
MachineFunctionAnalysis
(llvm#88610)In new pass system, `MachineFunction` could be an analysis result again, machine module pass can now fetch them from analysis manager. `MachineModuleInfo` no longer owns them. Remove `FreeMachineFunctionPass`, replaced by `InvalidateAnalysisPass<MachineFunctionAnalysis>`. Now `FreeMachineFunction` is replaced by `InvalidateAnalysisPass<MachineFunctionAnalysis>`, the workaround in `MachineFunctionPassManager` is no longer needed, there is no difference between `unittests/MIR/PassBuilderCallbacksTest.cpp` and `unittests/IR/PassBuilderCallbacksTest.cpp`.
Configuration menu - View commit details
-
Copy full SHA for 6ea0c0a - Browse repository at this point
Copy the full SHA 6ea0c0aView commit details -
[X86] Enable EVEX512 when host CPU has AVX512 (llvm#90479)
This is used when -march=native run on an unknown CPU to old version of LLVM.
Configuration menu - View commit details
-
Copy full SHA for b329179 - Browse repository at this point
Copy the full SHA b329179View commit details -
[BOLT] Avoid reference updates for non-JT symbol operands (llvm#88838)
Skip updating references for operands that do not directly refer to jump table symbols but fall within a jump table's address range to prevent unintended modifications.
Configuration menu - View commit details
-
Copy full SHA for 9d5411f - Browse repository at this point
Copy the full SHA 9d5411fView commit details -
[C++20] [Modules] [Reduced BMI] Avoid force writing static declarations
within module purview Close llvm#90259 Technically, the static declarations shouldn't be leaked from the module interface, otherwise it is an illegal program according to the spec. So we can get rid of the static declarations from the reduced BMI technically. Then we can close the above issue. However, there are too many `static inline` codes in existing headers. So it will be a pretty big breaking change if we do this globally.
Configuration menu - View commit details
-
Copy full SHA for 38067c5 - Browse repository at this point
Copy the full SHA 38067c5View commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for 1990de9 - Browse repository at this point
Copy the full SHA 1990de9View commit details -
Change-Id: Icf8748fff11482f16cbeb1f19baf5a3404b57c6e
Jenkins committedApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for 73aa06a - Browse repository at this point
Copy the full SHA 73aa06aView commit details -
Disable test for lsan and x86_64h (llvm#90483)
Disable this test on x86_64h for LSan. This test is failing with malformed object only on x86_64h. Disabling for now. rdar://125052424
Configuration menu - View commit details
-
Copy full SHA for 62d6560 - Browse repository at this point
Copy the full SHA 62d6560View commit details -
Configuration menu - View commit details
-
Copy full SHA for 326667d - Browse repository at this point
Copy the full SHA 326667dView commit details -
[ELF] --compress-debug-sections=zstd: replace ZSTD_c_nbWorkers parall…
…elism with multi-frame parallelism https://reviews.llvm.org/D133679 utilizes zstd's multithread API to create one single frame. This provides a higher compression ratio but is significantly slower than concatenating multiple frames. With manual parallelism, it is easier to parallelize memcpy in OutputSection::writeTo for parallel memcpy. In addition, as the individual allocated decompression buffers are much smaller, we can make a wild guess (compressed_size/4) without worrying about a resize (due to wrong guess) would waste memory.
Configuration menu - View commit details
-
Copy full SHA for 79095b4 - Browse repository at this point
Copy the full SHA 79095b4View commit details -
[clang-tidy] fix false-negative for macros in `readability-math-missi…
…ng-parentheses` (llvm#90279) When a binary operator is the last operand of a macro, the end location that is past the `BinaryOperator` will be inside the macro and therefore an invalid location to insert a `FixIt` into, which is why the check bails when encountering such a pattern. However, the end location is only required for the `FixIt` and the diagnostic can still be emitted, just without an attached fix.
Configuration menu - View commit details
-
Copy full SHA for fbe4d99 - Browse repository at this point
Copy the full SHA fbe4d99View commit details -
Configuration menu - View commit details
-
Copy full SHA for bd72f7b - Browse repository at this point
Copy the full SHA bd72f7bView commit details -
[NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to genera…
…te module file for C++20 modules instead of PCHGenerator Previously we're re-using PCHGenerator to generate the module file for C++20 modules. But this is slighty more or less odd. This patch tries to use a new class 'CXX20ModulesGenerator' to generate the module file for C++20 modules.
Configuration menu - View commit details
-
Copy full SHA for 18268ac - Browse repository at this point
Copy the full SHA 18268acView commit details -
[SelectionDAG][RISCV] Move VP_REDUCE* legalization to LegalizeDAG.cpp. (
llvm#90522) LegalizeVectorType is responsible for legalizing nodes that perform an operation on each element may need to scalarize. This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR, SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc. This patch drops any nodes with a scalar result from LegalizeVectorOps and handles them in LegalizeDAG instead. This required moving the reduction promotion to LegalizeDAG. I have removed the support integer promotion as it was incorrect for integer min/max reductions. Since it was untested, it was best to assert on it until it was really needed. There are a couple regressions that can be fixed with a small DAG combine which I will do as a follow up.
Configuration menu - View commit details
-
Copy full SHA for 705636a - Browse repository at this point
Copy the full SHA 705636aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e83058 - Browse repository at this point
Copy the full SHA 6e83058View commit details -
[C++20] [Modules] Don't skip pragma diagnostic mappings
Close llvm#75057 Previously, I thought the diagnostic mappings is not meaningful with modules incorrectly. And this problem get revealed by another change recently. So this patch tried to rever the previous "optimization" partially.
Configuration menu - View commit details
-
Copy full SHA for fb21343 - Browse repository at this point
Copy the full SHA fb21343View commit details -
[RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instr…
…uctions Marking them as `hasSideEffects=1` stops some optimizations. According to `Target.td`: > // Does the instruction have side effects that are not captured by any > // operands of the instruction or other flags? > bit hasSideEffects = ?; It seems we don't need to set `hasSideEffects` for vleNff since we have modelled `vl` as an output operand. As for saturating instructions, I think that explicit Def/Use list is kind of side effects captured by any operands of the instruction, so we don't need to set `hasSideEffects` either. And I have just investigated AArch64's implementation, they don't set this flag and don't add `Def` list. These changes make optimizations like `performCombineVMergeAndVOps` and MachineCSE possible for these instructions. As a consequence, `copyprop.mir` can't test what we want to test in https://reviews.llvm.org/D155140, so we replace `vssra.vi` with a VCIX instruction (it has side effects). Reviewers: jacquesguan, topperc, preames, asb, lukel97 Reviewed By: topperc, lukel97 Pull Request: llvm#90049
Configuration menu - View commit details
-
Copy full SHA for 940ef96 - Browse repository at this point
Copy the full SHA 940ef96View commit details -
Revert "[C++20] [Modules] Don't skip pragma diagnostic mappings"
Configuration menu - View commit details
-
Copy full SHA for 6b961e2 - Browse repository at this point
Copy the full SHA 6b961e2View commit details -
[RISCV] Add DAG combine for (vmv_s_x_vl (undef) (vmv_x_s X). (llvm#90524
Configuration menu - View commit details
-
Copy full SHA for 2524146 - Browse repository at this point
Copy the full SHA 2524146View commit details -
[LoongArch] Support parsing la.tls.desc pseudo instruction
Simultaneously implemented parsing support for the `%desc_*` modifiers. Reviewers: SixWeining, heiher, xen0n Reviewed By: xen0n, SixWeining Pull Request: llvm#90158
Configuration menu - View commit details
-
Copy full SHA for 4a84d8e - Browse repository at this point
Copy the full SHA 4a84d8eView commit details -
[C++20] [Modules] Don't skip pragma diagnostic mappings
Close llvm#75057 Previously, I thought the diagnostic mappings is not meaningful with modules incorrectly. And this problem get revealed by another change recently. So this patch tried to rever the previous "optimization" partially.
Configuration menu - View commit details
-
Copy full SHA for ec527b2 - Browse repository at this point
Copy the full SHA ec527b2View commit details -
Configuration menu - View commit details
-
Copy full SHA for f4843ac - Browse repository at this point
Copy the full SHA f4843acView commit details -
[mlir][OpenMP] Extend
omp.private
with adealloc
region (llvm#90456)Extends `omp.private` with a new region: `dealloc` where deallocation logic for Fortran deallocatables will be outlined (this will happen in later PRs).
Configuration menu - View commit details
-
Copy full SHA for ce12b12 - Browse repository at this point
Copy the full SHA ce12b12View commit details -
Configuration menu - View commit details
-
Copy full SHA for 09f160c - Browse repository at this point
Copy the full SHA 09f160cView commit details -
[lldb][Docs] Remove more subtitles from packets doc (llvm#90443)
This removes various subtitles or converts them to bold text so that the table of contents is less cluttered. This includes "Example", "Notes", "Priority To Implement" and "Response".
Configuration menu - View commit details
-
Copy full SHA for ff6c0ca - Browse repository at this point
Copy the full SHA ff6c0caView commit details -
[LoongArch][Codegen] Add support for TLSDESC
The implementation only enables when the `-enable-tlsdesc` option is passed and the TLS model is `dynamic`. LoongArch's GCC has the same option(-mtls-dialet=) as RISC-V. Reviewers: heiher, MaskRay, SixWeining Reviewed By: SixWeining, MaskRay Pull Request: llvm#90159
Configuration menu - View commit details
-
Copy full SHA for eb148ae - Browse repository at this point
Copy the full SHA eb148aeView commit details -
Reapply "[flang] Improve debug info for functions." with regression f…
…ixed. (llvm#90484) The original PR llvm#90083 had to be reverted in PR llvm#90444 as it caused one of the gfortran tests to fail. The issue was using `isIntOrIndex` for checking for integer type. It allowed index type which later caused assertion when calling `getIntOrFloatBitWidth`. I have now replaced it with `isInteger` which should fix this regression.
Configuration menu - View commit details
-
Copy full SHA for 91a8cb7 - Browse repository at this point
Copy the full SHA 91a8cb7View commit details -
[RemoveDIs] Fix findDbgValues to return dbg_assign records too (llvm#…
…90471) In the debug intrinsic class heirachy, a dbg.assign is a (inherits from) dbg.value, so `findDbgValues` returns dbg.values and dbg.assigns (by design). That hierarchy doesn't exist for DbgRecords - fix findDbgValues to return dbg_assign records as well as dbg_values and add unittest.
Configuration menu - View commit details
-
Copy full SHA for 09e7d86 - Browse repository at this point
Copy the full SHA 09e7d86View commit details -
[docs] Document which online sync-ups are no longer happening (llvm#8…
…9361) Some of the online sync-ups on our Getting Involved page seem to no longer be happening. Document them as no longer happening, so that people don't get confused when dialing in to one of these.
Configuration menu - View commit details
-
Copy full SHA for 853344d - Browse repository at this point
Copy the full SHA 853344dView commit details -
[Modules] No transitive source location change (llvm#86912)
This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. # Motivation Example This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. # Internal perspective of status quo To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. # The high level design of current patch From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. # Some low level details I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. # Correctness The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. # Future Plans I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
Configuration menu - View commit details
-
Copy full SHA for 6c31104 - Browse repository at this point
Copy the full SHA 6c31104View commit details -
[MLIR] Sprinkle extra asserts in OperationSupport.h (llvm#90465)
Should hopefully help shave some minutes off developer debugging time in the future.
Configuration menu - View commit details
-
Copy full SHA for 2464c1c - Browse repository at this point
Copy the full SHA 2464c1cView commit details -
[MLIR][LLVM] Have LLVM::AddressOfOp implement ConstantLike (llvm#90481)
For all means and purposes llvm.mlir.addressof acts like a constant, and should be treated as such by passes. In particular, the operation should be propagated rather than passed whenever possible.
Configuration menu - View commit details
-
Copy full SHA for 92ca6fc - Browse repository at this point
Copy the full SHA 92ca6fcView commit details -
[mlir][test] Add TD example for peel+vectorize (depthwise conv) (llvm…
…#90200) Adds an example that combines loop peeling and scalable vectorisation of `linalg.depthwise_conv_2d_nhwc_hwc`. This is similar to transform-op-peel-and-vectorize.mlir and is meant to demonstrate how to avoid masking when vectorising using scalable vectors.
Configuration menu - View commit details
-
Copy full SHA for c9d92d2 - Browse repository at this point
Copy the full SHA c9d92d2View commit details -
[clang][Interp] Handle Shifts in OpenCL correctly
We need to adjust the RHS to account for the LHS bitwidth.
Configuration menu - View commit details
-
Copy full SHA for 74e65ee - Browse repository at this point
Copy the full SHA 74e65eeView commit details -
Fix lock guads in PipePosix.cpp (llvm#90572)
Guard object destroyed immediately after creation without naming.
Configuration menu - View commit details
-
Copy full SHA for 29dda26 - Browse repository at this point
Copy the full SHA 29dda26View commit details -
[Clang][Sema] fix a bug on template partial specialization (llvm#89862)
attempt to fix llvm#68885 (comment) Deduction of NTTP whose type is `decltype(auto)` would create an implicit cast expression to dependent type and makes the type of primary template definition (`InjectedClassNameSpecialization`) and its partial specialization different. Prevent emitting cast expression to make clang knows their types are identical by removing `CTAK == CTAK_Deduced` when the type is `decltype(auto)`. Co-authored-by: huqizhi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for eaee8aa - Browse repository at this point
Copy the full SHA eaee8aaView commit details -
[Clang][Sema] Fix a bug on template partial specialization with issue…
… on deduction of nontype template parameter (llvm#90376) Fix llvm#68885 When build expression from a deduced argument whose kind is `Declaration` and `NTTPType`(which declared as `decltype(auto)`) is deduced as a reference type, `BuildExpressionFromDeclTemplateArgument` just create a `DeclRef`. This is incorrect while we get type from the expression since we can't get the original reference type from `DeclRef`. Creating a `SubstNonTypeTemplateParmExpr` expression and make the deduction correct. `Replacement` expression of `SubstNonTypeTemplateParmExpr` just helps the deduction and may not be same with the original expression. Co-authored-by: huqizhi <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a413c56 - Browse repository at this point
Copy the full SHA a413c56View commit details -
[PAC][lldb][Dwarf] Support
__ptrauth
-qualified types in user expres……sions (llvm#84387) Depends on llvm#84384 and llvm#90329 This adds support for `DW_TAG_LLVM_ptrauth_type` entries corresponding to explicitly signed types (e.g. free function pointers) in lldb user expressions. Applies PR swiftlang#8239 from Apple's downstream and also adds tests and related code. --------- Co-authored-by: Jonas Devlieghere <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 64248d7 - Browse repository at this point
Copy the full SHA 64248d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for f78949a - Browse repository at this point
Copy the full SHA f78949aView commit details -
[mlir] Mark
isa/dyn_cast/cast/...
member functions deprecated. (llv…Configuration menu - View commit details
-
Copy full SHA for 7ac1fb0 - Browse repository at this point
Copy the full SHA 7ac1fb0View commit details -
[C++20] [Modules] Add signature to the BMI recording export imported
modules After llvm#86912, for the following example, ``` export module A; export import B; ``` The generated BMI of `A` won't change if the source location in `A` changes. Further, we plan avoid more such changes. However, it is slightly problematic since `export import` should propagate all the changes. So this patch adds a signature to the BMI of C++20 modules so that we can propagate the changes correctly.
Configuration menu - View commit details
-
Copy full SHA for b2b463b - Browse repository at this point
Copy the full SHA b2b463bView commit details -
[NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to genera… (
llvm#90570) …te module file for C++20 modules instead of PCHGenerator Previously we're re-using PCHGenerator to generate the module file for C++20 modules. But this is slighty more or less odd. This patch tries to use a new class 'CXX20ModulesGenerator' to generate the module file for C++20 modules.
Configuration menu - View commit details
-
Copy full SHA for fce0916 - Browse repository at this point
Copy the full SHA fce0916View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21f8ced - Browse repository at this point
Copy the full SHA 21f8cedView commit details -
[NFC] [tests] Don't try to remove and create the same directory
In the test of clang/test/Modules/no-transitive-source-location-change.cppm, there were reports about invalid directory names in windows. The reason may be that we may remove and create the same directory. This patch tries to avoid such patterns for that.
Configuration menu - View commit details
-
Copy full SHA for 10aab63 - Browse repository at this point
Copy the full SHA 10aab63View commit details -
[offload] Fix missing reference decrement introduced by merge resolution
Added line which has been dropped from the 'deinitRuntime()' during merge-conflict resolution. Change-Id: Iee2c8b2fe63d8cd36cdb9befca2e8c93384087d9
Configuration menu - View commit details
-
Copy full SHA for 79ca523 - Browse repository at this point
Copy the full SHA 79ca523View commit details -
[Clang][Sema] Do not accept "vector _Complex" for AltiVec/ZVector (ll…
…vm#90467) The AltiVec (POWER) and ZVector (IBM Z) language extensions do not support using the "vector" keyword when the element type is a complex type, but current code does not verify this. Add a Sema check and diagnostic for this case. Fixes: llvm#88399
Configuration menu - View commit details
-
Copy full SHA for f73e87f - Browse repository at this point
Copy the full SHA f73e87fView commit details -
[AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (llvm#90201)
image_msaa_load is actually encoded as a VSAMPLE instruction and requires the appropriate waitcnt variant.
Configuration menu - View commit details
-
Copy full SHA for 62dea99 - Browse repository at this point
Copy the full SHA 62dea99View commit details -
Fix output in coro-elide-thinlto.cpp (llvm#90579)
Current dir can be read-only. Use a temp path instead.
Configuration menu - View commit details
-
Copy full SHA for fb2d305 - Browse repository at this point
Copy the full SHA fb2d305View commit details -
Configuration menu - View commit details
-
Copy full SHA for f10685f - Browse repository at this point
Copy the full SHA f10685fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0061616 - Browse repository at this point
Copy the full SHA 0061616View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3fca9d7 - Browse repository at this point
Copy the full SHA 3fca9d7View commit details -
[X86] combineMulToPMADDWD/combineMulToPMULDQ/reduceVMULWidth - pull o…
…ut repeated SDLoc(). NFC.
Configuration menu - View commit details
-
Copy full SHA for 066dc1e - Browse repository at this point
Copy the full SHA 066dc1eView commit details -
[X86] Add TODO for getTargetConstantFromBasePtr to support non-zero o…
…ffsets. As noted on llvm#66991 - we sometimes share vector constant pool entries, referencing subvectors within them via pointer offsets
Configuration menu - View commit details
-
Copy full SHA for 2cb97c7 - Browse repository at this point
Copy the full SHA 2cb97c7View commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for 1646003 - Browse repository at this point
Copy the full SHA 1646003View commit details -
[InstCombine] Fold
trunc nuw/nsw (x xor y) to i1
tox != y
(llvm#……90408) Fold: ``` llvm define i1 @src(i8 %x, i8 %y) { %xor = xor i8 %x, %y %r = trunc nuw/nsw i8 %xor to i1 ret i1 %r } define i1 @tgt(i8 %x, i8 %y) { %r = icmp ne i8 %x, %y ret i1 %r } ``` Proof: https://alive2.llvm.org/ce/z/dcuHmn
Configuration menu - View commit details
-
Copy full SHA for 34c89ef - Browse repository at this point
Copy the full SHA 34c89efView commit details -
Configuration menu - View commit details
-
Copy full SHA for 66e1d2c - Browse repository at this point
Copy the full SHA 66e1d2cView commit details -
[RISCV] Remove -riscv-insert-vsetvl-strict-asserts flag (llvm#90171)
This flag has been enabled by default for almost two years now since 1f06398, and at this stage we probably shouldn't be falling back to the fixups. This removes the flag so we always perform the assertion, as well as making sure that CurInfo is always valid on exit: We shouldn't leave emitVSETVLIs with an uninitialized VSETVLIInfo.
Configuration menu - View commit details
-
Copy full SHA for 7faf343 - Browse repository at this point
Copy the full SHA 7faf343View commit details -
Configuration menu - View commit details
-
Copy full SHA for af5d41e - Browse repository at this point
Copy the full SHA af5d41eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2f9462e - Browse repository at this point
Copy the full SHA 2f9462eView commit details -
[NFC][Clang] Update P2718R0 implementation status to partial supported (
llvm#90577) Once llvm#85613 fixed, we can mark this feature fully supported. Signed-off-by: yronglin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6fab3f2 - Browse repository at this point
Copy the full SHA 6fab3f2View commit details -
[LTO] Reset DiscardValueNames in optimize(). (llvm#78705)
libLTO parses options late, so at the moment the option is ignored. To fix that, re-set it in optimize(), as at this point the options have been parsed. When LTOCodeGenerator's constructor executes, the options haven't been parsed by the linker to libLTO yet. Note that we keep the value name of `%add = add..` because when the module is imported, DiscardValueNames is still set to false (the default when building with assertions). I tried to improve this in libLTO, but I am not sure if there's a suitable callback when all options have been set. PR: llvm#78705
Configuration menu - View commit details
-
Copy full SHA for f3ac55f - Browse repository at this point
Copy the full SHA f3ac55fView commit details -
Configuration menu - View commit details
-
Copy full SHA for bb95f5d - Browse repository at this point
Copy the full SHA bb95f5dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5cd074f - Browse repository at this point
Copy the full SHA 5cd074fView commit details -
[LAA] Pass maximum stride to isSafeDependenceDistance. (llvm#90036)
As discussed in llvm#88039, support different strides with isSafeDependenceDistance by passing the maximum of both strides. isSafeDependenceDistance tries to prove that |Dist| > BackedgeTakenCount * Step holds. Chosing the maximum stride computes the maximum range accesed by the loop for all strides. PR: llvm#90036
Configuration menu - View commit details
-
Copy full SHA for 82219e5 - Browse repository at this point
Copy the full SHA 82219e5View commit details -
[DAGCombiner] Fix mayAlias not accounting for scalable MMOs with offs…
…ets (llvm#90573) In llvm#70452 DAGCombiner::mayAlias was taught to handle scalable sizes, but when it checks via AA->isNoAlias it didn't take into account the case where the size is scalable but there was an offset too. For the fixed length case the offset was just accounted for by adding to the LocationSize, but for the scalable case there doesn't seem to be a way to represent both a scalable and fixed part in it. So this patch works around it by bailing if there is an offset. Fixes llvm#90559
Configuration menu - View commit details
-
Copy full SHA for 5e03c0a - Browse repository at this point
Copy the full SHA 5e03c0aView commit details -
[AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)
Thanks to ExtensionSet::toLLVMFeatureList, all values of ArchExtKind should correspond to a particular -target-feature. The valid values of -target-feature are in turn defined by SubtargetFeature defs. Therefore we can generate ArchExtKind from the tablegen data. This is done by adding an Extension class which derives from SubtargetFeature. Because the Has* FieldNames do not always correspond to the AEK_ names ("extensions", as defined in TargetParser), and AEK_ names do not always correspond to -march strings, some additional enum entries have been added to remap the names. I have renamed these to make the naming consistent, but split them into a separate PR to keep the diff reasonable (llvm#90320)
Configuration menu - View commit details
-
Copy full SHA for 61b2a0e - Browse repository at this point
Copy the full SHA 61b2a0eView commit details -
Change-Id: I95739002226a44f9c97a6b2ea2e349ec57b7a9f1
Jenkins committedApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for c4fa736 - Browse repository at this point
Copy the full SHA c4fa736View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1c17252 - Browse repository at this point
Copy the full SHA 1c17252View commit details -
Configuration menu - View commit details
-
Copy full SHA for e50a857 - Browse repository at this point
Copy the full SHA e50a857View commit details -
Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summari…
…es (llvm#90497) GUID often have content in the higher bits of a 64-bit entry so using the unabbrev encoding is inefficient (lots of VBR control bits). Instead, use an abbrev with two 32-bit fixed width chunks. The abbrev also helps encode the "count" in one place instead of in every record. Reduces size of distributed backend summary files by 8.7% in one example app. Co-authored-by: Jan Voung <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for adabdc1 - Browse repository at this point
Copy the full SHA adabdc1View commit details -
Configuration menu - View commit details
-
Copy full SHA for e4c0f4a - Browse repository at this point
Copy the full SHA e4c0f4aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7ae32bf - Browse repository at this point
Copy the full SHA 7ae32bfView commit details -
Revert "[AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)"
This reverts commit 61b2a0e. Reason: AArch64TargetParserDef.inc not found while building clang
Configuration menu - View commit details
-
Copy full SHA for 35e6bae - Browse repository at this point
Copy the full SHA 35e6baeView commit details -
Configuration menu - View commit details
-
Copy full SHA for b60a2b9 - Browse repository at this point
Copy the full SHA b60a2b9View commit details -
Revert "Use an abbrev to reduce size of VALUE_GUID records in ThinLTO…
… summaries" (llvm#90610) Reverts llvm#90497 Broke some LLD tests.
Configuration menu - View commit details
-
Copy full SHA for 2aabfc8 - Browse repository at this point
Copy the full SHA 2aabfc8View commit details -
Configuration menu - View commit details
-
Copy full SHA for c106abf - Browse repository at this point
Copy the full SHA c106abfView commit details -
Do not use R12 for indirect tail calls with PACBTI (llvm#82661)
When compiling for thumbv8.1m with +pacbti and making an indirect tail call, the compiler was free to put the function pointer into R12. This is incorrect because R12 is restored to contain authentication code for the caller's return address. This patch excludes R12 from the set of registers the compiler can put the function pointer in. Fixes llvm#75998
Configuration menu - View commit details
-
Copy full SHA for c12bc57 - Browse repository at this point
Copy the full SHA c12bc57View commit details -
Revert "[Modules] No transitive source location change (llvm#86912)"
This reverts commit 6c31104. Required by the post commit comments: llvm#86912
Configuration menu - View commit details
-
Copy full SHA for d333a0d - Browse repository at this point
Copy the full SHA d333a0dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8d28e58 - Browse repository at this point
Copy the full SHA 8d28e58View commit details -
Adding memref normalization of affine.prefetch (llvm#89675)
Added support for memref-normalization for prefetch. Signed-off-by: Alexandre Eichenberger <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a7b968a - Browse repository at this point
Copy the full SHA a7b968aView commit details -
Configuration menu - View commit details
-
Copy full SHA for ea81daf - Browse repository at this point
Copy the full SHA ea81dafView commit details -
Configuration menu - View commit details
-
Copy full SHA for 622ec1f - Browse repository at this point
Copy the full SHA 622ec1fView commit details -
[SystemZ] Enable MachineCombiner for FP reassociation (llvm#83546)
Enable MachineCombining for FP add, sub and mul. In order for this to work, the default instruction selection of reg/mem opcodes is disabled for ISD nodes that carry the flags that allow reassociation. The reg/mem folding is instead done after MachineCombiner by PeepholeOptimizer. SystemZInstrInfo optimizeLoadInstr() and foldMemoryOperandImpl() ("LoadMI version") have been implemented for this purpose also by this patch.
Configuration menu - View commit details
-
Copy full SHA for 6c32a1f - Browse repository at this point
Copy the full SHA 6c32a1fView commit details -
[RISCV] Use consume_front to parse rv32/rv64 in RISCVISAInfo::parse*A…
…rchString. NFC (llvm#90562) This replaces some starts_with calls wth consume_front. This allows us to remove a later assumption that prefix was 4 characters. We would eventually need to fix this anyway if we ever support rv128. Noticed while reviewing the RISCVISAInfo code for other reasons.
Configuration menu - View commit details
-
Copy full SHA for 1b942ae - Browse repository at this point
Copy the full SHA 1b942aeView commit details -
[flang][cuda] Fix iv store in cuf kernel (llvm#90551)
Store of the current induction value to the user IV was not placed correctly in the body of the cuf kernel. @ImanHosseini
Configuration menu - View commit details
-
Copy full SHA for f815d1f - Browse repository at this point
Copy the full SHA f815d1fView commit details -
[flang][cuda] Add fir.cuda_alloc/fir.cuda_free operations (llvm#90525)
This patch introduces fir.cuda_alloc/fir.cuda_free. These operations will be used instead of fir.alloca for local CUDA device, managed and unified variables.
Configuration menu - View commit details
-
Copy full SHA for a9c73f6 - Browse repository at this point
Copy the full SHA a9c73f6View commit details -
MachineLICM: Remove unnecessary isReg checks
COPY operands are always registers.
Configuration menu - View commit details
-
Copy full SHA for 114a59d - Browse repository at this point
Copy the full SHA 114a59dView commit details -
[OpenACC] Fix ast-print for OpenACC Clauses
Previously we weren't printing expressions correctly, so this patch adds a test to ensure we do, and fixes how expressions are printed.
Configuration menu - View commit details
-
Copy full SHA for cc6113d - Browse repository at this point
Copy the full SHA cc6113dView commit details -
Revert "[BOLT] Avoid reference updates for non-JT symbol operands (ll…
…vm#88838)" This reverts commit 9d5411f. Breaks aarch64 buildbot: https://lab.llvm.org/buildbot/#/builders/221/builds/22130
Configuration menu - View commit details
-
Copy full SHA for 721c31e - Browse repository at this point
Copy the full SHA 721c31eView commit details -
[AMPGPU] Emit s_singleuse_vdst instructions when a register is used m…
…ultiple times in the same instruction. (llvm#89601) Previously, multiple uses of a register within the same instruction were being counted as multiple uses. This has been corrected to only count as a single use as per the specification allowing for more optimisation candidates.
Configuration menu - View commit details
-
Copy full SHA for d97f25b - Browse repository at this point
Copy the full SHA d97f25bView commit details -
[flang][OpenMP] ensure we hit the TODO for intrinsic array reduction (l…
…lvm#90593) Before this patch we crashed lowering intrinsic array reductions. I think this lost during a rebase. I've added a test to make sure it doesn't break again. Also fixed the TODO message to be more accurate.
Configuration menu - View commit details
-
Copy full SHA for 5ada328 - Browse repository at this point
Copy the full SHA 5ada328View commit details -
[flang] Adapt PolymorphicOpConversion to run on all top level ops (ll…
…vm#90597) We might use polymorphic ops in top-level operations other than functions some time in the future. We need to ensure that these operations can be lowered. See RFC: https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations Some of the changes are from moving declaration and definition of the constructor function into tablegen (as requested in code review when altering another pass).
Configuration menu - View commit details
-
Copy full SHA for df513f8 - Browse repository at this point
Copy the full SHA df513f8View commit details -
[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (llvm#9…
…0502) This intrinsic is the VP version of `experimental.cttz.elts`.
Configuration menu - View commit details
-
Copy full SHA for 539f626 - Browse repository at this point
Copy the full SHA 539f626View commit details -
[MLIR] Generalize expand_shape to take shape as explicit input (llvm#…
…90040) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 --------- Signed-off-by: Gaurav Shukla<[email protected]> Signed-off-by: Gaurav Shukla <[email protected]> Co-authored-by: Ramiro Leal-Cavazos <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 97069a8 - Browse repository at this point
Copy the full SHA 97069a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9305fc - Browse repository at this point
Copy the full SHA e9305fcView commit details -
[X86] Add icmp i16 test coverage
Based off llvm#90355 - add basic tests for cases when to extend i16 comparisons to i32
Configuration menu - View commit details
-
Copy full SHA for 38c68e0 - Browse repository at this point
Copy the full SHA 38c68e0View commit details -
[DAG] Pull out repeated SDLoc() from SHL/SRL/SRA combines. NFC.
We were always calling SDLoc(N) at the top of each visitSHL/SRL/SRA for the FoldConstantArithmetic call, so just reuse this as much as possible.
Configuration menu - View commit details
-
Copy full SHA for 91c52b9 - Browse repository at this point
Copy the full SHA 91c52b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for fbe8d2a - Browse repository at this point
Copy the full SHA fbe8d2aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 554be97 - Browse repository at this point
Copy the full SHA 554be97View commit details -
[flang][OpenMP] Pass symTable to all genXYZ functions, NFC (llvm#90090)
This will unify the interface a bit more.
Configuration menu - View commit details
-
Copy full SHA for 33ccd03 - Browse repository at this point
Copy the full SHA 33ccd03View commit details -
[NFC][OpenMP][MI300] Refactoring of the checkIfAPU() method in prepar…
…ation of an upstream patch. This patch refactors the checkIfAPU method. The revised checkIfAPU() method, using the HSA symbols HSA_AGENT_INFO_AMD_MEMORY_PROPERTIES and HSA_AMD_MEMORY_PROPERTY_AGENT_IS_APU, will be upstreamed. This patch reduces merge conflicts with the upstream method, as the detection of the GFX90a and MI300x is moved to separate methods. As such, the downstream method can be replaced by the upstream implementation. Change-Id: Id10605e7ea2248538f26ebc717341b1735495a01
Configuration menu - View commit details
-
Copy full SHA for 3751ac4 - Browse repository at this point
Copy the full SHA 3751ac4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4631e7b - Browse repository at this point
Copy the full SHA 4631e7bView commit details -
[LegalizeDAG] Simplify interface to PromoteReduction. NFC
Return an SDValue instead of pushing to the Results vector. Let the caller do the push.
Configuration menu - View commit details
-
Copy full SHA for 267329d - Browse repository at this point
Copy the full SHA 267329dView commit details -
[VP] Fix unit test failures caused by llvm#90502
Forgot to add vp.cttz.elts into the unittest. Also, I didn't specify the positions of overloaded type parameters.
Configuration menu - View commit details
-
Copy full SHA for 6ab49fc - Browse repository at this point
Copy the full SHA 6ab49fcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4cd11c9 - Browse repository at this point
Copy the full SHA 4cd11c9View commit details -
[mlir][sparse] handle padding on sparse levels. (llvm#90527)
Peiming Liu authoredApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for dbe3766 - Browse repository at this point
Copy the full SHA dbe3766View commit details -
[MLIR][Arith] expand-ops: Support mini/maxi (llvm#90575)
Expand `arith.minsi`, `arith.minui`, `arith.maxsi`, `arith.maxui` into `arith.cmpi` and `arith.select`. --------- Co-authored-by: Jakub Kuderski <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 30badf9 - Browse repository at this point
Copy the full SHA 30badf9View commit details -
[LangRef] Try to clarify mustprogress wording. (llvm#90510)
Ensure it's clear that: - Infinite loops in non-mustprogress functions are well-defined, even if they're called by mustprogress functions. - Infinite recursion in mustprogress functions is not well-defined. Looking at D86233, it's clear this was the intent, but the "transitive" wording is ambiguous. Instead, just explicitly state that infinite loops written in non-mustprogress functions count as progress.
Configuration menu - View commit details
-
Copy full SHA for 600cae7 - Browse repository at this point
Copy the full SHA 600cae7View commit details -
[libc][stdfix] Fix overflow problem for fixed point sqrt when the inp…
…uts are close to max. (llvm#90558) Fixes llvm#89668
Configuration menu - View commit details
-
Copy full SHA for 7dd4ce4 - Browse repository at this point
Copy the full SHA 7dd4ce4View commit details -
[libc++][NFC] Fixes a status page note and a minor copy & paste error…
… in a test (llvm#90399) - Adds a status page note for P3142R0 - Fixes a copy & paste error in tuple protocol for `complex`
Configuration menu - View commit details
-
Copy full SHA for 9af7f40 - Browse repository at this point
Copy the full SHA 9af7f40View commit details -
Configuration menu - View commit details
-
Copy full SHA for a754ce0 - Browse repository at this point
Copy the full SHA a754ce0View commit details -
[RISCV] Handle fixed length vectors with exact VLEN in lowerINSERT_SU…
…BVECTOR (llvm#84107) This is the insert_subvector equivalent to llvm#79949, where we can avoid sliding up by the full LMUL amount if we know the exact subregister the subvector will be inserted into. This mirrors the lowerEXTRACT_SUBVECTOR changes in that we handle this in two parts: - We handle fixed length subvector types by converting the subvector to a scalable vector. But unlike EXTRACT_SUBVECTOR, we may also need to convert the vector being inserted into too. - Whenever we don't need a vslideup because either the subvector fits exactly into a vector register group *or* the vector is undef, we need to emit an insert_subreg ourselves because RISCVISelDAGToDAG::Select doesn't correctly handle fixed length subvectors yet: see d7a28f7 A subvector exactly fits into a vector register group if its size is a known multiple of the size of a vector register, and this adds a new overload for TypeSize::isKnownMultipleOf for scalable to scalable comparisons to help reason about this. I've left RISCVISelDAGToDAG::Select untouched for now (minus relaxing an invariant), so that the insert_subvector and extract_subvector code paths are the same. We should teach it to properly handle fixed length subvectors in a follow-up patch, so that the "exact subregsiter" logic is handled in one place instead of being spread across both RISCVISelDAGToDAG.cpp and RISCVISelLowering.cpp.
Configuration menu - View commit details
-
Copy full SHA for f565b79 - Browse repository at this point
Copy the full SHA f565b79View commit details -
Configuration menu - View commit details
-
Copy full SHA for f0cc373 - Browse repository at this point
Copy the full SHA f0cc373View commit details -
[libc++] Some tests are missing include for
numeric_limits
(llvm#90345) Noticed while attempting microsoft/STL#4634
Configuration menu - View commit details
-
Copy full SHA for 40083cf - Browse repository at this point
Copy the full SHA 40083cfView commit details -
[lldb] Support custom LLVM formatting for variables (llvm#81196)
Adds support for applying LLVM formatting to variables. The reason for this is to support cases such as the following. Let's say you have two separate bytes that you want to print as a combined hex value. Consider the following summary string: ``` ${var.byte1%x}${var.byte2%x} ``` The output of this will be: `0x120x34`. That is, a `0x` prefix is unconditionally applied to each byte. This is unlike printf formatting where you must include the `0x` yourself. Currently, there's no way to do this with summary strings, instead you'll need a summary provider in python or c++. This change introduces formatting support using LLVM's formatter system. This allows users to achieve the desired custom formatting using: ``` ${var.byte1:x-}${var.byte2:x-} ``` Here, each variable is suffixed with `:x-`. This is passed to the LLVM formatter as `{0:x-}`. For integer values, `x` declares the output as hex, and `-` declares that no `0x` prefix is to be used. Further, one could write: ``` ${var.byte1:x-2}${var.byte2:x-2} ``` Where the added `2` results in these bytes being written with a minimum of 2 digits. An alternative considered was to add a new format specifier that would print hex values without the `0x` prefix. The reason that approach was not taken is because in addition to forcing a `0x` prefix, hex values are also forced to use leading zeros. This approach lets the user have full control over formatting.
Configuration menu - View commit details
-
Copy full SHA for 7a8d15e - Browse repository at this point
Copy the full SHA 7a8d15eView commit details -
[BOLT] Fix build-time assertion in RewriteInstance (llvm#90540)
We use pwrite() in RewriteInstance to update contents of existing sections. pwrite() requires file position to be set past the written offset which we guarantee at the start of rewriteFile(). Then we had an implicit assumption in patchBuildID() that the file position will be set again in patchELFSymTabs() after being reset in patchELFPHDRTable(). That assumption was broken in llvm#90300. The fix is to save and restore file position in patchELFPHDRTable(). Then we don't have to update it again in patchELFSymTabs().
Configuration menu - View commit details
-
Copy full SHA for 49bb993 - Browse repository at this point
Copy the full SHA 49bb993View commit details -
[mlir][NFC] update code to use
mlir::dyn_cast/cast/isa
(llvm#90633)Fix compiler warning caused by using deprecated interface (llvm#90413)
Peiming Liu authoredApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for d235369 - Browse repository at this point
Copy the full SHA d235369View commit details -
[WebAssembly] Add preprocessor define for half-precision (llvm#90528)
This adds the preprocessor define for the half-precision feature and also adds preprocessor tests.
Configuration menu - View commit details
-
Copy full SHA for 7662f95 - Browse repository at this point
Copy the full SHA 7662f95View commit details -
[Clang][Sema][Parse] Delay parsing of noexcept-specifiers in friend f…
…unction declarations (llvm#90517) According to [class.mem.general] p8: > A complete-class context of a class (template) is a > - function body, > - default argument, > - default template argument, > - _noexcept-specifier_, or > - default member initializer > > within the member-specification of the class or class template. When testing llvm#90152, it came to my attention that we do _not_ consider the _noexcept-specifier_ of a friend function declaration to be a complete-class context (something which the Microsoft standard library depends on). Although a comment states that this is "consistent with what other implementations do", the only other implementation that exhibits this behavior is GCC (MSVC and EDG both late-parse the _noexcept-specifier_). This patch changes _noexcept-specifiers_ of friend function declarations to be late parsed, which is in agreement with the standard & majority of implementations. Pre-llvm#90152, our existing implementation falls "in between" the implementation consensus: within non-template classes, we would not find latter declared members (qualified and unqualified), while within class templates we would not find latter declared member when named with a unqualified name, we would find members named with a qualified name (even when lookup context is the current instantiation). Therefore, this _shouldn't_ be a breaking change -- any code that didn't compile will continue to not compile (since a _noexcept-specifier_ is not part of the deduction substitution loci (see [temp.deduct.general] p7), and any code which did compile should continue to do so.
Configuration menu - View commit details
-
Copy full SHA for f061a39 - Browse repository at this point
Copy the full SHA f061a39View commit details -
Reapply "[Clang][Sema] Diagnose class member access expressions namin…
…g non-existent members of the current instantiation prior to instantiation in the absence of dependent base classes (llvm#84050)" (llvm#90152) Reapplies llvm#84050, addressing a bug which cases a crash when an expression with the type of the current instantiation is used as the _postfix-expression_ in a class member access expression (arrow form).
Configuration menu - View commit details
-
Copy full SHA for 8009bbe - Browse repository at this point
Copy the full SHA 8009bbeView commit details -
[OpenACC] Private Clause on Compute Constructs (llvm#90521)
The private clause is the first that takes a 'var-list', thus this has a lot of additional work to enable the var-list type. A 'var' is a traditional variable reference, subscript, member-expression, or array-section, so checking of these is pretty minor. Note: This ran into some issues with array-sections (aka sub-arrays) that will be fixed in a follow-up patch.
Configuration menu - View commit details
-
Copy full SHA for fa67986 - Browse repository at this point
Copy the full SHA fa67986View commit details -
[GVNSink] Fix incorrect codegen with respect to GEPs llvm#85333 (llvm…
…#88440) As mentioned in llvm#68882 and https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 Gep arithmetic isn't consistent with different types. GVNSink didn't realize this and sank all geps as long as their operands can be wired via PHIs in a post-dominator. Fixes: llvm#85333
Configuration menu - View commit details
-
Copy full SHA for 1c979ab - Browse repository at this point
Copy the full SHA 1c979abView commit details -
[libc++][ranges] Implement LWG4053 and LWG4054 (llvm#88612)
Implement - LWG4053 Unary call to `std::views::repeat` does not decay the argument - LWG4054 Repeating a `repeat_view` should repeat the view Signed-off-by: yronglin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0ecc164 - Browse repository at this point
Copy the full SHA 0ecc164View commit details -
[OpenACC] Fix test failure from fa67986
Seemingly some other patch went in that altered how much dependence was printed vs the actual names, and it changed the ast-dump results. Commit to fix this test.
Configuration menu - View commit details
-
Copy full SHA for 41f9c78 - Browse repository at this point
Copy the full SHA 41f9c78View commit details -
[mlir][sparse] fix sparse tests that uses reshape operations. (llvm#9…
…0637) Due to generalization introduced in llvm#90040
Peiming Liu authoredApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for 7cbaaed - Browse repository at this point
Copy the full SHA 7cbaaedView commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for 824380f - Browse repository at this point
Copy the full SHA 824380fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 52cb953 - Browse repository at this point
Copy the full SHA 52cb953View commit details -
This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const unsigned int' and 'const int' [-Werror,-Wsign-compare]
Configuration menu - View commit details
-
Copy full SHA for 5f88f0c - Browse repository at this point
Copy the full SHA 5f88f0cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b07a03 - Browse repository at this point
Copy the full SHA 9b07a03View commit details -
[IR] Use StringRef::operator== instead of StringRef::equals (NFC) (ll…
…vm#90550) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator== outnumbers StringRef::equals by a factor of 22 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".
Configuration menu - View commit details
-
Copy full SHA for 4e6f6fd - Browse repository at this point
Copy the full SHA 4e6f6fdView commit details -
[mlir][tensor] Fix integration tests that uses reshape ops. (llvm#90649)
Due to generalization introduced in llvm#90040
Configuration menu - View commit details
-
Copy full SHA for a1423ba - Browse repository at this point
Copy the full SHA a1423baView commit details -
Revert "[GVNSink] Fix incorrect codegen with respect to GEPs llvm#85333…
…" (llvm#90658) Reverts llvm#88440 Test failing on Windows: https://lab.llvm.org/buildbot/#/builders/233/builds/9396 ``` Input file: <stdin> # | Check file: C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\Transforms\GVNSink\different-gep-types.ll # | # | -dump-input=help explains the following input dump. # | # | Input was: # | <<<<<< # | . # | . # | . # | 42: br label %if.end6 # | 43: # | 44: if.else5: ; preds = %if.else # | 45: br label %if.end6 # | 46: # | 47: if.end6: ; preds = %if.else5, %if.then3, %if.then # | next:67'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found # | next:67'1 with "IF_THEN" equal to "%if\\.then" # | next:67'2 with "IF_THEN3" equal to "%if\\.then3" # | next:67'3 with "IF_ELSE5" equal to "%if\\.else5" # | 48: %.sink1 = phi i32 [ -8, %if.then3 ], [ -4, %if.else5 ], [ 8, %if.then ] # | next:67'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # | next:67'4 ? possible intended match # | 49: %0 = load ptr, ptr %__i, align 4 # | next:67'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # | 50: %incdec.ptr4 = getelementptr inbounds i8, ptr %0, i32 %.sink1 # | next:67'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # | 51: store ptr %incdec.ptr4, ptr %__i, align 4 # | next:67'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # | 52: ret void # | next:67'0 ~~~~~~~~~~ # | 53: } # | next:67'0 ~~ # | >>>>>> # `----------------------------- # error: command failed with exit status: 1 ```
Configuration menu - View commit details
-
Copy full SHA for cf49d07 - Browse repository at this point
Copy the full SHA cf49d07View commit details -
NFC add a new precommit test case for PPCMIpeephole (llvm#90656)
Add pre-commit MIR test for PR "[Promote Pseudo Opcode from 32-bit to 64-bit after eliminating the extsw instruction in PPCMIPeepholes optimization](llvm#85451)" which fixes bug reported in the issue "[Inconsistent Output at -O1 and -O2 Optimization Levels on PowerPC64 Due to Complex Type Casting and Nested Loop Structure](llvm#71030)".
Configuration menu - View commit details
-
Copy full SHA for 70ada5b - Browse repository at this point
Copy the full SHA 70ada5bView commit details -
[RISCV] Make RISCVISAInfo::updateMaxELen extension checking more robu…
…st. Add inference from V extension. (llvm#90650) We weren't fully checking that we parsed Zve*x/f/d correctly. This could break if new extension is added that starts with Zve. We were assuming the Zve64d is present whenever V is so we only inferred from Zve*. It's more correct to infer ELEN from V itself too.
Configuration menu - View commit details
-
Copy full SHA for 05d04f0 - Browse repository at this point
Copy the full SHA 05d04f0View commit details -
[llvm][profdata][NFC] Support 64-bit weights in ProfDataUtils (llvm#8…
…6607) Since some places, like SimplifyCFG, work with 64-bit weights, we supply an API in ProfDataUtils to extract the weights accordingly. We change the API slightly to disambiguate the 64-bit version from the 32-bit version.
Configuration menu - View commit details
-
Copy full SHA for 7538df9 - Browse repository at this point
Copy the full SHA 7538df9View commit details -
[DFSan] Replace
cat
withcmake -E cat
(llvm#90557)`CMake` supports [this command](https://cmake.org/cmake/help/latest/manual/cmake.1.html#cmdoption-cmake-E-arg-cat) as of version 3.18. [D151344](https://reviews.llvm.org/D151344) bumped the minimum version to 3.20, so, it is now possible to remove the dependency on the external utility. This helps to cross-compile from Windows to Linux without installing additional tools, such as MSYS2.
Configuration menu - View commit details
-
Copy full SHA for 2224dce - Browse repository at this point
Copy the full SHA 2224dceView commit details -
[OpenMP][AIX] Implement __kmp_is_address_mapped() for AIX (llvm#90516)
This patch implements `__kmp_is_address_mapped()` for AIX by calling `loadquery()` to get the load info of the process and then checking if the address falls within the range of the data segment of one of the loaded modules.
Configuration menu - View commit details
-
Copy full SHA for 928db7e - Browse repository at this point
Copy the full SHA 928db7eView commit details -
SystemZ: Implement copyPhysReg between vr128 and gr128 (llvm#90616)
I have no idea if this is correct and I probably swapped the element ordering somewhere.
Configuration menu - View commit details
-
Copy full SHA for 75f4baa - Browse repository at this point
Copy the full SHA 75f4baaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6992433 - Browse repository at this point
Copy the full SHA 6992433View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1fb5083 - Browse repository at this point
Copy the full SHA 1fb5083View commit details -
[BOLT] Add ORC validation for the Linux kernel (llvm#90660)
The Linux kernel expects ORC tables to be sorted by IP address (for binary search to work). Add a post-emit pass in LinuxKernelRewriter that validates the written .orc_unwind_ip against that expectation.
Configuration menu - View commit details
-
Copy full SHA for c665e49 - Browse repository at this point
Copy the full SHA c665e49View commit details -
[Coroutines][Test] Specify target triple in coro-elide-thinlto (llvm#…
…90549) Resolve test failure on non-x86 linux host
Configuration menu - View commit details
-
Copy full SHA for 0232b77 - Browse repository at this point
Copy the full SHA 0232b77View commit details -
[flang] Remove double pointer indirection for _QQEnvironmentDefaults (l…
…lvm#90615) A double pointer was being passed to the call to FortranStart rather than just a pointer to the EnvironmentDefaults.list. This now passes `null` directly when there's no EnvironmentDefaults.list and passes the list directly when there is, removing the original global variable which was a pointer to a pointer containing null or the EnvironmentDefaults.list global. Fixes llvm#90537
Configuration menu - View commit details
-
Copy full SHA for ecec131 - Browse repository at this point
Copy the full SHA ecec131View commit details -
[GlobalISel] Fix store merging incorrectly classifying an unknown ind…
…ex expr as 0. (llvm#90375) During analysis, we incorrectly leave the offset part of an address info struct as zero, when in actual fact we failed to decompose it into base + offset. This results in incorrectly assuming that the address is adjacent to another store addr. To fix this we wrap the offset in an optional<> so we can distinguish between real zero and unknown. Fixes issue llvm#90242
Configuration menu - View commit details
-
Copy full SHA for 19f4d68 - Browse repository at this point
Copy the full SHA 19f4d68View commit details -
[SLP][NFCI]Improve compile time for phis with large number of incomin…
…g values. Added a limit of 128 incoming values at max for PHIs nodes to be vectorized plus improved performance by using logarithmic search instead of linear if the number of incoming values is > 4.
Configuration menu - View commit details
-
Copy full SHA for 51aac5b - Browse repository at this point
Copy the full SHA 51aac5bView commit details -
Fix -fno-unsafe-math-optimizations behavior (llvm#89473)
This changes the handling of -fno-unsafe-fp-math to stop having that option imply -ftrapping-math. In gcc, -fno-unsafe-math-optimizations sets -ftrapping-math, but that dependency is based on the fact the -ftrapping-math is enabled by default in gcc. Because clang does not enable -ftrapping-math by default, there is no reason for -fno-unsafe-math-optimizations to set it. On the other hand, -funsafe-math-optimizations continues to imply -fno-trapping-math because this option necessarily disables strict exception semantics. This fixes llvm#87523
Andy Kaylor authoredApr 30, 2024 Configuration menu - View commit details
-
Copy full SHA for fb85a28 - Browse repository at this point
Copy the full SHA fb85a28View commit details -
[flang][cuda] Allow PINNED argument to host dummy (llvm#90651)
Update the `AreCompatibleCUDADataAttrs` function to return true when one argument has the `PINNED` attribute and the other argument is just host data.
Configuration menu - View commit details
-
Copy full SHA for 89f8335 - Browse repository at this point
Copy the full SHA 89f8335View commit details -
Add basic char*_t support for libc (partial WG14 N2653) (llvm#90360)
This PR implements a part of WG14 N2653: - Define C23 char8_t - Define C11 char16_t - Define C11 char32_t Missing goals are: - The type of UTF-8 character literals is changed from unsigned char to char8_t. (Since UTF-8 character literals already have type unsigned char, this is not a semantic change). - New mbrtoc8() and c8rtomb() functions declared in <uchar.h> enable conversions between multibyte characters and UTF-8. - A new ATOMIC_CHAR8_T_LOCK_FREE macro. - A new atomic_char8_t typedef name.
Configuration menu - View commit details
-
Copy full SHA for cd7a7a5 - Browse repository at this point
Copy the full SHA cd7a7a5View commit details -
This patch fixes: bolt/lib/Rewrite/LinuxKernelRewriter.cpp:855:12: error: variable 'PrevIP' set but not used [-Werror,-Wunused-but-set-variable]
Configuration menu - View commit details
-
Copy full SHA for 805e08e - Browse repository at this point
Copy the full SHA 805e08eView commit details -
Configuration menu - View commit details
-
Copy full SHA for d688162 - Browse repository at this point
Copy the full SHA d688162View commit details -
[X86] Rename test to correct bug number. NFC
I accidentally named it pr90688 instead of pr90668.
Configuration menu - View commit details
-
Copy full SHA for 805f01f - Browse repository at this point
Copy the full SHA 805f01fView commit details -
[RISCV][ISel] Fix types in
tryFoldSelectIntoOp
(llvm#90659)``` SelectionDAG has 17 nodes: t0: ch,glue = EntryToken t6: i64,ch = CopyFromReg t0, Register:i64 %2 t8: i1 = truncate t6 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t7: i1 = truncate t4 t2: i64,ch = CopyFromReg t0, Register:i64 %0 t10: i64,i1 = saddo t2, Constant:i64<1> t11: i1 = or t8, t10:1 t12: i1 = select t7, t8, t11 t13: i64 = any_extend t12 t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13 t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1 ``` `OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`, which ignores `ResNo` in `SDValue`. Fix llvm#90652.
Configuration menu - View commit details
-
Copy full SHA for 2647bd7 - Browse repository at this point
Copy the full SHA 2647bd7View commit details -
[InstallAPI] Cleanup I/O error handling for input lists (llvm#90664)
Add validation in the FileList reader to check that the headers exist and use similar diagnostics in Options.cpp
Configuration menu - View commit details
-
Copy full SHA for 278774e - Browse repository at this point
Copy the full SHA 278774eView commit details -
Revert "[lldb] Support custom LLVM formatting for variables (llvm#81196…
…)" This reverts commit 7a8d15e.
Configuration menu - View commit details
-
Copy full SHA for 0f628fd - Browse repository at this point
Copy the full SHA 0f628fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 85f28cf - Browse repository at this point
Copy the full SHA 85f28cfView commit details -
[SelectionDAG][X86] Add a NoWrap flag to SelectionDAG::isAddLike. NFC (…
…llvm#90681) If this flag is set, Xor will not be considered AddLike. If an Xor were treated as an Add it may wrap. If we can prove there would be no carry out and thus no wrap, the Xor would be turned into a disjoint Or by DAGCombine. Use this new flag to fix a bug in X86 where an Xor is incorrectly being treated as an NUWAdd. Fixes llvm#90668.
Configuration menu - View commit details
-
Copy full SHA for a03eeb0 - Browse repository at this point
Copy the full SHA a03eeb0View commit details
Commits on May 1, 2024
-
[mlir][Tensor] Fix unpack -> transpose folding pattern for padded unp…
…acks (llvm#90678) Previously if the producer tensor.unpack op had "unpadding" semantics, the folding pattern would construct a destination that does not match with the result type of the transpose. Because both ops are DPS we can just reuse the destination of the transpose. Additionally cleans up a bunch of trailing whitespace in the test file.
Configuration menu - View commit details
-
Copy full SHA for 75f7295 - Browse repository at this point
Copy the full SHA 75f7295View commit details -
[AIX] Add git revision to .file string (llvm#88164)
If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file` string. The revision can be set with `LLVM_FORCE_VC_REVISION`. Before: `.file "git_revision.cpp",,"LLVM version 19.0.0git"` After: `.file "git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`
Configuration menu - View commit details
-
Copy full SHA for 8cde1cf - Browse repository at this point
Copy the full SHA 8cde1cfView commit details -
[flang] Added fir.dummy_scope operation to preserve dummy arguments a…
…ssociation. (llvm#90642) The new operation is just an abstract attribute that is attached to [hl]fir.declare operations of dummy arguments of a subroutine. Dummy arguments of the same subroutine refer to the same fir.dummy_scope, so they can be recognized as such during FIR AliasAnalysis. Note that the fir.dummy_scope must be specific to the runtime instantiation of a subroutine, so any MLIR inlining/cloning should duplicate and unique it vs using the same fir.dummy_scope for different runtime instantiations. This is why I made it an operation rather than an attribute. The new operation uses a write effect on DebuggingResource, same as [hl]fir.declare, to avoid optimizing it away.
Configuration menu - View commit details
-
Copy full SHA for 986f832 - Browse repository at this point
Copy the full SHA 986f832View commit details -
[Coroutines][Test] Only run coro-elide-thinlto under x86_64-linux (ll…
…vm#90672) Previous fix llvm#90549 didn't completely address the Buildbot failures. Some target may not recognize the target triple. This time, only run the test under x86_64-linux.
Configuration menu - View commit details
-
Copy full SHA for b1b1bfa - Browse repository at this point
Copy the full SHA b1b1bfaView commit details -
[cross-project-tests] Update code to use mlir::cast (NFC)
/llvm-project/cross-project-tests/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:41:16: error: 'cast' is deprecated: Use mlir::cast<U>() instead [-Werror,-Wdeprecated-declarations] VectorType.cast<mlir::ShapedType>(), llvm::ArrayRef<float>{2.0f, 3.0f}); ^ /llvm-project/llvm/../mlir/include/mlir/IR/Types.h:345:9: note: 'cast' has been explicitly marked deprecated here U Type::cast() const { ^ /llvm-project/cross-project-tests/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:41:16: error: 'cast<mlir::ShapedType>' is deprecated: Use mlir::cast<U>() instead [-Werror,-Wdeprecated-declarations] VectorType.cast<mlir::ShapedType>(), llvm::ArrayRef<float>{2.0f, 3.0f}); ^ /llvm-project/llvm/../mlir/include/mlir/IR/Types.h:112:5: note: 'cast<mlir::ShapedType>' has been explicitly marked deprecated here [[deprecated("Use mlir::cast<U>() instead")]] ^ 2 errors generated.
Configuration menu - View commit details
-
Copy full SHA for 63a2969 - Browse repository at this point
Copy the full SHA 63a2969View commit details -
[Windows] Restrict searchpath of dbghelp.dll to System32 (llvm#90520)
LoadLibraryW will lookup dlls in user directories if its search path is left unrestricted. This is a security vulnerability as one can name a shared library the same as that of a system dll in order to run arbitrary code when the shared library is loaded from the path in a user directory. This change modifies it to only search within sys32 when loading dbghelp.dll.
Configuration menu - View commit details
-
Copy full SHA for ef1dbcd - Browse repository at this point
Copy the full SHA ef1dbcdView commit details -
[flang][cuda] Update attribute compatibily check for unified matching…
… rule (llvm#90679) This patch updates the compatibility checks for CUDA attribute iin preparation to implement the matching rules described in section 3.2.3. We this patch the compiler will still emit an error when there is multiple specific procedures that matches since the matching distances is not yet implemented. This will be done in a separate patch. https://docs.nvidia.com/hpc-sdk/archive/24.3/compilers/cuda-fortran-prog-guide/index.html#cfref-var-attr-unified-data gpu=unified and gpu=managed are not part of this patch since these options are not recognized by flang yet.
Configuration menu - View commit details
-
Copy full SHA for 86e5d6f - Browse repository at this point
Copy the full SHA 86e5d6fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e9b1e9 - Browse repository at this point
Copy the full SHA 8e9b1e9View commit details -
Revert "[flang][cuda] Update attribute compatibily check for unified …
…matching rule" (llvm#90696) Reverts llvm#90679
Configuration menu - View commit details
-
Copy full SHA for 306ae14 - Browse repository at this point
Copy the full SHA 306ae14View commit details -
Change-Id: I4f3510230f3d590f3d875dc0cc78d816bce8bff8
Configuration menu - View commit details
-
Copy full SHA for f784bda - Browse repository at this point
Copy the full SHA f784bdaView commit details -
[Sema] Avoid an undesired pack expansion while transforming PackIndex…
…ingType (llvm#90195) A pack indexing type can appear in a larger pack expansion, e.g `Pack...[pack_of_indexes]...` so we need to temporarily disable substitution of pack elements. Besides, this patch also fixes an assertion failure in `PackIndexingExpr::classify`: dependent `PackIndexingExpr`s are always LValues and thus we don't need to consider their `IndexExpr`s. Fixes llvm#88925 --------- Co-authored-by: cor3ntin <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 410d635 - Browse repository at this point
Copy the full SHA 410d635View commit details -
Configuration menu - View commit details
-
Copy full SHA for 240592a - Browse repository at this point
Copy the full SHA 240592aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e93086 - Browse repository at this point
Copy the full SHA 3e93086View commit details -
[flang][MLIR] Outline deallocation logic to
omp.private
ops (llvm#9……0592) When delayed privatization is enabled, this PR emits the deallocation logic to the newly introduced `dealloc` region on `omp.private` ops.
Configuration menu - View commit details
-
Copy full SHA for 0632cb3 - Browse repository at this point
Copy the full SHA 0632cb3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 93b9b7c - Browse repository at this point
Copy the full SHA 93b9b7cView commit details -
[Pipelines][Coroutines] Tune coroutine passes only for ThinLTO pre-li…
…nk pipeline (llvm#90690) Follow up to llvm#90310, limit the tune up only to ThinLTO pre-link as coroutine passes are not in MonoLTO backend
Configuration menu - View commit details
-
Copy full SHA for bafc5f4 - Browse repository at this point
Copy the full SHA bafc5f4View commit details -
[RemoveDIs] Fix SIGSEGV caused by splitBasicBlock (llvm#90312)
See `llvm/unittests/IR/BasicBlockDbgInfoTest.cpp` for a test case.
Configuration menu - View commit details
-
Copy full SHA for 0fb5037 - Browse repository at this point
Copy the full SHA 0fb5037View commit details -
Revert "[alpha.webkit.UncountedCallArgsChecker] Ignore methods of WTF…
… String classes." (llvm#90701) Reverts llvm#90180
Configuration menu - View commit details
-
Copy full SHA for 3684a38 - Browse repository at this point
Copy the full SHA 3684a38View commit details -
[InstCombine] Canonicalize scalable GEPs to use llvm.vscale intrinsic (…
…llvm#90569) Canonicalize getelementptr instructions for scalable vector types into ptradd representation with an explicit llvm.vscale call. This representation has better support in BasicAA, which can reason about llvm.vscale, but not plain scalable GEPs.
Configuration menu - View commit details
-
Copy full SHA for 74aa1ab - Browse repository at this point
Copy the full SHA 74aa1abView commit details -
[RISCV] Convert vsetvli mir tests to use $noreg instead of implicit_d…
…ef. NFC This matches what comes out of isel since a63bd7e. It also adds the undef flag to more closely match the output after regalloc, which will help with the test diffs in llvm#70549
Configuration menu - View commit details
-
Copy full SHA for d392520 - Browse repository at this point
Copy the full SHA d392520View commit details -
Tweak BumpPtrAllocator to benefit the hot path (llvm#90571)
This takes the form of three consecutive but related changes: - Mark the fast path of BumpPtrAllocator as likely-taken. - Move the slow path of BumpPtrAllocator to a separate function. - Mark the slow path of BumpPtrAllocator as noinline. Overall, this saves geomean 0.4% userspace instructions on CTMark -O3, and 0.98% on CTMark -O0 -g. http://llvm-compile-time-tracker.com/compare.php?from=e1622e189e8c0ef457bfac528f90a7a930d9aad2&to=9eb53a4ed3af4a55e769ae1dd22d034b63d046e3&stat=instructions%3Au
Configuration menu - View commit details
-
Copy full SHA for cd46c2c - Browse repository at this point
Copy the full SHA cd46c2cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 23f0f7b - Browse repository at this point
Copy the full SHA 23f0f7bView commit details -
[AArch64][MC]Add diagnostic message for Multiple of 2/4 for ZPR128 (l…
…lvm#90600) This patch fix the crash reported in: llvm#90589
Configuration menu - View commit details
-
Copy full SHA for 14b66fe - Browse repository at this point
Copy the full SHA 14b66feView commit details -
[lldb][Docs] Sort documented packets alphabetically (llvm#90584)
For the platform and extension doc. Also add links in the extension doc to the GDB specs we're extending.
Configuration menu - View commit details
-
Copy full SHA for 0c42fa3 - Browse repository at this point
Copy the full SHA 0c42fa3View commit details -
[Modules] Process include files changes (llvm#90319)
There were two diffs that introduced some options useful when you build modules externally and cannot rely on file modification time as the key for detecting input file changes: - [D67249](https://reviews.llvm.org/D67249) introduced the `-fmodules-validate-input-files-content` option, which allows the use of file content hash in addition to the modification time. - [D141632](https://reviews.llvm.org/D141632) propagated the use of `-fno-pch-timestamps` with Clang modules. There is a problem when the size of the input file (header) is not modified but the content is. In this case, Clang cannot detect the file change when the `-fno-pch-timestamps` option is used. The `-fmodules-validate-input-files-content` option should help, but there is an issue with its application: it's not applied when the modification time is stored as zero that is the case for `-fno-pch-timestamps`. The issue can be fixed using the same trick that was applied during the processing of `ForceCheckCXX20ModulesInputFiles`: ``` // When ForceCheckCXX20ModulesInputFiles and ValidateASTInputFilesContent // enabled, it is better to check the contents of the inputs. Since we can't // get correct modified time information for inputs from overriden inputs. if (HSOpts.ForceCheckCXX20ModulesInputFiles && ValidateASTInputFilesContent && F.StandardCXXModule && FileChange.Kind == Change::None) FileChange = HasInputContentChanged(FileChange); ``` The patch suggests the solution similar to the presented above and includes a LIT test to verify it.
Configuration menu - View commit details
-
Copy full SHA for 9a9cff1 - Browse repository at this point
Copy the full SHA 9a9cff1View commit details -
device-libs: Use ballot(true) instead of calling read_exec builtin
The read_exec builtins are implemented with the ballot intrinsic anyway. In the wave32 case, these will optimize down to just use the low 32-bits. This converts a few uses, but others remain. Apparently you can just use exec_hi as a GPR in wave32 though, so I'm not sure we should be treating the raw exec read as assumed 0. Change-Id: Id5621bf31b0bb7fa27456938942138f3dea85a0a
Configuration menu - View commit details
-
Copy full SHA for 1a62373 - Browse repository at this point
Copy the full SHA 1a62373View commit details -
[ORC] Switch ObjectLinkingLayer::Plugins to shared ownership, copy pi…
…peline. Previously ObjectLinkingLayer held unique ownership of Plugins, and links always used the Layer's plugin list at each step. This can cause problems if plugins are added while links are in progress however, as the newly added plugin may receive only some of the callbacks for links that are already running. In this patch each link gets its own copy of the pipeline that remains consistent throughout the link's lifetime, and it is guaranteed that Plugin objects (now with shared ownership) will remain valid until the link completes. Coding my way home: 9.80469S, 139.03167W
Configuration menu - View commit details
-
Copy full SHA for 7565b20 - Browse repository at this point
Copy the full SHA 7565b20View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3a3bdd8 - Browse repository at this point
Copy the full SHA 3a3bdd8View commit details -
[lldb][Docs] Various style improvements to the tutorial (llvm#90594)
* Replace "we" with either "you" (when talking to the reader) or "lldb" (when talking about the project). * Refer to lldb as lldb not LLDB, to match what the user sees on the command line (I am going to come back later and put the proper name in places where it's talking about the projects themselves) * Remove a bunch of contractions for example "won't". Which don't (pun intended) seem like a big deal at first but even I as a native English speaker find the text clearer with them expanded. * Use RST's plain text highlighting for keywords and command names. * Split some very long lines for easier editing in future.
Configuration menu - View commit details
-
Copy full SHA for eb6097a - Browse repository at this point
Copy the full SHA eb6097aView commit details -
[AMDGPU][AsmParser][NFC] Generate NamedIntOperand predicates automati…
…cally. (llvm#90576) Part of <llvm#62629>.
Configuration menu - View commit details
-
Copy full SHA for 9bebf25 - Browse repository at this point
Copy the full SHA 9bebf25View commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for 3f0e2e3 - Browse repository at this point
Copy the full SHA 3f0e2e3View commit details -
[LLVM][SVE] Improve legalisation of fixed length get.active.lane.mask (…
…llvm#90213) We are effectively performing type and operation legalisation very early within the code generation flow. This results in worse code quality because the DAG is not in canonical form, which DAGCombiner corrects through the introduction of operations that are not legal. This patchs splits and moves the code to where type and operation legalisation is typically implemented.
Configuration menu - View commit details
-
Copy full SHA for fdf206c - Browse repository at this point
Copy the full SHA fdf206cView commit details -
[AMDGPU] Do not optimize away pre-existing waitcnt instructions at -O0 (
llvm#90716) The autogenerated memory legalizer tests use -O0 so this allows us to see the exact waitcnts that were inserted by the memory legalizer without them being optimized away.
Configuration menu - View commit details
-
Copy full SHA for 0b21b25 - Browse repository at this point
Copy the full SHA 0b21b25View commit details -
Configuration menu - View commit details
-
Copy full SHA for 582c6a8 - Browse repository at this point
Copy the full SHA 582c6a8View commit details -
[AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (llvm#9…
…0595) Code to determine if a waitcnt is required before a barrier instruction only considered S_BARRIER. gfx12 adds barrier_signal/wait so need to enhance the existing code to look for a barrier start (which is just an S_BARRIER for earlier architectures).
Configuration menu - View commit details
-
Copy full SHA for 5fb1e28 - Browse repository at this point
Copy the full SHA 5fb1e28View commit details -
[AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12 (llvm#90710
) llvm#90201 made some fixes for gfx12 image_msaa_load waitcnt insertion. That fix might break in some situations for pre-gfx12 - this fixes that by explitly checking for VSAMPLE which always requires a s_wait_samplecnt and leaves the previous logic intact for non-gfx12.
Configuration menu - View commit details
-
Copy full SHA for f898161 - Browse repository at this point
Copy the full SHA f898161View commit details -
[AArch64] NFC: Add RUN lines for streaming-compatible code. (llvm#90617)
The intent is to test lowering of vector operations by scalarization, for functions that are streaming-compatible (and thus cannot use NEON) and also don't have the +sve attribute. The generated code is clearly wrong at the moment, but a series of patches will follow to fix up all cases to use scalar instructions. A bit of context: This work will form the base to decouple SME from SVE later on, as it will make sure that no NEON instructions are used in streaming[-compatible] mode. Later this will be followed by a patch that changes `useSVEForFixedLengthVectors` to only return `true` if SVE is available for the given runtime mode, at which point I'll change the `-mattr=+sme -force-streaming-compatible-sve` to `-mattr=+sme -force-streaming-sve` in the RUN lines, so that the tests are considered to be executed in Streaming-SVE mode.
Configuration menu - View commit details
-
Copy full SHA for ccb198d - Browse repository at this point
Copy the full SHA ccb198dView commit details -
[llvm] Revive constructor of 'ResourceSegments'
582c6a8 removed a constructor of 'ResourceSegments' that is needed in LLVM unit tests. * Revert 582c6a8 * Update the constructor to take a const reference of `std::list` as pointed out in llvm#89193.
Configuration menu - View commit details
-
Copy full SHA for 803e03f - Browse repository at this point
Copy the full SHA 803e03fView commit details -
[SLP]Transform stores + reverse to strided stores with stride -1, if …
…profitable. Adds transformation of consecutive vector store + reverse to strided stores with stride -1, if it is profitable Reviewers: RKSimon, preames Reviewed By: RKSimon Pull Request: llvm#90464
Configuration menu - View commit details
-
Copy full SHA for 67e726a - Browse repository at this point
Copy the full SHA 67e726aView commit details -
[SLP]Improve reordering for consts, splats and ops from same nodes + …
…improved analysis. Improved detection of const/splat candidates, their matching and analysis of instructions from same nodes. Metric: size..text Program size..text results results0 diff results results0 diff test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test 92952.00 93096.00 0.2% test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test 779832.00 780136.00 0.0% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 839923.00 840179.00 0.0% test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 392708.00 392740.00 0.0% test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test 1171131.00 1171147.00 0.0% test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1391089.00 1391073.00 -0.0% test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1391089.00 1391073.00 -0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12352780.00 12352636.00 -0.0% MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE - small reordering External/SPEC/CINT2006/464.h264ref/464.h264ref - small better code after reordering MultiSource/Applications/JM/lencod/lencod - smaller code with less shuffles MultiSource/Applications/JM/ldecod/ldecod - same External/SPEC/CFP2017rate/511.povray_r/511.povray_r - 2 extra loads vectorized, smaller code External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r - better code, size increased because of more constant vectors. External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s - same External/SPEC/CFP2017rate/526.blender_r/526.blender_r - small change in the vectorized code, some code a bit better, some a bit worse. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#87091
Configuration menu - View commit details
-
Copy full SHA for 576261a - Browse repository at this point
Copy the full SHA 576261aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 442990b - Browse repository at this point
Copy the full SHA 442990bView commit details -
[z/OS] add support for z/OS system headers to clang std header wrappe…
…rs (llvm#89995) Update the wrappers for the C std headers so that they always forward to the z/OS system headers.
Configuration menu - View commit details
-
Copy full SHA for df241b1 - Browse repository at this point
Copy the full SHA df241b1View commit details -
This is a second attempt to land llvm#84501 which failed on several targets. This patch adds the HAS_IEE754_FLOAT128 define which makes the check for typedef'ing float128 more precise by checking whether __uint128_t is available and checking if the host does not use __ibm128 which is prevalent on power pc targets and replaces IEEE754 float128s.
Configuration menu - View commit details
-
Copy full SHA for 088aa81 - Browse repository at this point
Copy the full SHA 088aa81View commit details -
[Flang][OpenMP] Handle more character allocatable cases in privatizat…
…ion (llvm#90449) Fixes llvm#84732, llvm#81947, llvm#81946 Note: This is a fix till we enable delayed privatization.
Configuration menu - View commit details
-
Copy full SHA for 57d0d3b - Browse repository at this point
Copy the full SHA 57d0d3bView commit details -
[gn] port 088aa81 (LLVM_HAS_LOGF128)
If we want to turn this on on some platforms, we'll also want to define HAS_LOGF128 for AnalysisTest, see llvm/unittests/Analysis/CMakeLists.txt
Configuration menu - View commit details
-
Copy full SHA for 68b863b - Browse repository at this point
Copy the full SHA 68b863bView commit details -
[SystemZ][z/OS] Build in ASCII 64 bit mode on z/OS (llvm#90630)
Setting the correct build flags on z/OS to build LLVM as 64-bit ASCII application.
Configuration menu - View commit details
-
Copy full SHA for 034912d - Browse repository at this point
Copy the full SHA 034912dView commit details -
Configuration menu - View commit details
-
Copy full SHA for efce8a0 - Browse repository at this point
Copy the full SHA efce8a0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9ebf2f8 - Browse repository at this point
Copy the full SHA 9ebf2f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0647b2a - Browse repository at this point
Copy the full SHA 0647b2aView commit details -
[Offload] Fix CMake detection when it is not found (llvm#90729)
Summary: This variable could be unset if not found or when building standalone. We should check for that and set it to true or false. Fixes: llvm#90708
Configuration menu - View commit details
-
Copy full SHA for e312f07 - Browse repository at this point
Copy the full SHA e312f07View commit details -
[libcxx][ci] In picolib build, ask clang for the normalised triple (l…
…lvm#90722) This is needed for a workaround to make sure the link later succeeds. I don't know the reason for that but it is definitely needed. llvm#89234 will/wants to correct the triple normalisation for -none- and this means that clang prior to 19, and clang 19 and above will have different answers and therefore different library paths. I don't want to bootstrap a clang just for libcxx CI, or require that anyone building for Arm do the same, so ask the compiler what the triple should be. This will be compatible with 17 and 19 when we do update to that version. I'm assuming $CC is what anyone locally would set to override the compiler, and `cc` is the binary name in our CI containers. It's not perfect but it should cover most use cases.
Configuration menu - View commit details
-
Copy full SHA for 167b506 - Browse repository at this point
Copy the full SHA 167b506View commit details -
[AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)
Re-land 61b2a0e. Some Windows builds were failing because AArch64TargetParserDef.inc is a generated header which is included transitively into some clang components, but this information is not available to the build system and therefore there is a missing edge in the dependency graph. This patch incorporates the fixes described in ac1ffd3/D142403. Thanks to ExtensionSet::toLLVMFeatureList, all values of ArchExtKind should correspond to a particular -target-feature. The valid values of -target-feature are in turn defined by SubtargetFeature defs. Therefore we can generate ArchExtKind from the tablegen data. This is done by adding an Extension class which derives from SubtargetFeature. Because the Has* FieldNames do not always correspond to the AEK_ names ("extensions", as defined in TargetParser), and AEK_ names do not always correspond to -march strings, some additional enum entries have been added to remap the names. I have renamed these to make the naming consistent, but split them into a separate PR to keep the diff reasonable (llvm#90320)
Configuration menu - View commit details
-
Copy full SHA for cfca977 - Browse repository at this point
Copy the full SHA cfca977View commit details -
[lldb] Teach LocateExecutableSymbolFile to look into LOCALBASE on Fre…
…eBSD (llvm#81355) FreeBSD ports will now install debuginfo under $LOCALBASE/lib/debug/, where $LOCALBASE is typically /usr/local. On FreeBSD search this path in addition to existing debug info paths. Relevant change on the FreeBSD side: https://reviews.freebsd.org/D43515
Configuration menu - View commit details
-
Copy full SHA for f07a2ed - Browse repository at this point
Copy the full SHA f07a2edView commit details -
[CUDA] make kernel stub ICF-proof (llvm#90155)
MSVC linker merges functions having comdat which have identical set of instructions. CUDA uses kernel stub function as key to look up kernels in device executables. If kernel stub function for different kernels are merged by ICF, incorrect kernels will be launched. To prevent ICF from merging kernel stub functions, an unique global variable is created for each kernel stub function having comdat and a store is added to the kernel stub function. This makes the set of instructions in each kernel function unique. Fixes: llvm#88883
Configuration menu - View commit details
-
Copy full SHA for be5075a - Browse repository at this point
Copy the full SHA be5075aView commit details -
[OpenMP][TR12] change property of map-type modifier. (llvm#90499)
map-type change to "default" instead "ultimate" from [OpenMP5.2] The change is allowed map-type to be placed any locations within map modifiers, besides the last location in the modifiers-list, also map-type can be omitted afterward.
Configuration menu - View commit details
-
Copy full SHA for f050660 - Browse repository at this point
Copy the full SHA f050660View commit details -
[UndefOrPoison] [CompileTime] Avoid IDom walk unless required. NFC (l…
…lvm#90092) If the value is not boolean and we are checking for `Undef` or `UndefOrPoison`, we can avoid the potentially expensive IDom walk. This should improve compile time for isGuaranteedNotToBeUndefOrPoison and isGuaranteedNotToBeUndef.
Configuration menu - View commit details
-
Copy full SHA for 78270cb - Browse repository at this point
Copy the full SHA 78270cbView commit details -
[z/OS] treat text files as text files so auto-conversion is done (llv…
…m#90128) To support auto-conversion on z/OS text files need to be opened as text files. These changes will fix a number of LIT failures due to text files not being converted to the internal code page. update a number of tools so they open the text files as text files add support in the cat.py to open a text file as a text file (Windows will continue to treat all files as binary so new lines are handled correctly) add env var definitions to enable auto-conversion in the lit config file.
Configuration menu - View commit details
-
Copy full SHA for e22ce61 - Browse repository at this point
Copy the full SHA e22ce61View commit details -
Configuration menu - View commit details
-
Copy full SHA for e83c6dd - Browse repository at this point
Copy the full SHA e83c6ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for 39e24bd - Browse repository at this point
Copy the full SHA 39e24bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0606747 - Browse repository at this point
Copy the full SHA 0606747View commit details -
[mlir][ArmSME] Add a tests showing liveness issues in the tile alloca…
…tor (llvm#90447) This test shows a few cases (not at all complete) where the current ArmSME tile allocator produces incorrect results. The plan is to resolve these issues with a future tile allocator that uses liveness information.
Configuration menu - View commit details
-
Copy full SHA for 9226688 - Browse repository at this point
Copy the full SHA 9226688View commit details -
[AMDGPU] change order of fp and sp in kernel prologue (llvm#90626)
change order of fp and sp in kernel prologue also related codegen tests to make it easier to merge code into our downstream branches Signed-off-by: gangc <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 167427f - Browse repository at this point
Copy the full SHA 167427fView commit details -
- Revert 8009bbe - Revert "Reapply "[Clang][Sema] Diagnose class member access expressions naming non-existent members of the current instantiation prior to instantiation in the absence of dependent base classes (llvm#84050)" (llvm#90152)" - Breaks composable kernels and rocthrust builds - Revert 41f9c78 - Revert "[OpenACC] Fix test failure from fa67986" - Fixes some issues in 8009bbe, so depends on it - Cherry-pick 803e03f from trunk - Fixes unit test failures introduced in trunk earlier Change-Id: I718574c8a26745a52845d0b5a914ed00db611956
Configuration menu - View commit details
-
Copy full SHA for c1c0371 - Browse repository at this point
Copy the full SHA c1c0371View commit details -
[RemoveDIs] Load into new debug info format by default in LLVM (llvm#…
…89799) This patch enables parsing and creating modules directly into the new debug info format. Prior to this patch, all modules were constructed with the old debug info format by default, and would be converted into the new format just before running LLVM passes. This is an important milestone, in that this means that every tool will now be exposed to debug records, rather than those that run LLVM passes. As far as I've tested, all LLVM tools/projects now either handle debug records, or convert them to the old intrinsic format. There are a few unit tests that need updating for this patch; these are either cases of tests that previously needed to set the debug info format to function, or tests that depend on the old debug info format in some way. There should be no visible change in the output of any LLVM tool as a result of this patch, although the likelihood of this patch breaking downstream code means an NFC tag might be a little misleading, if not technically incorrect: This will probably break some downstream tools that don't already handle debug records. If your downstream code breaks as a result of this change, the simplest fix is to convert the module in question to the old debug format before you process it, using `Module::convertFromNewDbgValues()`. For more information about how to handle debug records or about what has changed, see the migration document: https://llvm.org/docs/RemoveDIsDebugInfo.html
Configuration menu - View commit details
-
Copy full SHA for 2f01fd9 - Browse repository at this point
Copy the full SHA 2f01fd9View commit details -
Revert "[RemoveDIs] Load into new debug info format by default in LLVM (
llvm#89799)" A unit test was broken by the above commit: https://lab.llvm.org/buildbot/#/builders/139/builds/64627 This reverts commit 2f01fd9.
Configuration menu - View commit details
-
Copy full SHA for 00821fe - Browse repository at this point
Copy the full SHA 00821feView commit details -
[llvm-install-name-tool] Error on non-Mach-O binaries (llvm#90351)
Previously if you passed an ELF binary it would be silently copied with no changes.
Configuration menu - View commit details
-
Copy full SHA for fa53545 - Browse repository at this point
Copy the full SHA fa53545View commit details -
[analysis] assume expr is not mutated after analysis to avoid recursi…
…ve (llvm#90581) Fixes: llvm#89376.
Configuration menu - View commit details
-
Copy full SHA for 6e31714 - Browse repository at this point
Copy the full SHA 6e31714View commit details -
[LLDB][ELF] Fix section unification to not just use names. (llvm#90099)
Section unification cannot just use names, because it's valid for ELF binaries to have multiple sections with the same name. We should check other section properties too. Fixes llvm#88001. rdar://124467787
Configuration menu - View commit details
-
Copy full SHA for 4cbe760 - Browse repository at this point
Copy the full SHA 4cbe760View commit details -
[libc++] Remove _LIBCPP_DISABLE_ADDITIONAL_DIAGNOSTICS (llvm#90512)
I strongly suspect nobody ever used that macro since it wasn't very well known. Furthermore, it only affects a handful of diagnostics and I think it makes sense to either provide them unconditionally, or to not provided them at all.
Configuration menu - View commit details
-
Copy full SHA for a00bbcb - Browse repository at this point
Copy the full SHA a00bbcbView commit details -
[mlir][Vector] Add patterns for efficient unsigned i4 -> i8 conversio…
…n emulation (llvm#89131) This PR builds on llvm#79494 with an additional path for efficient unsigned `i4 ->i8` type extension for 1D/2D operations. This will impact any i4 -> i8/i16/i32/i64 unsigned extensions as well as sitofp i4 -> f8/f16/f32/f64.
Configuration menu - View commit details
-
Copy full SHA for 6dfaecf - Browse repository at this point
Copy the full SHA 6dfaecfView commit details -
[DirectX backend] generate ISG1, OSG1 part for compute shader (llvm#9…
…0508) Empty ISG1 and OSG1 parts are generated for compute shader since there's no signature for compute shader. Fixes llvm#88778
Configuration menu - View commit details
-
Copy full SHA for a764f49 - Browse repository at this point
Copy the full SHA a764f49View commit details -
[NFC][libc++] Fixes comment indention.
The output on eel.is has similar oddities, so I expect this was copy pasted.
Configuration menu - View commit details
-
Copy full SHA for 754072e - Browse repository at this point
Copy the full SHA 754072eView commit details -
[clang][modules] Allow including module maps to be non-affecting (llv…
…m#89992) The dependency scanner only puts top-level affecting module map files on the command line for explicitly building a module. This is done because any affecting child module map files should be referenced by the top-level one, meaning listing them explicitly does not have any meaning and only makes the command lines longer. However, a problem arises whenever the definition of an affecting module lives in a module map that is not top-level. Considering the rules explained above, such module map file would not make it to the command line. That's why 83973cf started marking the parents of an affecting module map file as affecting too. This way, the top-level file does make it into the command line. This can be problematic, though. On macOS, for example, the Darwin module lives in "/usr/include/Darwin.modulemap" one of many module map files included by "/usr/include/module.modulemap". Reporting the parent on the command line forces explicit builds to parse all the other module map files included by it, which is not necessary and can get expensive in terms of file system traffic. This patch solves that performance issue by stopping marking parent module map files as affecting, and marking module map files as top-level whenever they are top-level among the set of affecting files, not among the set of all known files. This means that the top-level "/usr/include/module.modulemap" is now not marked as affecting and "/usr/include/Darwin.modulemap" is.
Configuration menu - View commit details
-
Copy full SHA for 477c705 - Browse repository at this point
Copy the full SHA 477c705View commit details -
Configuration menu - View commit details
-
Copy full SHA for 987c036 - Browse repository at this point
Copy the full SHA 987c036View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c369cf - Browse repository at this point
Copy the full SHA 6c369cfView commit details -
[MIR] Serialize MachineFrameInfo::isCalleeSavedInfoValid() (llvm#90561)
In case of functions without a stack frame no "stack" field is serialized into MIR which leads to isCalleeSavedInfoValid being false when reading a MIR file back in. To fix this we should serialize MachineFrameInfo::isCalleeSavedInfoValid() into MIR.
Configuration menu - View commit details
-
Copy full SHA for cf2f32c - Browse repository at this point
Copy the full SHA cf2f32cView commit details -
[NVPTX] Fix 64 bits rotations with large shift values (llvm#89399)
ROTL and ROTR can take a shift amount larger than the element size, in which case the effective shift amount should be the shift amount modulo the element size. This patch adds the modulo step when the shift amount isn't known at compile time. Without it the existing implementation would end up shifting beyond the type size and give incorrect results.
Configuration menu - View commit details
-
Copy full SHA for 7396ab1 - Browse repository at this point
Copy the full SHA 7396ab1View commit details -
[RISCV] Refactor profile selection in RISCVISAInfo::parseArchString. (l…
…lvm#90700) Instead of hardcoding the 4 current profile prefixes, treat profile selection as a fallback if we don't find "rv32" or "rv64". Update the error message accordingly.
Configuration menu - View commit details
-
Copy full SHA for 09f4b06 - Browse repository at this point
Copy the full SHA 09f4b06View commit details -
[RISCV] Merge RISCVISAInfo::updateFLen/MinVLen/MaxELen into a single …
…function. (llvm#90665) This simplifies the callers.
Configuration menu - View commit details
-
Copy full SHA for cf3c714 - Browse repository at this point
Copy the full SHA cf3c714View commit details -
Reapply "Use an abbrev to reduce size of VALUE_GUID records in ThinLT…
…O summaries" (llvm#90610) (llvm#90692) This reverts commit 2aabfc8. Add fixes to LLD and Gold tests missed in original change. Co-authored-by: Jan Voung <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 28869a7 - Browse repository at this point
Copy the full SHA 28869a7View commit details -
[flang] always run PolymorphicOpConversion sequentially (llvm#90721)
It was pointed out in post commit review of llvm#90597 that the pass should never have been run in parallel over all functions (and now other top level operations) in the first place. The mutex used in the pass was ineffective at preventing races since each instance of the pass would have a different mutex.
Configuration menu - View commit details
-
Copy full SHA for d1b3648 - Browse repository at this point
Copy the full SHA d1b3648View commit details -
[libc] Implement fcntl() function (llvm#89507)
Fixes llvm#84968. Implements the `fcntl()` function defined in the `fcntl.h` header.
Configuration menu - View commit details
-
Copy full SHA for aca5117 - Browse repository at this point
Copy the full SHA aca5117View commit details -
[alpha.webkit.UncountedCallArgsChecker] Support more trivial expressi…
…ons. (llvm#90414) Treat a compound operator such as |=, array subscription, sizeof, and non-type template parameter as trivial so long as subexpressions are also trivial. Also treat true/false boolean literal as trivial.
Configuration menu - View commit details
-
Copy full SHA for 1ca6005 - Browse repository at this point
Copy the full SHA 1ca6005View commit details -
[ELF] Catch zlib deflateInit2 error
The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not have a good way of testing. The latter will be possible with a pending change that allows setting the compression level, which will come with a test.
Configuration menu - View commit details
-
Copy full SHA for 91fef00 - Browse repository at this point
Copy the full SHA 91fef00View commit details -
[ELF] Adjust --compress-sections to support compression level
zstd excels at scaling from low-ratio-very-fast to high-ratio-pretty-slow. Some users prioritize speed and prefer disk read speed, while others focus on achieving the highest compression ratio possible, similar to traditional high-ratio codecs like LZMA. Add an optional `level` to `--compress-sections` (llvm#84855) to cater to these diverse needs. While we initially aimed for a one-size-fits-all approach, this no longer seems to work. (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html) When --compress-debug-sections is used together, make --compress-sections take precedence since --compress-sections is usually more specific. Remove the level distinction between -O/-O1 and -O2 for --compress-debug-sections=zlib for a more consistent user experience. Pull Request: llvm#90567
Configuration menu - View commit details
-
Copy full SHA for 6d44a1e - Browse repository at this point
Copy the full SHA 6d44a1eView commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for c93f480 - Browse repository at this point
Copy the full SHA c93f480View commit details -
Change-Id: I4968e32ce2fcf8592f4ab65f9b2eb89b5fbb67dc
Jenkins committedMay 1, 2024 Configuration menu - View commit details
-
Copy full SHA for 39fea68 - Browse repository at this point
Copy the full SHA 39fea68View commit details -
Minor cleanups; replace amd-stg-open with amd-staging
Change-Id: I8d57fc9053f1ee71230ac48337f73b474581188f
Configuration menu - View commit details
-
Copy full SHA for e85d0d4 - Browse repository at this point
Copy the full SHA e85d0d4View commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for fc06b37 - Browse repository at this point
Copy the full SHA fc06b37View commit details
Commits on May 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 59a2734 - Browse repository at this point
Copy the full SHA 59a2734View commit details -
Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…
…g/ec/llvm-project into amd-staging
Configuration menu - View commit details
-
Copy full SHA for c013d6b - Browse repository at this point
Copy the full SHA c013d6bView commit details -
Allow link to llvm shared library for current distros
Signed-off-by: "Yiyang Wu <[email protected]>"
Configuration menu - View commit details
-
Copy full SHA for 7311c1b - Browse repository at this point
Copy the full SHA 7311c1bView commit details