Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow link to llvm shared library for current distros #68

Open
wants to merge 10,000 commits into
base: amd-stg-open
Choose a base branch
from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Apr 30, 2024

  1. [NFC] Remove method from FoldingSet that already existed in APInt. (l…

    …lvm#90486)
    
    Noticed that there already was a function in APInt that updated a
    FoldingSet so there was no need for me to add it in
    llvm#84617.
    andjo403 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    9a1386e View commit details
    Browse the repository at this point in the history
  2. [mlir][sparse] fold explicit value during sparsification (llvm#90530)

    This ensures the explicit value is generated (and not a load into the
    values array). Note that actually not storing values array at all is
    still TBD, this is just the very first step.
    aartbik authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    65ee8f1 View commit details
    Browse the repository at this point in the history
  3. [Attributes] Support Attributes being declared as supporting an exper…

    …imental late parsing mode "extension" (llvm#88596)
    
    This patch changes the `LateParsed` field of `Attr` in `Attr.td` to be
    an instantiation of the new `LateAttrParseKind` class. The instation can be one of the following:
    
    * `LateAttrParsingNever` - Corresponds with the false value of `LateParsed` prior to this patch (the default for an attribute).
    * `LateAttrParseStandard` - Corresponds with the true value of `LateParsed` prior to this patch.
    * `LateAttrParseExperimentalExt` - A new mode described below.
    
    `LateAttrParseExperimentalExt` is an experimental extension to
    `LateAttrParseStandard`. Essentially this allows
    `Parser::ParseGNUAttributes(...)` to distinguish between these cases:
    
    1. Only `LateAttrParseExperimentalExt` attributes should be late parsed.
    2. Both `LateAttrParseExperimentalExt` and `LateAttrParseStandard`
      attributes should be late parsed.
    
    Callers (and indirect callers) of `Parser::ParseGNUAttributes(...)`
    indicate the desired behavior by setting a flag in the
    `LateParsedAttrList` object that is passed to the function.
    
    In addition to the above, a new driver and frontend flag
    (`-fexperimental-late-parse-attributes`) with a corresponding LangOpt
    (`ExperimentalLateParseAttributes`) is added that changes how
    `LateAttrParseExperimentalExt` attributes are parsed.
    
    * When the flag is disabled (default), in cases where only
      `LateAttrParsingExperimentalOnly` late parsing is requested, the
      attribute will be parsed immediately (i.e. **NOT** late parsed). This
      allows the attribute to act just like a `LateAttrParseStandard`
      attribute when the flag is disabled.
    
    * When the flag is enabled, in cases where only
      `LateAttrParsingExperimentalOnly` late parsing is requested, the
      attribute will be late parsed.
    
    The motivation behind this change is to allow the new `counted_by`
    attribute (part of `-fbounds-safety`) to support late parsing but
    **only** when `-fexperimental-late-parse-attributes` is enabled. This
    attribute needs to support late parsing to allow it to refer to fields
    later in a struct definition (or function parameters declared later).
    However, there isn't a precedent for supporting late attribute parsing
    in C so this flag allows the new behavior to exist in Clang but not be
    on by default. This behavior was requested as part of the
    `-fbounds-safety` RFC process
    (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854/68).
    
    This patch doesn't introduce any uses of `LateAttrParseExperimentalExt`.
    This will be added for the `counted_by` attribute in a future patch
    (llvm#87596). A consequence is the
    new behavior added in this patch is not yet testable. Hence, the lack of
    tests covering the new behavior.
    
    rdar://125400257
    delcypher authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    b1867e1 View commit details
    Browse the repository at this point in the history
  4. [NewPM][CodeGen] Add MachineFunctionAnalysis (llvm#88610)

    In new pass system, `MachineFunction` could be an analysis result again,
    machine module pass can now fetch them from analysis manager.
    `MachineModuleInfo` no longer owns them.
    Remove `FreeMachineFunctionPass`, replaced by
    `InvalidateAnalysisPass<MachineFunctionAnalysis>`.
    
    Now `FreeMachineFunction` is replaced by
    `InvalidateAnalysisPass<MachineFunctionAnalysis>`, the workaround in
    `MachineFunctionPassManager` is no longer needed, there is no difference
    between `unittests/MIR/PassBuilderCallbacksTest.cpp` and
    `unittests/IR/PassBuilderCallbacksTest.cpp`.
    paperchalice authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6ea0c0a View commit details
    Browse the repository at this point in the history
  5. [X86] Enable EVEX512 when host CPU has AVX512 (llvm#90479)

    This is used when -march=native run on an unknown CPU to old version of
    LLVM.
    phoebewang authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    b329179 View commit details
    Browse the repository at this point in the history
  6. [BOLT] Avoid reference updates for non-JT symbol operands (llvm#88838)

    Skip updating references for operands that do not directly
    refer to jump table symbols but fall within a jump table's
    address range to prevent unintended modifications.
    linsinan1995 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    9d5411f View commit details
    Browse the repository at this point in the history
  7. [C++20] [Modules] [Reduced BMI] Avoid force writing static declarations

    within module purview
    
    Close llvm#90259
    
    Technically, the static declarations shouldn't be leaked from the module
    interface, otherwise it is an illegal program according to the spec. So
    we can get rid of the static declarations from the reduced BMI
    technically. Then we can close the above issue.
    
    However, there are too many `static inline` codes in existing headers.
    So it will be a pretty big breaking change if we do this globally.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    38067c5 View commit details
    Browse the repository at this point in the history
  8. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    1990de9 View commit details
    Browse the repository at this point in the history
  9. merge main into amd-staging

    Change-Id: Icf8748fff11482f16cbeb1f19baf5a3404b57c6e
    Jenkins committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    73aa06a View commit details
    Browse the repository at this point in the history
  10. Disable test for lsan and x86_64h (llvm#90483)

    Disable this test on x86_64h for LSan.
    
    This test is failing with malformed object only on x86_64h.
    Disabling for now. 
    
    rdar://125052424
    thetruestblue authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    62d6560 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    326667d View commit details
    Browse the repository at this point in the history
  12. [ELF] --compress-debug-sections=zstd: replace ZSTD_c_nbWorkers parall…

    …elism with multi-frame parallelism
    
    https://reviews.llvm.org/D133679 utilizes zstd's multithread API to
    create one single frame. This provides a higher compression ratio but is
    significantly slower than concatenating multiple frames.
    
    With manual parallelism, it is easier to parallelize memcpy in
    OutputSection::writeTo for parallel memcpy.
    
    In addition, as the individual allocated decompression buffers are much
    smaller, we can make a wild guess (compressed_size/4) without worrying
    about a resize (due to wrong guess) would waste memory.
    MaskRay committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    79095b4 View commit details
    Browse the repository at this point in the history
  13. [clang-tidy] fix false-negative for macros in `readability-math-missi…

    …ng-parentheses` (llvm#90279)
    
    When a binary operator is the last operand of a macro, the end location
    that is past the `BinaryOperator` will be inside the macro and therefore
    an
    invalid location to insert a `FixIt` into, which is why the check bails
    when encountering such a pattern.
    However, the end location is only required for the `FixIt` and the
    diagnostic can still be emitted, just without an attached fix.
    5chmidti authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fbe4d99 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    bd72f7b View commit details
    Browse the repository at this point in the history
  15. [NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to genera…

    …te module file for C++20 modules instead of PCHGenerator
    
    Previously we're re-using PCHGenerator to generate the module file for
    C++20 modules. But this is slighty more or less odd. This patch tries
    to use a new class 'CXX20ModulesGenerator' to generate the module file
    for C++20 modules.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    18268ac View commit details
    Browse the repository at this point in the history
  16. [SelectionDAG][RISCV] Move VP_REDUCE* legalization to LegalizeDAG.cpp. (

    llvm#90522)
    
    LegalizeVectorType is responsible for legalizing nodes that perform an
    operation on each element may need to scalarize.
    
    This is not true for nodes like VP_REDUCE.*, BUILD_VECTOR,
    SHUFFLE_VECTOR, EXTRACT_SUBVECTOR, etc.
    
    This patch drops any nodes with a scalar result from LegalizeVectorOps
    and handles them in LegalizeDAG instead.
    
    This required moving the reduction promotion to LegalizeDAG. I have
    removed the support integer promotion as it was incorrect for integer
    min/max reductions. Since it was untested, it was best to assert on it
    until it was really needed.
    
    There are a couple regressions that can be fixed with a small DAG
    combine which I will do as a follow up.
    topperc authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    705636a View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    6e83058 View commit details
    Browse the repository at this point in the history
  18. [C++20] [Modules] Don't skip pragma diagnostic mappings

    Close llvm#75057
    
    Previously, I thought the diagnostic mappings is not meaningful with
    modules incorrectly. And this problem get revealed by another change
    recently. So this patch tried to rever the previous "optimization"
    partially.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fb21343 View commit details
    Browse the repository at this point in the history
  19. [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instr…

    …uctions
    
    Marking them as `hasSideEffects=1` stops some optimizations.
    
    According to `Target.td`:
    
    > // Does the instruction have side effects that are not captured by any
    > // operands of the instruction or other flags?
    > bit hasSideEffects = ?;
    
    It seems we don't need to set `hasSideEffects` for vleNff since we have
    modelled `vl` as an output operand.
    
    As for saturating instructions, I think that explicit Def/Use list
    is kind of side effects captured by any operands of the instruction,
    so we don't need to set `hasSideEffects` either. And I have just
    investigated AArch64's implementation, they don't set this flag and
    don't add `Def` list.
    
    These changes make optimizations like `performCombineVMergeAndVOps`
    and MachineCSE possible for these instructions.
    
    As a consequence, `copyprop.mir` can't test what we want to test in
    https://reviews.llvm.org/D155140, so we replace `vssra.vi` with a
    VCIX instruction (it has side effects).
    
    Reviewers: jacquesguan, topperc, preames, asb, lukel97
    
    Reviewed By: topperc, lukel97
    
    Pull Request: llvm#90049
    wangpc-pp authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    940ef96 View commit details
    Browse the repository at this point in the history
  20. Revert "[C++20] [Modules] Don't skip pragma diagnostic mappings"

    and "[NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to
    generate module file for C++20 modules instead of PCHGenerator"
    
    This reverts commit fb21343.
    and commit 18268ac.
    
    It looks like there are some problems about linking the compiler
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6b961e2 View commit details
    Browse the repository at this point in the history
  21. [RISCV] Add DAG combine for (vmv_s_x_vl (undef) (vmv_x_s X). (llvm#90524

    )
    
    We can use the original vector as long as the type of X matches the
    result type of the vmv_s_x_vl.
    topperc authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2524146 View commit details
    Browse the repository at this point in the history
  22. [LoongArch] Support parsing la.tls.desc pseudo instruction

    Simultaneously implemented parsing support for the `%desc_*` modifiers.
    
    Reviewers: SixWeining, heiher, xen0n
    
    Reviewed By: xen0n, SixWeining
    
    Pull Request: llvm#90158
    wangleiat authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    4a84d8e View commit details
    Browse the repository at this point in the history
  23. [C++20] [Modules] Don't skip pragma diagnostic mappings

    Close llvm#75057
    
    Previously, I thought the diagnostic mappings is not meaningful with
    modules incorrectly. And this problem get revealed by another change
    recently. So this patch tried to rever the previous "optimization"
    partially.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    ec527b2 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    f4843ac View commit details
    Browse the repository at this point in the history
  25. [mlir][OpenMP] Extend omp.private with a dealloc region (llvm#90456)

    Extends `omp.private` with a new region: `dealloc` where deallocation
    logic for Fortran deallocatables will be outlined (this will happen in
    later PRs).
    ergawy authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    ce12b12 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    09f160c View commit details
    Browse the repository at this point in the history
  27. [lldb][Docs] Remove more subtitles from packets doc (llvm#90443)

    This removes various subtitles or converts them to bold text so that the
    table of contents is less cluttered.
    
    This includes "Example", "Notes", "Priority To Implement" and
    "Response".
    DavidSpickett authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    ff6c0ca View commit details
    Browse the repository at this point in the history
  28. [LoongArch][Codegen] Add support for TLSDESC

    The implementation only enables when the `-enable-tlsdesc` option is
    passed and the TLS model is `dynamic`.
    
    LoongArch's GCC has the same option(-mtls-dialet=) as RISC-V.
    
    Reviewers: heiher, MaskRay, SixWeining
    
    Reviewed By: SixWeining, MaskRay
    
    Pull Request: llvm#90159
    wangleiat authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    eb148ae View commit details
    Browse the repository at this point in the history
  29. Reapply "[flang] Improve debug info for functions." with regression f…

    …ixed. (llvm#90484)
    
    The original PR llvm#90083 had to be reverted in PR llvm#90444 as it caused one
    of the gfortran tests to fail. The issue was using `isIntOrIndex` for
    checking for integer type. It allowed index type which later caused
    assertion when calling `getIntOrFloatBitWidth`. I have now replaced it
    with `isInteger` which should fix this regression.
    abidh authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    91a8cb7 View commit details
    Browse the repository at this point in the history
  30. [RemoveDIs] Fix findDbgValues to return dbg_assign records too (llvm#…

    …90471)
    
    In the debug intrinsic class heirachy, a dbg.assign is a (inherits from)
    dbg.value, so `findDbgValues` returns dbg.values and dbg.assigns (by
    design). That hierarchy doesn't exist for DbgRecords - fix findDbgValues
    to return dbg_assign records as well as dbg_values and add unittest.
    OCHyams authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    09e7d86 View commit details
    Browse the repository at this point in the history
  31. [docs] Document which online sync-ups are no longer happening (llvm#8…

    …9361)
    
    Some of the online sync-ups on our Getting Involved page seem to no
    longer be happening. Document them as no longer happening, so that
    people don't get confused when dialing in to one of these.
    kbeyls authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    853344d View commit details
    Browse the repository at this point in the history
  32. [Modules] No transitive source location change (llvm#86912)

    This is part of "no transitive change" patch series, "no transitive
    source location change". I talked this with @Bigcheese in the tokyo's
    WG21 meeting.
    
    The idea comes from @jyknight posted on LLVM discourse. That for:
    
    ```
    // A.cppm
    export module A;
    ...
    
    // B.cppm
    export module B;
    import A;
    ...
    
    //--- C.cppm
    export module C;
    import C;
    ```
    
    Almost every time A.cppm changes, we need to recompile `B`. Due to we
    think the source location is significant to the semantics. But it may be
    good if we can avoid recompiling `C` if the change from `A` wouldn't
    change the BMI of B.
    
    # Motivation Example
    
    This patch only cares source locations. So let's focus on source
    location's example. We can see the full example from the attached test.
    
    ```
    //--- A.cppm
    export module A;
    export template <class T>
    struct C {
        T func() {
            return T(43);
        }
    };
    export int funcA() {
        return 43;
    }
    
    //--- A.v1.cppm
    export module A;
    
    export template <class T>
    struct C {
        T func() {
            return T(43);
        }
    };
    export int funcA() {
        return 43;
    }
    
    //--- B.cppm
    export module B;
    import A;
    
    export int funcB() {
        return funcA();
    }
    
    //--- C.cppm
    export module C;
    import A;
    export void testD() {
        C<int> c;
        c.func();
    }
    ```
    
    Here the only difference between `A.cppm` and `A.v1.cppm` is that
    `A.v1.cppm` has an additional blank line. Then the test shows that two
    BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other
    specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same
    contents.
    
    However, it is a different story for C, since C instantiates templates
    from A, and the instantiation records the source information from module
    A, which is different from `A` and `A.v1`, so it is expected that the
    BMI `C.pcm` and `C.v1.pcm` can and should differ.
    
    # Internal perspective of status quo
    
    To fully understand the patch, we need to understand how we encodes
    source locations and how we serialize and deserialize them.
    
    For source locations, we encoded them as:
    
    ```
    |
    |
    | _____ base offset of an imported module
    |
    |
    |
    |_____ base offset of another imported module
    |
    |
    |
    |
    | ___ 0
    ```
    
    As the diagram shows, we encode the local (unloaded) source location
    from 0 to higher bits. And we allocate the space for source locations
    from the loaded modules from high bits to 0. Then the source locations
    from the loaded modules will be mapped to our source location space
    according to the allocated offset.
    
    For example, for,
    
    ```
    // a.cppm
    export module a;
    ...
    
    // b.cppm
    export module b;
    import a;
    ...
    ```
    
    Assuming the offset of a source location (let's name the location as
    `S`) in a.cppm is 45 and we will record the value `45` into the BMI
    `a.pcm`. Then in b.cppm, when we import a, the source manager will
    allocate a space for module 'a' (according to the recorded number of
    source locations) as the base offset of module 'a' in the current source
    location spaces. Let's assume the allocated base offset as 90 in this
    example. Then when we want to get the location in the current source
    location space for `S`, we can get it simply by adding `45` to `90` to
    `135`. Finally we can get the source location for `S` in module B as
    `135`.
    
    And when we want to write module `b`, we would also write the source
    location of `S` as `135` directly in the BMI. And to clarify the
    location `S` comes from module `a`, we also need to record the base
    offset of module `a`, 90 in the BMI of `b`.
    
    Then the problem comes. Since the base offset of module 'a' is computed
    by the number source locations in module 'a'. In module 'b', the
    recorded base offset of module 'a' will change every time the number of
    source locations in module 'a' increase or decrease. In other words, the
    contents of BMI of B will change every time the number of locations in
    module 'a' changes. This is pretty sensitive. Almost every change will
    change the number of locations. So this is the problem this patch want
    to solve.
    
    Let's continue with the existing design to understand what's going on.
    Another interesting case is:
    
    ```
    // c.cppm
    export module c;
    import whatever;
    import a;
    import b;
    ...
    ```
    
    In `c.cppm`, when we import `a`, we still need to allocate a base
    location offset for it, let's say the value becomes to `200` somehow.
    Then when we reach the location `S` recorded in module `b`, we need to
    translate it into the current source location space. The solution is
    quite simple, we can get it by `135 + (200 - 90) = 245`. In another
    word, the offset of a source location in current module can be computed
    as `Recorded Offset + Base Offset of the its module file - Recorded Base
    Offset`.
    
    Then we're almost done about how we handle the offset of source
    locations in serializers.
    
    # The high level design of current patch
    
    From the abstract level, what we want to do is to remove the hardcoded
    base offset of imported modules and remain the ability to calculate the
    source location in a new module unit. To achieve this, we need to be
    able to find the module file owning a source location from the encoding
    of the source location.
    
    So in this patch, for each source location, we will store the local
    offset of the location and the module file index. For the above example,
    in `b.pcm`, the source location of `S` will be recorded as `135`
    directly. And in the new design, the source location of `S` will be
    recorded as `<1, 45>`. Here `1` stands for the module file index of `a`
    in module `b`. And `45` means the offset of `S` to the base offset of
    module `a`.
    
    So the trade-off here is that, to make the BMI more independent, we need
    to record more abstract information. And I feel it is worthy. The
    recompilation problem of modules is really annoying and there are still
    people complaining this. But if we can make this (including stopping
    other changes transitively), I think this may be a killer feature for
    modules. And from @Bigcheese , this should be helpful for clang explicit
    modules too.
    
    And the benchmarking side, I tested this patch against
    https://github.com/alibaba/async_simple/tree/CXX20Modules. No
    significant change on compilation time. The size of .pcm files becomes
    to 204M from 200M. I think the trade-off is pretty fair.
    
    # Some low level details
    
    I didn't use another slot to record the module file index. I tried to
    use the higher 32 bits of the existing source location encodings to
    store that information. This design may be safe. Since we use `unsigned`
    to store source locations but we use uint64_t in serialization. And
    generally `unsigned` is 32 bit width in most platforms. So it might not
    be a safe problem. Since all the bits we used to store the module file
    index is not used before. So the new encodings may be:
    
    ```
       |-----------------------|-----------------------|
       |           A           |         B         | C |
    
      * A: 32 bit. The index of the module file in the module manager + 1. The +1
              here is necessary since we wish 0 stands for the current module file.
      * B: 31 bit. The offset of the source location to the module file containing it.
      * C: The macro bit. We rotate it to the lowest bit so that we can save some 
              space in case the index of the module file is 0.
    ```
    
    (The B and C is the existing raw encoding for source locations)
    
    Another reason to reuse the same slot of the source location is to
    reduce the impact of the patch. Since there are a lot of places assuming
    we can store and get a source location from a slot. And if I tried to
    add another slot, a lot of codes breaks. I don't feel it is worhty.
    
    Another impact of this decision is that, the existing small
    optimizations for encoding source location may be invalided. The key of
    the optimization is that we can turn large values into small values then
    we can use VBR6 format to reduce the size. But if we decided to put the
    module file index into the higher bits, then maybe it simply doesn't
    work. An example may be the `SourceLocationSequence` optimization.
    
    This will only affect the size of on-disk .pcm files. I don't expect
    this impact the speed and memory use of compilations. And seeing my
    small experiments above, I feel this trade off is worthy.
    
    # Correctness
    
    The mental model for handling source location offsets is not so complex
    and I believe we can solve it by adding module file index to each stored
    source location.
    
    For the practical side, since the source location is pretty sensitive,
    and the patch can pass all the in-tree tests and a small scale projects,
    I feel it should be correct.
    
    # Future Plans
    
    I'll continue to work on no transitive decl change and no transitive
    identifier change (if matters) to achieve the goal to stop the
    propagation of unnecessary changes. But all of this depends on this
    patch. Since, clearly, the source locations are the most sensitive
    thing.
    
    ---
    
    The release nots and documentation will be added seperately.
    ChuanqiXu9 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6c31104 View commit details
    Browse the repository at this point in the history
  33. [MLIR] Sprinkle extra asserts in OperationSupport.h (llvm#90465)

    Should hopefully help shave some minutes off developer debugging time in
    the future.
    definelicht authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2464c1c View commit details
    Browse the repository at this point in the history
  34. [MLIR][LLVM] Have LLVM::AddressOfOp implement ConstantLike (llvm#90481)

    For all means and purposes llvm.mlir.addressof acts like a constant, and
    should be treated as such by passes. In particular, the operation should
    be propagated rather than passed whenever possible.
    definelicht authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    92ca6fc View commit details
    Browse the repository at this point in the history
  35. [mlir][test] Add TD example for peel+vectorize (depthwise conv) (llvm…

    …#90200)
    
    Adds an example that combines loop peeling and scalable vectorisation of
    `linalg.depthwise_conv_2d_nhwc_hwc`. This is similar to
    transform-op-peel-and-vectorize.mlir and is meant to demonstrate how to
    avoid masking when vectorising using scalable vectors.
    banach-space authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    c9d92d2 View commit details
    Browse the repository at this point in the history
  36. [clang][Interp] Handle Shifts in OpenCL correctly

    We need to adjust the RHS to account for the LHS bitwidth.
    tbaederr committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    74e65ee View commit details
    Browse the repository at this point in the history
  37. Fix lock guads in PipePosix.cpp (llvm#90572)

    Guard object destroyed immediately after creation without naming.
    dklimkin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    29dda26 View commit details
    Browse the repository at this point in the history
  38. [Clang][Sema] fix a bug on template partial specialization (llvm#89862)

    attempt to fix
    llvm#68885 (comment)
    Deduction of NTTP whose type is `decltype(auto)` would create an
    implicit cast expression to dependent type and makes the type of primary
    template definition (`InjectedClassNameSpecialization`) and its partial
    specialization different. Prevent emitting cast expression to make clang
    knows their types are identical by removing `CTAK == CTAK_Deduced` when
    the type is `decltype(auto)`.
    
    Co-authored-by: huqizhi <[email protected]>
    jcsxky authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    eaee8aa View commit details
    Browse the repository at this point in the history
  39. [Clang][Sema] Fix a bug on template partial specialization with issue…

    … on deduction of nontype template parameter (llvm#90376)
    
    Fix llvm#68885
    When build expression from a deduced argument whose kind is
    `Declaration` and `NTTPType`(which declared as `decltype(auto)`) is
    deduced as a reference type, `BuildExpressionFromDeclTemplateArgument`
    just create a `DeclRef`. This is incorrect while we get type from the
    expression since we can't get the original reference type from
    `DeclRef`. Creating a `SubstNonTypeTemplateParmExpr` expression and make
    the deduction correct. `Replacement` expression of
    `SubstNonTypeTemplateParmExpr` just helps the deduction and may not be
    same with the original expression.
    
    Co-authored-by: huqizhi <[email protected]>
    jcsxky authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a413c56 View commit details
    Browse the repository at this point in the history
  40. [PAC][lldb][Dwarf] Support __ptrauth-qualified types in user expres…

    …sions (llvm#84387)
    
    Depends on llvm#84384 and llvm#90329
    
    This adds support for `DW_TAG_LLVM_ptrauth_type` entries corresponding
    to explicitly signed types (e.g. free function pointers) in lldb user
    expressions. Applies PR swiftlang#8239
    from Apple's downstream and also adds tests and related code.
    
    ---------
    
    Co-authored-by: Jonas Devlieghere <[email protected]>
    kovdan01 and JDevlieghere authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    64248d7 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    f78949a View commit details
    Browse the repository at this point in the history
  42. [mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (llv…

    …m#90413)
    
    This also removes the member overload in TypeSwitch.
    
    All other users have been removed in
    fac349a and
    bd9fdce.
    chsigg authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7ac1fb0 View commit details
    Browse the repository at this point in the history
  43. [C++20] [Modules] Add signature to the BMI recording export imported

    modules
    
    After llvm#86912,
    for the following example,
    
    ```
    export module A;
    export import B;
    ```
    
    The generated BMI of `A` won't change if the source location in `A`
    changes. Further, we plan avoid more such changes.
    
    However, it is slightly problematic since `export import` should
    propagate all the changes.
    
    So this patch adds a signature to the BMI of C++20 modules so that we
    can propagate the changes correctly.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    b2b463b View commit details
    Browse the repository at this point in the history
  44. [NFC] [C++20] [Modules] Use new class CXX20ModulesGenerator to genera… (

    llvm#90570)
    
    …te module file for C++20 modules instead of PCHGenerator
    
    Previously we're re-using PCHGenerator to generate the module file for
    C++20 modules. But this is slighty more or less odd. This patch tries to
    use a new class 'CXX20ModulesGenerator' to generate the module file for
    C++20 modules.
    ChuanqiXu9 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fce0916 View commit details
    Browse the repository at this point in the history
  45. [flang] Fix debug-fn-info.f90 test

    91a8cb7 was originally written
    before 8d53866 landed. The latter
    changed how main is emitted which changed the numbering of the
    suprograms in the test output.
    
    To fix this I've added a check for the new _QQmain and renumbered
    the existing checks.
    DavidSpickett committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    21f8ced View commit details
    Browse the repository at this point in the history
  46. [NFC] [tests] Don't try to remove and create the same directory

    In the test of
    clang/test/Modules/no-transitive-source-location-change.cppm, there were
    reports about invalid directory names in windows. The reason may be that
    we may remove and create the same directory. This patch tries to avoid
    such patterns for that.
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    10aab63 View commit details
    Browse the repository at this point in the history
  47. [offload] Fix missing reference decrement introduced by merge resolution

    Added line which has been dropped from the 'deinitRuntime()' during
    merge-conflict resolution.
    
    Change-Id: Iee2c8b2fe63d8cd36cdb9befca2e8c93384087d9
    mhalk authored and ronlieb committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    79ca523 View commit details
    Browse the repository at this point in the history
  48. [Clang][Sema] Do not accept "vector _Complex" for AltiVec/ZVector (ll…

    …vm#90467)
    
    The AltiVec (POWER) and ZVector (IBM Z) language extensions do not
    support using the "vector" keyword when the element type is a complex
    type, but current code does not verify this.
    
    Add a Sema check and diagnostic for this case.
    
    Fixes: llvm#88399
    uweigand authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    f73e87f View commit details
    Browse the repository at this point in the history
  49. [AMDGPU] Fix gfx12 waitcnt type for image_msaa_load (llvm#90201)

    image_msaa_load is actually encoded as a VSAMPLE instruction and
    requires the appropriate waitcnt variant.
    dstutt authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    62dea99 View commit details
    Browse the repository at this point in the history
  50. Fix output in coro-elide-thinlto.cpp (llvm#90579)

    Current dir can be read-only. Use a temp path instead.
    dklimkin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fb2d305 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    f10685f View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    0061616 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    3fca9d7 View commit details
    Browse the repository at this point in the history
  54. [X86] combineMulToPMADDWD/combineMulToPMULDQ/reduceVMULWidth - pull o…

    …ut repeated SDLoc(). NFC.
    RKSimon committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    066dc1e View commit details
    Browse the repository at this point in the history
  55. [X86] Add TODO for getTargetConstantFromBasePtr to support non-zero o…

    …ffsets.
    
    As noted on llvm#66991 - we sometimes share vector constant pool entries, referencing subvectors within them via pointer offsets
    RKSimon committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2cb97c7 View commit details
    Browse the repository at this point in the history
  56. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    1646003 View commit details
    Browse the repository at this point in the history
  57. [InstCombine] Fold trunc nuw/nsw (x xor y) to i1 to x != y (llvm#…

    …90408)
    
    Fold:
    ``` llvm
    define i1 @src(i8 %x, i8 %y) {
      %xor = xor i8 %x, %y
      %r = trunc nuw/nsw i8 %xor to i1
      ret i1 %r
    }
    
    define i1 @tgt(i8 %x, i8 %y) {
      %r = icmp ne i8 %x, %y
      ret i1 %r
    }
    ```
    
    Proof: https://alive2.llvm.org/ce/z/dcuHmn
    YanWQ-monad authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    34c89ef View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    66e1d2c View commit details
    Browse the repository at this point in the history
  59. [RISCV] Remove -riscv-insert-vsetvl-strict-asserts flag (llvm#90171)

    This flag has been enabled by default for almost two years now since
    1f06398, and at this stage we probably
    shouldn't be falling back to the fixups.
    
    This removes the flag so we always perform the assertion, as well as
    making sure that CurInfo is always valid on exit: We shouldn't leave
    emitVSETVLIs with an uninitialized VSETVLIInfo.
    lukel97 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7faf343 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    af5d41e View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    2f9462e View commit details
    Browse the repository at this point in the history
  62. [NFC][Clang] Update P2718R0 implementation status to partial supported (

    llvm#90577)
    
    Once llvm#85613 fixed, we can
    mark this feature fully supported.
    
    Signed-off-by: yronglin <[email protected]>
    yronglin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6fab3f2 View commit details
    Browse the repository at this point in the history
  63. [LTO] Reset DiscardValueNames in optimize(). (llvm#78705)

    libLTO parses options late, so at the moment the option is ignored. To
    fix that, re-set it in optimize(), as at this point the options have been
    parsed. When LTOCodeGenerator's constructor executes, the options
    haven't been parsed by the linker to libLTO yet.
    
    Note that we keep the value name of `%add = add..` because when the
    module is imported, DiscardValueNames is still set to false (the default
    when building with assertions).
    
    I tried to improve this in libLTO, but I am not sure if there's a
    suitable callback when all options have been set.
    
    PR: llvm#78705
    fhahn authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    f3ac55f View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    bb95f5d View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    5cd074f View commit details
    Browse the repository at this point in the history
  66. [LAA] Pass maximum stride to isSafeDependenceDistance. (llvm#90036)

    As discussed in llvm#88039, support
    different strides with isSafeDependenceDistance by passing the maximum
    of both strides.
    
    isSafeDependenceDistance tries to prove that
        |Dist| > BackedgeTakenCount * Step
    holds. Chosing the maximum stride computes the maximum range accesed by
    the loop for all strides.
    
    PR: llvm#90036
    fhahn authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    82219e5 View commit details
    Browse the repository at this point in the history
  67. [DAGCombiner] Fix mayAlias not accounting for scalable MMOs with offs…

    …ets (llvm#90573)
    
    In llvm#70452 DAGCombiner::mayAlias was taught to handle scalable sizes, but
    when it checks via AA->isNoAlias it didn't take into account the case
    where the size is scalable but there was an offset too.
    
    For the fixed length case the offset was just accounted for by adding to
    the LocationSize, but for the scalable case there doesn't seem to be a
    way to represent both a scalable and fixed part in it. So this patch
    works around it by bailing if there is an offset.
    
    Fixes llvm#90559
    lukel97 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    5e03c0a View commit details
    Browse the repository at this point in the history
  68. [AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)

    Thanks to ExtensionSet::toLLVMFeatureList, all values of ArchExtKind
    should correspond to a particular -target-feature. The valid values of
    -target-feature are in turn defined by SubtargetFeature defs.
    
    Therefore we can generate ArchExtKind from the tablegen data. This is
    done by adding an Extension class which derives from SubtargetFeature.
    
    Because the Has* FieldNames do not always correspond to the AEK_
    names ("extensions", as defined in TargetParser), and AEK_ names do
    not always correspond to -march strings, some additional enum entries
    have been added to remap the names. I have renamed these to make the
    naming consistent, but split them into a separate PR to keep the diff
    reasonable (llvm#90320)
    tmatheson-arm authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    61b2a0e View commit details
    Browse the repository at this point in the history
  69. merge main into amd-staging

    Change-Id: I95739002226a44f9c97a6b2ea2e349ec57b7a9f1
    Jenkins committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    c4fa736 View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    1c17252 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    e50a857 View commit details
    Browse the repository at this point in the history
  72. Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summari…

    …es (llvm#90497)
    
    GUID often have content in the higher bits of a 64-bit entry so using
    the unabbrev encoding is inefficient (lots of VBR control bits).
    Instead, use an abbrev with two 32-bit fixed width chunks.
    The abbrev also helps encode the "count" in one place instead of
    in every record.
    
    Reduces size of distributed backend summary files by 8.7% in one
    example app.
    
    Co-authored-by: Jan Voung <[email protected]>
    jvoung and jvoung authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    adabdc1 View commit details
    Browse the repository at this point in the history
  73. Configuration menu
    Copy the full SHA
    e4c0f4a View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    7ae32bf View commit details
    Browse the repository at this point in the history
  75. Revert "[AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)"

    This reverts commit 61b2a0e.
    
    Reason: AArch64TargetParserDef.inc not found while building clang
    tmatheson-arm committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    35e6bae View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    b60a2b9 View commit details
    Browse the repository at this point in the history
  77. Revert "Use an abbrev to reduce size of VALUE_GUID records in ThinLTO…

    … summaries" (llvm#90610)
    
    Reverts llvm#90497
    Broke some LLD tests.
    jvoung authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2aabfc8 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    c106abf View commit details
    Browse the repository at this point in the history
  79. Do not use R12 for indirect tail calls with PACBTI (llvm#82661)

    When compiling for thumbv8.1m with +pacbti and making an indirect tail
    call, the compiler was free to put the function pointer into R12.
    
    This is incorrect because R12 is restored to contain authentication code
    for the caller's return address.
    
    This patch excludes R12 from the set of registers the compiler can put
    the function pointer in.
    
    Fixes llvm#75998
    eleanor-arm authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    c12bc57 View commit details
    Browse the repository at this point in the history
  80. Revert "[Modules] No transitive source location change (llvm#86912)"

    This reverts commit 6c31104.
    
    Required by the post commit comments: llvm#86912
    ChuanqiXu9 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    d333a0d View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    8d28e58 View commit details
    Browse the repository at this point in the history
  82. Adding memref normalization of affine.prefetch (llvm#89675)

    Added support for memref-normalization for prefetch.
    
    Signed-off-by: Alexandre Eichenberger <[email protected]>
    AlexandreEichenberger authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a7b968a View commit details
    Browse the repository at this point in the history
  83. [gn build] Port 6ea0c0a

    llvmgnsyncbot committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    ea81daf View commit details
    Browse the repository at this point in the history
  84. [gn build] Port a5cc951

    llvmgnsyncbot committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    622ec1f View commit details
    Browse the repository at this point in the history
  85. [SystemZ] Enable MachineCombiner for FP reassociation (llvm#83546)

    Enable MachineCombining for FP add, sub and mul.
    
    In order for this to work, the default instruction selection of reg/mem opcodes is disabled for ISD nodes that carry the flags that allow reassociation. The reg/mem folding is instead done after MachineCombiner by PeepholeOptimizer. SystemZInstrInfo optimizeLoadInstr() and foldMemoryOperandImpl() ("LoadMI version") have been implemented for this purpose also by this patch.
    JonPsson1 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6c32a1f View commit details
    Browse the repository at this point in the history
  86. [RISCV] Use consume_front to parse rv32/rv64 in RISCVISAInfo::parse*A…

    …rchString. NFC (llvm#90562)
    
    This replaces some starts_with calls wth consume_front. This allows us
    to remove a later assumption that prefix was 4 characters. We would
    eventually need to fix this anyway if we ever support rv128.
    
    Noticed while reviewing the RISCVISAInfo code for other reasons.
    topperc authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    1b942ae View commit details
    Browse the repository at this point in the history
  87. [flang][cuda] Fix iv store in cuf kernel (llvm#90551)

    Store of the current induction value to the user IV was not placed
    correctly in the body of the cuf kernel.
    
    @ImanHosseini
    clementval authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    f815d1f View commit details
    Browse the repository at this point in the history
  88. [flang][cuda] Add fir.cuda_alloc/fir.cuda_free operations (llvm#90525)

    This patch introduces fir.cuda_alloc/fir.cuda_free. These operations
    will be used instead of fir.alloca for local CUDA device, managed and
    unified variables.
    clementval authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a9c73f6 View commit details
    Browse the repository at this point in the history
  89. MachineLICM: Remove unnecessary isReg checks

    COPY operands are always registers.
    arsenm committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    114a59d View commit details
    Browse the repository at this point in the history
  90. [OpenACC] Fix ast-print for OpenACC Clauses

    Previously we weren't printing expressions correctly, so this patch adds
    a test to ensure we do, and fixes how expressions are printed.
    erichkeane committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    cc6113d View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    721c31e View commit details
    Browse the repository at this point in the history
  92. [AMPGPU] Emit s_singleuse_vdst instructions when a register is used m…

    …ultiple times in the same instruction. (llvm#89601)
    
    Previously, multiple uses of a register within the same instruction were
    being counted as multiple uses. This has been corrected to
    only count as a single use as per the specification allowing for
    more optimisation candidates.
    ScottEgerton authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    d97f25b View commit details
    Browse the repository at this point in the history
  93. [flang][OpenMP] ensure we hit the TODO for intrinsic array reduction (l…

    …lvm#90593)
    
    Before this patch we crashed lowering intrinsic array reductions.
    
    I think this lost during a rebase. I've added a test to make sure it
    doesn't break again.
    
    Also fixed the TODO message to be more accurate.
    tblah authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    5ada328 View commit details
    Browse the repository at this point in the history
  94. [flang] Adapt PolymorphicOpConversion to run on all top level ops (ll…

    …vm#90597)
    
    We might use polymorphic ops in top-level operations other than
    functions some time in the future. We need to ensure that these
    operations can be lowered.
    
    See RFC:
    
    https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations
    
    Some of the changes are from moving declaration and definition of the
    constructor function into tablegen (as requested in code review when
    altering another pass).
    tblah authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    df513f8 View commit details
    Browse the repository at this point in the history
  95. [VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (llvm#9…

    …0502)
    
    This intrinsic is the VP version of `experimental.cttz.elts`.
    mshockwave authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    539f626 View commit details
    Browse the repository at this point in the history
  96. [MLIR] Generalize expand_shape to take shape as explicit input (llvm#…

    …90040)
    
    This patch generalizes tensor.expand_shape and memref.expand_shape to
    consume the output shape as a list of SSA values. This enables us to
    implement generic reshape operations with dynamic shapes using
    collapse_shape/expand_shape pairs.
    
    The output_shape input to expand_shape follows the static/dynamic
    representation that's also used in `tensor.extract_slice`.
    
    Differential Revision: https://reviews.llvm.org/D140821
    
    ---------
    
    Signed-off-by: Gaurav Shukla<[email protected]>
    Signed-off-by: Gaurav Shukla <[email protected]>
    Co-authored-by: Ramiro Leal-Cavazos <[email protected]>
    Shukla-Gaurav and ramiro050 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    97069a8 View commit details
    Browse the repository at this point in the history
  97. Configuration menu
    Copy the full SHA
    e9305fc View commit details
    Browse the repository at this point in the history
  98. [X86] Add icmp i16 test coverage

    Based off llvm#90355 - add basic tests for cases when to extend i16 comparisons to i32
    RKSimon committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    38c68e0 View commit details
    Browse the repository at this point in the history
  99. [DAG] Pull out repeated SDLoc() from SHL/SRL/SRA combines. NFC.

    We were always calling SDLoc(N) at the top of each visitSHL/SRL/SRA for the FoldConstantArithmetic call, so just reuse this as much as possible.
    RKSimon committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    91c52b9 View commit details
    Browse the repository at this point in the history
  100. Configuration menu
    Copy the full SHA
    fbe8d2a View commit details
    Browse the repository at this point in the history
  101. Configuration menu
    Copy the full SHA
    554be97 View commit details
    Browse the repository at this point in the history
  102. [flang][OpenMP] Pass symTable to all genXYZ functions, NFC (llvm#90090)

    This will unify the interface a bit more.
    kparzysz authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    33ccd03 View commit details
    Browse the repository at this point in the history
  103. [NFC][OpenMP][MI300] Refactoring of the checkIfAPU() method in prepar…

    …ation of an upstream patch.
    
    This patch refactors the checkIfAPU method. The revised checkIfAPU() method, using the HSA symbols HSA_AGENT_INFO_AMD_MEMORY_PROPERTIES and HSA_AMD_MEMORY_PROPERTY_AGENT_IS_APU, will be upstreamed.
    
    This patch reduces merge conflicts with the upstream method, as the detection of the GFX90a and MI300x is moved to separate methods. As such, the downstream method can be replaced by the upstream implementation.
    
    Change-Id: Id10605e7ea2248538f26ebc717341b1735495a01
    ThorBl authored and ronlieb committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    3751ac4 View commit details
    Browse the repository at this point in the history
  104. Configuration menu
    Copy the full SHA
    4631e7b View commit details
    Browse the repository at this point in the history
  105. [LegalizeDAG] Simplify interface to PromoteReduction. NFC

    Return an SDValue instead of pushing to the Results vector. Let
    the caller do the push.
    topperc committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    267329d View commit details
    Browse the repository at this point in the history
  106. [VP] Fix unit test failures caused by llvm#90502

    Forgot to add vp.cttz.elts into the unittest. Also, I didn't specify the
    positions of overloaded type parameters.
    mshockwave committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    6ab49fc View commit details
    Browse the repository at this point in the history
  107. Configuration menu
    Copy the full SHA
    4cd11c9 View commit details
    Browse the repository at this point in the history
  108. Configuration menu
    Copy the full SHA
    dbe3766 View commit details
    Browse the repository at this point in the history
  109. [MLIR][Arith] expand-ops: Support mini/maxi (llvm#90575)

    Expand `arith.minsi`, `arith.minui`, `arith.maxsi`, `arith.maxui` into
    `arith.cmpi` and `arith.select`.
    
    ---------
    
    Co-authored-by: Jakub Kuderski <[email protected]>
    mgehre-amd and kuhar authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    30badf9 View commit details
    Browse the repository at this point in the history
  110. [LangRef] Try to clarify mustprogress wording. (llvm#90510)

    Ensure it's clear that:
    
    - Infinite loops in non-mustprogress functions are well-defined, even if
    they're called by mustprogress functions.
    - Infinite recursion in mustprogress functions is not well-defined.
    
    Looking at D86233, it's clear this was the intent, but the "transitive"
    wording is ambiguous. Instead, just explicitly state that infinite loops
    written in non-mustprogress functions count as progress.
    efriedma-quic authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    600cae7 View commit details
    Browse the repository at this point in the history
  111. Configuration menu
    Copy the full SHA
    7dd4ce4 View commit details
    Browse the repository at this point in the history
  112. [libc++][NFC] Fixes a status page note and a minor copy & paste error…

    … in a test (llvm#90399)
    
    - Adds a status page note for P3142R0
    - Fixes a copy & paste error in tuple protocol for `complex`
    H-G-Hristov authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    9af7f40 View commit details
    Browse the repository at this point in the history
  113. Configuration menu
    Copy the full SHA
    a754ce0 View commit details
    Browse the repository at this point in the history
  114. [RISCV] Handle fixed length vectors with exact VLEN in lowerINSERT_SU…

    …BVECTOR (llvm#84107)
    
    This is the insert_subvector equivalent to llvm#79949, where we can avoid
    sliding up by the full LMUL amount if we know the exact subregister the
    subvector will be inserted into.
    
    This mirrors the lowerEXTRACT_SUBVECTOR changes in that we handle this
    in two parts:
    
    - We handle fixed length subvector types by converting the subvector to
    a scalable vector. But unlike EXTRACT_SUBVECTOR, we may also need to
    convert the vector being inserted into too.
    
    - Whenever we don't need a vslideup because either the subvector fits
    exactly into a vector register group *or* the vector is undef, we need
    to emit an insert_subreg ourselves because RISCVISelDAGToDAG::Select
    doesn't correctly handle fixed length subvectors yet: see d7a28f7
    
    A subvector exactly fits into a vector register group if its size is a
    known multiple of the size of a vector register, and this adds a new
    overload for TypeSize::isKnownMultipleOf for scalable to scalable
    comparisons to help reason about this.
    
    I've left RISCVISelDAGToDAG::Select untouched for now (minus relaxing an
    invariant), so that the insert_subvector and extract_subvector code
    paths are the same.
    
    We should teach it to properly handle fixed length subvectors in a
    follow-up patch, so that the "exact subregsiter" logic is handled in one
    place instead of being spread across both RISCVISelDAGToDAG.cpp and
    RISCVISelLowering.cpp.
    lukel97 authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    f565b79 View commit details
    Browse the repository at this point in the history
  115. Configuration menu
    Copy the full SHA
    f0cc373 View commit details
    Browse the repository at this point in the history
  116. Configuration menu
    Copy the full SHA
    40083cf View commit details
    Browse the repository at this point in the history
  117. [lldb] Support custom LLVM formatting for variables (llvm#81196)

    Adds support for applying LLVM formatting to variables.
    
    The reason for this is to support cases such as the following.
    
    Let's say you have two separate bytes that you want to print as a
    combined hex value. Consider the following summary string:
    
    ```
    ${var.byte1%x}${var.byte2%x}
    ```
    
    The output of this will be: `0x120x34`. That is, a `0x` prefix is
    unconditionally applied to each byte. This is unlike printf formatting
    where you must include the `0x` yourself.
    
    Currently, there's no way to do this with summary strings, instead
    you'll need a summary provider in python or c++.
    
    This change introduces formatting support using LLVM's formatter system.
    This allows users to achieve the desired custom formatting using:
    
    ```
    ${var.byte1:x-}${var.byte2:x-}
    ```
    
    Here, each variable is suffixed with `:x-`. This is passed to the LLVM
    formatter as `{0:x-}`. For integer values, `x` declares the output as
    hex, and `-` declares that no `0x` prefix is to be used. Further, one
    could write:
    
    ```
    ${var.byte1:x-2}${var.byte2:x-2}
    ```
    
    Where the added `2` results in these bytes being written with a minimum
    of 2 digits.
    
    An alternative considered was to add a new format specifier that would
    print hex values without the `0x` prefix. The reason that approach was
    not taken is because in addition to forcing a `0x` prefix, hex values
    are also forced to use leading zeros. This approach lets the user have
    full control over formatting.
    kastiglione authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7a8d15e View commit details
    Browse the repository at this point in the history
  118. [BOLT] Fix build-time assertion in RewriteInstance (llvm#90540)

    We use pwrite() in RewriteInstance to update contents of existing
    sections. pwrite() requires file position to be set past the written
    offset which we guarantee at the start of rewriteFile(). Then we had an
    implicit assumption in patchBuildID() that the file position will be set
    again in patchELFSymTabs() after being reset in patchELFPHDRTable().
    That assumption was broken in llvm#90300. The fix is to save and restore
    file position in patchELFPHDRTable(). Then we don't have to update it
    again in patchELFSymTabs().
    maksfb authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    49bb993 View commit details
    Browse the repository at this point in the history
  119. [mlir][NFC] update code to use mlir::dyn_cast/cast/isa (llvm#90633)

    Fix compiler warning caused by using deprecated interface
    (llvm#90413)
    Peiming Liu authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    d235369 View commit details
    Browse the repository at this point in the history
  120. [WebAssembly] Add preprocessor define for half-precision (llvm#90528)

    This adds the preprocessor define for the half-precision feature and
    also adds preprocessor tests.
    aheejin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7662f95 View commit details
    Browse the repository at this point in the history
  121. [Clang][Sema][Parse] Delay parsing of noexcept-specifiers in friend f…

    …unction declarations (llvm#90517)
    
    According to [class.mem.general] p8:
    > A complete-class context of a class (template) is a
    > - function body,
    > - default argument,
    > - default template argument,
    > - _noexcept-specifier_, or
    > - default member initializer
    >
    > within the member-specification of the class or class template.
    
    When testing llvm#90152, it came to my attention that we do _not_ consider
    the _noexcept-specifier_ of a friend function declaration to be a
    complete-class context (something which the Microsoft standard library
    depends on). Although a comment states that this is "consistent with
    what other implementations do", the only other implementation that
    exhibits this behavior is GCC (MSVC and EDG both late-parse the
    _noexcept-specifier_).
    
    This patch changes _noexcept-specifiers_ of friend function declarations
    to be late parsed, which is in agreement with the standard & majority of
    implementations. Pre-llvm#90152, our existing implementation falls "in
    between" the implementation consensus: within non-template classes, we
    would not find latter declared members (qualified and unqualified),
    while within class templates we would not find latter declared member
    when named with a unqualified name, we would find members named with a
    qualified name (even when lookup context is the current instantiation).
    Therefore, this _shouldn't_ be a breaking change -- any code that didn't
    compile will continue to not compile (since a _noexcept-specifier_ is
    not part of the deduction substitution
    loci (see [temp.deduct.general] p7), and any code which
    did compile should continue to do so.
    sdkrystian authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    f061a39 View commit details
    Browse the repository at this point in the history
  122. Reapply "[Clang][Sema] Diagnose class member access expressions namin…

    …g non-existent members of the current instantiation prior to instantiation in the absence of dependent base classes (llvm#84050)" (llvm#90152)
    
    Reapplies llvm#84050, addressing a bug which cases a crash when an
    expression with the type of the current instantiation is used as the
    _postfix-expression_ in a class member access expression (arrow form).
    sdkrystian authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    8009bbe View commit details
    Browse the repository at this point in the history
  123. [OpenACC] Private Clause on Compute Constructs (llvm#90521)

    The private clause is the first that takes a 'var-list', thus this has a
    lot of additional work to enable the var-list type. A 'var' is a
    traditional variable reference, subscript, member-expression, or
    array-section, so checking of these is pretty minor.
    
    Note: This ran into some issues with array-sections (aka sub-arrays)
    that will be fixed in a follow-up patch.
    erichkeane authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fa67986 View commit details
    Browse the repository at this point in the history
  124. [GVNSink] Fix incorrect codegen with respect to GEPs llvm#85333 (llvm…

    …#88440)
    
    As mentioned in llvm#68882 and
    https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699
    
    Gep arithmetic isn't consistent with different types. GVNSink didn't
    realize this and sank all geps
    as long as their operands can be wired via PHIs
    in a post-dominator.
    
    Fixes: llvm#85333
    hiraditya authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    1c979ab View commit details
    Browse the repository at this point in the history
  125. [libc++][ranges] Implement LWG4053 and LWG4054 (llvm#88612)

    Implement
    - LWG4053 Unary call to `std::views::repeat` does not decay the argument
    - LWG4054 Repeating a `repeat_view` should repeat the view
    
    Signed-off-by: yronglin <[email protected]>
    yronglin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    0ecc164 View commit details
    Browse the repository at this point in the history
  126. [OpenACC] Fix test failure from fa67986

    Seemingly some other patch went in that altered how much dependence was
    printed vs the actual names, and it changed the ast-dump results.
    Commit to fix this test.
    erichkeane committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    41f9c78 View commit details
    Browse the repository at this point in the history
  127. [mlir][sparse] fix sparse tests that uses reshape operations. (llvm#9…

    …0637)
    
    Due to generalization introduced in
    llvm#90040
    Peiming Liu authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7cbaaed View commit details
    Browse the repository at this point in the history
  128. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    824380f View commit details
    Browse the repository at this point in the history
  129. Configuration menu
    Copy the full SHA
    52cb953 View commit details
    Browse the repository at this point in the history
  130. [lldb] Fix a warning

    This patch fixes:
    
      third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
      error: comparison of integers of different signs: 'const unsigned
      int' and 'const int' [-Werror,-Wsign-compare]
    kazutakahirata committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    5f88f0c View commit details
    Browse the repository at this point in the history
  131. Configuration menu
    Copy the full SHA
    9b07a03 View commit details
    Browse the repository at this point in the history
  132. [IR] Use StringRef::operator== instead of StringRef::equals (NFC) (ll…

    …vm#90550)
    
    I'm planning to remove StringRef::equals in favor of
    StringRef::operator==.
    
    - StringRef::operator== outnumbers StringRef::equals by a factor of 22
      under llvm/ in terms of their usage.
    
    - The elimination of StringRef::equals brings StringRef closer to
      std::string_view, which has operator== but not equals.
    
    - S == "foo" is more readable than S.equals("foo"), especially for
      !Long.Expression.equals("str") vs Long.Expression != "str".
    kazutakahirata authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    4e6f6fd View commit details
    Browse the repository at this point in the history
  133. [mlir][tensor] Fix integration tests that uses reshape ops. (llvm#90649)

    Due to generalization introduced in
    llvm#90040
    hanhanW authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a1423ba View commit details
    Browse the repository at this point in the history
  134. Revert "[GVNSink] Fix incorrect codegen with respect to GEPs llvm#85333

    …" (llvm#90658)
    
    Reverts llvm#88440
    
    Test failing on Windows:
    https://lab.llvm.org/buildbot/#/builders/233/builds/9396
    ```
    Input file: <stdin>
    # | Check file: C:\buildbot\as-builder-8\llvm-nvptx-nvidia-win\llvm-project\llvm\test\Transforms\GVNSink\different-gep-types.ll
    # | 
    # | -dump-input=help explains the following input dump.
    # | 
    # | Input was:
    # | <<<<<<
    # |            .
    # |            .
    # |            .
    # |           42:  br label %if.end6 
    # |           43:  
    # |           44: if.else5: ; preds = %if.else 
    # |           45:  br label %if.end6 
    # |           46:  
    # |           47: if.end6: ; preds = %if.else5, %if.then3, %if.then 
    # | next:67'0             X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
    # | next:67'1                                                        with "IF_THEN" equal to "%if\\.then"
    # | next:67'2                                                        with "IF_THEN3" equal to "%if\\.then3"
    # | next:67'3                                                        with "IF_ELSE5" equal to "%if\\.else5"
    # |           48:  %.sink1 = phi i32 [ -8, %if.then3 ], [ -4, %if.else5 ], [ 8, %if.then ] 
    # | next:67'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # | next:67'4      ?                                                                        possible intended match
    # |           49:  %0 = load ptr, ptr %__i, align 4 
    # | next:67'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # |           50:  %incdec.ptr4 = getelementptr inbounds i8, ptr %0, i32 %.sink1 
    # | next:67'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # |           51:  store ptr %incdec.ptr4, ptr %__i, align 4 
    # | next:67'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # |           52:  ret void 
    # | next:67'0     ~~~~~~~~~~
    # |           53: } 
    # | next:67'0     ~~
    # | >>>>>>
    # `-----------------------------
    # error: command failed with exit status: 1
    ```
    hiraditya authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    cf49d07 View commit details
    Browse the repository at this point in the history
  135. NFC add a new precommit test case for PPCMIpeephole (llvm#90656)

    Add pre-commit MIR test for PR "[Promote Pseudo Opcode from 32-bit to
    64-bit after eliminating the extsw instruction in PPCMIPeepholes
    optimization](llvm#85451)" which
    fixes bug reported in the issue "[Inconsistent Output at -O1 and -O2
    Optimization Levels on PowerPC64 Due to Complex Type Casting and Nested
    Loop Structure](llvm#71030)".
    diggerlin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    70ada5b View commit details
    Browse the repository at this point in the history
  136. [RISCV] Make RISCVISAInfo::updateMaxELen extension checking more robu…

    …st. Add inference from V extension. (llvm#90650)
    
    We weren't fully checking that we parsed Zve*x/f/d correctly. This could
    break if new extension is added that starts with Zve.
    
    We were assuming the Zve64d is present whenever V is so we only
    inferred from Zve*. It's more correct to infer ELEN from V itself too.
    topperc authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    05d04f0 View commit details
    Browse the repository at this point in the history
  137. [llvm][profdata][NFC] Support 64-bit weights in ProfDataUtils (llvm#8…

    …6607)
    
    Since some places, like SimplifyCFG, work with 64-bit weights, we supply
    an API in ProfDataUtils to extract the weights accordingly.
    
    We change the API slightly to disambiguate the 64-bit version from the
    32-bit version.
    ilovepi authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    7538df9 View commit details
    Browse the repository at this point in the history
  138. [DFSan] Replace cat with cmake -E cat (llvm#90557)

    `CMake` supports [this
    command](https://cmake.org/cmake/help/latest/manual/cmake.1.html#cmdoption-cmake-E-arg-cat)
    as of version 3.18. [D151344](https://reviews.llvm.org/D151344) bumped
    the minimum version to 3.20, so, it is now possible to remove the
    dependency on the external utility. This helps to cross-compile from
    Windows to Linux without installing additional tools, such as MSYS2.
    igorkudrin authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2224dce View commit details
    Browse the repository at this point in the history
  139. [OpenMP][AIX] Implement __kmp_is_address_mapped() for AIX (llvm#90516)

    This patch implements `__kmp_is_address_mapped()` for AIX by calling
    `loadquery()` to get the load info of the process and then checking if
    the address falls within the range of the data segment of one of the
    loaded modules.
    xingxue-ibm authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    928db7e View commit details
    Browse the repository at this point in the history
  140. SystemZ: Implement copyPhysReg between vr128 and gr128 (llvm#90616)

    I have no idea if this is correct and I probably swapped the element
    ordering somewhere.
    arsenm authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    75f4baa View commit details
    Browse the repository at this point in the history
  141. Configuration menu
    Copy the full SHA
    6992433 View commit details
    Browse the repository at this point in the history
  142. [flang][cuda] Accept variable with UNIFIED attribute in main (llvm#90647

    )
    
    UNIFIED variables are accept in program scope. Update the check to allow
    them.
    clementval authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    1fb5083 View commit details
    Browse the repository at this point in the history
  143. [BOLT] Add ORC validation for the Linux kernel (llvm#90660)

    The Linux kernel expects ORC tables to be sorted by IP address (for
    binary search to work). Add a post-emit pass in LinuxKernelRewriter that
    validates the written .orc_unwind_ip against that expectation.
    maksfb authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    c665e49 View commit details
    Browse the repository at this point in the history
  144. [Coroutines][Test] Specify target triple in coro-elide-thinlto (llvm#…

    …90549)
    
    Resolve test failure on non-x86 linux host
    apolloww authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    0232b77 View commit details
    Browse the repository at this point in the history
  145. [flang] Remove double pointer indirection for _QQEnvironmentDefaults (l…

    …lvm#90615)
    
    A double pointer was being passed to the call to FortranStart rather than just a pointer to the EnvironmentDefaults.list. This now passes `null` directly when there's no EnvironmentDefaults.list and passes the list directly when there is, removing the original global variable which was a pointer to a pointer containing null or the EnvironmentDefaults.list global.
    
    Fixes llvm#90537
    DavidTruby authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    ecec131 View commit details
    Browse the repository at this point in the history
  146. [GlobalISel] Fix store merging incorrectly classifying an unknown ind…

    …ex expr as 0. (llvm#90375)
    
    During analysis, we incorrectly leave the offset part of an address info
    struct
    as zero, when in actual fact we failed to decompose it into base +
    offset.
    This results in incorrectly assuming that the address is adjacent to
    another store
    addr. To fix this we wrap the offset in an optional<> so we can
    distinguish between
    real zero and unknown.
    
    Fixes issue llvm#90242
    aemerson authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    19f4d68 View commit details
    Browse the repository at this point in the history
  147. [SLP][NFCI]Improve compile time for phis with large number of incomin…

    …g values.
    
    Added a limit of 128 incoming values at max for PHIs nodes to be
    vectorized plus improved performance by using logarithmic search instead
    of linear if the number of incoming values is > 4.
    alexey-bataev committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    51aac5b View commit details
    Browse the repository at this point in the history
  148. Fix -fno-unsafe-math-optimizations behavior (llvm#89473)

    This changes the handling of -fno-unsafe-fp-math to stop having that
    option imply -ftrapping-math. In gcc, -fno-unsafe-math-optimizations
    sets -ftrapping-math, but that dependency is based on the fact the
    -ftrapping-math is enabled by default in gcc. Because clang does not
    enable -ftrapping-math by default, there is no reason for
    -fno-unsafe-math-optimizations to set it.
    
    On the other hand, -funsafe-math-optimizations continues to imply
    -fno-trapping-math because this option necessarily disables strict
    exception semantics.
    
    This fixes llvm#87523
    Andy Kaylor authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    fb85a28 View commit details
    Browse the repository at this point in the history
  149. [flang][cuda] Allow PINNED argument to host dummy (llvm#90651)

    Update the `AreCompatibleCUDADataAttrs` function to return true when one
    argument has the `PINNED` attribute and the other argument is just host
    data.
    clementval authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    89f8335 View commit details
    Browse the repository at this point in the history
  150. Add basic char*_t support for libc (partial WG14 N2653) (llvm#90360)

    This PR implements a part of WG14 N2653:
     - Define C23 char8_t
     - Define C11 char16_t
     - Define C11 char32_t
     
     Missing goals are:
    - The type of UTF-8 character literals is changed from unsigned char to
    char8_t. (Since UTF-8 character literals already have type unsigned
    char, this is not a semantic change).
    - New mbrtoc8() and c8rtomb() functions declared in <uchar.h> enable
    conversions between multibyte characters and UTF-8.
        - A new ATOMIC_CHAR8_T_LOCK_FREE macro.
        - A new atomic_char8_t typedef name.
    Febbe authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    cd7a7a5 View commit details
    Browse the repository at this point in the history
  151. [BOLT] Fix a warning

    This patch fixes:
    
      bolt/lib/Rewrite/LinuxKernelRewriter.cpp:855:12: error: variable
      'PrevIP' set but not used [-Werror,-Wunused-but-set-variable]
    kazutakahirata committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    805e08e View commit details
    Browse the repository at this point in the history
  152. Configuration menu
    Copy the full SHA
    d688162 View commit details
    Browse the repository at this point in the history
  153. [X86] Rename test to correct bug number. NFC

    I accidentally named it pr90688 instead of pr90668.
    topperc committed Apr 30, 2024
    Configuration menu
    Copy the full SHA
    805f01f View commit details
    Browse the repository at this point in the history
  154. [RISCV][ISel] Fix types in tryFoldSelectIntoOp (llvm#90659)

    ```
    SelectionDAG has 17 nodes:
      t0: ch,glue = EntryToken
        t6: i64,ch = CopyFromReg t0, Register:i64 %2
      t8: i1 = truncate t6
              t4: i64,ch = CopyFromReg t0, Register:i64 %1
            t7: i1 = truncate t4
                t2: i64,ch = CopyFromReg t0, Register:i64 %0
              t10: i64,i1 = saddo t2, Constant:i64<1>
            t11: i1 = or t8, t10:1
          t12: i1 = select t7, t8, t11
        t13: i64 = any_extend t12
      t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13
      t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1
    ```
    
    `OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`,
    which ignores `ResNo` in `SDValue`.
    
    Fix llvm#90652.
    dtcxzyw authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    2647bd7 View commit details
    Browse the repository at this point in the history
  155. [InstallAPI] Cleanup I/O error handling for input lists (llvm#90664)

    Add validation in the FileList reader to check that the headers exist and use similar diagnostics in Options.cpp
    cyndyishida authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    278774e View commit details
    Browse the repository at this point in the history
  156. Configuration menu
    Copy the full SHA
    0f628fd View commit details
    Browse the repository at this point in the history
  157. Configuration menu
    Copy the full SHA
    85f28cf View commit details
    Browse the repository at this point in the history
  158. [SelectionDAG][X86] Add a NoWrap flag to SelectionDAG::isAddLike. NFC (

    …llvm#90681)
    
    If this flag is set, Xor will not be considered AddLike. If an Xor were
    treated as an Add it may wrap. If we can prove there would be no carry out and
    thus no wrap, the Xor would be turned into a disjoint Or by DAGCombine.
    
    Use this new flag to fix a bug in X86 where an Xor is incorrectly being treated
    as an NUWAdd.
    
    Fixes llvm#90668.
    topperc authored Apr 30, 2024
    Configuration menu
    Copy the full SHA
    a03eeb0 View commit details
    Browse the repository at this point in the history

Commits on May 1, 2024

  1. [mlir][Tensor] Fix unpack -> transpose folding pattern for padded unp…

    …acks (llvm#90678)
    
    Previously if the producer tensor.unpack op had "unpadding" semantics,
    the folding pattern would construct a destination that does not match
    with the result type of the transpose. Because both ops are DPS we can
    just reuse the destination of the transpose.
    
    Additionally cleans up a bunch of trailing whitespace in the test file.
    qedawkins authored May 1, 2024
    Configuration menu
    Copy the full SHA
    75f7295 View commit details
    Browse the repository at this point in the history
  2. [AIX] Add git revision to .file string (llvm#88164)

    If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file`
    string. The revision can be set with `LLVM_FORCE_VC_REVISION`.
    
    Before:
    `.file	"git_revision.cpp",,"LLVM version 19.0.0git"`
    
    After:
    `.file	"git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`
    jakeegan authored May 1, 2024
    Configuration menu
    Copy the full SHA
    8cde1cf View commit details
    Browse the repository at this point in the history
  3. [flang] Added fir.dummy_scope operation to preserve dummy arguments a…

    …ssociation. (llvm#90642)
    
    The new operation is just an abstract attribute that is attached to
    [hl]fir.declare operations of dummy arguments of a subroutine.
    Dummy arguments of the same subroutine refer to the same
    fir.dummy_scope, so they can be recognized as such during FIR AliasAnalysis.
    Note that the fir.dummy_scope must be specific to the runtime
    instantiation of a subroutine, so any MLIR inlining/cloning should duplicate and
    unique it vs using the same fir.dummy_scope for different runtime instantiations.
    This is why I made it an operation rather than an attribute.
    The new operation uses a write effect on DebuggingResource, same as
    [hl]fir.declare, to avoid optimizing it away.
    vzakhari authored May 1, 2024
    Configuration menu
    Copy the full SHA
    986f832 View commit details
    Browse the repository at this point in the history
  4. [Coroutines][Test] Only run coro-elide-thinlto under x86_64-linux (ll…

    …vm#90672)
    
    Previous fix llvm#90549 didn't completely address the Buildbot failures.
    Some target may not recognize the target triple. This time, only run the
    test under x86_64-linux.
    apolloww authored May 1, 2024
    Configuration menu
    Copy the full SHA
    b1b1bfa View commit details
    Browse the repository at this point in the history
  5. [cross-project-tests] Update code to use mlir::cast (NFC)

    /llvm-project/cross-project-tests/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:41:16:
     error: 'cast' is deprecated: Use mlir::cast<U>() instead [-Werror,-Wdeprecated-declarations]
        VectorType.cast<mlir::ShapedType>(), llvm::ArrayRef<float>{2.0f, 3.0f});
                   ^
    /llvm-project/llvm/../mlir/include/mlir/IR/Types.h:345:9: note: 'cast' has been explicitly marked deprecated here
    U Type::cast() const {
            ^
    /llvm-project/cross-project-tests/debuginfo-tests/llvm-prettyprinters/gdb/mlir-support.cpp:41:16:
     error: 'cast<mlir::ShapedType>' is deprecated: Use mlir::cast<U>() instead [-Werror,-Wdeprecated-declarations]
        VectorType.cast<mlir::ShapedType>(), llvm::ArrayRef<float>{2.0f, 3.0f});
                   ^
    /llvm-project/llvm/../mlir/include/mlir/IR/Types.h:112:5: note: 'cast<mlir::ShapedType>' has been explicitly marked deprecated here
      [[deprecated("Use mlir::cast<U>() instead")]]
        ^
    2 errors generated.
    DamonFool committed May 1, 2024
    Configuration menu
    Copy the full SHA
    63a2969 View commit details
    Browse the repository at this point in the history
  6. [Windows] Restrict searchpath of dbghelp.dll to System32 (llvm#90520)

    LoadLibraryW will lookup dlls in user directories if its search path is
    left unrestricted. This is a security vulnerability as one can name a
    shared library the same as that of a system dll in order to run
    arbitrary code when the shared library is loaded from the path in a user
    directory. This change modifies it to only search within sys32 when
    loading dbghelp.dll.
    jofrn authored May 1, 2024
    Configuration menu
    Copy the full SHA
    ef1dbcd View commit details
    Browse the repository at this point in the history
  7. [flang][cuda] Update attribute compatibily check for unified matching…

    … rule (llvm#90679)
    
    This patch updates the compatibility checks for CUDA attribute iin
    preparation to implement the matching rules described in section 3.2.3.
    We this patch the compiler will still emit an error when there is
    multiple specific procedures that matches since the matching distances
    is not yet implemented. This will be done in a separate patch.
    
    
    https://docs.nvidia.com/hpc-sdk/archive/24.3/compilers/cuda-fortran-prog-guide/index.html#cfref-var-attr-unified-data
    
    gpu=unified and gpu=managed are not part of this patch since these
    options are not recognized by flang yet.
    clementval authored May 1, 2024
    Configuration menu
    Copy the full SHA
    86e5d6f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    8e9b1e9 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    306ae14 View commit details
    Browse the repository at this point in the history
  10. merge main into amd-staging

    Change-Id: I4f3510230f3d590f3d875dc0cc78d816bce8bff8
    ronlieb committed May 1, 2024
    Configuration menu
    Copy the full SHA
    f784bda View commit details
    Browse the repository at this point in the history
  11. [Sema] Avoid an undesired pack expansion while transforming PackIndex…

    …ingType (llvm#90195)
    
    A pack indexing type can appear in a larger pack expansion, e.g
    `Pack...[pack_of_indexes]...` so we need to temporarily disable
    substitution of pack elements.
    
    Besides, this patch also fixes an assertion failure in
    `PackIndexingExpr::classify`: dependent `PackIndexingExpr`s are always
    LValues and thus we don't need to consider their `IndexExpr`s.
    
    Fixes llvm#88925
    
    ---------
    
    Co-authored-by: cor3ntin <[email protected]>
    zyn0217 and cor3ntin authored May 1, 2024
    Configuration menu
    Copy the full SHA
    410d635 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    240592a View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    3e93086 View commit details
    Browse the repository at this point in the history
  14. [flang][MLIR] Outline deallocation logic to omp.private ops (llvm#9…

    …0592)
    
    When delayed privatization is enabled, this PR emits the deallocation
    logic to the newly introduced `dealloc` region on `omp.private` ops.
    ergawy authored May 1, 2024
    Configuration menu
    Copy the full SHA
    0632cb3 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    93b9b7c View commit details
    Browse the repository at this point in the history
  16. [Pipelines][Coroutines] Tune coroutine passes only for ThinLTO pre-li…

    …nk pipeline (llvm#90690)
    
    Follow up to llvm#90310, limit the tune up only to ThinLTO pre-link as
    coroutine passes are not in MonoLTO backend
    apolloww authored May 1, 2024
    Configuration menu
    Copy the full SHA
    bafc5f4 View commit details
    Browse the repository at this point in the history
  17. [RemoveDIs] Fix SIGSEGV caused by splitBasicBlock (llvm#90312)

    See `llvm/unittests/IR/BasicBlockDbgInfoTest.cpp` for a test case.
    FLZ101 authored May 1, 2024
    Configuration menu
    Copy the full SHA
    0fb5037 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    3684a38 View commit details
    Browse the repository at this point in the history
  19. [InstCombine] Canonicalize scalable GEPs to use llvm.vscale intrinsic (

    …llvm#90569)
    
    Canonicalize getelementptr instructions for scalable vector types into
    ptradd representation with an explicit llvm.vscale call. This
    representation has better support in BasicAA, which can reason about
    llvm.vscale, but not plain scalable GEPs.
    nikic authored May 1, 2024
    Configuration menu
    Copy the full SHA
    74aa1ab View commit details
    Browse the repository at this point in the history
  20. [RISCV] Convert vsetvli mir tests to use $noreg instead of implicit_d…

    …ef. NFC
    
    This matches what comes out of isel since
    a63bd7e. It also adds the undef flag to
    more closely match the output after regalloc, which will help with the test
    diffs in llvm#70549
    lukel97 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    d392520 View commit details
    Browse the repository at this point in the history
  21. Tweak BumpPtrAllocator to benefit the hot path (llvm#90571)

    This takes the form of three consecutive but related changes:
    - Mark the fast path of BumpPtrAllocator as likely-taken.
    - Move the slow path of BumpPtrAllocator to a separate function.
    - Mark the slow path of BumpPtrAllocator as noinline.
    
    Overall, this saves geomean 0.4% userspace instructions on CTMark -O3,
    and 0.98% on CTMark -O0 -g.
    
    
    http://llvm-compile-time-tracker.com/compare.php?from=e1622e189e8c0ef457bfac528f90a7a930d9aad2&to=9eb53a4ed3af4a55e769ae1dd22d034b63d046e3&stat=instructions%3Au
    resistor authored May 1, 2024
    Configuration menu
    Copy the full SHA
    cd46c2c View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    23f0f7b View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    14b66fe View commit details
    Browse the repository at this point in the history
  24. [lldb][Docs] Sort documented packets alphabetically (llvm#90584)

    For the platform and extension doc.
    
    Also add links in the extension doc to the GDB specs we're extending.
    DavidSpickett authored May 1, 2024
    Configuration menu
    Copy the full SHA
    0c42fa3 View commit details
    Browse the repository at this point in the history
  25. [Modules] Process include files changes (llvm#90319)

    There were two diffs that introduced some options useful when you build
    modules externally and cannot rely on file modification time as the key
    for detecting input file changes:
    - [D67249](https://reviews.llvm.org/D67249) introduced the
    `-fmodules-validate-input-files-content` option, which allows the use of
    file content hash in addition to the modification time.
    - [D141632](https://reviews.llvm.org/D141632) propagated the use of
    `-fno-pch-timestamps` with Clang modules.
    
    There is a problem when the size of the input file (header) is not
    modified but the content is. In this case, Clang cannot detect the file
    change when the `-fno-pch-timestamps` option is used. The
    `-fmodules-validate-input-files-content` option should help, but there
    is an issue with its application: it's not applied when the modification
    time is stored as zero that is the case for `-fno-pch-timestamps`.
    
    The issue can be fixed using the same trick that was applied during the
    processing of `ForceCheckCXX20ModulesInputFiles`:
    ```
      // When ForceCheckCXX20ModulesInputFiles and ValidateASTInputFilesContent
      // enabled, it is better to check the contents of the inputs. Since we can't
      // get correct modified time information for inputs from overriden inputs.
      if (HSOpts.ForceCheckCXX20ModulesInputFiles && ValidateASTInputFilesContent &&
          F.StandardCXXModule && FileChange.Kind == Change::None)
        FileChange = HasInputContentChanged(FileChange);
    ```
    The patch suggests the solution similar to the presented above and
    includes a LIT test to verify it.
    ivanmurashko authored May 1, 2024
    Configuration menu
    Copy the full SHA
    9a9cff1 View commit details
    Browse the repository at this point in the history
  26. device-libs: Use ballot(true) instead of calling read_exec builtin

    The read_exec builtins are implemented with the ballot intrinsic anyway.
    In the wave32 case, these will optimize down to just use the low 32-bits.
    This converts a few uses, but others remain.
    
    Apparently you can just use exec_hi as a GPR in wave32 though, so I'm not sure
    we should be treating the raw exec read as assumed 0.
    
    Change-Id: Id5621bf31b0bb7fa27456938942138f3dea85a0a
    arsenm committed May 1, 2024
    Configuration menu
    Copy the full SHA
    1a62373 View commit details
    Browse the repository at this point in the history
  27. [ORC] Switch ObjectLinkingLayer::Plugins to shared ownership, copy pi…

    …peline.
    
    Previously ObjectLinkingLayer held unique ownership of Plugins, and links
    always used the Layer's plugin list at each step. This can cause problems if
    plugins are added while links are in progress however, as the newly added
    plugin may receive only some of the callbacks for links that are already
    running.
    
    In this patch each link gets its own copy of the pipeline that remains
    consistent throughout the link's lifetime, and it is guaranteed that Plugin
    objects (now with shared ownership) will remain valid until the link completes.
    
    Coding my way home: 9.80469S, 139.03167W
    lhames committed May 1, 2024
    Configuration menu
    Copy the full SHA
    7565b20 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    3a3bdd8 View commit details
    Browse the repository at this point in the history
  29. [lldb][Docs] Various style improvements to the tutorial (llvm#90594)

    * Replace "we" with either "you" (when talking to the reader) or "lldb"
    (when talking about the project).
    * Refer to lldb as lldb not LLDB, to match what the user sees on
    the command line (I am going to come back later and put the proper name in places where it's talking about the projects themselves)
    * Remove a bunch of contractions for example "won't". Which don't (pun
    intended) seem like a big deal at first but even I as a native English
    speaker find the text clearer with them expanded.
    * Use RST's plain text highlighting for keywords and command names.
    * Split some very long lines for easier editing in future.
    DavidSpickett authored May 1, 2024
    Configuration menu
    Copy the full SHA
    eb6097a View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    9bebf25 View commit details
    Browse the repository at this point in the history
  31. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    3f0e2e3 View commit details
    Browse the repository at this point in the history
  32. [LLVM][SVE] Improve legalisation of fixed length get.active.lane.mask (

    …llvm#90213)
    
    We are effectively performing type and operation legalisation very early
    within the code generation flow. This results in worse code quality
    because the DAG is not in canonical form, which DAGCombiner corrects
    through the introduction of operations that are not legal.
    
    This patchs splits and moves the code to where type and operation
    legalisation is typically implemented.
    paulwalker-arm authored May 1, 2024
    Configuration menu
    Copy the full SHA
    fdf206c View commit details
    Browse the repository at this point in the history
  33. [AMDGPU] Do not optimize away pre-existing waitcnt instructions at -O0 (

    llvm#90716)
    
    The autogenerated memory legalizer tests use -O0 so this allows us to
    see the exact waitcnts that were inserted by the memory legalizer
    without them being optimized away.
    jayfoad authored May 1, 2024
    Configuration menu
    Copy the full SHA
    0b21b25 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    582c6a8 View commit details
    Browse the repository at this point in the history
  35. [AMDGPU] Enhance s_waitcnt insertion before barrier for gfx12 (llvm#9…

    …0595)
    
    Code to determine if a waitcnt is required before a barrier instruction
    only
    considered S_BARRIER.
    gfx12 adds barrier_signal/wait so need to enhance the existing code to
    look for
    a barrier start (which is just an S_BARRIER for earlier architectures).
    dstutt authored May 1, 2024
    Configuration menu
    Copy the full SHA
    5fb1e28 View commit details
    Browse the repository at this point in the history
  36. [AMDGPU] Fix image_msaa_load waitcnt insertion for pre-gfx12 (llvm#90710

    )
    
    llvm#90201 made some fixes for
    gfx12
    image_msaa_load waitcnt insertion.
    That fix might break in some situations for pre-gfx12 - this fixes that
    by
    explitly checking for VSAMPLE which always requires a s_wait_samplecnt
    and
    leaves the previous logic intact for non-gfx12.
    dstutt authored May 1, 2024
    Configuration menu
    Copy the full SHA
    f898161 View commit details
    Browse the repository at this point in the history
  37. [AArch64] NFC: Add RUN lines for streaming-compatible code. (llvm#90617)

    The intent is to test lowering of vector operations by scalarization,
    for functions that are streaming-compatible (and thus cannot use NEON)
    and also don't have the +sve attribute.
    
    The generated code is clearly wrong at the moment, but a series of
    patches will follow to fix up all cases to use scalar instructions.
    
    A bit of context:
    
    This work will form the base to decouple SME from SVE later on, as it
    will make sure that no NEON instructions are used in
    streaming[-compatible] mode. Later this will be followed by a patch that
    changes `useSVEForFixedLengthVectors` to only return `true` if SVE is
    available for the given runtime mode, at which point I'll change the
    `-mattr=+sme -force-streaming-compatible-sve` to `-mattr=+sme
    -force-streaming-sve` in the RUN lines, so that the tests are considered
    to be executed in Streaming-SVE mode.
    sdesmalen-arm authored May 1, 2024
    Configuration menu
    Copy the full SHA
    ccb198d View commit details
    Browse the repository at this point in the history
  38. [llvm] Revive constructor of 'ResourceSegments'

    582c6a8 removed a constructor of
    'ResourceSegments' that is needed in LLVM unit tests.
    
    * Revert 582c6a8
    * Update the constructor to take a const reference of
      `std::list` as pointed out in llvm#89193.
    JOE1994 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    803e03f View commit details
    Browse the repository at this point in the history
  39. [SLP]Transform stores + reverse to strided stores with stride -1, if …

    …profitable.
    
    Adds transformation of consecutive vector store + reverse to strided
    stores with stride -1, if it is profitable
    
    Reviewers: RKSimon, preames
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#90464
    alexey-bataev authored May 1, 2024
    Configuration menu
    Copy the full SHA
    67e726a View commit details
    Browse the repository at this point in the history
  40. [SLP]Improve reordering for consts, splats and ops from same nodes + …

    …improved analysis.
    
    Improved detection of const/splat candidates, their matching and analysis of instructions from same nodes.
    
    Metric: size..text
    
    Program                                                                                                                                                size..text
                                                                                                                                                           results     results0    diff
                                                                                                                                                           results     results0    diff
                                                                                 test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test    92952.00    93096.00  0.2%
                                                                                         test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test   779832.00   780136.00  0.0%
                                                                                              test-suite :: MultiSource/Applications/JM/lencod/lencod.test   839923.00   840179.00  0.0%
                                                                                              test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test   392708.00   392740.00  0.0%
                                                                                    test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test  1171131.00  1171147.00  0.0%
    
                                                                                  test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  1391089.00  1391073.00 -0.0%
                                                                                 test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  1391089.00  1391073.00 -0.0%
                                                                                  test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12352780.00 12352636.00 -0.0%
    
    MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE - small
    reordering
    External/SPEC/CINT2006/464.h264ref/464.h264ref - small better code after
    reordering
    MultiSource/Applications/JM/lencod/lencod - smaller code with less
    shuffles
    MultiSource/Applications/JM/ldecod/ldecod - same
    External/SPEC/CFP2017rate/511.povray_r/511.povray_r - 2 extra loads
    vectorized, smaller code
    External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r - better code,
    size increased because of more constant vectors.
    External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s - same
    External/SPEC/CFP2017rate/526.blender_r/526.blender_r - small change in
    the vectorized code, some code a bit better, some a bit worse.
    
    Reviewers: RKSimon
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#87091
    alexey-bataev authored May 1, 2024
    Configuration menu
    Copy the full SHA
    576261a View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    442990b View commit details
    Browse the repository at this point in the history
  42. [z/OS] add support for z/OS system headers to clang std header wrappe…

    …rs (llvm#89995)
    
    Update the wrappers for the C std headers so that they always forward to
    the z/OS system headers.
    perry-ca authored May 1, 2024
    Configuration menu
    Copy the full SHA
    df241b1 View commit details
    Browse the repository at this point in the history
  43. Constant Fold logf128 calls

    This is a second attempt to land llvm#84501 which failed on several targets.
    
    This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
    typedef'ing float128 more precise by checking whether __uint128_t is available
    and checking if the host does not use __ibm128 which is prevalent on power pc
    targets and replaces IEEE754 float128s.
    MDevereau committed May 1, 2024
    Configuration menu
    Copy the full SHA
    088aa81 View commit details
    Browse the repository at this point in the history
  44. [Flang][OpenMP] Handle more character allocatable cases in privatizat…

    …ion (llvm#90449)
    
    Fixes llvm#84732, llvm#81947, llvm#81946
    
    Note: This is a fix till we enable delayed privatization.
    kiranchandramohan authored May 1, 2024
    Configuration menu
    Copy the full SHA
    57d0d3b View commit details
    Browse the repository at this point in the history
  45. [gn] port 088aa81 (LLVM_HAS_LOGF128)

    If we want to turn this on on some platforms, we'll also want to
    define HAS_LOGF128 for AnalysisTest, see
    llvm/unittests/Analysis/CMakeLists.txt
    nico committed May 1, 2024
    Configuration menu
    Copy the full SHA
    68b863b View commit details
    Browse the repository at this point in the history
  46. [SystemZ][z/OS] Build in ASCII 64 bit mode on z/OS (llvm#90630)

    Setting the correct build flags on z/OS to build LLVM as 64-bit ASCII
    application.
    fanbo-meng authored May 1, 2024
    Configuration menu
    Copy the full SHA
    034912d View commit details
    Browse the repository at this point in the history
  47. Revert "Constant Fold logf128 calls"

    This reverts commit 088aa81.
    MDevereau committed May 1, 2024
    Configuration menu
    Copy the full SHA
    efce8a0 View commit details
    Browse the repository at this point in the history
  48. Revert "[gn] port 088aa81 (LLVM_HAS_LOGF128)"

    This reverts commit 68b863b.
    088aa81 was reverted in efce8a0.
    nico committed May 1, 2024
    Configuration menu
    Copy the full SHA
    9ebf2f8 View commit details
    Browse the repository at this point in the history
  49. [gn build] Port df241b1

    llvmgnsyncbot committed May 1, 2024
    Configuration menu
    Copy the full SHA
    0647b2a View commit details
    Browse the repository at this point in the history
  50. [Offload] Fix CMake detection when it is not found (llvm#90729)

    Summary:
    This variable could be unset if not found or when building standalone.
    We should check for that and set it to true or false.
    
    Fixes: llvm#90708
    jhuber6 authored May 1, 2024
    Configuration menu
    Copy the full SHA
    e312f07 View commit details
    Browse the repository at this point in the history
  51. [libcxx][ci] In picolib build, ask clang for the normalised triple (l…

    …lvm#90722)
    
    This is needed for a workaround to make sure the link later succeeds. I
    don't know the reason for that but it is definitely needed.
    
    llvm#89234 will/wants to correct
    the triple normalisation for -none- and this means that clang prior to
    19, and clang 19 and above will have different answers and therefore
    different library paths.
    
    I don't want to bootstrap a clang just for libcxx CI, or require that
    anyone building for Arm do the same, so ask the compiler what the triple
    should be.
    
    This will be compatible with 17 and 19 when we do update to that
    version.
    
    I'm assuming $CC is what anyone locally would set to override the
    compiler, and `cc` is the binary name in our CI containers. It's not
    perfect but it should cover most use cases.
    DavidSpickett authored May 1, 2024
    Configuration menu
    Copy the full SHA
    167b506 View commit details
    Browse the repository at this point in the history
  52. [AArch64][TargetParser] autogen ArchExtKind enum (llvm#90314)

    Re-land 61b2a0e. Some Windows builds
    were failing because AArch64TargetParserDef.inc is a generated header
    which is included transitively into some clang components, but this
    information is not available to the build system and therefore there is
    a missing edge in the dependency graph. This patch incorporates the
    fixes described in ac1ffd3/D142403.
    
    Thanks to ExtensionSet::toLLVMFeatureList, all values of ArchExtKind
    should correspond to a particular -target-feature. The valid values of
    -target-feature are in turn defined by SubtargetFeature defs.
    
    Therefore we can generate ArchExtKind from the tablegen data. This is
    done by adding an Extension class which derives from SubtargetFeature.
    
    Because the Has* FieldNames do not always correspond to the AEK_
    names ("extensions", as defined in TargetParser), and AEK_ names do
    not always correspond to -march strings, some additional enum entries
    have been added to remap the names. I have renamed these to make the
    naming consistent, but split them into a separate PR to keep the diff
    reasonable (llvm#90320)
    tmatheson-arm committed May 1, 2024
    Configuration menu
    Copy the full SHA
    cfca977 View commit details
    Browse the repository at this point in the history
  53. [lldb] Teach LocateExecutableSymbolFile to look into LOCALBASE on Fre…

    …eBSD (llvm#81355)
    
    FreeBSD ports will now install debuginfo under $LOCALBASE/lib/debug/, where $LOCALBASE is typically /usr/local. On FreeBSD search this path in addition to existing debug info paths.
    
    Relevant change on the FreeBSD side: https://reviews.freebsd.org/D43515
    arrowd authored May 1, 2024
    Configuration menu
    Copy the full SHA
    f07a2ed View commit details
    Browse the repository at this point in the history
  54. [CUDA] make kernel stub ICF-proof (llvm#90155)

    MSVC linker merges functions having comdat which have identical set of
    instructions. CUDA uses kernel stub function as key to look up kernels
    in device executables. If kernel stub function for different kernels are
    merged by ICF, incorrect kernels will be launched.
    
    To prevent ICF from merging kernel stub functions, an unique global
    variable is created for each kernel stub function having comdat and a
    store is added to the kernel stub function. This makes the set of
    instructions in each kernel function unique.
    
    Fixes: llvm#88883
    yxsamliu authored May 1, 2024
    Configuration menu
    Copy the full SHA
    be5075a View commit details
    Browse the repository at this point in the history
  55. [OpenMP][TR12] change property of map-type modifier. (llvm#90499)

    map-type change to "default" instead "ultimate" from [OpenMP5.2]
    
    The change is allowed map-type to be placed any locations within map
    modifiers, besides the last location in the modifiers-list, also
    map-type can be omitted afterward.
    jyu2-git authored May 1, 2024
    Configuration menu
    Copy the full SHA
    f050660 View commit details
    Browse the repository at this point in the history
  56. [UndefOrPoison] [CompileTime] Avoid IDom walk unless required. NFC (l…

    …lvm#90092)
    
    If the value is not boolean and we are checking for `Undef` or
    `UndefOrPoison`, we can avoid the potentially expensive IDom walk.
        
    This should improve compile time for isGuaranteedNotToBeUndefOrPoison
    and isGuaranteedNotToBeUndef.
    annamthomas authored May 1, 2024
    Configuration menu
    Copy the full SHA
    78270cb View commit details
    Browse the repository at this point in the history
  57. [z/OS] treat text files as text files so auto-conversion is done (llv…

    …m#90128)
    
    To support auto-conversion on z/OS text files need to be opened as text files. These changes will fix a number of LIT failures due to text files not being converted to the internal code page.
    
    update a number of tools so they open the text files as text files
    add support in the cat.py to open a text file as a text file (Windows will continue to treat all files as binary so new lines are handled correctly)
    add env var definitions to enable auto-conversion in the lit config file.
    perry-ca authored May 1, 2024
    Configuration menu
    Copy the full SHA
    e22ce61 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    e83c6dd View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    39e24bd View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    0606747 View commit details
    Browse the repository at this point in the history
  61. [mlir][ArmSME] Add a tests showing liveness issues in the tile alloca…

    …tor (llvm#90447)
    
    This test shows a few cases (not at all complete) where the current
    ArmSME tile allocator produces incorrect results. The plan is to resolve
    these issues with a future tile allocator that uses liveness
    information.
    MacDue authored May 1, 2024
    Configuration menu
    Copy the full SHA
    9226688 View commit details
    Browse the repository at this point in the history
  62. [AMDGPU] change order of fp and sp in kernel prologue (llvm#90626)

    change order of fp and sp in kernel prologue also related codegen tests
    to make it easier to merge code into our downstream branches
    
    Signed-off-by: gangc <[email protected]>
    cmc-rep authored May 1, 2024
    Configuration menu
    Copy the full SHA
    167427f View commit details
    Browse the repository at this point in the history
  63. merge main into amd-staging

      - Revert 8009bbe
        - Revert "Reapply "[Clang][Sema] Diagnose class member access expressions
          naming non-existent members of the current instantiation prior to
          instantiation in the absence of dependent base classes (llvm#84050)"
          (llvm#90152)"
        - Breaks composable kernels and rocthrust builds
      - Revert 41f9c78
        - Revert "[OpenACC] Fix test failure from fa67986"
        - Fixes some issues in 8009bbe, so depends on it
      - Cherry-pick 803e03f from trunk
        - Fixes unit test failures introduced in trunk earlier
    
    Change-Id: I718574c8a26745a52845d0b5a914ed00db611956
    kzhuravl committed May 1, 2024
    Configuration menu
    Copy the full SHA
    c1c0371 View commit details
    Browse the repository at this point in the history
  64. [RemoveDIs] Load into new debug info format by default in LLVM (llvm#…

    …89799)
    
    This patch enables parsing and creating modules directly into the new
    debug info format. Prior to this patch, all modules were constructed
    with the old debug info format by default, and would be converted into
    the new format just before running LLVM passes. This is an important
    milestone, in that this means that every tool will now be exposed to
    debug records, rather than those that run LLVM passes. As far as I've
    tested, all LLVM tools/projects now either handle debug records, or
    convert them to the old intrinsic format.
    
    There are a few unit tests that need updating for this patch; these are
    either cases of tests that previously needed to set the debug info
    format to function, or tests that depend on the old debug info format in
    some way. There should be no visible change in the output of any LLVM
    tool as a result of this patch, although the likelihood of this patch
    breaking downstream code means an NFC tag might be a little misleading,
    if not technically incorrect:
    
    This will probably break some downstream tools that don't already handle
    debug records. If your downstream code breaks as a result of this
    change, the simplest fix is to convert the module in question to the old
    debug format before you process it, using
    `Module::convertFromNewDbgValues()`. For more information about how to
    handle debug records or about what has changed, see the migration
    document:
      https://llvm.org/docs/RemoveDIsDebugInfo.html
    SLTozer authored May 1, 2024
    Configuration menu
    Copy the full SHA
    2f01fd9 View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    00821fe View commit details
    Browse the repository at this point in the history
  66. [llvm-install-name-tool] Error on non-Mach-O binaries (llvm#90351)

    Previously if you passed an ELF binary it would be silently copied with no changes.
    keith authored May 1, 2024
    Configuration menu
    Copy the full SHA
    fa53545 View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    6e31714 View commit details
    Browse the repository at this point in the history
  68. [LLDB][ELF] Fix section unification to not just use names. (llvm#90099)

    Section unification cannot just use names, because it's valid for ELF
    binaries to have multiple sections with the same name. We should check
    other section properties too.
    
    Fixes llvm#88001.
    
    rdar://124467787
    al45tair authored May 1, 2024
    Configuration menu
    Copy the full SHA
    4cbe760 View commit details
    Browse the repository at this point in the history
  69. [libc++] Remove _LIBCPP_DISABLE_ADDITIONAL_DIAGNOSTICS (llvm#90512)

    I strongly suspect nobody ever used that macro since it wasn't very well
    known. Furthermore, it only affects a handful of diagnostics and I think
    it makes sense to either provide them unconditionally, or to not
    provided them at all.
    ldionne authored May 1, 2024
    Configuration menu
    Copy the full SHA
    a00bbcb View commit details
    Browse the repository at this point in the history
  70. [mlir][Vector] Add patterns for efficient unsigned i4 -> i8 conversio…

    …n emulation (llvm#89131)
    
    This PR builds on llvm#79494 with an additional path for efficient unsigned `i4 ->i8` type extension for 1D/2D operations. This will impact any i4 -> i8/i16/i32/i64 unsigned extensions as well as sitofp i4 -> f8/f16/f32/f64.
    KoolJBlack authored May 1, 2024
    Configuration menu
    Copy the full SHA
    6dfaecf View commit details
    Browse the repository at this point in the history
  71. [DirectX backend] generate ISG1, OSG1 part for compute shader (llvm#9…

    …0508)
    
    Empty ISG1 and OSG1 parts are generated for compute shader since there's
    no signature for compute shader.
    
    Fixes llvm#88778
    python3kgae authored May 1, 2024
    Configuration menu
    Copy the full SHA
    a764f49 View commit details
    Browse the repository at this point in the history
  72. [NFC][libc++] Fixes comment indention.

    The output on eel.is has similar oddities, so I expect this was copy
    pasted.
    mordante committed May 1, 2024
    Configuration menu
    Copy the full SHA
    754072e View commit details
    Browse the repository at this point in the history
  73. [clang][modules] Allow including module maps to be non-affecting (llv…

    …m#89992)
    
    The dependency scanner only puts top-level affecting module map files on
    the command line for explicitly building a module. This is done because
    any affecting child module map files should be referenced by the
    top-level one, meaning listing them explicitly does not have any meaning
    and only makes the command lines longer.
    
    However, a problem arises whenever the definition of an affecting module
    lives in a module map that is not top-level. Considering the rules
    explained above, such module map file would not make it to the command
    line. That's why 83973cf started
    marking the parents of an affecting module map file as affecting too.
    This way, the top-level file does make it into the command line.
    
    This can be problematic, though. On macOS, for example, the Darwin
    module lives in "/usr/include/Darwin.modulemap" one of many module map
    files included by "/usr/include/module.modulemap". Reporting the parent
    on the command line forces explicit builds to parse all the other module
    map files included by it, which is not necessary and can get expensive
    in terms of file system traffic.
    
    This patch solves that performance issue by stopping marking parent
    module map files as affecting, and marking module map files as top-level
    whenever they are top-level among the set of affecting files, not among
    the set of all known files. This means that the top-level
    "/usr/include/module.modulemap" is now not marked as affecting and
    "/usr/include/Darwin.modulemap" is.
    jansvoboda11 authored May 1, 2024
    Configuration menu
    Copy the full SHA
    477c705 View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    987c036 View commit details
    Browse the repository at this point in the history
  75. Configuration menu
    Copy the full SHA
    6c369cf View commit details
    Browse the repository at this point in the history
  76. [MIR] Serialize MachineFrameInfo::isCalleeSavedInfoValid() (llvm#90561)

    In case of functions without a stack frame no "stack" field is
    serialized into MIR which leads to isCalleeSavedInfoValid being false
    when reading a MIR file back in. To fix this we should serialize
    MachineFrameInfo::isCalleeSavedInfoValid() into MIR.
    dtellenbach authored May 1, 2024
    Configuration menu
    Copy the full SHA
    cf2f32c View commit details
    Browse the repository at this point in the history
  77. [NVPTX] Fix 64 bits rotations with large shift values (llvm#89399)

    ROTL and ROTR can take a shift amount larger than the element size, in
    which case the effective shift amount should be the shift amount modulo
    the element size.
    
    This patch adds the modulo step when the shift amount isn't known at
    compile time. Without it the existing implementation would end up
    shifting beyond the type size and give incorrect results.
    npmiller authored May 1, 2024
    Configuration menu
    Copy the full SHA
    7396ab1 View commit details
    Browse the repository at this point in the history
  78. [RISCV] Refactor profile selection in RISCVISAInfo::parseArchString. (l…

    …lvm#90700)
    
    Instead of hardcoding the 4 current profile prefixes, treat profile
    selection as a fallback if we don't find "rv32" or "rv64".
    
    Update the error message accordingly.
    topperc authored May 1, 2024
    Configuration menu
    Copy the full SHA
    09f4b06 View commit details
    Browse the repository at this point in the history
  79. [RISCV] Merge RISCVISAInfo::updateFLen/MinVLen/MaxELen into a single …

    …function. (llvm#90665)
    
    This simplifies the callers.
    topperc authored May 1, 2024
    Configuration menu
    Copy the full SHA
    cf3c714 View commit details
    Browse the repository at this point in the history
  80. Reapply "Use an abbrev to reduce size of VALUE_GUID records in ThinLT…

    …O summaries" (llvm#90610) (llvm#90692)
    
    This reverts commit 2aabfc8.
    
    Add fixes to LLD and Gold tests missed in original change.
    
    Co-authored-by: Jan Voung <[email protected]>
    jvoung and jvoung authored May 1, 2024
    Configuration menu
    Copy the full SHA
    28869a7 View commit details
    Browse the repository at this point in the history
  81. [flang] always run PolymorphicOpConversion sequentially (llvm#90721)

    It was pointed out in post commit review of
    llvm#90597 that the pass should
    never have been run in parallel over all functions (and now other top
    level operations) in the first place. The mutex used in the pass was
    ineffective at preventing races since each instance of the pass would
    have a different mutex.
    tblah authored May 1, 2024
    Configuration menu
    Copy the full SHA
    d1b3648 View commit details
    Browse the repository at this point in the history
  82. [libc] Implement fcntl() function (llvm#89507)

    Fixes llvm#84968. 
    
    Implements the `fcntl()` function defined in the `fcntl.h` header.
    vinayakdsci authored May 1, 2024
    Configuration menu
    Copy the full SHA
    aca5117 View commit details
    Browse the repository at this point in the history
  83. [alpha.webkit.UncountedCallArgsChecker] Support more trivial expressi…

    …ons. (llvm#90414)
    
    Treat a compound operator such as |=, array subscription, sizeof, and
    non-type template parameter as trivial so long as subexpressions are
    also trivial.
    
    Also treat true/false boolean literal as trivial.
    rniwa authored May 1, 2024
    Configuration menu
    Copy the full SHA
    1ca6005 View commit details
    Browse the repository at this point in the history
  84. [ELF] Catch zlib deflateInit2 error

    The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not
    have a good way of testing. The latter will be possible with a pending
    change that allows setting the compression level, which will come with a
    test.
    MaskRay committed May 1, 2024
    Configuration menu
    Copy the full SHA
    91fef00 View commit details
    Browse the repository at this point in the history
  85. [ELF] Adjust --compress-sections to support compression level

    zstd excels at scaling from low-ratio-very-fast to
    high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
    speed, while others focus on achieving the highest compression ratio
    possible, similar to traditional high-ratio codecs like LZMA.
    
    Add an optional `level` to `--compress-sections` (llvm#84855) to cater to
    these diverse needs. While we initially aimed for a one-size-fits-all
    approach, this no longer seems to work.
    (https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)
    
    When --compress-debug-sections is used together, make
    --compress-sections take precedence since --compress-sections is usually
    more specific.
    
    Remove the level distinction between -O/-O1 and -O2 for
    --compress-debug-sections=zlib for a more consistent user experience.
    
    Pull Request: llvm#90567
    MaskRay authored May 1, 2024
    Configuration menu
    Copy the full SHA
    6d44a1e View commit details
    Browse the repository at this point in the history
  86. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    c93f480 View commit details
    Browse the repository at this point in the history
  87. merge main into amd-staging

    Change-Id: I4968e32ce2fcf8592f4ab65f9b2eb89b5fbb67dc
    Jenkins committed May 1, 2024
    Configuration menu
    Copy the full SHA
    39fea68 View commit details
    Browse the repository at this point in the history
  88. Minor cleanups; replace amd-stg-open with amd-staging

    Change-Id: I8d57fc9053f1ee71230ac48337f73b474581188f
    searlmc1 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    e85d0d4 View commit details
    Browse the repository at this point in the history
  89. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed May 1, 2024
    Configuration menu
    Copy the full SHA
    fc06b37 View commit details
    Browse the repository at this point in the history

Commits on May 2, 2024

  1. Configuration menu
    Copy the full SHA
    59a2734 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'amd-staging' of ssh://gerrit-git.amd.com:29418/lightnin…

    …g/ec/llvm-project into amd-staging
    searlmc1 committed May 2, 2024
    Configuration menu
    Copy the full SHA
    c013d6b View commit details
    Browse the repository at this point in the history
  3. Allow link to llvm shared library for current distros

    Signed-off-by: "Yiyang Wu <[email protected]>"
    littlewu2508 authored and LiXueying0309 committed May 2, 2024
    Configuration menu
    Copy the full SHA
    7311c1b View commit details
    Browse the repository at this point in the history