Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 5ec73b7d (Aug 21) (8) #361

Open
wants to merge 128 commits into
base: bump_to_e47b5075
Choose a base branch
from

Commits on Aug 20, 2024

  1. [CycleAnalysis] Methods to verify cycles and their nesting. (llvm#102300

    )
    
    The original implementation provided a simple method to check whether
    the forest of nested cycles is well-formed. This is now augmented with
    other methods to check well-formedness of all cycles, either
    invdividually, or as the entire forest. These will be used by future
    transforms that modify CycleInfo.
    ssahasra authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    b432afc View commit details
    Browse the repository at this point in the history
  2. [BasicAA] Use nuw attribute of GEPs (llvm#98608)

    Use the nuw attribute of GEPs to prove that pointers do not alias, in
    cases matching the following:
    
       +                +                     +
       | BaseOffset     |   +<nuw> Indices    |
       ---------------->|-------------------->|
       |-->V2Size       |                     |-------> V1Size
      LHS                                    RHS
    
    If the difference between pointers is Offset +<nuw> Indices then we know
    that the addition does not wrap the pointer index type (add nuw) and the
    constant Offset is a lower bound on the distance between the pointers. We
    can then prove NoAlias via Offset u>= V2Size.
    hazzlim authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    ba84cfb View commit details
    Browse the repository at this point in the history
  3. [Flang][OpenMP] Prevent re-composition of composite constructs (llvm#…

    …102613)
    
    After decomposition of OpenMP compound constructs and assignment of
    applicable clauses to each leaf construct, composite constructs are then
    combined again into a single element in the construct queue. This helped
    later lowering stages easily identify composite constructs.
    
    However, as a result of the re-composition stage, the same list of
    clauses is used to produce all MLIR operations corresponding to each
    leaf of the original composite construct. This undoes existing logic
    introducing implicit clauses and deciding to which leaf construct(s)
    each clause applies.
    
    This patch removes construct re-composition logic and updates Flang
    lowering to be able to identify composite constructs from a list of leaf
    constructs. As a result, the right set of clauses is produced for each
    operation representing a leaf of a composite construct.
    
    PR stack:
    - llvm#102612
    - llvm#102613
    skatrak authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    aa875cf View commit details
    Browse the repository at this point in the history
  4. [MLIR][DLTI] Introduce DLTIQueryInterface and impl for DLTI attrs (ll…

    …vm#104595)
    
    This new interface is supposed to capture the core functionality of
    DLTI: querying for values at keys. As such this new interface unifies
    the ability to query DLTI attributes in a single method: query(). All
    existing DLTI interfaces exposing their own query methods now 1) now
    extend this new interface and 2) provide a default implementation for
    `query()`.
    
    As DLTIQueryInterface::query() returns an attribute, it naturally
    enables recursive queries on nested DLTI attrs. A utility function,
    `dlti::query()`, implements the logic for nested lookups.
    
    A new `#dlti.map` attribute is introduced to capture the most generic
    form of a finite DLTI-mapping. One of the benefits is that it allows for
    more easily encoding hierachical information that is suitably queryable,
    i.e. by means of nested attributes.
    
    In line with the above, `transform.dlti.query` is modified so as to take
    an arbitrary number of keys and to perform a nested lookup using the
    above utility function.
    rolfmorel authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    34a88bb View commit details
    Browse the repository at this point in the history
  5. [clang][modules] Built-in modules are not correctly enabled for Mac C…

    …atalyst (llvm#104872)
    
    Mac Catalyst is the iOS platform, but it builds against the macOS SDK
    and so it needs to be checking the macOS SDK version instead of the iOS
    one. Add tests against a greater-than SDK version just to make sure this
    works beyond the initially supporting SDKs.
    ian-twilightcoder authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    b986438 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    42067f2 View commit details
    Browse the repository at this point in the history
  7. Revert "[CycleAnalysis] Methods to verify cycles and their nesting. (l…

    …lvm#102300)"
    
    This reverts commit b432afc.
    
    Reverted due to linker failures in expensive-checks.
    ssahasra committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    4aacc60 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    c99347a View commit details
    Browse the repository at this point in the history
  9. [SimplifyCFG] Add support for hoisting commutative instructions (llvm…

    …#104805)
    
    This extends SimplifyCFG hoisting to also hoist instructions with
    commuted operands, for example a+b on one side and b+a on the other
    side.
    
    This should address the issue mentioned in:
    llvm#91185 (comment)
    nikic authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    b3fa45b View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    3b49d27 View commit details
    Browse the repository at this point in the history
  11. [X86] Use correct fp immediate types in _mm_set_ss/sd

    Avoids implicit sint_to_fp which wasn't occurring on strict fp codegen
    
    Fixes llvm#104848
    RKSimon committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    6dcce42 View commit details
    Browse the repository at this point in the history
  12. [gn build] Port 42067f2

    llvmgnsyncbot committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    21de049 View commit details
    Browse the repository at this point in the history
  13. [ScheduleDAG] Dirty height/depth in addPred/removePred even for laten…

    …cy zero (llvm#102915)
    
    A long time ago (back in 2009) there was a commit 52d4d82
    that changed the scheduler to not dirty height/depth when adding or
    removing SUnit predecessors when the latency on the edge was zero. That
    commit message is claiming that the depth or height isn't affected when
    the latency is zero.
    
    As a matter of fact, the depth/height can change even with a zero
    latency on the edge. If for example adding a new SUnit A, with zero
    latency, but as a predecessor to a SUnit B, then both height of A and
    depth of B should be marked as dirty. If for example B has a greater
    height than A, then the height of A needs to be adjusted even if the
    latency is zero.
    
    I think this has been wrong for many years. Downstream we have had
    commit 52d4d82 reverted since back in 2016. There is no
    motivating lit test for 52d4d82 (only an incomplete C level
    reproducer in llvm#3613).
    
    After commit 13d04fa there finally appeared an upstream
    lit test that shows that we get better code if marking height/depth as
    dirty (llvm/test/CodeGen/AArch64/abds.ll).
    bjope authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    f321456 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    3f25f23 View commit details
    Browse the repository at this point in the history
  15. [clang][NFC] Split invalid-cpu-note tests (llvm#104601)

    This change does two kinds of splits:
    - Splits each target into a different file. Some targets are left in the
    same files, such as riscv32/64 and x86/_64 as these tests and lists are
    very similar.
    - Splits up the very long 'note:' lines which contain a list of CPUs,
    using `CHECK-SAME`. There was a note about this not being possible
    before, but with `{{^}}`, this is now possible -- I have
    verified that this does the right thing if a single CPU anywhere in the
    list is left out.
    
    These tests had become quite annoying to change when adding a CPU, and I
    believe this change makes these easier to maintain, and should cut down
    on conflicts in these files (or at least makes conflicts easier to
    resolve).
    
    I apologise in advance for downstream conflicts, but hopefully that's a
    small amount of short term pain, in return for fewer conflicts in
    future.
    lenary authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    39e3085 View commit details
    Browse the repository at this point in the history
  16. [llvm-c] Add getters for LLVMContextRef for various types (llvm#99087)

    Small PR to add additional getters for LLVMContextRef in the C API.
    abgeana authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    7cfc9a3 View commit details
    Browse the repository at this point in the history
  17. [LLVM] Add a C API for creating instructions with custom syncscopes. (l…

    …lvm#104775)
    
    Another upstreaming of C API extensions we have in Julia/LLVM.jl.
    Although [we went](maleadt/LLVM.jl#431) with a
    string-based API there, here I'm proposing something that's similar to
    existing metadata/attribute APIs:
    - explicit functions to map syncscope names to IDs, and back
    - `LLVM*SyncScope` versions of builder APIs that already take a
    `SingleThread` argument: atomic rmw, atomic xchg, fence
    - `LLVMGetAtomicSyncScopeID` and `LLVMSetAtomicSyncScopeID` for other
    atomic instructions
    - testing through `llvm-c-test`'s `--echo` functionality
    maleadt authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    eb7d535 View commit details
    Browse the repository at this point in the history
  18. [InstCombine] Adjust fixpoint error message (NFC)

    Add a hint to use the no-verify-fixpoint option.
    nikic committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    2511cdb View commit details
    Browse the repository at this point in the history
  19. [AArch64] Remove TargetParser CPU/Arch feature tests (llvm#104587)

    These are annoying to update, and are redundant since the tests in
    clang/test/Driver/print-enabled-extensions/ were added.
    tmatheson-arm authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    34e15ad View commit details
    Browse the repository at this point in the history
  20. [AArch64][NEON] Extend faminmax patterns with fminnm/fmaxnm (llvm#104766

    )
    
    Patterns were previously added to allow the following reductions
    - fminimum(abs(a), abs(b)) -> famin(a, b)
    - fmaximum(abs(a), abs(b)) -> famax(a, b)
    - llvm#103027
     
    It was suggested by @davemgreen that the following reductions are also
    possible
    - fminnum[nnan](abs(a), abs(b)) -> famin(a, b)
    - fmaxnum[nnan](abs(a), abs(b)) -> famax(a, b)
    
    ('nnan' documenatation:
    https://llvm.org/docs/LangRef.html#fast-math-flags)
    
    The 'no NaNs' flag allows optimisations to assume that neither argument
    is a NaN, and so the differing NaN propagation semantics of
    llvm.maxnum/llvm.minnum and FAMAX/FAMIN can be ignored in this
    reduction.
    (llvm.maxnum/llvm.minnum:
    https://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic)
    
    - Changes to LLVM
    	- lib/target/AArch64/AArch64InstrInfo.td
    - add 'fminnm_nnan' and 'fmaxnm_nnan'; patfrags on fminnm/fmaxnm that
    are predicated on the instrinsic call having the 'nnan' flag.
    - add AArch64famin and AArch64famax patfrags, containing the new and
    existing reductions.
    	- test/CodeGen/AArch64/aarch64-neon-faminmax.ll
    - add positive and negative tests for the new reduction, based on the
    presence of 'nnan' in the IR intrinsic call.
    SpencerAbson authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    5f3c0b2 View commit details
    Browse the repository at this point in the history
  21. [llvm][offload] Move AMDGPU offload utilities to LLVM (llvm#102487)

    This patch moves utilities from
    `offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to
    `llvm/Frontend/Offloading/Utility.h` to be reused by
    other projects.
    
    Concretely the following changes were made:
    - Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`.
    - Remove unused fields `KernelObject`, `KernelSegmentSize`,
    `ExplicitArgumentCount` and `ImplicitArgumentCount` from
    `AMDGPUKernelMetaData`.
    - Return the produced error if `ELFObj.sections()` failed instead of
    using `cantFail`.
    - Added `AGPRCount` field to `AMDGPUKernelMetaData`.
    - Added a default invalid value to all the fields in
    `AMDGPUKernelMetaData`.
    fabianmcg authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    cfc76b6 View commit details
    Browse the repository at this point in the history
  22. [SPARC] Remove assertions in printOperand for inline asm operands (ll…

    …vm#104692)
    
    Inline asm operands could contain any kind of relocation, so remove the
    checks.
    
    Fixes llvm#103493
    koachan authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    576b7a7 View commit details
    Browse the repository at this point in the history
  23. [lldb][Windows] Fixed the API test breakpoint_with_realpath_and_sourc…

    …e_map (llvm#104918)
    
    This test is already disabled for Windows because of symlinks. Disable
    it for cross build on Windows host too.
    slydiman authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    fc04490 View commit details
    Browse the repository at this point in the history
  24. [AArch64] Optimize when storing symmetry constants (llvm#93717)

    This change looks for instructions of storing symmetric constants
    instruction 32-bit units. usually consisting of several 'MOV' and
    one or less 'ORR'.
    
    If found, load only the lower 32-bit constant and change it to copy
    and save to the upper 32-bit using the 'STP' instruction.
    
    For example:
      renamable $x8 = MOVZXi 49370, 0
      renamable $x8 = MOVKXi $x8, 320, 16
      renamable $x8 = ORRXrs $x8, $x8, 32
      STRXui killed renamable $x8, killed renamable $x0, 0
    becomes
      $w8 = MOVZWi 49370, 0
      $w8 = MOVKWi $w8, 320, 16
    STPWi killed renamable $w8, killed renamable $w8, killed renamable $x0,
    0
    
    
    related issue : llvm#51483
    ParkHanbum authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    ee572ed View commit details
    Browse the repository at this point in the history
  25. Reapply "[CycleAnalysis] Methods to verify cycles and their nesting. (l…

    …lvm#102300)"
    
    This reverts commit 4aacc60.
    
    The original implementation provided a simple method to check whether the forest
    of nested cycles is well-formed. This is now augmented with other methods to
    check well-formedness of every cycle, either individually, or as the entire
    forest. These will be used by future transforms that modify CycleInfo.
    ssahasra committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e6da78a View commit details
    Browse the repository at this point in the history
  26. [AArch64] Extend sxtw peephole to uxtw. (llvm#104516)

    This extends the existing sxtw peephole optimization (llvm#96293) to uxtw,
    which in llvm is a ORRWrr which clears the top bits.
    
    Fixes llvm#98481
    davemgreen authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    fe946bf View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    fd83b86 View commit details
    Browse the repository at this point in the history
  28. [Driver] Make ffp-model=fast honor non-finite-values, introduce ffp-m…

    …odel=aggressive (llvm#100453)
    
    This change modifies -ffp-model=fast to select options that more closely
    match -funsafe-math-optimizations, and introduces a new model,
    -ffp-model=aggressive which matches the existing behavior (except for a
    minor change in the fp-contract behavior).
    
    The primary motivation for this change is to make -ffp-model=fast more
    user friendly, particularly in light of LLVM's aggressive optimizations
    when -fno-honor-nans and -fno-honor-infinites are used.
    
    This was previously proposed here:
    
    https://discourse.llvm.org/t/making-ffp-model-fast-more-user-friendly/78402
    Andy Kaylor authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    27e5f50 View commit details
    Browse the repository at this point in the history
  29. [CostModel][X86] Add missing costkinds for scalar CTLZ/CTTZ instructions

    Baed off worst case llvm-mca numbers for CTLZ/CTTZ(+ZERO_UNDEF) codegen
    
    Prep work for llvm#102885
    RKSimon committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    254da5a View commit details
    Browse the repository at this point in the history
  30. Reland [CGData] llvm-cgdata llvm#89884 (llvm#101461)

    Reland [CGData] llvm-cgdata llvm#89884 using `Opt` instead of `cl`
    - Action options are required, `--convert`, `--show`, `--merge`. This
    was similar to sub-commands previously implemented, but having a prefix
    `--`.
    - `--format` option is added, which specifies `text` or `binary`.
    
    ---------
    
    Co-authored-by: Kyungwoo Lee <[email protected]>
    kyulee-com and Kyungwoo Lee authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    9bb5556 View commit details
    Browse the repository at this point in the history
  31. [DXIL][Analysis] Add validator version to info collected by Module Me…

    …tadata Analysis (llvm#104828)
    
    Add Validator Version to information collected by Module Metadata
    Analysis pass. An earlier change (llvm#104040) added a default hardcoded
    value for validator version to be associated with DXIL module created
    during HLSL source compilation.
    
    Add tests to verify validator version info collected
     - Updated existing tests
     - Added a test with validator version specified in DXIL metadata
    bharadwajy authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    74f5ee4 View commit details
    Browse the repository at this point in the history
  32. Reenable anon structs (llvm#104922)

    Add back missing includes and revert revert "[clang][ExtractAPI] Stop
    dropping fields of nested anonymous record types when they aren't
    attached to variable declaration (llvm#104600)"
    daniel-grumberg authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8f4f3df View commit details
    Browse the repository at this point in the history
  33. [llvm-cgdata] Fix -Wcovered-switch-default (NFC)

    /llvm-project/llvm/tools/llvm-cgdata/llvm-cgdata.cpp:349:3:
    error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
      default:
      ^
    1 error generated.
    DamonFool committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    723a9b8 View commit details
    Browse the repository at this point in the history
  34. [AArch64] fix buildbot by removing dead code

    Failure with -Werror buildbot caused by llvm#104587
    tmatheson-arm committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    b5f7b69 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    1c3955f View commit details
    Browse the repository at this point in the history
  36. [AMDGPU] Move AMDGPUMemoryUtils out of Utils. NFC. (llvm#104930)

    It is only used by CodeGen so does not need to be shared with the
    assembler/disassembler.
    jayfoad authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    55d744e View commit details
    Browse the repository at this point in the history
  37. [NVPTX] Add elect.sync Intrinsic (llvm#104780)

    This patch adds an NVVM intrinsic and NVPTX codegen for the elect.sync
    PTX instruction. Lit tests are
    added in elect.ll and verified through ptxas.
    
    PTX ISA reference:
    
    https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-elect-sync
    
    Signed-off-by: Durgadoss R <[email protected]>
    durga4github authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    d5e9691 View commit details
    Browse the repository at this point in the history
  38. [NFC] Remove explicit bitcode enumeration from BitCodeFormat.rst (llv…

    …m#102618)
    
    This explicit listing of the bitcodes is out of date, and had fallen out of date in the past as well.
    
    Delete the explicit listing and point users to where they can find it.
    cjappl authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    5f77734 View commit details
    Browse the repository at this point in the history
  39. [MLIR][EmitC] Allow ptrdiff_t as result in sub op (llvm#104921)

    This explicitly allows the `emitc.ptrdiff_t` type for the result of
    substrating two pointers and changes the example accordingly.
    marbre authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    5032fa8 View commit details
    Browse the repository at this point in the history
  40. [DXIL][Analysis] Delete unnecessary test (llvm#105025)

    Delete an unnecessary test added in an earlier PR.
    bharadwajy authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c670cb4 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    90a8e5a View commit details
    Browse the repository at this point in the history
  42. [clang][ASTMatcher] Fix execution order of hasOperands submatchers (l…

    …lvm#104148)
    
    `hasOperands` does not always execute matchers in the order they are
    written. This can cause issue in code using bindings when one operand
    matcher is relying on a binding set by the other. With this change, the
    first matcher present in the code is always executed first and any
    binding it sets are available to the second matcher.
    
    Simple example with current version (1 match) and new version (2
    matches):
    ```bash
    > cat tmp.cpp
    int a = 13;
    int b = ((int) a) - a;
    int c = a - ((int) a);
    
    > clang-query tmp.cpp
    clang-query> set traversal IgnoreUnlessSpelledInSource
    clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))
    
    Match #1:
    
    tmp.cpp:1:1: note: "d" binds here
    int a = 13;
    ^~~~~~~~~~
    tmp.cpp:2:9: note: "root" binds here
    int b = ((int)a) - a;
            ^~~~~~~~~~~~
    1 match.
    
    > ./build/bin/clang-query tmp.cpp
    clang-query> set traversal IgnoreUnlessSpelledInSource
    clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))
    
    Match #1:
    
    tmp.cpp:1:1: note: "d" binds here
        1 | int a = 13;
          | ^~~~~~~~~~
    tmp.cpp:2:9: note: "root" binds here
        2 | int b = ((int)a) - a;
          |         ^~~~~~~~~~~~
    
    Match #2:
    
    tmp.cpp:1:1: note: "d" binds here
        1 | int a = 13;
          | ^~~~~~~~~~
    tmp.cpp:3:9: note: "root" binds here
        3 | int c = a - ((int)a);
          |         ^~~~~~~~~~~~
    2 matches.
    ```
    
    If this should be documented or regression tested anywhere please let me
    know where.
    nicovank authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    f9e2a86 View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    8f44fee View commit details
    Browse the repository at this point in the history
  44. [Support] Remove unneeded __has_include fallback

    This is a C++17 feature implemented in all supported compilers.
    
    Pull Request: llvm#104898
    MaskRay authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c106e8d View commit details
    Browse the repository at this point in the history
  45. [CMake] Remove HAVE_LINK_H

    We can remove the variable from https://reviews.llvm.org/D5610 since
    link.h is available on Linux (glibc/musl/Bionic), FreeBSD, and NetBSD.
    Use `__has_include(<link.h>)` before including it.
    
    Pull Request: llvm#104893
    MaskRay authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    7c06786 View commit details
    Browse the repository at this point in the history
  46. [mlir] [irdl] Improve IRDL documentation (llvm#104928)

    Updates some of the irdl documentation to be in line with the current
    state of IRDL. Also removes some trailing spaces in this documentation.
    alexarice authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    61f8ab3 View commit details
    Browse the repository at this point in the history
  47. [OpenMP][FIX] Check for requirements early (llvm#104836)

    If we can't transform the region to SPMD, we should not wait till the
    end to decide that. Other AAs might assume SPMD, and we did set the
    constant initializer to indicate SPMD, but we did not change the code
    properly.
    jdoerfert authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    2641ed7 View commit details
    Browse the repository at this point in the history
  48. Fix a warning for -Wcovered-switch-default (llvm#105054)

    This fixes a build break from [llvm/llvm-project] Reland [CGData]
    llvm-cgdata llvm#89884 (PR llvm#101461)
    kyulee-com authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    dfc3494 View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    9b25ad8 View commit details
    Browse the repository at this point in the history
  50. Recommit "[CodeGenPrepare] Folding urem with loop invariant value"

    Was missing remainder on `Start` value.
    
    Also changed logic as as nikic suggested (getting loop from `PN`
    instead of `Rem`). The prior impl increased the complexity of the code
    and made debugging it more difficult.
    
    Closes llvm#104877
    goldsteinn committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e4c67ba View commit details
    Browse the repository at this point in the history
  51. [RISCV] Add coverage for VP div[u]/rem[u] with non-power-of-2 vectors

    This already works, just adding coverage to show that before a change
    which depends on this functionality.
    preames committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    aaba552 View commit details
    Browse the repository at this point in the history
  52. [Clang] CWG722: nullptr to ellipses (llvm#104704)

    https://cplusplus.github.io/CWG/issues/722.html
    
    nullptr passed to a variadic function now converted to void* in C++.
    This does not affect C23 nullptr.
    
    Also fixes -Wformat-pedantic so that it no longer warns for nullptr
    passed to %p (because it is converted to void* in C++ and it is allowed
    for va_arg(ap, void*) in C23)
    MitalAshok authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    0e24686 View commit details
    Browse the repository at this point in the history
  53. [bazel] Port bf68e90 (llvm#104907)

    Add dep on ControlFlowInterfaces for arith td files
    Groverkss authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8ac9247 View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    abd3a2d View commit details
    Browse the repository at this point in the history
  55. [RISCV] Add isel optimization for (and (sra y, c2), c1) to recover re…

    …gression from llvm#101751. (llvm#104114)
    
    If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
    c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
    followed by a SHXADD with c4 as the X amount.
    
    Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
    Alive2: https://alive2.llvm.org/ce/z/AwhheR
    topperc authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    5144817 View commit details
    Browse the repository at this point in the history
  56. [HLSL] Implement support for HLSL intrinsic - saturate (llvm#104619)

    Implement support for HLSL intrinsic saturate.
    Implement DXIL codegen for the intrinsic saturate by lowering it to DXIL
    Op dx.saturate.
    Implement SPIRV codegen by transforming saturate(x) to clamp(x, 0.0f,
    1.0f).
    
    Add tests for DXIL and SPIRV CodeGen.
    bharadwajy authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    6a38e19 View commit details
    Browse the repository at this point in the history
  57. [NFC][TableGen] Elminate use of isalpha/isdigit from TGLexer (llvm#10…

    …4837)
    
    - Replace use of std::isalpha, std::isdigit, std:isxdigit with LLVM's
    StringExtras versions, to avoid possibly locale dependent behavior (e.g.
    glibc).
    - Create helper function for common checks for valid identifier
    characters.
    jurahul authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e6751bf View commit details
    Browse the repository at this point in the history
  58. [OpenMP] Map omp_default_mem_alloc to global memory (llvm#104790)

    Summary:
    Currently, we assign this to private memory. This causes failures on
    some SOLLVE tests. The standard isn't clear on the semantics of this
    allocation type, but there seems to be a consensus that it's supposed to
    be shared memory.
    jhuber6 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e0326b6 View commit details
    Browse the repository at this point in the history
  59. [libc++][chono] Use hidden friends for leap_second comparison. (llvm#…

    …104713)
    
    The function
    
        template<class Duration>
    requires three_way_comparable_with<sys_seconds, sys_time<Duration>>
    constexpr auto operator<=>(const leap_second& x, const
    sys_time<Duration>& y) noexcept;
    
    Has a recursive constrained. This caused an infinite loop in GCC and is
    now hit by llvm#102857.
    
    A fix would be to make this function a hidden friend, this solution is
    propsed in LWG4139.
    
    For consistency all comparisons are made hidden friends. Since the issue
    causes compilation failures no additional test are needed.
    
    Fixes: llvm#104700
    mordante authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e0441d5 View commit details
    Browse the repository at this point in the history
  60. [mlir][spirv] Support gpu in convert-to-spirv pass (llvm#105010)

    This PR adds conversion patterns for GPU to the `convert-to-spirv` pass,
    introduced in llvm#95942. Now the pass is able to convert each `gpu.module`
    and its ops within a `builtin.module` into a `spirv.module`.
    
    **Future Plans**
    - Use `gpu.launch_func` to invoke kernel from host functions
    - Potentially integrate into the `mlir-vulkan-runner` for e2e testing
    angelz913 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    93eda08 View commit details
    Browse the repository at this point in the history
  61. [mlir][gpu] Add 'cluster_size' attribute to gpu.subgroup_reduce (llvm…

    …#104851)
    
    This enables performing several reductions in parallel, each smaller
    than the size of the subgroup.
    
    One potential application is flash attention with subgroup-wide matrix
    multiplication and reduction combined in one kernel. The multiplication
    operation requires a 2D matrix to be distributed over the lanes of the
    subgroup, which then constrains the shape the following reduction can
    have if we want to keep data in registers.
    andfau-amd authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    7aa22f0 View commit details
    Browse the repository at this point in the history
  62. [lldb][ClangExpressionParser] Don't leak memory when multiplexing Ext…

    …ernalASTSources (llvm#104799)
    
    When we use `SemaSourceWithPriorities` as the `ASTContext`s
    ExternalASTSource, we allocate a `ClangASTSourceProxy` (via
    `CreateProxy`) and two `ExternalASTSourceWrapper`. Then we push these
    sources into a vector in `SemaSourceWithPriorities`. The allocated
    `SemaSourceWithPriorities` itself will get properly deallocated because
    the `ASTContext` wraps it in an `IntrusiveRefCntPtr`. But the three
    sources we allocated earlier will never get released.
    
    This patch fixes this by mimicking what `MultiplexExternalSemaSource`
    does (which is what `SemaSourceWithPriorities` is based on anyway).
    I.e., when `SemaSourceWithPriorities` gets constructed, it increments
    the use count of its sources. And on destruction it decrements them.
    
    Similarly, to make sure we dealloacted the `ClangASTProxy` properly, the
    `ExternalASTSourceWrapper` now assumes shared ownership of the
    underlying source.
    Michael137 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    770cd24 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    ddaa828 View commit details
    Browse the repository at this point in the history
  64. [lldb][ClangExpressionParser] Implement ExternalSemaSource::ReadUndef…

    …inedButUsed (llvm#104817)
    
    While parsing an expression, Clang tries to diagnose usage of decls
    (with possibly non-external linkage) for which it hasn't been provided
    with a definition. This is the case, e.g., for functions with parameters
    that live in an anonymous namespace (those will have `UniqueExternal`
    linkage, this is computed [here in
    computeTypeLinkageInfo](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/AST/Type.cpp#L4647-L4653)).
    Before diagnosing such situations, Clang calls
    `ExternalSemaSource::ReadUndefinedButUsed`. The intended use of this API
    is to extend the set of "used but not defined" decls with additional
    ones that the external source knows about. However, in LLDB's case, we
    never provide `FunctionDecl`s with a definition, and instead rely on the
    expression parser to resolve those symbols by linkage name. Thus, to
    avoid the Clang parser from erroring out in these situations, this patch
    implements `ReadUndefinedButUsed` which just removes the "undefined"
    non-external `FunctionDecl`s that Clang found.
    
    We also had to add an `ExternalSemaSource` to the `clang::Sema` instance
    LLDB creates. We previously didn't have any source on `Sema`. Because we
    add the `ExternalASTSourceWrapper` here, that means we'd also
    technically be adding the `ClangExpressionDeclMap` as an
    `ExternalASTSource` to `Sema`, which is fine because `Sema` will only be
    calling into the `ExternalSemaSource` APIs (though nothing currently
    strictly enforces this, which is a bit worrying).
    
    Note, the decision for whether to put a function into `UndefinedButUsed`
    is done in
    [Sema::MarkFunctionReferenced](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/Sema/SemaExpr.cpp#L18083-L18087).
    The `UniqueExternal` linkage computation is done in
    [getLVForNamespaceScopeDecl](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/AST/Decl.cpp#L821-L833).
    
    Fixes llvm#104712
    Michael137 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8056d92 View commit details
    Browse the repository at this point in the history
  65. [lldb] Fix windows debug build after 9d07f43 (llvm#104896)

    This patch tries to fix an issue with the windows debug builds where the
    PDB file for python scripted interfaces cannot be opened since its path
    length exceed the windows `MAX_PATH` limit:
    
    llvm#101672 (comment)
    
    This patch addresses the issue by building all the interfaces as a
    single library plugin that initiliazes each component as part of its
    `Initialize` method, instead of building each interface as its own
    library plugin.
    
    This keeps the build artifact path length smaller while respecting the
    naming convention and without making any exception in the build system.
    
    Fixes llvm#104895.
    
    Signed-off-by: Med Ismail Bennani <[email protected]>
    medismailben authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    3565332 View commit details
    Browse the repository at this point in the history
  66. [ctx_prof] Add analysis utility to fetch ID of a callsite (llvm#104491)

    This will be needed when maintaining the contextual profile for ICP or inlining - we'll need to first fetch the ID of a callsite, which is in an instrumentation instruction (intrinsic) preceding the callsite.
    mtrofin authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c8a678b View commit details
    Browse the repository at this point in the history
  67. [DirectX] Encapsulate DXILOpLowering's state into a class. NFC

    This introduces an anonymous class "OpLowerer" to help with lowering DXIL ops,
    and moves the DXILOpBuilder there instead of creating a new one for every
    operation. DXILOpBuilder is also changed to own its IRBuilder, since that makes
    it simpler to ensure that it isn't misused.
    
    Pull Request: llvm#104248
    bogner authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e56ad22 View commit details
    Browse the repository at this point in the history
  68. [mlir][tablegen] Fix tablegen bug with Complex class (llvm#104974)

    The `Complex` class in tablegen tries to store its element type, but due
    to a name collision it actually ends up storing the `type` field of the
    `ConfinedType` superclass, and so `elementType` is always set to
    `AnyComplex`.
    
    This renames the field so that it gets correctly set.
    alexarice authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    655d62c View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    3031840 View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    c442025 View commit details
    Browse the repository at this point in the history
  71. llvm.lround: Update verifier to validate support of vector types. (ll…

    …vm#98950)
    
    Both IRVerifier and Machine Verifier are updated
    sgundapa authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    b941ba1 View commit details
    Browse the repository at this point in the history
  72. [lldb] Disable the API test TestCppBitfields on Windows (llvm#105037)

    This test causes the assert in clang CodeGen and python crashes with the
    error code 0x80000003. See llvm#105019 for more details. Note the similar
    test lldb/test/API/lang/c/bitfields/TestBitfields.py is already disabled
    on Windows.
    slydiman authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    31e55d4 View commit details
    Browse the repository at this point in the history
  73. [libc++] Fix several double-moves in the code base (llvm#104616)

    This patch hardens the "test iterators" we use to test algorithms by
    ensuring that they don't get double-moved. As a result of this
    hardening, the tests started reporting multiple failures where we would
    double-move iterators, which are being fixed in this patch.
    
    In particular:
    - Fixed a double-move in pstl.partition
    - Add coverage for begin()/end() in subrange tests
    - Fix tests for ranges::ends_with and ranges::contains, which were
      incorrectly calling begin() twice on the same subrange containing
      non-copyable input iterators.
    
    Fixes llvm#100709
    ldionne authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    f73050e View commit details
    Browse the repository at this point in the history
  74. [AArch64][MachO] Add ptrauth ABI version to arm64e cpusubtype. (llvm#…

    …104650)
    
    In a mach_header, the cpusubtype is a 32-bit field, but it's split in 2
    subfields:
    - the low 24 bits containing the cpu subtype proper, (e.g.,
    CPU_SUBTYPE_ARM64E 2)
    - the high 8 bits containing a capability field used for additional
    feature flags.
    
    Notably, it's only the subtype subfield that participates in fat file
    slice discrimination: the caps are ignored.
    
    arm64e uses the caps subfield to encode a ptrauth ABI version:
    - 0x80 (CPU_SUBTYPE_PTRAUTH_ABI) denotes a versioned binary
    - 0x40 denotes a kernel-ABI binary
    - 0x00-0x0F holds the ptrauth ABI version
    
    This teaches the basic obj tools to decode that (or ignore it when
    unneeded).
    
    It also teaches the MachO writer to default to emitting versioned
    binaries, but with a version of 0 (and without the kernel ABI flag).
    
    Modern arm64e requires versioned binaries: a binary with 0x00 caps in
    cpusubtype is now rejected by the linker and everything after. We can
    live without the sophistication of specifying the version and kernel ABI
    for now.
    
    Co-authored-by: Francis Visoiu Mistrih <[email protected]>
    ahmedbougacha and francisvm authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    fd4f952 View commit details
    Browse the repository at this point in the history
  75. [mlir][gpu] Add extra value types for gpu::ShuffleOp (llvm#104605)

    Expand the accepted types for gpu.shuffle to any integer, float or 1d vector of integers or floats.
    Also updated the gpu-to-llvm-spv pass to support those types.
    FMarno authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    552d26e View commit details
    Browse the repository at this point in the history
  76. [llvm-lit][test] Updated built-in cat command tests (llvm#104473)

    This patch makes changes to improve syntax in tests and to add more
    strict checks on cat output. This is a prequisite for
    llvm#101530.
    connieyzhu authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    349d76d View commit details
    Browse the repository at this point in the history
  77. [lldb][test] Change unsupported cat -e to cat -v to work with lit int…

    …ernal shell (llvm#104878)
    
    This patch changes the test that uses the `cat -e` option to `cat -v` so
    that the test can be run using lit's internal shell. For `cat`, the `-v`
    option prints non-printing characters in ^ and M- notation, while the
    `-e` option adds `$` to the end of lines in addition to printing
    non-printing characters in ^ and M- notation. This is an alternative
    patch to llvm#102061, opting to
    rewrite the test that uses `cat -e` instead of extending support to the
    `-e` option.
    
    Fixes llvm#102377
    connieyzhu authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    6558e04 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    ce132a5 View commit details
    Browse the repository at this point in the history
  79. [flang] More support for anonymous parent components in struct constr… (

    llvm#102642)
    
    …uctors
    
    A non-conforming extension to Fortran present in a couple other
    compilers is allowing a anonymous component in a structure constructor
    to initialize a parent (or greater ancestor) component. This was working
    in this compiler only for direct parents, and only when the type was not
    use-associated.
    
    Fixes llvm#102557.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    10cc4a5 View commit details
    Browse the repository at this point in the history
  80. [flang] Fix inheritance of IMPLICIT typing rules (llvm#102692)

    Interfaces don't inherit the IMPLICIT typing rules of their enclosing
    scope, and separate MODULE PROCEDUREs inherit the IMPLICIT typing rules
    of submodule in which they are defined, not the rules from their
    interface.
    
    Fixes llvm#102558.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    90d753a View commit details
    Browse the repository at this point in the history
  81. [flang] Silence an inappropriate warning (llvm#104685)

    A bare ALLOCATE statement with no SOURCE= rightly earns a warning about
    an undefined function result, if that result is an allocatable that
    appears in the ALLOCATE. But in the case of a pointer, where the warning
    should care more about the pointer's association status than the value
    of its target, a bare ALLOCATE should suffice to silence the warning.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    f059017 View commit details
    Browse the repository at this point in the history
  82. [flang] Silence spurious error (llvm#104821)

    Don't complain about a local object with an impure final procedure in a
    pure subprogram when the local object is a named constant.
    
    Fixes llvm#104796.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    143be4e View commit details
    Browse the repository at this point in the history
  83. [flang] Fix IEEE_NEAREST_AFTER folding edge cases (llvm#104846)

    Conversions of infinities from other kinds to real(10) were incorrect,
    and comparisons of real(2) vs real(3) are dicey as conversions in one
    direction can overflow and conversions in the other can lose precision.
    Use real(16) as the common type for comparisons in IEEE_NEAREST_AFTER.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1e1cf25 View commit details
    Browse the repository at this point in the history
  84. [Attributor] Improve AAUnderlyingObjects (llvm#104835)

    - Allocas and GlobalValues cannot be simplified, so we should not try.
    - If we never used any assumed state, the AAUnderlyingObjects doesn't
    require an additional update.
    - If we have seen an object (or it's underlying object) before, we do
    not need to inspect it anymore.
    
    The original logic for "SeenObjects" was flawed and caused us to add
    intermediate values to the underlying object list if a PHI or select
    instruction referenced the same underlying object twice. The test
    changes are all instances of this situation and we now correctly derive
    `memory(none)` for the functions that only access stack memory.
    
    ---------
    
    Co-authored-by: Shilei Tian <[email protected]>
    jdoerfert and shiltian authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8266d47 View commit details
    Browse the repository at this point in the history
  85. [Driver,DXIL] Fix build

    MaskRay committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c932a0e View commit details
    Browse the repository at this point in the history
  86. Configuration menu
    Copy the full SHA
    0a22655 View commit details
    Browse the repository at this point in the history
  87. Configuration menu
    Copy the full SHA
    5822cc2 View commit details
    Browse the repository at this point in the history
  88. Configuration menu
    Copy the full SHA
    93e0f31 View commit details
    Browse the repository at this point in the history
  89. [SandboxIR] Implement CatchSwitchInst (llvm#104652)

    This patch implements sandboxir::CatchSwitchInst mirroring
    llvm::CatchSwitchInst.
    vporpo authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    6e8c970 View commit details
    Browse the repository at this point in the history
  90. AMDGPU/NewPM: Fill out passes in addCodeGenPrepare (llvm#102867)

    AMDGPUAnnotateKernelFeatures hasn't been ported yet, but it
    should be soon removable.
    arsenm authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    afeef4d View commit details
    Browse the repository at this point in the history
  91. AMDGPU/NewPM: Start filling out addIRPasses (llvm#102884)

    This is not complete, but gets AtomicExpand running. I was able
    to get further than I expected; we're quite close to having all
    the IR codegen passes ported.
    arsenm authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    33e18b2 View commit details
    Browse the repository at this point in the history
  92. [clang] Support -Wa, options -mmsa and -mno-msa (llvm#99615)

    Co-authored-by: Fangrui Song <[email protected]>
    yingopq and MaskRay authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    26ae316 View commit details
    Browse the repository at this point in the history
  93. Configuration menu
    Copy the full SHA
    66ab4b8 View commit details
    Browse the repository at this point in the history
  94. [NFC] Fixed two typos: "__builin_" --> "__builtin_" (llvm#98782)

    Fixed two typos:
    1. `__builin_va_list` --> `__builtin_va_list`
    2. `__builin_suspend` --> `__builtin_suspend`
    ZERICO2005 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    d03dcf6 View commit details
    Browse the repository at this point in the history
  95. [NFC] Fix a typo in InternalsManual: ActOnCXX -> ActOnXXX (llvm#105207)

    This part of the manual describes uses of `ActOnXXX` and `BuildXXX`.
    mpark authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    660de53 View commit details
    Browse the repository at this point in the history
  96. [flang] Disable failing test (llvm#105327)

    flang/test/Evaluate/fold-nearest.f90 is failing oddly on ppc64le;
    disable it for now while I sort things out.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1233df7 View commit details
    Browse the repository at this point in the history
  97. [RISCV][GISel] Remove s32 support on RV64 for DIV, and REM. (llvm#102519

    )
    
    Based on experience with SelectionDAG and experimental-rv64-legal-i32, I
    don't believe making s32 a legal type is viable without introducing an
    invariant that s32 values are always sign extended like Mips64 does.
    Mips64 does this with a separate 32-bit register class.
    
    `experimental-rv64-legal-i32` was removed in #llvm#102509.
    
    This patch is part of a series to remove s32 support so we can remove
    the isel patterns that SelectionDAG is no longer using. To restore code
    quality, we will need to add custom W nodes like SelectionDAG.
    topperc authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    2599d69 View commit details
    Browse the repository at this point in the history
  98. [Clang] Re-land Overflow Pattern Exclusions (llvm#104889)

    Introduce "-fsanitize-undefined-ignore-overflow-pattern=" which can
    be used to disable sanitizer instrumentation for common overflow-dependent
    code patterns.
    
    For a wide selection of projects, proper overflow sanitization could
    help catch bugs and solve security vulnerabilities. Unfortunately, in
    some cases the integer overflow sanitizers are too noisy for their users
    and are often left disabled. Providing users with a method to disable
    sanitizer instrumentation of common patterns could mean more projects
    actually utilize the sanitizers in the first place.
    
    One such project that has opted to not use integer overflow (or
    truncation) sanitizers is the Linux Kernel. There has been some
    discussion[1] recently concerning mitigation strategies for unexpected
    arithmetic overflow. This discussion is still ongoing and a succinct
    article[2] accurately sums up the discussion. In summary, many Kernel
    developers do not want to introduce more arithmetic wrappers when
    most developers understand the code patterns as they are.
    
    Patterns like:
    
      if (base + offset < base) { ... }
    
    or
    
      while (i--) { ... }
    
    or
    
      #define SOME -1UL
    
    are extremely common in a code base like the Linux Kernel. It is
    perhaps too much to ask of kernel developers to use arithmetic wrappers
    in these cases. For example:
    
      while (wrapping_post_dec(i)) { ... }
    
    which wraps some builtin would not fly. This would incur too many
    changes to existing code; the code churn would be too much, at least too
    much to justify turning on overflow sanitizers.
    
    Currently, this commit tackles three pervasive idioms:
    
    1. "if (a + b < a)" or some logically-equivalent re-ordering like "if (a > b + a)"
    2. "while (i--)" (for unsigned) a post-decrement always overflows here
    3. "-1UL, -2UL, etc" negation of unsigned constants will always overflow
    
    The patterns that are excluded can be chosen from the following list:
    
    - add-overflow-test
    - post-decr-while
    - negated-unsigned-const
    
    These can be enabled with a comma-separated list:
    
      -fsanitize-undefined-ignore-overflow-pattern=add-overflow-test,negated-unsigned-const
    
    "all" or "none" may also be used to specify that all patterns should be
    excluded or that none should be.
    
    [1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
    [2] https://lwn.net/Articles/979747/
    
    CCs: @efriedma-quic @kees @jyknight @fmayer @vitalybuka
    Signed-off-by: Justin Stitt <[email protected]>
    Co-authored-by: Bill Wendling <[email protected]>
    JustinStitt and bwendling authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    295fe0b View commit details
    Browse the repository at this point in the history
  99. [OpenMP] Temporarily disable test to keep bots green

    Summary:
    This test mysteriously fails on the bots but not locally, disable until
    I can figure out why.
    jhuber6 committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e96146c View commit details
    Browse the repository at this point in the history
  100. AMDGPU: Temporarily stop adding AtomicExpand to new PM passes

    This breaks using -passes=atomic-expand (but only sometimes?).
    Somehow an AtomicExpand pass ends up running without a TargetMachine,
    despite always being constructed with one.
    arsenm committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    dd90c72 View commit details
    Browse the repository at this point in the history
  101. [flang] Disable part of failing test (temporary) (llvm#105350)

    A new section of a test is failing on aarch64 and ppc64le; disable it
    while I sort things out.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    0c48986 View commit details
    Browse the repository at this point in the history
  102. [BOLT] Reduce CFI warning verbosity (llvm#105336)

    CFI programs may have more saves than restores and this is completely
    benign from BOLT's perspective. Reduce the verbosity and print the
    warning only under `-v=1` and above.
    maksfb authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8f30506 View commit details
    Browse the repository at this point in the history
  103. [DAG][RISCV] Use vp.<binop> when widening illegal types for binops wh…

    …ich can trap (llvm#105214)
    
    This allows the use a single wider operation with a restricted EVL
    instead of having to split and cover via decreasing powers-of-two sizes.
    
    On RISCV, this avoids the need for a bunch of vslidedown and vslideup
    instructions to extract subvectors, and VL toggles to switch between the
    various widths.
    
    Note there is a potential downside of using vp nodes; we loose any
    generic DAG combines which might have applied to the split form.
    preames authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    91b423d View commit details
    Browse the repository at this point in the history
  104. [libc] Include startup code when installing all (llvm#105203)

    Previously the libc startup code was marked `EXCLUDE_FROM_ALL` due to
    build issues. This patch removes that as no longer necessary.
    michaelrj-google authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    2353f48 View commit details
    Browse the repository at this point in the history
  105. [cmake] Set up llvm-ml as ASM_MASM tool in WinMsvc.cmake (llvm#104903)

    Nowadays, an ASM_MASM tool is required for building the BLAKE3 assembly
    in llvm/lib/Support - the llvm-ml tool can do this.
    mstorsjo authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    aeeb74f View commit details
    Browse the repository at this point in the history
  106. [libc] move newheadergen back to safe_load (llvm#105374)

    In llvm#100024 we moved from safe_load to load for reading the yaml in
    newheadergen due to dependency issues. Those should be resolved by now
    so this should be a simple safety improvement.
    michaelrj-google authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    a3c66c8 View commit details
    Browse the repository at this point in the history
  107. [TableGen] Rework EmitIntrinsicToBuiltinMap (llvm#104681)

    Rework `IntrinsicEmitter::EmitIntrinsicToBuiltinMap` for improved
        peformance as well as refactor the code.
    
        Performance:
        - Current generated code does a linear search on the TargetPrefix,
          followed by a binary search on the builtin names for that
          target's builtins.
        - Improve the performance of this code in 2 ways:
          (a) Use binary search on the target prefix to lookup the builtin
              table for the target.
          (b) Improve the (common) case of when all builtins for a target
              share a common prefix.  Check this common prefix first, and
    then do the binary search in the builtin table using the builtin
              name with the common prefix removed. This should help
              both data size (by creating a smaller static string table) and
              runtime (by reducing the cost of binary search on smaller
              strings).
    
        Refactor:
        - Use range based for loops for iterating over maps.
    - Use formatv() and C++ raw string literals to simplify the emission
    code.
        - Change the generated `getIntrinsicForClangBuiltin` and 
          `getIntrinsicForMSBuiltin`  to take a `StringRef` instead of 
          `const char *` for the prefix.
    jurahul authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    389f339 View commit details
    Browse the repository at this point in the history
  108. Configuration menu
    Copy the full SHA
    5e6a198 View commit details
    Browse the repository at this point in the history
  109. Configuration menu
    Copy the full SHA
    019e1a3 View commit details
    Browse the repository at this point in the history
  110. [flang] Fix test on ppc64le & aarch64 (llvm#105439)

    Don't try to fold x87 extended precision operations in a test unless
    it's targeting x86-64.
    klausler authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c9a4c51 View commit details
    Browse the repository at this point in the history
  111. [DXIL][Analysis] Update test to match comment. NFC (llvm#105409)

    The mismatch between the comment on this test and the test itself was
    pointed out in
    llvm#100699 (comment),
    but apparently I failed to actually commit the fix.
    bogner authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1a2a18f View commit details
    Browse the repository at this point in the history
  112. Configuration menu
    Copy the full SHA
    26b79f8 View commit details
    Browse the repository at this point in the history
  113. Configuration menu
    Copy the full SHA
    b7eac8c View commit details
    Browse the repository at this point in the history
  114. [lldb][test] XFAIL TestAnonNamespaceParamFunc.cpp on Windows

    This recently added test is failing on Windows with:
    ```
    c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe --no-lldbinit -S C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Expr\Output\TestAnonNamespaceParamFunc.cpp.tmp -o run -o "expression func(a)" -o exit | c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp
    executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe' --no-lldbinit -S 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Expr\Output\TestAnonNamespaceParamFunc.cpp.tmp' -o run -o 'expression func(a)' -o exit
    .---command stderr------------
    | TestAnonNamespaceParamFunc.cpp.tmp :: Class 'tagARRAYDESC' has a member 'tdescElem' of type 'tagTYPEDESC' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'tagARRAYDESC' has a member 'tdescElem' of type 'tagTYPEDESC' which does not have a complete definition.
    | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::partial_ordering' has a member 'less' of type 'std::partial_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::partial_ordering' has a member 'less' of type 'std::partial_ordering' which does not have a complete definition.
    | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::strong_ordering' has a member 'less' of type 'std::strong_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::strong_ordering' has a member 'less' of type 'std::strong_ordering' which does not have a complete definition.
    | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::weak_ordering' has a member 'less' of type 'std::weak_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::weak_ordering' has a member 'less' of type 'std::weak_ordering' which does not have a complete definition.
    | (lldb) error: Couldn't look up symbols:
    |   int func(struct `anonymous namespace'::InAnon)
    | Hint: The expression tried to call a function that is not present in the target, perhaps because it was optimized out by the compiler.
    `-----------------------------
    executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp'
    .---command stderr------------
    | C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp:10:11: error: CHECK: expected string not found in input
    | // CHECK: (int) $0 = 15
    |           ^
    | <stdin>:16:26: note: scanning from here
    | (lldb) expression func(a)
    |                          ^
    ```
    
    So the function is still not callable. But AFAICT, this is not a
    regression, since this function wasn't callable prior to the patch
    anyway. I currently do not have a Windows setup to test this on,
    so XFAIL for now.
    Michael137 committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8d712b4 View commit details
    Browse the repository at this point in the history
  115. [RISCV][GISel] Split LoadStoreActions in LoadActions and StoreActions.

    Remove widenToNextPow2 from StoreActions.
    Reorder clampScalar and lowerIfMemSizeNotByteSizePow2 for StoreActions.
    
    These match AArch64 and got me further on a test case I was playing with
    that contained a i129 store.
    topperc committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1e9d002 View commit details
    Browse the repository at this point in the history
  116. [lldb] Extend frame recognizers to hide frames from backtraces (llvm#…

    …104523)
    
    Compilers and language runtimes often use helper functions that are
    fundamentally uninteresting when debugging anything but the
    compiler/runtime itself. This patch introduces a user-extensible
    mechanism that allows for these frames to be hidden from backtraces and
    automatically skipped over when navigating the stack with `up` and
    `down`.
    
    This does not affect the numbering of frames, so `f <N>` will still
    provide access to the hidden frames. The `bt` output will also print a
    hint that frames have been hidden.
    
    My primary motivation for this feature is to hide thunks in the Swift
    programming language, but I'm including an example recognizer for
    `std::function::operator()` that I wished for myself many times while
    debugging LLDB.
    
    rdar://126629381
    
    
    Example output. (Yes, my proof-of-concept recognizer could hide even
    more frames if we had a method that returned the function name without
    the return type or I used something that isn't based off regex, but it's
    really only meant as an example).
    
    before:
    ```
    (lldb) thread backtrace --filtered=false
    * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
      * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
        frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
        frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
        frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
        frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
        frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
        frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
        frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
        frame #8: 0x0000000183cdf154 dyld`start + 2476
    (lldb) 
    ```
    
    after
    
    ```
    (lldb) bt
    * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
      * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
        frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
        frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
        frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
        frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
        frame #8: 0x0000000183cdf154 dyld`start + 2476
    Note: Some frames were hidden by frame recognizers
    ```
    adrian-prantl authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    f01f80c View commit details
    Browse the repository at this point in the history
  117. [mlir][linalg] Improve getPreservedProducerResults estimation in Elem…

    …entwiseOpFusion (llvm#104409)
    
    This commit changes the getPreservedProducerResults function so that it
    takes the consumer into account along with the producer, in order to
    predict which of the producer’s outputs can be dropped during the fusion
    process. It provides a more accurate prediction, considering that the
    fusion process also depends on the consumer.
    DanielLevi6 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    4a4b233 View commit details
    Browse the repository at this point in the history
  118. Configuration menu
    Copy the full SHA
    a16f0dc View commit details
    Browse the repository at this point in the history
  119. [DirectX] Register a few DXIL passes with the new PM

    This wires up dxil-op-lower, dxil-intrinsic-expansion, dxil-translate-metadata,
    and dxil-pretty-printer to the new pass manager, both as a matter of future
    proofing the backend and so that they can be used more flexibly in tests.
    
    A few arbitrary tests are updated in order to test the new PM path, and we drop
    the "print-dxil-resource-md" pass since it's redundant with the pretty printer.
    
    Pull Request: llvm#104250
    bogner authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    81ee385 View commit details
    Browse the repository at this point in the history
  120. Revert "[RISCV][GISel] Allow >2*XLen integers in isSupportedReturnType."

    It didn't crash so I thought this worked now, but upon further review
    it miscalculates the stack address for the return.
    topperc committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    a8ef679 View commit details
    Browse the repository at this point in the history
  121. [RISCV] Add coverage for int reductions of <3 x i8> vectors

    Specifically, to illustrate our general lowering strategy for
    non-power of two vectors.
    preames committed Aug 20, 2024
    Configuration menu
    Copy the full SHA
    3145cff View commit details
    Browse the repository at this point in the history
  122. Fix KCFI types for generated functions with integer normalization (ll…

    …vm#104826)
    
    With -fsanitize-cfi-icall-experimental-normalize-integers, Clang
    appends ".normalized" to KCFI types in CodeGenModule::CreateKCFITypeId,
    which changes type hashes also for functions that don't have integer
    types in their signatures. However, llvm::setKCFIType does not take
    integer normalization into account, which means LLVM generated
    functions with KCFI types, e.g. sanitizer constructors, will fail KCFI
    checks when integer normalization is enabled in Clang.
    
    Add a cfi-normalize-integers module flag to indicate integer
    normalization is used, and append ".normalized" to KCFI types also in
    llvm::setKCFIType to fix the type mismatch.
    samitolvanen authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    e1c36bd View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2024

  1. [AArch64] Basic SVE PCS support for handling scalable vectors on Darwin.

    For the tests I just added +sve instead of what actual hardware has, which is only SME,
    since otherwise all the test functions need to be marked as streaming mode.
    
    rdar://121864771
    aemerson committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    39ec1f7 View commit details
    Browse the repository at this point in the history
  2. [RISCV][GISel] Merge RISCVCallLowering::lowerReturnVal into RISCVCall…

    …Lowering::lowerReturn. NFC
    
    This is similar to X86 and AArch64 structure.
    topperc committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    381a803 View commit details
    Browse the repository at this point in the history
  3. [mlir] Fix -Wunused-result in ElementwiseOpFusion.cpp (NFC)

    /llvm-project/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp:124:7:
    error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
          opOperandsToIgnore.pop_back_val();
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1 error generated.
    DamonFool committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    d8b6df2 View commit details
    Browse the repository at this point in the history
  4. RISC-V: Add fminimumnum and fmaximumnum support (llvm#104411)

    Since 2.2, `fmin.s/fmax.s` instructions follow the IEEE754-2019, if F
    extension is avaiable; and `fmin.d/fmax.d` also follow the IEEE754-2019
    if D extension is avaiable.
    
    So, let's mark them as Legal.
    wzssyqa authored Aug 21, 2024
    Configuration menu
    Copy the full SHA
    2b84fe6 View commit details
    Browse the repository at this point in the history
  5. Reland "[gn build] Port d3fb41d (llvm-cgdata)"

    This reverts commit 6476a1d.
    d3fb41d relanded in 9bb5556.
    
    ...amended to incorporate changes from the reland.
    nico committed Aug 21, 2024
    Configuration menu
    Copy the full SHA
    5ec73b7 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2024

  1. Configuration menu
    Copy the full SHA
    64e887e View commit details
    Browse the repository at this point in the history