-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 5ec73b7d (Aug 21) (8) #361
base: bump_to_e47b5075
Are you sure you want to change the base?
Commits on Aug 20, 2024
-
[CycleAnalysis] Methods to verify cycles and their nesting. (llvm#102300
) The original implementation provided a simple method to check whether the forest of nested cycles is well-formed. This is now augmented with other methods to check well-formedness of all cycles, either invdividually, or as the entire forest. These will be used by future transforms that modify CycleInfo.
Configuration menu - View commit details
-
Copy full SHA for b432afc - Browse repository at this point
Copy the full SHA b432afcView commit details -
[BasicAA] Use nuw attribute of GEPs (llvm#98608)
Use the nuw attribute of GEPs to prove that pointers do not alias, in cases matching the following: + + + | BaseOffset | +<nuw> Indices | ---------------->|-------------------->| |-->V2Size | |-------> V1Size LHS RHS If the difference between pointers is Offset +<nuw> Indices then we know that the addition does not wrap the pointer index type (add nuw) and the constant Offset is a lower bound on the distance between the pointers. We can then prove NoAlias via Offset u>= V2Size.
Configuration menu - View commit details
-
Copy full SHA for ba84cfb - Browse repository at this point
Copy the full SHA ba84cfbView commit details -
[Flang][OpenMP] Prevent re-composition of composite constructs (llvm#…
…102613) After decomposition of OpenMP compound constructs and assignment of applicable clauses to each leaf construct, composite constructs are then combined again into a single element in the construct queue. This helped later lowering stages easily identify composite constructs. However, as a result of the re-composition stage, the same list of clauses is used to produce all MLIR operations corresponding to each leaf of the original composite construct. This undoes existing logic introducing implicit clauses and deciding to which leaf construct(s) each clause applies. This patch removes construct re-composition logic and updates Flang lowering to be able to identify composite constructs from a list of leaf constructs. As a result, the right set of clauses is produced for each operation representing a leaf of a composite construct. PR stack: - llvm#102612 - llvm#102613
Configuration menu - View commit details
-
Copy full SHA for aa875cf - Browse repository at this point
Copy the full SHA aa875cfView commit details -
[MLIR][DLTI] Introduce DLTIQueryInterface and impl for DLTI attrs (ll…
…vm#104595) This new interface is supposed to capture the core functionality of DLTI: querying for values at keys. As such this new interface unifies the ability to query DLTI attributes in a single method: query(). All existing DLTI interfaces exposing their own query methods now 1) now extend this new interface and 2) provide a default implementation for `query()`. As DLTIQueryInterface::query() returns an attribute, it naturally enables recursive queries on nested DLTI attrs. A utility function, `dlti::query()`, implements the logic for nested lookups. A new `#dlti.map` attribute is introduced to capture the most generic form of a finite DLTI-mapping. One of the benefits is that it allows for more easily encoding hierachical information that is suitably queryable, i.e. by means of nested attributes. In line with the above, `transform.dlti.query` is modified so as to take an arbitrary number of keys and to perform a nested lookup using the above utility function.
Configuration menu - View commit details
-
Copy full SHA for 34a88bb - Browse repository at this point
Copy the full SHA 34a88bbView commit details -
[clang][modules] Built-in modules are not correctly enabled for Mac C…
…atalyst (llvm#104872) Mac Catalyst is the iOS platform, but it builds against the macOS SDK and so it needs to be checking the macOS SDK version instead of the iOS one. Add tests against a greater-than SDK version just to make sure this works beyond the initially supporting SDKs.
Configuration menu - View commit details
-
Copy full SHA for b986438 - Browse repository at this point
Copy the full SHA b986438View commit details -
Configuration menu - View commit details
-
Copy full SHA for 42067f2 - Browse repository at this point
Copy the full SHA 42067f2View commit details -
Revert "[CycleAnalysis] Methods to verify cycles and their nesting. (l…
…lvm#102300)" This reverts commit b432afc. Reverted due to linker failures in expensive-checks.
Configuration menu - View commit details
-
Copy full SHA for 4aacc60 - Browse repository at this point
Copy the full SHA 4aacc60View commit details -
Configuration menu - View commit details
-
Copy full SHA for c99347a - Browse repository at this point
Copy the full SHA c99347aView commit details -
[SimplifyCFG] Add support for hoisting commutative instructions (llvm…
…#104805) This extends SimplifyCFG hoisting to also hoist instructions with commuted operands, for example a+b on one side and b+a on the other side. This should address the issue mentioned in: llvm#91185 (comment)
Configuration menu - View commit details
-
Copy full SHA for b3fa45b - Browse repository at this point
Copy the full SHA b3fa45bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b49d27 - Browse repository at this point
Copy the full SHA 3b49d27View commit details -
[X86] Use correct fp immediate types in _mm_set_ss/sd
Avoids implicit sint_to_fp which wasn't occurring on strict fp codegen Fixes llvm#104848
Configuration menu - View commit details
-
Copy full SHA for 6dcce42 - Browse repository at this point
Copy the full SHA 6dcce42View commit details -
Configuration menu - View commit details
-
Copy full SHA for 21de049 - Browse repository at this point
Copy the full SHA 21de049View commit details -
[ScheduleDAG] Dirty height/depth in addPred/removePred even for laten…
…cy zero (llvm#102915) A long time ago (back in 2009) there was a commit 52d4d82 that changed the scheduler to not dirty height/depth when adding or removing SUnit predecessors when the latency on the edge was zero. That commit message is claiming that the depth or height isn't affected when the latency is zero. As a matter of fact, the depth/height can change even with a zero latency on the edge. If for example adding a new SUnit A, with zero latency, but as a predecessor to a SUnit B, then both height of A and depth of B should be marked as dirty. If for example B has a greater height than A, then the height of A needs to be adjusted even if the latency is zero. I think this has been wrong for many years. Downstream we have had commit 52d4d82 reverted since back in 2016. There is no motivating lit test for 52d4d82 (only an incomplete C level reproducer in llvm#3613). After commit 13d04fa there finally appeared an upstream lit test that shows that we get better code if marking height/depth as dirty (llvm/test/CodeGen/AArch64/abds.ll).
Configuration menu - View commit details
-
Copy full SHA for f321456 - Browse repository at this point
Copy the full SHA f321456View commit details -
[X86][AVX10] Fix unexpected error and warning when using intrinsic (l…
…lvm#104781) E.g.: https://godbolt.org/z/G8zK5svjK Based on Evgenii's work.
Configuration menu - View commit details
-
Copy full SHA for 3f25f23 - Browse repository at this point
Copy the full SHA 3f25f23View commit details -
[clang][NFC] Split invalid-cpu-note tests (llvm#104601)
This change does two kinds of splits: - Splits each target into a different file. Some targets are left in the same files, such as riscv32/64 and x86/_64 as these tests and lists are very similar. - Splits up the very long 'note:' lines which contain a list of CPUs, using `CHECK-SAME`. There was a note about this not being possible before, but with `{{^}}`, this is now possible -- I have verified that this does the right thing if a single CPU anywhere in the list is left out. These tests had become quite annoying to change when adding a CPU, and I believe this change makes these easier to maintain, and should cut down on conflicts in these files (or at least makes conflicts easier to resolve). I apologise in advance for downstream conflicts, but hopefully that's a small amount of short term pain, in return for fewer conflicts in future.
Configuration menu - View commit details
-
Copy full SHA for 39e3085 - Browse repository at this point
Copy the full SHA 39e3085View commit details -
[llvm-c] Add getters for LLVMContextRef for various types (llvm#99087)
Small PR to add additional getters for LLVMContextRef in the C API.
Configuration menu - View commit details
-
Copy full SHA for 7cfc9a3 - Browse repository at this point
Copy the full SHA 7cfc9a3View commit details -
[LLVM] Add a C API for creating instructions with custom syncscopes. (l…
…lvm#104775) Another upstreaming of C API extensions we have in Julia/LLVM.jl. Although [we went](maleadt/LLVM.jl#431) with a string-based API there, here I'm proposing something that's similar to existing metadata/attribute APIs: - explicit functions to map syncscope names to IDs, and back - `LLVM*SyncScope` versions of builder APIs that already take a `SingleThread` argument: atomic rmw, atomic xchg, fence - `LLVMGetAtomicSyncScopeID` and `LLVMSetAtomicSyncScopeID` for other atomic instructions - testing through `llvm-c-test`'s `--echo` functionality
Configuration menu - View commit details
-
Copy full SHA for eb7d535 - Browse repository at this point
Copy the full SHA eb7d535View commit details -
[InstCombine] Adjust fixpoint error message (NFC)
Add a hint to use the no-verify-fixpoint option.
Configuration menu - View commit details
-
Copy full SHA for 2511cdb - Browse repository at this point
Copy the full SHA 2511cdbView commit details -
[AArch64] Remove TargetParser CPU/Arch feature tests (llvm#104587)
These are annoying to update, and are redundant since the tests in clang/test/Driver/print-enabled-extensions/ were added.
Configuration menu - View commit details
-
Copy full SHA for 34e15ad - Browse repository at this point
Copy the full SHA 34e15adView commit details -
[AArch64][NEON] Extend faminmax patterns with fminnm/fmaxnm (llvm#104766
) Patterns were previously added to allow the following reductions - fminimum(abs(a), abs(b)) -> famin(a, b) - fmaximum(abs(a), abs(b)) -> famax(a, b) - llvm#103027 It was suggested by @davemgreen that the following reductions are also possible - fminnum[nnan](abs(a), abs(b)) -> famin(a, b) - fmaxnum[nnan](abs(a), abs(b)) -> famax(a, b) ('nnan' documenatation: https://llvm.org/docs/LangRef.html#fast-math-flags) The 'no NaNs' flag allows optimisations to assume that neither argument is a NaN, and so the differing NaN propagation semantics of llvm.maxnum/llvm.minnum and FAMAX/FAMIN can be ignored in this reduction. (llvm.maxnum/llvm.minnum: https://llvm.org/docs/LangRef.html#llvm-minnum-intrinsic) - Changes to LLVM - lib/target/AArch64/AArch64InstrInfo.td - add 'fminnm_nnan' and 'fmaxnm_nnan'; patfrags on fminnm/fmaxnm that are predicated on the instrinsic call having the 'nnan' flag. - add AArch64famin and AArch64famax patfrags, containing the new and existing reductions. - test/CodeGen/AArch64/aarch64-neon-faminmax.ll - add positive and negative tests for the new reduction, based on the presence of 'nnan' in the IR intrinsic call.
Configuration menu - View commit details
-
Copy full SHA for 5f3c0b2 - Browse repository at this point
Copy the full SHA 5f3c0b2View commit details -
[llvm][offload] Move AMDGPU offload utilities to LLVM (llvm#102487)
This patch moves utilities from `offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to `llvm/Frontend/Offloading/Utility.h` to be reused by other projects. Concretely the following changes were made: - Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`. - Remove unused fields `KernelObject`, `KernelSegmentSize`, `ExplicitArgumentCount` and `ImplicitArgumentCount` from `AMDGPUKernelMetaData`. - Return the produced error if `ELFObj.sections()` failed instead of using `cantFail`. - Added `AGPRCount` field to `AMDGPUKernelMetaData`. - Added a default invalid value to all the fields in `AMDGPUKernelMetaData`.
Configuration menu - View commit details
-
Copy full SHA for cfc76b6 - Browse repository at this point
Copy the full SHA cfc76b6View commit details -
[SPARC] Remove assertions in printOperand for inline asm operands (ll…
…vm#104692) Inline asm operands could contain any kind of relocation, so remove the checks. Fixes llvm#103493
Configuration menu - View commit details
-
Copy full SHA for 576b7a7 - Browse repository at this point
Copy the full SHA 576b7a7View commit details -
[lldb][Windows] Fixed the API test breakpoint_with_realpath_and_sourc…
…e_map (llvm#104918) This test is already disabled for Windows because of symlinks. Disable it for cross build on Windows host too.
Configuration menu - View commit details
-
Copy full SHA for fc04490 - Browse repository at this point
Copy the full SHA fc04490View commit details -
[AArch64] Optimize when storing symmetry constants (llvm#93717)
This change looks for instructions of storing symmetric constants instruction 32-bit units. usually consisting of several 'MOV' and one or less 'ORR'. If found, load only the lower 32-bit constant and change it to copy and save to the upper 32-bit using the 'STP' instruction. For example: renamable $x8 = MOVZXi 49370, 0 renamable $x8 = MOVKXi $x8, 320, 16 renamable $x8 = ORRXrs $x8, $x8, 32 STRXui killed renamable $x8, killed renamable $x0, 0 becomes $w8 = MOVZWi 49370, 0 $w8 = MOVKWi $w8, 320, 16 STPWi killed renamable $w8, killed renamable $w8, killed renamable $x0, 0 related issue : llvm#51483
Configuration menu - View commit details
-
Copy full SHA for ee572ed - Browse repository at this point
Copy the full SHA ee572edView commit details -
Reapply "[CycleAnalysis] Methods to verify cycles and their nesting. (l…
…lvm#102300)" This reverts commit 4aacc60. The original implementation provided a simple method to check whether the forest of nested cycles is well-formed. This is now augmented with other methods to check well-formedness of every cycle, either individually, or as the entire forest. These will be used by future transforms that modify CycleInfo.
Configuration menu - View commit details
-
Copy full SHA for e6da78a - Browse repository at this point
Copy the full SHA e6da78aView commit details -
[AArch64] Extend sxtw peephole to uxtw. (llvm#104516)
This extends the existing sxtw peephole optimization (llvm#96293) to uxtw, which in llvm is a ORRWrr which clears the top bits. Fixes llvm#98481
Configuration menu - View commit details
-
Copy full SHA for fe946bf - Browse repository at this point
Copy the full SHA fe946bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for fd83b86 - Browse repository at this point
Copy the full SHA fd83b86View commit details -
[Driver] Make ffp-model=fast honor non-finite-values, introduce ffp-m…
…odel=aggressive (llvm#100453) This change modifies -ffp-model=fast to select options that more closely match -funsafe-math-optimizations, and introduces a new model, -ffp-model=aggressive which matches the existing behavior (except for a minor change in the fp-contract behavior). The primary motivation for this change is to make -ffp-model=fast more user friendly, particularly in light of LLVM's aggressive optimizations when -fno-honor-nans and -fno-honor-infinites are used. This was previously proposed here: https://discourse.llvm.org/t/making-ffp-model-fast-more-user-friendly/78402
Andy Kaylor authoredAug 20, 2024 Configuration menu - View commit details
-
Copy full SHA for 27e5f50 - Browse repository at this point
Copy the full SHA 27e5f50View commit details -
[CostModel][X86] Add missing costkinds for scalar CTLZ/CTTZ instructions
Baed off worst case llvm-mca numbers for CTLZ/CTTZ(+ZERO_UNDEF) codegen Prep work for llvm#102885
Configuration menu - View commit details
-
Copy full SHA for 254da5a - Browse repository at this point
Copy the full SHA 254da5aView commit details -
Reland [CGData] llvm-cgdata llvm#89884 (llvm#101461)
Reland [CGData] llvm-cgdata llvm#89884 using `Opt` instead of `cl` - Action options are required, `--convert`, `--show`, `--merge`. This was similar to sub-commands previously implemented, but having a prefix `--`. - `--format` option is added, which specifies `text` or `binary`. --------- Co-authored-by: Kyungwoo Lee <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9bb5556 - Browse repository at this point
Copy the full SHA 9bb5556View commit details -
[DXIL][Analysis] Add validator version to info collected by Module Me…
…tadata Analysis (llvm#104828) Add Validator Version to information collected by Module Metadata Analysis pass. An earlier change (llvm#104040) added a default hardcoded value for validator version to be associated with DXIL module created during HLSL source compilation. Add tests to verify validator version info collected - Updated existing tests - Added a test with validator version specified in DXIL metadata
Configuration menu - View commit details
-
Copy full SHA for 74f5ee4 - Browse repository at this point
Copy the full SHA 74f5ee4View commit details -
Reenable anon structs (llvm#104922)
Add back missing includes and revert revert "[clang][ExtractAPI] Stop dropping fields of nested anonymous record types when they aren't attached to variable declaration (llvm#104600)"
Configuration menu - View commit details
-
Copy full SHA for 8f4f3df - Browse repository at this point
Copy the full SHA 8f4f3dfView commit details -
[llvm-cgdata] Fix -Wcovered-switch-default (NFC)
/llvm-project/llvm/tools/llvm-cgdata/llvm-cgdata.cpp:349:3: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] default: ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 723a9b8 - Browse repository at this point
Copy the full SHA 723a9b8View commit details -
[AArch64] fix buildbot by removing dead code
Failure with -Werror buildbot caused by llvm#104587
Configuration menu - View commit details
-
Copy full SHA for b5f7b69 - Browse repository at this point
Copy the full SHA b5f7b69View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1c3955f - Browse repository at this point
Copy the full SHA 1c3955fView commit details -
[AMDGPU] Move AMDGPUMemoryUtils out of Utils. NFC. (llvm#104930)
It is only used by CodeGen so does not need to be shared with the assembler/disassembler.
Configuration menu - View commit details
-
Copy full SHA for 55d744e - Browse repository at this point
Copy the full SHA 55d744eView commit details -
[NVPTX] Add elect.sync Intrinsic (llvm#104780)
This patch adds an NVVM intrinsic and NVPTX codegen for the elect.sync PTX instruction. Lit tests are added in elect.ll and verified through ptxas. PTX ISA reference: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-elect-sync Signed-off-by: Durgadoss R <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d5e9691 - Browse repository at this point
Copy the full SHA d5e9691View commit details -
[NFC] Remove explicit bitcode enumeration from BitCodeFormat.rst (llv…
…m#102618) This explicit listing of the bitcodes is out of date, and had fallen out of date in the past as well. Delete the explicit listing and point users to where they can find it.
Configuration menu - View commit details
-
Copy full SHA for 5f77734 - Browse repository at this point
Copy the full SHA 5f77734View commit details -
[MLIR][EmitC] Allow ptrdiff_t as result in sub op (llvm#104921)
This explicitly allows the `emitc.ptrdiff_t` type for the result of substrating two pointers and changes the example accordingly.
Configuration menu - View commit details
-
Copy full SHA for 5032fa8 - Browse repository at this point
Copy the full SHA 5032fa8View commit details -
[DXIL][Analysis] Delete unnecessary test (llvm#105025)
Delete an unnecessary test added in an earlier PR.
Configuration menu - View commit details
-
Copy full SHA for c670cb4 - Browse repository at this point
Copy the full SHA c670cb4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 90a8e5a - Browse repository at this point
Copy the full SHA 90a8e5aView commit details -
[clang][ASTMatcher] Fix execution order of hasOperands submatchers (l…
…lvm#104148) `hasOperands` does not always execute matchers in the order they are written. This can cause issue in code using bindings when one operand matcher is relying on a binding set by the other. With this change, the first matcher present in the code is always executed first and any binding it sets are available to the second matcher. Simple example with current version (1 match) and new version (2 matches): ```bash > cat tmp.cpp int a = 13; int b = ((int) a) - a; int c = a - ((int) a); > clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match #1: tmp.cpp:1:1: note: "d" binds here int a = 13; ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here int b = ((int)a) - a; ^~~~~~~~~~~~ 1 match. > ./build/bin/clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match #1: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here 2 | int b = ((int)a) - a; | ^~~~~~~~~~~~ Match #2: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:3:9: note: "root" binds here 3 | int c = a - ((int)a); | ^~~~~~~~~~~~ 2 matches. ``` If this should be documented or regression tested anywhere please let me know where.
Configuration menu - View commit details
-
Copy full SHA for f9e2a86 - Browse repository at this point
Copy the full SHA f9e2a86View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8f44fee - Browse repository at this point
Copy the full SHA 8f44feeView commit details -
[Support] Remove unneeded __has_include fallback
This is a C++17 feature implemented in all supported compilers. Pull Request: llvm#104898
Configuration menu - View commit details
-
Copy full SHA for c106e8d - Browse repository at this point
Copy the full SHA c106e8dView commit details -
We can remove the variable from https://reviews.llvm.org/D5610 since link.h is available on Linux (glibc/musl/Bionic), FreeBSD, and NetBSD. Use `__has_include(<link.h>)` before including it. Pull Request: llvm#104893
Configuration menu - View commit details
-
Copy full SHA for 7c06786 - Browse repository at this point
Copy the full SHA 7c06786View commit details -
[mlir] [irdl] Improve IRDL documentation (llvm#104928)
Updates some of the irdl documentation to be in line with the current state of IRDL. Also removes some trailing spaces in this documentation.
Configuration menu - View commit details
-
Copy full SHA for 61f8ab3 - Browse repository at this point
Copy the full SHA 61f8ab3View commit details -
[OpenMP][FIX] Check for requirements early (llvm#104836)
If we can't transform the region to SPMD, we should not wait till the end to decide that. Other AAs might assume SPMD, and we did set the constant initializer to indicate SPMD, but we did not change the code properly.
Configuration menu - View commit details
-
Copy full SHA for 2641ed7 - Browse repository at this point
Copy the full SHA 2641ed7View commit details -
Fix a warning for -Wcovered-switch-default (llvm#105054)
This fixes a build break from [llvm/llvm-project] Reland [CGData] llvm-cgdata llvm#89884 (PR llvm#101461)
Configuration menu - View commit details
-
Copy full SHA for dfc3494 - Browse repository at this point
Copy the full SHA dfc3494View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b25ad8 - Browse repository at this point
Copy the full SHA 9b25ad8View commit details -
Recommit "[CodeGenPrepare] Folding
urem
with loop invariant value"Was missing remainder on `Start` value. Also changed logic as as nikic suggested (getting loop from `PN` instead of `Rem`). The prior impl increased the complexity of the code and made debugging it more difficult. Closes llvm#104877
Configuration menu - View commit details
-
Copy full SHA for e4c67ba - Browse repository at this point
Copy the full SHA e4c67baView commit details -
[RISCV] Add coverage for VP div[u]/rem[u] with non-power-of-2 vectors
This already works, just adding coverage to show that before a change which depends on this functionality.
Configuration menu - View commit details
-
Copy full SHA for aaba552 - Browse repository at this point
Copy the full SHA aaba552View commit details -
[Clang] CWG722: nullptr to ellipses (llvm#104704)
https://cplusplus.github.io/CWG/issues/722.html nullptr passed to a variadic function now converted to void* in C++. This does not affect C23 nullptr. Also fixes -Wformat-pedantic so that it no longer warns for nullptr passed to %p (because it is converted to void* in C++ and it is allowed for va_arg(ap, void*) in C23)
Configuration menu - View commit details
-
Copy full SHA for 0e24686 - Browse repository at this point
Copy the full SHA 0e24686View commit details -
[bazel] Port bf68e90 (llvm#104907)
Add dep on ControlFlowInterfaces for arith td files
Configuration menu - View commit details
-
Copy full SHA for 8ac9247 - Browse repository at this point
Copy the full SHA 8ac9247View commit details -
Configuration menu - View commit details
-
Copy full SHA for abd3a2d - Browse repository at this point
Copy the full SHA abd3a2dView commit details -
[RISCV] Add isel optimization for (and (sra y, c2), c1) to recover re…
…gression from llvm#101751. (llvm#104114) If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4) followed by a SHXADD with c4 as the X amount. Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4). Alive2: https://alive2.llvm.org/ce/z/AwhheR
Configuration menu - View commit details
-
Copy full SHA for 5144817 - Browse repository at this point
Copy the full SHA 5144817View commit details -
[HLSL] Implement support for HLSL intrinsic - saturate (llvm#104619)
Implement support for HLSL intrinsic saturate. Implement DXIL codegen for the intrinsic saturate by lowering it to DXIL Op dx.saturate. Implement SPIRV codegen by transforming saturate(x) to clamp(x, 0.0f, 1.0f). Add tests for DXIL and SPIRV CodeGen.
Configuration menu - View commit details
-
Copy full SHA for 6a38e19 - Browse repository at this point
Copy the full SHA 6a38e19View commit details -
[NFC][TableGen] Elminate use of isalpha/isdigit from TGLexer (llvm#10…
…4837) - Replace use of std::isalpha, std::isdigit, std:isxdigit with LLVM's StringExtras versions, to avoid possibly locale dependent behavior (e.g. glibc). - Create helper function for common checks for valid identifier characters.
Configuration menu - View commit details
-
Copy full SHA for e6751bf - Browse repository at this point
Copy the full SHA e6751bfView commit details -
[OpenMP] Map
omp_default_mem_alloc
to global memory (llvm#104790)Summary: Currently, we assign this to private memory. This causes failures on some SOLLVE tests. The standard isn't clear on the semantics of this allocation type, but there seems to be a consensus that it's supposed to be shared memory.
Configuration menu - View commit details
-
Copy full SHA for e0326b6 - Browse repository at this point
Copy the full SHA e0326b6View commit details -
[libc++][chono] Use hidden friends for leap_second comparison. (llvm#…
…104713) The function template<class Duration> requires three_way_comparable_with<sys_seconds, sys_time<Duration>> constexpr auto operator<=>(const leap_second& x, const sys_time<Duration>& y) noexcept; Has a recursive constrained. This caused an infinite loop in GCC and is now hit by llvm#102857. A fix would be to make this function a hidden friend, this solution is propsed in LWG4139. For consistency all comparisons are made hidden friends. Since the issue causes compilation failures no additional test are needed. Fixes: llvm#104700
Configuration menu - View commit details
-
Copy full SHA for e0441d5 - Browse repository at this point
Copy the full SHA e0441d5View commit details -
[mlir][spirv] Support
gpu
inconvert-to-spirv
pass (llvm#105010)This PR adds conversion patterns for GPU to the `convert-to-spirv` pass, introduced in llvm#95942. Now the pass is able to convert each `gpu.module` and its ops within a `builtin.module` into a `spirv.module`. **Future Plans** - Use `gpu.launch_func` to invoke kernel from host functions - Potentially integrate into the `mlir-vulkan-runner` for e2e testing
Configuration menu - View commit details
-
Copy full SHA for 93eda08 - Browse repository at this point
Copy the full SHA 93eda08View commit details -
[mlir][gpu] Add 'cluster_size' attribute to gpu.subgroup_reduce (llvm…
…#104851) This enables performing several reductions in parallel, each smaller than the size of the subgroup. One potential application is flash attention with subgroup-wide matrix multiplication and reduction combined in one kernel. The multiplication operation requires a 2D matrix to be distributed over the lanes of the subgroup, which then constrains the shape the following reduction can have if we want to keep data in registers.
Configuration menu - View commit details
-
Copy full SHA for 7aa22f0 - Browse repository at this point
Copy the full SHA 7aa22f0View commit details -
[lldb][ClangExpressionParser] Don't leak memory when multiplexing Ext…
…ernalASTSources (llvm#104799) When we use `SemaSourceWithPriorities` as the `ASTContext`s ExternalASTSource, we allocate a `ClangASTSourceProxy` (via `CreateProxy`) and two `ExternalASTSourceWrapper`. Then we push these sources into a vector in `SemaSourceWithPriorities`. The allocated `SemaSourceWithPriorities` itself will get properly deallocated because the `ASTContext` wraps it in an `IntrusiveRefCntPtr`. But the three sources we allocated earlier will never get released. This patch fixes this by mimicking what `MultiplexExternalSemaSource` does (which is what `SemaSourceWithPriorities` is based on anyway). I.e., when `SemaSourceWithPriorities` gets constructed, it increments the use count of its sources. And on destruction it decrements them. Similarly, to make sure we dealloacted the `ClangASTProxy` properly, the `ExternalASTSourceWrapper` now assumes shared ownership of the underlying source.
Configuration menu - View commit details
-
Copy full SHA for 770cd24 - Browse repository at this point
Copy the full SHA 770cd24View commit details -
Revert "[compiler-rt][fuzzer] implements SetThreadName for fuchsia." (l…
…lvm#105162) Reverts llvm#99953
Configuration menu - View commit details
-
Copy full SHA for ddaa828 - Browse repository at this point
Copy the full SHA ddaa828View commit details -
[lldb][ClangExpressionParser] Implement ExternalSemaSource::ReadUndef…
…inedButUsed (llvm#104817) While parsing an expression, Clang tries to diagnose usage of decls (with possibly non-external linkage) for which it hasn't been provided with a definition. This is the case, e.g., for functions with parameters that live in an anonymous namespace (those will have `UniqueExternal` linkage, this is computed [here in computeTypeLinkageInfo](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/AST/Type.cpp#L4647-L4653)). Before diagnosing such situations, Clang calls `ExternalSemaSource::ReadUndefinedButUsed`. The intended use of this API is to extend the set of "used but not defined" decls with additional ones that the external source knows about. However, in LLDB's case, we never provide `FunctionDecl`s with a definition, and instead rely on the expression parser to resolve those symbols by linkage name. Thus, to avoid the Clang parser from erroring out in these situations, this patch implements `ReadUndefinedButUsed` which just removes the "undefined" non-external `FunctionDecl`s that Clang found. We also had to add an `ExternalSemaSource` to the `clang::Sema` instance LLDB creates. We previously didn't have any source on `Sema`. Because we add the `ExternalASTSourceWrapper` here, that means we'd also technically be adding the `ClangExpressionDeclMap` as an `ExternalASTSource` to `Sema`, which is fine because `Sema` will only be calling into the `ExternalSemaSource` APIs (though nothing currently strictly enforces this, which is a bit worrying). Note, the decision for whether to put a function into `UndefinedButUsed` is done in [Sema::MarkFunctionReferenced](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/Sema/SemaExpr.cpp#L18083-L18087). The `UniqueExternal` linkage computation is done in [getLVForNamespaceScopeDecl](https://github.com/llvm/llvm-project/blob/ea8bb4d633683f5cbfd82491620be3056f347a02/clang/lib/AST/Decl.cpp#L821-L833). Fixes llvm#104712
Configuration menu - View commit details
-
Copy full SHA for 8056d92 - Browse repository at this point
Copy the full SHA 8056d92View commit details -
[lldb] Fix windows debug build after 9d07f43 (llvm#104896)
This patch tries to fix an issue with the windows debug builds where the PDB file for python scripted interfaces cannot be opened since its path length exceed the windows `MAX_PATH` limit: llvm#101672 (comment) This patch addresses the issue by building all the interfaces as a single library plugin that initiliazes each component as part of its `Initialize` method, instead of building each interface as its own library plugin. This keeps the build artifact path length smaller while respecting the naming convention and without making any exception in the build system. Fixes llvm#104895. Signed-off-by: Med Ismail Bennani <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3565332 - Browse repository at this point
Copy the full SHA 3565332View commit details -
[ctx_prof] Add analysis utility to fetch ID of a callsite (llvm#104491)
This will be needed when maintaining the contextual profile for ICP or inlining - we'll need to first fetch the ID of a callsite, which is in an instrumentation instruction (intrinsic) preceding the callsite.
Configuration menu - View commit details
-
Copy full SHA for c8a678b - Browse repository at this point
Copy the full SHA c8a678bView commit details -
[DirectX] Encapsulate DXILOpLowering's state into a class. NFC
This introduces an anonymous class "OpLowerer" to help with lowering DXIL ops, and moves the DXILOpBuilder there instead of creating a new one for every operation. DXILOpBuilder is also changed to own its IRBuilder, since that makes it simpler to ensure that it isn't misused. Pull Request: llvm#104248
Configuration menu - View commit details
-
Copy full SHA for e56ad22 - Browse repository at this point
Copy the full SHA e56ad22View commit details -
[mlir][tablegen] Fix tablegen bug with
Complex
class (llvm#104974)The `Complex` class in tablegen tries to store its element type, but due to a name collision it actually ends up storing the `type` field of the `ConfinedType` superclass, and so `elementType` is always set to `AnyComplex`. This renames the field so that it gets correctly set.
Configuration menu - View commit details
-
Copy full SHA for 655d62c - Browse repository at this point
Copy the full SHA 655d62cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3031840 - Browse repository at this point
Copy the full SHA 3031840View commit details -
[mlir][sparse] support sparsification to coiterate operations. (llvm#…
Peiming Liu authoredAug 20, 2024 Configuration menu - View commit details
-
Copy full SHA for c442025 - Browse repository at this point
Copy the full SHA c442025View commit details -
llvm.lround: Update verifier to validate support of vector types. (ll…
…vm#98950) Both IRVerifier and Machine Verifier are updated
Configuration menu - View commit details
-
Copy full SHA for b941ba1 - Browse repository at this point
Copy the full SHA b941ba1View commit details -
[lldb] Disable the API test TestCppBitfields on Windows (llvm#105037)
This test causes the assert in clang CodeGen and python crashes with the error code 0x80000003. See llvm#105019 for more details. Note the similar test lldb/test/API/lang/c/bitfields/TestBitfields.py is already disabled on Windows.
Configuration menu - View commit details
-
Copy full SHA for 31e55d4 - Browse repository at this point
Copy the full SHA 31e55d4View commit details -
[libc++] Fix several double-moves in the code base (llvm#104616)
This patch hardens the "test iterators" we use to test algorithms by ensuring that they don't get double-moved. As a result of this hardening, the tests started reporting multiple failures where we would double-move iterators, which are being fixed in this patch. In particular: - Fixed a double-move in pstl.partition - Add coverage for begin()/end() in subrange tests - Fix tests for ranges::ends_with and ranges::contains, which were incorrectly calling begin() twice on the same subrange containing non-copyable input iterators. Fixes llvm#100709
Configuration menu - View commit details
-
Copy full SHA for f73050e - Browse repository at this point
Copy the full SHA f73050eView commit details -
[AArch64][MachO] Add ptrauth ABI version to arm64e cpusubtype. (llvm#…
…104650) In a mach_header, the cpusubtype is a 32-bit field, but it's split in 2 subfields: - the low 24 bits containing the cpu subtype proper, (e.g., CPU_SUBTYPE_ARM64E 2) - the high 8 bits containing a capability field used for additional feature flags. Notably, it's only the subtype subfield that participates in fat file slice discrimination: the caps are ignored. arm64e uses the caps subfield to encode a ptrauth ABI version: - 0x80 (CPU_SUBTYPE_PTRAUTH_ABI) denotes a versioned binary - 0x40 denotes a kernel-ABI binary - 0x00-0x0F holds the ptrauth ABI version This teaches the basic obj tools to decode that (or ignore it when unneeded). It also teaches the MachO writer to default to emitting versioned binaries, but with a version of 0 (and without the kernel ABI flag). Modern arm64e requires versioned binaries: a binary with 0x00 caps in cpusubtype is now rejected by the linker and everything after. We can live without the sophistication of specifying the version and kernel ABI for now. Co-authored-by: Francis Visoiu Mistrih <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for fd4f952 - Browse repository at this point
Copy the full SHA fd4f952View commit details -
[mlir][gpu] Add extra value types for gpu::ShuffleOp (llvm#104605)
Expand the accepted types for gpu.shuffle to any integer, float or 1d vector of integers or floats. Also updated the gpu-to-llvm-spv pass to support those types.
Configuration menu - View commit details
-
Copy full SHA for 552d26e - Browse repository at this point
Copy the full SHA 552d26eView commit details -
[llvm-lit][test] Updated built-in cat command tests (llvm#104473)
This patch makes changes to improve syntax in tests and to add more strict checks on cat output. This is a prequisite for llvm#101530.
Configuration menu - View commit details
-
Copy full SHA for 349d76d - Browse repository at this point
Copy the full SHA 349d76dView commit details -
[lldb][test] Change unsupported cat -e to cat -v to work with lit int…
…ernal shell (llvm#104878) This patch changes the test that uses the `cat -e` option to `cat -v` so that the test can be run using lit's internal shell. For `cat`, the `-v` option prints non-printing characters in ^ and M- notation, while the `-e` option adds `$` to the end of lines in addition to printing non-printing characters in ^ and M- notation. This is an alternative patch to llvm#102061, opting to rewrite the test that uses `cat -e` instead of extending support to the `-e` option. Fixes llvm#102377
Configuration menu - View commit details
-
Copy full SHA for 6558e04 - Browse repository at this point
Copy the full SHA 6558e04View commit details -
Configuration menu - View commit details
-
Copy full SHA for ce132a5 - Browse repository at this point
Copy the full SHA ce132a5View commit details -
[flang] More support for anonymous parent components in struct constr… (
llvm#102642) …uctors A non-conforming extension to Fortran present in a couple other compilers is allowing a anonymous component in a structure constructor to initialize a parent (or greater ancestor) component. This was working in this compiler only for direct parents, and only when the type was not use-associated. Fixes llvm#102557.
Configuration menu - View commit details
-
Copy full SHA for 10cc4a5 - Browse repository at this point
Copy the full SHA 10cc4a5View commit details -
[flang] Fix inheritance of IMPLICIT typing rules (llvm#102692)
Interfaces don't inherit the IMPLICIT typing rules of their enclosing scope, and separate MODULE PROCEDUREs inherit the IMPLICIT typing rules of submodule in which they are defined, not the rules from their interface. Fixes llvm#102558.
Configuration menu - View commit details
-
Copy full SHA for 90d753a - Browse repository at this point
Copy the full SHA 90d753aView commit details -
[flang] Silence an inappropriate warning (llvm#104685)
A bare ALLOCATE statement with no SOURCE= rightly earns a warning about an undefined function result, if that result is an allocatable that appears in the ALLOCATE. But in the case of a pointer, where the warning should care more about the pointer's association status than the value of its target, a bare ALLOCATE should suffice to silence the warning.
Configuration menu - View commit details
-
Copy full SHA for f059017 - Browse repository at this point
Copy the full SHA f059017View commit details -
[flang] Silence spurious error (llvm#104821)
Don't complain about a local object with an impure final procedure in a pure subprogram when the local object is a named constant. Fixes llvm#104796.
Configuration menu - View commit details
-
Copy full SHA for 143be4e - Browse repository at this point
Copy the full SHA 143be4eView commit details -
[flang] Fix IEEE_NEAREST_AFTER folding edge cases (llvm#104846)
Conversions of infinities from other kinds to real(10) were incorrect, and comparisons of real(2) vs real(3) are dicey as conversions in one direction can overflow and conversions in the other can lose precision. Use real(16) as the common type for comparisons in IEEE_NEAREST_AFTER.
Configuration menu - View commit details
-
Copy full SHA for 1e1cf25 - Browse repository at this point
Copy the full SHA 1e1cf25View commit details -
[Attributor] Improve AAUnderlyingObjects (llvm#104835)
- Allocas and GlobalValues cannot be simplified, so we should not try. - If we never used any assumed state, the AAUnderlyingObjects doesn't require an additional update. - If we have seen an object (or it's underlying object) before, we do not need to inspect it anymore. The original logic for "SeenObjects" was flawed and caused us to add intermediate values to the underlying object list if a PHI or select instruction referenced the same underlying object twice. The test changes are all instances of this situation and we now correctly derive `memory(none)` for the functions that only access stack memory. --------- Co-authored-by: Shilei Tian <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8266d47 - Browse repository at this point
Copy the full SHA 8266d47View commit details -
Configuration menu - View commit details
-
Copy full SHA for c932a0e - Browse repository at this point
Copy the full SHA c932a0eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0a22655 - Browse repository at this point
Copy the full SHA 0a22655View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5822cc2 - Browse repository at this point
Copy the full SHA 5822cc2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 93e0f31 - Browse repository at this point
Copy the full SHA 93e0f31View commit details -
[SandboxIR] Implement CatchSwitchInst (llvm#104652)
This patch implements sandboxir::CatchSwitchInst mirroring llvm::CatchSwitchInst.
Configuration menu - View commit details
-
Copy full SHA for 6e8c970 - Browse repository at this point
Copy the full SHA 6e8c970View commit details -
AMDGPU/NewPM: Fill out passes in addCodeGenPrepare (llvm#102867)
AMDGPUAnnotateKernelFeatures hasn't been ported yet, but it should be soon removable.
Configuration menu - View commit details
-
Copy full SHA for afeef4d - Browse repository at this point
Copy the full SHA afeef4dView commit details -
AMDGPU/NewPM: Start filling out addIRPasses (llvm#102884)
This is not complete, but gets AtomicExpand running. I was able to get further than I expected; we're quite close to having all the IR codegen passes ported.
Configuration menu - View commit details
-
Copy full SHA for 33e18b2 - Browse repository at this point
Copy the full SHA 33e18b2View commit details -
[clang] Support -Wa, options -mmsa and -mno-msa (llvm#99615)
Co-authored-by: Fangrui Song <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 26ae316 - Browse repository at this point
Copy the full SHA 26ae316View commit details -
Configuration menu - View commit details
-
Copy full SHA for 66ab4b8 - Browse repository at this point
Copy the full SHA 66ab4b8View commit details -
[NFC] Fixed two typos: "__builin_" --> "__builtin_" (llvm#98782)
Fixed two typos: 1. `__builin_va_list` --> `__builtin_va_list` 2. `__builin_suspend` --> `__builtin_suspend`
Configuration menu - View commit details
-
Copy full SHA for d03dcf6 - Browse repository at this point
Copy the full SHA d03dcf6View commit details -
[NFC] Fix a typo in InternalsManual: ActOnCXX -> ActOnXXX (llvm#105207)
This part of the manual describes uses of `ActOnXXX` and `BuildXXX`.
Configuration menu - View commit details
-
Copy full SHA for 660de53 - Browse repository at this point
Copy the full SHA 660de53View commit details -
[flang] Disable failing test (llvm#105327)
flang/test/Evaluate/fold-nearest.f90 is failing oddly on ppc64le; disable it for now while I sort things out.
Configuration menu - View commit details
-
Copy full SHA for 1233df7 - Browse repository at this point
Copy the full SHA 1233df7View commit details -
[RISCV][GISel] Remove s32 support on RV64 for DIV, and REM. (llvm#102519
) Based on experience with SelectionDAG and experimental-rv64-legal-i32, I don't believe making s32 a legal type is viable without introducing an invariant that s32 values are always sign extended like Mips64 does. Mips64 does this with a separate 32-bit register class. `experimental-rv64-legal-i32` was removed in #llvm#102509. This patch is part of a series to remove s32 support so we can remove the isel patterns that SelectionDAG is no longer using. To restore code quality, we will need to add custom W nodes like SelectionDAG.
Configuration menu - View commit details
-
Copy full SHA for 2599d69 - Browse repository at this point
Copy the full SHA 2599d69View commit details -
[Clang] Re-land Overflow Pattern Exclusions (llvm#104889)
Introduce "-fsanitize-undefined-ignore-overflow-pattern=" which can be used to disable sanitizer instrumentation for common overflow-dependent code patterns. For a wide selection of projects, proper overflow sanitization could help catch bugs and solve security vulnerabilities. Unfortunately, in some cases the integer overflow sanitizers are too noisy for their users and are often left disabled. Providing users with a method to disable sanitizer instrumentation of common patterns could mean more projects actually utilize the sanitizers in the first place. One such project that has opted to not use integer overflow (or truncation) sanitizers is the Linux Kernel. There has been some discussion[1] recently concerning mitigation strategies for unexpected arithmetic overflow. This discussion is still ongoing and a succinct article[2] accurately sums up the discussion. In summary, many Kernel developers do not want to introduce more arithmetic wrappers when most developers understand the code patterns as they are. Patterns like: if (base + offset < base) { ... } or while (i--) { ... } or #define SOME -1UL are extremely common in a code base like the Linux Kernel. It is perhaps too much to ask of kernel developers to use arithmetic wrappers in these cases. For example: while (wrapping_post_dec(i)) { ... } which wraps some builtin would not fly. This would incur too many changes to existing code; the code churn would be too much, at least too much to justify turning on overflow sanitizers. Currently, this commit tackles three pervasive idioms: 1. "if (a + b < a)" or some logically-equivalent re-ordering like "if (a > b + a)" 2. "while (i--)" (for unsigned) a post-decrement always overflows here 3. "-1UL, -2UL, etc" negation of unsigned constants will always overflow The patterns that are excluded can be chosen from the following list: - add-overflow-test - post-decr-while - negated-unsigned-const These can be enabled with a comma-separated list: -fsanitize-undefined-ignore-overflow-pattern=add-overflow-test,negated-unsigned-const "all" or "none" may also be used to specify that all patterns should be excluded or that none should be. [1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/ [2] https://lwn.net/Articles/979747/ CCs: @efriedma-quic @kees @jyknight @fmayer @vitalybuka Signed-off-by: Justin Stitt <[email protected]> Co-authored-by: Bill Wendling <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 295fe0b - Browse repository at this point
Copy the full SHA 295fe0bView commit details -
[OpenMP] Temporarily disable test to keep bots green
Summary: This test mysteriously fails on the bots but not locally, disable until I can figure out why.
Configuration menu - View commit details
-
Copy full SHA for e96146c - Browse repository at this point
Copy the full SHA e96146cView commit details -
AMDGPU: Temporarily stop adding AtomicExpand to new PM passes
This breaks using -passes=atomic-expand (but only sometimes?). Somehow an AtomicExpand pass ends up running without a TargetMachine, despite always being constructed with one.
Configuration menu - View commit details
-
Copy full SHA for dd90c72 - Browse repository at this point
Copy the full SHA dd90c72View commit details -
[flang] Disable part of failing test (temporary) (llvm#105350)
A new section of a test is failing on aarch64 and ppc64le; disable it while I sort things out.
Configuration menu - View commit details
-
Copy full SHA for 0c48986 - Browse repository at this point
Copy the full SHA 0c48986View commit details -
[BOLT] Reduce CFI warning verbosity (llvm#105336)
CFI programs may have more saves than restores and this is completely benign from BOLT's perspective. Reduce the verbosity and print the warning only under `-v=1` and above.
Configuration menu - View commit details
-
Copy full SHA for 8f30506 - Browse repository at this point
Copy the full SHA 8f30506View commit details -
[DAG][RISCV] Use vp.<binop> when widening illegal types for binops wh…
…ich can trap (llvm#105214) This allows the use a single wider operation with a restricted EVL instead of having to split and cover via decreasing powers-of-two sizes. On RISCV, this avoids the need for a bunch of vslidedown and vslideup instructions to extract subvectors, and VL toggles to switch between the various widths. Note there is a potential downside of using vp nodes; we loose any generic DAG combines which might have applied to the split form.
Configuration menu - View commit details
-
Copy full SHA for 91b423d - Browse repository at this point
Copy the full SHA 91b423dView commit details -
[libc] Include startup code when installing all (llvm#105203)
Previously the libc startup code was marked `EXCLUDE_FROM_ALL` due to build issues. This patch removes that as no longer necessary.
Configuration menu - View commit details
-
Copy full SHA for 2353f48 - Browse repository at this point
Copy the full SHA 2353f48View commit details -
[cmake] Set up llvm-ml as ASM_MASM tool in WinMsvc.cmake (llvm#104903)
Nowadays, an ASM_MASM tool is required for building the BLAKE3 assembly in llvm/lib/Support - the llvm-ml tool can do this.
Configuration menu - View commit details
-
Copy full SHA for aeeb74f - Browse repository at this point
Copy the full SHA aeeb74fView commit details -
[libc] move newheadergen back to safe_load (llvm#105374)
In llvm#100024 we moved from safe_load to load for reading the yaml in newheadergen due to dependency issues. Those should be resolved by now so this should be a simple safety improvement.
Configuration menu - View commit details
-
Copy full SHA for a3c66c8 - Browse repository at this point
Copy the full SHA a3c66c8View commit details -
[TableGen] Rework
EmitIntrinsicToBuiltinMap
(llvm#104681)Rework `IntrinsicEmitter::EmitIntrinsicToBuiltinMap` for improved peformance as well as refactor the code. Performance: - Current generated code does a linear search on the TargetPrefix, followed by a binary search on the builtin names for that target's builtins. - Improve the performance of this code in 2 ways: (a) Use binary search on the target prefix to lookup the builtin table for the target. (b) Improve the (common) case of when all builtins for a target share a common prefix. Check this common prefix first, and then do the binary search in the builtin table using the builtin name with the common prefix removed. This should help both data size (by creating a smaller static string table) and runtime (by reducing the cost of binary search on smaller strings). Refactor: - Use range based for loops for iterating over maps. - Use formatv() and C++ raw string literals to simplify the emission code. - Change the generated `getIntrinsicForClangBuiltin` and `getIntrinsicForMSBuiltin` to take a `StringRef` instead of `const char *` for the prefix.
Configuration menu - View commit details
-
Copy full SHA for 389f339 - Browse repository at this point
Copy the full SHA 389f339View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5e6a198 - Browse repository at this point
Copy the full SHA 5e6a198View commit details -
Configuration menu - View commit details
-
Copy full SHA for 019e1a3 - Browse repository at this point
Copy the full SHA 019e1a3View commit details -
[flang] Fix test on ppc64le & aarch64 (llvm#105439)
Don't try to fold x87 extended precision operations in a test unless it's targeting x86-64.
Configuration menu - View commit details
-
Copy full SHA for c9a4c51 - Browse repository at this point
Copy the full SHA c9a4c51View commit details -
[DXIL][Analysis] Update test to match comment. NFC (llvm#105409)
The mismatch between the comment on this test and the test itself was pointed out in llvm#100699 (comment), but apparently I failed to actually commit the fix.
Configuration menu - View commit details
-
Copy full SHA for 1a2a18f - Browse repository at this point
Copy the full SHA 1a2a18fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 26b79f8 - Browse repository at this point
Copy the full SHA 26b79f8View commit details -
[FunctionAttrs] deduce attr
cold
on functions if all CG paths call ……a `cold` function Closes llvm#101298
Configuration menu - View commit details
-
Copy full SHA for b7eac8c - Browse repository at this point
Copy the full SHA b7eac8cView commit details -
[lldb][test] XFAIL TestAnonNamespaceParamFunc.cpp on Windows
This recently added test is failing on Windows with: ``` c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe --no-lldbinit -S C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Expr\Output\TestAnonNamespaceParamFunc.cpp.tmp -o run -o "expression func(a)" -o exit | c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe' --no-lldbinit -S 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Expr\Output\TestAnonNamespaceParamFunc.cpp.tmp' -o run -o 'expression func(a)' -o exit .---command stderr------------ | TestAnonNamespaceParamFunc.cpp.tmp :: Class 'tagARRAYDESC' has a member 'tdescElem' of type 'tagTYPEDESC' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'tagARRAYDESC' has a member 'tdescElem' of type 'tagTYPEDESC' which does not have a complete definition. | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::partial_ordering' has a member 'less' of type 'std::partial_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::partial_ordering' has a member 'less' of type 'std::partial_ordering' which does not have a complete definition. | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::strong_ordering' has a member 'less' of type 'std::strong_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::strong_ordering' has a member 'less' of type 'std::strong_ordering' which does not have a complete definition. | (lldb) TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::weak_ordering' has a member 'less' of type 'std::weak_ordering' which does not have a complete definition.error: TestAnonNamespaceParamFunc.cpp.tmp :: Class 'std::weak_ordering' has a member 'less' of type 'std::weak_ordering' which does not have a complete definition. | (lldb) error: Couldn't look up symbols: | int func(struct `anonymous namespace'::InAnon) | Hint: The expression tried to call a function that is not present in the target, perhaps because it was optimized out by the compiler. `----------------------------- executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp' .---command stderr------------ | C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Expr\TestAnonNamespaceParamFunc.cpp:10:11: error: CHECK: expected string not found in input | // CHECK: (int) $0 = 15 | ^ | <stdin>:16:26: note: scanning from here | (lldb) expression func(a) | ^ ``` So the function is still not callable. But AFAICT, this is not a regression, since this function wasn't callable prior to the patch anyway. I currently do not have a Windows setup to test this on, so XFAIL for now.
Configuration menu - View commit details
-
Copy full SHA for 8d712b4 - Browse repository at this point
Copy the full SHA 8d712b4View commit details -
[RISCV][GISel] Split LoadStoreActions in LoadActions and StoreActions.
Remove widenToNextPow2 from StoreActions. Reorder clampScalar and lowerIfMemSizeNotByteSizePow2 for StoreActions. These match AArch64 and got me further on a test case I was playing with that contained a i129 store.
Configuration menu - View commit details
-
Copy full SHA for 1e9d002 - Browse repository at this point
Copy the full SHA 1e9d002View commit details -
[lldb] Extend frame recognizers to hide frames from backtraces (llvm#…
…104523) Compilers and language runtimes often use helper functions that are fundamentally uninteresting when debugging anything but the compiler/runtime itself. This patch introduces a user-extensible mechanism that allows for these frames to be hidden from backtraces and automatically skipped over when navigating the stack with `up` and `down`. This does not affect the numbering of frames, so `f <N>` will still provide access to the hidden frames. The `bt` output will also print a hint that frames have been hidden. My primary motivation for this feature is to hide thunks in the Swift programming language, but I'm including an example recognizer for `std::function::operator()` that I wished for myself many times while debugging LLDB. rdar://126629381 Example output. (Yes, my proof-of-concept recognizer could hide even more frames if we had a method that returned the function name without the return type or I used something that isn't based off regex, but it's really only meant as an example). before: ``` (lldb) thread backtrace --filtered=false * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12 frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10 frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 (lldb) ``` after ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 Note: Some frames were hidden by frame recognizers ```
Configuration menu - View commit details
-
Copy full SHA for f01f80c - Browse repository at this point
Copy the full SHA f01f80cView commit details -
[mlir][linalg] Improve getPreservedProducerResults estimation in Elem…
…entwiseOpFusion (llvm#104409) This commit changes the getPreservedProducerResults function so that it takes the consumer into account along with the producer, in order to predict which of the producer’s outputs can be dropped during the fusion process. It provides a more accurate prediction, considering that the fusion process also depends on the consumer.
Configuration menu - View commit details
-
Copy full SHA for 4a4b233 - Browse repository at this point
Copy the full SHA 4a4b233View commit details -
Configuration menu - View commit details
-
Copy full SHA for a16f0dc - Browse repository at this point
Copy the full SHA a16f0dcView commit details -
[DirectX] Register a few DXIL passes with the new PM
This wires up dxil-op-lower, dxil-intrinsic-expansion, dxil-translate-metadata, and dxil-pretty-printer to the new pass manager, both as a matter of future proofing the backend and so that they can be used more flexibly in tests. A few arbitrary tests are updated in order to test the new PM path, and we drop the "print-dxil-resource-md" pass since it's redundant with the pretty printer. Pull Request: llvm#104250
Configuration menu - View commit details
-
Copy full SHA for 81ee385 - Browse repository at this point
Copy the full SHA 81ee385View commit details -
Revert "[RISCV][GISel] Allow >2*XLen integers in isSupportedReturnType."
It didn't crash so I thought this worked now, but upon further review it miscalculates the stack address for the return.
Configuration menu - View commit details
-
Copy full SHA for a8ef679 - Browse repository at this point
Copy the full SHA a8ef679View commit details -
[RISCV] Add coverage for int reductions of <3 x i8> vectors
Specifically, to illustrate our general lowering strategy for non-power of two vectors.
Configuration menu - View commit details
-
Copy full SHA for 3145cff - Browse repository at this point
Copy the full SHA 3145cffView commit details -
Fix KCFI types for generated functions with integer normalization (ll…
…vm#104826) With -fsanitize-cfi-icall-experimental-normalize-integers, Clang appends ".normalized" to KCFI types in CodeGenModule::CreateKCFITypeId, which changes type hashes also for functions that don't have integer types in their signatures. However, llvm::setKCFIType does not take integer normalization into account, which means LLVM generated functions with KCFI types, e.g. sanitizer constructors, will fail KCFI checks when integer normalization is enabled in Clang. Add a cfi-normalize-integers module flag to indicate integer normalization is used, and append ".normalized" to KCFI types also in llvm::setKCFIType to fix the type mismatch.
Configuration menu - View commit details
-
Copy full SHA for e1c36bd - Browse repository at this point
Copy the full SHA e1c36bdView commit details
Commits on Aug 21, 2024
-
[AArch64] Basic SVE PCS support for handling scalable vectors on Darwin.
For the tests I just added +sve instead of what actual hardware has, which is only SME, since otherwise all the test functions need to be marked as streaming mode. rdar://121864771
Configuration menu - View commit details
-
Copy full SHA for 39ec1f7 - Browse repository at this point
Copy the full SHA 39ec1f7View commit details -
[RISCV][GISel] Merge RISCVCallLowering::lowerReturnVal into RISCVCall…
…Lowering::lowerReturn. NFC This is similar to X86 and AArch64 structure.
Configuration menu - View commit details
-
Copy full SHA for 381a803 - Browse repository at this point
Copy the full SHA 381a803View commit details -
[mlir] Fix -Wunused-result in ElementwiseOpFusion.cpp (NFC)
/llvm-project/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp:124:7: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result] opOperandsToIgnore.pop_back_val(); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for d8b6df2 - Browse repository at this point
Copy the full SHA d8b6df2View commit details -
RISC-V: Add fminimumnum and fmaximumnum support (llvm#104411)
Since 2.2, `fmin.s/fmax.s` instructions follow the IEEE754-2019, if F extension is avaiable; and `fmin.d/fmax.d` also follow the IEEE754-2019 if D extension is avaiable. So, let's mark them as Legal.
Configuration menu - View commit details
-
Copy full SHA for 2b84fe6 - Browse repository at this point
Copy the full SHA 2b84fe6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5ec73b7 - Browse repository at this point
Copy the full SHA 5ec73b7View commit details
Commits on Sep 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 64e887e - Browse repository at this point
Copy the full SHA 64e887eView commit details