forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358
Open
mgehre-amd
wants to merge
444
commits into
bump_to_894d3eeb
Choose a base branch
from
bump_to_2d50029f
base: bump_to_894d3eeb
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This changes the bare-metal driver logic such that it _always_ tries multilib.yaml if it exists, and it falls back to the hardwired/default RISC-V multilib selection only if a multilib.yaml doesn't exist. In contrast, the current behavior is that RISC-V can never use multilib.yaml, but other targets will try it if it exists. The flags `-march=` and `-mabi=` are exposed for multilib.yaml to match on. There is no attempt to help YAML file creators to duplicate the existing hard-wired multilib reuse logic -- they will have to implement it using `Mappings`. This should be backwards-compatible with existing sysroots, as multilib.yaml was previously never used for RISC-V, and the behavior doesn't change after this PR if the file doesn't exist.
Defined AMDGPU DPP operation in mlir to represent semantics. Introduced a new enumeration attribute for different permutations and allowed for different types of arguments. Implemented constant attribute handling for ROCDL::DPPMovOp operation. The operation now correctly accepts constant attributes for dppCtrl, rowMask, bankMask, boundCtrl, and passes them to the corresponding LLVM intrinsic.
… a few places. (llvm#104555) PR llvm#80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.
This PR is continuation of the [previous one](llvm#101478). As a result, the `emitc::SwitchOp` op was developed inspired by `scf::IndexSwitchOp`. Main points of PR: - Added the `emitc::SwitchOp` op to the EmitC dialect + CppEmitter - Corresponding tests were added - Conversion from the SCF dialect to the EmitC dialect for the op
CodeGenIntrinsic changes: - Use `const` Record pointers, and `StringRef` when possible. - Default initialize several fields with their definition instead of in the constructor. - Simplify various string checks in the constructor using StringRef starts_with()/ends_with() functions. - Eliminate first argument to `setDefaultProperties` and use `TheDef` class member instead. IntrinsicEmitter changes: - Emit `namespace llvm::Intrinsic` instead of nested namespaces. - End generated comments with a . - Use range based for loops, and early continue within loops. - Emit `static constexpr` instead of `static const` for arrays. - Change `compareFnAttributes` to use std::tie() to compare intrinsic attributes and return a default value when all attributes are equal. STLExtras: - Add std::replace wrapper which takes a range.
…eGen/bit-int-ubsan.c (llvm#104607) Add missing -triple x86_64-pc-linux-gnu line into RUN line, which should be here. --------- Co-authored-by: Eänolituri Lómitaurë <[email protected]> Co-authored-by: Aaron Ballman <[email protected]> Co-authored-by: Paul Kirth <[email protected]> Co-authored-by: Vitaly Buka <[email protected]>
Passing to the `PGOInstrumentationGen` pass whether it needs to produce contextual profiling instrumentation as a flag, in the process restructuring a bit the places that need to be aware of that (some were unnecessarily in `PGOInstrumentationUse`)
llvm#100367) This is split off from llvm#71764, and moves only the vmv.v.v part of performCombineVMergeAndVOps to work on MachineInstrs. In retrospect trying to handle PseudoVMV_V_V and PseudoVMERGE_VVM in the same function makes the code quite hard to read, so this just does it in a separate peephole. This turns out to be simpler since for PseudoVMV_V_V we don't need to convert the Src instruction to a masked variant, and we don't need to create a fake all ones mask.
This patch implements sandboxir::AtomicRMWInst mirroring llvm::AtomicRMWInst.
Old Headergen needed extra build rules to ensure that it worked in runtimes mode. This patch disables those checks if new headergen is enabled. Also some new headers were not being properly built with new headergen, and that's also fixed.
Similar to llvm#104481. Replace more "Utility" dependencies with "UtilityHeaders" to avoid cyclic dependency when building on macos.
An HLSL function has internal linkage by default unless it is: 1. shader entry point function 2. marked with the `export` keyword (llvm#92812) 3. patch constant function (not implemented yet) This PR adds a link-time pass `DXILFinalizeLinkage` that updates the linkage of functions to make sure only shader entry points and exported functions are visible from the module (have _program linkage_). All other functions will be updated to have internal linkage. Related spec update: microsoft/hlsl-specs#295 Fixes #llvm#92071
This reverts commit e592c2d. We can finally reland the PR since the issue that caused the PR to be reverted has been resolved in llvm#104051.
This allows annotating fields of C/C++ structs using API Notes. Previously API Notes supported Objective-C properties, but not fields. rdar://131548377
…vm#102986) When a test case inside of a gtest suite fails, we report a failure which causes the entire `ninja check-lldb` invocation to fail, even if the outer test case is marked as XFAIL - each test case result is reported as its own lit test run. This PR updates lit so it checks whether each test case's parent test suite is XFAIL before setting the status to FAIL. This is especially problematic because the failing tests can't manually be marked as XFAIL, due to llvm#102264. Fixes llvm#102265 ### Repro instructions 1. Modify any gtest test case to generate a failure. 2. Mark the outer lit test with XFAIL using either `--xfail-tests` flag or `LIT_XFAIL` env var. 3. Run the tests 4. Observe the lit test is XFAIL as expected, but the failed child test cases show up as separate failures. Co-authored-by: kendal <[email protected]>
…llvm#104519) This patch makes `-objc_relative_method_lists` default on MacOS 10.16+/iOS 14+. Manual override still work if command line argument is provided. To test this change, many explict arguments are removed from the test files. Some explict `-objc_no_objc_relative_method_lists` are also added for tests that don't support this yet. This commit tries to revive llvm#101360, which exposes a bug that breaks CI. llvm#104081 has fixed that bug.
) Reverts llvm#104522 Caused crashes on Fuchsia
This feature provided CPM_IOACC_CTL_EL3, a lone system register that has been carried over since the original ARM64 implementation, where it was the only processor-specific register in a long list of architectural sysregs. We don't need it here. It's been used as a generic processor-specific sysreg in tests, but the functionality they target is now better covered in other more exhaustive tests.
This analysis can't be used with other analyses if this isn't set. Pull Request: llvm#104244
…or buffers" (llvm#104517) Some build configs allow `llvm_unreachable` in a constexpr context, but not all, so these functions that map a fully covered enum to a string can't be constexpr. This version fixes that by dropping constexpr from those functions. This reverts commit fcc318f, reapplying 28d577e. Original message follows: This implements the DXILResourceAnalysis pass for `dx.TypedBuffer` and `dx.RawBuffer` types. This should be sufficient to lower `dx.handle.fromBinding` for this set of types, but it leaves a number of TODOs around for other resource types. This also includes a straightforward `print` method in `ResourceInfo` to make the analysis testable. This is deliberately different than the printer in `lib/Target/DirectX/DXILResource.cpp`, which attempts to print bindings in a format compatible with the comments `dxc` prints. We will eventually want to make that functionality driven by this analysis pass, but it isn't sufficient for testing so we need both.
…es to be treated as loads (llvm#99999) This change avoids deleting `!willReturn` intrinsics for which the return value is unused when building the SDAG. Currently, calls to read-only intrinsics not marked with `IntrWillReturn` cannot be deleted at the LLVM IR level but may be deleted when building the SDAG. These calls are unsafe to remove from the IR because the functions are `!willReturn` and should also be unsafe to remove fromthe SDAG for the same reason. This change aligns the behavior of the SDAG to that of LLVM IR. This change also requires that intrinsics not have the `Throws` attribute to be treated as loads for the same reason.
Summary: This used an old name I forgot to fix, linter didn't catch it because it was behind `ifdef` and the branch which I tested it on I forgot to update the one I landed.
Some new headers were not being properly built with new headergen, since they were using the old "add_gen_header" instead of the new "add_header_macro". This patch fixes the issue.
…4613) Flang is switch to cc1 when we use `-x cuda`. Make sure we can use fc1 with cuda fortran input. The current pipeline will fail at MLIR level for the moment. llvm#104483
This adds MachO support for emission of authenticated pointer relocations. We already support AArch64AuthMCExpr, to represent assembly expressions such as: .quad <symbol>@AUTH(<key>, <discriminator> [, addr]) For example: .quad _g3@AUTH(ib, 1234, addr) These @AUTH expressions lower to a new kind of MachO relocation: ARM64_RELOC_AUTHENTICATED_POINTER (11) The relocation points to the referenced symbol. The other data, describing the signing scheme and original addend (only 32 bits instead of 64), is encoded into the addend (in the relocated location): |63|62|61-51|50-49| 48 |47 - 32|31 - 0| | 1| 0| 0 | key | addr | discriminator | addend |
…lvm#94059) This patch prevents thread-local constants to be merged within PPCMergeStringPool.cpp. The PPCMergeStringPool pass primarily merges non-thread-local constants together, and thread-local constants should not be mixed together with other (non-thread-local) constants. In the event that thread-local and other non-thread-local constants are pooled together, the llvm.threadlocal.address intrinsic can fail as it expects its argument to be a thread-local global value, but the merged string structure created by the PPCMergeStringPool pass is not thread-local as a whole.
There are 3 ways in which `ParseAST::build` can fail and return `std::nullopt`. 2 of the ways we emit the error message using `elog`, but for the 3rd way, `log` is used. We should emit all 3 of these reasons with `elog`.
…lvm#104824) `SBCommand::AddCommand()` requires `SBCommandPluginInterface` to be heap based because it will be stored inside `std::shared_ptr<lldb::SBCommandPluginInterface>` later for reference counting. But lldb-dap passes `StartDebuggingRequestHandler/ReplModeRequestHandler` static function pointer to it which will cause corruption later during destruction. This PR fixes this issue by making these two handler heap based. Co-authored-by: jeffreytan81 <[email protected]>
This reverts commit d4f6fcf. Relanding with fixed obj_offset calculation (precedence of operations was wrong), and the suggestion in llvm#95308 (comment)
- When an unterminated open { is detected in the format string, instead of asserting and ignoring the error, replace that string with another to indicate the error, and remove the assert as well. - This will make the error evident in both assert and release builds and make observing the error more convenient (as several uses of this function are in TableGen and it is often built in release mode even in debug builds)
…KnownNonEqual`; NFC Downstream hit this assert, since it doesn't really make any difference, just change code to return false.
Error: CommandLine Error: Option 'attributor-manifest-internal' registered more than once During the standalone debug build of offload the above error is seen at app runtime when using a prebuilt llvm with LLVM_LINK_LLVM_DYLIB=ON. This is caused by linking both libLLVM.so and various archives that are found via llvm_map_components_to_libnames for jit support.
… on LLVM Dialect and LLVM Core in CMake build (llvm#104832) This change removes dependencies declared as either 'LINK_LIBS' or 'LINK_COMPONENTS' across several MLIR libraries. The removed dependencies appear to be incorrect and may have been required in older versions of the project. These dependencies cause many high level dialects to have transitive dependence on the LLVM dialect and the LLVM 'Core' library ('llvm/lib/IR'). Note that if using the 'Ninja' CMake generator, one can inspect the dependencies (including all transitive libraries) of any given MLIR target but using the command `ninja -C <build dir> -t browse` and navigating to the library of interest in a web browser.
) Previously the secondary cache retrieval algorithm would not allow retrievals of memory chunks where the number of unused bytes would be greater than than `MaxUnusedCachePages * PageSize` bytes. This meant that even if a memory chunk satisfied the requirements of the optimal fit algorithm, it may not be returned. This remains true if memory tagging is enabled. However, if memory tagging is disabled, a new heuristic has been put in place. Specifically, If a memory chunk is a non-optimal fit, the cache retrieval algorithm will attempt to release the excess memory to force a cache hit while keeping RSS down. In the event that a memory chunk is a non-optimal fit, the retrieval algorithm will release excess memory as long as the amount of memory to be released is less than or equal to 16 KB. If the amount of memory to be released exceeds 16 KB, the retrieval algorithm will not consider that cached memory chunk valid for retrieval.
Inverse mapping needs to be updated for the result that was remapped (it was previously only updated halfway).
Fix list formatting, improve the wording, and fix the description when both options (note: prefer "option" to "flag" when arguments are supported) are specified. Pull Request: llvm#104886
D57497 added -msmall-data-limit= as an alias for -G and defaulted it to 8 for -fno-pic/-fpie. The behavior is already different from GCC in a few ways: * GCC doesn't accept -G. * GCC -fpie doesn't seem to use -msmall-data-limit=. * GCC emits .srodata.cst* that we don't use (llvm#82214). Writable contents caused confusion (https://bugs.chromium.org/p/llvm/issues/detail?id=61) In addition, * claiming `-shared` means we don't get a desired `-Wunused-command-line-argument` for `clang --target=riscv64-linux-gnu -fpic -c -shared a.c`. * -mcmodel=large doesn't work for RISC-V yet, so the special case is strange. * It's quite unusual to emit a warning when an option (unrelated to relocation model) is used with -fpic. * We don't want future configurations (Android) to continue adding customization to `SetRISCVSmallDataLimit`. I believe the extra code just doesn't pull its weight and should be cleaned up. This patch also changes the default to 0. GP relaxation users are encouraged to specify these customization options explicitly. Pull Request: llvm#83093
A quick follow-up fix for llvm#99403 Buildbot [reported](https://lab.llvm.org/buildbot/#/builders/168/builds/2330) an error: ``` /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ADT/FunctionExtrasTest.cpp:320:8: error: variable 'ptr' is uninitialized when used here [-Werror,-Wuninitialized] 320 | [ptr](void *self) { | ^~~ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ADT/FunctionExtrasTest.cpp:318:12: note: initialize the variable 'ptr' to silence this warning 318 | void *ptr; | ^ | = nullptr 1 error generated. ``` So that PR does exactly what's sugested.
…dialects on LLVM Dialect and LLVM Core in CMake build (llvm#104832)" This reverts commit 43b5085 since it caused the build to break with BUILD_SHARED_LIBS=ON.
I started out by adding a new pointer type for blocks, and I was fully prepared to compile their AST to bytecode and later call them. ... then I found out that the current interpreter doesn't support calling blocks at all. So we reuse `Function` to support sources other than `FunctionDecl`s and classify `BlockPointerType` as `PT_FnPtr`.
…s. (llvm#104876) This was broken back in llvm#78658 when we transitioned away from archive indexes to parsing lazy object files. Fixes: llvm#94077 Fixes: emscripten-core/emscripten#22008
…able files. (llvm#102978) This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002
We used integer comparisons instead of floating point comparisons resulting in very odd behavior.
We would crash on sufficiently old NV hardware (Volta or so) due to incorrectly marking certain operations legal.
mgehre-amd
changed the title
[AutoBump] Merge with fixes of 2d50029f (Aug 15) (5)
[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5)
Sep 20, 2024
mgehre-amd
force-pushed
the
bump_to_2d50029f
branch
2 times, most recently
from
September 20, 2024 09:03
141a45a
to
d840dbb
Compare
mgehre-amd
force-pushed
the
bump_to_2d50029f
branch
from
September 20, 2024 09:36
d840dbb
to
6f28929
Compare
cferry-AMD
approved these changes
Sep 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Xilinx/torch-mlir#359