[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358

mgehre-amd · 2024-09-20T08:07:56Z

This changes the bare-metal driver logic such that it _always_ tries multilib.yaml if it exists, and it falls back to the hardwired/default RISC-V multilib selection only if a multilib.yaml doesn't exist. In contrast, the current behavior is that RISC-V can never use multilib.yaml, but other targets will try it if it exists. The flags `-march=` and `-mabi=` are exposed for multilib.yaml to match on. There is no attempt to help YAML file creators to duplicate the existing hard-wired multilib reuse logic -- they will have to implement it using `Mappings`. This should be backwards-compatible with existing sysroots, as multilib.yaml was previously never used for RISC-V, and the behavior doesn't change after this PR if the file doesn't exist.

Defined AMDGPU DPP operation in mlir to represent semantics. Introduced a new enumeration attribute for different permutations and allowed for different types of arguments. Implemented constant attribute handling for ROCDL::DPPMovOp operation. The operation now correctly accepts constant attributes for dppCtrl, rowMask, bankMask, boundCtrl, and passes them to the corresponding LLVM intrinsic.

… a few places. (llvm#104555) PR llvm#80309 proposes to have users of APInt's uint64_t constructor opt-in to implicit truncation. Currently, that patch requires SelectionDAG::getConstant to opt-in. This patch adds getSignedConstant so we can start fixing some of the cases that require implicit truncation.

This PR is continuation of the [previous one](llvm#101478). As a result, the `emitc::SwitchOp` op was developed inspired by `scf::IndexSwitchOp`. Main points of PR: - Added the `emitc::SwitchOp` op to the EmitC dialect + CppEmitter - Corresponding tests were added - Conversion from the SCF dialect to the EmitC dialect for the op

CodeGenIntrinsic changes: - Use `const` Record pointers, and `StringRef` when possible. - Default initialize several fields with their definition instead of in the constructor. - Simplify various string checks in the constructor using StringRef starts_with()/ends_with() functions. - Eliminate first argument to `setDefaultProperties` and use `TheDef` class member instead. IntrinsicEmitter changes: - Emit `namespace llvm::Intrinsic` instead of nested namespaces. - End generated comments with a . - Use range based for loops, and early continue within loops. - Emit `static constexpr` instead of `static const` for arrays. - Change `compareFnAttributes` to use std::tie() to compare intrinsic attributes and return a default value when all attributes are equal. STLExtras: - Add std::replace wrapper which takes a range.

…eGen/bit-int-ubsan.c (llvm#104607) Add missing -triple x86_64-pc-linux-gnu line into RUN line, which should be here. --------- Co-authored-by: Eänolituri Lómitaurë <vladislav.aranov@ericsson.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com> Co-authored-by: Paul Kirth <paulkirth@google.com> Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>

Passing to the `PGOInstrumentationGen` pass whether it needs to produce contextual profiling instrumentation as a flag, in the process restructuring a bit the places that need to be aware of that (some were unnecessarily in `PGOInstrumentationUse`)

llvm#100367) This is split off from llvm#71764, and moves only the vmv.v.v part of performCombineVMergeAndVOps to work on MachineInstrs. In retrospect trying to handle PseudoVMV_V_V and PseudoVMERGE_VVM in the same function makes the code quite hard to read, so this just does it in a separate peephole. This turns out to be simpler since for PseudoVMV_V_V we don't need to convert the Src instruction to a masked variant, and we don't need to create a fake all ones mask.

This patch implements sandboxir::AtomicRMWInst mirroring llvm::AtomicRMWInst.

Old Headergen needed extra build rules to ensure that it worked in runtimes mode. This patch disables those checks if new headergen is enabled. Also some new headers were not being properly built with new headergen, and that's also fixed.

Similar to llvm#104481. Replace more "Utility" dependencies with "UtilityHeaders" to avoid cyclic dependency when building on macos.

An HLSL function has internal linkage by default unless it is: 1. shader entry point function 2. marked with the `export` keyword (llvm#92812) 3. patch constant function (not implemented yet) This PR adds a link-time pass `DXILFinalizeLinkage` that updates the linkage of functions to make sure only shader entry points and exported functions are visible from the module (have _program linkage_). All other functions will be updated to have internal linkage. Related spec update: microsoft/hlsl-specs#295 Fixes #llvm#92071

This reverts commit e592c2d. We can finally reland the PR since the issue that caused the PR to be reverted has been resolved in llvm#104051.

This allows annotating fields of C/C++ structs using API Notes. Previously API Notes supported Objective-C properties, but not fields. rdar://131548377

…vm#102986) When a test case inside of a gtest suite fails, we report a failure which causes the entire `ninja check-lldb` invocation to fail, even if the outer test case is marked as XFAIL - each test case result is reported as its own lit test run. This PR updates lit so it checks whether each test case's parent test suite is XFAIL before setting the status to FAIL. This is especially problematic because the failing tests can't manually be marked as XFAIL, due to llvm#102264. Fixes llvm#102265 ### Repro instructions 1. Modify any gtest test case to generate a failure. 2. Mark the outer lit test with XFAIL using either `--xfail-tests` flag or `LIT_XFAIL` env var. 3. Run the tests 4. Observe the lit test is XFAIL as expected, but the failed child test cases show up as separate failures. Co-authored-by: kendal <kendal@thebrowser.company>

…ause (llvm#102717)

…llvm#104519) This patch makes `-objc_relative_method_lists` default on MacOS 10.16+/iOS 14+. Manual override still work if command line argument is provided. To test this change, many explict arguments are removed from the test files. Some explict `-objc_no_objc_relative_method_lists` are also added for tests that don't support this yet. This commit tries to revive llvm#101360, which exposes a bug that breaks CI. llvm#104081 has fixed that bug.

) Reverts llvm#104522 Caused crashes on Fuchsia

This feature provided CPM_IOACC_CTL_EL3, a lone system register that has been carried over since the original ARM64 implementation, where it was the only processor-specific register in a long list of architectural sysregs. We don't need it here. It's been used as a generic processor-specific sysreg in tests, but the functionality they target is now better covered in other more exhaustive tests.

This analysis can't be used with other analyses if this isn't set. Pull Request: llvm#104244

…or buffers" (llvm#104517) Some build configs allow `llvm_unreachable` in a constexpr context, but not all, so these functions that map a fully covered enum to a string can't be constexpr. This version fixes that by dropping constexpr from those functions. This reverts commit fcc318f, reapplying 28d577e. Original message follows: This implements the DXILResourceAnalysis pass for `dx.TypedBuffer` and `dx.RawBuffer` types. This should be sufficient to lower `dx.handle.fromBinding` for this set of types, but it leaves a number of TODOs around for other resource types. This also includes a straightforward `print` method in `ResourceInfo` to make the analysis testable. This is deliberately different than the printer in `lib/Target/DirectX/DXILResource.cpp`, which attempts to print bindings in a format compatible with the comments `dxc` prints. We will eventually want to make that functionality driven by this analysis pass, but it isn't sufficient for testing so we need both.

…es to be treated as loads (llvm#99999) This change avoids deleting `!willReturn` intrinsics for which the return value is unused when building the SDAG. Currently, calls to read-only intrinsics not marked with `IntrWillReturn` cannot be deleted at the LLVM IR level but may be deleted when building the SDAG. These calls are unsafe to remove from the IR because the functions are `!willReturn` and should also be unsafe to remove fromthe SDAG for the same reason. This change aligns the behavior of the SDAG to that of LLVM IR. This change also requires that intrinsics not have the `Throws` attribute to be treated as loads for the same reason.

Summary: This used an old name I forgot to fix, linter didn't catch it because it was behind `ifdef` and the branch which I tested it on I forgot to update the one I landed.

Some new headers were not being properly built with new headergen, since they were using the old "add_gen_header" instead of the new "add_header_macro". This patch fixes the issue.

Requested on llvm#95394

…4613) Flang is switch to cc1 when we use `-x cuda`. Make sure we can use fc1 with cuda fortran input. The current pipeline will fail at MLIR level for the moment. llvm#104483

This adds MachO support for emission of authenticated pointer relocations. We already support AArch64AuthMCExpr, to represent assembly expressions such as: .quad <symbol>@AUTH(<key>, <discriminator> [, addr]) For example: .quad _g3@AUTH(ib, 1234, addr) These @AUTH expressions lower to a new kind of MachO relocation: ARM64_RELOC_AUTHENTICATED_POINTER (11) The relocation points to the referenced symbol. The other data, describing the signing scheme and original addend (only 32 bits instead of 64), is encoded into the addend (in the relocated location): |63|62|61-51|50-49| 48 |47 - 32|31 - 0| | 1| 0| 0 | key | addr | discriminator | addend |

…llvm#104632) Reverts llvm#104613

…lvm#94059) This patch prevents thread-local constants to be merged within PPCMergeStringPool.cpp. The PPCMergeStringPool pass primarily merges non-thread-local constants together, and thread-local constants should not be mixed together with other (non-thread-local) constants. In the event that thread-local and other non-thread-local constants are pooled together, the llvm.threadlocal.address intrinsic can fail as it expects its argument to be a thread-local global value, but the merged string structure created by the PPCMergeStringPool pass is not thread-local as a whole.

…lvm#104803)

There are 3 ways in which `ParseAST::build` can fail and return `std::nullopt`. 2 of the ways we emit the error message using `elog`, but for the 3rd way, `log` is used. We should emit all 3 of these reasons with `elog`.

…lvm#104824) `SBCommand::AddCommand()` requires `SBCommandPluginInterface` to be heap based because it will be stored inside `std::shared_ptr<lldb::SBCommandPluginInterface>` later for reference counting. But lldb-dap passes `StartDebuggingRequestHandler/ReplModeRequestHandler` static function pointer to it which will cause corruption later during destruction. This PR fixes this issue by making these two handler heap based. Co-authored-by: jeffreytan81 <jeffreytan@fb.com>

This reverts commit d4f6fcf. Relanding with fixed obj_offset calculation (precedence of operations was wrong), and the suggestion in llvm#95308 (comment)

- When an unterminated open { is detected in the format string, instead of asserting and ignoring the error, replace that string with another to indicate the error, and remove the assert as well. - This will make the error evident in both assert and release builds and make observing the error more convenient (as several uses of this function are in TableGen and it is often built in release mode even in debug builds)

…KnownNonEqual`; NFC Downstream hit this assert, since it doesn't really make any difference, just change code to return false.

Error: CommandLine Error: Option 'attributor-manifest-internal' registered more than once During the standalone debug build of offload the above error is seen at app runtime when using a prebuilt llvm with LLVM_LINK_LLVM_DYLIB=ON. This is caused by linking both libLLVM.so and various archives that are found via llvm_map_components_to_libnames for jit support.

… on LLVM Dialect and LLVM Core in CMake build (llvm#104832) This change removes dependencies declared as either 'LINK_LIBS' or 'LINK_COMPONENTS' across several MLIR libraries. The removed dependencies appear to be incorrect and may have been required in older versions of the project. These dependencies cause many high level dialects to have transitive dependence on the LLVM dialect and the LLVM 'Core' library ('llvm/lib/IR'). Note that if using the 'Ninja' CMake generator, one can inspect the dependencies (including all transitive libraries) of any given MLIR target but using the command `ninja -C <build dir> -t browse` and navigating to the library of interest in a web browser.

) Previously the secondary cache retrieval algorithm would not allow retrievals of memory chunks where the number of unused bytes would be greater than than `MaxUnusedCachePages * PageSize` bytes. This meant that even if a memory chunk satisfied the requirements of the optimal fit algorithm, it may not be returned. This remains true if memory tagging is enabled. However, if memory tagging is disabled, a new heuristic has been put in place. Specifically, If a memory chunk is a non-optimal fit, the cache retrieval algorithm will attempt to release the excess memory to force a cache hit while keeping RSS down. In the event that a memory chunk is a non-optimal fit, the retrieval algorithm will release excess memory as long as the amount of memory to be released is less than or equal to 16 KB. If the amount of memory to be released exceeds 16 KB, the retrieval algorithm will not consider that cached memory chunk valid for retrieval.

Inverse mapping needs to be updated for the result that was remapped (it was previously only updated halfway).

Fix list formatting, improve the wording, and fix the description when both options (note: prefer "option" to "flag" when arguments are supported) are specified. Pull Request: llvm#104886

D57497 added -msmall-data-limit= as an alias for -G and defaulted it to 8 for -fno-pic/-fpie. The behavior is already different from GCC in a few ways: * GCC doesn't accept -G. * GCC -fpie doesn't seem to use -msmall-data-limit=. * GCC emits .srodata.cst* that we don't use (llvm#82214). Writable contents caused confusion (https://bugs.chromium.org/p/llvm/issues/detail?id=61) In addition, * claiming `-shared` means we don't get a desired `-Wunused-command-line-argument` for `clang --target=riscv64-linux-gnu -fpic -c -shared a.c`. * -mcmodel=large doesn't work for RISC-V yet, so the special case is strange. * It's quite unusual to emit a warning when an option (unrelated to relocation model) is used with -fpic. * We don't want future configurations (Android) to continue adding customization to `SetRISCVSmallDataLimit`. I believe the extra code just doesn't pull its weight and should be cleaned up. This patch also changes the default to 0. GP relaxation users are encouraged to specify these customization options explicitly. Pull Request: llvm#83093

A quick follow-up fix for llvm#99403 Buildbot [reported](https://lab.llvm.org/buildbot/#/builders/168/builds/2330) an error: ``` /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ADT/FunctionExtrasTest.cpp:320:8: error: variable 'ptr' is uninitialized when used here [-Werror,-Wuninitialized] 320 | [ptr](void *self) { | ^~~ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/ADT/FunctionExtrasTest.cpp:318:12: note: initialize the variable 'ptr' to silence this warning 318 | void *ptr; | ^ | = nullptr 1 error generated. ``` So that PR does exactly what's sugested.

…dialects on LLVM Dialect and LLVM Core in CMake build (llvm#104832)" This reverts commit 43b5085 since it caused the build to break with BUILD_SHARED_LIBS=ON.

I started out by adding a new pointer type for blocks, and I was fully prepared to compile their AST to bytecode and later call them. ... then I found out that the current interpreter doesn't support calling blocks at all. So we reuse `Function` to support sources other than `FunctionDecl`s and classify `BlockPointerType` as `PT_FnPtr`.

…s. (llvm#104876) This was broken back in llvm#78658 when we transitioned away from archive indexes to parsing lazy object files. Fixes: llvm#94077 Fixes: emscripten-core/emscripten#22008

…able files. (llvm#102978) This change is enough to allow `--strip-debug` to work on object files, without breaking the relocation information or symbol table. A more complete version of this change would instead reconstruct the symbol table and relocation sections, but that is much larger change. Bug: llvm#102002

We used integer comparisons instead of floating point comparisons resulting in very odd behavior.

We would crash on sufficiently old NV hardware (Volta or so) due to incorrectly marking certain operations legal.

…lvm#104894) Reverts llvm#104807

ArcaneNibble and others added 30 commits August 16, 2024 17:14

[SandboxIR] Implement AtomicRMWInst (llvm#104529)

ab5102d

This patch implements sandboxir::AtomicRMWInst mirroring llvm::AtomicRMWInst.

[bazel] Fix cyclic dependencies for macos (llvm#104528)

fbef911

Similar to llvm#104481. Replace more "Utility" dependencies with "UtilityHeaders" to avoid cyclic dependency when building on macos.

[Attributor] Enable AAAddressSpace in OpenMPOpt (llvm#104363)

907c7eb

This reverts commit e592c2d. We can finally reland the PR since the issue that caused the PR to be reverted has been resolved in llvm#104051.

[APINotes] Support fields of C/C++ structs

b816977

This allows annotating fields of C/C++ structs using API Notes. Previously API Notes supported Objective-C properties, but not fields. rdar://131548377

[Clang][OMPX] Add the code generation for multi-dim thread_limit cl…

0551926

…ause (llvm#102717)

Revert "[libc] Disable old headergen checks unless enabled" (llvm#104627

9791986

) Reverts llvm#104522 Caused crashes on Fuchsia

[DirectX] Add missing Analysis usage to DXILResourceMDWrapper

f999b32

This analysis can't be used with other analyses if this isn't set. Pull Request: llvm#104244

[SelectionDAGISel] Use getSignedConstant for OPC_EmitInteger.

535b209

[libcxx][fix] Rename incorrect filename variable

3c603f8

Summary: This used an old name I forgot to fix, linter didn't catch it because it was behind `ifdef` and the branch which I tested it on I forgot to update the one I landed.

[libc] Fix generated header definitions in cmake (llvm#104628)

8454610

Some new headers were not being properly built with new headergen, since they were using the old "add_gen_header" instead of the new "add_header_macro". This patch fixes the issue.

AMDGPU: Rename type helper functions in atomic handling

ef56061

Requested on llvm#95394

[flang][cuda][driver] Make sure flang does not switch to cc1 (llvm#10…

e6b9f12

…4613) Flang is switch to cc1 when we use `-x cuda`. Make sure we can use fc1 with cuda fortran input. The current pipeline will fail at MLIR level for the moment. llvm#104483

Revert "[flang][cuda][driver] Make sure flang does not switch to cc1" (…

e315ba1

…llvm#104632) Reverts llvm#104613

lntue and others added 23 commits August 19, 2024 17:58

[libc][NFC] Add sollya script to compute worst case range reduction. (l…

54c6b93

…lvm#104803)

Emit BeginSourceFile failure with elog. (llvm#104845)

2405253

There are 3 ways in which `ParseAST::build` can fail and return `std::nullopt`. 2 of the ways we emit the error message using `elog`, but for the 3rd way, `log` is used. We should emit all 3 of these reasons with `elog`.

Reapply "[HWASan] symbolize stack overflows" (llvm#102951) (llvm#104036)

c478139

This reverts commit d4f6fcf. Relanding with fixed obj_offset calculation (precedence of operations was wrong), and the suggestion in llvm#95308 (comment)

[compiler-rt][fuzzer] implements SetThreadName for fuchsia. (llvm#99953)

31cc4cc

[AMDGPU] Add VOPD combine dependency tests. NFC. (llvm#104841)

5fcd059

[ValueTracking] Handle incompatible types instead of asserting in `is…

42ce628

…KnownNonEqual`; NFC Downstream hit this assert, since it doesn't really make any difference, just change code to return false.

[MLIR][Transforms] Fix dialect conversion inverse mapping (llvm#104648)

baa6627

Inverse mapping needs to be updated for the result that was remapped (it was previously only updated halfway).

[docs] Revise the doc for __builtin_allow_runtime_check

b5f3e28

Fix list formatting, improve the wording, and fix the description when both options (note: prefer "option" to "flag" when arguments are supported) are specified. Pull Request: llvm#104886

Revert "[mlir] NFC: fix dependence of (Tensor|Linalg|MemRef|Complex) …

06fd808

…dialects on LLVM Dialect and LLVM Core in CMake build (llvm#104832)" This reverts commit 43b5085 since it caused the build to break with BUILD_SHARED_LIBS=ON.

[lld][WebAssembly] Ignore local symbols when parsing lazy object file…

5403123

…s. (llvm#104876) This was broken back in llvm#78658 when we transitioned away from archive indexes to parsing lazy object files. Fixes: llvm#94077 Fixes: emscripten-core/emscripten#22008

[SelectionDAG] Fix lowering of IEEE 754 2019 minimum/maximum

ea1f05e

We used integer comparisons instead of floating point comparisons resulting in very odd behavior.

[NVPTX] Fix bugs involving maximum/minimum and bf16

a9ce181

We would crash on sufficiently old NV hardware (Volta or so) due to incorrectly marking certain operations legal.

Revert "[scudo] Add partial chunk heuristic to retrieval algorithm." (l…

f9031f0

…lvm#104894) Reverts llvm#104807

[AutoBump] Merge with fixes of 2d50029 (Aug 15)

9b53030

mgehre-amd changed the title ~~[AutoBump] Merge with fixes of 2d50029f (Aug 15) (5)~~ [AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) Sep 20, 2024

mgehre-amd force-pushed the bump_to_2d50029f branch 2 times, most recently from 141a45a to d840dbb Compare September 20, 2024 09:03

Merge commit 'f9031f00f2c9' into bump_to_2d50029f

6f28929

mgehre-amd force-pushed the bump_to_2d50029f branch from d840dbb to 6f28929 Compare September 20, 2024 09:36

mgehre-amd mentioned this pull request Sep 20, 2024

[AutoBump] Merge with fixes of 98e08023 (Aug 28, needs LLVM bump) (36) Xilinx/torch-mlir#359

Open

cferry-AMD approved these changes Sep 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358

[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358

mgehre-amd commented Sep 20, 2024 •

edited

Loading

[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358

Are you sure you want to change the base?

[AutoBump] Merge with fixes of 2d50029f (Aug 15, needs torch-mlir bump) (5) #358

Conversation

mgehre-amd commented Sep 20, 2024 • edited Loading

mgehre-amd commented Sep 20, 2024 •

edited

Loading