forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 83891777 (May 16) (48) #307
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As mentioned in llvm#68882 and https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699 Gep arithmetic isn't consistent with different types. GVNSink didn't realize this and sank all geps as long as their operands can be wired via PHIs in a post-dominator. Fixes: llvm#85333 Reapply: llvm#88440 after fixing the non-determinism issues in llvm#90995
Adds the LLVM vector.deinterleave2 intrinsic to the MLIR LLVM dialect. The deinterleave2 intrinsic takes a vector and returns two vectors with the first having even elements and the second with odd elements from the input vector. The inverse of vector.interleave2.
MCOperand has a constructor that permits a nullptr MCInst, and BOLT makes use of that. Adjust MCOperand's dumper to permit such use.
Remove 'Valid' local boolean that has a single use, and return directly instead.
…#91933) Given `foo...[idx]` if idx is value dependent, the expression is type dependent. Fixes llvm#91885 Fixes llvm#91884
Remove excess parentheses and use `boolean ? true-case : false-case` idiom.
When writing the test for this I seemingly forgot to put 'CHECK' on the lines, so I didn't notice that I was printing the identifiers as pointers rather than their names. This patch corrects the tests and the print behavior.
…0448) This patch rewrites the ArmSME tile allocator to use liveness information to make better tile allocation decisions and improve the correctness of the ArmSME dialect. This algorithm used here is a linear scan over live ranges, where live ranges are assigned to tiles as they appear in the program (chronologically). Live ranges release their assigned tile ID when the current program point is passed their end. This is a greedy algorithm (which is mainly to keep the implementation relatively straightforward), and because it seems to be sufficient for most kernels (e.g. matmuls) that use ArmSME. The general steps of this are roughly from https://link.springer.com/content/pdf/10.1007/3-540-45937-5_17.pdf, though there have been a few simplifications and assumptions made for our use case. Hopefully, the only changes needed for a user of the ArmSME dialect is that: - `-allocate-arm-sme-tiles` will no longer be a standalone pass - `-test-arm-sme-tile-allocation` is only for unit tests - `-convert-arm-sme-to-llvm` must happen after `-convert-scf-to-cf` - SME tile allocation is now part of the LLVM conversion By integrating this into the `ArmSME -> LLVM` conversion we can allow high-level (value-based) ArmSME operations to be side-effect-free, as we can guarantee nothing will rearrange ArmSME operations before we emit intrinsics (which could invalidate the tile allocation). The hope is for ArmSME operations to have no hidden state/side effects and allow easily lowering dialects such as `vector` and `arith` to SME, without making assumptions about how the input IR looks, as the semantics of the operations will be the same. That is no (new) side effects and the IR follows the rules of SSA (a value will never change). The aim is correctness, so we have a base for working on optimizations.
A buildbot with expensive checks enabled flagged some problems with my patch. There was also a post-commit nit on the langref changes.
…m#92004) This makes the `vc-rev-enabled` feature unsupported if we fail to retrieve the git revision for any reason, such as if git is not installed.
…straints (llvm#92104) Clangd uses it to determine whether the argument is within the selection range. Fixes clangd/clangd#2033
…filename and location info (llvm#92050)
PR llvm#80680 added bits in the codegen to lazily add convergence intrinsics when required. This logic relied on the LoopStack. The issue is when parsing the condition, the loopstack doesn't yet reflect the correct values, as expected since we are not yet in the loop. However, convergence tokens should sometimes already be available. The solution which seemed the simplest is to greedily generate the tokens when we generate SPIR-V. Fixes llvm#88144 --------- Signed-off-by: Nathan Gauër <brioche@google.com>
Now that we've got (minus some issues around datatypes and invariant loads) working lowerings for address space 7, update the table in the AMDGPU usage guide to properly indicate the nature of these address spaces.
…llvm#92067) cm.push can't save X26 without also saving X27. This removes two other checks for this case. This causes CFI to be emitted since X27 is now explicitly a callee saved register. The affected tests use inline assembly to clobber X26 rather than the whole range of s0-s10.
Allow mixing objects with/without signed class ro data and category class properties as long as it happens before we register the metadata. These combinations are a warning in ld, not a hard error. The only case that is ABI-breaking is if we already registered with the feature enabled but later try to load an object that doesn't support it. rdar://127336061
…inations of 32-bit integers. NFC
tryToCreateDiffCheck has one caller, and exits early if CanUseDiffCheck is false. Hence, we can get/set CanUseDiffCheck in the caller to avoid wastefully calling tryToCreateDiffCheck. This patch is an NFC simplification of program logic.
The target combine is no longer required because InstCombine will transform the DIV by a power of 2 into a multiply, so in practice this case will never trigger. Additionally, the generated code would have been incorrect for streaming(-compatible) functions, because it assumed NEON was available.
…lvm#92086) self.wait_for_running_event(process) is always called after self.runCmd("continue"). It is strange to expect eStateConnected here. This test failed in case of a remote target. The correct state is eStateRunning. Removed incorrect checking.
…#90500) Currently, clang postpones all semantic analysis of unary operators with operands of pointer/pointer to member/array/function type until instantiation whenever that type is dependent (e.g. `T*` where `T` is a type template parameter). Consequently, the uninstantiated AST nodes all have the type `ASTContext::DependentTy` (which, for the purposes of llvm#90152, is undesirable as that type may be the current instantiation! (e.g. `*this`)) This patch moves the point at which we perform semantic analysis for such expression to be prior to instantiation.
llvm#91137 reverted in llvm#92001 A build error fix added in 28d5ece --------- Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Most diagnostics obey https://llvm.org/docs/CodingStandards.html#error-and-warning-messages but some diverge. Fix them. While here, adjust some diagnostics. Pull Request: llvm#92024
Currently, only those global variables which are at compile unit scope are added to the 'globals' list of the DICompileUnit. This does not work for languages which support modules (e.g. Fortran) where hierarchy can be variable -> module -> compile unit. To fix this, if a variable scope points to a module, we walk one level up and see if module is in the compile unit scope. This was initially part of llvm#91582 which adds debug information for Fortran module variables. @kiranchandramohan pointed out that MLIR changes should go in separate PRs.
Use `os.devnull` instead of `/dev/null`.
…#91579) This patch adds nsw flag to the increment of do-variables when a new option is enabled. NOTE 11.10 in the Fortran 2018 standard says they never overflow. See also the discussion in llvm#74709 and the following discourse post. https://discourse.llvm.org/t/rfc-add-nsw-flags-to-arithmetic-integer-operations-using-the-option-fno-wrapv/77584/5
The error checking is only for .macro directives. Move it to the .macro parser to remove one parameter.
…m#90578) This patch add support of intrinsics GNU extension ETIME llvm#84205. Some usage info and example has been added to `flang/docs/Intrinsics.md`. The patch contains both the lowering and the runtime code and works on both Windows and Linux. | System | Implmentation | |-----------|--------------------| | Windows| GetProcessTimes | | Linux |times |
…andled by LegalizeVectorOps. (llvm#92332) The expand code is present, but we were missing the type query code so the nodes would be ignored until LegalizeDAG.
…erleavedMemoryOpCost. (llvm#91825) isLegalInterleavedAccessType expects the subvector type, but getInterleavedMemoryOpCost is called with the full vector type. So we need to divide by Factor.
rdar://127846581
…ter (llvm#92303) As noted in llvm#91440 (comment), if the pass pipeline stops early because of -stop-after any allocated passes added with insertPass will not be freed if they haven't already been added. This was showing up as a failure on the address sanitizer buildbots. We can fix it by instead passing the pass ID instead so that allocation is deferred.
I built it and confirmed this fixes the issue locally. Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Currently the irdl dialect page has no content beyond the header. By referring to the Ops.td in the CMake config, it pulls in all the types, attributes, etc., so that the doc generation can include them all in the page. Rendered locally to confirm it fixes the issue ![image](https://github.com/llvm/llvm-project/assets/2467754/8758f324-6bc3-4f0e-8fa9-8962cdb0177f) Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
…ariables before consuming it (llvm#92218)" This reverts commit 3a4c1b9. This breaks a bot on clang-s390x-linux
This field is present in LLVM, but was missing from the MLIR wrapper type. This addition allows MLIR languages to add proper DWARF info for GPU programs.
…ion" (llvm#92354) Reverts llvm#90578 This broke the premerge linux buildbot.
Wrongly removed in 45cc6bd.
In .macro, \+ expands to the per-macro invocation count. https://sourceware.org/pipermail/binutils/2024-May/134009.html \+ counts from 0 for .irp/.irpc/.rept . Note: We currently prints \q for `.print "\q"` while gas doesn't. This patch does not change this behavior.
If there is only one non-terminator operation in the update region then the update operation can be found and we can try to generate an atomicrmw instruction. Otherwise use the cmpxchg loop. Fixes llvm#91929
Support `R_AARCH64_AUTH_RELATIVE` relocation compression as described in https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#relocation-compression
Addresses old TODO about the exp10 intrinsic not existing.
) Unsupported ops on tile types can become dead after `-convert-arm-sme-to-llvm` resulting in incorrect results. Verify such operations don't exist post-conversion and fail if they do. Based on discussion from https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543
cferry-AMD
approved these changes
Aug 26, 2024
An error occurred while trying to automatically change base from
bump_to_ecce5ccd
to
feature/fused-ops
September 3, 2024 20:08
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.