forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357
Open
mgehre-amd
wants to merge
132
commits into
feature/fused-ops
Choose a base branch
from
bump_to_894d3eeb
base: feature/fused-ops
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…, fsqrt(, l, f128) to math.yaml. (llvm#103494) Added auto function hdrgen specification for functions: totalordermag(,f, l, f128), dsqrt(l, f128), fsqrt(, l, f128)
Also combine the GlobalISel tests into the SelectionDAG ones.
This commit adds three matchers that unlike the m_NonZero matcher not only match constants, but also operations that implement the InferIntRangeInterface. These matchers can then match a non-zero value or a value that is not minus one based on the inferred range. Additionally, the commit uses the new matchers in the getSpeculatability functions of Arith's signed and unsigned integer divisions. At the moment, the matchers only look at the defining operation to avoid expensive IR walks. This range based matchers can be useful when hoisting divisions out of a loop, which requires knowing the divisor is non-zero and not minus one for signed divisions. Just checking for a constant divisor may not be sufficient, if the divisor is, for example, the result of an operation that returns the number of threads of a team of threads.
Allow subvector extraction as long as at least one operand extraction is free. Refactor existing cases into a switch statement to allow easier reuse + future expansion.
...because it is too noisy to be useful right now, and its architecture is terrible, so it can't act a starting point of future development. The main problem with this checker is that it tries to do (or at least fake) path-sensitive analysis without actually using the established path-sensitive analysis engine. Instead of actually tracking the symbolic values and the known constraints on them, this checker blindly gropes the AST and uses heuristics like "this variable was seen in a comparison operator expression that is not a loop condition, so it's probably not too large" (which was improved in a separate commit to at least ignore comparison operators that appear after the actual `malloc()` call). This might have been acceptable in 2011 (when this checker was added), but since then we developed a significantly better standard approach for analysis and this old relic doesn't deserve to remain in the codebase. Needless to say, this primitive approach causes lots of false positives (and presumably false negatives as well), which ensures that this alpha checker won't be missed by the users. Moreover, the goals of this checker would be questionable even if it had a perfect implementation. It's very aggressive to assume that the argument of malloc can overflow by default (unless the checker sees a bounds check); and this produces too many false positives -- perhaps even for an optin checker. It may be possible to eventually create a useful (and properly path-sensitive) optin checker for these kinds of suspicious code, but this is a very low priority goal. Also note that we already have `alpha.security.TaintedAlloc` which provides more practical heuristics for detecting somewhat similar "argument of malloc may be too large" vulnerabilities.
…sions There's some coverage in RISCVISAInfoTest, but it's worth adding a quick test to ensure nothing happens to the frontend handling of this option.
When instantiating a delayed template, the recorded token stream is passed to `Parser::ParseLateTemplatedFuncDef` which will append the current token "so it doesn't get lost". With incremental extensions enabled, this is `repl_input_end` which subsequently needs support for (de)serialization.
… order (llvm#102844) Put the newest standards first, same as for the [C++ status page](https://clang.llvm.org/cxx_status.html). The diff is pretty busted, but I swear I copy & pasted faithfully 😅 The only change beyond shuffling sections around is unfolding the sections for C99/C11 (6dbce28), which isn't necessary anymore now that they're safely tucked away towards the end of the page.
…file (llvm#103004)" This reverts commit 2d53f0a. This causes warnings when building with MSVC.
getRawData exposes some internal details of APInt. The code was iterating over the uint64_t pieces and then iterating breaking them into 4 uint16_t pieces. This patch changes the code to extract 16-bit pieces directly from the APInt without using getRawData.
…te global data (llvm#101224) This patch aims to reduce TOC usage by merging internal and private global data. Moreover, we also add the GlobalMerge pass within the PPCTargetMachine pipeline, which is disabled by default. This transformation can be enabled by -ppc-global-merge.
…nstant into a signed comparison (llvm#103480) Given an unsigned integer comparison of `add nsw X, C1` with some constant `C2` we can fold it into a signed comparison of `X` and `C2 - C1` under the following conditions: * There's a `nsw` flag on the addition * `C2` is non-negative * `X + C1` is non-negative * `C2 - C1` is non-negative
…lvm#103392) ... whereever we have the Decl for it, and even when we don't keep the SourceLocation of it aimed at the call site. Fixes: llvm#102983
In preparing for the future upcoming patches, just moving the call to the proper place, which is NFC for now.
…p_atomics (llvm#103732) This commit adds support amdgpu-unsafe-gp-atomics attr plumbing via introduction of `rocdl.unsafe_fp_atomics`. This adds the missing translation for amdgpu-waves-per-eu attr.
…vm#103927) This commit changes the LLVM dialect's inliner interface to no longer be registered at dialect initialization. Instead, it is now a promised interface, that needs to be registered explicitly. This change is desired to avoid pulling in a lot of dependencies into the `MLIRLLVMDialect` library, especially considering future patches that plan to extend it further with strong IR analysis.
…rs, NFC GateredScalars is a full copy of the E->Scalars in this places and can be safely used for now. Unifies the code across the function.
…to combine (srl (sra X, C1), ShAmt) -> sra(X, C1+ShAmt) (llvm#101751) If the upper bits of the shr aren't demanded. This helps with cases where the outer srl was originally an sra and was converted to a srl by SimplifyDemandedBits before it had a chance to combine with the inner sra. This can occur when the inner sra was part of a sign_extend_inreg expansion. There are some regressions in ARM and Thumb2.
…lvm#102952) This PR addresses the issue detailed in iree-org/iree#17948. The problem occurs when distributed types are set to NULL, leading to compilation crashes. --------- Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
…eded for explicit symbol visibility (llvm#103900) In multiple source files function definitions never sees there declaration in a header because its never included causing linker errors when explicit symbol visibility macros\dllexport are added to the declarations. Most of these were originally found by @tstellar in llvm#67502 TargetRegistry.h is needed in MCExternalSymbolizer.cpp for createMCSymbolizer Analysis/Passes.h is needed in LazyValueInfo.cpp and RegionInfo.cpp for createLazyValueInfoPassin and createRegionInfoPass Transforms/Scalar.h is needed in SpeculativeExecution.cpp for createSpeculativeExecutionPass
…+ / VS2019+ (llvm#102848) Partial fix for llvm#92204. This PR just fixes VS2019+ since that is the suite of compilers that I require link compatibility with at the moment. I still intend to fix VS2017 and to update llvm-undname in future PRs. Once those are also finished and merged I'll close out llvm#92204. I am hoping to get the llvm-undname PR up in a couple of weeks to be able to demangle the VS2019+ name mangling. MSVC 1920+ mangles placeholder return types for non-templated functions with "@". For example `auto foo() { return 0; }` is mangled as `?foo@@ya@XZ`. MSVC 1920+ mangles placeholder return types for templated functions as the qualifiers of the AutoType followed by "_P" for `auto` and "_T" for `decltype(auto)`. For example `template<class T> auto foo() { return 0; }` is mangled as `??$foo@H@@ya?A_PXZ` when `foo` is instantiated as follows `foo<int>()`. Lambdas with placeholder return types are still mangled with clang's custom mangling since MSVC lambda mangling hasn't been deciphered yet. Similarly any pointers in the return type with an address space are mangled with clang's custom mangling since that is a clang extension. We cannot augment `mangleType` to support this mangling scheme as the mangling schemes for variables and functions differ. auto variables are encoded with the fully deduced type where auto return types are not. The following two functions with a static variable are mangled the same ``` template<class T> int test() { static int i = 0; // "?i@?1???$test@H@@yahxz@4HA" return i; } template<class T> int test() { static auto i = 0; // "?i@?1???$test@H@@yahxz@4HA" return i; } ``` Inside `mangleType` once we get to mangling the `AutoType` we have no context if we are from a variable encoding or some other encoding. Therefore it was easier to handle any special casing for `AutoType` return types with a separate function instead of using the `mangleType` infrastructure.
FindCountedByField can be used in more places than CodeGen. Move it into FieldDecl to avoid layering issues.
llvm#96649) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.
To align with gas's latest changes. relate gas patch: https://sourceware.org/pipermail/binutils/2024-May/134360.html
This patch adds a verifier to `tosa.table` which fixes a crash. Fix llvm#103086.
Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp
3 MLIR tests `FAIL` on SPARC, both Solaris/sparcv9 and Linux/sparc64: ``` MLIR :: Conversion/ArithToSPIRV/arith-to-spirv-le-specific.mlir MLIR :: IR/elements-attr-interface.mlir MLIR :: Target/LLVMIR/llvmir-le-specific.mlir ``` The issue is always the same: the tests in question are little-endian-only currently, so this patch `XFAIL`s them on `sparc*` as is already done for `s390x`. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
…lvm#103722) `Flang :: Lower/default-initialization-globals.f90` `FAIL`s on SPARC, both Solaris/sparcv9 and Linux/sparc64. The failure mode is same as on AIX/PowerPC, so both targets being big-endian, this patch treats them the same. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
…C) (llvm#103723) This makes `LayoutAlignElem` / `PointerAlignElem` and `AlignTypeEnum` inner types of `DataLayout`. The types are also renamed to match their meaning (LangRef refers to them as "specification" and "specifier"). Pull Request: llvm#103723
Removing them simplifies the content and means we don't confuse anyone who joined after the Phabricator shutdown. You could use them for review archaeology but this is only a subset of the names you'd encounter there anyway. So I don't think this is a good reason to keep them here. With a couple of exceptions the Phabricator/GitHub names are the same and/or related to their full name anyway.
…lvm#103730) `Flang :: Driver/fveclib-codegen.f90` currently `FAIL`s on SPARC, both Solaris/sparcv9 and Linux/sparc64: ``` bin/flang-new -S -Ofast -fveclib=LIBMVEC -o - /vol/llvm/src/llvm-project/local/flang/test/Driver/fveclib-codegen.f90 flang/test/Driver/fveclib-codegen.f90:11:10: error: CHECK: expected string not found in input ! CHECK: _ZGVbN4vv_powf ^ ``` The code in question only contains calls to `powf`. Given that `glibc` only supports `libmvec` on `aarch64` and `x86_64`, this test targets only those if possible. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
Until llvm#103056 lands or another more appropriate check can be found. This test fails on Ubuntu Focal where zdump is built with 32 bit time_t but passes on Ubuntu Jammy where zdump is built with 64 bit time_t. Marking it unsupported means Linaro can upgrade its bots to Ubuntu Jammy without getting an unexpected pass.
This commit introduces a slicing utility that can be used to walk arbitrary IR slices. It additionally ships logic to determine control flow predecessors, which allows users to walk backward slices without dealing with both `RegionBranchOpInterface` and `BranchOpInterface`. This utility is used to improve the `noalias` propagation in the LLVM dialect's inliner interface. Before this change, it broke down as soon as pointer were passed through region control flow operations.
Adds m_FPToUI/m_FPToSI matchers for ISD::FP_TO_UINT/ISD::FP_TO_SINT in SDPatternMatch.h with suitable test coverage. Fixes llvm#103872
…llvm#104037) The target needs to be initialized in order to compute the correct target triple from the command line. Without initialized targets the OS component of the triple might not reflect what would be computed by the driver for an actual compiler invocation. Fixes llvm#61762
…pNestOp (llvm#103731) This patch adds an assert to `genLoopNestClauses` to ensure the number of symbols and corresponding loop wrapper entry block arguments have the same size. This is checked by some of the callers, but it makes more sense moving it into the function itself and avoid having to replicate it.
This updates the "dxil-metadata-emit" pass flag to be spelled "dxil-translate-metadata" to better match the pass name. Pull Request: llvm#104249
cferry-AMD
approved these changes
Sep 30, 2024
An error occurred while trying to automatically change base from
bump_to_98119718
to
feature/fused-ops
October 4, 2024 14:33
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.