[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357

mgehre-amd · 2024-09-20T08:04:22Z

No description provided.

…, fsqrt(, l, f128) to math.yaml. (llvm#103494) Added auto function hdrgen specification for functions: totalordermag(,f, l, f128), dsqrt(l, f128), fsqrt(, l, f128)

Also combine the GlobalISel tests into the SelectionDAG ones.

This commit adds three matchers that unlike the m_NonZero matcher not only match constants, but also operations that implement the InferIntRangeInterface. These matchers can then match a non-zero value or a value that is not minus one based on the inferred range. Additionally, the commit uses the new matchers in the getSpeculatability functions of Arith's signed and unsigned integer divisions. At the moment, the matchers only look at the defining operation to avoid expensive IR walks. This range based matchers can be useful when hoisting divisions out of a loop, which requires knowing the divisor is non-zero and not minus one for signed divisions. Just checking for a constant divisor may not be sufficient, if the divisor is, for example, the result of an operation that returns the number of threads of a team of threads.

Allow subvector extraction as long as at least one operand extraction is free. Refactor existing cases into a switch statement to allow easier reuse + future expansion.

) It seems that the parameters can be passed through the class members.

…03767)

...because it is too noisy to be useful right now, and its architecture is terrible, so it can't act a starting point of future development. The main problem with this checker is that it tries to do (or at least fake) path-sensitive analysis without actually using the established path-sensitive analysis engine. Instead of actually tracking the symbolic values and the known constraints on them, this checker blindly gropes the AST and uses heuristics like "this variable was seen in a comparison operator expression that is not a loop condition, so it's probably not too large" (which was improved in a separate commit to at least ignore comparison operators that appear after the actual `malloc()` call). This might have been acceptable in 2011 (when this checker was added), but since then we developed a significantly better standard approach for analysis and this old relic doesn't deserve to remain in the codebase. Needless to say, this primitive approach causes lots of false positives (and presumably false negatives as well), which ensures that this alpha checker won't be missed by the users. Moreover, the goals of this checker would be questionable even if it had a perfect implementation. It's very aggressive to assume that the argument of malloc can overflow by default (unless the checker sees a bounds check); and this produces too many false positives -- perhaps even for an optin checker. It may be possible to eventually create a useful (and properly path-sensitive) optin checker for these kinds of suspicious code, but this is a very low priority goal. Also note that we already have `alpha.security.TaintedAlloc` which provides more practical heuristics for detecting somewhat similar "argument of malloc may be too large" vulnerabilities.

…sions There's some coverage in RISCVISAInfoTest, but it's worth adding a quick test to ensure nothing happens to the frontend handling of this option.

When instantiating a delayed template, the recorded token stream is passed to `Parser::ParseLateTemplatedFuncDef` which will append the current token "so it doesn't get lost". With incremental extensions enabled, this is `repl_input_end` which subsequently needs support for (de)serialization.

… order (llvm#102844) Put the newest standards first, same as for the [C++ status page](https://clang.llvm.org/cxx_status.html). The diff is pretty busted, but I swear I copy & pasted faithfully 😅 The only change beyond shuffling sections around is unfolding the sections for C99/C11 (6dbce28), which isn't necessary anymore now that they're safely tucked away towards the end of the page.

…file (llvm#103004)" This reverts commit 2d53f0a. This causes warnings when building with MSVC.

getRawData exposes some internal details of APInt. The code was iterating over the uint64_t pieces and then iterating breaking them into 4 uint16_t pieces. This patch changes the code to extract 16-bit pieces directly from the APInt without using getRawData.

…te global data (llvm#101224) This patch aims to reduce TOC usage by merging internal and private global data. Moreover, we also add the GlobalMerge pass within the PPCTargetMachine pipeline, which is disabled by default. This transformation can be enabled by -ppc-global-merge.

…nstant into a signed comparison (llvm#103480) Given an unsigned integer comparison of `add nsw X, C1` with some constant `C2` we can fold it into a signed comparison of `X` and `C2 - C1` under the following conditions: * There's a `nsw` flag on the addition * `C2` is non-negative * `X + C1` is non-negative * `C2 - C1` is non-negative

…lvm#103392) ... whereever we have the Decl for it, and even when we don't keep the SourceLocation of it aimed at the call site. Fixes: llvm#102983

llvm#103935)

In preparing for the future upcoming patches, just moving the call to the proper place, which is NFC for now.

…p_atomics (llvm#103732) This commit adds support amdgpu-unsafe-gp-atomics attr plumbing via introduction of `rocdl.unsafe_fp_atomics`. This adds the missing translation for amdgpu-waves-per-eu attr.

…vm#103927) This commit changes the LLVM dialect's inliner interface to no longer be registered at dialect initialization. Instead, it is now a promised interface, that needs to be registered explicitly. This change is desired to avoid pulling in a lot of dependencies into the `MLIRLLVMDialect` library, especially considering future patches that plan to extend it further with strong IR analysis.

…rs, NFC GateredScalars is a full copy of the E->Scalars in this places and can be safely used for now. Unifies the code across the function.

…to combine (srl (sra X, C1), ShAmt) -> sra(X, C1+ShAmt) (llvm#101751) If the upper bits of the shr aren't demanded. This helps with cases where the outer srl was originally an sra and was converted to a srl by SimplifyDemandedBits before it had a chance to combine with the inner sra. This can occur when the inner sra was part of a sign_extend_inreg expansion. There are some regressions in ARM and Thumb2.

…lvm#102952) This PR addresses the issue detailed in iree-org/iree#17948. The problem occurs when distributed types are set to NULL, leading to compilation crashes. --------- Signed-off-by: Bangtian Liu <liubangtian@gmail.com>

@tstellar

…eded for explicit symbol visibility (llvm#103900) In multiple source files function definitions never sees there declaration in a header because its never included causing linker errors when explicit symbol visibility macros\dllexport are added to the declarations. Most of these were originally found by @tstellar in llvm#67502 TargetRegistry.h is needed in MCExternalSymbolizer.cpp for createMCSymbolizer Analysis/Passes.h is needed in LazyValueInfo.cpp and RegionInfo.cpp for createLazyValueInfoPassin and createRegionInfoPass Transforms/Scalar.h is needed in SpeculativeExecution.cpp for createSpeculativeExecutionPass

@ya

…+ / VS2019+ (llvm#102848) Partial fix for llvm#92204. This PR just fixes VS2019+ since that is the suite of compilers that I require link compatibility with at the moment. I still intend to fix VS2017 and to update llvm-undname in future PRs. Once those are also finished and merged I'll close out llvm#92204. I am hoping to get the llvm-undname PR up in a couple of weeks to be able to demangle the VS2019+ name mangling. MSVC 1920+ mangles placeholder return types for non-templated functions with "@". For example `auto foo() { return 0; }` is mangled as `?foo@@ya@XZ`. MSVC 1920+ mangles placeholder return types for templated functions as the qualifiers of the AutoType followed by "_P" for `auto` and "_T" for `decltype(auto)`. For example `template<class T> auto foo() { return 0; }` is mangled as `??$foo@H@@ya?A_PXZ` when `foo` is instantiated as follows `foo<int>()`. Lambdas with placeholder return types are still mangled with clang's custom mangling since MSVC lambda mangling hasn't been deciphered yet. Similarly any pointers in the return type with an address space are mangled with clang's custom mangling since that is a clang extension. We cannot augment `mangleType` to support this mangling scheme as the mangling schemes for variables and functions differ. auto variables are encoded with the fully deduced type where auto return types are not. The following two functions with a static variable are mangled the same ``` template<class T> int test() { static int i = 0; // "?i@?1???$test@H@@yahxz@4HA" return i; } template<class T> int test() { static auto i = 0; // "?i@?1???$test@H@@yahxz@4HA" return i; } ``` Inside `mangleType` once we get to mangling the `AutoType` we have no context if we are from a variable encoding or some other encoding. Therefore it was easier to handle any special casing for `AutoType` return types with a separate function instead of using the `mangleType` infrastructure.

FindCountedByField can be used in more places than CodeGen. Move it into FieldDecl to avoid layering issues.

llvm#96649) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.

To align with gas's latest changes. relate gas patch: https://sourceware.org/pipermail/binutils/2024-May/134360.html

This patch adds a verifier to `tosa.table` which fixes a crash. Fix llvm#103086.

Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp

) For now, the testcases are grouped in a single TEST. I'll sort them out and add more testcases in follow-up commits.

3 MLIR tests `FAIL` on SPARC, both Solaris/sparcv9 and Linux/sparc64: ``` MLIR :: Conversion/ArithToSPIRV/arith-to-spirv-le-specific.mlir MLIR :: IR/elements-attr-interface.mlir MLIR :: Target/LLVMIR/llvmir-le-specific.mlir ``` The issue is always the same: the tests in question are little-endian-only currently, so this patch `XFAIL`s them on `sparc*` as is already done for `s390x`. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.

…lvm#103722) `Flang :: Lower/default-initialization-globals.f90` `FAIL`s on SPARC, both Solaris/sparcv9 and Linux/sparc64. The failure mode is same as on AIX/PowerPC, so both targets being big-endian, this patch treats them the same. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.

…C) (llvm#103723) This makes `LayoutAlignElem` / `PointerAlignElem` and `AlignTypeEnum` inner types of `DataLayout`. The types are also renamed to match their meaning (LangRef refers to them as "specification" and "specifier"). Pull Request: llvm#103723

Removing them simplifies the content and means we don't confuse anyone who joined after the Phabricator shutdown. You could use them for review archaeology but this is only a subset of the names you'd encounter there anyway. So I don't think this is a good reason to keep them here. With a couple of exceptions the Phabricator/GitHub names are the same and/or related to their full name anyway.

…lvm#103730) `Flang :: Driver/fveclib-codegen.f90` currently `FAIL`s on SPARC, both Solaris/sparcv9 and Linux/sparc64: ``` bin/flang-new -S -Ofast -fveclib=LIBMVEC -o - /vol/llvm/src/llvm-project/local/flang/test/Driver/fveclib-codegen.f90 flang/test/Driver/fveclib-codegen.f90:11:10: error: CHECK: expected string not found in input ! CHECK: _ZGVbN4vv_powf ^ ``` The code in question only contains calls to `powf`. Given that `glibc` only supports `libmvec` on `aarch64` and `x86_64`, this test targets only those if possible. Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.

Until llvm#103056 lands or another more appropriate check can be found. This test fails on Ubuntu Focal where zdump is built with 32 bit time_t but passes on Ubuntu Jammy where zdump is built with 64 bit time_t. Marking it unsupported means Linaro can upgrade its bots to Ubuntu Jammy without getting an unexpected pass.

This commit introduces a slicing utility that can be used to walk arbitrary IR slices. It additionally ships logic to determine control flow predecessors, which allows users to walk backward slices without dealing with both `RegionBranchOpInterface` and `BranchOpInterface`. This utility is used to improve the `noalias` propagation in the LLVM dialect's inliner interface. Before this change, it broke down as soon as pointer were passed through region control flow operations.

…#98586) `cast_or_null` is deprecated. https://github.com/llvm/llvm-project/blob/062844615db5e141da118c1ad780bf102537f40a/llvm/include/llvm/Support/Casting.h#L717-L722

Adds m_FPToUI/m_FPToSI matchers for ISD::FP_TO_UINT/ISD::FP_TO_SINT in SDPatternMatch.h with suitable test coverage. Fixes llvm#103872

…llvm#104037) The target needs to be initialized in order to compute the correct target triple from the command line. Without initialized targets the OS component of the triple might not reflect what would be computed by the driver for an actual compiler invocation. Fixes llvm#61762

…pNestOp (llvm#103731) This patch adds an assert to `genLoopNestClauses` to ensure the number of symbols and corresponding loop wrapper entry block arguments have the same size. This is checked by some of the callers, but it makes more sense moving it into the function itself and avoid having to replicate it.

This updates the "dxil-metadata-emit" pass flag to be spelled "dxil-translate-metadata" to better match the pass name. Pull Request: llvm#104249

wldfngrs and others added 30 commits August 14, 2024 08:22

[libc] Add definitions for totalordermag(,f, l, f128), dsqrt(l, f128)…

f53e355

…, fsqrt(, l, f128) to math.yaml. (llvm#103494) Added auto function hdrgen specification for functions: totalordermag(,f, l, f128), dsqrt(l, f128), fsqrt(, l, f128)

[AMDGPU] Generate checks for llvm.amdgcn.is.private/shared (llvm#103859)

df57833

Also combine the GlobalISel tests into the SelectionDAG ones.

[X86] combineEXTRACT_SUBVECTOR - fold extractions from UNPCK nodes.

6183665

Allow subvector extraction as long as at least one operand extraction is free. Refactor existing cases into a switch statement to allow easier reuse + future expansion.

[X86] combineEXTRACT_SUBVECTOR - fold extractions from BLENDI nodes.

503ba62

[NFC][VP] Reduce parameters in LoopVectorizePass::runImpl (llvm#103551

b006007

) It seems that the parameters can be passed through the class members.

LowerAtomic: Use explicit alignment in lowerAtomicCmpXchgInst (llvm#1…

5ececb4

…03767)

[clang][test][RISCV] Add simple litmus test for --print-enabled-exten…

3f0d3fd

…sions There's some coverage in RISCVISAInfoTest, but it's worth adding a quick test to ensure nothing happens to the frontend handling of this option.

AMDGPU: Preserve alignment when custom expanding atomicrmw (llvm#103768)

0edd077

[LV] Add test where diff checks not used when re-trying with RT checks.

3efcc8e

Revert "[clang][Interp][NFC] Move _Complex compiler code to separate …

486adc5

…file (llvm#103004)" This reverts commit 2d53f0a. This causes warnings when building with MSVC.

[gn build] Port 486adc5

cba9166

[clang][AArch64] Point the nofp ABI check diagnostics at the callee (l…

019ef52

…lvm#103392) ... whereever we have the Decl for it, and even when we don't keep the SourceLocation of it aimed at the call site. Fixes: llvm#102983

[libc][math] Fix missing const in hdrgen signatures for totalordermag* (

80c5ccd

llvm#103935)

[SLP][NFC]Use transform nodes before building external uses, NFC.

d9b9ae6

In preparing for the future upcoming patches, just moving the call to the proper place, which is NFC for now.

[SelectionDAG] Construct SmallVector with ArrayRef (NFC) (llvm#103705)

5ce326c

[MLIR][AMDGPU] Add rocdl.attr translation for waves_per_eu & unsafe_f…

bd42177

…p_atomics (llvm#103732) This commit adds support amdgpu-unsafe-gp-atomics attr plumbing via introduction of `rocdl.unsafe_fp_atomics`. This adds the missing translation for amdgpu-waves-per-eu attr.

[Analysis] Use range-based for loops (NFC) (llvm#103540)

1115dee

[SLP][NFC]Use GatheredScalars vector instead of the original E->Scala…

20b2c9f

…rs, NFC GateredScalars is a full copy of the E->Scalars in this places and can be safely used for now. Unifies the code across the function.

[RISCV] Use if init statement to reduce scope of variable. NFC

294ed6a

MaxEW707 and others added 27 commits August 14, 2024 21:51

Fix testcases. Use -emit-llvm and not -S. Use LABEL checking.

6e2d9df

[Clang][NFC] Move FindCountedByField into FieldDecl (llvm#104235)

94b8b11

FindCountedByField can be used in more places than CodeGen. Move it into FieldDecl to avoid layering issues.

Remove failing test until it can be fixed properly.

07a8cba

[ctx_prof] Remove an unneeded include in CtxProfAnalysis.cpp

8d03710

[X86][MC] Remove CMPCCXADD's CondCode flavor. (llvm#103898)

372842b

To align with gas's latest changes. relate gas patch: https://sourceware.org/pipermail/binutils/2024-May/134360.html

[mlir][tosa] Add verifier for tosa.table (llvm#103708)

1e34706

This patch adds a verifier to `tosa.table` which fixes a crash. Fix llvm#103086.

[include-cleaner] Remove two commented-out lines of code.

3eaf483

[VPlan] Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp (NFC).

12763a0

Move VPWidenStoreRecipe::execute to VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp

Fix warnings in llvm#102848 [-Wunused-but-set-variable]

fa343be

[UnitTests] Convert some data layout parsing tests to GTest (llvm#104346

845431a

) For now, the testcases are grouped in a single TEST. I'll sort them out and add more testcases in follow-up commits.

[llvm][Docs] _or_null -> _if_present in Programmer's Manual (llvm…

5f15c17

…#98586) `cast_or_null` is deprecated. https://github.com/llvm/llvm-project/blob/062844615db5e141da118c1ad780bf102537f40a/llvm/include/llvm/Support/Casting.h#L717-L722

[DAG] Adding m_FPToUI and m_FPToSI to SDPatternMatch.h (llvm#104044)

05dfac2

Adds m_FPToUI/m_FPToSI matchers for ISD::FP_TO_UINT/ISD::FP_TO_SINT in SDPatternMatch.h with suitable test coverage. Fixes llvm#103872

[bazel] Port for 1415365

9a9ce91

[DirectX] Use a more consistent pass name for DXILTranslateMetadata

8107810

This updates the "dxil-metadata-emit" pass flag to be spelled "dxil-translate-metadata" to better match the pass name. Pull Request: llvm#104249

Remove empty line.

894d3ee

[AutoBump] Merge with 894d3ee (Aug 15)

7f35dc8

cferry-AMD approved these changes Sep 30, 2024

View reviewed changes

Base automatically changed from bump_to_98119718 to feature/fused-ops October 4, 2024 14:33

An error occurred while trying to automatically change base from bump_to_98119718 to feature/fused-ops October 4, 2024 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357

[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357

mgehre-amd commented Sep 20, 2024

[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357

Are you sure you want to change the base?

[AutoBump] Merge with 894d3eeb (Aug 15) (4) #357

Conversation

mgehre-amd commented Sep 20, 2024