[AutoBump] Merge with 235d6841 (17) #274

cferry-AMD · 2024-08-16T14:44:07Z

No description provided.

…6730) Move `GPUOpsLowering.cpp` from `//mlir:GPUCommonTransforms` to `//mlir:GPUToGPURuntimeTransforms` to match the CMake setup. Ideally, header files should be used by only one target, but this is hard because CMake is less strict with headers (no layering check). But even with bazel, headers should only be exported once in the `hdrs` attribute. Other targets may use them in the `srcs` attribute to avoid circular dependencies.

…m#86394) As part of the WebAssembly support work llvm#85566 The README.txt is a bit odd since it only lists issues and problems without talking about what works. It’s also hard to read on the GitHub web view. - Convert to Markdown and linking to the command docs https://llvm.org/docs/CommandGuide/llvm-debuginfo-analyzer - Rename some left 'elf reader' to 'DWARF reader'.

…cb. (llvm#83375) For targets with Zcb, this patch makes llvm generate more compress c.lb/lbu/lh/lhu/sb/sh instructions.

Inline a callee if its target-features are a subset of the callers target-features.

…m#86617)" Changes in Recommit: Add an additional check on sign/zero extend to the same type. Original message: Use the destination data type to measure the LMUL size for latency/throughput cost

Unlike add, sub and mul, we don't have widening instructions for div, rem and logical ops, so we don't have any test coverage if we were to extend combineBinOpOfZExts to handle them. Adding tests coincidentally revealed that logical ops are already narrowed as a generic DAG combine via DAGCombiner::hoistLogicOpWithSameOpcodeHands. So we don't actually need to run combineBinOpOfZExts on them.

When `+sve` is passed in the command line, if the Architecture being targeted is V8.6A/V9.1A or later, `+f32mm` is also added. This enables FEAT_32MM, however at the time of writing no CPU's support this. This leads to the FEAT_32MM instructions being compiled for CPU's that do not support them. This commit removes the automatic enablement, however the option is still able to be used by passing `+f32mm`.

add restrict reassoc for the powi(X,Y) / X according the discuss on PR69998.

Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: llvm#86709

Note we can't use vwaddu.wv because it will get combined away with llvm#78403

llvm#78772 added similar support for .def file parser and import library writer. This PR adds missing bits in LLD to propagate EXPORTAS name and allow it in `/export` parser. This is syntax is used by MSVC for ARM64EC `__declspec(dllexport)` handling.

…angled ARM64EC symbols. (llvm#86722)

…85460) Currently, the builtins used for implementing `va_list` handling unconditionally take their arguments as unqualified `ptr`s i.e. pointers to AS 0. This does not work for targets where the default AS is not 0 or AS 0 is not a viable AS (for example, a target might choose 0 to represent the constant address space). This patch changes the builtins' signature to take generic `anyptr` args, which corrects this issue. It is noisy due to the number of tests affected. A test for an upstream target which does not use 0 as its default AS (SPIRV for HIP device compilations) is added as well.

Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr. Relands llvm#80855 with fixes

…ing it. NFC.

…ExtractVectorElt Prep work for llvm#85419 to make it easier to reuse in other combines

…vm#86411) Closes llvm#73675 Co-authored-by: Balazs Benics <benicsbalazs@gmail.com> Co-authored-by: NagyDonat <donat.nagy@ericsson.com>

…lvm#86789) Reland of cfa0833

Adjust logic of 1cb9f37 to match freebsd/freebsd-src@9a4d48a645a7a. D113443 is the original attempt to bring this FreeBSD patch to llvm-project, but it never landed. This change is required to build FreeBSD kernel modules with -fstack-protector using a standard LLVM toolchain. The FreeBSD kernel loader does not handle R_X86_64_REX_GOTPCRELX relocations. Fixes llvm#50932.

This upstreams the last bits of Clang API Notes functionality that is currently implemented in the Apple fork: https://github.com/apple/llvm-project/tree/next/clang/lib/APINotes

I'm hoping this will fix the errors we've been seeing the last few days: 2024-03-19T20:44:07.4841482Z 2024/03/19 20:44:07 error signing scorecard json results: error signing payload: getting key from Fulcio: verifying SCT: updating local metadata and targets: error updating to TUF remote mirror: invalid key

…lvm#86724) On z/OS int128 is disabled causing one of the cases in `saturate_cast.pass.cpp` to fail. The failure is only in 64-bit mode. In this case `the std::numeric_limits<long long int>::max()` is within `std::numeric_limits<unsigned long int>::min()` and `std::numeric_limits<unsigned long int>::max()` therefore, saturate_cast<unsigned long int>( sBigMax) == LONG_MAX and not ULONG_MAX as original test. In 32-bit, `saturate_cast<unsigned long int>( sBigMax) == ULONG_MAX` like on other platforms where int128 is enabled. This PR is required to pass this test case on z/OS and possibly on other platforms where int128 is not supported/enabled. --------- Co-authored-by: Sean Perry <perry@ca.ibm.com>

…} (zext, zext))) (llvm#86779) This narrows unsigned and signed div and rem nodes via combineBinOpOfZExt. Unlike other binary ops, there are no widening div or rem instructions. So we will end up with an extra vzext.vf2. However I'm assuming that div/rem are expensive enough that by reducing their EMUL we will gain back the cost. Alive2 proof: https://alive2.llvm.org/ce/z/Et_L6y

…tion (llvm#86972) Fixes llvm#86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.

libc++ debug mode verifies that a comparator passed to std::sort defines a strict weak order by calling it with the same element. See also: - RFC that introduced the feature: https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217 - `strict_weak_ordering_check.h` in libc++ sources.

Adds tests for `inf` and `nan` values to the tests for `strfrom*()` functions. Also marks some variables as `[[maybe_unused]]` to silence unused variable warnings.

This can be adjusted during runtime and it may impact the memory footprint if it's set to a big value or is disabled.

General cleanup in LangRef (and two outdated comments in LLParser.cpp) with the aim of making it easier to understand some of the terminology and subtle idiosyncrasies related to metadata in the IR. I'm still not happy with the fact that "node" is used both informally and with a particular category of metadata in mind, depending on the context. This also bleeds into the type names in the implementation. There are also several places where names from the implementation appear in the document with no other context or definition. In some cases I added a parenthetical to section titles to tie the two together, but I don't think this is ideal. I also think it might be useful to define the "abstract" metadata classes like "DIScope" in the document, so the hierarchy of metadata node kinds is direct, and so we can avoid repetitive descriptions of all of the members of on part of the hierarchy. This inheritance doesn't have to be in terms of C++ classes, but using the same names as the implementation seems helpful, and we already do it for many other things. Finally I added sections for the specialized nodes which are implemented in the IR but didn't have documentation in LangRef yet. These could use some work, and I admit I didn't dig too deep into the specifics beyond enumerating the fields, but I think we would ideally always have a LangRef section for every kind of node.

…pes (llvm#82705)" This reverts commit ca4c4a6. This was intended not to introduce new consistency diagnostics for smart pointer types, but failed to ignore sugar around types when detecting this. Fixed and test added.

llvm#87001) I tried to add representative tests while not duplicating complete coverage. If there's other tests you'd like to see, let me know.

Our CI system makes the source tree read-only. The 'cp' command that copies a directory from the source tree into a temp directory preserves permissions, and the copied files stay read-only. When the test tries to append to one of these files, it fails with a "permission denied" error.

…ofiles (llvm#83507) - Add pointers to code for source of truth. - Move necessary details from doc to code.

This is a follow-up to the profile format change in llvm#82711

In the StackDepot::isValid function, there is work to validate the TabMask variable. Unfortunately, if TabMask is set to the maximum allowed value, TabSize = TabMask + 1 becomes zero and validation passes. Disallow that case to prevent invalid reads into the Tab structure.

node.

represented by bitwidth without analysis. Need to check that initial ext/trunc nodes can be safely represented using calculated bitwidth before applying it.

Also another move comment to correct place.

In python3.11 there is a new environment variable PYTHONSAFEPATH which stops python from setting the current directory as the first entry in sys.path. Bazel started setting this to ensure that python targets don't accidentally access things that aren't in their dependency tree. This resulted in lit tests breaking because sys.path didn't include the directory to the lit source files. This is fixed by adding the lit binary to the dependency tree and propagating the import path from it. Fixes llvm#75963

Move all the stale profile matching stuffs into new files so that it can be shared for unit testing.

size_t in PatchItem eliminates the need for casts.

…r than Source Type (llvm#86941) We currently check if the source and promoted types are not equal before generating truncate instructions. This does not work for RV64 where the promoted type is i64 and this lead to a crash due to the generation of truncate instructions from i32 to i64. Fixes llvm#86400

This adds the ability to create a Scalar from an APFloat, and to create an APFloat from an APSInt or another APFloat.

This patch adds the nuw (no unsigned wrap) and nsw (no signed wrap) poison-generating flags to the trunc instruction. Discourse thread: https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453

by default For reduced BMI, it is meaningless to record the local declarations in functions if not required explicitly during the process of writing the function bodies. It wastes time for reduced BMI and may be problematic if we want to avoid transiting unnecessary changes.

chsigg and others added 30 commits March 27, 2024 06:07

[RISCV] Teach RISCVMakeCompressible handle byte/half load/store for Z…

22bfc58

…cb. (llvm#83375) For targets with Zcb, this patch makes llvm generate more compress c.lb/lbu/lh/lhu/sb/sh instructions.

[NFC][HWASAN] Regenerate test

16993c7

[RISCV] Add areInlineCompatible for riscv target (llvm#86639)

05a7b22

Inline a callee if its target-features are a subset of the callers target-features.

[clang][AST] Silence unused-value warnings in unittest DeclPrinterTest

577e0ef

Recommit "[RISCV][TTI] Scale the cost of the sext/zext with LMUL (llv…

aa2d5d5

…m#86617)" Changes in Recommit: Add an additional check on sign/zero extend to the same type. Original message: Use the destination data type to measure the LMUL size for latency/throughput cost

[InstCombine] Refactor powi(X,Y) / X to call foldPowiReassoc, NFC

2938f1c

[InstCombine] add restrict reassoc for the powi(X,Y) / X

bd9bb31

add restrict reassoc for the powi(X,Y) / X according the discuss on PR69998.

[RISCV] Add test case to show missing vmerge fold on tied pseudos. NFC

f15b7de

Note we can't use vwaddu.wv because it will get combined away with llvm#78403

[llvm-dlltool][llvm-lib][COFF] Don't override NONAME exports with dem…

c9d1266

…angled ARM64EC symbols. (llvm#86722)

AMDGPU: Fix dead check prefixes in test

ef316da

Reland [AMDGPU] MCExpr-ify MC layer kernel descriptor (llvm#86494)

1103a2a

Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr. Relands llvm#80855 with fixes

[gn build] Port 1103a2a

408c365

[DAG] visitSub - reuse existing SDLoc instead of regenerating it. NFC.

51388fb

[DAG] foldAddSubOfSignBit - reuse existing SDLoc instead of regenerat…

9247f31

…ing it. NFC.

[X86] Add combineExtractFromVectorLoad helper - pulled out of combine…

875aed1

…ExtractVectorElt Prep work for llvm#85419 to make it easier to reuse in other combines

[X86] masked_store.ll - add nounwind to remove cfi noise

e82765b

[analyzer][docs] Document the optin.performance.Padding checker (ll…

b8cc838

…vm#86411) Closes llvm#73675 Co-authored-by: Balazs Benics <benicsbalazs@gmail.com> Co-authored-by: NagyDonat <donat.nagy@ericsson.com>

[NFC][TableGen][GlobalISel] Move MIR pattern parsing out of combiner (l…

4f9aab2

…lvm#86789) Reland of cfa0833

[APINotes] Upstream the remaining API Notes fixes and tests

932949d

This upstreams the last bits of Clang API Notes functionality that is currently implemented in the Apple fork: https://github.com/apple/llvm-project/tree/next/clang/lib/APINotes

[gn build] Port 4f9aab2

b343b02

lukel97 and others added 23 commits March 29, 2024 05:55

[AArch64][GISEL] Consider fcmp true and fcmp false in cond code selec…

c482fad

…tion (llvm#86972) Fixes llvm#86917 `FCMP_TRUE` and `FCMP_FALSE` were previously not considered and we ended up in an llvm_unreachable assertion.

[libc] Add inf/nan tests for strfrom*() functions (llvm#86663)

6373577

Adds tests for `inf` and `nan` values to the tests for `strfrom*()` functions. Also marks some variables as `[[maybe_unused]]` to silence unused variable warnings.

[scudo] Dump ReleaseToOsIntervalMs (llvm#86887)

6b149f7

This can be adjusted during runtime and it may impact the memory footprint if it's set to a big value or is disabled.

[RISCV] Extend pattern matches involving shNadd to support disjoint or (

9ea0396

llvm#87001) I tried to add representative tests while not duplicating complete coverage. If there's other tests you'd like to see, let me know.

[nfc][docs]Generalize header description and ascii art for indexed pr…

d0b4780

…ofiles (llvm#83507) - Add pointers to code for source of truth. - Move necessary details from doc to code.

[docs][TypeProf]Update instrumentation file format document (llvm#83309)

07a1fbe

This is a follow-up to the profile format change in llvm#82711

[SLP][NFC]Add a test with the incorrect sign extension of first ext

338be79

node.

[SLP]Fix PR87011: Do not assume that initial ext/trunc nodes can be

01e02e0

represented by bitwidth without analysis. Need to check that initial ext/trunc nodes can be safely represented using calculated bitwidth before applying it.

[NFC] [HWASan] add example for ring buffer wrap (llvm#87029)

39e8137

Also another move comment to correct place.

[SampleFDO][NFC] Refactoring SampleProfileMatcher (llvm#86988)

1d99d7a

Move all the stale profile matching stuffs into new files so that it can be shared for unit testing.

[ProfileData] Use size_t in PatchItem (NFC) (llvm#87014)

c64a328

size_t in PatchItem eliminates the need for casts.

[LLDB] Add APFloat helper functions to Scalar class. (llvm#86862)

ba6b2d2

This adds the ability to create a Scalar from an APFloat, and to create an APFloat from an APSInt or another APFloat.

[IR] Add nowrap flags for trunc instruction (llvm#85592)

7d3924c

This patch adds the nuw (no unsigned wrap) and nsw (no signed wrap) poison-generating flags to the trunc instruction. Discourse thread: https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453

[AutoBump] Merge with 235d684

28cc244

cferry-AMD requested review from TinaAMD and josel-amd August 19, 2024 09:05

josel-amd approved these changes Aug 19, 2024

View reviewed changes

TinaAMD removed their request for review August 20, 2024 08:41

Base automatically changed from bump_to_c6d419c1 to feature/fused-ops August 20, 2024 08:47

cferry-AMD merged commit 0b49c20 into feature/fused-ops Aug 20, 2024
6 checks passed

cferry-AMD deleted the bump_to_235d6841 branch August 20, 2024 10:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 235d6841 (17) #274

[AutoBump] Merge with 235d6841 (17) #274

cferry-AMD commented Aug 16, 2024

[AutoBump] Merge with 235d6841 (17) #274

[AutoBump] Merge with 235d6841 (17) #274

Conversation

cferry-AMD commented Aug 16, 2024