[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321

mgehre-amd · 2024-08-31T20:14:23Z

No description provided.

goma is deprecated and not maintained anymore. https://chromium.googlesource.com/infra/goma/client/

…ead of std::vector by value. Avoid std::vector copies as setDefaultProperties just iterates across the Records Fixes llvm#89207

…#93815) After recent improvements (llvm#80029) and testing on open-source projects, the checker is ready to move out of the alpha package.

…#91472) SCEVLoopGuardRewriter only replaces operands with equivalent values, so we should be able to transfer the flags from the original expression. PR: llvm#91472

Similar to the change previously made for binops, make m_Trunc() only match instructions, not constant expressions. This is more likely to cause a crash than do something useful. Fixes crash reported at: llvm#92885 (comment)

llvm#94203) In llvm#94167 I found out that `cwg28xx.cpp` has been running without `-pedantic-errors` and fixed that. This patch fixes that for the rest of the test suite. Only one test was affected with a trivial fix (warning was escalated to an error). I'm intentionally leaving out test for CWG2390, because it requires major surgery. It's addressed in llvm#94206.

Also mark the test as nounwind. The unwinding information does not appear to be pertinent to the original intent of the test.

Make sure this test is preserved when icmp constant expressions are removed.

…lvm#69704) Moving the body of member functions out-of-line makes sense for classes defined in implementation files too.

This PR fixes legalize info for G_BITREVERSE.

To make sure these are preserved when icmp constant expressions are removed.

We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging. The MinSize is set in 3 ways: 1. If global-merge-min-data-size is explicitly set, then it uses that value. 2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used. 3. Otherwise, 0 is used, which means all sizes are considered for merging. We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites. I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.

Noticed while triaging the failures on llvm#93673 - the attributor pass doesn't emit any range metadata in these tests

Check whether parsing of the argument failed before attempting to build the expression. Fixes llvm#80474.

Removed foo-registered-target constraints from a bunch of tests, because mostly the driver doesn't need to have a target availabile. I ran check-clang-driver using a build with only the XCore target, and these all passed. There are ~50 tests that still have foo-registered-target, and it looks like most of them are either doing codegen when they don't need to, or don't really belong in the Driver tests. But that's a task for another day.

…91469)" (llvm#94210) Reverts llvm@3bcccb6 and llvm@9a28272 because llvm#91469 causes a miscompilation llvm#91469 (comment).

When FMV was added to AArch64, it added a dependency expansion step after the -cc1 command line was parsed but before Sema, in AArch64TargetInfo::initFeatureMap. One effect of this is that -target-features specified on the -cc1 command line had some level of incomplete and broken dependency expansion. Since then, many tests have been added which depend on this behaviour. The dependency expansion can be considered broken at this stage because dependency expansion is already performed by the driver to generate the -target-feature flags using an ExtensionSet. This class does dependency evaluation and then generates a flattened representation of the dependency graph in the form of -target-features, which are passed to -cc1 in an arbitrary order (determined by the order of bits in the bitset). Any dependency expansion done after -cc1 will be inherently contradictory. It is impossible to accurately treat negative features once the dependency graph has been flattened and the order randomised. This patch fixes a large number of those tests, specifically ones where only a dependent feature (e.g. -target-feature +sme2p1) was added to the test -cc1 command, and not the necessary dependencies (e.g. -target-feature +sme). See PR llvm#93695 further details.

…3692) Another bug fix for llvm#83628.

…or.shuffle` (llvm#93858) This PR tries to reland llvm#93595 which was reverted in llvm#93732 due to some issues. The original PR: - Add integration test for `vector.shuffle` and `vector.interleave` - Add `VectorToSPIRV` patterns to `GPUToSPIRVPass` Description of the issue: - llvm#93595 (comment) - Using either `vector.load` or `vector.store` in the kernel function will cause the validation layer to report an error - Trying to bypass the issue by using `memref.load` and `memref.store` to load/store individual elements from/to the vectors, and populate the vectors using `vector.insertelement` and `vector.extractelement` instead.

…llvm#94204) Since llvm#80801 clang requires a template argument list after the use of the template keyword. https://lab.llvm.org/buildbot/#/builders/176/builds/10230 error: a template argument list is expected after a name prefixed by the template keyword [-Wmissing-template-arg-list-after-template-kw] This fixes the instances found by the AArch64 Linux builds.

Ensure that FormatStringConverter's constructor fails with a sensible error message rather than asserting if the format string is not a narrow string literal. Also, ensure that we don't even get that far in modernize-use-std-print and modernize-use-std-format by checking that the format string parameter is a char pointer. Fixes llvm#92896

Fix typos in AGGRESIVE-->AGGRESSIVE + WAYAGGRESIVE->WAYAGGRESSIVE This also exposed an issue that the WAYAGGRESSIVE run removed a block entirely, so the LABEL check was silently failing. Noticed while triaging the failures on llvm#93673

…ndidates() (llvm#90260) This reduce the time complexity of the main loop of `findCandidates()` method from $O(n^2)$ to $O(n \log n)$. For small $n$, the modification does not regress the build time, but it helps significantly when $n$ is large. For one application, this reduces the runtime of the main loop from 120 seconds to 28 seconds. This is the first commit for an enhanced version of machine outliner -- see [RFC](https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-1-fulllto-part-2-thinlto-nolto-to-come/78732).

As noted in llvm#93796 (comment), a better way to teach RISCVInsertVSETVLI to work without LiveIntervals is to set VNInfo to nullptr and teach the various methods to handle it. We should try that approach first, so we no longer need this pre-commit patch. This reverts commit 4b4d366.

The std::min behaves like 'a<b?a:b', which does not match libstdc++/libc++ behavior like 'b<a?b:a' when input is NaN. Make it consistent with libstdc++/libc++. Fixes: llvm#93962 Fixes: ROCm/HIP#3502

reorganize the PPCInstrP10.td based on comment llvm#92543 (comment) The instructions or patterns defined by same predicates are currently placed at several different locations , They will be reorganized into same group based on these predicates in the patch.

…lvm#79875) Particular example that lead to this is a very long chain of `UsingShadowDecl`s that we hit in our codebase in generated code. To avoid that, check for stack exhaustion when deserializing the declaration. At that point, we can point to source location of a particular declaration that is being deserialized.

Replace argmemonly readonly with memory(argmem: read).

FirstCand is a reference to RepeatedSequenceLocs[0]. However, that vector is being modified a lot throughout the function, including one place that reassigns the whole vector. I'm not sure whether this can really happen in practice, but it doesn't seem unlikely that this could lead to a use-after-free. Avoid this by directly using RepeatedSequenceLocs[0] at the start of the function (as a lot of other places already do) and only creating FirstCand at the end where no more modifications take place.

…complete codegen line Noticed while triaging the failures on llvm#93673

) The revision unrolls vector.bitcast like: ```mlir %0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64> ``` to ```mlir %cst = arith.constant dense<0> : vector<2x2xi64> %0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32> %1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64> %2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64> %3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32> %4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64> %5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64> ``` The scalable vector is not supported because of the limitation of `vector::createUnrollIterator`. The targetRank could mismatch the final rank during unrolling; there is no direct way to query what the final rank is from the object.

…llvm#94149) - Fix build with `EXPENSIVE_CHECKS` - Remove unused `PassName::ID` to resolve warning - Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly

…bcxx/libcxxabi/libunwind. -fvisibility-global-new-delete-hidden is deprecated and clang was warning about it on every build command. These libraries are always built using a stage2 compiler, so we can use the new build flag unconditionally. Reviewers: aeubanks Reviewed By: aeubanks Pull Request: llvm#88459

…achine block address taken. (llvm#94296) These blocks usually show up in the form of branches within inline assembly. Since it's hard to rewire them, we fully omit paths with such blocks from path cloning.

…m#93775) To support the third parameter of the alignment directive, R_LARCH_ALIGN relocations need a non-zero symbol index. In many cases we don't need the third parameter and can set the symbol index to 0. This patch will remove a lot of .Lla-relax-align* symbols and mitigate the size regression due to llvm#72962. Co-authored-by: Jinyang He <hejinyang@loongson.cn> Co-authored-by: Weining Lu <luweining@loongson.cn>

Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would always be enabled in this case, if it's not `TTI` does not exist. Introduced in 7652a59

It should preserve more analysis results, but it happens immediately after instruction selection.

…632a (llvm#94301)

The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.

* Improve the condition type requirement description ('scalar' -> signless i1), to match what is actually verified. * Use the `I1` type predicate instead of `AnyBooleanTypeMatch`. Related discussion: llvm#93351 (comment).

…m#94304) This manifests as `AddressSanitizer: stack-use-after-return` w/o this change. The `~CheckFEnv()` method of checking fenv seems to only work for test fixtures.

…()` (llvm#93927) Fixes a crash uncovered by [pr89651](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr89651.f90) in the test suite. Fixes a crash caused by missing handling of `omp.private` ops in `FirOpBuilder::getAllocaBlock()`.

…ing (llvm#94285) A cycle profile of a thin link showed a lot of time spent in sort called from the BitcodeWriter, which was being used to compute the unique references to stack ids in the summaries emitted for each backend in a distributed thinlto build. We were also frequently invoking lower_bound to locate stack id indices in the resulting vector when writing out the referencing memprof records. Change this to use a map to uniquify the references, and to hold the index of the corresponding stack id in the StackIds vector, which is now populated at the same time. This reduced the time of a large thin link by about 10%.

Sink vscale calls as well when indvars is not widen (-indvars-widen-indvars=false).

…lable vector type. (llvm#93406) FunctionStackPoisoner does not serve for `AllocaInst` with scalable vector type, but it does not filter out struct type with scalable vector introduced by c8eb535.

The old use of must-be-executed-context (MBEC) did propagate through calls even if that was not allowed. We now only propagate from call site arguments. If there are calls/intrinsics that allows propagation, we need to add them explicitly. Fixes: llvm#78507 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>

A cycle profile showed that we were spending a lot of time invoking MapVector::erase. According to https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h, erasing elements one at a time is very inefficient for MapVector and it is better to use remove_if. This change resulted in around 7% time reduction on a large thin link. While here remove an unused function that also invokes erase on MapVectors.

Similar to rust-lang/rust#125007

With the change in 2fa0591 we can now use a range for loop.

Move functionality for patching build ID into a separate rewriter class and change the way we do the patching. Support build ID in different note sections in order to update the build ID in the Linux kernel binary which puts in into ".notes" section instead of ".note.gnu.build-id".

…93206) This patch picks up llvm#78598 with the hope that we can address such crashes in `tryCaptureVariable()` for unevaluated lambdas. In addition to `tryCaptureVariable()`, this also contains several other fixes on e.g. lambda parsing/dependencies. Fixes llvm#63845 Fixes llvm#67260 Fixes llvm#69307 Fixes llvm#88081 Fixes llvm#89496 Fixes llvm#90669 Fixes llvm#91633

…llvm#94045) In some cases (see iree-org/iree#16285), `memref.subview` ops can't be folded into transfer ops and sub-byte type emulation fails. This issue has been blocking a few things, including the enablement of vector flattening transformations (iree-org/iree#16456). This PR extends the existing sub-byte type emulation support of `memref.subview` to handle multi-dimensional subviews with dynamic offsets and addresses the issues for some of the `memref.subview` cases that can't be folded. Co-authored-by: Diego Caballero <diegocaballero@google.com>

atetubou and others added 30 commits June 3, 2024 07:30

remove goma support from clang (llvm#93942)

9a7bd8a

goma is deprecated and not maintained anymore. https://chromium.googlesource.com/infra/goma/client/

[TableGen] CodeGenIntrinsic - pass DefaultProperties as ArrayRef inst…

72c901f

…ead of std::vector by value. Avoid std::vector copies as setDefaultProperties just iterates across the Records Fixes llvm#89207

[clang][analyzer] Move unix.BlockInCriticalSection out of alpha (llvm…

6ef785c

…#93815) After recent improvements (llvm#80029) and testing on open-source projects, the checker is ready to move out of the alpha package.

[SCEV] Preserve flags in SCEVLoopGuardRewriter for add and mul. (llvm…

4812e9a

…#91472) SCEVLoopGuardRewriter only replaces operands with equivalent values, so we should be able to transfer the flags from the original expression. PR: llvm#91472

[PatternMatch] Do not match constant expression trunc

bda8d1a

Similar to the change previously made for binops, make m_Trunc() only match instructions, not constant expressions. This is more likely to cause a crash than do something useful. Fixes crash reported at: llvm#92885 (comment)

[AArch64] Generate test checks (NFC)

cee6e81

Also mark the test as nounwind. The unwinding information does not appear to be pertinent to the original intent of the test.

[Tests] Move test from Assembler to InstSimplify (NFC)

e8ff03b

Make sure this test is preserved when icmp constant expressions are removed.

[clangd] Allow "move function body out-of-line" in non-header files (l…

955c223

…lvm#69704) Moving the body of member functions out-of-line makes sense for classes defined in implementation files too.

[SPIR-V] Fix legalize info for G_BITREVERSE (llvm#93699)

5ff993a

This PR fixes legalize info for G_BITREVERSE.

[Tests] Move some tests from Assembler to InstSimplify (NFC)

2f1229e

To make sure these are preserved when icmp constant expressions are removed.

[Attributor] Remove unused metadata checks from liveness tests

7952720

Noticed while triaging the failures on llvm#93673 - the attributor pass doesn't emit any range metadata in these tests

[Clang] Fix crash on improper use of __array_extent (llvm#94173)

27fe526

Check whether parsing of the argument failed before attempting to build the expression. Fixes llvm#80474.

Revert "[Reassociate] Drop weight reduction to fix issue 91417 (llvm#…

22b63b9

…91469)" (llvm#94210) Reverts llvm@3bcccb6 and llvm@9a28272 because llvm#91469 causes a miscompilation llvm#91469 (comment).

[X86][AMX] Check also AMX register live out for copy lowering (llvm#9…

8aa33f1

…3692) Another bug fix for llvm#83628.

[CUDA][HIP] Fix std::min in wrapper header (llvm#93976)

987e1b2

The std::min behaves like 'a<b?a:b', which does not match libstdc++/libc++ behavior like 'b<a?b:a' when input is NaN. Make it consistent with libstdc++/libc++. Fixes: llvm#93962 Fixes: ROCm/HIP#3502

[LangRef] Remove mention of argmemonly (NFC)

4023f4e

Replace argmemonly readonly with memory(argmem: read).

[llvm-reduce] reduce-register-defs.mir - fix check prefix typo and in…

26ee42a

…complete codegen line Noticed while triaging the failures on llvm#93673

hanhanW and others added 26 commits June 3, 2024 16:39

Reland "[NewPM][CodeGen] Port selection dag isel to new pass manager" (…

7652a59

…llvm#94149) - Fix build with `EXPENSIVE_CHECKS` - Remove unused `PassName::ID` to resolve warning - Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly

[Codegen, BasicBlockSections] Avoid cloning blocks which have their m…

8ec1161

…achine block address taken. (llvm#94296) These blocks usually show up in the form of branches within inline assembly. Since it's hard to rewire them, we fully omit paths with such blocks from path cloning.

[CodeGen] Fix compiler conditional combination (llvm#94297)

cac5d0e

Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would always be enabled in this case, if it's not `TTI` does not exist. Introduced in 7652a59

[NewPM][CodeGen] Port finalize-isel to new pass manager (llvm#94214)

9b0e1c2

It should preserve more analysis results, but it happens immediately after instruction selection.

Fix lsda-section-name adding back RUN line incorrectly removed in 6ef…

c7b7875

…632a (llvm#94301)

[PowerPC] Remove DAG matching in ADDIStocHA (llvm#93905)

4d20f49

The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.

[mlir][arith] Further clean up select op definition (llvm#93358)

85e4e9d

* Improve the condition type requirement description ('scalar' -> signless i1), to match what is actually verified. * Use the `I1` type predicate instead of `AnyBooleanTypeMatch`. Related discussion: llvm#93351 (comment).

[libc][test] Fix TEST->TEST_F typo in getenv_and_setenv_test.cpp (llv…

392ca64

…m#94304) This manifests as `AddressSanitizer: stack-use-after-return` w/o this change. The `~CheckFEnv()` method of checking fenv seems to only work for test fixtures.

[AArch64] Sink llvm.vscale.i32 into blocks for better isel (llvm#93465)

acfc79d

Sink vscale calls as well when indvars is not widen (-indvars-widen-indvars=false).

[Asan] Teach FunctionStackPoisoner to filter out struct type with sca…

e9dd6b2

…lable vector type. (llvm#93406) FunctionStackPoisoner does not serve for `AllocaInst` with scalable vector type, but it does not filter out struct type with scalable vector introduced by c8eb535.

[test] Fix filecheck annotation typos (llvm#91854)

fa72a02

Similar to rust-lang/rust#125007

[MemProf][NFC] Use range for loop (llvm#94308)

4973ad4

With the change in 2fa0591 we can now use a range for loop.

[gn build] Port 8ea59ec

4c416a9

[mlir][bazel] Fix BUILD after d041343.

22dcdcc

[mlir][bazel] Really fix BUILD after d041343.

4ab7354

[AutoBump] Merge with 4ab7354 (Jun 04)

75dd1f5

cferry-AMD approved these changes Sep 2, 2024

View reviewed changes

Base automatically changed from bump_to_12fcca0a to feature/fused-ops September 11, 2024 12:08

mgehre-amd merged commit 75dd1f5 into feature/fused-ops Sep 11, 2024
5 checks passed

mgehre-amd deleted the bump_to_4ab73549 branch September 11, 2024 12:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321

[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321

mgehre-amd commented Aug 31, 2024

[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321

[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321

Conversation

mgehre-amd commented Aug 31, 2024