forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 4ab73549 (Jun 04) (60) #321
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
goma is deprecated and not maintained anymore. https://chromium.googlesource.com/infra/goma/client/
…ead of std::vector by value. Avoid std::vector copies as setDefaultProperties just iterates across the Records Fixes llvm#89207
…#93815) After recent improvements (llvm#80029) and testing on open-source projects, the checker is ready to move out of the alpha package.
…#91472) SCEVLoopGuardRewriter only replaces operands with equivalent values, so we should be able to transfer the flags from the original expression. PR: llvm#91472
Similar to the change previously made for binops, make m_Trunc() only match instructions, not constant expressions. This is more likely to cause a crash than do something useful. Fixes crash reported at: llvm#92885 (comment)
llvm#94203) In llvm#94167 I found out that `cwg28xx.cpp` has been running without `-pedantic-errors` and fixed that. This patch fixes that for the rest of the test suite. Only one test was affected with a trivial fix (warning was escalated to an error). I'm intentionally leaving out test for CWG2390, because it requires major surgery. It's addressed in llvm#94206.
Also mark the test as nounwind. The unwinding information does not appear to be pertinent to the original intent of the test.
Make sure this test is preserved when icmp constant expressions are removed.
…lvm#69704) Moving the body of member functions out-of-line makes sense for classes defined in implementation files too.
This PR fixes legalize info for G_BITREVERSE.
To make sure these are preserved when icmp constant expressions are removed.
We add a feature that prevents the GlobalMerge pass from considering data smaller than a minimum size in bytes for merging. The MinSize is set in 3 ways: 1. If global-merge-min-data-size is explicitly set, then it uses that value. 2. If SmallDataLimit is set and non-zero, then SmallDataLimit + 1 is used. 3. Otherwise, 0 is used, which means all sizes are considered for merging. We found that this feature allowed us to see the benefit of the GlobalMerge pass while eliminating some merging that was not beneficial. This feature allowed us to enable the GlobalMerge pass on RISC-V in our downstream by default because it led to improvements on multiple benchmark suites. I plan to post a separate patch to propose enabling this by default on RISC-V. But I do not want that discussion to be part of the discussion of adding this feature, so I am keeping the patches separate.
Noticed while triaging the failures on llvm#93673 - the attributor pass doesn't emit any range metadata in these tests
Check whether parsing of the argument failed before attempting to build the expression. Fixes llvm#80474.
Removed foo-registered-target constraints from a bunch of tests, because mostly the driver doesn't need to have a target availabile. I ran check-clang-driver using a build with only the XCore target, and these all passed. There are ~50 tests that still have foo-registered-target, and it looks like most of them are either doing codegen when they don't need to, or don't really belong in the Driver tests. But that's a task for another day.
…91469)" (llvm#94210) Reverts llvm@3bcccb6 and llvm@9a28272 because llvm#91469 causes a miscompilation llvm#91469 (comment).
When FMV was added to AArch64, it added a dependency expansion step after the -cc1 command line was parsed but before Sema, in AArch64TargetInfo::initFeatureMap. One effect of this is that -target-features specified on the -cc1 command line had some level of incomplete and broken dependency expansion. Since then, many tests have been added which depend on this behaviour. The dependency expansion can be considered broken at this stage because dependency expansion is already performed by the driver to generate the -target-feature flags using an ExtensionSet. This class does dependency evaluation and then generates a flattened representation of the dependency graph in the form of -target-features, which are passed to -cc1 in an arbitrary order (determined by the order of bits in the bitset). Any dependency expansion done after -cc1 will be inherently contradictory. It is impossible to accurately treat negative features once the dependency graph has been flattened and the order randomised. This patch fixes a large number of those tests, specifically ones where only a dependent feature (e.g. -target-feature +sme2p1) was added to the test -cc1 command, and not the necessary dependencies (e.g. -target-feature +sme). See PR llvm#93695 further details.
…3692) Another bug fix for llvm#83628.
…or.shuffle` (llvm#93858) This PR tries to reland llvm#93595 which was reverted in llvm#93732 due to some issues. The original PR: - Add integration test for `vector.shuffle` and `vector.interleave` - Add `VectorToSPIRV` patterns to `GPUToSPIRVPass` Description of the issue: - llvm#93595 (comment) - Using either `vector.load` or `vector.store` in the kernel function will cause the validation layer to report an error - Trying to bypass the issue by using `memref.load` and `memref.store` to load/store individual elements from/to the vectors, and populate the vectors using `vector.insertelement` and `vector.extractelement` instead.
…llvm#94204) Since llvm#80801 clang requires a template argument list after the use of the template keyword. https://lab.llvm.org/buildbot/#/builders/176/builds/10230 error: a template argument list is expected after a name prefixed by the template keyword [-Wmissing-template-arg-list-after-template-kw] This fixes the instances found by the AArch64 Linux builds.
Ensure that FormatStringConverter's constructor fails with a sensible error message rather than asserting if the format string is not a narrow string literal. Also, ensure that we don't even get that far in modernize-use-std-print and modernize-use-std-format by checking that the format string parameter is a char pointer. Fixes llvm#92896
Fix typos in AGGRESIVE-->AGGRESSIVE + WAYAGGRESIVE->WAYAGGRESSIVE This also exposed an issue that the WAYAGGRESSIVE run removed a block entirely, so the LABEL check was silently failing. Noticed while triaging the failures on llvm#93673
…ndidates() (llvm#90260) This reduce the time complexity of the main loop of `findCandidates()` method from $O(n^2)$ to $O(n \log n)$. For small $n$, the modification does not regress the build time, but it helps significantly when $n$ is large. For one application, this reduces the runtime of the main loop from 120 seconds to 28 seconds. This is the first commit for an enhanced version of machine outliner -- see [RFC](https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-1-fulllto-part-2-thinlto-nolto-to-come/78732).
As noted in llvm#93796 (comment), a better way to teach RISCVInsertVSETVLI to work without LiveIntervals is to set VNInfo to nullptr and teach the various methods to handle it. We should try that approach first, so we no longer need this pre-commit patch. This reverts commit 4b4d366.
The std::min behaves like 'a<b?a:b', which does not match libstdc++/libc++ behavior like 'b<a?b:a' when input is NaN. Make it consistent with libstdc++/libc++. Fixes: llvm#93962 Fixes: ROCm/HIP#3502
reorganize the PPCInstrP10.td based on comment llvm#92543 (comment) The instructions or patterns defined by same predicates are currently placed at several different locations , They will be reorganized into same group based on these predicates in the patch.
…lvm#79875) Particular example that lead to this is a very long chain of `UsingShadowDecl`s that we hit in our codebase in generated code. To avoid that, check for stack exhaustion when deserializing the declaration. At that point, we can point to source location of a particular declaration that is being deserialized.
Replace argmemonly readonly with memory(argmem: read).
FirstCand is a reference to RepeatedSequenceLocs[0]. However, that vector is being modified a lot throughout the function, including one place that reassigns the whole vector. I'm not sure whether this can really happen in practice, but it doesn't seem unlikely that this could lead to a use-after-free. Avoid this by directly using RepeatedSequenceLocs[0] at the start of the function (as a lot of other places already do) and only creating FirstCand at the end where no more modifications take place.
…complete codegen line Noticed while triaging the failures on llvm#93673
) The revision unrolls vector.bitcast like: ```mlir %0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64> ``` to ```mlir %cst = arith.constant dense<0> : vector<2x2xi64> %0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32> %1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64> %2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64> %3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32> %4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64> %5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64> ``` The scalable vector is not supported because of the limitation of `vector::createUnrollIterator`. The targetRank could mismatch the final rank during unrolling; there is no direct way to query what the final rank is from the object.
…llvm#94149) - Fix build with `EXPENSIVE_CHECKS` - Remove unused `PassName::ID` to resolve warning - Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly
…bcxx/libcxxabi/libunwind. -fvisibility-global-new-delete-hidden is deprecated and clang was warning about it on every build command. These libraries are always built using a stage2 compiler, so we can use the new build flag unconditionally. Reviewers: aeubanks Reviewed By: aeubanks Pull Request: llvm#88459
…achine block address taken. (llvm#94296) These blocks usually show up in the form of branches within inline assembly. Since it's hard to rewire them, we fully omit paths with such blocks from path cloning.
…m#93775) To support the third parameter of the alignment directive, R_LARCH_ALIGN relocations need a non-zero symbol index. In many cases we don't need the third parameter and can set the symbol index to 0. This patch will remove a lot of .Lla-relax-align* symbols and mitigate the size regression due to llvm#72962. Co-authored-by: Jinyang He <hejinyang@loongson.cn> Co-authored-by: Weining Lu <luweining@loongson.cn>
Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would always be enabled in this case, if it's not `TTI` does not exist. Introduced in 7652a59
It should preserve more analysis results, but it happens immediately after instruction selection.
The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.
* Improve the condition type requirement description ('scalar' -> signless i1), to match what is actually verified. * Use the `I1` type predicate instead of `AnyBooleanTypeMatch`. Related discussion: llvm#93351 (comment).
…m#94304) This manifests as `AddressSanitizer: stack-use-after-return` w/o this change. The `~CheckFEnv()` method of checking fenv seems to only work for test fixtures.
…()` (llvm#93927) Fixes a crash uncovered by [pr89651](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr89651.f90) in the test suite. Fixes a crash caused by missing handling of `omp.private` ops in `FirOpBuilder::getAllocaBlock()`.
…ing (llvm#94285) A cycle profile of a thin link showed a lot of time spent in sort called from the BitcodeWriter, which was being used to compute the unique references to stack ids in the summaries emitted for each backend in a distributed thinlto build. We were also frequently invoking lower_bound to locate stack id indices in the resulting vector when writing out the referencing memprof records. Change this to use a map to uniquify the references, and to hold the index of the corresponding stack id in the StackIds vector, which is now populated at the same time. This reduced the time of a large thin link by about 10%.
Sink vscale calls as well when indvars is not widen (-indvars-widen-indvars=false).
…lable vector type. (llvm#93406) FunctionStackPoisoner does not serve for `AllocaInst` with scalable vector type, but it does not filter out struct type with scalable vector introduced by c8eb535.
The old use of must-be-executed-context (MBEC) did propagate through calls even if that was not allowed. We now only propagate from call site arguments. If there are calls/intrinsics that allows propagation, we need to add them explicitly. Fixes: llvm#78507 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
A cycle profile showed that we were spending a lot of time invoking MapVector::erase. According to https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h, erasing elements one at a time is very inefficient for MapVector and it is better to use remove_if. This change resulted in around 7% time reduction on a large thin link. While here remove an unused function that also invokes erase on MapVectors.
With the change in 2fa0591 we can now use a range for loop.
Move functionality for patching build ID into a separate rewriter class and change the way we do the patching. Support build ID in different note sections in order to update the build ID in the Linux kernel binary which puts in into ".notes" section instead of ".note.gnu.build-id".
…93206) This patch picks up llvm#78598 with the hope that we can address such crashes in `tryCaptureVariable()` for unevaluated lambdas. In addition to `tryCaptureVariable()`, this also contains several other fixes on e.g. lambda parsing/dependencies. Fixes llvm#63845 Fixes llvm#67260 Fixes llvm#69307 Fixes llvm#88081 Fixes llvm#89496 Fixes llvm#90669 Fixes llvm#91633
…llvm#94045) In some cases (see iree-org/iree#16285), `memref.subview` ops can't be folded into transfer ops and sub-byte type emulation fails. This issue has been blocking a few things, including the enablement of vector flattening transformations (iree-org/iree#16456). This PR extends the existing sub-byte type emulation support of `memref.subview` to handle multi-dimensional subviews with dynamic offsets and addresses the issues for some of the `memref.subview` cases that can't be folded. Co-authored-by: Diego Caballero <diegocaballero@google.com>
cferry-AMD
approved these changes
Sep 2, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.