forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with b851c7f1 (Apr 17) (2) #295
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ze (llvm#83124) When in-place new-ing a local variable of an array of trivial type, the generated code calls 'memset' with the correct size of the array, earlier it was generating size (squared of the typedef array + size). The cause: `typedef TYPE TArray[8]; TArray x;` The type of declarator is Tarray[8] and in `SemaExprCXX.cpp::BuildCXXNew` we check if it's of typedef and of constant size then we get the original type and it works fine for non-dependent cases. But in case of template we do `TreeTransform.h:TransformCXXNEWExpr` and there we again check the allocated type which is TArray[8] and it stays that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the squared size allocation. ArraySize gets calculated earlier in `TreeTransform.h` so that `if(!ArraySize)` condition was failing. fix: I changed that condition to `if(ArraySize)`. Fixes llvm#41441
Some very basic tests for a case where we could fold BLEND(PERMUTE(X),PERMUTE(Y)) -> PERMUTE(BLEND(X,Y)) These assume the permute masks are the same, and "complete" (no undefs/duplicate elements) but we could relax that depending on the blend mask
This should ensure we explore the same VFs as before 6d66db3. Fixes llvm#88640.
On macOS, file paths start with /Users/..., which clang-cl interptrets as the /U switch followed by a preprocessor macro name to undefine. Put the filename after `--` to prevent this. For consistency, move %s to the end of the regular `clang` lines (where this isn't needed) as well.
This patch moves OpenMP-related entities out of `Sema` to a newly created `SemaOpenMP` class. This is a part of the effort to split `Sema` up, and follows the recent example of CUDA, OpenACC, SYCL, HLSL. Additional context can be found in llvm#82217, llvm#84184, llvm#87634.
Summary: AIX headers define this, so we need to work around it. In the future this will be removed but for now we should just rename it to avoid these issues.
…sions" and related commits (llvm#88884) The original change caused widespread breakages in msan/ubsan tests and causes `use-after-free`. Most likely we are adding more cleanups than necessary.
…llvm#87632) We had some instances when LLVM would not inline fixed-count memcpy and ended up attempting to lower it a a libcall, which would not work on AMDGPU as the address space doesn't meet the requirement, causing compiler crash. The patch relaxes the threshold used for -Os/-Oz compilation so we're always allowed to inline memory copy functions. This patch basically does the same thing as https://reviews.llvm.org/D158226 for AMDGPU. Fix llvm#88497.
… smax/smin intrinsics. Need to check that unsigned argument can be safely used in smax/smin intrinsics by checking if at least single sign bit is cleared, otherwise its value may be treated as negative instead of positive.
Use v4 of UTC to improve regex matching of argument names to fix a filecheck matching in a future patch
…ble, fix possible UB. Using fp type in the compiler is not the best idea, here it used with the comparison for equal to 0 and may cause undefined behavior in some cases. Reviewers: fhahn Reviewed By: fhahn Pull Request: llvm#87241
…nops tests from shuffle.ll
`self` clauses on compute constructs take an optional condition expression. We again limit the implementation to ONLY compute constructs to ensure we get all the rules correct for others. However, this one will be particularly complicated, as it takes a `var-list` for `update`, so when we get to that construct/clause combination, we need to do that as well. This patch also furthers uses of the `OpenACCClauses.def` as it became useful while implementing this (as well as some other minor refactors as I went through). Finally, `self` and `if` clauses have an interaction with each other, if an `if` clause evaluates to `true`, the `self` clause has no effect. While this is intended and can be used 'meaningfully', we are warning on this with a very granular warning, so that this edge case will be noticed by newer users, but can be disabled trivially.
We need file-level - not target-level - dependencies for these custom commands to re-trigger when their dependencies change.
Prior to llvm#85863, the required parameters of llvm::isKnownNonZero were Value and DataLayout. After, they are Value, Depth, and SimplifyQuery, where SimplifyQuery is implicitly constructible from DataLayout. The change to move Depth before SimplifyQuery needed callers to be updated unnecessarily, and as commented in llvm#85863, we actually want Depth to be after SimplifyQuery anyway so that it can be defaulted and the caller does not need to specify it.
Not sure how best to test this, but I think it fixes the error https://github.com/llvm/mlir-www/actions/runs/8699908058/job/23859264085#step:7:1111 Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com> Co-authored-by: Jacques Pienaar <jpienaar@google.com>
…#88008) (llvm#88014) Use refactored `CheckForConstantInitializer()` to skip checking expr with error. --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
…ccess and bad_function_call (llvm#87390) This patch uses our availability machinery to allow defining a key function for bad_function_call and bad_expected_access at all times but only rely on it when we can. This prevents compilers from complaining about weak vtables and reduces code bloat and the amount of work done by the dynamic linker. rdar://111917845
…vm#86410) When we initially implemented the C++20 synchronization library, we reluctantly accepted for the implementation to be backported to C++03 upon request from the person who provided the patch. This was when we were only starting to have experience with the issues this can create, so we flinched. Nowadays, we have a much stricter stance about not backporting features to previous standards. We have recently started fixing several bugs (and near bugs) in our implementation of the synchronization library. A recurring theme during these reviews has been how difficult to understand the current code is, and upon inspection it becomes clear that being able to use a few recent C++ features (in particular lambdas) would help a great deal. The code would still be pretty intricate, but it would be a lot easier to reason about the flow of callbacks through things like __thread_poll_with_backoff. As a result, this patch deprecates support for the synchronization library before C++20. In the next release, we can remove that support entirely.
This tests requires the OpenMP runtime to be present, but the way that the lit config detects it fails when "openmp" is added to RUNTIMES instead of PROJECTS. This caused the tests to be skipped as unsupported in local and upstream tests. The actual bug was a missing word in the message, and putting the check at the wrong line.
This reverts commit 82f479b due to bot breakage.
CASE_VFMA_OPCODE_VV and CASE_VFMA_CHANGE_OPCODE_VV need to match up if we are are to avoid "Unexpected opcode" errors, but in CASE_VFMA_CHANGE_OPCODE_VV, CASE_VFMA_CHANGE_OPCODE_LMULS_MF2 had mistakenly been used instead of CASE_VFMA_CHANGE_OPCODE_LMULS_MF4.
This change updates a few of the transformations in foldFMulReassoc to respect absent fast-math flags in cases where fmul and fdiv, fadd, or fsub instructions were being folded but the code was only checking for fast-math flags on the fmul instruction and was transferring flags to the folded instruction that were not present on the other original instructions. This fixes llvm#82857
llvm#88249) …se of tensor pack When the vector sizes are not passed as inputs to the vector transform operation, the vector sizes are queried from the static result shape in the case of tensor.pack op.
Since 97fe519, in ARM64EC mode, we don't define `__aarch64__`. Fix various preprocessor guards to account for this.
This reverts commit 7d4e8c1. Contrary to the commit description, this does cause large compile-time regressions (up to 10% on individual files).
- Those special register stores are STORE and their memory operands are input operands instead of output ones. Reviewers: JDevlieghere, arsenm, yinying-lisa-li, koachan, PeimingLiu, jyknight, aartbik, matthias-springer Reviewed By: arsenm Pull Request: llvm#88971
- If a def operand includes multiple sub-operands, count them when generating instr info. - Found issues in x86 and sparc backends, where memory operands of store or store-like instructions are wrongly placed in the output list. Reviewers: jayfoad, arsenm, Pierre-vh Reviewed By: arsenm Pull Request: llvm#88972
…h" (llvm#89006) Reverts llvm#88546 Leak and performance regression. Details in llvm#88546
RFC: https://discourse.llvm.org/t/rfc-introduce-new-clang-builtin-builtin-allow-runtime-check/78281 --------- Co-authored-by: Noah Goldstein <goldstein.w.n@gmail.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
getFileOffsetFor() was replaced with getFileOffsetForAddress().
RewriteInstance::isKSymtabSection() is deprecated.
…ference (llvm#88843) Reapply for llvm#88765. Partially fixes: llvm#60895.
The VTYPE operands of a vsetvli pseudo are always immediates
…6811) Currently the CFI offset for RVV registers are not handled entirely, this patch add those information for either stack unwinding or debugger to work correctly on RVV callee-saved stack object. Depends On D154576 Differential Revision: https://reviews.llvm.org/D156846
The `clang-scan-deps` tool can be used for fast scanning of batches of compilation commands passed in via the `-compilation-database` option. This gets awkward in our tests where we have to resort to using `.in`/`.template` JSON files and running them through `sed` in order to embed LIT's `%t` variable into them. However, most of our tests only need to pass single compilation command, so this dance is entirely unnecessary. This patch makes sure the existing "per-file" mode (where the compilation command is passed in-line after the `--` argument) works for all output formats, not only `P1689`.
This makes it possible to pass "-o /dev/null" to `clang-scan-deps` and skip some potentially expensive work, making timings less noisy. Also removes the need for stream redirection.
llvm::SmallVector::operator== exactly meets our needs.
…89001) Instead of searching all encodings, we can convert the encoding back to a register and use getMatchingSuperReg.
Per [tab:time.format.spec] %z The offset from UTC as specified in ISO 8601-1:2019, subclause 5.3.4.1. For example -0430 refers to 4 hours 30 minutes behind UTC. If the offset is zero, +0000 is used. The modified commands %Ez and %Oz insert a : between the hours and minutes: -04:30. If the offset information is not available, an exception of type format_error is thrown. Typically the modified versions Oz or Ez would have wording like The modified command %OS produces the locale's alternative representation. In this case the modified version does not depend on the locale. This change is a preparation for formatting sys_info which has time zone information. The function time_put<_CharT>::put() does not have proper time zone support, therefore it's a manual implementation. Fixes llvm#78184
…lvm#88872) This patch adds a test that assert-fails without the fix.
cferry-AMD
approved these changes
Aug 22, 2024
An error occurred while trying to automatically change base from
bump_to_1c076b43
to
feature/fused-ops
August 22, 2024 15:06
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.