Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with b851c7f1 (Apr 17) (2) #295

Merged
merged 120 commits into from
Aug 22, 2024

Conversation

mgehre-amd
Copy link
Collaborator

No description provided.

nico and others added 30 commits April 16, 2024 08:14
…ze (llvm#83124)

When in-place new-ing a local variable of an array of trivial type, the
generated code calls 'memset' with the correct size of the array,
earlier it was generating size (squared of the typedef array + size).

The cause: `typedef TYPE TArray[8]; TArray x;` The type of declarator is
Tarray[8] and in `SemaExprCXX.cpp::BuildCXXNew` we check if it's of
typedef and of constant size then we get the original type and it works
fine for non-dependent cases.
But in case of template we do `TreeTransform.h:TransformCXXNEWExpr` and
there we again check the allocated type which is TArray[8] and it stays
that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the
squared size allocation.

ArraySize gets calculated earlier in `TreeTransform.h` so that
`if(!ArraySize)` condition was failing.
fix: I changed that condition to `if(ArraySize)`. 


Fixes llvm#41441
Some very basic tests for a case where we could fold BLEND(PERMUTE(X),PERMUTE(Y)) -> PERMUTE(BLEND(X,Y))

These assume the permute masks are the same, and "complete" (no undefs/duplicate elements) but we could relax that depending on the blend mask
This should ensure we explore the same VFs as before 6d66db3.

Fixes llvm#88640.
On macOS, file paths start with /Users/..., which clang-cl interptrets
as the /U switch followed by a preprocessor macro name to undefine.

Put the filename after `--` to prevent this. For consistency, move %s
to the end of the regular `clang` lines (where this isn't needed) as
well.
This patch moves OpenMP-related entities out of `Sema` to a newly
created `SemaOpenMP` class. This is a part of the effort to split `Sema`
up, and follows the recent example of CUDA, OpenACC, SYCL, HLSL.
Additional context can be found in
llvm#82217,
llvm#84184,
llvm#87634.
Summary:
AIX headers define this, so we need to work around it. In the future
this will be removed but for now we should just rename it to avoid these
issues.
…sions" and related commits (llvm#88884)

The original change caused widespread breakages in msan/ubsan tests and
causes `use-after-free`. Most likely we are adding more cleanups than
necessary.
…llvm#87632)

We had some instances when LLVM would not inline fixed-count memcpy and
ended up
attempting to lower it a a libcall, which would not work on AMDGPU as
the
address space doesn't meet the requirement, causing compiler crash.

The patch relaxes the threshold used for -Os/-Oz compilation so we're
always allowed
to inline memory copy functions.

This patch basically does the same thing as
https://reviews.llvm.org/D158226 for
AMDGPU.

Fix llvm#88497.
… smax/smin intrinsics.

Need to check that unsigned argument can be safely used in smax/smin
intrinsics by checking if at least single sign bit is cleared, otherwise
its value may be treated as negative instead of positive.
Use v4 of UTC to improve regex matching of argument names to fix a filecheck matching in a future patch
…ble, fix possible UB.

Using fp type in the compiler is not the best idea, here it used with
the comparison for equal to 0 and may cause undefined behavior in some
cases.

Reviewers: fhahn

Reviewed By: fhahn

Pull Request: llvm#87241
`self` clauses on compute constructs take an optional condition
expression. We again limit the implementation to ONLY compute constructs
to ensure we get all the rules correct for others. However, this one
will be particularly complicated, as it takes a `var-list` for `update`,
so when we get to that construct/clause combination, we need to do that
as well.

This patch also furthers uses of the `OpenACCClauses.def` as it became
useful while implementing this (as well as some other minor refactors as
I went through).

Finally, `self` and `if` clauses have an interaction with each other, if
an `if` clause evaluates to `true`, the `self` clause has no effect.
While this is intended and can be used 'meaningfully', we are warning on
this with a very granular warning, so that this edge case will be
noticed by newer users, but can be disabled trivially.
We need file-level - not target-level - dependencies for these custom
commands to re-trigger when their dependencies change.
Prior to llvm#85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in llvm#85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
Not sure how best to test this, but I think it fixes the error
https://github.com/llvm/mlir-www/actions/runs/8699908058/job/23859264085#step:7:1111

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Jacques Pienaar <jpienaar@google.com>
…#88008) (llvm#88014)

Use refactored `CheckForConstantInitializer()` to skip checking expr
with error.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
…ccess and bad_function_call (llvm#87390)

This patch uses our availability machinery to allow defining a key
function for bad_function_call and bad_expected_access at all times but
only rely on it when we can. This prevents compilers from complaining
about weak vtables and reduces code bloat and the amount of work done by
the dynamic linker.

rdar://111917845
…vm#86410)

When we initially implemented the C++20 synchronization library, we
reluctantly accepted for the implementation to be backported to C++03
upon request from the person who provided the patch. This was when we
were only starting to have experience with the issues this can create,
so we flinched. Nowadays, we have a much stricter stance about not
backporting features to previous standards.

We have recently started fixing several bugs (and near bugs) in our
implementation of the synchronization library. A recurring theme during
these reviews has been how difficult to understand the current code is,
and upon inspection it becomes clear that being able to use a few recent
C++ features (in particular lambdas) would help a great deal. The code
would still be pretty intricate, but it would be a lot easier to reason
about the flow of callbacks through things like
__thread_poll_with_backoff.

As a result, this patch deprecates support for the synchronization
library before C++20. In the next release, we can remove that support
entirely.
This tests requires the OpenMP runtime to be present, but the way that
the lit config detects it fails when "openmp" is added to RUNTIMES
instead of PROJECTS. This caused the tests to be skipped as unsupported
in local and upstream tests.

The actual bug was a missing word in the message, and putting the check
at the wrong line.
This reverts commit 82f479b due to bot breakage.
CASE_VFMA_OPCODE_VV and CASE_VFMA_CHANGE_OPCODE_VV need to match up if we are
are to avoid "Unexpected opcode" errors, but in CASE_VFMA_CHANGE_OPCODE_VV,
CASE_VFMA_CHANGE_OPCODE_LMULS_MF2 had mistakenly been used instead of
CASE_VFMA_CHANGE_OPCODE_LMULS_MF4.
keith and others added 25 commits April 16, 2024 19:17
This change updates a few of the transformations in foldFMulReassoc to
respect absent fast-math flags in cases where fmul and fdiv, fadd, or fsub
instructions were being folded but the code was only checking for
fast-math flags on the fmul instruction and was transferring flags to
the folded instruction that were not present on the other original 
instructions.

This fixes llvm#82857
llvm#88249)

…se of tensor pack

When the vector sizes are not passed as inputs to the vector transform
operation, the vector sizes are queried from the static result shape in
the case of tensor.pack op.
Since 97fe519, in ARM64EC mode, we don't define `__aarch64__`. Fix
various preprocessor guards to account for this.
This reverts commit 7d4e8c1.

Contrary to the commit description, this does cause large
compile-time regressions (up to 10% on individual files).
- Those special register stores are STORE and their memory operands are
  input operands instead of output ones.

Reviewers:
JDevlieghere, arsenm, yinying-lisa-li, koachan, PeimingLiu, jyknight, aartbik, matthias-springer

Reviewed By: arsenm

Pull Request: llvm#88971
- If a def operand includes multiple sub-operands, count them when
  generating instr info.
- Found issues in x86 and sparc backends, where memory operands of
  store or store-like instructions are wrongly placed in the output
  list.

Reviewers: jayfoad, arsenm, Pierre-vh

Reviewed By: arsenm

Pull Request: llvm#88972
RFC:
https://discourse.llvm.org/t/rfc-introduce-new-clang-builtin-builtin-allow-runtime-check/78281

---------

Co-authored-by: Noah Goldstein <goldstein.w.n@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
getFileOffsetFor() was replaced with getFileOffsetForAddress().
RewriteInstance::isKSymtabSection() is deprecated.
The VTYPE operands of a vsetvli pseudo are always immediates
…6811)

Currently the CFI offset for RVV registers are not handled entirely,
this patch add those information for either stack unwinding or
debugger to work correctly on RVV callee-saved stack object.

Depends On D154576

Differential Revision: https://reviews.llvm.org/D156846
The `clang-scan-deps` tool can be used for fast scanning of batches of
compilation commands passed in via the `-compilation-database` option.
This gets awkward in our tests where we have to resort to using
`.in`/`.template` JSON files and running them through `sed` in order to
embed LIT's `%t` variable into them. However, most of our tests only
need to pass single compilation command, so this dance is entirely
unnecessary.

This patch makes sure the existing "per-file" mode (where the
compilation command is passed in-line after the `--` argument) works for
all output formats, not only `P1689`.
This makes it possible to pass "-o /dev/null" to `clang-scan-deps` and
skip some potentially expensive work, making timings less noisy. Also
removes the need for stream redirection.
llvm::SmallVector::operator== exactly meets our needs.
…89001)

Instead of searching all encodings, we can convert the encoding back to
a register and use getMatchingSuperReg.
Per [tab:time.format.spec]
%z  The offset from UTC as specified in ISO 8601-1:2019, subclause
    5.3.4.1. For example -0430 refers to 4 hours 30 minutes behind UTC.
    If the offset is zero, +0000 is used. The modified commands %Ez and
    %Oz insert a : between the hours and minutes: -04:30. If the offset
    information is not available, an exception of type format_error is
    thrown.

Typically the modified versions Oz or Ez would have wording like

  The modified command %OS produces the locale's alternative
  representation.

In this case the modified version does not depend on the locale.

This change is a preparation for formatting sys_info which has time zone
information. The function time_put<_CharT>::put() does not have proper
time zone support, therefore it's a manual implementation.

Fixes llvm#78184
…lvm#88872)

This patch adds a test that assert-fails without the fix.
Base automatically changed from bump_to_1c076b43 to feature/fused-ops August 22, 2024 15:06
An error occurred while trying to automatically change base from bump_to_1c076b43 to feature/fused-ops August 22, 2024 15:06
@mgehre-amd mgehre-amd merged commit ac378c2 into feature/fused-ops Aug 22, 2024
11 checks passed
@mgehre-amd mgehre-amd deleted the bump_to_b851c7f1 branch August 22, 2024 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.