Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FXML-3548] bump llvm to d13da154a7c7eff77df8686b2de1cfdfa7cc7029 #84

Conversation

TinaAMD
Copy link

@TinaAMD TinaAMD commented Nov 16, 2023

No description provided.

s-barannikov and others added 30 commits September 18, 2023 14:45
…lvm#66571)

Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and
`(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`.

Proof: https://alive2.llvm.org/ce/z/Lj49YM

This is the same as https://reviews.llvm.org/D46505 and
https://reviews.llvm.org/D59208, but for signed opcodes.
If the R_AARCH64_CALL26 against a symbol that has a lower address, then
encodeValueAArch64 will return a wrong value.

Reviewed By: Kepontry, yota9

Differential Revision: https://reviews.llvm.org/D159513
In the place it used to be linked from.
…ack-move optimization (llvm#66618)

Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes llvm#65225
Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D159449
This bitcast is no longer necessary with opaque pointers. This
results in some annoying variable name changes in tests.
It is possible for a derived type extending a type with private
components to define components with the same name as the private
components.

This was not properly handled by lowering where several fir.record type
component names could end-up being the same, leading to bad generated
code (only the first component was accessed via fir.field_index, leading
to bad generated code).

This patch handles the situation by adding the derived type mangled name
to private component.
Summary:
This patch copies a config file for the GPU similar to the
baremetal/embedded implementation. This will configure the
implementations of functions like `sprintf` and `snprintf` to be
compiled into more simple versions that can be run on the GPU. These
functions cannot be enabled yet as Vararg support hasn't landed, but it
will be used then.
llvm#66387)

…(trunci) expansion

This revision adds a rewrite for sequences of vector `bitcast(trunci)`
to use a more efficient sequence of vector operations comprising
`shuffle` and `bitwise` ops.

Such patterns appear naturally when writing quantization /
dequantization functionality with the vector dialect.

The rewrite performs a simple enumeration of each of the bits in the
result vector and determines its provenance in the pre-trunci vector.
The enumeration is used to generate the proper sequence of `shuffle`,
`andi`, `ori` followed by an optional final `trunci`/`extui`.

The rewrite currently only applies to 1-D non-scalable vectors and bails
out if the final vector element type is not a multiple of 8. This is a
failsafe heuristic determined empirically: if the resulting type is not
an even number of bytes, further complexities arise that are not
improved by this pattern: the heavy lifting still needs to be done by
LLVM.
This patch moves the group of OpenMP MLIR passes using after lowering of
Fortran to MLIR into a pipeline to be shared by `flang-new` and `bbc`.
Currently, the `bbc` tool does not produce the expected FIR for offloading-
enabled OpenMP codes due to not running these passes.

Unit tests exercising these passes are updated to check `bbc` output as well.
…tant (llvm#65905)

This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when
one constant mask is the subset of the other.
If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the
following folds:
`icmp ule A, B -> true`
`icmp ugt A, B -> false`
We can apply similar folds for signed predicates when `C1` and `C2` are
the same sign:
`icmp sle A, B -> true`
`icmp sgt A, B -> false`

Alive2: https://alive2.llvm.org/ce/z/Q4ekP5
Fixes llvm#65833.
…tination (llvm#65468)

This revision adds support for empty tensor elimination to
"bufferization.materialize_in_destination" by implementing the
`SubsetInsertionOpInterface`.

Furthermore, the One-Shot Bufferize conflict detection is improved for
"bufferization.materialize_in_destination".
…vm#66385)

This gets rid of the separate parameter enable_modules_lsv in favor of
adding a named option to the enable_modules parameter. The patch also
removes the getModuleFlag helper, which was just a really complicated
way of hardcoding "none".
…NFC)

/data/llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:229:21: error: unused function 'operator<<' [-Werror,-Wunused-function]
static raw_ostream &operator<<(raw_ostream &os,
                    ^
1 error generated.
… undef read (llvm#66211)

Update LiveIntervals after rewriting:
  %reg = INSERT_SUBREG undef %reg, %subreg, subidx
to:
  undef %reg:subidx = COPY %subreg

D113044 implemented this for the non-undef case.
Promotion can add/remove arguments. We need to update the
indices in the allocsize attribute accordingly.

Fixes llvm#66103.
…#66206)

Thanks to Giuseppe D'Angelo for pointing this out on the cpplang Slack!

The example implementation in https://eel.is/c++draft/string.view.comparison#example-1
was necessary when it was written, in C++17, but in C++20 we don't need that
complexity anymore, because of the reversed candidates that are
synthesized by the compiler.
…) -> (assertzext x) fold.

We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.
…pass options, remove bufferization.escape attribute (llvm#66619)

This commit removes the deallocation capabilities of
one-shot-bufferization. One-shot-bufferization should never deallocate
any memrefs as this should be entirely handled by the
ownership-based-buffer-deallocation pass going forward. This means the
`allow-return-allocs` pass option will default to true now,
`create-deallocs` defaults to false and they, as well as the escape
attribute indicating whether a memref escapes the current region, will
be removed. A new `allow-return-allocs-from-loops` option is added as a
temporary workaround for some bufferization limitations.
…ocale "cs_CZ.ISO8859-2"

Reviewers: David Tenty, Mark de Wever
Differential Revision: https://reviews.llvm.org/D126407
…m#66653)

Summary:
The GPU build has a lot of magic around how we package the output.
Generally, the GPU needs to exist as a secondary fatbinary image for
offloading languages. This is because offloading languages pretend like
offloading to an accelerator is a single file. This then needs to be put
into a single file to make it mesh with the existing build
infrastructure. To work with this, the `libc` makes an installed version
of the library that simply embeds the GPU code into an empty stub file.

This wasn't being updated correctly, which lead to the installed `libc`
static library not being updated correctly when the underlying file was
changed. The previous behaviour only updated when the entrypoint itself
was modified, but not any of its headers. By adding a dependcy on the
actual *object* file we should now capture the regular CMake semantics.
…t. NFC (llvm#66199)

Some VP intrinsic definitions were missing the
VP_PROPERTY_FUNCTIONAL_INTRINSIC property. This patch fills them in, and
adds a static_assert that all VP intrinsics have an equivalent opcode or
intrinsic defined so we don't forget them in future.

Some VP intrinsics don't have an equivalent, namely merge and strided
load/store. For those, a new property was added to mark that they don't
have a non-VP equivalent.

This adds a helper method to get the ID of the functionally equivalent
intrinsic, similar to the existing getFunctionalOpcodeForVP and
getConstrainedIntrinsicIDForVP method.
…nts (llvm#66238)

The POINTER= and TARGET= arguments to the intrinsic function
ASSOCIATED() can be the results of references to functions that return
object pointers or procedure pointers. NULL() was working well but not
program-defined pointer-valued functions. Correct the validation of
ASSOCIATED() and extend the infrastructure used to detect and
characterize procedures and pointers.
DanielMcIntosh and others added 23 commits September 19, 2023 16:17
__call_once is large and cluttered with #ifdef preprocessor guards. This
cleans it up a bit by using an exception guard instead of try-catch.

Differential Revision: https://reviews.llvm.org/D112319
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
We want to activate `llvm-header-guard` (llvm#66477) but the current CMake
configuration includes paths that should be `isystem`. This PR restricts
the number of `-I` passed to the clang command line and correctly marks
the llvm libc include path as `isystem`.
Removed lots of outdated statements that were misleading.
This change matches a masked.stride.load from a mgather node whose index
operand is a strided sequence. We can reuse the VID matching from
build_vector lowering for this purpose.

Note that this duplicates the matching done at IR by
RISCVGatherScatterLowering.cpp. Now that we can widen gathers to a wider
SEW, I don't see a good way to remove this duplication. The only obvious
alternative is to move thw widening transform to IR, but that's a no-go
as I want other DAGs to run first. I think we should just live with the
duplication - particularly since the reuse is isSimpleVIDSequence means
the duplication is somewhat minimal.
This parallels the binutils/BSD flag of the same name. Debugging
information is loaded to print line number information for symbols.
Defined symbols are symbolized by their section addresses, and undefined
symbols by their first text reloc with line info.

Differential Revision: https://reviews.llvm.org/D150987
We were losing the function entry count, which is useful to check profile quality. For the original cases where we want
entrypoint-relative MBB frequencies, the user would just need to divide these values by the entrypoint (first MBB, with ID=0) value.
…xt. (llvm#66021)

Per CWG2760, default members initializers should be consider part the
body of constructors, which mean they are evaluated in an immediate
escalating context.

However, this does not apply to static members.

This patch produces some extraneous diagnostics, unfortunately we do not
have a good way to report an error back to the initializer and this is a
pre existing issue

Fixes llvm#65985
Fixes llvm#66562
This is in preparation for adding a KHR variant which does not share the
same parameters and needs a separate attribute.
…L-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
The new python formatting on changed files triggers for all files in the merge from upstream. If we fix those errors, we would get a huge diff to upstream. Therefore temporarily disable the formatter and re-enable it after the bump.
* Use new operator printing syntax
* Change i64 -> i32 for axis of reduction ops
…L-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
@mgehre-amd mgehre-amd marked this pull request as ready for review February 1, 2024 15:37
@mgehre-amd mgehre-amd self-requested a review February 1, 2024 15:37
@mgehre-amd mgehre-amd merged commit 845176a into feature/fused-ops Feb 1, 2024
2 checks passed
@mgehre-amd mgehre-amd deleted the tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029 branch February 1, 2024 15:46
@mgehre-amd mgehre-amd restored the tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029 branch February 2, 2024 10:10
@mgehre-amd mgehre-amd deleted the tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029 branch February 2, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.