forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FXML-3548] bump llvm to d13da154a7c7eff77df8686b2de1cfdfa7cc7029 #84
Merged
mgehre-amd
merged 10,000 commits into
feature/fused-ops
from
tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
Feb 1, 2024
Merged
[FXML-3548] bump llvm to d13da154a7c7eff77df8686b2de1cfdfa7cc7029 #84
mgehre-amd
merged 10,000 commits into
feature/fused-ops
from
tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
Feb 1, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lvm#66571) Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and `(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`. Proof: https://alive2.llvm.org/ce/z/Lj49YM This is the same as https://reviews.llvm.org/D46505 and https://reviews.llvm.org/D59208, but for signed opcodes.
If the R_AARCH64_CALL26 against a symbol that has a lower address, then encodeValueAArch64 will return a wrong value. Reviewed By: Kepontry, yota9 Differential Revision: https://reviews.llvm.org/D159513
In the place it used to be linked from.
…ack-move optimization (llvm#66618) Stack-move optimization, the optimization that merges src and dest alloca of the full-size copy, replaces all uses of the dest alloca with src alloca. For safety, we needed to check all uses of the dest alloca locations are dominated by src alloca, to be replaced. This PR adds the check for that. Fixes llvm#65225
…ndly. (llvm#65177) See https://wg21.link/LWG3545 for background and details. Differential Revision: https://reviews.llvm.org/D158922
Reviewed By: Chia-hungDuan Differential Revision: https://reviews.llvm.org/D159449
This bitcast is no longer necessary with opaque pointers. This results in some annoying variable name changes in tests.
It is possible for a derived type extending a type with private components to define components with the same name as the private components. This was not properly handled by lowering where several fir.record type component names could end-up being the same, leading to bad generated code (only the first component was accessed via fir.field_index, leading to bad generated code). This patch handles the situation by adding the derived type mangled name to private component.
Summary: This patch copies a config file for the GPU similar to the baremetal/embedded implementation. This will configure the implementations of functions like `sprintf` and `snprintf` to be compiled into more simple versions that can be run on the GPU. These functions cannot be enabled yet as Vararg support hasn't landed, but it will be used then.
llvm#66387) …(trunci) expansion This revision adds a rewrite for sequences of vector `bitcast(trunci)` to use a more efficient sequence of vector operations comprising `shuffle` and `bitwise` ops. Such patterns appear naturally when writing quantization / dequantization functionality with the vector dialect. The rewrite performs a simple enumeration of each of the bits in the result vector and determines its provenance in the pre-trunci vector. The enumeration is used to generate the proper sequence of `shuffle`, `andi`, `ori` followed by an optional final `trunci`/`extui`. The rewrite currently only applies to 1-D non-scalable vectors and bails out if the final vector element type is not a multiple of 8. This is a failsafe heuristic determined empirically: if the resulting type is not an even number of bytes, further complexities arise that are not improved by this pattern: the heavy lifting still needs to be done by LLVM.
This patch moves the group of OpenMP MLIR passes using after lowering of Fortran to MLIR into a pipeline to be shared by `flang-new` and `bbc`. Currently, the `bbc` tool does not produce the expected FIR for offloading- enabled OpenMP codes due to not running these passes. Unit tests exercising these passes are updated to check `bbc` output as well.
…tant (llvm#65905) This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when one constant mask is the subset of the other. If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the following folds: `icmp ule A, B -> true` `icmp ugt A, B -> false` We can apply similar folds for signed predicates when `C1` and `C2` are the same sign: `icmp sle A, B -> true` `icmp sgt A, B -> false` Alive2: https://alive2.llvm.org/ce/z/Q4ekP5 Fixes llvm#65833.
…tination (llvm#65468) This revision adds support for empty tensor elimination to "bufferization.materialize_in_destination" by implementing the `SubsetInsertionOpInterface`. Furthermore, the One-Shot Bufferize conflict detection is improved for "bufferization.materialize_in_destination".
…vm#66385) This gets rid of the separate parameter enable_modules_lsv in favor of adding a named option to the enable_modules parameter. The patch also removes the getModuleFlag helper, which was just a really complicated way of hardcoding "none".
…NFC) /data/llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:229:21: error: unused function 'operator<<' [-Werror,-Wunused-function] static raw_ostream &operator<<(raw_ostream &os, ^ 1 error generated.
… undef read (llvm#66211) Update LiveIntervals after rewriting: %reg = INSERT_SUBREG undef %reg, %subreg, subidx to: undef %reg:subidx = COPY %subreg D113044 implemented this for the non-undef case.
Promotion can add/remove arguments. We need to update the indices in the allocsize attribute accordingly. Fixes llvm#66103.
… being treated as titles
…#66206) Thanks to Giuseppe D'Angelo for pointing this out on the cpplang Slack! The example implementation in https://eel.is/c++draft/string.view.comparison#example-1 was necessary when it was written, in C++17, but in C++20 we don't need that complexity anymore, because of the reversed candidates that are synthesized by the compiler.
…) -> (assertzext x) fold. We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.
…pass options, remove bufferization.escape attribute (llvm#66619) This commit removes the deallocation capabilities of one-shot-bufferization. One-shot-bufferization should never deallocate any memrefs as this should be entirely handled by the ownership-based-buffer-deallocation pass going forward. This means the `allow-return-allocs` pass option will default to true now, `create-deallocs` defaults to false and they, as well as the escape attribute indicating whether a memref escapes the current region, will be removed. A new `allow-return-allocs-from-loops` option is added as a temporary workaround for some bufferization limitations.
…ocale "cs_CZ.ISO8859-2" Reviewers: David Tenty, Mark de Wever Differential Revision: https://reviews.llvm.org/D126407
…m#66653) Summary: The GPU build has a lot of magic around how we package the output. Generally, the GPU needs to exist as a secondary fatbinary image for offloading languages. This is because offloading languages pretend like offloading to an accelerator is a single file. This then needs to be put into a single file to make it mesh with the existing build infrastructure. To work with this, the `libc` makes an installed version of the library that simply embeds the GPU code into an empty stub file. This wasn't being updated correctly, which lead to the installed `libc` static library not being updated correctly when the underlying file was changed. The previous behaviour only updated when the entrypoint itself was modified, but not any of its headers. By adding a dependcy on the actual *object* file we should now capture the regular CMake semantics.
…t. NFC (llvm#66199) Some VP intrinsic definitions were missing the VP_PROPERTY_FUNCTIONAL_INTRINSIC property. This patch fills them in, and adds a static_assert that all VP intrinsics have an equivalent opcode or intrinsic defined so we don't forget them in future. Some VP intrinsics don't have an equivalent, namely merge and strided load/store. For those, a new property was added to mark that they don't have a non-VP equivalent. This adds a helper method to get the ID of the functionally equivalent intrinsic, similar to the existing getFunctionalOpcodeForVP and getConstrainedIntrinsicIDForVP method.
…nts (llvm#66238) The POINTER= and TARGET= arguments to the intrinsic function ASSOCIATED() can be the results of references to functions that return object pointers or procedure pointers. NULL() was working well but not program-defined pointer-valued functions. Correct the validation of ASSOCIATED() and extend the infrastructure used to detect and characterize procedures and pointers.
__call_once is large and cluttered with #ifdef preprocessor guards. This cleans it up a bit by using an exception guard instead of try-catch. Differential Revision: https://reviews.llvm.org/D112319 Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
We want to activate `llvm-header-guard` (llvm#66477) but the current CMake configuration includes paths that should be `isystem`. This PR restricts the number of `-I` passed to the clang command line and correctly marks the llvm libc include path as `isystem`.
Removed lots of outdated statements that were misleading.
This change matches a masked.stride.load from a mgather node whose index operand is a strided sequence. We can reuse the VID matching from build_vector lowering for this purpose. Note that this duplicates the matching done at IR by RISCVGatherScatterLowering.cpp. Now that we can widen gathers to a wider SEW, I don't see a good way to remove this duplication. The only obvious alternative is to move thw widening transform to IR, but that's a no-go as I want other DAGs to run first. I think we should just live with the duplication - particularly since the reuse is isSimpleVIDSequence means the duplication is somewhat minimal.
This parallels the binutils/BSD flag of the same name. Debugging information is loaded to print line number information for symbols. Defined symbols are symbolized by their section addresses, and undefined symbols by their first text reloc with line info. Differential Revision: https://reviews.llvm.org/D150987
This reverts commit a35a3b7. This broke libc benchmarks.
Yup, a bit of an oversight ;-)
We were losing the function entry count, which is useful to check profile quality. For the original cases where we want entrypoint-relative MBB frequencies, the user would just need to divide these values by the entrypoint (first MBB, with ID=0) value.
…xt. (llvm#66021) Per CWG2760, default members initializers should be consider part the body of constructors, which mean they are evaluated in an immediate escalating context. However, this does not apply to static members. This patch produces some extraneous diagnostics, unfortunately we do not have a good way to report an error back to the initializer and this is a pre existing issue Fixes llvm#65985 Fixes llvm#66562
This is in preparation for adding a KHR variant which does not share the same parameters and needs a separate attribute.
…L-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
The new python formatting on changed files triggers for all files in the merge from upstream. If we fix those errors, we would get a huge diff to upstream. Therefore temporarily disable the formatter and re-enable it after the bump.
…da154a7c7eff77df8686b2de1cfdfa7cc7029
* Use new operator printing syntax * Change i64 -> i32 for axis of reduction ops
…da154a7c7eff77df8686b2de1cfdfa7cc7029
…da154a7c7eff77df8686b2de1cfdfa7cc7029
…L-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
mgehre-amd
approved these changes
Feb 1, 2024
mgehre-amd
deleted the
tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
branch
February 1, 2024 15:46
mgehre-amd
restored the
tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
branch
February 2, 2024 10:10
mgehre-amd
deleted the
tina.FXML-3548-bump-llvm-to-d13da154a7c7eff77df8686b2de1cfdfa7cc7029
branch
February 2, 2024 10:12
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.