forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with f4b9839d (Sep 04) (20) #373
Open
mgehre-amd
wants to merge
289
commits into
bump_to_1293ab35
Choose a base branch
from
bump_to_f4b9839d
base: bump_to_1293ab35
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…#106835) The target name and the message are wrong -- both should say "cuda" for the filtering to work. Fixes commit 300e5b9 (llvm#93186).
In llvm#92581 the `LibomptargetUitls.cmake` helpers have been removed, but only uses of `libomptarget_say` were migrated. Migrate the remaining few warning and error messages so the `check-offload` target would not fail due to missing `libomptarget_warning_say`. While at it, update the `check-offload` unavailability message to say `check-offload` instead of `check-libomptarget`. Fixes llvm#92581
…lvm#106634) Summary: The `langinfo.h` header is a POSIX extension, so ideally we would be able to build the C++ library without it. Currently the LLVM C library doesn't support / provide it. This allows us to build the C++ library with locales enabled. We can either disable it here, or just provide stubs that do nothing as in llvm#106620.
…6632) Summary: We currently do not provide a more complicated rune table, so we want the default.
This patch adds cost model support for [u|s]cmp.
/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21558:14: error: unused variable 'ValLMUL' [-Werror,-Wunused-variable] unsigned ValLMUL = ^ /llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21561:14: error: unused variable 'PartLMUL' [-Werror,-Wunused-variable] unsigned PartLMUL = ^ 2 errors generated.
…mber of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#106449
First step for support WaveSize attribute in https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html and https://microsoft.github.io/hlsl-specs/proposals/0013-wave-size-range.html A new attribute HLSLWaveSizeAttr was supported in the AST. Implement both the wave size and the wave size range, rather than separately which might require more work. For llvm#70118
HLSL output parameters are denoted with the `inout` and `out` keywords in the function declaration. When an argument to an output parameter is constructed a temporary value is constructed for the argument. For `inout` pamameters the argument is initialized via copy-initialization from the argument lvalue expression to the parameter type. For `out` parameters the argument is not initialized before the call. In both cases on return of the function the temporary value is written back to the argument lvalue expression through an implicit assignment binary operator with casting as required. This change introduces a new HLSLOutArgExpr ast node which represents the output argument behavior. The OutArgExpr has three defined children: - An OpaqueValueExpr of the argument lvalue expression. - An OpaqueValueExpr of the copy-initialized parameter. - A BinaryOpExpr assigning the first with the value of the second. Fixes llvm#87526 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: John McCall <rjmccall@gmail.com>
…lvm#106828) Stop adding liveins for virtual registers. In the livein interface, the register goes through a MCPhysReg which is uint16_t. This causes the virtual register bit to be dropped making it alias to some nonsense physical register. Recompute the liveins for the continue block to handle any live registers that are needed by instructions that were spliced from the original block. This fixing the machine verifier error so we can remove that fixme now.
llvm#105740) …limination ArgumentPromotion and DeadArgumentElimination passes may change function signature. This makes bpf tracing difficult since users either not aware of signature change or need to poke into IR or assembly to understand the function signature change. This patch enabled to emit some remarks so if recompiling with -foptimization-record-file=<file>, users can check remarks to see what kind of signature changes for a particular function. The following are some examples for implemented remarks: ``` Pass: deadargelim Name: ReturnValueRemoved DebugLoc: { File: 'bpf-next/net/mptcp/protocol.c', Line: 572, Column: 0 } Function: mptcp_check_data_fin Args: - String: 'removing return value ' - String: '0' Pass: deadargelim Name: ArgumentRemoved DebugLoc: { File: 'bpf-next/kernel/bpf/syscall.c', Line: 1670, Column: 0 } Function: map_delete_elem Args: - String: 'eliminating argument ' - ArgName: uattr.coerce0 - String: '(' - ArgIndex: '1' - String: ')' Pass: argpromotion Name: ArgumentPromoted DebugLoc: { File: 'bpf-next/net/mptcp/protocol.h', Line: 570, Column: 0 } Function: mptcp_subflow_ctx Args: - String: 'promoting argument ' - ArgName: sk - String: '(' - ArgIndex: '0' - String: ')' - String: ' to pass by value' ``` [1] llvm#104678
This applies to function template non-call partial ordering the same provisional wording change applied in the call context: Don't perform the consistency check on return type and parameters which didn't have any template parameters deduced from. Fixes regression introduced in llvm#100692, which was reported on the PR.
…m#101414) We have been discussing changes to our commit access polices recently and based on some feedback from clattner here: https://discourse.llvm.org/t/rfc-new-criteria-for-commit-access/76290/81 We need to update our Developer Policy so that it matches what we are actually doing in this project. We currently grant commit access to anyone with a valid justification, not just contributors who have submitted high-quality patches in the past. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
This matches the MachineBasicBlock liveins used to populate it.
…vm#105728) Basic infrastructure to collect Function properties in Metadata Analysis - Add a `SmallVector` of entry properties to the metadata information. - Add a structure to represent function properties. Currently `numthreads` and shader kind properties of shader entry functions are represented.
…llvm#105510) This patch replaces all dominated uses of condition with true/false to improve context-sensitive optimizations. It eliminates a bunch of branches in llvm-opt-benchmark. As a side effect, it may introduce new phi nodes in some corner cases. See the following case: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else] ret i1 %res } ``` It will be simplified into: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else] ret i1 %res } ``` I am planning to fix this in late pipeline/CGP since this problem exists before the patch.
Fix the DeclID not being set in global temporaries and use the same strategy for deciding if a temporary is readable as the current interpreter.
As far as I can tell, there's no way to call this. There are no calls in the X86 directory. It has the same name as a function in MCRegisterInfo, but that function takes a MCRegister and isn't virtual. The function in MCRegisterInfo uses a DenseMap populated by `X86_MC::initLLVMToSEHAndCVRegMapping`. The DenseMap is populated for every physical register using the encoding value. I think that means the function in MCRegisterInfo would return the same value as the function in X86RegisterInfo.
…106886) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes.
…n Windows (llvm#106794) Suppresses the copyright banner for `ml64` compiling BLAKE3 assembly sources with MSVC and Ninja on Windows: ``` [157/3758] Building ASM_MASM object lib\Support\BLAKE3\CMa...upportBlake3.dir\blake3_avx512_x86-64_windows_msvc.asm.obj Microsoft (R) Macro Assembler (x64) Version 14.41.34120.0 Copyright (C) Microsoft Corporation. All rights reserved. Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` is now just: ``` Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` We can suppress that last line with `/quiet` in more recent versions of `ml64` (from MSVC 2022 17.6) but it is not supported by all potential MASM compilers.
When the shuffle masks are `PoisonMaskElem`, there is not need to check the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause the compiler to crash. Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) && "SK_ExtractSubvector index out of range"' failed.
…s that emit two instructions. (llvm#107011) All of the test changes are because integer type legalization prefers to promote fp_to_uint to fp_to_sint if neither is "Legal".
Bump the lldb-dap version number so that we can publish and updated version in the Visual Studio Marketplace.
…ions reduced values Need to correctly track reduced values with multiple uses in the same reduction emission attempt. Otherwise, the number of the reuses might be calculated incorrectly, and may cause compiler crash. Fixes llvm#107037
Add three more special cases for loading registers with immediates. The first allows values in the range of [-255, 255] to be loaded with MOVEQ, even if the register is more than 8 bits and the sign extention is unwanted. This is done by loading the bitwise complement of the desired value, then performing a NOT instruction on the loaded register. This special case is only used when a simple MOVEQ cannot be used, and is only used for 32 bit data registers. Address registers cannot support MOVEQ, and the two-instruction sequence is no faster or smaller than a plain MOVE instruction when loading 16 bit immediates on the 68000, and likely slower for more sophisticated microarchitectures. However, the instruction sequence is both smaller and faster than the corresponding MOVE instruction for 32 bit register widths. The second special case is for zeroing address registers. This simply expands to subtracting a register with itself, consuming one instruction word rather than 2-3, with a small improvement in speed as well. The last special case is for assigning sign-extended 16-bit values to a full address register. This takes advantage of the fact that the movea.w instruction sign extends the output, permitting the immediate to be smaller. This is similar to using lea with a 16-bit address, which is not added in this patch as 16-bit absolute addressing is not yet implemented. This is a v2 submission of llvm#90817. It also creates a 'Data' test directory to better align with the backend's tablegen layout.
Select only needs branches and moves so we don't need to promote it. Promoting would canonicalize NaNs which select shouldn't do.
The only instance where we weren't already passing a `StringRef` with a known length to `Symbol`'s constructor is where the argument is a string literal. Even in that case, lazy `strlen` calls don't make sense, as the compiler can constant-evaluate the `StringRef(const char*)` constructor. For symbols that go into the symbol table we need the length when calculating the hash anyway. We could get away with not calling `getName()` for local symbols, but the total contribution of `strlen` to the run time is already below 1%, so that would just complicate the code for a negligible benefit.
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant. Post-inlining, the update mainly consists of: - making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions) - in the contextual profile: - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual. - the contexts of the callee (at the inlined callsite) are moved to the caller. - the callee context at the inlined callsite is deleted.
…ctorizations reduced values" This reverts commit 98bb354 to fix buildbots https://lab.llvm.org/buildbot/#/builders/155/builds/2056 and https://lab.llvm.org/buildbot/#/builders/11/builds/4407
Uses a static lock to ensure multiple threads reporting issues at the same time don't have printing collisions. This isn't so important now, but will be with continue mode in the future.
`memory read` will return an error if you try to read more than 1k bytes in a single command, instructing you to set `target.max-memory-read-size` or use `--force` if you intended to read more than that. This is a safeguard for a command where people are being explicit about how much memory they would like lldb to read (either to display, or save to a file) and is an annoyance every time you need to read more than a small amount. If someone confuses the --count argument with the start address, lldb may begin dumping gigabytes of data but I'd rather that behavior than requiring everyone to special-case their way around a common use case. I don't want to remove the setting because many people have added (much larger) default max read sizes to their ~/.lldbinit files after hitting this behavior. Another option would be to stop reading/using the value in Target.cpp, but I see no harm in leaving the setting if someone really does prefer to have a small cap on their memory read size.
Despite the stale comments, none of these actually use TTI, and they're solely generating standard LLVM IR.
…ion) (llvm#107131) [CWG2486](https://cplusplus.github.io/CWG/issues/2486.html) "Call to `noexcept` function via `noexcept(false)` pointer/lvalue" allows `noexcept` functions to be called via `noexcept(false)` pointers or values. There appears to be no implementation divergence whatsoever: https://godbolt.org/z/3afTfeEM8. That said, in C++14 and earlier we do not issue all the diagnostics we issue in C++17 and newer, so I'm specifying the status of the issue accordingly.
This patch implements sandboxir:: ConstantAggregate, ConstantStruct, ConstantArray and ConstantVector, mirroring LLVM IR.
…in/Zvfhmin+Zfbfmin/Zfhmin. (llvm#106637) Previously, if Zfbfmin/Zfhmin were enabled, we only handled build_vectors that could be turned into splat_vectors. We promoted them to f32 splats by extending in the scalar domain and narrowing in the vector domain. This patch fixes a crash where we failed to account for whether the f32 vector type fit in LMUL<=8. Because the new lowering occurs after type legalization, we have to be careful to use XLenVT for the scalar integer type and use custom cast nodes.
…m#107157) The `Kind` argument does not need to passed separately.
… `outer_dims_perm` attribute (llvm#106687)
This patch covers Core issues about language linkage during declaration matching resolved in [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html), namely [CWG563](https://cplusplus.github.io/CWG/issues/563.html) and [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html). [CWG563](https://cplusplus.github.io/CWG/issues/563.html) "Linkage specification for objects" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG563](https://cplusplus.github.io/CWG/issues/563.html) is resolved by simplifications that follow its suggestions. Wording ([[dcl.link]/5](https://eel.is/c++draft/dcl.link#5)): > In a [linkage-specification](https://eel.is/c++draft/dcl.link#nt:linkage-specification), the specified language linkage applies to the function types of all function declarators and to all functions and variables whose names have external linkage[.](https://eel.is/c++draft/dcl.link#5.sentence-5) Now the wording clearly says that linkage-specification applies to variables with external linkage. [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html) "Visibility and inherited language linkage" ------------ [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG386](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#386), [CWG1839](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1839), [CWG1818](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1818), [CWG2058](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2058), [CWG1900](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1900), and Richard’s observation in [“are non-type names ignored in a class-head-name or enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are resolved by describing the limited lookup that occurs for a declarator-id, including the changes in Richard’s [proposed resolution for CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html) (which also resolves CWG1818 and what of CWG2058 was not resolved along with CWG2059) and rejecting the example from [CWG1477](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1477). Wording ([[dcl.link]/6](https://eel.is/c++draft/dcl.link#6)): > A redeclaration of an entity without a linkage specification inherits the language linkage of the entity and (if applicable) its type[.](https://eel.is/c++draft/dcl.link#6.sentence-2). Answer to the question in the example is `extern "C"`, and not linkage mismatch. Further analysis of the example is provided as inline comments in the test itself. Note that https://eel.is/c++draft/dcl.link#7 does NOT apply in this example, as it's focused squarely at declarations that are already known to have C language linkage, and declarations of variables in the global scope.
These don't look like they've been used since the original 'use-diet' branch was merged in 2008 ( f6caff6)
…5614) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.
…vm#105686) The lowering happens in post-legalizer lowering if any source registers from G_BUILD_VECTOR are not constants. Add pattern pragment setting `scalar_to_vector ($src)` asequivalent to `vector_insert (undef), ($src), (i61 0)`
…lvm#107041) This patch is provided by @jeliebig. Fixes llvm#107017.
Fixes llvm#105972. Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>
…lvm#106094) Fix `RankedTensorType` equality check in unpack op canonicalization.
…7074) A macro definition needs its own scope stack in the annotator, so we add the MacroBodyScopes stack and use ScopeStack to refer to it when in the macro definition body. Also, we need to have a scope type for a child block because its parent line is parsed (and thus the scope type for the braces is popped off the scope stack) before the lines in the child block are. Fixes llvm#99271.
…SPIRV (llvm#107110) This patch add a type check for `tensor.extract` in TensorToSPIRV. Only convert `tensor.extract` with supported element type. Fix llvm#74466.
cferry-AMD
approved these changes
Sep 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.