[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

mgehre-amd · 2024-09-25T15:27:19Z

No description provided.

…#106835) The target name and the message are wrong -- both should say "cuda" for the filtering to work. Fixes commit 300e5b9 (llvm#93186).

In llvm#92581 the `LibomptargetUitls.cmake` helpers have been removed, but only uses of `libomptarget_say` were migrated. Migrate the remaining few warning and error messages so the `check-offload` target would not fail due to missing `libomptarget_warning_say`. While at it, update the `check-offload` unavailability message to say `check-offload` instead of `check-libomptarget`. Fixes llvm#92581

…lvm#106634) Summary: The `langinfo.h` header is a POSIX extension, so ideally we would be able to build the C++ library without it. Currently the LLVM C library doesn't support / provide it. This allows us to build the C++ library with locales enabled. We can either disable it here, or just provide stubs that do nothing as in llvm#106620.

…6632) Summary: We currently do not provide a more complicated rune table, so we want the default.

This patch adds cost model support for [u|s]cmp.

/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21558:14: error: unused variable 'ValLMUL' [-Werror,-Wunused-variable] unsigned ValLMUL = ^ /llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21561:14: error: unused variable 'PartLMUL' [-Werror,-Wunused-variable] unsigned PartLMUL = ^ 2 errors generated.

…mber of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#106449

First step for support WaveSize attribute in https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html and https://microsoft.github.io/hlsl-specs/proposals/0013-wave-size-range.html A new attribute HLSLWaveSizeAttr was supported in the AST. Implement both the wave size and the wave size range, rather than separately which might require more work. For llvm#70118

HLSL output parameters are denoted with the `inout` and `out` keywords in the function declaration. When an argument to an output parameter is constructed a temporary value is constructed for the argument. For `inout` pamameters the argument is initialized via copy-initialization from the argument lvalue expression to the parameter type. For `out` parameters the argument is not initialized before the call. In both cases on return of the function the temporary value is written back to the argument lvalue expression through an implicit assignment binary operator with casting as required. This change introduces a new HLSLOutArgExpr ast node which represents the output argument behavior. The OutArgExpr has three defined children: - An OpaqueValueExpr of the argument lvalue expression. - An OpaqueValueExpr of the copy-initialized parameter. - A BinaryOpExpr assigning the first with the value of the second. Fixes llvm#87526 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: John McCall <rjmccall@gmail.com>

…lvm#106828) Stop adding liveins for virtual registers. In the livein interface, the register goes through a MCPhysReg which is uint16_t. This causes the virtual register bit to be dropped making it alias to some nonsense physical register. Recompute the liveins for the continue block to handle any live registers that are needed by instructions that were spliced from the original block. This fixing the machine verifier error so we can remove that fixme now.

llvm#105740) …limination ArgumentPromotion and DeadArgumentElimination passes may change function signature. This makes bpf tracing difficult since users either not aware of signature change or need to poke into IR or assembly to understand the function signature change. This patch enabled to emit some remarks so if recompiling with -foptimization-record-file=<file>, users can check remarks to see what kind of signature changes for a particular function. The following are some examples for implemented remarks: ``` Pass: deadargelim Name: ReturnValueRemoved DebugLoc: { File: 'bpf-next/net/mptcp/protocol.c', Line: 572, Column: 0 } Function: mptcp_check_data_fin Args: - String: 'removing return value ' - String: '0' Pass: deadargelim Name: ArgumentRemoved DebugLoc: { File: 'bpf-next/kernel/bpf/syscall.c', Line: 1670, Column: 0 } Function: map_delete_elem Args: - String: 'eliminating argument ' - ArgName: uattr.coerce0 - String: '(' - ArgIndex: '1' - String: ')' Pass: argpromotion Name: ArgumentPromoted DebugLoc: { File: 'bpf-next/net/mptcp/protocol.h', Line: 570, Column: 0 } Function: mptcp_subflow_ctx Args: - String: 'promoting argument ' - ArgName: sk - String: '(' - ArgIndex: '0' - String: ')' - String: ' to pass by value' ``` [1] llvm#104678

This applies to function template non-call partial ordering the same provisional wording change applied in the call context: Don't perform the consistency check on return type and parameters which didn't have any template parameters deduced from. Fixes regression introduced in llvm#100692, which was reported on the PR.

…m#101414) We have been discussing changes to our commit access polices recently and based on some feedback from clattner here: https://discourse.llvm.org/t/rfc-new-criteria-for-commit-access/76290/81 We need to update our Developer Policy so that it matches what we are actually doing in this project. We currently grant commit access to anyone with a valid justification, not just contributors who have submitted high-quality patches in the past. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>

…m#106489)

This matches the MachineBasicBlock liveins used to populate it.

…vm#105728) Basic infrastructure to collect Function properties in Metadata Analysis - Add a `SmallVector` of entry properties to the metadata information. - Add a structure to represent function properties. Currently `numthreads` and shader kind properties of shader entry functions are represented.

fixes llvm#103300

@test

…llvm#105510) This patch replaces all dominated uses of condition with true/false to improve context-sensitive optimizations. It eliminates a bunch of branches in llvm-opt-benchmark. As a side effect, it may introduce new phi nodes in some corner cases. See the following case: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else] ret i1 %res } ``` It will be simplified into: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else] ret i1 %res } ``` I am planning to fix this in late pipeline/CGP since this problem exists before the patch.

Fixes llvm#106761.

…asts. NFC

Fix the DeclID not being set in global temporaries and use the same strategy for deciding if a temporary is readable as the current interpreter.

As far as I can tell, there's no way to call this. There are no calls in the X86 directory. It has the same name as a function in MCRegisterInfo, but that function takes a MCRegister and isn't virtual. The function in MCRegisterInfo uses a DenseMap populated by `X86_MC::initLLVMToSEHAndCVRegMapping`. The DenseMap is populated for every physical register using the encoding value. I think that means the function in MCRegisterInfo would return the same value as the function in X86RegisterInfo.

…106886) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes.

…lvm#106882)

…n Windows (llvm#106794) Suppresses the copyright banner for `ml64` compiling BLAKE3 assembly sources with MSVC and Ninja on Windows: ``` [157/3758] Building ASM_MASM object lib\Support\BLAKE3\CMa...upportBlake3.dir\blake3_avx512_x86-64_windows_msvc.asm.obj Microsoft (R) Macro Assembler (x64) Version 14.41.34120.0 Copyright (C) Microsoft Corporation. All rights reserved. Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` is now just: ``` Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` We can suppress that last line with `/quiet` in more recent versions of `ml64` (from MSVC 2022 17.6) but it is not supported by all potential MASM compilers.

) This doesn't seem to have any use other than the possibility of merge conflicts and accidentally forgetting to update `NUM_PREDEF_DECL_IDS`.

When the shuffle masks are `PoisonMaskElem`, there is not need to check the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause the compiler to crash. Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) && "SK_ExtractSubvector index out of range"' failed.

…s that emit two instructions. (llvm#107011) All of the test changes are because integer type legalization prefers to promote fp_to_uint to fp_to_sint if neither is "Legal".

Bump the lldb-dap version number so that we can publish and updated version in the Visual Studio Marketplace.

…ions reduced values Need to correctly track reduced values with multiple uses in the same reduction emission attempt. Otherwise, the number of the reuses might be calculated incorrectly, and may cause compiler crash. Fixes llvm#107037

Add three more special cases for loading registers with immediates. The first allows values in the range of [-255, 255] to be loaded with MOVEQ, even if the register is more than 8 bits and the sign extention is unwanted. This is done by loading the bitwise complement of the desired value, then performing a NOT instruction on the loaded register. This special case is only used when a simple MOVEQ cannot be used, and is only used for 32 bit data registers. Address registers cannot support MOVEQ, and the two-instruction sequence is no faster or smaller than a plain MOVE instruction when loading 16 bit immediates on the 68000, and likely slower for more sophisticated microarchitectures. However, the instruction sequence is both smaller and faster than the corresponding MOVE instruction for 32 bit register widths. The second special case is for zeroing address registers. This simply expands to subtracting a register with itself, consuming one instruction word rather than 2-3, with a small improvement in speed as well. The last special case is for assigning sign-extended 16-bit values to a full address register. This takes advantage of the fact that the movea.w instruction sign extends the output, permitting the immediate to be smaller. This is similar to using lea with a 16-bit address, which is not added in this patch as 16-bit absolute addressing is not yet implemented. This is a v2 submission of llvm#90817. It also creates a 'Data' test directory to better align with the backend's tablegen layout.

Select only needs branches and moves so we don't need to promote it. Promoting would canonicalize NaNs which select shouldn't do.

The only instance where we weren't already passing a `StringRef` with a known length to `Symbol`'s constructor is where the argument is a string literal. Even in that case, lazy `strlen` calls don't make sense, as the compiler can constant-evaluate the `StringRef(const char*)` constructor. For symbols that go into the symbol table we need the length when calculating the hash anyway. We could get away with not calling `getName()` for local symbols, but the total contribution of `strlen` to the run time is already below 1%, so that would just complicate the code for a negligible benefit.

Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant. Post-inlining, the update mainly consists of: - making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions) - in the contextual profile: - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual. - the contexts of the callee (at the inlined callsite) are moved to the caller. - the callee context at the inlined callsite is deleted.

…ctorizations reduced values" This reverts commit 98bb354 to fix buildbots https://lab.llvm.org/buildbot/#/builders/155/builds/2056 and https://lab.llvm.org/buildbot/#/builders/11/builds/4407

Uses a static lock to ensure multiple threads reporting issues at the same time don't have printing collisions. This isn't so important now, but will be with continue mode in the future.

`memory read` will return an error if you try to read more than 1k bytes in a single command, instructing you to set `target.max-memory-read-size` or use `--force` if you intended to read more than that. This is a safeguard for a command where people are being explicit about how much memory they would like lldb to read (either to display, or save to a file) and is an annoyance every time you need to read more than a small amount. If someone confuses the --count argument with the start address, lldb may begin dumping gigabytes of data but I'd rather that behavior than requiring everyone to special-case their way around a common use case. I don't want to remove the setting because many people have added (much larger) default max read sizes to their ~/.lldbinit files after hitting this behavior. Another option would be to stop reading/using the value in Target.cpp, but I see no harm in leaving the setting if someone really does prefer to have a small cap on their memory read size.

Despite the stale comments, none of these actually use TTI, and they're solely generating standard LLVM IR.

…ion) (llvm#107131) [CWG2486](https://cplusplus.github.io/CWG/issues/2486.html) "Call to `noexcept` function via `noexcept(false)` pointer/lvalue" allows `noexcept` functions to be called via `noexcept(false)` pointers or values. There appears to be no implementation divergence whatsoever: https://godbolt.org/z/3afTfeEM8. That said, in C++14 and earlier we do not issue all the diagnostics we issue in C++17 and newer, so I'm specifying the status of the issue accordingly.

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

This patch implements sandboxir:: ConstantAggregate, ConstantStruct, ConstantArray and ConstantVector, mirroring LLVM IR.

…in/Zvfhmin+Zfbfmin/Zfhmin. (llvm#106637) Previously, if Zfbfmin/Zfhmin were enabled, we only handled build_vectors that could be turned into splat_vectors. We promoted them to f32 splats by extending in the scalar domain and narrowing in the vector domain. This patch fixes a crash where we failed to account for whether the f32 vector type fit in LMUL<=8. Because the new lowering occurs after type legalization, we have to be careful to use XLenVT for the scalar integer type and use custom cast nodes.

…m#107157) The `Kind` argument does not need to passed separately.

… `outer_dims_perm` attribute (llvm#106687)

This patch covers Core issues about language linkage during declaration matching resolved in [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html), namely [CWG563](https://cplusplus.github.io/CWG/issues/563.html) and [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html). [CWG563](https://cplusplus.github.io/CWG/issues/563.html) "Linkage specification for objects" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG563](https://cplusplus.github.io/CWG/issues/563.html) is resolved by simplifications that follow its suggestions. Wording ([[dcl.link]/5](https://eel.is/c++draft/dcl.link#5)): > In a [linkage-specification](https://eel.is/c++draft/dcl.link#nt:linkage-specification), the specified language linkage applies to the function types of all function declarators and to all functions and variables whose names have external linkage[.](https://eel.is/c++draft/dcl.link#5.sentence-5) Now the wording clearly says that linkage-specification applies to variables with external linkage. [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html) "Visibility and inherited language linkage" ------------ [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG386](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#386), [CWG1839](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1839), [CWG1818](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1818), [CWG2058](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2058), [CWG1900](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1900), and Richard’s observation in [“are non-type names ignored in a class-head-name or enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are resolved by describing the limited lookup that occurs for a declarator-id, including the changes in Richard’s [proposed resolution for CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html) (which also resolves CWG1818 and what of CWG2058 was not resolved along with CWG2059) and rejecting the example from [CWG1477](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1477). Wording ([[dcl.link]/6](https://eel.is/c++draft/dcl.link#6)): > A redeclaration of an entity without a linkage specification inherits the language linkage of the entity and (if applicable) its type[.](https://eel.is/c++draft/dcl.link#6.sentence-2). Answer to the question in the example is `extern "C"`, and not linkage mismatch. Further analysis of the example is provided as inline comments in the test itself. Note that https://eel.is/c++draft/dcl.link#7 does NOT apply in this example, as it's focused squarely at declarations that are already known to have C language linkage, and declarations of variables in the global scope.

These don't look like they've been used since the original 'use-diet' branch was merged in 2008 ( f6caff6)

…5614) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.

…vm#105686) The lowering happens in post-legalizer lowering if any source registers from G_BUILD_VECTOR are not constants. Add pattern pragment setting `scalar_to_vector ($src)` asequivalent to `vector_insert (undef), ($src), (i61 0)`

@jeliebig

…lvm#107041) This patch is provided by @jeliebig. Fixes llvm#107017.

…07021) Fixes llvm#106994.

Fixes llvm#105972. Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>

…lvm#106094) Fix `RankedTensorType` equality check in unpack op canonicalization.

…7074) A macro definition needs its own scope stack in the annotator, so we add the MacroBodyScopes stack and use ScopeStack to refer to it when in the macro definition body. Also, we need to have a scope type for a child block because its parent line is parsed (and thus the scope type for the braces is popped off the scope stack) before the lines in the child block are. Fixes llvm#99271.

…SPIRV (llvm#107110) This patch add a type check for `tensor.extract` in TensorToSPIRV. Only convert `tensor.extract` with supported element type. Fix llvm#74466.

xen0n and others added 30 commits August 31, 2024 07:06

[Offload] Fix disabling of cuda target on unsupported platforms (llvm…

75545b3

…#106835) The target name and the message are wrong -- both should say "cuda" for the filtering to work. Fixes commit 300e5b9 (llvm#93186).

[libcxx] Use the default rune table when using the LLVM libc (llvm#10…

38dbcbd

…6632) Summary: We currently do not provide a more complicated rune table, so we want the default.

[TTI] Add cost model support for [u|s]cmp (llvm#106824)

140e80a

This patch adds cost model support for [u|s]cmp.

[lld] Fix invalid Python escape sequences (llvm#94033)

4514c38

[RISCV] Use MCRegister for return value from allocateRVVReg. NFC

6d9c6f0

[RISCV] Use MCRegister for vectors in CC_RISCV_FastCC. NFC

2afa975

[OpenMP] Support setting POSIX thread name on *BSD's and Solaris (llv…

37e109c

…m#106489)

[SelectionDAGISel] Use MCRegister and Register for LiveInMap. NFC

a3e2936

This matches the MachineBasicBlock liveins used to populate it.

[mlir][irdl] update documentation (llvm#103394)

84580a0

fixes llvm#103300

[NFC] Fix typos (llvm#106817)

4f4bd41

Fixes llvm#106761.

[RISCV] Merge similar code for legalizing i16<->f16 and i<->bf16 bitc…

6f682c2

…asts. NFC

[clang][bytecode] Fix diagnosing reads from temporaries (llvm#106868)

e4f3b56

Fix the DeclID not being set in global temporaries and use the same strategy for deciding if a temporary is readable as the current interpreter.

[RISCV] Custom legalize f16/bf16 FNEG/FABS with Zfhmin/Zbfmin. (llvm#…

3bdec31

…106886) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes.

[clang] NFCI: don't check deduced constraints when partial ordering (l…

840d4d9

…lvm#106882)

[Clang][NFC] Don't manually enumerate the PredefinedDeclIDs (llvm#106891

4fef204

) This doesn't seem to have any use other than the possibility of merge conflicts and accidentally forgetting to update `NUM_PREDEF_DECL_IDS`.

topperc and others added 29 commits September 3, 2024 15:34

[RISCV] Custom promote f16/bf16 fp_to_(s/u)int to reduce isel pattern…

db3792b

…s that emit two instructions. (llvm#107011) All of the test changes are because integer type legalization prefers to promote fp_to_uint to fp_to_sint if neither is "Legal".

[lldb] Bump the lldb-dap version number

7d3b81d

Bump the lldb-dap version number so that we can publish and updated version in the Visual Studio Marketplace.

[RISCV] Don't promote f16/bf16 SELECT with Zfhmin/Zfbfmin. (llvm#107138)

1c874bb

Select only needs branches and moves so we don't need to promote it. Promoting would canonicalize NaNs which select shouldn't do.

Revert "[SLP]Fix PR107037: correctly track origonal/modified after ve…

dce73e1

…ctorizations reduced values" This reverts commit 98bb354 to fix buildbots https://lab.llvm.org/buildbot/#/builders/155/builds/2056 and https://lab.llvm.org/buildbot/#/builders/11/builds/4407

[compiler-rt][rtsan] Add scoped reporting lock (llvm#107167)

18263c3

Uses a static lock to ensure multiple threads reporting issues at the same time don't have printing collisions. This isn't so important now, but will be with continue mode in the future.

Remove "Target" from createXReduction naming [nfc]

3e8840b

Despite the stale comments, none of these actually use TTI, and they're solely generating standard LLVM IR.

[X86][AVX10.2] Support AVX10.2-BF16 new instructions. (llvm#101603)

83ad644

Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

[SandboxIR] Implement ConstantAggregate (llvm#107136)

814aa43

This patch implements sandboxir:: ConstantAggregate, ConstantStruct, ConstantArray and ConstantVector, mirroring LLVM IR.

[gn build] Port 83ad644

48bc8b0

[WebAssembly] Remove Kind argument from WebAssemblyOperand (NFC) (llv…

f1615e3

…m#107157) The `Kind` argument does not need to passed separately.

[mlir][tensor] Fix consumer fusion for tensor.pack without explicit…

c8763f0

… `outer_dims_perm` attribute (llvm#106687)

[IR] Remove unused MINARITY operand trait tpl args, NFC (llvm#107165)

b057e16

These don't look like they've been used since the original 'use-diet' branch was merged in 2008 ( f6caff6)

[VPlan][NFC] Implement VPWidenMemoryRecipe::computeCost(). (llvm#10…

ed220e1

…5614) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.

[clang-format] Handle spaces in file paths in git-clang-format.bat (l…

12c0823

…lvm#107041) This patch is provided by @jeliebig. Fixes llvm#107017.

[clang-format] Fix a regression in annotating ObjCBlockLParen (llvm#1…

a27ff17

…07021) Fixes llvm#106994.

[clang][Driver] Define soft float macros for PPC. (llvm#106012)

b55186e

Fixes llvm#105972. Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>

[MLIR][Tensor] Fix source/dest type check in UnPackOp canonicalize (l…

8d08166

…lvm#106094) Fix `RankedTensorType` equality check in unpack op canonicalization.

[mlir][TensorToSPIRV] Add type check for tensor.extract in TensorTo…

f4b9839

…SPIRV (llvm#107110) This patch add a type check for `tensor.extract` in TensorToSPIRV. Only convert `tensor.extract` with supported element type. Fix llvm#74466.

[AutoBump] Merge with f4b9839 (Sep 04)

2fff529

cferry-AMD approved these changes Sep 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

mgehre-amd commented Sep 25, 2024

[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

Are you sure you want to change the base?

[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

Conversation

mgehre-amd commented Sep 25, 2024