-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with f4b9839d (Sep 04) (20) #373
base: bump_to_1293ab35
Are you sure you want to change the base?
Commits on Aug 31, 2024
-
[Offload] Fix disabling of cuda target on unsupported platforms (llvm…
…#106835) The target name and the message are wrong -- both should say "cuda" for the filtering to work. Fixes commit 300e5b9 (llvm#93186).
Configuration menu - View commit details
-
Copy full SHA for 75545b3 - Browse repository at this point
Copy the full SHA 75545b3View commit details -
[Offload] Fix stray libomptarget message helper calls (llvm#106837)
In llvm#92581 the `LibomptargetUitls.cmake` helpers have been removed, but only uses of `libomptarget_say` were migrated. Migrate the remaining few warning and error messages so the `check-offload` target would not fail due to missing `libomptarget_warning_say`. While at it, update the `check-offload` unavailability message to say `check-offload` instead of `check-libomptarget`. Fixes llvm#92581
Configuration menu - View commit details
-
Copy full SHA for 9adf811 - Browse repository at this point
Copy the full SHA 9adf811View commit details -
[libcxx] Do not include
langinfo.h
when using the LLVM C library (l……lvm#106634) Summary: The `langinfo.h` header is a POSIX extension, so ideally we would be able to build the C++ library without it. Currently the LLVM C library doesn't support / provide it. This allows us to build the C++ library with locales enabled. We can either disable it here, or just provide stubs that do nothing as in llvm#106620.
Configuration menu - View commit details
-
Copy full SHA for 109bff1 - Browse repository at this point
Copy the full SHA 109bff1View commit details -
[libcxx] Use the default rune table when using the LLVM libc (llvm#10…
…6632) Summary: We currently do not provide a more complicated rune table, so we want the default.
Configuration menu - View commit details
-
Copy full SHA for 38dbcbd - Browse repository at this point
Copy the full SHA 38dbcbdView commit details -
[TTI] Add cost model support for [u|s]cmp (llvm#106824)
This patch adds cost model support for [u|s]cmp.
Configuration menu - View commit details
-
Copy full SHA for 140e80a - Browse repository at this point
Copy the full SHA 140e80aView commit details -
[RISCV] Fix -Wunused-variable in RISCVISelLowering.cpp (NFC)
/llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21558:14: error: unused variable 'ValLMUL' [-Werror,-Wunused-variable] unsigned ValLMUL = ^ /llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21561:14: error: unused variable 'PartLMUL' [-Werror,-Wunused-variable] unsigned PartLMUL = ^ 2 errors generated.
Configuration menu - View commit details
-
Copy full SHA for 1061c6d - Browse repository at this point
Copy the full SHA 1061c6dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4514c38 - Browse repository at this point
Copy the full SHA 4514c38View commit details -
[SLP]Initial support for non-power-of-2 (but still whole register) nu…
…mber of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#106449
Configuration menu - View commit details
-
Copy full SHA for a3ea90f - Browse repository at this point
Copy the full SHA a3ea90fView commit details -
[HLSL] AST support for WaveSize attribute. (llvm#101240)
First step for support WaveSize attribute in https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html and https://microsoft.github.io/hlsl-specs/proposals/0013-wave-size-range.html A new attribute HLSLWaveSizeAttr was supported in the AST. Implement both the wave size and the wave size range, rather than separately which might require more work. For llvm#70118
Configuration menu - View commit details
-
Copy full SHA for e41579a - Browse repository at this point
Copy the full SHA e41579aView commit details -
[HLSL] Implement output parameter (llvm#101083)
HLSL output parameters are denoted with the `inout` and `out` keywords in the function declaration. When an argument to an output parameter is constructed a temporary value is constructed for the argument. For `inout` pamameters the argument is initialized via copy-initialization from the argument lvalue expression to the parameter type. For `out` parameters the argument is not initialized before the call. In both cases on return of the function the temporary value is written back to the argument lvalue expression through an implicit assignment binary operator with casting as required. This change introduces a new HLSLOutArgExpr ast node which represents the output argument behavior. The OutArgExpr has three defined children: - An OpaqueValueExpr of the argument lvalue expression. - An OpaqueValueExpr of the copy-initialized parameter. - A BinaryOpExpr assigning the first with the value of the second. Fixes llvm#87526 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: John McCall <rjmccall@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 89fb849 - Browse repository at this point
Copy the full SHA 89fb849View commit details -
[X86] Fix livein handling in emitStackProbeInlineWindowsCoreCLR64. (l…
…lvm#106828) Stop adding liveins for virtual registers. In the livein interface, the register goes through a MCPhysReg which is uint16_t. This causes the virtual register bit to be dropped making it alias to some nonsense physical register. Recompute the liveins for the continue block to handle any live registers that are needed by instructions that were spliced from the original block. This fixing the machine verifier error so we can remove that fixme now.
Configuration menu - View commit details
-
Copy full SHA for 8638fe1 - Browse repository at this point
Copy the full SHA 8638fe1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6d9c6f0 - Browse repository at this point
Copy the full SHA 6d9c6f0View commit details -
[Transforms][IPO] Add remarks for ArgumentPromotion and DeadArgumentE… (
llvm#105740) …limination ArgumentPromotion and DeadArgumentElimination passes may change function signature. This makes bpf tracing difficult since users either not aware of signature change or need to poke into IR or assembly to understand the function signature change. This patch enabled to emit some remarks so if recompiling with -foptimization-record-file=<file>, users can check remarks to see what kind of signature changes for a particular function. The following are some examples for implemented remarks: ``` Pass: deadargelim Name: ReturnValueRemoved DebugLoc: { File: 'bpf-next/net/mptcp/protocol.c', Line: 572, Column: 0 } Function: mptcp_check_data_fin Args: - String: 'removing return value ' - String: '0' Pass: deadargelim Name: ArgumentRemoved DebugLoc: { File: 'bpf-next/kernel/bpf/syscall.c', Line: 1670, Column: 0 } Function: map_delete_elem Args: - String: 'eliminating argument ' - ArgName: uattr.coerce0 - String: '(' - ArgIndex: '1' - String: ')' Pass: argpromotion Name: ArgumentPromoted DebugLoc: { File: 'bpf-next/net/mptcp/protocol.h', Line: 570, Column: 0 } Function: mptcp_subflow_ctx Args: - String: 'promoting argument ' - ArgName: sk - String: '(' - ArgIndex: '0' - String: ')' - String: ' to pass by value' ``` [1] llvm#104678
Configuration menu - View commit details
-
Copy full SHA for 470f55f - Browse repository at this point
Copy the full SHA 470f55fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2afa975 - Browse repository at this point
Copy the full SHA 2afa975View commit details -
[clang] function template non-call partial ordering fixes (llvm#106829)
This applies to function template non-call partial ordering the same provisional wording change applied in the call context: Don't perform the consistency check on return type and parameters which didn't have any template parameters deduced from. Fixes regression introduced in llvm#100692, which was reported on the PR.
Configuration menu - View commit details
-
Copy full SHA for cfe331b - Browse repository at this point
Copy the full SHA cfe331bView commit details -
docs: Clarify commit access requirements in the Developer Policy (llv…
…m#101414) We have been discussing changes to our commit access polices recently and based on some feedback from clattner here: https://discourse.llvm.org/t/rfc-new-criteria-for-commit-access/76290/81 We need to update our Developer Policy so that it matches what we are actually doing in this project. We currently grant commit access to anyone with a valid justification, not just contributors who have submitted high-quality patches in the past. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
Configuration menu - View commit details
-
Copy full SHA for ec58817 - Browse repository at this point
Copy the full SHA ec58817View commit details -
Configuration menu - View commit details
-
Copy full SHA for 37e109c - Browse repository at this point
Copy the full SHA 37e109cView commit details -
[SelectionDAGISel] Use MCRegister and Register for LiveInMap. NFC
This matches the MachineBasicBlock liveins used to populate it.
Configuration menu - View commit details
-
Copy full SHA for a3e2936 - Browse repository at this point
Copy the full SHA a3e2936View commit details -
[DXIL][Analysis] Collect Function properties in Metadata Analysis (ll…
…vm#105728) Basic infrastructure to collect Function properties in Metadata Analysis - Add a `SmallVector` of entry properties to the metadata information. - Add a structure to represent function properties. Currently `numthreads` and shader kind properties of shader entry functions are represented.
Configuration menu - View commit details
-
Copy full SHA for 8aa8c05 - Browse repository at this point
Copy the full SHA 8aa8c05View commit details
Commits on Sep 1, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 84580a0 - Browse repository at this point
Copy the full SHA 84580a0View commit details -
[InstCombine] Replace all dominated uses of condition with constants (…
…llvm#105510) This patch replaces all dominated uses of condition with true/false to improve context-sensitive optimizations. It eliminates a bunch of branches in llvm-opt-benchmark. As a side effect, it may introduce new phi nodes in some corner cases. See the following case: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else] ret i1 %res } ``` It will be simplified into: ``` define i1 @test(i1 %cmp, i1 %cond) { entry: br i1 %cond, label %bb1, label %bb2 bb1: br i1 %cmp, label %if.then, label %if.else if.then: br %bb2 if.else: br %bb2 bb2: %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else] ret i1 %res } ``` I am planning to fix this in late pipeline/CGP since this problem exists before the patch.
Configuration menu - View commit details
-
Copy full SHA for 380fa87 - Browse repository at this point
Copy the full SHA 380fa87View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f4bd41 - Browse repository at this point
Copy the full SHA 4f4bd41View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f682c2 - Browse repository at this point
Copy the full SHA 6f682c2View commit details -
[clang][bytecode] Fix diagnosing reads from temporaries (llvm#106868)
Fix the DeclID not being set in global temporaries and use the same strategy for deciding if a temporary is readable as the current interpreter.
Configuration menu - View commit details
-
Copy full SHA for e4f3b56 - Browse repository at this point
Copy the full SHA e4f3b56View commit details -
[X86] Remove X86RegisterInfo::getSEHRegNum. (llvm#106866)
As far as I can tell, there's no way to call this. There are no calls in the X86 directory. It has the same name as a function in MCRegisterInfo, but that function takes a MCRegister and isn't virtual. The function in MCRegisterInfo uses a DenseMap populated by `X86_MC::initLLVMToSEHAndCVRegMapping`. The DenseMap is populated for every physical register using the encoding value. I think that means the function in MCRegisterInfo would return the same value as the function in X86RegisterInfo.
Configuration menu - View commit details
-
Copy full SHA for feb391c - Browse repository at this point
Copy the full SHA feb391cView commit details -
[RISCV] Custom legalize f16/bf16 FNEG/FABS with Zfhmin/Zbfmin. (llvm#…
…106886) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes.
Configuration menu - View commit details
-
Copy full SHA for 3bdec31 - Browse repository at this point
Copy the full SHA 3bdec31View commit details -
Configuration menu - View commit details
-
Copy full SHA for 840d4d9 - Browse repository at this point
Copy the full SHA 840d4d9View commit details -
[CMake][Support] Use /nologo when compiling BLAKE3 assembly sources o…
…n Windows (llvm#106794) Suppresses the copyright banner for `ml64` compiling BLAKE3 assembly sources with MSVC and Ninja on Windows: ``` [157/3758] Building ASM_MASM object lib\Support\BLAKE3\CMa...upportBlake3.dir\blake3_avx512_x86-64_windows_msvc.asm.obj Microsoft (R) Macro Assembler (x64) Version 14.41.34120.0 Copyright (C) Microsoft Corporation. All rights reserved. Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` is now just: ``` Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm ``` We can suppress that last line with `/quiet` in more recent versions of `ml64` (from MSVC 2022 17.6) but it is not supported by all potential MASM compilers.
Configuration menu - View commit details
-
Copy full SHA for bec1d86 - Browse repository at this point
Copy the full SHA bec1d86View commit details -
[Clang][NFC] Don't manually enumerate the PredefinedDeclIDs (llvm#106891
Configuration menu - View commit details
-
Copy full SHA for 4fef204 - Browse repository at this point
Copy the full SHA 4fef204View commit details -
[SLP] Fix crash of shuffle poison (llvm#106857)
When the shuffle masks are `PoisonMaskElem`, there is not need to check the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause the compiler to crash. Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) && "SK_ExtractSubvector index out of range"' failed.
Configuration menu - View commit details
-
Copy full SHA for 24a043a - Browse repository at this point
Copy the full SHA 24a043aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c4cffd - Browse repository at this point
Copy the full SHA 7c4cffdView commit details -
Revert "[AMDGPU][LTO] Assume closed world after linking (llvm#105845)" (
llvm#106889) We can't assume closed world even in full LTO post-link stage. It is only true if we are building a "GPU executable". However, AMDGPU does support "dyamic library". I'm not aware of any approach to tell if it is relocatable link when we create the pass. For now let's revert the patch as it is currently breaking things. We can re-enable it once we can handle it correctly.
Configuration menu - View commit details
-
Copy full SHA for 84ed3c2 - Browse repository at this point
Copy the full SHA 84ed3c2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 57ef16c - Browse repository at this point
Copy the full SHA 57ef16cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 803ab28 - Browse repository at this point
Copy the full SHA 803ab28View commit details -
[SLP]Fix PR106909: add a check for unsafe FP operations.
NEON has non-IEEE compliant denormal flushing and the compiler should check if it safe to vectorize instructions for NEON in non-fast math mode. Fixes llvm#106909
Configuration menu - View commit details
-
Copy full SHA for 6e68fa9 - Browse repository at this point
Copy the full SHA 6e68fa9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7b2fe84 - Browse repository at this point
Copy the full SHA 7b2fe84View commit details -
[SDAG] Expand vector [u|s]cmp in VectorLegalizer (llvm#106883)
Address comment llvm#106747 (comment).
Configuration menu - View commit details
-
Copy full SHA for affc0c6 - Browse repository at this point
Copy the full SHA affc0c6View commit details -
[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). (llvm#106047)
Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments. PR: llvm#106731
Configuration menu - View commit details
-
Copy full SHA for 9ccf825 - Browse repository at this point
Copy the full SHA 9ccf825View commit details -
[LTO] Reduce memory usage for import lists (llvm#106772)
This patch reduces the memory usage for import lists by employing memory-efficient data structures. With this patch, an import list for a given destination module is basically DenseSet<uint32_t> with each element indexing into the deduplication table containing tuples of: {SourceModule, GUID, Definition/Declaration} In one of our large applications, the peak memory usage goes down by 9.2% from 6.120GB to 5.555GB during the LTO indexing step. This patch addresses several sources of space inefficiency associated with std::unordered_map: - std::unordered_map<GUID, ImportKind> takes up 16 bytes because of padding even though ImportKind only carries one bit of information. - std::unordered_map uses pointers to elements, both in the hash table proper and for collision chains. - We allocate an instance of std::unordered_map for each {Destination Module, Source Module} pair for which we have at least one import. Most import lists have less than 10 imports, so the metadata like the size of std::unordered_map and the pointer to the hash table costs a lot relative to the actual contents.
Configuration menu - View commit details
-
Copy full SHA for 5c0d61e - Browse repository at this point
Copy the full SHA 5c0d61eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 984fca5 - Browse repository at this point
Copy the full SHA 984fca5View commit details -
[RISCV] Add test for llvm.round.i32.f16 RV64+Zfhmin/Zhinxmin. NFC
We have special handling for this in type legalization, but we didn't have a test.
Configuration menu - View commit details
-
Copy full SHA for 5aa83eb - Browse repository at this point
Copy the full SHA 5aa83ebView commit details -
[LV] Don't consider branches leaving loop in collectValuesToIgnore.
Branches exiting the loop will remain regardless, so don't consider them in collectValuesToIgnore. This fixes another divergence between legacy and VPlan-based cost model. Fixes llvm#106780.
Configuration menu - View commit details
-
Copy full SHA for 654bb4e - Browse repository at this point
Copy the full SHA 654bb4eView commit details -
[AArch64] Add tests for fused FP literals. NFC (llvm#106731)
This is for an upcoming change to the threshold on Apple targets for using a constant pool for FP literals versus building them with integer moves. This file is based on literal_pools_float.ll. I tried to bolt on to the existing test, but it got messy as that file is already testing a matrix of combinations, so creating this new file instead.
Configuration menu - View commit details
-
Copy full SHA for 747d89a - Browse repository at this point
Copy the full SHA 747d89aView commit details -
[RISCV] Correct the rounding mode for llvm.lround.i64.f32 with RV64+Z…
…finx. We should use RMM instead of DYN.
Configuration menu - View commit details
-
Copy full SHA for 776aef1 - Browse repository at this point
Copy the full SHA 776aef1View commit details -
[RISCV] Custom promote f16 (l)lround/(l)lrint with Zfhmin/Zhinxmin in…
…stead of using isel patterns.
Configuration menu - View commit details
-
Copy full SHA for 357bd61 - Browse repository at this point
Copy the full SHA 357bd61View commit details
Commits on Sep 2, 2024
-
[lld][ELF] Add
-plugin-opt=time-trace=
as an alias of `--time-trace……=` (llvm#106803) Time trace profiler support was added into LLVMgold in cd3255a. This patch adds its `-plugin-opt` counterpart, which is just an alias to `--time-trace=`, into LLD for compatibility.
Configuration menu - View commit details
-
Copy full SHA for 5fe852e - Browse repository at this point
Copy the full SHA 5fe852eView commit details -
[RISCV][TTI] Scale the cost of FP-Int conversion with LMUL (llvm#87506)
Widening/narrowing the source data type to match the destination data type may require multiple steps. To model the costs, the patch generated the interim type by following the logic in RISCVTargetLowering::lowerVPFPIntConvOp.
Configuration menu - View commit details
-
Copy full SHA for 837ee5b - Browse repository at this point
Copy the full SHA 837ee5bView commit details -
[LoongArch] Remove unnecessary increment operations
`HighMask` is the value that sets bits from `Msb+1` to 63 to 1, while the other bits are set to 0.
Configuration menu - View commit details
-
Copy full SHA for 77523f9 - Browse repository at this point
Copy the full SHA 77523f9View commit details -
[clang][AIX] Fix -print-runtime-dir on AIX (llvm#104806)
Currently the option prints a path to a nonexistent directory with the full triple, `lib/powerpc64-ibm-aix7.2.0.0`. It should only be `lib/aix`.
Configuration menu - View commit details
-
Copy full SHA for 27e244f - Browse repository at this point
Copy the full SHA 27e244fView commit details -
[RISCV] Move VLDSX0Pred from RISCVSchedSiFive7.td to RISCVScheduleV.t…
…d. NFC (llvm#106671) This predicate isn't bound to the scheduler model and and we may want to reuse it in the future. We already moved it to reuse it in our downstream.
Configuration menu - View commit details
-
Copy full SHA for c74cc73 - Browse repository at this point
Copy the full SHA c74cc73View commit details -
Configuration menu - View commit details
-
Copy full SHA for 647f892 - Browse repository at this point
Copy the full SHA 647f892View commit details -
[Clang][Concepts] Correct the CurContext for friend declarations (llv…
…m#106890) `FindInstantiatedDecl()` relies on the `CurContext` to find the corresponding class template instantiation for a class template declaration. Previously, we pushed the semantic declaration context for constraint comparison, which is incorrect for constraints on friend declarations. In issue llvm#78101, the semantic context of the friend is the TU, so we missed the implicit template specialization `Template<void, 4>` when looking for the instantiation of the primary template `Template` at the time of checking the member instantiation; instead, we mistakenly picked up the explicit specialization `Template<float, 5>`, hence the error. As a bonus, this also fixes a crash when diagnosing constraints. The DeclarationName is not necessarily an identifier, so it's incorrect to call `getName()` on e.g. overloaded operators. Since the DiagnosticBuilder has correctly handled Decl printing, we don't need to find the printable name ourselves. Fixes llvm#78101
Configuration menu - View commit details
-
Copy full SHA for 358165d - Browse repository at this point
Copy the full SHA 358165dView commit details -
Configuration menu - View commit details
-
Copy full SHA for da13754 - Browse repository at this point
Copy the full SHA da13754View commit details -
[lldb] Better matching of types in anonymous namespaces (llvm#102111)
This patch extends TypeQuery matching to support anonymous namespaces. A new flag is added to control the behavior. In the "strict" mode, the query must match the type exactly -- all anonymous namespaces included. The dynamic type resolver in the itanium abi (the motivating use case for this) uses this flag, as it queries using the name from the demangles, which includes anonymous namespaces. This ensures we don't confuse a type with a same-named type in an anonymous namespace. However, this does *not* ensure we don't confuse two types in anonymous namespacs (in different CUs). To resolve this, we would need to use a completely different lookup algorithm, which probably also requires a DWARF extension. In the "lax" mode (the default), the anonymous namespaces in the query are optional, and this allows one search for the type using the usual language rules (`::A` matches `::(anonymous namespace)::A`). This patch also changes the type context computation algorithm in DWARFDIE, so that it includes anonymous namespace information. This causes a slight change in behavior: the algorithm previously stopped computing the context after encountering an anonymous namespace, which caused the outer namespaces to be ignored. This meant that a type like `NS::(anonymous namespace)::A` would be (incorrectly) recognized as `::A`). This can cause code depending on the old behavior to misbehave. The fix is to specify all the enclosing namespaces in the query, or use a non-exact match.
Configuration menu - View commit details
-
Copy full SHA for dd5d730 - Browse repository at this point
Copy the full SHA dd5d730View commit details -
Configuration menu - View commit details
-
Copy full SHA for d2ce9dc - Browse repository at this point
Copy the full SHA d2ce9dcView commit details -
[InstCombine] Make backedge check in op of phi transform more precise (…
…llvm#106075) The op of phi transform wants to prevent moving an operation across a backedge, as this may lead to an infinite combine loop. Currently, this is done using isPotentiallyReachable(). The problem with that is that all blocks inside a loop are reachable from each other. This means that the op of phi transform is effectively completely disabled for code inside loops, even when it's not actually operating on a loop phi (just a phi that happens to be in a loop). Fix this by explicitly computing the backedges inside the function instead. Do this via RPOT, which is a bit more efficient than using FindFunctionBackedges() (which does it without any pre-computed analyses). For irreducible cycles, there may be multiple possible choices of backedge, and this just picks one of them. This is still sufficient to prevent combine loops. This also removes the last use of LoopInfo in InstCombine -- I'll drop the analysis in a followup.
Configuration menu - View commit details
-
Copy full SHA for f044564 - Browse repository at this point
Copy the full SHA f044564View commit details -
[RISCV] Remove zfbfmin.ll. NFC (llvm#106937)
Most of it is redundant with bfloat-convert.ll. One testcase is found in bfloat-imm.ll. The load and stores are more thoroughly tested in bfloat-mem.ll.
Configuration menu - View commit details
-
Copy full SHA for c950ecb - Browse repository at this point
Copy the full SHA c950ecbView commit details -
[CodeGen] Update a few places that were passing Register to raw_ostre…
…am::operator<< (llvm#106877) These would implicitly cast the register to `unsigned`. Switch most of them to use printReg will give a more readable output. Change some others to use Register::id() so we can eventually remove the implicit cast to `unsigned`.
Configuration menu - View commit details
-
Copy full SHA for cd3667d - Browse repository at this point
Copy the full SHA cd3667dView commit details -
[clang] Bump up DIAG_SIZE_SEMA by 500 for downstream diagnostics.
Recently added HLSL diagnostics (89fb849) pushed the Swift compiler over the existing limit. rdar://135126738
Configuration menu - View commit details
-
Copy full SHA for 08a72cb - Browse repository at this point
Copy the full SHA 08a72cbView commit details -
[TSan] fix crash when symbolize on darwin platforms (llvm#99441)
The `dli_sname` filed in `Dl_info` may be `NULL`, which could cause a crash
Configuration menu - View commit details
-
Copy full SHA for fe1006b - Browse repository at this point
Copy the full SHA fe1006bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ed6d9f6 - Browse repository at this point
Copy the full SHA ed6d9f6View commit details -
[CGP] Undo constant propagation of pointers across calls
It may be profitable to revert SCCP propagation of C++ static values, if such constants are pointers, in order to avoid redundant pointer computation, since the method returning the constant is non-removable.
Configuration menu - View commit details
-
Copy full SHA for e4e0dfb - Browse repository at this point
Copy the full SHA e4e0dfbView commit details -
[APInt] Add default-disabled assertion to APInt constructor (llvm#106524
) If the uint64_t constructor is used, assert that the value is actually a signed or unsigned N-bit integer depending on whether the isSigned flag is set. Provide an implicitTrunc flag to restore the previous behavior, where the argument is silently truncated instead. In this commit, implicitTrunc is enabled by default, which means that the new assertions are disabled and no actual change in behavior occurs. The plan is to flip the default once all places violating the assertion have been fixed. See llvm#80309 for the scope of the necessary changes. The primary motivation for this change is to avoid incorrectly specified isSigned flags. A recurring problem we have is that people write something like `APInt(BW, -1)` and this works perfectly fine -- until the code path is hit with `BW > 64`. Most of our i128 specific miscompilations are caused by variants of this issue. The cost of the change is that we have to specify the correct isSigned flag (and make sure there are no excess bits) for uses where BW is always <= 64 as well.
Configuration menu - View commit details
-
Copy full SHA for 30cc198 - Browse repository at this point
Copy the full SHA 30cc198View commit details -
[ARM] Fix failure to register-allocate CMP_SWAP_64 pseudo-inst (llvm#…
…106721) This test case was failing to compile with a "ran out of registers during register allocation" error at -O0. This was because CMP_SWAP_64 has 3 operands which must be an even-odd register pair, and two other GPR operands. All of the def operands are also early-clobber, so registers can't be shared between uses and defs. Because the function has an over-aligned alloca it needs frame and base pointers, so r6 and r11 are both reserved. That leaves r0/r1, r2/r3, r4/r5 and r8/r9 as the only valid register pairs, and if the two individual GPR operands happen to get allocated to registers in different pairs then only 2 pairs will be available for the three GPRPair operands. To fix this, I've merged the two GPR operands into a single GPRPair operand. This means that the instruction now has 4 GPRPair operands, which can always be allocated without relying on luck. This does constrain register allocation a bit more, but this pseudo instruction is only used at -O0, so I don't think that's a problem.
Configuration menu - View commit details
-
Copy full SHA for 9cf6867 - Browse repository at this point
Copy the full SHA 9cf6867View commit details -
[CGP] Regenerate
revert-constant-ptr-propagation-on-calls.ll
test (……NFC) Multiple buildbots were previously failing.
Configuration menu - View commit details
-
Copy full SHA for d79c4c1 - Browse repository at this point
Copy the full SHA d79c4c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5bd3ee0 - Browse repository at this point
Copy the full SHA 5bd3ee0View commit details -
[InstCombine] Remove optional LoopInfo dependency
llvm#106075 has removed the last dependency on LoopInfo in InstCombine, so don't fetch the analysis anymore and remove the use-loop-info pass option.
Configuration menu - View commit details
-
Copy full SHA for 34b10e1 - Browse repository at this point
Copy the full SHA 34b10e1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0fa78b6 - Browse repository at this point
Copy the full SHA 0fa78b6View commit details -
[SLP] Add vectorization support for [u|s]cmp (llvm#106747)
This patch adds vectorization support for [u|s]cmp intrinsic calls.
Configuration menu - View commit details
-
Copy full SHA for a156b5a - Browse repository at this point
Copy the full SHA a156b5aView commit details -
[RuntimeDyld][Windows] Allocate space for dllimport things. (llvm#102586
) We weren't taking account of the space we require in the stubs for things that are dllimported, and as a result we could hit the assertion failure for running out of stub space. Fix that. rdar://133473673 --------- Co-authored-by: Saleem Abdulrasool <compnerd@compnerd.org> Co-authored-by: Lang Hames <lhames@gmail.com> Co-authored-by: Ben Barham <b.n.barham@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for a0a2531 - Browse repository at this point
Copy the full SHA a0a2531View commit details -
[flang][runtime] long double isn't always f80 (llvm#106746)
f80 is only a thing on x86, and even then the size of long double can be changed with compiler flags. Instead set the size according to the host system (this is what is already done for integer types).
Configuration menu - View commit details
-
Copy full SHA for cde3838 - Browse repository at this point
Copy the full SHA cde3838View commit details -
[clang] The ms-extension __noop should return zero in a constexpr con…
…text. (llvm#106849) Fixes llvm#106713.
Configuration menu - View commit details
-
Copy full SHA for eaea4d1 - Browse repository at this point
Copy the full SHA eaea4d1View commit details -
Revert "[RuntimeDyld][Windows] Allocate space for dllimport things." (l…
…lvm#106954) Looks like I missed an `override` (maybe that warning was enabled recently?). Will revert and fix. Reverts llvm#102586
Configuration menu - View commit details
-
Copy full SHA for 87d9048 - Browse repository at this point
Copy the full SHA 87d9048View commit details -
[SCCP] Infer return attributes in SCCP as well (llvm#106732)
We can infer the range/nonnull attributes in non-interprocedural SCCP as well. The results may be better after the function has been simplified.
Configuration menu - View commit details
-
Copy full SHA for 24fe1d4 - Browse repository at this point
Copy the full SHA 24fe1d4View commit details -
[llvm][Support] Adjust maximum thread name length to the right value …
…for OpenBSD (llvm#106956) The thread name length is derived from _MAXCOMLEN which is 24.
Configuration menu - View commit details
-
Copy full SHA for d710011 - Browse repository at this point
Copy the full SHA d710011View commit details -
[BasicAA] Track nuw through decomposed expressions (llvm#106512)
When we decompose the GEP offset expression, and the arithmetic is not performed using nuw operations, we cannot retain the nuw flag on the decomposed GEP. For example, if we have `gep nuw p, (a-1)`, this is not at all the same as `gep nuw (gep nuw p, a), -1`. Fix this by tracking NUW through linear expression decomposition, similarly to what we already do for the NSW flag. This fixes the miscompilation reported in llvm#105496 (comment).
Configuration menu - View commit details
-
Copy full SHA for b9bba6c - Browse repository at this point
Copy the full SHA b9bba6cView commit details -
[mlir][ArmSME] Rename slice move operations to insert/extract_tile_sl…
…ice (llvm#106755) This renames: - `arm_sme.move_tile_slice_to_vector` to `arm_sme.extract_tile_slice` - `arm_sme.move_vector_to_tile_slice` to `arm_sme.insert_tile_slice` The new names are more consistent with the rest of MLIR and should be easier to understand. The current names (to me personally) are hard to parse and easy to mix up when skimming through code. Additionally, the syntax for `insert_tile_slice` has changed from: ```mlir %4 = arm_sme.insert_tile_slice %0, %1, %2 : vector<[16]xi8> into vector<[16]x[16]xi8> ``` To: ```mlir %4 = arm_sme.insert_tile_slice %0, %1[%2] : vector<[16]xi8> into vector<[16]x[16]xi8> ``` This is for consistency with `extract_tile_slice`, but also helps with readability as it makes it clear which operand is the index.
Configuration menu - View commit details
-
Copy full SHA for c425124 - Browse repository at this point
Copy the full SHA c425124View commit details -
[llvm][Support] Add support for thread naming under DragonFly BSD and…
… Solaris/illumos (llvm#106944)
Configuration menu - View commit details
-
Copy full SHA for 1e65b76 - Browse repository at this point
Copy the full SHA 1e65b76View commit details -
Reapply "[MLIR][LLVM] Make DISubprogramAttr cyclic" (llvm#106571) wit…
…h fixes (llvm#106947) This reverts commit fa93be4, restoring commit d884b77, with fixes that ensure the CAPI declarations are exported properly. This commit implements LLVM_DIRecursiveTypeAttrInterface for the DISubprogramAttr to ensure cyclic subprograms can be imported properly. In the process multiple shortcuts around the recently introduced DIImportedEntityAttr can be removed.
Configuration menu - View commit details
-
Copy full SHA for 7519755 - Browse repository at this point
Copy the full SHA 7519755View commit details -
[AutoUpgrade] Preserve attributes when upgrading named struct return
For example, if the argument has an alignment attribute, preserve it.
Configuration menu - View commit details
-
Copy full SHA for 5dcea46 - Browse repository at this point
Copy the full SHA 5dcea46View commit details -
[DebugInfo][RemoveDIs] Find types hidden in DbgRecords (llvm#106547)
When serialising to textual IR, there can be constant Values referred to by DbgRecords that don't appear anywhere else, and have types hidden even deeper in side them. Enumerate these when enumerating all types. Test by Mikael Holmén.
Configuration menu - View commit details
-
Copy full SHA for 25f87f2 - Browse repository at this point
Copy the full SHA 25f87f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for f79722b - Browse repository at this point
Copy the full SHA f79722bView commit details -
[X86] scmp/ucmp - add SSE42/AVX2/AVX512 test coverage to show current…
… state of vector legalization/lowering
Configuration menu - View commit details
-
Copy full SHA for f19dff1 - Browse repository at this point
Copy the full SHA f19dff1View commit details -
Configuration menu - View commit details
-
Copy full SHA for a9c71d3 - Browse repository at this point
Copy the full SHA a9c71d3View commit details -
Configuration menu - View commit details
-
Copy full SHA for e90b219 - Browse repository at this point
Copy the full SHA e90b219View commit details -
[RuntimeDyld][Windows] Allocate space for dllimport things. (llvm#106958
) We weren't taking account of the space we require in the stubs for things that are dllimported, and as a result we could hit the assertion failure for running out of stub space. Fix that. Also add a couple of `override` specifiers that were missing last time (llvm#102586). rdar://133473673
Configuration menu - View commit details
-
Copy full SHA for bdfd780 - Browse repository at this point
Copy the full SHA bdfd780View commit details -
[Flang][Lower] Handle mangling of a generic name with a homonym speci…
…fic procedure (llvm#106693) This may happen when using modules. Fixes llvm#93707
Configuration menu - View commit details
-
Copy full SHA for 4ed9092 - Browse repository at this point
Copy the full SHA 4ed9092View commit details -
[clang][bytecode] Implement __noop (llvm#106714)
This does nothing and returns 0.
Configuration menu - View commit details
-
Copy full SHA for f838d6b - Browse repository at this point
Copy the full SHA f838d6bView commit details -
[clang][bytecode] Fix zero-init of first union member (llvm#106962)
... if done via a ImplicitValueInitExpr. We were already doing this later in visitZeroRecordInitializer().
Configuration menu - View commit details
-
Copy full SHA for a9006bf - Browse repository at this point
Copy the full SHA a9006bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 224112f - Browse repository at this point
Copy the full SHA 224112fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 60ed104 - Browse repository at this point
Copy the full SHA 60ed104View commit details -
[mlir][EmitC] Remove restrictions on include op (llvm#106953)
An `emitc.include` should be usable even though the parent is not a ModuleOp. This requirement is therefore removed.
Configuration menu - View commit details
-
Copy full SHA for 8b2ad5c - Browse repository at this point
Copy the full SHA 8b2ad5cView commit details -
Revert "[compiler-rt][fuzzer] SetThreadName build fix for Mingwin att…
…empt (llvm#106902)" This reverts commit 7c4cffd. This commit broke compilation in environments that don't use winpthreads.
Configuration menu - View commit details
-
Copy full SHA for b32dc67 - Browse repository at this point
Copy the full SHA b32dc67View commit details -
[NFC] Fix dead links in TargetCXXABI.def (llvm#96348)
http://itanium-cxx-abi.github.io/cxx-abi/ > This website may be mirrored in many places, some of which may become stale. The current canonical location is: > * http://itanium-cxx-abi.github.io/cxx-abi/ https://github.com/ARM-software/abi-aa > This is the official place for the latest documents of the Application Binary Interface for the Arm® Architecture, both for source files and officially released documents.
Configuration menu - View commit details
-
Copy full SHA for dc3f66a - Browse repository at this point
Copy the full SHA dc3f66aView commit details -
[NFC][IR] Add CreateCountTrailingZeroElems helper (llvm#106711)
The LoopIdiomVectorize pass already creates calls to the intrinsic experimental_cttz_elts, but PR llvm#88385 will start calling this more too so I've created a helper for it.
Configuration menu - View commit details
-
Copy full SHA for dc6c3ba - Browse repository at this point
Copy the full SHA dc6c3baView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0c0bac9 - Browse repository at this point
Copy the full SHA 0c0bac9View commit details -
[lldb/linux] Make truncated reads work (llvm#106532)
Previously, we were returning an error if we couldn't read the whole region. This doesn't matter most of the time, because lldb caches memory reads, and in that process it aligns them to cache line boundaries. As (LLDB) cache lines are smaller than pages, the reads are unlikely to cross page boundaries. Nonetheless, this can cause a problem for large reads (which bypass the cache), where we're unable to read anything even if just a single byte of the memory is unreadable. This patch fixes the lldb-server to do that, and also changes the linux implementation, to reuse any partial results it got from the process_vm_readv call (to avoid having to re-read everything again using ptrace, only to find that it stopped at the same place). This matches debugserver behavior. It is also consistent with the gdb remote protocol documentation, but -- notably -- not with actual gdbserver behavior (which returns errors instead of partial results). We filed a [clarification bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several years ago. Though we did not really reach a conclusion there, I think this is the most logical behavior. The associated test does not currently pass on windows, because the windows memory read APIs don't support partial reads (I have a WIP patch to work around that).
Configuration menu - View commit details
-
Copy full SHA for 181cc75 - Browse repository at this point
Copy the full SHA 181cc75View commit details -
[VPlan] Use op from underlying call in computeCost if needed.
This fixes a divergence between legacy and VPlan-based cost model, e.g. if one of the operands has an first-order recurrence phi as operand.
Configuration menu - View commit details
-
Copy full SHA for b0de7fa - Browse repository at this point
Copy the full SHA b0de7faView commit details -
Win release packaging: Don't try to use rpmalloc for 32-bit x86 (llvm…
…#106969) because that doesn't work (results in `LINK : error LNK2001: unresolved external symbol malloc`). Based on the title of llvm#91862 it was only intended for use in 64-bit builds.
Configuration menu - View commit details
-
Copy full SHA for ef26afc - Browse repository at this point
Copy the full SHA ef26afcView commit details -
[Analysis] Add getPredicatedExitCount to ScalarEvolution (llvm#105649)
Due to a reviewer request on PR llvm#88385 I have created this patch to add a getPredicatedExitCount function, which is similar to getExitCount except that it uses the predicated backedge taken information. With PR llvm#88385 we will start to care about more loops with multiple exits, and want the ability to query exit counts for a particular exiting block. Such loops may require predicates in order to be vectorised. New tests added here: Analysis/ScalarEvolution/predicated-exit-count.ll
Configuration menu - View commit details
-
Copy full SHA for df3d70b - Browse repository at this point
Copy the full SHA df3d70bView commit details -
[AArch64] Lower partial add reduction to udot or svdot (llvm#101010)
This patch introduces lowering of the partial add reduction intrinsic to a udot or svdot for AArch64. This also involves adding a `shouldExpandPartialReductionIntrinsic` target hook, which AArch64 will return false from in the cases that it can be lowered.
Configuration menu - View commit details
-
Copy full SHA for 44cfbef - Browse repository at this point
Copy the full SHA 44cfbefView commit details -
[clangd] Update TidyFastChecks for release/19.x (llvm#106354)
Run for clang-tidy checks available in release/19.x branch. Some notable findings: - altera-id-dependent-backward-branch, stays slow with 13%. - misc-const-correctness become faster, going from 261% to 67%, but still above 8% threshold. - misc-header-include-cycle is a new SLOW check with 10% runtime implications - readability-container-size-empty went from 16% to 13%, still SLOW.
Configuration menu - View commit details
-
Copy full SHA for b47d7ce - Browse repository at this point
Copy the full SHA b47d7ceView commit details -
[NFC][Support] Add FormatVariadic sub-test for validation (llvm#106578)
- Add validation subtest that tests assert failures in assert enabled builds, and that validation is disabled in assert disabled builds.
Configuration menu - View commit details
-
Copy full SHA for ad30a05 - Browse repository at this point
Copy the full SHA ad30a05View commit details -
[NFC][TableGen] Refactor
getIntrinsicFnAttributeSet
(llvm#106587)Fix intrinsic function attributes to not generate attribute sets that are empty in `getIntrinsicFnAttributeSet`. Refactor the code to use helper functions to get effective memory effects for an intrinsic and to check if it has non-default attributes. This eliminates one case statement in `getIntrinsicFnAttributeSet` that we generate today for the case when intrinsic attributes are default ones. Also rename `Intrinsic` to `Int` to follow the naming convention used in this file and adjust emission code to not emit unnecessary empty line between cases generated.
Configuration menu - View commit details
-
Copy full SHA for e5c7cde - Browse repository at this point
Copy the full SHA e5c7cdeView commit details -
Configuration menu - View commit details
-
Copy full SHA for b6a4ab5 - Browse repository at this point
Copy the full SHA b6a4ab5View commit details -
[clang] Add tests for CWG issues about friend declaration matching (l…
…lvm#106117) This patch covers CWG issues regarding declaration matching when `friend` declarations are involved: [CWG138](https://cplusplus.github.io/CWG/issues/138.html), [CWG386](https://cplusplus.github.io/CWG/issues/386.html), [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html), and [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html). Atypical for our CWG tests, the ones in this patch are quite extensively commented in-line, explaining the mechanics. PR description focuses on high-level concerns and references. [CWG138](https://cplusplus.github.io/CWG/issues/138.html) "Friend declaration name lookup" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG138](https://cplusplus.github.io/CWG/issues/138.html) is resolved according to [N1229](http://wg21.link/n1229), except that using-directives that nominate nested namespaces are considered. I find it hard to pin down the scope of this issue, so I'm relying on three examples from the filing to define it. Because of that, it's also hard to pinpoint exact wording changes that resolve it. Relevant references are: [[dcl.meaning.general]/2](http://eel.is/c++draft/dcl.meaning#general-2), [[namespace.udecl]/10](https://eel.is/c++draft/namespace.udecl#10), [[dcl.type.elab]/3](https://eel.is/c++draft/dcl.type.elab#3), [[basic.lookup.elab]/1](https://eel.is/c++draft/basic.lookup.elab#1). [CWG386](https://cplusplus.github.io/CWG/issues/386.html) "Friend declaration of name brought in by _using-declaration_" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG386](https://cplusplus.github.io/CWG/issues/386.html), [CWG1839](https://cplusplus.github.io/CWG/issues/1839.html), [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html), [CWG2058](https://cplusplus.github.io/CWG/issues/2058.html), [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html), and Richard’s observation in [“are non-type names ignored in a class-head-name or enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are resolved by describing the limited lookup that occurs for a declarator-id, including the changes in Richard’s [proposed resolution for CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html) (which also resolves CWG1818 and what of CWG2058 was not resolved along with CWG2059) and rejecting the example from [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html). Wording ([[dcl.meaning.general]/2](http://eel.is/c++draft/dcl.meaning#general-2)): > — If the [id-expression](http://eel.is/c++draft/expr.prim.id.general#nt:id-expression) E in the [declarator-id](http://eel.is/c++draft/dcl.decl.general#nt:declarator-id) of the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) is a [qualified-id](http://eel.is/c++draft/expr.prim.id.qual#nt:qualified-id) or a [template-id](http://eel.is/c++draft/temp.names#nt:template-id): > — [...] > — The [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) shall correspond to one or more declarations found by the lookup; they shall all have the same target scope, and the target scope of the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) is that scope[.](http://eel.is/c++draft/dcl.meaning#general-2.2.2.sentence-1) This issue focuses on interaction of `friend` declarations with template-id and qualified-id with using-declarations. The short answer is that terminal name in such declarations undergo lookup, and using-declarations do what they usually do helping that lookup. Target scope of such friend declaration is the target scope of lookup result, so no conflicts arise with the using-declarations. [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html) "Definition of a `friend` outside its namespace" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [...] and rejecting the example from [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html). Wording ([[dcl.meaning.general]/3.4](http://eel.is/c++draft/dcl.meaning#general-3.4)): > Otherwise, the terminal name of the [declarator-id](http://eel.is/c++draft/dcl.decl.general#nt:declarator-id) is not looked up[.](http://eel.is/c++draft/dcl.meaning#general-3.4.sentence-1) If it is a qualified name, the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) shall correspond to one or more declarations nominable in S; all the declarations shall have the same target scope and the target scope of the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) is that scope[.](http://eel.is/c++draft/dcl.meaning#general-3.4.sentence-2) This issue focuses on befriending a function in one scope, then defining it from other scope using qualified-id. Contrary to what P1787R6 says in prose, this example is accepted by the wording in that paper. In the wording quote above, note the absence of a statement like "terminal name of the declarator-id is not bound", contrary to similar statements made before that in [dcl.meaning.general] about friend declarations and template-ids. There's also a note in [basic.scope.scope] that supports the rejection, but it's considered incorrect and expected to be removed in the future. This is tracked in cplusplus/draft#7238. [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html) "Do `friend` declarations count as “previous declarations”?" ------------------ [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG386](https://cplusplus.github.io/CWG/issues/386.html), [CWG1839](https://cplusplus.github.io/CWG/issues/1839.html), [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html), [CWG2058](https://cplusplus.github.io/CWG/issues/2058.html), [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html), and Richard’s observation in [“are non-type names ignored in a class-head-name or enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are resolved by describing the limited lookup that occurs for a declarator-id, including the changes in Richard’s [proposed resolution for CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html) (which also resolves CWG1818 and what of CWG2058 was not resolved along with CWG2059) and rejecting the example from [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html). Wording ([[dcl.meaning.general]/2.3](http://eel.is/c++draft/dcl.meaning#general-2.3)): > The declaration's target scope is the innermost enclosing namespace scope; if the declaration is contained by a block scope, the declaration shall correspond to a reachable ([[module.reach]](http://eel.is/c++draft/module.reach)) declaration that inhabits the innermost block scope[.](http://eel.is/c++draft/dcl.meaning#general-2.3.sentence-2) Wording ([[basic.scope.scope]/7](http://eel.is/c++draft/basic.scope#scope-7)): > A declaration is [nominable](http://eel.is/c++draft/basic.scope#def:nominable) in a class, class template, or namespace E at a point P if it precedes P, it does not inhabit a block scope, and its target scope is the scope associated with E or, if E is a namespace, any element of the inline namespace set of E ([[namespace.def]](http://eel.is/c++draft/namespace.def))[.](http://eel.is/c++draft/basic.scope#scope-7.sentence-1) Wording ([[dcl.meaning.general]/3.4](http://eel.is/c++draft/dcl.meaning#general-3.4)): > If it is a qualified name, the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) shall correspond to one or more declarations nominable in S; [...] In the new wording it's clear that while `friend` declarations of functions do not bind names, declaration is still introduced, and is nominable, making it eligible for a later definition by qualified-id.
Configuration menu - View commit details
-
Copy full SHA for 4a505e1 - Browse repository at this point
Copy the full SHA 4a505e1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 30d56be - Browse repository at this point
Copy the full SHA 30d56beView commit details -
[clang][Driver] Add a custom error option in multilib.yaml. (llvm#105684
) Sometimes a collection of multilibs has a gap in it, where a set of driver command-line options can't work with any of the available libraries. For example, the Arm MVE extension requires special startup code (you need to initialize FPSCR.LTPSIZE), and also benefits greatly from -mfloat-abi=hard. So a multilib provider might build a library for systems without MVE, and another for MVE with -mfloat-abi=hard, anticipating that that's what most MVE users would want. But then if a user compiles for MVE _without_ -mfloat-abi=hard, thhey can't use either of those libraries – one has an ABI mismatch, and the other will fail to set up LTPSIZE. In that situation, it's useful to include a multilib.yaml entry for the unworkable intermediate situation, and have it map to a fatal error message rather than a set of actual libraries. Then the user gets a build failure with a sensible explanation, instead of selecting an unworkable library and silently generating bad output. The new regression test demonstrates this case. This patch introduces extra syntax into multilib.yaml, so that a record in the `Variants` list can omit the `Dir` key, and in its place, provide a `FatalError` key. Then, if that variant is selected, the error message is emitted as a clang diagnostic, and multilib selection fails. In order to emit the error message in `MultilibSet::select`, I had to pass a `Driver &` to that function, which involved plumbing one through to every call site, and in the unit tests, constructing one specially.
Configuration menu - View commit details
-
Copy full SHA for 26bf0b4 - Browse repository at this point
Copy the full SHA 26bf0b4View commit details -
[NFC] Update check lines of the test case `llvm/test/CodeGen/AMDGPU/r…
…emove-no-kernel-id-attribute.ll`
Configuration menu - View commit details
-
Copy full SHA for f32f028 - Browse repository at this point
Copy the full SHA f32f028View commit details -
Configuration menu - View commit details
-
Copy full SHA for cb949b7 - Browse repository at this point
Copy the full SHA cb949b7View commit details -
[clang][AST][NFC] Make ASTContext::UnwrapSimilar{Array,}Types const (l…
…lvm#106992) They don't mutate the context at all, so mark them const.
Configuration menu - View commit details
-
Copy full SHA for 38ae53d - Browse repository at this point
Copy the full SHA 38ae53dView commit details -
[RISCV] Remove RISCVISD::FP_EXTEND_BF16. (llvm#106939)
I don't think we need this node. We can isel fp_extend directly. fp_extend to f64 requires two instructions, but we can emit them with an isel pattern. I have not removed RISCVISD::FP_ROUND_BF16 because f64->bf16 needs more work to fix the double rounding.
Configuration menu - View commit details
-
Copy full SHA for 55eb93b - Browse repository at this point
Copy the full SHA 55eb93bView commit details -
Reland [AArch64][AsmParser] Directives should clear transitively impl…
…ied features (llvm#106625) (llvm#106850) Relands 2497739 addressing the buffer overflow caused when dereferencing an iterator past the end of ExtensionMap.
Configuration menu - View commit details
-
Copy full SHA for a586b5a - Browse repository at this point
Copy the full SHA a586b5aView commit details -
[VPlan] Pass intrinsic inst to TTI in VPWidenCallRecipe::computeCost.
Follow-up to 9ccf825, adjust computeCost to also pass IntrinsicInst to TTI if available, as there are multiple places in TTI which use the IntrinsicInst. Fixes llvm#107016.
Configuration menu - View commit details
-
Copy full SHA for 50a02e7 - Browse repository at this point
Copy the full SHA 50a02e7View commit details -
[VPlan] Simplify MUL operands at recipe construction.
This moves the logic to create simplified operands using SCEV to MUL recipe creation. This is needed to match the behavior of the legacy's cost model. TODOs are to extend to other opcodes and move to a transform. Note that this also restricts the number of SCEV simplifications we apply to more precisely match the cases handled by the legacy cost model. Fixes llvm#107015.
Configuration menu - View commit details
-
Copy full SHA for 954ed05 - Browse repository at this point
Copy the full SHA 954ed05View commit details -
Configuration menu - View commit details
-
Copy full SHA for ecc9aec - Browse repository at this point
Copy the full SHA ecc9aecView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e8aba2 - Browse repository at this point
Copy the full SHA 7e8aba2View commit details -
[MIPS] Fix error messages when rejecting certain assembly not support…
…ed by ISA (llvm#94695) … instructions. This is a fix I stumbled upon while working on something else. I decided to break it out since it seems like a good "first issue" to submit. I updated the comments in the "wrong error" test files to indicate that the messages are no longer incorrect, but I left the names of the test files alone. I was not sure what to do with those, so I would appreciate thoughts or guidance.
Configuration menu - View commit details
-
Copy full SHA for 0ba006d - Browse repository at this point
Copy the full SHA 0ba006dView commit details -
[LegalizeVectorOps] Defer UnrollVectorOp in ExpandFNEG to caller. (ll…
…vm#106783) Make ExpandFNEG return SDValue() when it doesn't expand. The caller already knows how to Unroll when Results is empty.
Configuration menu - View commit details
-
Copy full SHA for 366ac8c - Browse repository at this point
Copy the full SHA 366ac8cView commit details
Commits on Sep 3, 2024
-
Apparently DragonFly BSD and Solaris/illumos call these APIs `pthread_get_name_np` / `pthread_set_name_np` (with an extra underscore) instead of `pthread_getname_np` / `pthread_setname_np`.
Configuration menu - View commit details
-
Copy full SHA for b6597f5 - Browse repository at this point
Copy the full SHA b6597f5View commit details -
[RISCV] Correct the scheduler class for FCVT_S_BF16. (llvm#107028)
Use FCvtF16ToF32 instead of FCvtF32ToF16.
Configuration menu - View commit details
-
Copy full SHA for ba3c1ed - Browse repository at this point
Copy the full SHA ba3c1edView commit details -
[LTO] Don't make unnecessary copies of ImportIDTable (llvm#106998)
Without this patch, {ImportMapTy,SortedImportList}::{begin,end} make unnecessary copies of ImportIDTable via: map_iterator(Imports.begin(), IDs); The second parameter, IDs, is passed by value, so we make a copy of MapVector inside ImportIDTable every time we call begin and end. These begin and end show up as time-consuming functions in the performance profile. This patch fixes the problem by passing IDs by reference with std::cref. While we are at it, this patch deletes the copy constructor and assignment operator. I cannot think of any legitimate need reason to make a copy of the deduplication table.
Configuration menu - View commit details
-
Copy full SHA for 9a1d14a - Browse repository at this point
Copy the full SHA 9a1d14aView commit details -
[RISCV] Rename test cases in bfloat-arith.ll and half-arith.ll. NFC
Use _bf16 or _h instead of _s. The _s was copied from float-arith.ll
Configuration menu - View commit details
-
Copy full SHA for dc19b59 - Browse repository at this point
Copy the full SHA dc19b59View commit details -
Revert "[C++20] [Modules] Embed all source files for C++20 Modules (l…
…lvm#102444)" This reverts commit 2eeeff8. See the post commit discussion in llvm@2eeeff8
Configuration menu - View commit details
-
Copy full SHA for 2cbd1bc - Browse repository at this point
Copy the full SHA 2cbd1bcView commit details -
[clang][Sema] Fix diagnostic for function overloading in extern "C" (l…
…lvm#106033) Fixes llvm#80235 When trying to overload a function within `extern "C"`, the diagnostic `functions that differ only in their return type cannot be overloaded` is given. This diagnostic is inappropriate because overloading is basically not allowed in the C language. However, if the redeclared function has the `((overloadable))` attribute, it should be diagnosed as `functions that differ only in their return type cannot be overloaded`. This patch uses `isExternC()` to provide an appropriate diagnostic during the diagnostic process. `isExternC()` updates the linkage information cache internally, so calling it before merging functions can cause clang to crash. An example is declaring `static void foo()` and `void foo()` within an `extern "C"` block. Therefore, I decided to call `isExternC()` after the compilation error is confirmed and select the diagnostic message. The diagnostic message is `conflicting types for 'func'` similar to the diagnostic in C, and `functions that differ only in their return type cannot be overloaded` if the `((overloadable))` attribute is given. Regression tests verify that the expected diagnostics are given when trying to overload functions within `extern "C"` and when the `((overloadable))` attribute is present. --------- Co-authored-by: Sirraide <aeternalmail@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 78abeca - Browse repository at this point
Copy the full SHA 78abecaView commit details -
[RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (llvm#…
…107039) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes. Similar to what we did for llvm#106886 for FNEG and FABS. Special care is needed to handle the Sign operand being a different type.
Configuration menu - View commit details
-
Copy full SHA for 9a1eded - Browse repository at this point
Copy the full SHA 9a1ededView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0421049 - Browse repository at this point
Copy the full SHA 0421049View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e5b43c - Browse repository at this point
Copy the full SHA 8e5b43cView commit details -
[lldb-dap][test] Fix: Typo in unresolved test (llvm#107030)
There is a typo in an assertion that causes the instruction break-point test to be unresolved
Configuration menu - View commit details
-
Copy full SHA for 7d7d2d2 - Browse repository at this point
Copy the full SHA 7d7d2d2View commit details -
[MachinePipeliner] Make Recurrence MII More Accurate (llvm#105475)
Current RecMII calculation is bigger than it needs to be. The calculation was refined in this patch.
Configuration menu - View commit details
-
Copy full SHA for 00c198b - Browse repository at this point
Copy the full SHA 00c198bView commit details -
[RISCV] Rename
vcix_state
register tosf_vcix_state
. NFC (llvm#10……6995) Since it's SiFive VCIX specific register, it's better to have a prefix so that it's more understandable.
Configuration menu - View commit details
-
Copy full SHA for 7e6bad1 - Browse repository at this point
Copy the full SHA 7e6bad1View commit details -
[compiler-rt] [docs] Mention Windows as one of the supported OSes (ll…
…vm#106874) Compiler-rt can be built for Windows, and most parts of it work. Some parts only really work on x86/x86_64 (like address sanitizers), but the OS overall is supported.
Configuration menu - View commit details
-
Copy full SHA for af5c18a - Browse repository at this point
Copy the full SHA af5c18aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 525ffd6 - Browse repository at this point
Copy the full SHA 525ffd6View commit details -
[lldb] Support partial memory reads on windows (llvm#106981)
ReadProcessMemory will not perform the read if part of the memory is unreadable (and even though the API has a `number_of_bytes_read` argument). To make this work, I explicitly inspect the memory region being read and only read the accessible part.
Configuration menu - View commit details
-
Copy full SHA for 04ed12c - Browse repository at this point
Copy the full SHA 04ed12cView commit details -
[Analysis] getIntrinsicForCallSite - add vectorization support for ac…
…os/asin/atan and cosh/sinh/tanh libcalls (llvm#106844) Followup to llvm#106584 - ensure acos/asin/atan and cosh/sinh/tanh libcalls correctly map to the llvm intrinsic equivalents
Configuration menu - View commit details
-
Copy full SHA for 6c8746b - Browse repository at this point
Copy the full SHA 6c8746bView commit details -
[clang][bytecode] Print Pointers via APValue (llvm#107056)
Instead of doing this ourselves, just rely on printing the APValue.
Configuration menu - View commit details
-
Copy full SHA for 733a92d - Browse repository at this point
Copy the full SHA 733a92dView commit details -
[bazel] Attempt to fix issue fetching remote blob
Bazel builds currently fail with `Failed to fetch blobs because they do not exist remotely.`. These extra bazel flags hopefully fix it.
Configuration menu - View commit details
-
Copy full SHA for a70d999 - Browse repository at this point
Copy the full SHA a70d999View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6c59dfb - Browse repository at this point
Copy the full SHA 6c59dfbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 851bacb - Browse repository at this point
Copy the full SHA 851bacbView commit details -
[flang][semantics][OpenMP] store DSA using ultimate sym (llvm#107002)
Previously we tracked data sharing attributes by the symbol itself not by the ultimate symbol. When the private clause came first, subsequent uses of the symbol found a host-associated version instead of the ultimate symbol and so the check didn't consider them to be the same symbol. Always adding and checking for the ultimate symbol ensures that we have the same behaviour no matter the order of clauses. The modified list is only used for this multiple clause check. Closes llvm#78235
Configuration menu - View commit details
-
Copy full SHA for 4befe65 - Browse repository at this point
Copy the full SHA 4befe65View commit details -
[X86] canCreateUndefOrPoisonForTargetNode - X86ISD::CMPP (CMPPS/D) no…
…des do not generate poison
Configuration menu - View commit details
-
Copy full SHA for 377045e - Browse repository at this point
Copy the full SHA 377045eView commit details -
Configuration menu - View commit details
-
Copy full SHA for fe1a1ee - Browse repository at this point
Copy the full SHA fe1a1eeView commit details -
[Utils][SPIR-V] Adding spirv-sim to LLVM (llvm#104020)
Currently, the testing infrastructure for SPIR-V is based on FileCheck. Those tests are great to check some level of codegen, but when the test needs check both the CFG layout and the content of each basic-block, things becomes messy. - Because the CHECK/CHECK-DAG/CHECK-NEXT state is limited, it is sometimes hard to catch the good block: if 2 basic blocks have similar instructions, FileCheck can match the wrong one. - Cross-lane interaction can be a bit difficult to understand, and writting a FileCheck test that is strong enough to catch bad CFG transforms while not being broken everytime some unrelated codegen part changes is hard. And lastly, the spirv-val tooling we have checks that the generated SPIR-V respects the spec, not that it is correct in regards to the source IR. For those reasons, I believe the best way to test the structurizer is to: - run spirv-val to make sure the CFG respects the spec. - simulate the function to validate result for each lane, making sure the generated code is correct. This simulator has no other dependencies than core python. It also only supports a very limited set of instructions as we can test most features through control-flow and some basic cross-lane interactions. As-is, the added tests are just a harness for the simulator itself. If this gets merged, the structurizer PR will benefit from this as I'll be able to add extensive testing using this. --------- Signed-off-by: Nathan Gauër <brioche@google.com>
Configuration menu - View commit details
-
Copy full SHA for c3d8124 - Browse repository at this point
Copy the full SHA c3d8124View commit details -
[AMDGPU] Create dir for amdgpu specific machineverifier tests (llvm#1…
…06960) Move the AMDGPU target specific testcases in MachineVerifier separately into new directory. Reference : llvm#105494 (comment)
Configuration menu - View commit details
-
Copy full SHA for d24a2fd - Browse repository at this point
Copy the full SHA d24a2fdView commit details -
[mlir][vector] Refactor vector-transfer-to-vector-load-store.mlir (NF…
…C) (llvm#105509) Overview of changes: - All memref input arguments are re-named to %mem. - All vector input arguments are re-named to %vec. - All index input arguments are re-named to %idx. - All tensor input arguments are re-named to %src/%dst. - LIT variables were updated to be consistent with input arguments. - Renamed all output arguments as %res. - Removed unused argument in `transfer_write_broadcast_unit_dim`. - Unified identation of `FileCheck` commands. - Split `transfer_write_permutations` and `transfer_write_broadcast_unit_dim` into tensor and memref variants. - Renamed `transfer_write_permutations_tensor` as `transfer_write_permutations_tensor_masked`.
Configuration menu - View commit details
-
Copy full SHA for 4d8903b - Browse repository at this point
Copy the full SHA 4d8903bView commit details -
[LoopUnroll] Avoid undef values in test (NFC)
Avoid most of the code being optimized away as a result of optimization improvements.
Configuration menu - View commit details
-
Copy full SHA for 52b8795 - Browse repository at this point
Copy the full SHA 52b8795View commit details -
Revert "[Utils][SPIR-V] Adding spirv-sim to LLVM" (llvm#107084)
Reverts llvm#104020 Looks like it caused build failures.
Configuration menu - View commit details
-
Copy full SHA for 8861328 - Browse repository at this point
Copy the full SHA 8861328View commit details -
[libc++][NFC] Canonicalize the benchmark suite a bit
This replaces `BENCHMARK_TEMPLATE` with `BENCHMARK` and uses `BENCHMARK_MAIN()` when possible.
Configuration menu - View commit details
-
Copy full SHA for 5e19e31 - Browse repository at this point
Copy the full SHA 5e19e31View commit details -
Configuration menu - View commit details
-
Copy full SHA for a5f03b4 - Browse repository at this point
Copy the full SHA a5f03b4View commit details -
[lldb/windows] Reset MainLoop events after handling them (llvm#107061)
This prevents the callback function from being called in a busy loop. Discovered by @slydiman on llvm#106955.
Configuration menu - View commit details
-
Copy full SHA for 4353530 - Browse repository at this point
Copy the full SHA 4353530View commit details -
[lldb] Add a callback version of TCPSocket::Accept (llvm#106955)
The existing function already used the MainLoop class, which allows one to wait on multiple events at once. It needed to do this in order to wait for v4 and v6 connections simultaneously. However, since it was creating its own instance of MainLoop, this meant that it was impossible to multiplex these sockets with anything else. This patch simply adds a version of this function which uses an externally provided main loop instance, which allows the caller to add any events it deems necessary. The previous function becomes a very thin wrapper over the new one.
Configuration menu - View commit details
-
Copy full SHA for 3d5e1ec - Browse repository at this point
Copy the full SHA 3d5e1ecView commit details -
[AArch64][GlobalISel] Legalize 128-bit types for FABS (llvm#104753)
This patch adds a common lower action for `G_FABS`, which generates `and x8, x8, #0x7fffffffffffffff` to reset the sign bit. The action does not support vectors since `G_AND` does not support fp128. This approach is different than what SDAG is doing. SDAG stores the value onto stack, clears the sign bit in the most significant byte, and loads the value back into register. This involves multiple memory ops and sounds slower.
Configuration menu - View commit details
-
Copy full SHA for 0748f42 - Browse repository at this point
Copy the full SHA 0748f42View commit details -
[analyzer] Fix false positive for stack-addr leak on simple param ptr (…
…llvm#107003) Assigning to a pointer parameter does not leak the stack address because it stays within the function and is not shared with the caller. Previous implementation reported any association of a pointer parameter with a local address, which is too broad. This fix enforces that the pointer to a stack variable is related by at least one level of indirection. CPP-5642 Fixes llvm#106834
Configuration menu - View commit details
-
Copy full SHA for aa4f81e - Browse repository at this point
Copy the full SHA aa4f81eView commit details -
Configuration menu - View commit details
-
Copy full SHA for f77f604 - Browse repository at this point
Copy the full SHA f77f604View commit details -
[clang][bytecode] Pass FPOptions to floating point ops (llvm#107063)
So we don't have to retrieve them from the InterpFrame, which is slow.
Configuration menu - View commit details
-
Copy full SHA for 0f5f440 - Browse repository at this point
Copy the full SHA 0f5f440View commit details -
[SCCP] Avoid use of undef value in test (NFC)
Avoid optimization away most of the code if we resolve this to a specific value.
Configuration menu - View commit details
-
Copy full SHA for c80cabf - Browse repository at this point
Copy the full SHA c80cabfView commit details -
[Offload] Change x86_64-pc-linux to x86_64-unknown-linux (llvm#107023)
It appears that the RUNTIMES build prefers the x86-64-unknown-linux-gnu triple notation for the host. This fixes runtime / test breakages when compiler-rt is used as the CLANG_DEFAULT_RTLIB.
Configuration menu - View commit details
-
Copy full SHA for 1a0cf24 - Browse repository at this point
Copy the full SHA 1a0cf24View commit details -
[profile] Change __llvm_profile_counter_bias etc. types to match llvm (…
…llvm#102747) As detailed in Issue llvm#101667, two `profile` tests `FAIL` on 32-bit SPARC, both Linux/sparc64 and Solaris/sparcv9 (where the tests work when enabled): ``` Profile-sparc :: ContinuousSyncMode/runtime-counter-relocation.c Profile-sparc :: ContinuousSyncMode/set-file-object.c ``` The Solaris linker provides the crucial clue as to what's wrong: ``` ld: warning: symbol '__llvm_profile_counter_bias' has differing sizes: (file runtime-counter-relocation-17ff25.o value=0x8; file libclang_rt.profile-sparc.a(InstrProfilingFile.c.o) value=0x4); runtime-counter-relocation-17ff25.o definition taken ``` In fact, the types in `llvm` and `compiler-rt` differ: - `__llvm_profile_counter_bias`/`INSTR_PROF_PROFILE_COUNTER_BIAS_VAR` is created in `llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp` (`InstrLowerer::getCounterAddress`) as `int64_t`, while `compiler-rt/lib/profile/InstrProfilingFile.c` uses `intptr_t`. While this doesn't matter in the 64-bit case, the type sizes differ for 32-bit. - `__llvm_profile_bitmap_bias`/`INSTR_PROF_PROFILE_BITMAP_BIAS_VAR` has the same issue: created in `InstrProfiling.cpp` (`InstrLowerer::getBitmapAddress`) as `int64_t`, while `InstrProfilingFile.c` again uses `intptr_t`. This patch changes the `compiler-rt` types to match `llvm`. At the same time, the affected testcases are enabled on Solaris, too, where they now just `PASS`. Tested on `sparc64-unknown-linux-gnu`, `sparcv9-sun-solaris2.11`, `x86_64-pc-linux-gnu`, and `amd64-pc-solaris2.11.
Configuration menu - View commit details
-
Copy full SHA for 70a19ad - Browse repository at this point
Copy the full SHA 70a19adView commit details -
[SLP]Fix PR107036: Check if the type of the user is sizable before re…
…questing its size. Only some instructions should be considered as potentially reducing the size of the operands types, not all instructions should be considered. Fixes llvm#107036
Configuration menu - View commit details
-
Copy full SHA for f381cd0 - Browse repository at this point
Copy the full SHA f381cd0View commit details -
[SCCP] Explicitly mark gep as overdefined if ct eval fails
Don't just leave the result as unknown. I think this currently works out thanks to undef resolution, but the correct thing to do is set it to overdefined explicitly.
Configuration menu - View commit details
-
Copy full SHA for 0797c18 - Browse repository at this point
Copy the full SHA 0797c18View commit details -
[LV] Update call widening decision when scalarzing calls.
collectInstsToScalarize may decide to scalarize a call. If so, we have to update the widening decision for the call, otherwise the call won't be scalarized as expected during VPlan construction. This issue was uncovered by f82543d509.
Configuration menu - View commit details
-
Copy full SHA for dd94537 - Browse repository at this point
Copy the full SHA dd94537View commit details -
[SLP]Check for the whole vector vectorization in unique scalars analysis
Need to check that thr whole number of register is attempted to vectorize before actually trying to build the node to avoid compiler crash.
Configuration menu - View commit details
-
Copy full SHA for b74e09c - Browse repository at this point
Copy the full SHA b74e09cView commit details -
Configuration menu - View commit details
-
Copy full SHA for ce8ec31 - Browse repository at this point
Copy the full SHA ce8ec31View commit details -
[compiler-rt][rtsan] Record pc and bp higher up in the stack (llvm#10…
…7014) Functionally, this change affects only our printed stack traces. New version does not expose any internal rtsan interworking
Configuration menu - View commit details
-
Copy full SHA for a424b79 - Browse repository at this point
Copy the full SHA a424b79View commit details -
[Vectorize] Fix -Wunused-variable in SLPVectorizer.cpp (NFC)
/llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10310:26: error: unused variable 'isExtractSubvectorMask' [-Werror,-Wunused-variable] bool isExtractSubvectorMask = ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 20fa37b - Browse repository at this point
Copy the full SHA 20fa37bView commit details -
Configuration menu - View commit details
-
Copy full SHA for d7c44ef - Browse repository at this point
Copy the full SHA d7c44efView commit details -
[BPF] Make -mcpu=v3 as the default (llvm#107008)
Before llvm20, (void)__sync_fetch_and_add(...) always generates locked xadd insns. In linux kernel upstream discussion [1], it is found that for arm64 architecture, the original semantics of (void)__sync_fetch_and_add(...), i.e., __atomic_fetch_add(...), is preferred in order for jit to emit proper native barrier insns. In llvm commits [2] and [3], (void)__sync_fetch_and_add(...) will generate the following insns: - for cpu v1/v2: locked xadd insns to keep backward compatibility - for cpu v3/v4: __atomic_fetch_add() insns To ensure proper barrier semantics for (void)__sync_fetch_and_add(...), cpu v3/v4 is recommended. This patch enables cpu=v3 as the default cpu version. For users wanting to use cpu v1, -mcpu=v1 needs to be explicitly added to clang/llc command line. [1] https://lore.kernel.org/bpf/ZqqiQQWRnz7H93Hc@google.com/T/#mb68d67bc8f39e35a0c3db52468b9de59b79f021f [2] llvm#101428 [3] llvm#106494
Configuration menu - View commit details
-
Copy full SHA for 7852ebc - Browse repository at this point
Copy the full SHA 7852ebcView commit details -
[clang][bytecode][NFC] Move Call ops into Interp.cpp (llvm#107104)
They are quite long and not templated.
Configuration menu - View commit details
-
Copy full SHA for f70ccda - Browse repository at this point
Copy the full SHA f70ccdaView commit details -
[GISEL][AArch64][NFC] Stop using wip_match_opcode for some opcodes (l…
…lvm#106702) This patch moves to the new style of writing pattern for matching opcodes and thus deprecates using wip_match_opcoee. It moves G_FCONSTANT, G_ICMP, G_STORE, and G_OR.
Configuration menu - View commit details
-
Copy full SHA for df159d3 - Browse repository at this point
Copy the full SHA df159d3View commit details -
LICM: use IRBuilder in hoist BO assoc (llvm#106978)
Use IRBuilder when creating the new invariant instruction, so that the constant-folder has an opportunity to constant-fold the new Instruction that we desire to create.
Configuration menu - View commit details
-
Copy full SHA for 05f5a91 - Browse repository at this point
Copy the full SHA 05f5a91View commit details -
[ThinLTO] Don't always print ModulesToCompile debugging information (l…
…lvm#106769) Nothing went wrong in this case, we just successfully matched a module by identifier. No need to print to std::error like we would for something that should be user-visible. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Configuration menu - View commit details
-
Copy full SHA for fedc755 - Browse repository at this point
Copy the full SHA fedc755View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b6e255 - Browse repository at this point
Copy the full SHA 3b6e255View commit details -
[RISCV] Rename sf_vcix_state to sf.vcix_state. NFC (llvm#107115)
This PR: llvm#106995 names the vendor CSR in a wrong way, it should be `sf.` rather than `sf_` for prefix.
Configuration menu - View commit details
-
Copy full SHA for b7017ef - Browse repository at this point
Copy the full SHA b7017efView commit details -
Configuration menu - View commit details
-
Copy full SHA for e1bde1c - Browse repository at this point
Copy the full SHA e1bde1cView commit details -
[RISCV] Use RNE rounding mode for fcvt.s.bf16. Don't print the roundi…
…ng mode if RNE. (llvm#106948) The rounding mode has no effect on the instruction behavior. Using RNE matches what we do for fcvt.s.h, fcvt.d.f, fcvt.d.h which are similarily not affected by the rounding mode. This appears to match the behavior of binutils. According to compiler explore, objdump is unable to disassembler fcvt.s.bf16 with a non-zero rounding mode.
Configuration menu - View commit details
-
Copy full SHA for 2a9f93b - Browse repository at this point
Copy the full SHA 2a9f93bView commit details -
[ADT] Deprecate DenseMap::getOrInsertDefault (llvm#107040)
This patch deprecates DenseMap::getOrInsertDefault in favor of DenseMap::operator[], which does the same thing, has been around longer, and is also a household name as part of std::map and std::unordered_map. Note that DenseMap provides several equivalent ways to insert or default-construct a key-value pair: - operator[Key] - try_emplace(Key).first->second - getOrInsertDefault(Key) - FindAndConstruct(Key).second
Configuration menu - View commit details
-
Copy full SHA for 59a3b41 - Browse repository at this point
Copy the full SHA 59a3b41View commit details -
[clang][ExtractAPI] Remove erroneous module name check in MacroCallba…
…cks (llvm#107059) rdar://135044923
Configuration menu - View commit details
-
Copy full SHA for 86835d2 - Browse repository at this point
Copy the full SHA 86835d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 93857af - Browse repository at this point
Copy the full SHA 93857afView commit details -
[libclc] More cross compilation fixes (llvm#97811)
* Move the setup_host_tool calls to the directories of their tool. Although it works to call it in libclc, it can only appear in a single location so it fails the "what if everyone did this?" test and causes problems for downstream code that also wants to use native versions of these tools from other projects. * Correct the TARGET "${${tool}_target}" check. "${${tool}_target}" may be set to the path to the executable, which works in dependencies but cannot be tested using if(TARGET). For lack of a better alternative, just check that "${${tool}_target}" is non-empty and trust that if it is, it is set to a meaningful value. If somehow it turns out to be a valid target, its value will still show up in error messages anyway. * Account for llvm-spirv possibly being provided in-tree. Per https://github.com/KhronosGroup/SPIRV-LLVM-Translator?tab=readme-ov-file#llvm-in-tree-build it is possible to drop llvm-spirv into LLVM and have it built as part of LLVM's build. In this configuration, cross builds of LLVM require a native version of llvm-spirv to be built.
Configuration menu - View commit details
-
Copy full SHA for 903d1c6 - Browse repository at this point
Copy the full SHA 903d1c6View commit details -
LICM: extend hoist BO assoc to mul case (llvm#106991)
Trivially extend hoistBOAssociation to also handle the BinaryOperator Mul. Alive2 proofs: https://alive2.llvm.org/ce/z/zjtR5g
Configuration menu - View commit details
-
Copy full SHA for f1ef67d - Browse repository at this point
Copy the full SHA f1ef67dView commit details -
[gn build] Add missing llvm-strings dependency to check-lld (llvm#106896
) This has been required by `lld/test/ELF/zsectionheader.s` since it was added in 5d972c5.
Configuration menu - View commit details
-
Copy full SHA for 4da0aa3 - Browse repository at this point
Copy the full SHA 4da0aa3View commit details -
[bazel] Change cache-silo-key to fix blob fetch issue.
Bazel builds currently fail with `Failed to fetch blobs because they do not exist remotely.`. Set a cache-silo-key to start a new cache.
Configuration menu - View commit details
-
Copy full SHA for df4746d - Browse repository at this point
Copy the full SHA df4746dView commit details -
Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (llvm#1…
…06770) This is a follow up to 924907b, and is mostly motivated by consistency but does include one additional optimization. In general, we prefer 0.0 over -0.0 as the identity value for an fadd. We use that value in several places, but don't in others. So, let's be consistent and use the same identity (when nsz allows) everywhere. This creates a bunch of test churn, but due to 924907b, most of that churn doesn't actually indicate a change in codegen. The exception is that this change enables the use of 0.0 for nsz, but *not* reasoc, fadd reductions. Or said differently, it allows the neutral value of an ordered fadd reduction to be 0.0.
Configuration menu - View commit details
-
Copy full SHA for 2c7786e - Browse repository at this point
Copy the full SHA 2c7786eView commit details -
[M68k] Fix compilation pipeline check
- After 'RemoveLoadsIntoFakeUses' is enabled to support llvm.fake.use
Configuration menu - View commit details
-
Copy full SHA for 8e4b815 - Browse repository at this point
Copy the full SHA 8e4b815View commit details -
[clang][bytecode][NFC] Simplify builtin-functions.cpp (llvm#107118)
The effect is the same, but this version doesn't take as long to evaluate.
Configuration menu - View commit details
-
Copy full SHA for 9626e84 - Browse repository at this point
Copy the full SHA 9626e84View commit details -
[LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC]
These recurrence types don't have a meaningful identity, and the routine was abused to return the start value instead. Out of the three callers to this routine, only one actually wants this behavior. This is a prep change for removing the routine entirely and commoning it with other copies of the same logic.
Configuration menu - View commit details
-
Copy full SHA for 0b2f253 - Browse repository at this point
Copy the full SHA 0b2f253View commit details -
[MLIR][AMDGPU] Add support for fp8 ops on gfx12 (llvm#106388)
This PR is adding support for `fp8` and `bfp8` on gfx12
Configuration menu - View commit details
-
Copy full SHA for a8e1c6f - Browse repository at this point
Copy the full SHA a8e1c6fView commit details -
[SPIR-V] Improve correctness of emitted MIR between passes for branch…
…ing instructions (llvm#106966) This PR improves correctness of emitted MIR between passes for branching instructions and thus increase number of passing tests when expensive checks are on. Specifically, we address here such issues with machine verifier as: * fix switch generation: generate correct successors and undo the "address taken" status to reflect that a successor doesn't actually correspond to an IR-level basic block; * fix incorrect definition of OpBranch and OpBranchConditional in TableGen (SPIRVInstrInfo.td) to set isBarrier status properly and set a correct type of virtual registers; * fix a case when Phi refers to a type definition that goes after the Phi instruction, so that the virtual register definition of the type doesn't dominate all uses. This PR decrease number of failing tests under expensive checks from 56 to 50.
Configuration menu - View commit details
-
Copy full SHA for ebdadcf - Browse repository at this point
Copy the full SHA ebdadcfView commit details -
[SPIR-V] Ensure that OpExtInst instructions generated by NonSemantic_…
…Shader_DebugInfo_100 are not mixed up with other OpExtInst instructions (llvm#107007) This PR is to ensure that OpExtInst instructions generated by NonSemantic_Shader_DebugInfo_100 are not mixed up with other OpExtInst instructions. Original implementation (llvm#97558) has introduced an issue by moving OpExtInst instruction with the 3rd operand equal to DebugSource (value 35) or DebugCompilationUnit (value 1) even if OpExtInst is not generated by NonSemantic_Shader_DebugInfo_100 implementation code. The reproducer is attached as a new test case. The code of the test case reproduces the issue, because "lgamma" has the same code (35) inside OpenCL_std as DebugSource inside NonSemantic_Shader_DebugInfo_100.
Configuration menu - View commit details
-
Copy full SHA for 4f403e8 - Browse repository at this point
Copy the full SHA 4f403e8View commit details -
[SandboxIR] Add tracking for ShuffleVectorInst::commute. (llvm#106644)
Track it as an operand swap + a `setShuffleMask` and delegate to the `llvm::ShuffleVectorInst` implementation.
Configuration menu - View commit details
-
Copy full SHA for e89bcfc - Browse repository at this point
Copy the full SHA e89bcfcView commit details -
[NFC][opt] Rename VerifierKind enums (llvm#106789)
Make into enum class. Output really should be InputOutput since it also verifies the input IR.
Configuration menu - View commit details
-
Copy full SHA for fdc1b5d - Browse repository at this point
Copy the full SHA fdc1b5dView commit details -
[libc++] Add missing
std::is_virtual_base_of
totype_traits.inc
(l……lvm#107009) std::is_virtual_base_of was implemented in llvm#105847
Configuration menu - View commit details
-
Copy full SHA for 4640736 - Browse repository at this point
Copy the full SHA 4640736View commit details -
[CMake][compiler-rt] Support for using compiler-rt atomic library (ll…
…vm#106603) Not every toolchain provides and want to use libatomic which is a part of GCC, some toolchains may opt into using compiler-rt atomic library.
Configuration menu - View commit details
-
Copy full SHA for 26a4edf - Browse repository at this point
Copy the full SHA 26a4edfView commit details -
[SandboxIR] Implement remaining ConstantInt functions (llvm#106775)
This patch adds the remaining ConstantInt:: functions and it also implements the IntegerType class.
Configuration menu - View commit details
-
Copy full SHA for b91b1f0 - Browse repository at this point
Copy the full SHA b91b1f0View commit details -
[PGO][Pipeline] Enable PGOForceFunctionAttrs in PGO optimization pipe…
…lines (llvm#106790) Remove flag that turns on the PGOForceFunctionAttrs pass and always add it to default pipelines when using PGO. This is NFC by default since PGOOpt->ColdOptType is by default ColdFuncOpt::Default. Remove -O2 RUN line in basic.ll since we now have the pipeline tests.
Configuration menu - View commit details
-
Copy full SHA for fb14f1d - Browse repository at this point
Copy the full SHA fb14f1dView commit details -
[libc++] Fix __datasizeof_v for Clang17 and 18 in C++03 (llvm#106832)
This also disables the use of `__datasizeof`, since it's currently broken for empty types.
Configuration menu - View commit details
-
Copy full SHA for 42f5277 - Browse repository at this point
Copy the full SHA 42f5277View commit details -
Configuration menu - View commit details
-
Copy full SHA for 24b6b82 - Browse repository at this point
Copy the full SHA 24b6b82View commit details -
Revert "[SLP]Check for the whole vector vectorization in unique scala…
…rs analysis" This reverts commit b74e09c after post-commit review. The number of parts is calculated incorrectly.
Configuration menu - View commit details
-
Copy full SHA for 884d7c1 - Browse repository at this point
Copy the full SHA 884d7c1View commit details -
Revert "[SLP]Initial support for non-power-of-2 (but still whole regi…
…ster) number of elements in operands." This reverts commit a3ea90f after the post commit review. The number of parts is calculated incorrectly.
Configuration menu - View commit details
-
Copy full SHA for 571c8c2 - Browse repository at this point
Copy the full SHA 571c8c2View commit details -
[SLPVectorizer] Use DenseMap::{find,try_emplace} (NFC) (llvm#107123)
I'm planning to deprecate and eventually remove DenseMap::FindAndConstruct in favor of operator[].
Configuration menu - View commit details
-
Copy full SHA for 126940b - Browse repository at this point
Copy the full SHA 126940bView commit details -
[BOLT][YAML] Allow unknown keys in the input (llvm#100824)
This ensures forward compatibility, where old BOLT versions can consume the profile created by newer versions with extra keys. Test Plan: added yaml-unknown-keys.test
Configuration menu - View commit details
-
Copy full SHA for 15fa3ba - Browse repository at this point
Copy the full SHA 15fa3baView commit details -
[Clang] Fix handling of placeholder variables name in init captures (l…
…lvm#107055) We were incorrectly not deduplicating results when looking up `_` which, for a lambda init capture, would result in an ambiguous lookup. The same bug caused some diagnostic notes to be emitted twice. Fixes llvm#107024
Configuration menu - View commit details
-
Copy full SHA for eec1fac - Browse repository at this point
Copy the full SHA eec1facView commit details -
[LV] Prefer FLT_MIN/MAX for fmin/fmax reductions with ninf (llvm#107141)
Analogous to 2c7786e, cleanup a case where the vectorizer is emitting a non-canonical identity value given the available flags. We use largest/smallest value during ISEL, and VP expansion, but not during vectorization. Since the fmin/fmax/fminimum/fmaximum intrinsics don't require a start value, this difference is only visible when masking of inactive lanes is required. Primary motivation of this change is simply to remove a difference between version of code which reason about the identity value of a reduction so I can kill all but one off. In review, it was pointed out that this is actually a functional fix as well. The old code used inf on a noinf reduction instruction - whose result is poison! That wasn't the intent of the code.
Configuration menu - View commit details
-
Copy full SHA for 1fbb6b4 - Browse repository at this point
Copy the full SHA 1fbb6b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 451a313 - Browse repository at this point
Copy the full SHA 451a313View commit details -
[clang] [docs] Clarify the issue with compiler-rt on Windows/MSVC (ll…
…vm#106875) Compiler-rt does support Windows just fine, even if outdated docs pages didn't list it as one of the supported OSes, this is being rectified in llvm#106874. MinGW is another environment configuration on Windows, where compiler-rt or libgcc is linked in automatically, so there's no issue with having such builtins functions available. For MSVC style environments, compiler-rt builtins do work just fine, but Clang doesn't automatically link them in. See e.g. https://discourse.llvm.org/t/improve-autolinking-of-compiler-rt-and-libc-on-windows-with-lld-link/71392 for a discussion on how to improve this situation. But none of that issue is that compiler-rt itself wouldn't support Windows.
Configuration menu - View commit details
-
Copy full SHA for eb05e8f - Browse repository at this point
Copy the full SHA eb05e8fView commit details -
[clang] Don't add DWARF debug info when assembling .s with clang-cl /…
…Z7 (llvm#106686) This fixes a regression from f58330c. That commit changed the clang-cl options /Zi and /Z7 to be implemented as aliases of -g rather than having separate handling. This had the unintended effect, that when assembling .s files with clang-cl, the /Z7 option (which implies using CodeView debug info) was treated as a -g option, which causes `ClangAs::ConstructJob` to pick up the option as part of `Args.getLastArg(options::OPT_g_Group)`, which sets the `WantDebug` variable. Within `Clang::ConstructJob`, we check for whether explicit `-gdwarf` or `-gcodeview` options have been set, and if not, we pick the default debug format for the current toolchain. However, in `ClangAs`, if debug info has been enabled, it always adds DWARF debug info. Add similar logic in `ClangAs` - check if the user has explicitly requested either DWARF or CodeView, otherwise look up the toolchain default. If we (either implicitly or explicitly) should be producing CodeView, don't enable the default `ClangAs` DWARF generation. This fixes the issue, where assembling a single `.s` file with clang-cl, with the /Z7 option, causes the file to contain some DWARF sections. This causes the output executable to contain DWARF, in addition to the separate intended main PDB file. By having the output executable contain DWARF sections, LLDB only looks at the (very little) DWARF info in the executable, rather than looking for a separate standalone PDB file. This caused an issue with LLDB's tests, llvm#101710.
Configuration menu - View commit details
-
Copy full SHA for fcb7b39 - Browse repository at this point
Copy the full SHA fcb7b39View commit details -
[LV] Honor forced scalars in setVectorizedCallDecision.
Similarly to dd94537, setVectorizedCallDecision also did not consider ForcedScalars. This lead to VPlans not reflecting the decision by the legacy cost model (cost computation would use scalar cost, VPlan would have VPWidenCallRecipe). To fix this, check if the call has been forced to scalar in setVectorizedCallDecision. Note that this requires moving setVectorizedCallDecision after collectLoopUniforms (which sets ForcedScalars). collectLoopUniforms does not depend on call decisions and can safely be moved. Fixes llvm#107051.
Configuration menu - View commit details
-
Copy full SHA for 3bd161e - Browse repository at this point
Copy the full SHA 3bd161eView commit details -
[clang] [test] Fix the debug-options-as.c test on macOS
Separate the path, which may begin with e.g. /Users, with "--" from the other options, to make it clear that it is a path, not an option. This fixes a test from fcb7b39.
Configuration menu - View commit details
-
Copy full SHA for 70f3511 - Browse repository at this point
Copy the full SHA 70f3511View commit details -
[RISCV] Custom promote f16/bf16 (s/u)int_to_fp. (llvm#107026)
This avoids having isel patterns that emit two instrutions. It also allows us to remove sext.w and slli+srli pairs by using fcvt.s.w(u) on RV64.
Configuration menu - View commit details
-
Copy full SHA for ec8e1c6 - Browse repository at this point
Copy the full SHA ec8e1c6View commit details -
[Clang][Sema] clang generates incorrect fix-its for API_AVAILABLE (ll…
…vm#105855) Apple's API_AVAILABLE macro has its own notion of platform names which are supported by \_\_API_AVAILABLE_PLATFORM_<name> macros. They don't follow a consistent naming convention, but there's at least one that matches a valid availability attribute platform name. Instead of lowercasing the source spelling name, search for a defined macro and use that in the fix-it.
Configuration menu - View commit details
-
Copy full SHA for 319e8cd - Browse repository at this point
Copy the full SHA 319e8cdView commit details -
[X86] Don't save/restore fp/bp around terminator (llvm#106462)
In function spillFPBP we already try to skip terminator, but there is a logic error, so when there is only terminator instruction in the MBB, it still tries to save/restore fp/bp around it if the terminator clobbers fp/bp, for example a tail call with ghc calling convention. Now this patch really skips terminator even if it is the only instruction in the MBB.
Configuration menu - View commit details
-
Copy full SHA for cdab6ff - Browse repository at this point
Copy the full SHA cdab6ffView commit details -
[clang] [test] Fix the debug-options-as.c test on PowerPC
Use an explicit MSVC triple with an architecture that does have proper handling for MSVC style targets. This fixes a test from fcb7b39.
Configuration menu - View commit details
-
Copy full SHA for cbb5f03 - Browse repository at this point
Copy the full SHA cbb5f03View commit details -
[scudo] Update secondary cache released pages bound. (llvm#106466)
`MaxReleasedCachePages` has been set to 4. Initially, in llvm#105009 , we set `MaxReleasedCachePages` to 0 so that the partial chunk heuristic could be introduced incrementally as we observed its impact on retrieval order and more generally, performance. Co-authored-by: Joshua Baehring <josh.baehring@yale.edu>
Configuration menu - View commit details
-
Copy full SHA for 0ef7b1d - Browse repository at this point
Copy the full SHA 0ef7b1dView commit details -
[HLSL] Adjust resource binding diagnostic flags code (llvm#106657)
Adjust register binding diagnostic flags code in a couple of ways: - Store the resource class in the Flags struct to avoid duplicated scanning for HLSLResourceClassAttribute - Avoid unnecessary indirection when converting resource class to register type - Remove recursion and reduce duplicated code Also fixes a case where struct with an array was incorrectly diagnosed unfit for `c` register binding. This will also simplify work that is needed to be done in this area for llvm#104861.
Configuration menu - View commit details
-
Copy full SHA for 334d123 - Browse repository at this point
Copy the full SHA 334d123View commit details -
[flang][cuda] Convert global allocation for pinned variable (llvm#106807
Configuration menu - View commit details
-
Copy full SHA for dfc21ac - Browse repository at this point
Copy the full SHA dfc21acView commit details -
This patch fixes: clang/lib/Sema/SemaHLSL.cpp:838:12: error: unused variable 'TheVarDecl' [-Werror,-Wunused-variable] clang/lib/Sema/SemaHLSL.cpp:840:19: error: unused variable 'CBufferOrTBuffer' [-Werror,-Wunused-variable]
Configuration menu - View commit details
-
Copy full SHA for b2dabd2 - Browse repository at this point
Copy the full SHA b2dabd2View commit details -
Configuration menu - View commit details
-
Copy full SHA for d966d47 - Browse repository at this point
Copy the full SHA d966d47View commit details -
[lldb] Avoid FileSpec indirection where we can use SupportFiles directly
Now that more parts of LLDB know about SupportFiles, avoid going through FileSpec (and losing the Checksum in the process). Instead, use the SupportFile directly.
Configuration menu - View commit details
-
Copy full SHA for 98bde7f - Browse repository at this point
Copy the full SHA 98bde7fView commit details -
[SLPVectorizer] Avoid two successive hash lookups on the same key (ll…
…vm#107143) This patch replaces the find-try_emplace sequence with just one call to try_emplace, thereby avoiding two successive hash lookups on the same key. I am not using the "inserted" boolean from try_emplace to preserve the original behavior (that is, before PR 107123) that checks to see if the value is nullptr or not.
Configuration menu - View commit details
-
Copy full SHA for 53d3d1a - Browse repository at this point
Copy the full SHA 53d3d1aView commit details -
Configuration menu - View commit details
-
Copy full SHA for db8ca88 - Browse repository at this point
Copy the full SHA db8ca88View commit details -
[Docs] Use cacheable myst_heading_slug_func value
Avoid creating an uncacheable conf variable by using a string instead of a function reference. Also has the effect of avoiding triggering the "config.cache" sphinx warning. Requires myst_parser 0.19.0 (specifically executablebooks/MyST-Parser#696) which is over a year old by now. Do we mandate any minimum version for these dependencies?
Configuration menu - View commit details
-
Copy full SHA for 18cf14e - Browse repository at this point
Copy the full SHA 18cf14eView commit details -
[RISCV] Custom promote f16/bf16 fp_to_(s/u)int to reduce isel pattern…
…s that emit two instructions. (llvm#107011) All of the test changes are because integer type legalization prefers to promote fp_to_uint to fp_to_sint if neither is "Legal".
Configuration menu - View commit details
-
Copy full SHA for db3792b - Browse repository at this point
Copy the full SHA db3792bView commit details -
[lldb] Bump the lldb-dap version number
Bump the lldb-dap version number so that we can publish and updated version in the Visual Studio Marketplace.
Configuration menu - View commit details
-
Copy full SHA for 7d3b81d - Browse repository at this point
Copy the full SHA 7d3b81dView commit details -
[SLP]Fix PR107037: correctly track origonal/modified after vectorizat…
…ions reduced values Need to correctly track reduced values with multiple uses in the same reduction emission attempt. Otherwise, the number of the reuses might be calculated incorrectly, and may cause compiler crash. Fixes llvm#107037
Configuration menu - View commit details
-
Copy full SHA for 98bb354 - Browse repository at this point
Copy the full SHA 98bb354View commit details -
[M68k] Introduce more MOVI cases (llvm#98377)
Add three more special cases for loading registers with immediates. The first allows values in the range of [-255, 255] to be loaded with MOVEQ, even if the register is more than 8 bits and the sign extention is unwanted. This is done by loading the bitwise complement of the desired value, then performing a NOT instruction on the loaded register. This special case is only used when a simple MOVEQ cannot be used, and is only used for 32 bit data registers. Address registers cannot support MOVEQ, and the two-instruction sequence is no faster or smaller than a plain MOVE instruction when loading 16 bit immediates on the 68000, and likely slower for more sophisticated microarchitectures. However, the instruction sequence is both smaller and faster than the corresponding MOVE instruction for 32 bit register widths. The second special case is for zeroing address registers. This simply expands to subtracting a register with itself, consuming one instruction word rather than 2-3, with a small improvement in speed as well. The last special case is for assigning sign-extended 16-bit values to a full address register. This takes advantage of the fact that the movea.w instruction sign extends the output, permitting the immediate to be smaller. This is similar to using lea with a 16-bit address, which is not added in this patch as 16-bit absolute addressing is not yet implemented. This is a v2 submission of llvm#90817. It also creates a 'Data' test directory to better align with the backend's tablegen layout.
Configuration menu - View commit details
-
Copy full SHA for d3c10b5 - Browse repository at this point
Copy the full SHA d3c10b5View commit details -
[RISCV] Don't promote f16/bf16 SELECT with Zfhmin/Zfbfmin. (llvm#107138)
Select only needs branches and moves so we don't need to promote it. Promoting would canonicalize NaNs which select shouldn't do.
Configuration menu - View commit details
-
Copy full SHA for 1c874bb - Browse repository at this point
Copy the full SHA 1c874bbView commit details -
[lld-macho] Always store symbol name length eagerly (NFC) (llvm#106906)
The only instance where we weren't already passing a `StringRef` with a known length to `Symbol`'s constructor is where the argument is a string literal. Even in that case, lazy `strlen` calls don't make sense, as the compiler can constant-evaluate the `StringRef(const char*)` constructor. For symbols that go into the symbol table we need the length when calculating the hash anyway. We could get away with not calling `getName()` for local symbols, but the total contribution of `strlen` to the run time is already below 1%, so that would just complicate the code for a negligible benefit.
Configuration menu - View commit details
-
Copy full SHA for b24a304 - Browse repository at this point
Copy the full SHA b24a304View commit details -
[ctx_prof] Add Inlining support (llvm#106154)
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant. Post-inlining, the update mainly consists of: - making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions) - in the contextual profile: - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual. - the contexts of the callee (at the inlined callsite) are moved to the caller. - the callee context at the inlined callsite is deleted.
Configuration menu - View commit details
-
Copy full SHA for 3209766 - Browse repository at this point
Copy the full SHA 3209766View commit details -
Revert "[SLP]Fix PR107037: correctly track origonal/modified after ve…
…ctorizations reduced values" This reverts commit 98bb354 to fix buildbots https://lab.llvm.org/buildbot/#/builders/155/builds/2056 and https://lab.llvm.org/buildbot/#/builders/11/builds/4407
Configuration menu - View commit details
-
Copy full SHA for dce73e1 - Browse repository at this point
Copy the full SHA dce73e1View commit details -
[compiler-rt][rtsan] Add scoped reporting lock (llvm#107167)
Uses a static lock to ensure multiple threads reporting issues at the same time don't have printing collisions. This isn't so important now, but will be with continue mode in the future.
Configuration menu - View commit details
-
Copy full SHA for 18263c3 - Browse repository at this point
Copy the full SHA 18263c3View commit details -
[lldb] Remove limit on max memory read size (llvm#105765)
`memory read` will return an error if you try to read more than 1k bytes in a single command, instructing you to set `target.max-memory-read-size` or use `--force` if you intended to read more than that. This is a safeguard for a command where people are being explicit about how much memory they would like lldb to read (either to display, or save to a file) and is an annoyance every time you need to read more than a small amount. If someone confuses the --count argument with the start address, lldb may begin dumping gigabytes of data but I'd rather that behavior than requiring everyone to special-case their way around a common use case. I don't want to remove the setting because many people have added (much larger) default max read sizes to their ~/.lldbinit files after hitting this behavior. Another option would be to stop reading/using the value in Target.cpp, but I see no harm in leaving the setting if someone really does prefer to have a small cap on their memory read size.
Configuration menu - View commit details
-
Copy full SHA for b076f66 - Browse repository at this point
Copy the full SHA b076f66View commit details
Commits on Sep 4, 2024
-
Remove "Target" from createXReduction naming [nfc]
Despite the stale comments, none of these actually use TTI, and they're solely generating standard LLVM IR.
Configuration menu - View commit details
-
Copy full SHA for 3e8840b - Browse repository at this point
Copy the full SHA 3e8840bView commit details -
[clang] Add test for CWG2486 (
noexcept
and function pointer convers……ion) (llvm#107131) [CWG2486](https://cplusplus.github.io/CWG/issues/2486.html) "Call to `noexcept` function via `noexcept(false)` pointer/lvalue" allows `noexcept` functions to be called via `noexcept(false)` pointers or values. There appears to be no implementation divergence whatsoever: https://godbolt.org/z/3afTfeEM8. That said, in C++14 and earlier we do not issue all the diagnostics we issue in C++17 and newer, so I'm specifying the status of the issue accordingly.
Configuration menu - View commit details
-
Copy full SHA for eaa95a1 - Browse repository at this point
Copy the full SHA eaa95a1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 83ad644 - Browse repository at this point
Copy the full SHA 83ad644View commit details -
[SandboxIR] Implement ConstantAggregate (llvm#107136)
This patch implements sandboxir:: ConstantAggregate, ConstantStruct, ConstantArray and ConstantVector, mirroring LLVM IR.
Configuration menu - View commit details
-
Copy full SHA for 814aa43 - Browse repository at this point
Copy the full SHA 814aa43View commit details -
Configuration menu - View commit details
-
Copy full SHA for 48bc8b0 - Browse repository at this point
Copy the full SHA 48bc8b0View commit details -
[RISCV] Bitcast fixed length bf16/f16 build_vector to i16 with Zvfbfm…
…in/Zvfhmin+Zfbfmin/Zfhmin. (llvm#106637) Previously, if Zfbfmin/Zfhmin were enabled, we only handled build_vectors that could be turned into splat_vectors. We promoted them to f32 splats by extending in the scalar domain and narrowing in the vector domain. This patch fixes a crash where we failed to account for whether the f32 vector type fit in LMUL<=8. Because the new lowering occurs after type legalization, we have to be careful to use XLenVT for the scalar integer type and use custom cast nodes.
Configuration menu - View commit details
-
Copy full SHA for ff0f201 - Browse repository at this point
Copy the full SHA ff0f201View commit details -
[WebAssembly] Remove Kind argument from WebAssemblyOperand (NFC) (llv…
…m#107157) The `Kind` argument does not need to passed separately.
Configuration menu - View commit details
-
Copy full SHA for f1615e3 - Browse repository at this point
Copy the full SHA f1615e3View commit details -
[mlir][tensor] Fix consumer fusion for
tensor.pack
without explicit…… `outer_dims_perm` attribute (llvm#106687)
Configuration menu - View commit details
-
Copy full SHA for c8763f0 - Browse repository at this point
Copy the full SHA c8763f0View commit details -
[clang] Add tests for CWG issues about language linkage (llvm#107019)
This patch covers Core issues about language linkage during declaration matching resolved in [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html), namely [CWG563](https://cplusplus.github.io/CWG/issues/563.html) and [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html). [CWG563](https://cplusplus.github.io/CWG/issues/563.html) "Linkage specification for objects" ----------- [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG563](https://cplusplus.github.io/CWG/issues/563.html) is resolved by simplifications that follow its suggestions. Wording ([[dcl.link]/5](https://eel.is/c++draft/dcl.link#5)): > In a [linkage-specification](https://eel.is/c++draft/dcl.link#nt:linkage-specification), the specified language linkage applies to the function types of all function declarators and to all functions and variables whose names have external linkage[.](https://eel.is/c++draft/dcl.link#5.sentence-5) Now the wording clearly says that linkage-specification applies to variables with external linkage. [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html) "Visibility and inherited language linkage" ------------ [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html): > [CWG386](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#386), [CWG1839](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1839), [CWG1818](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1818), [CWG2058](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2058), [CWG1900](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1900), and Richard’s observation in [“are non-type names ignored in a class-head-name or enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are resolved by describing the limited lookup that occurs for a declarator-id, including the changes in Richard’s [proposed resolution for CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html) (which also resolves CWG1818 and what of CWG2058 was not resolved along with CWG2059) and rejecting the example from [CWG1477](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1477). Wording ([[dcl.link]/6](https://eel.is/c++draft/dcl.link#6)): > A redeclaration of an entity without a linkage specification inherits the language linkage of the entity and (if applicable) its type[.](https://eel.is/c++draft/dcl.link#6.sentence-2). Answer to the question in the example is `extern "C"`, and not linkage mismatch. Further analysis of the example is provided as inline comments in the test itself. Note that https://eel.is/c++draft/dcl.link#7 does NOT apply in this example, as it's focused squarely at declarations that are already known to have C language linkage, and declarations of variables in the global scope.
Configuration menu - View commit details
-
Copy full SHA for 99f02a8 - Browse repository at this point
Copy the full SHA 99f02a8View commit details -
[IR] Remove unused MINARITY operand trait tpl args, NFC (llvm#107165)
These don't look like they've been used since the original 'use-diet' branch was merged in 2008 ( f6caff6)
Configuration menu - View commit details
-
Copy full SHA for b057e16 - Browse repository at this point
Copy the full SHA b057e16View commit details -
[VPlan][NFC] Implement
VPWidenMemoryRecipe::computeCost()
. (llvm#10……5614) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.
Configuration menu - View commit details
-
Copy full SHA for ed220e1 - Browse repository at this point
Copy the full SHA ed220e1View commit details -
[AArch64][GlobalISel] Lower G_BUILD_VECTOR to G_INSERT_VECTOR_ELT (ll…
…vm#105686) The lowering happens in post-legalizer lowering if any source registers from G_BUILD_VECTOR are not constants. Add pattern pragment setting `scalar_to_vector ($src)` asequivalent to `vector_insert (undef), ($src), (i61 0)`
Configuration menu - View commit details
-
Copy full SHA for 9b5971a - Browse repository at this point
Copy the full SHA 9b5971aView commit details -
[clang-format] Handle spaces in file paths in git-clang-format.bat (l…
…lvm#107041) This patch is provided by @jeliebig. Fixes llvm#107017.
Configuration menu - View commit details
-
Copy full SHA for 12c0823 - Browse repository at this point
Copy the full SHA 12c0823View commit details -
Configuration menu - View commit details
-
Copy full SHA for a27ff17 - Browse repository at this point
Copy the full SHA a27ff17View commit details -
[clang][Driver] Define soft float macros for PPC. (llvm#106012)
Fixes llvm#105972. Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>
Configuration menu - View commit details
-
Copy full SHA for b55186e - Browse repository at this point
Copy the full SHA b55186eView commit details -
[MLIR][Tensor] Fix source/dest type check in UnPackOp canonicalize (l…
…lvm#106094) Fix `RankedTensorType` equality check in unpack op canonicalization.
Configuration menu - View commit details
-
Copy full SHA for 8d08166 - Browse repository at this point
Copy the full SHA 8d08166View commit details -
[clang-format] Handle pointer/reference in macro definitions (llvm#10…
…7074) A macro definition needs its own scope stack in the annotator, so we add the MacroBodyScopes stack and use ScopeStack to refer to it when in the macro definition body. Also, we need to have a scope type for a child block because its parent line is parsed (and thus the scope type for the braces is popped off the scope stack) before the lines in the child block are. Fixes llvm#99271.
Configuration menu - View commit details
-
Copy full SHA for 812c96e - Browse repository at this point
Copy the full SHA 812c96eView commit details -
[mlir][TensorToSPIRV] Add type check for
tensor.extract
in TensorTo……SPIRV (llvm#107110) This patch add a type check for `tensor.extract` in TensorToSPIRV. Only convert `tensor.extract` with supported element type. Fix llvm#74466.
Configuration menu - View commit details
-
Copy full SHA for f4b9839 - Browse repository at this point
Copy the full SHA f4b9839View commit details
Commits on Sep 25, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2fff529 - Browse repository at this point
Copy the full SHA 2fff529View commit details