Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with f4b9839d (Sep 04) (20) #373

Open
wants to merge 289 commits into
base: bump_to_1293ab35
Choose a base branch
from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Aug 31, 2024

  1. [Offload] Fix disabling of cuda target on unsupported platforms (llvm…

    …#106835)
    
    The target name and the message are wrong -- both should say "cuda" for
    the filtering to work.
    
    Fixes commit 300e5b9 (llvm#93186).
    xen0n authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    75545b3 View commit details
    Browse the repository at this point in the history
  2. [Offload] Fix stray libomptarget message helper calls (llvm#106837)

    In llvm#92581 the `LibomptargetUitls.cmake` helpers have been removed, but
    only uses of `libomptarget_say` were migrated. Migrate the remaining few
    warning and error messages so the `check-offload` target would not fail
    due to missing `libomptarget_warning_say`.
    
    While at it, update the `check-offload` unavailability message to say
    `check-offload` instead of `check-libomptarget`.
    
    Fixes llvm#92581
    xen0n authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    9adf811 View commit details
    Browse the repository at this point in the history
  3. [libcxx] Do not include langinfo.h when using the LLVM C library (l…

    …lvm#106634)
    
    Summary:
    The `langinfo.h` header is a POSIX extension, so ideally we would be
    able to build the C++ library without it. Currently the LLVM C library
    doesn't support / provide it. This allows us to build the C++ library
    with locales enabled. We can either disable it here, or just provide
    stubs that do nothing as in
    llvm#106620.
    jhuber6 authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    109bff1 View commit details
    Browse the repository at this point in the history
  4. [libcxx] Use the default rune table when using the LLVM libc (llvm#10…

    …6632)
    
    Summary:
    We currently do not provide a more complicated rune table, so we want
    the
    default.
    jhuber6 authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    38dbcbd View commit details
    Browse the repository at this point in the history
  5. [TTI] Add cost model support for [u|s]cmp (llvm#106824)

    This patch adds cost model support for [u|s]cmp.
    dtcxzyw authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    140e80a View commit details
    Browse the repository at this point in the history
  6. [RISCV] Fix -Wunused-variable in RISCVISelLowering.cpp (NFC)

    /llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21558:14: error: unused variable 'ValLMUL' [-Werror,-Wunused-variable]
        unsigned ValLMUL =
                 ^
    /llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:21561:14: error: unused variable 'PartLMUL' [-Werror,-Wunused-variable]
        unsigned PartLMUL =
                 ^
    2 errors generated.
    DamonFool committed Aug 31, 2024
    Configuration menu
    Copy the full SHA
    1061c6d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    4514c38 View commit details
    Browse the repository at this point in the history
  8. [SLP]Initial support for non-power-of-2 (but still whole register) nu…

    …mber of elements in operands.
    
    Patch adds basic support for non-power-of-2 number of elements in
    operands. The patch still requires that this number addresses whole
    registers.
    
    Reviewers: RKSimon
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#106449
    alexey-bataev committed Aug 31, 2024
    Configuration menu
    Copy the full SHA
    a3ea90f View commit details
    Browse the repository at this point in the history
  9. [HLSL] AST support for WaveSize attribute. (llvm#101240)

    First step for support WaveSize attribute in
     https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html
    and
    https://microsoft.github.io/hlsl-specs/proposals/0013-wave-size-range.html
    
    A new attribute HLSLWaveSizeAttr was supported in the AST.
    
    Implement both the wave size and the wave size range, rather than
    separately which might require more work.
    
    For llvm#70118
    python3kgae authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    e41579a View commit details
    Browse the repository at this point in the history
  10. [HLSL] Implement output parameter (llvm#101083)

    HLSL output parameters are denoted with the `inout` and `out` keywords
    in the function declaration. When an argument to an output parameter is
    constructed a temporary value is constructed for the argument.
    
    For `inout` pamameters the argument is initialized via copy-initialization
    from the argument lvalue expression to the parameter type. For `out`
    parameters the argument is not initialized before the call.
    
    In both cases on return of the function the temporary value is written
    back to the argument lvalue expression through an implicit assignment
    binary operator with casting as required.
    
    This change introduces a new HLSLOutArgExpr ast node which represents
    the output argument behavior. The OutArgExpr has three defined children:
    - An OpaqueValueExpr of the argument lvalue expression.
    - An OpaqueValueExpr of the copy-initialized parameter.
    - A BinaryOpExpr assigning the first with the value of the second.
    
    Fixes llvm#87526
    
    ---------
    
    Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
    Co-authored-by: John McCall <rjmccall@gmail.com>
    3 people authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    89fb849 View commit details
    Browse the repository at this point in the history
  11. [X86] Fix livein handling in emitStackProbeInlineWindowsCoreCLR64. (l…

    …lvm#106828)
    
    Stop adding liveins for virtual registers. In the livein interface, the
    register goes through a MCPhysReg which is uint16_t. This causes the
    virtual register bit to be dropped making it alias to some nonsense
    physical register.
    
    Recompute the liveins for the continue block to handle any live
    registers that are needed by instructions that were spliced from the
    original block. This fixing the machine verifier error so we can remove
    that fixme now.
    topperc authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    8638fe1 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    6d9c6f0 View commit details
    Browse the repository at this point in the history
  13. [Transforms][IPO] Add remarks for ArgumentPromotion and DeadArgumentE… (

    llvm#105740)
    
    …limination
    
    ArgumentPromotion and DeadArgumentElimination passes may change function
    signature. This makes bpf tracing difficult since users either not aware
    of signature change or need to poke into IR or assembly to understand
    the function signature change.
    
    This patch enabled to emit some remarks so if recompiling with
    -foptimization-record-file=<file>, users can check remarks to see what
    kind of signature changes for a particular function. The following are
    some examples for implemented remarks:
    ```
      Pass:            deadargelim
      Name:            ReturnValueRemoved
      DebugLoc:        { File: 'bpf-next/net/mptcp/protocol.c', Line: 572, Column: 0 }
      Function:        mptcp_check_data_fin
      Args:
        - String:          'removing return value '
        - String:          '0'
    
      Pass:            deadargelim
      Name:            ArgumentRemoved
      DebugLoc:        { File: 'bpf-next/kernel/bpf/syscall.c', Line: 1670, Column: 0 }
      Function:        map_delete_elem
      Args:
          - String:          'eliminating argument '
          - ArgName:         uattr.coerce0
          - String:          '('
          - ArgIndex:        '1'
          - String:          ')'
    
      Pass:            argpromotion
      Name:            ArgumentPromoted
      DebugLoc:        { File: 'bpf-next/net/mptcp/protocol.h', Line: 570, Column: 0 }
      Function:        mptcp_subflow_ctx
      Args:
        - String:          'promoting argument '
        - ArgName:         sk
        - String:          '('
        - ArgIndex:        '0'
        - String:          ')'
        - String:          ' to pass by value'
    ```
      [1] llvm#104678
    yonghong-song authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    470f55f View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    2afa975 View commit details
    Browse the repository at this point in the history
  15. [clang] function template non-call partial ordering fixes (llvm#106829)

    This applies to function template non-call partial ordering the same
    provisional wording change applied in the call context: Don't perform
    the consistency check on return type and parameters which didn't have
    any template parameters deduced from.
    
    Fixes regression introduced in llvm#100692, which was reported on the PR.
    mizvekov authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    cfe331b View commit details
    Browse the repository at this point in the history
  16. docs: Clarify commit access requirements in the Developer Policy (llv…

    …m#101414)
    
    We have been discussing changes to our commit access polices recently
    and based on some feedback from clattner here:
    
    https://discourse.llvm.org/t/rfc-new-criteria-for-commit-access/76290/81
    
    We need to update our Developer Policy so that it matches what we are
    actually doing in this project. We currently grant commit access to
    anyone with a valid justification, not just contributors who have
    submitted high-quality patches in the past.
    
    ---------
    
    Co-authored-by: Shilei Tian <i@tianshilei.me>
    tstellar and shiltian authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    ec58817 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    37e109c View commit details
    Browse the repository at this point in the history
  18. [SelectionDAGISel] Use MCRegister and Register for LiveInMap. NFC

    This matches the MachineBasicBlock liveins used to populate it.
    topperc committed Aug 31, 2024
    Configuration menu
    Copy the full SHA
    a3e2936 View commit details
    Browse the repository at this point in the history
  19. [DXIL][Analysis] Collect Function properties in Metadata Analysis (ll…

    …vm#105728)
    
    Basic infrastructure to collect Function properties in Metadata Analysis
    - Add a `SmallVector` of entry properties to the metadata information.
    - Add a structure to represent function properties. Currently
    `numthreads` and shader kind properties of shader entry functions are
    represented.
    bharadwajy authored Aug 31, 2024
    Configuration menu
    Copy the full SHA
    8aa8c05 View commit details
    Browse the repository at this point in the history

Commits on Sep 1, 2024

  1. Configuration menu
    Copy the full SHA
    84580a0 View commit details
    Browse the repository at this point in the history
  2. [InstCombine] Replace all dominated uses of condition with constants (

    …llvm#105510)
    
    This patch replaces all dominated uses of condition with true/false to
    improve context-sensitive optimizations. It eliminates a bunch of
    branches in llvm-opt-benchmark.
    
    As a side effect, it may introduce new phi nodes in some corner cases.
    See the following case:
    ```
    define i1 @test(i1 %cmp, i1 %cond) {
    entry:
       br i1 %cond, label %bb1, label %bb2
    bb1:
       br i1 %cmp, label %if.then, label %if.else
    if.then:
       br %bb2
    if.else:
       br %bb2
    bb2:
      %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else]
      ret i1 %res
    }
    ```
    It will be simplified into:
    ```
    define i1 @test(i1 %cmp, i1 %cond) {
    entry:
       br i1 %cond, label %bb1, label %bb2
    bb1:
       br i1 %cmp, label %if.then, label %if.else
    if.then:
       br %bb2
    if.else:
       br %bb2
    bb2:
      %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else]
      ret i1 %res
    }
    ```
    
    I am planning to fix this in late pipeline/CGP since this problem exists
    before the patch.
    dtcxzyw authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    380fa87 View commit details
    Browse the repository at this point in the history
  3. [NFC] Fix typos (llvm#106817)

    c8ef authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    4f4bd41 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6f682c2 View commit details
    Browse the repository at this point in the history
  5. [clang][bytecode] Fix diagnosing reads from temporaries (llvm#106868)

    Fix the DeclID not being set in global temporaries and use the same
    strategy for deciding if a temporary is readable as the current
    interpreter.
    tbaederr authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    e4f3b56 View commit details
    Browse the repository at this point in the history
  6. [X86] Remove X86RegisterInfo::getSEHRegNum. (llvm#106866)

    As far as I can tell, there's no way to call this. There are no calls in
    the X86 directory. It has the same name as a function in MCRegisterInfo,
    but that function takes a MCRegister and isn't virtual.
    
    The function in MCRegisterInfo uses a DenseMap populated by
    `X86_MC::initLLVMToSEHAndCVRegMapping`. The DenseMap is populated for
    every physical register using the encoding value. I think that means the
    function in MCRegisterInfo would return the same value as the function
    in X86RegisterInfo.
    topperc authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    feb391c View commit details
    Browse the repository at this point in the history
  7. [RISCV] Custom legalize f16/bf16 FNEG/FABS with Zfhmin/Zbfmin. (llvm#…

    …106886)
    
    The LegalizeDAG expansion will go through memory since i16 isn't a legal
    type. Avoid this by using FMV nodes.
    topperc authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    3bdec31 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    840d4d9 View commit details
    Browse the repository at this point in the history
  9. [CMake][Support] Use /nologo when compiling BLAKE3 assembly sources o…

    …n Windows (llvm#106794)
    
    Suppresses the copyright banner for `ml64` compiling BLAKE3 assembly
    sources with MSVC and Ninja on Windows:
    
    ```
    [157/3758] Building ASM_MASM object lib\Support\BLAKE3\CMa...upportBlake3.dir\blake3_avx512_x86-64_windows_msvc.asm.obj
    Microsoft (R) Macro Assembler (x64) Version 14.41.34120.0
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
     Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm
    ```
    
    is now just:
    
    ```
     Assembling: C:\path\to\llvm-project\llvm\lib\Support\BLAKE3\blake3_avx512_x86-64_windows_msvc.asm
    ```
    
    We can suppress that last line with `/quiet` in more recent versions of
    `ml64` (from MSVC 2022 17.6) but it is not supported by all potential
    MASM compilers.
    MattBolitho authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    bec1d86 View commit details
    Browse the repository at this point in the history
  10. [Clang][NFC] Don't manually enumerate the PredefinedDeclIDs (llvm#106891

    )
    
    This doesn't seem to have any use other than the possibility of merge
    conflicts and accidentally forgetting to update `NUM_PREDEF_DECL_IDS`.
    philnik777 authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    4fef204 View commit details
    Browse the repository at this point in the history
  11. [SLP] Fix crash of shuffle poison (llvm#106857)

    When the shuffle masks are `PoisonMaskElem`, there is not need to check
    the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause
    the compiler to crash.
    
    Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) &&
    "SK_ExtractSubvector index out of range"' failed.
    tcwzxx authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    24a043a View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    7c4cffd View commit details
    Browse the repository at this point in the history
  13. Revert "[AMDGPU][LTO] Assume closed world after linking (llvm#105845)" (

    llvm#106889)
    
    We can't assume closed world even in full LTO post-link stage. It is
    only true
    if we are building a "GPU executable". However, AMDGPU does support
    "dyamic
    library". I'm not aware of any approach to tell if it is relocatable
    link when
    we create the pass. For now let's revert the patch as it is currently
    breaking things.
    We can re-enable it once we can handle it correctly.
    shiltian authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    84ed3c2 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    57ef16c View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    803ab28 View commit details
    Browse the repository at this point in the history
  16. [SLP]Fix PR106909: add a check for unsafe FP operations.

    NEON has non-IEEE compliant denormal flushing and the compiler should
    check if it safe to vectorize instructions for NEON in non-fast math
    mode.
    
    Fixes llvm#106909
    alexey-bataev committed Sep 1, 2024
    Configuration menu
    Copy the full SHA
    6e68fa9 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    7b2fe84 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    affc0c6 View commit details
    Browse the repository at this point in the history
  19. [VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). (llvm#106047)

    Implement cost computation for VPWidenCallRecipe. In some cases, targets
    use argument info to compute intrinsic costs. If all operands of the
    call are VPValues with an underlying IR value, use the IR values as
    arguments.
    
    PR: llvm#106731
    fhahn authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    9ccf825 View commit details
    Browse the repository at this point in the history
  20. [LTO] Reduce memory usage for import lists (llvm#106772)

    This patch reduces the memory usage for import lists by employing
    memory-efficient data structures.
    
    With this patch, an import list for a given destination module is
    basically DenseSet<uint32_t> with each element indexing into the
    deduplication table containing tuples of:
    
      {SourceModule, GUID, Definition/Declaration}
    
    In one of our large applications, the peak memory usage goes down by
    9.2% from 6.120GB to 5.555GB during the LTO indexing step.
    
    This patch addresses several sources of space inefficiency associated
    with std::unordered_map:
    
    - std::unordered_map<GUID, ImportKind> takes up 16 bytes because of
      padding even though ImportKind only carries one bit of information.
    
    - std::unordered_map uses pointers to elements, both in the hash table
      proper and for collision chains.
    
    - We allocate an instance of std::unordered_map for each
      {Destination Module, Source Module} pair for which we have at least
      one import.  Most import lists have less than 10 imports, so the
      metadata like the size of std::unordered_map and the pointer to the
      hash table costs a lot relative to the actual contents.
    kazutakahirata authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    5c0d61e View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    984fca5 View commit details
    Browse the repository at this point in the history
  22. [RISCV] Add test for llvm.round.i32.f16 RV64+Zfhmin/Zhinxmin. NFC

    We have special handling for this in type legalization, but we
    didn't have a test.
    topperc committed Sep 1, 2024
    Configuration menu
    Copy the full SHA
    5aa83eb View commit details
    Browse the repository at this point in the history
  23. [LV] Don't consider branches leaving loop in collectValuesToIgnore.

    Branches exiting the loop will remain regardless, so don't consider them
    in collectValuesToIgnore.
    
    This fixes another divergence between legacy and VPlan-based cost model.
    
    Fixes llvm#106780.
    fhahn committed Sep 1, 2024
    Configuration menu
    Copy the full SHA
    654bb4e View commit details
    Browse the repository at this point in the history
  24. [AArch64] Add tests for fused FP literals. NFC (llvm#106731)

    This is for an upcoming change to the threshold on Apple targets for
    using a constant pool for FP literals versus building them with integer
    moves.
    
    This file is based on literal_pools_float.ll. I tried to bolt on to the
    existing test, but it got messy as that file is already testing a matrix
    of combinations, so creating this new file instead.
    citymarina authored Sep 1, 2024
    Configuration menu
    Copy the full SHA
    747d89a View commit details
    Browse the repository at this point in the history
  25. [RISCV] Correct the rounding mode for llvm.lround.i64.f32 with RV64+Z…

    …finx.
    
    We should use RMM instead of DYN.
    topperc committed Sep 1, 2024
    Configuration menu
    Copy the full SHA
    776aef1 View commit details
    Browse the repository at this point in the history
  26. [RISCV] Custom promote f16 (l)lround/(l)lrint with Zfhmin/Zhinxmin in…

    …stead of using isel patterns.
    topperc committed Sep 1, 2024
    Configuration menu
    Copy the full SHA
    357bd61 View commit details
    Browse the repository at this point in the history

Commits on Sep 2, 2024

  1. [lld][ELF] Add -plugin-opt=time-trace= as an alias of `--time-trace…

    …=` (llvm#106803)
    
    Time trace profiler support was added into LLVMgold in
    cd3255a. This patch adds its
    `-plugin-opt` counterpart, which is just an alias to `--time-trace=`,
    into LLD for compatibility.
    mshockwave authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    5fe852e View commit details
    Browse the repository at this point in the history
  2. [RISCV][TTI] Scale the cost of FP-Int conversion with LMUL (llvm#87506)

    Widening/narrowing the source data type to match the destination data
    type may require multiple steps.
    To model the costs, the patch generated the interim type by following
    the logic in RISCVTargetLowering::lowerVPFPIntConvOp.
    arcbbb authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    837ee5b View commit details
    Browse the repository at this point in the history
  3. [LoongArch] Remove unnecessary increment operations

    `HighMask` is the value that sets bits from `Msb+1` to 63 to 1, while
    the other bits are set to 0.
    wangleiat committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    77523f9 View commit details
    Browse the repository at this point in the history
  4. [clang][AIX] Fix -print-runtime-dir on AIX (llvm#104806)

    Currently the option prints a path to a nonexistent directory with the
    full triple, `lib/powerpc64-ibm-aix7.2.0.0`. It should only be
    `lib/aix`.
    jakeegan authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    27e244f View commit details
    Browse the repository at this point in the history
  5. [RISCV] Move VLDSX0Pred from RISCVSchedSiFive7.td to RISCVScheduleV.t…

    …d. NFC (llvm#106671)
    
    This predicate isn't bound to the scheduler model and and we may want to
    reuse it in the future. We already moved it to reuse it in our
    downstream.
    topperc authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    c74cc73 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    647f892 View commit details
    Browse the repository at this point in the history
  7. [Clang][Concepts] Correct the CurContext for friend declarations (llv…

    …m#106890)
    
    `FindInstantiatedDecl()` relies on the `CurContext` to find the
    corresponding class template instantiation for a class template
    declaration.
    
    Previously, we pushed the semantic declaration context for constraint
    comparison, which is incorrect for constraints on friend declarations.
    In issue llvm#78101, the semantic context of the friend is the TU, so we
    missed the implicit template specialization `Template<void, 4>` when
    looking for the instantiation of the primary template `Template` at the
    time of checking the member instantiation; instead, we mistakenly picked
    up the explicit specialization `Template<float, 5>`, hence the error.
    
    As a bonus, this also fixes a crash when diagnosing constraints. The
    DeclarationName is not necessarily an identifier, so it's incorrect to
    call `getName()` on e.g. overloaded operators. Since the
    DiagnosticBuilder has correctly handled Decl printing, we don't need to
    find the printable name ourselves.
    
    Fixes llvm#78101
    zyn0217 authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    358165d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    da13754 View commit details
    Browse the repository at this point in the history
  9. [lldb] Better matching of types in anonymous namespaces (llvm#102111)

    This patch extends TypeQuery matching to support anonymous namespaces. A
    new flag is added to control the behavior. In the "strict" mode, the
    query must match the type exactly -- all anonymous namespaces included.
    The dynamic type resolver in the itanium abi (the motivating use case
    for this) uses this flag, as it queries using the name from the
    demangles, which includes anonymous namespaces.
    
    This ensures we don't confuse a type with a same-named type in an
    anonymous namespace. However, this does *not* ensure we don't confuse
    two types in anonymous namespacs (in different CUs). To resolve this, we
    would need to use a completely different lookup algorithm, which
    probably also requires a DWARF extension.
    
    In the "lax" mode (the default), the anonymous namespaces in the query
    are optional, and this allows one search for the type using the usual
    language rules (`::A` matches `::(anonymous namespace)::A`).
    
    This patch also changes the type context computation algorithm in
    DWARFDIE, so that it includes anonymous namespace information. This
    causes a slight change in behavior: the algorithm previously stopped
    computing the context after encountering an anonymous namespace, which
    caused the outer namespaces to be ignored. This meant that a type like
    `NS::(anonymous namespace)::A` would be (incorrectly) recognized as
    `::A`). This can cause code depending on the old behavior to misbehave.
    The fix is to specify all the enclosing namespaces in the query, or use
    a non-exact match.
    labath authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    dd5d730 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    d2ce9dc View commit details
    Browse the repository at this point in the history
  11. [InstCombine] Make backedge check in op of phi transform more precise (

    …llvm#106075)
    
    The op of phi transform wants to prevent moving an operation across a
    backedge, as this may lead to an infinite combine loop.
    
    Currently, this is done using isPotentiallyReachable(). The problem with
    that is that all blocks inside a loop are reachable from each other.
    This means that the op of phi transform is effectively completely
    disabled for code inside loops, even when it's not actually operating on
    a loop phi (just a phi that happens to be in a loop).
    
    Fix this by explicitly computing the backedges inside the function
    instead. Do this via RPOT, which is a bit more efficient than using
    FindFunctionBackedges() (which does it without any pre-computed
    analyses).
    
    For irreducible cycles, there may be multiple possible choices of
    backedge, and this just picks one of them. This is still sufficient to
    prevent combine loops.
    
    This also removes the last use of LoopInfo in InstCombine -- I'll drop
    the analysis in a followup.
    nikic authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    f044564 View commit details
    Browse the repository at this point in the history
  12. [RISCV] Remove zfbfmin.ll. NFC (llvm#106937)

    Most of it is redundant with bfloat-convert.ll. One testcase is found in
    bfloat-imm.ll. The load and stores are more thoroughly tested in
    bfloat-mem.ll.
    topperc authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    c950ecb View commit details
    Browse the repository at this point in the history
  13. [CodeGen] Update a few places that were passing Register to raw_ostre…

    …am::operator<< (llvm#106877)
    
    These would implicitly cast the register to `unsigned`. Switch most of
    them to use printReg will give a more readable output. Change some
    others to use Register::id() so we can eventually remove the implicit
    cast to `unsigned`.
    topperc authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    cd3667d View commit details
    Browse the repository at this point in the history
  14. [clang] Bump up DIAG_SIZE_SEMA by 500 for downstream diagnostics.

    Recently added HLSL diagnostics (89fb849) pushed the Swift compiler over
    the existing limit.
    
    rdar://135126738
    lhames committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    08a72cb View commit details
    Browse the repository at this point in the history
  15. [TSan] fix crash when symbolize on darwin platforms (llvm#99441)

    The `dli_sname` filed in `Dl_info` may be `NULL`, which could cause a
    crash
    pudge62 authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    fe1006b View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    ed6d9f6 View commit details
    Browse the repository at this point in the history
  17. [CGP] Undo constant propagation of pointers across calls

    It may be profitable to revert SCCP propagation of C++ static values,
    if such constants are pointers, in order to avoid redundant pointer
    computation, since the method returning the constant is non-removable.
    antoniofrighetto committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    e4e0dfb View commit details
    Browse the repository at this point in the history
  18. [APInt] Add default-disabled assertion to APInt constructor (llvm#106524

    )
    
    If the uint64_t constructor is used, assert that the value is actually a
    signed or unsigned N-bit integer depending on whether the isSigned flag
    is set. Provide an implicitTrunc flag to restore the previous behavior,
    where the argument is silently truncated instead.
    
    In this commit, implicitTrunc is enabled by default, which means that
    the new assertions are disabled and no actual change in behavior occurs.
    The plan is to flip the default once all places violating the assertion
    have been fixed. See llvm#80309 for the scope of the necessary changes.
    
    The primary motivation for this change is to avoid incorrectly specified
    isSigned flags. A recurring problem we have is that people write
    something like `APInt(BW, -1)` and this works perfectly fine -- until
    the code path is hit with `BW > 64`. Most of our i128 specific
    miscompilations are caused by variants of this issue.
    
    The cost of the change is that we have to specify the correct isSigned
    flag (and make sure there are no excess bits) for uses where BW is
    always <= 64 as well.
    nikic authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    30cc198 View commit details
    Browse the repository at this point in the history
  19. [ARM] Fix failure to register-allocate CMP_SWAP_64 pseudo-inst (llvm#…

    …106721)
    
    This test case was failing to compile with a "ran out of registers
    during register allocation" error at -O0. This was because CMP_SWAP_64
    has 3 operands which must be an even-odd register pair, and two other
    GPR operands. All of the def operands are also early-clobber, so
    registers can't be shared between uses and defs. Because the function
    has an over-aligned alloca it needs frame and base pointers, so r6 and
    r11 are both reserved. That leaves r0/r1, r2/r3, r4/r5 and r8/r9 as the
    only valid register pairs, and if the two individual GPR operands happen
    to get allocated to registers in different pairs then only 2 pairs will
    be available for the three GPRPair operands.
    
    To fix this, I've merged the two GPR operands into a single GPRPair
    operand. This means that the instruction now has 4 GPRPair operands,
    which can always be allocated without relying on luck. This does
    constrain register allocation a bit more, but this pseudo instruction is
    only used at -O0, so I don't think that's a problem.
    ostannard authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    9cf6867 View commit details
    Browse the repository at this point in the history
  20. [CGP] Regenerate revert-constant-ptr-propagation-on-calls.ll test (…

    …NFC)
    
    Multiple buildbots were previously failing.
    antoniofrighetto committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    d79c4c1 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    5bd3ee0 View commit details
    Browse the repository at this point in the history
  22. [InstCombine] Remove optional LoopInfo dependency

    llvm#106075 has removed the
    last dependency on LoopInfo in InstCombine, so don't fetch the
    analysis anymore and remove the use-loop-info pass option.
    nikic committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    34b10e1 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    0fa78b6 View commit details
    Browse the repository at this point in the history
  24. [SLP] Add vectorization support for [u|s]cmp (llvm#106747)

    This patch adds vectorization support for [u|s]cmp intrinsic calls.
    dtcxzyw authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a156b5a View commit details
    Browse the repository at this point in the history
  25. [RuntimeDyld][Windows] Allocate space for dllimport things. (llvm#102586

    )
    
    We weren't taking account of the space we require in the stubs for
    things that are dllimported, and as a result we could hit the assertion
    failure for running out of stub space. Fix that.
    
    rdar://133473673
    
    ---------
    
    Co-authored-by: Saleem Abdulrasool <compnerd@compnerd.org>
    Co-authored-by: Lang Hames <lhames@gmail.com>
    Co-authored-by: Ben Barham <b.n.barham@gmail.com>
    4 people authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a0a2531 View commit details
    Browse the repository at this point in the history
  26. [flang][runtime] long double isn't always f80 (llvm#106746)

    f80 is only a thing on x86, and even then the size of long double can be
    changed with compiler flags. Instead set the size according to the host
    system (this is what is already done for integer types).
    tblah authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    cde3838 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    eaea4d1 View commit details
    Browse the repository at this point in the history
  28. Revert "[RuntimeDyld][Windows] Allocate space for dllimport things." (l…

    …lvm#106954)
    
    Looks like I missed an `override` (maybe that warning was enabled recently?). Will revert and fix.
    
    Reverts llvm#102586
    al45tair authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    87d9048 View commit details
    Browse the repository at this point in the history
  29. [SCCP] Infer return attributes in SCCP as well (llvm#106732)

    We can infer the range/nonnull attributes in non-interprocedural SCCP as
    well. The results may be better after the function has been simplified.
    nikic authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    24fe1d4 View commit details
    Browse the repository at this point in the history
  30. [llvm][Support] Adjust maximum thread name length to the right value …

    …for OpenBSD (llvm#106956)
    
    The thread name length is derived from _MAXCOMLEN which is 24.
    brad0 authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    d710011 View commit details
    Browse the repository at this point in the history
  31. [BasicAA] Track nuw through decomposed expressions (llvm#106512)

    When we decompose the GEP offset expression, and the arithmetic is not
    performed using nuw operations, we cannot retain the nuw flag on the
    decomposed GEP.
    
    For example, if we have `gep nuw p, (a-1)`, this is not at all the same
    as `gep nuw (gep nuw p, a), -1`.
    
    Fix this by tracking NUW through linear expression decomposition,
    similarly to what we already do for the NSW flag.
    
    This fixes the miscompilation reported in
    llvm#105496 (comment).
    nikic authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b9bba6c View commit details
    Browse the repository at this point in the history
  32. [mlir][ArmSME] Rename slice move operations to insert/extract_tile_sl…

    …ice (llvm#106755)
    
    This renames:
    
    - `arm_sme.move_tile_slice_to_vector` to `arm_sme.extract_tile_slice`
    - `arm_sme.move_vector_to_tile_slice` to `arm_sme.insert_tile_slice`
    
    The new names are more consistent with the rest of MLIR and should be
    easier to understand. The current names (to me personally) are hard to
    parse and easy to mix up when skimming through code.
    
    Additionally, the syntax for `insert_tile_slice` has changed from:
    
    ```mlir
    %4 = arm_sme.insert_tile_slice %0, %1, %2
      : vector<[16]xi8> into vector<[16]x[16]xi8>
    ```
    
    To:
    
    ```mlir
    %4 = arm_sme.insert_tile_slice %0, %1[%2]
      : vector<[16]xi8> into vector<[16]x[16]xi8>
    ```
    
    This is for consistency with `extract_tile_slice`, but also helps with
    readability as it makes it clear which operand is the index.
    MacDue authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    c425124 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    1e65b76 View commit details
    Browse the repository at this point in the history
  34. Reapply "[MLIR][LLVM] Make DISubprogramAttr cyclic" (llvm#106571) wit…

    …h fixes (llvm#106947)
    
    This reverts commit fa93be4, restoring
    commit d884b77, with fixes that ensure the CAPI declarations are
    exported properly.
    
    This commit implements LLVM_DIRecursiveTypeAttrInterface for the
    DISubprogramAttr to ensure cyclic subprograms can be imported properly.
    In the process multiple shortcuts around the recently introduced
    DIImportedEntityAttr can be removed.
    gysit authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    7519755 View commit details
    Browse the repository at this point in the history
  35. [AutoUpgrade] Preserve attributes when upgrading named struct return

    For example, if the argument has an alignment attribute, preserve it.
    nikic committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    5dcea46 View commit details
    Browse the repository at this point in the history
  36. [DebugInfo][RemoveDIs] Find types hidden in DbgRecords (llvm#106547)

    When serialising to textual IR, there can be constant Values referred to
    by DbgRecords that don't appear anywhere else, and have types hidden
    even deeper in side them. Enumerate these when enumerating all types.
    
    Test by Mikael Holmén.
    jmorse authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    25f87f2 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    f79722b View commit details
    Browse the repository at this point in the history
  38. [X86] scmp/ucmp - add SSE42/AVX2/AVX512 test coverage to show current…

    … state of vector legalization/lowering
    RKSimon committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    f19dff1 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    a9c71d3 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    e90b219 View commit details
    Browse the repository at this point in the history
  41. [RuntimeDyld][Windows] Allocate space for dllimport things. (llvm#106958

    )
    
    We weren't taking account of the space we require in the stubs for
    things that are dllimported, and as a result we could hit the assertion
    failure for running out of stub space. Fix that.
    
    Also add a couple of `override` specifiers that were missing last time
    (llvm#102586).
    
    rdar://133473673
    al45tair authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    bdfd780 View commit details
    Browse the repository at this point in the history
  42. [Flang][Lower] Handle mangling of a generic name with a homonym speci…

    …fic procedure (llvm#106693)
    
    This may happen when using modules.
    
    Fixes llvm#93707
    rofirrim authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    4ed9092 View commit details
    Browse the repository at this point in the history
  43. [clang][bytecode] Implement __noop (llvm#106714)

    This does nothing and returns 0.
    tbaederr authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    f838d6b View commit details
    Browse the repository at this point in the history
  44. [clang][bytecode] Fix zero-init of first union member (llvm#106962)

    ... if done via a ImplicitValueInitExpr.
    We were already doing this later in visitZeroRecordInitializer().
    tbaederr authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a9006bf View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    224112f View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    60ed104 View commit details
    Browse the repository at this point in the history
  47. [mlir][EmitC] Remove restrictions on include op (llvm#106953)

    An `emitc.include` should be usable even though the parent is not a
    ModuleOp. This requirement is therefore removed.
    marbre authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    8b2ad5c View commit details
    Browse the repository at this point in the history
  48. Revert "[compiler-rt][fuzzer] SetThreadName build fix for Mingwin att…

    …empt (llvm#106902)"
    
    This reverts commit 7c4cffd.
    
    This commit broke compilation in environments that don't use
    winpthreads.
    mstorsjo committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b32dc67 View commit details
    Browse the repository at this point in the history
  49. [NFC] Fix dead links in TargetCXXABI.def (llvm#96348)

    http://itanium-cxx-abi.github.io/cxx-abi/
    
    > This website may be mirrored in many places, some of which may become
    stale. The current canonical location is:
    >  * http://itanium-cxx-abi.github.io/cxx-abi/
    
    https://github.com/ARM-software/abi-aa
    
    > This is the official place for the latest documents of the Application
    Binary Interface for the Arm® Architecture, both for source files and
    officially released documents.
    MitalAshok authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    dc3f66a View commit details
    Browse the repository at this point in the history
  50. [NFC][IR] Add CreateCountTrailingZeroElems helper (llvm#106711)

    The LoopIdiomVectorize pass already creates calls to the intrinsic
    experimental_cttz_elts, but PR llvm#88385 will start calling this more
    too so I've created a helper for it.
    david-arm authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    dc6c3ba View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    0c0bac9 View commit details
    Browse the repository at this point in the history
  52. [lldb/linux] Make truncated reads work (llvm#106532)

    Previously, we were returning an error if we couldn't read the whole
    region. This doesn't matter most of the time, because lldb caches memory
    reads, and in that process it aligns them to cache line boundaries. As
    (LLDB) cache lines are smaller than pages, the reads are unlikely to
    cross page boundaries.
    
    Nonetheless, this can cause a problem for large reads (which bypass the
    cache), where we're unable to read anything even if just a single byte
    of the memory is unreadable. This patch fixes the lldb-server to do
    that, and also changes the linux implementation, to reuse any partial
    results it got from the process_vm_readv call (to avoid having to
    re-read everything again using ptrace, only to find that it stopped at
    the same place).
    
    This matches debugserver behavior. It is also consistent with the gdb
    remote protocol documentation, but -- notably -- not with actual
    gdbserver behavior (which returns errors instead of partial results). We
    filed a
    [clarification
    bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several
    years ago. Though we did not really reach a conclusion there, I think
    this is the most logical behavior.
    
    The associated test does not currently pass on windows, because the
    windows memory read APIs don't support partial reads (I have a WIP patch
    to work around that).
    labath authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    181cc75 View commit details
    Browse the repository at this point in the history
  53. [VPlan] Use op from underlying call in computeCost if needed.

    This fixes a divergence between legacy and VPlan-based cost model, e.g.
    if one of the operands has an first-order recurrence phi as operand.
    fhahn committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b0de7fa View commit details
    Browse the repository at this point in the history
  54. Win release packaging: Don't try to use rpmalloc for 32-bit x86 (llvm…

    …#106969)
    
    because that doesn't work (results in `LINK : error LNK2001: unresolved
    external symbol malloc`).
    Based on the title of llvm#91862 it was only intended for use in 64-bit
    builds.
    zmodem authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    ef26afc View commit details
    Browse the repository at this point in the history
  55. [Analysis] Add getPredicatedExitCount to ScalarEvolution (llvm#105649)

    Due to a reviewer request on PR llvm#88385 I have created this patch
    to add a getPredicatedExitCount function, which is similar to
    getExitCount except that it uses the predicated backedge taken
    information. With PR llvm#88385 we will start to care about more
    loops with multiple exits, and want the ability to query exit
    counts for a particular exiting block. Such loops may require
    predicates in order to be vectorised.
    
    New tests added here:
    
    Analysis/ScalarEvolution/predicated-exit-count.ll
    david-arm authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    df3d70b View commit details
    Browse the repository at this point in the history
  56. [AArch64] Lower partial add reduction to udot or svdot (llvm#101010)

    This patch introduces lowering of the partial add reduction intrinsic to
    a udot or svdot for AArch64. This also involves adding a
    `shouldExpandPartialReductionIntrinsic` target hook, which AArch64 will
    return false from in the cases that it can be lowered.
    SamTebbs33 authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    44cfbef View commit details
    Browse the repository at this point in the history
  57. [clangd] Update TidyFastChecks for release/19.x (llvm#106354)

    Run for clang-tidy checks available in release/19.x branch.
    
    Some notable findings:
    - altera-id-dependent-backward-branch, stays slow with 13%.
    - misc-const-correctness become faster, going from 261% to 67%, but
    still above
      8% threshold.
    - misc-header-include-cycle is a new SLOW check with 10% runtime
    implications
    - readability-container-size-empty went from 16% to 13%, still SLOW.
    kadircet authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    b47d7ce View commit details
    Browse the repository at this point in the history
  58. [NFC][Support] Add FormatVariadic sub-test for validation (llvm#106578)

    - Add validation subtest that tests assert failures in assert enabled
      builds, and that validation is disabled in assert disabled builds.
    jurahul authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    ad30a05 View commit details
    Browse the repository at this point in the history
  59. [NFC][TableGen] Refactor getIntrinsicFnAttributeSet (llvm#106587)

    Fix intrinsic function attributes to not generate attribute sets that
    are empty in `getIntrinsicFnAttributeSet`. Refactor the code to use
    helper functions to get effective memory effects for an intrinsic and to
    check if it has non-default attributes.
    
    This eliminates one case statement in `getIntrinsicFnAttributeSet` that
    we generate today for the case when intrinsic attributes are default
    ones.
    
    Also rename `Intrinsic` to `Int` to follow the naming convention used in
    this file and adjust emission code to not emit unnecessary empty line
    between cases generated.
    jurahul authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    e5c7cde View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    b6a4ab5 View commit details
    Browse the repository at this point in the history
  61. [clang] Add tests for CWG issues about friend declaration matching (l…

    …lvm#106117)
    
    This patch covers CWG issues regarding declaration matching when
    `friend` declarations are involved:
    [CWG138](https://cplusplus.github.io/CWG/issues/138.html),
    [CWG386](https://cplusplus.github.io/CWG/issues/386.html),
    [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html), and
    [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html). Atypical
    for our CWG tests, the ones in this patch are quite extensively
    commented in-line, explaining the mechanics. PR description focuses on
    high-level concerns and references.
    
    [CWG138](https://cplusplus.github.io/CWG/issues/138.html) "Friend
    declaration name lookup"
    -----------
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    > [CWG138](https://cplusplus.github.io/CWG/issues/138.html) is resolved
    according to [N1229](http://wg21.link/n1229), except that
    using-directives that nominate nested namespaces are considered.
    
    I find it hard to pin down the scope of this issue, so I'm relying on
    three examples from the filing to define it. Because of that, it's also
    hard to pinpoint exact wording changes that resolve it. Relevant
    references are:
    [[dcl.meaning.general]/2](http://eel.is/c++draft/dcl.meaning#general-2),
    [[namespace.udecl]/10](https://eel.is/c++draft/namespace.udecl#10),
    [[dcl.type.elab]/3](https://eel.is/c++draft/dcl.type.elab#3),
    [[basic.lookup.elab]/1](https://eel.is/c++draft/basic.lookup.elab#1).
    
    [CWG386](https://cplusplus.github.io/CWG/issues/386.html) "Friend
    declaration of name brought in by _using-declaration_"
    -----------
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    > [CWG386](https://cplusplus.github.io/CWG/issues/386.html),
    [CWG1839](https://cplusplus.github.io/CWG/issues/1839.html),
    [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html),
    [CWG2058](https://cplusplus.github.io/CWG/issues/2058.html),
    [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html), and
    Richard’s observation in [“are non-type names ignored in a
    class-head-name or
    enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are
    resolved by describing the limited lookup that occurs for a
    declarator-id, including the changes in Richard’s [proposed resolution
    for
    CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html)
    (which also resolves CWG1818 and what of CWG2058 was not resolved along
    with CWG2059) and rejecting the example from
    [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html).
    
    Wording
    ([[dcl.meaning.general]/2](http://eel.is/c++draft/dcl.meaning#general-2)):
    > — If the
    [id-expression](http://eel.is/c++draft/expr.prim.id.general#nt:id-expression)
    E in the
    [declarator-id](http://eel.is/c++draft/dcl.decl.general#nt:declarator-id)
    of the
    [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) is a
    [qualified-id](http://eel.is/c++draft/expr.prim.id.qual#nt:qualified-id)
    or a [template-id](http://eel.is/c++draft/temp.names#nt:template-id):
    > &nbsp;&nbsp;&nbsp;&nbsp; — [...]
    > &nbsp;&nbsp;&nbsp;&nbsp; — The
    [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator)
    shall correspond to one or more declarations found by the lookup; they
    shall all have the same target scope, and the target scope of the
    [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator) is
    that
    scope[.](http://eel.is/c++draft/dcl.meaning#general-2.2.2.sentence-1)
    
    This issue focuses on interaction of `friend` declarations with
    template-id and qualified-id with using-declarations. The short answer
    is that terminal name in such declarations undergo lookup, and
    using-declarations do what they usually do helping that lookup. Target
    scope of such friend declaration is the target scope of lookup result,
    so no conflicts arise with the using-declarations.
    
    [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html) "Definition
    of a `friend` outside its namespace"
    -----------
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    > [...] and rejecting the example from
    [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html).
    
    Wording
    ([[dcl.meaning.general]/3.4](http://eel.is/c++draft/dcl.meaning#general-3.4)):
    > Otherwise, the terminal name of the
    [declarator-id](http://eel.is/c++draft/dcl.decl.general#nt:declarator-id)
    is not looked
    up[.](http://eel.is/c++draft/dcl.meaning#general-3.4.sentence-1)
    If it is a qualified name, the
    [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator)
    shall correspond to one or more declarations nominable in S; all the
    declarations shall have the same target scope and the target scope of
    the [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator)
    is that
    scope[.](http://eel.is/c++draft/dcl.meaning#general-3.4.sentence-2)
    
    This issue focuses on befriending a function in one scope, then defining
    it from other scope using qualified-id. Contrary to what P1787R6 says in
    prose, this example is accepted by the wording in that paper. In the
    wording quote above, note the absence of a statement like "terminal name
    of the declarator-id is not bound", contrary to similar statements made
    before that in [dcl.meaning.general] about friend declarations and
    template-ids.
    
    There's also a note in [basic.scope.scope] that supports the rejection,
    but it's considered incorrect and expected to be removed in the future.
    This is tracked in cplusplus/draft#7238.
    
    [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html) "Do `friend`
    declarations count as “previous declarations”?"
    ------------------
    
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    > [CWG386](https://cplusplus.github.io/CWG/issues/386.html),
    [CWG1839](https://cplusplus.github.io/CWG/issues/1839.html),
    [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html),
    [CWG2058](https://cplusplus.github.io/CWG/issues/2058.html),
    [CWG1900](https://cplusplus.github.io/CWG/issues/1900.html), and
    Richard’s observation in [“are non-type names ignored in a
    class-head-name or
    enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are
    resolved by describing the limited lookup that occurs for a
    declarator-id, including the changes in Richard’s [proposed resolution
    for
    CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html)
    (which also resolves CWG1818 and what of CWG2058 was not resolved along
    with CWG2059) and rejecting the example from
    [CWG1477](https://cplusplus.github.io/CWG/issues/1477.html).
    
    Wording
    ([[dcl.meaning.general]/2.3](http://eel.is/c++draft/dcl.meaning#general-2.3)):
    > The declaration's target scope is the innermost enclosing namespace
    scope; if the declaration is contained by a block scope, the declaration
    shall correspond to a reachable
    ([[module.reach]](http://eel.is/c++draft/module.reach)) declaration that
    inhabits the innermost block
    scope[.](http://eel.is/c++draft/dcl.meaning#general-2.3.sentence-2)
    
    Wording
    ([[basic.scope.scope]/7](http://eel.is/c++draft/basic.scope#scope-7)):
    > A declaration is
    [nominable](http://eel.is/c++draft/basic.scope#def:nominable) in a
    class, class template, or namespace E at a point P if it precedes P, it
    does not inhabit a block scope, and its target scope is the scope
    associated with E or, if E is a namespace, any element of the inline
    namespace set of E
    ([[namespace.def]](http://eel.is/c++draft/namespace.def))[.](http://eel.is/c++draft/basic.scope#scope-7.sentence-1)
    
    Wording
    ([[dcl.meaning.general]/3.4](http://eel.is/c++draft/dcl.meaning#general-3.4)):
    > If it is a qualified name, the
    [declarator](http://eel.is/c++draft/dcl.decl.general#nt:declarator)
    shall correspond to one or more declarations nominable in S; [...]
    
    In the new wording it's clear that while `friend` declarations of
    functions do not bind names, declaration is still introduced, and is
    nominable, making it eligible for a later definition by qualified-id.
    Endilll authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    4a505e1 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    30d56be View commit details
    Browse the repository at this point in the history
  63. [clang][Driver] Add a custom error option in multilib.yaml. (llvm#105684

    )
    
    Sometimes a collection of multilibs has a gap in it, where a set of
    driver command-line options can't work with any of the available
    libraries.
    
    For example, the Arm MVE extension requires special startup code (you
    need to initialize FPSCR.LTPSIZE), and also benefits greatly from
    -mfloat-abi=hard. So a multilib provider might build a library for
    systems without MVE, and another for MVE with -mfloat-abi=hard,
    anticipating that that's what most MVE users would want. But then if a
    user compiles for MVE _without_ -mfloat-abi=hard, thhey can't use either
    of those libraries – one has an ABI mismatch, and the other will fail to
    set up LTPSIZE.
    
    In that situation, it's useful to include a multilib.yaml entry for the
    unworkable intermediate situation, and have it map to a fatal error
    message rather than a set of actual libraries. Then the user gets a
    build failure with a sensible explanation, instead of selecting an
    unworkable library and silently generating bad output. The new
    regression test demonstrates this case.
    
    This patch introduces extra syntax into multilib.yaml, so that a record
    in the `Variants` list can omit the `Dir` key, and in its place, provide
    a `FatalError` key. Then, if that variant is selected, the error message
    is emitted as a clang diagnostic, and multilib selection fails.
    
    In order to emit the error message in `MultilibSet::select`, I had to
    pass a `Driver &` to that function, which involved plumbing one through
    to every call site, and in the unit tests, constructing one specially.
    statham-arm authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    26bf0b4 View commit details
    Browse the repository at this point in the history
  64. [NFC] Update check lines of the test case `llvm/test/CodeGen/AMDGPU/r…

    …emove-no-kernel-id-attribute.ll`
    shiltian committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    f32f028 View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    cb949b7 View commit details
    Browse the repository at this point in the history
  66. [clang][AST][NFC] Make ASTContext::UnwrapSimilar{Array,}Types const (l…

    …lvm#106992)
    
    They don't mutate the context at all, so mark them const.
    tbaederr authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    38ae53d View commit details
    Browse the repository at this point in the history
  67. [RISCV] Remove RISCVISD::FP_EXTEND_BF16. (llvm#106939)

    I don't think we need this node. We can isel fp_extend directly.
    fp_extend to f64 requires two instructions, but we can emit them with an
    isel pattern.
    
    I have not removed RISCVISD::FP_ROUND_BF16 because f64->bf16 needs more
    work to fix the double rounding.
    topperc authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    55eb93b View commit details
    Browse the repository at this point in the history
  68. Reland [AArch64][AsmParser] Directives should clear transitively impl…

    …ied features (llvm#106625) (llvm#106850)
    
    Relands 2497739 addressing the buffer overflow caused when
    dereferencing an iterator past the end of ExtensionMap.
    labrinea authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    a586b5a View commit details
    Browse the repository at this point in the history
  69. [VPlan] Pass intrinsic inst to TTI in VPWidenCallRecipe::computeCost.

    Follow-up to 9ccf825, adjust computeCost to also pass IntrinsicInst to
    TTI if available, as there are multiple places in TTI which use the
    IntrinsicInst.
    
    Fixes llvm#107016.
    fhahn committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    50a02e7 View commit details
    Browse the repository at this point in the history
  70. [VPlan] Simplify MUL operands at recipe construction.

    This moves the logic to create simplified operands using SCEV to MUL
    recipe creation. This is needed to match the behavior of the legacy's cost
    model. TODOs are to extend to other opcodes and move to a transform.
    
    Note that this also restricts the number of SCEV simplifications we
    apply to more precisely match the cases handled by the legacy cost
    model.
    
    Fixes llvm#107015.
    fhahn committed Sep 2, 2024
    Configuration menu
    Copy the full SHA
    954ed05 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    ecc9aec View commit details
    Browse the repository at this point in the history
  72. Configuration menu
    Copy the full SHA
    7e8aba2 View commit details
    Browse the repository at this point in the history
  73. [MIPS] Fix error messages when rejecting certain assembly not support…

    …ed by ISA (llvm#94695)
    
    … instructions.
    
    This is a fix I stumbled upon while working on something else. I decided
    to break it out since it seems like a good "first issue" to submit. I
    updated the comments in the "wrong error" test files to indicate that
    the messages are no longer incorrect, but I left the names of the test
    files alone. I was not sure what to do with those, so I would appreciate
    thoughts or guidance.
    jdeguire authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    0ba006d View commit details
    Browse the repository at this point in the history
  74. [LegalizeVectorOps] Defer UnrollVectorOp in ExpandFNEG to caller. (ll…

    …vm#106783)
    
    Make ExpandFNEG return SDValue() when it doesn't expand. The caller
    already knows how to Unroll when Results is empty.
    topperc authored Sep 2, 2024
    Configuration menu
    Copy the full SHA
    366ac8c View commit details
    Browse the repository at this point in the history

Commits on Sep 3, 2024

  1. [gn] port 1e65b76

    Apparently DragonFly BSD and Solaris/illumos call these APIs
    `pthread_get_name_np` / `pthread_set_name_np` (with an extra
    underscore) instead of `pthread_getname_np` / `pthread_setname_np`.
    nico committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b6597f5 View commit details
    Browse the repository at this point in the history
  2. [RISCV] Correct the scheduler class for FCVT_S_BF16. (llvm#107028)

    Use FCvtF16ToF32 instead of FCvtF32ToF16.
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    ba3c1ed View commit details
    Browse the repository at this point in the history
  3. [LTO] Don't make unnecessary copies of ImportIDTable (llvm#106998)

    Without this patch, {ImportMapTy,SortedImportList}::{begin,end} make
    unnecessary copies of ImportIDTable via:
    
      map_iterator(Imports.begin(), IDs);
    
    The second parameter, IDs, is passed by value, so we make a copy of
    MapVector inside ImportIDTable every time we call begin and end.
    These begin and end show up as time-consuming functions in the
    performance profile.
    
    This patch fixes the problem by passing IDs by reference with
    std::cref.
    
    While we are at it, this patch deletes the copy constructor and
    assignment operator.  I cannot think of any legitimate need reason to
    make a copy of the deduplication table.
    kazutakahirata authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    9a1d14a View commit details
    Browse the repository at this point in the history
  4. [RISCV] Rename test cases in bfloat-arith.ll and half-arith.ll. NFC

    Use _bf16 or _h instead of _s. The _s was copied from float-arith.ll
    topperc committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    dc19b59 View commit details
    Browse the repository at this point in the history
  5. Revert "[C++20] [Modules] Embed all source files for C++20 Modules (l…

    …lvm#102444)"
    
    This reverts commit 2eeeff8.
    
    See the post commit discussion in
    llvm@2eeeff8
    ChuanqiXu9 committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    2cbd1bc View commit details
    Browse the repository at this point in the history
  6. [clang][Sema] Fix diagnostic for function overloading in extern "C" (l…

    …lvm#106033)
    
    Fixes llvm#80235
    
    When trying to overload a function within `extern "C"`, the diagnostic
    `functions that differ only in their return type cannot be overloaded`
    is given. This diagnostic is inappropriate because overloading is
    basically not allowed in the C language. However, if the redeclared
    function has the `((overloadable))` attribute, it should be diagnosed as
    `functions that differ only in their return type cannot be overloaded`.
    
    This patch uses `isExternC()` to provide an appropriate diagnostic
    during the diagnostic process. `isExternC()` updates the linkage
    information cache internally, so calling it before merging functions can
    cause clang to crash. An example is declaring `static void foo()` and
    `void foo()` within an `extern "C"` block. Therefore, I decided to call
    `isExternC()` after the compilation error is confirmed and select the
    diagnostic message. The diagnostic message is `conflicting types for
    'func'` similar to the diagnostic in C, and `functions that differ only
    in their return type cannot be overloaded` if the `((overloadable))`
    attribute is given.
    
    Regression tests verify that the expected diagnostics are given when
    trying to overload functions within `extern "C"` and when the
    `((overloadable))` attribute is present.
    
    ---------
    
    Co-authored-by: Sirraide <aeternalmail@gmail.com>
    s-watanabe314 and Sirraide authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    78abeca View commit details
    Browse the repository at this point in the history
  7. [RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (llvm#…

    …107039)
    
    The LegalizeDAG expansion will go through memory since i16 isn't a legal
    type. Avoid this by using FMV nodes.
    
    Similar to what we did for llvm#106886 for FNEG and FABS. Special care is
    needed to handle the Sign operand being a different type.
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    9a1eded View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    0421049 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    8e5b43c View commit details
    Browse the repository at this point in the history
  10. [lldb-dap][test] Fix: Typo in unresolved test (llvm#107030)

    There is a typo in an assertion that causes the instruction break-point
    test to be unresolved
    Da-Viper authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    7d7d2d2 View commit details
    Browse the repository at this point in the history
  11. [MachinePipeliner] Make Recurrence MII More Accurate (llvm#105475)

    Current RecMII calculation is bigger than it needs to be. The
    calculation was refined in this patch.
    mmarjieh authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    00c198b View commit details
    Browse the repository at this point in the history
  12. [RISCV] Rename vcix_state register to sf_vcix_state. NFC (llvm#10…

    …6995)
    
    Since it's SiFive VCIX specific register, it's better to have a prefix
    so that it's more understandable.
    4vtomat authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    7e6bad1 View commit details
    Browse the repository at this point in the history
  13. [compiler-rt] [docs] Mention Windows as one of the supported OSes (ll…

    …vm#106874)
    
    Compiler-rt can be built for Windows, and most parts of it work. Some
    parts only really work on x86/x86_64 (like address sanitizers), but the
    OS overall is supported.
    mstorsjo authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    af5c18a View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    525ffd6 View commit details
    Browse the repository at this point in the history
  15. [lldb] Support partial memory reads on windows (llvm#106981)

    ReadProcessMemory will not perform the read if part of the memory is
    unreadable (and even though the API has a `number_of_bytes_read`
    argument). To make this work, I explicitly inspect the memory region
    being read and only read the accessible part.
    labath authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    04ed12c View commit details
    Browse the repository at this point in the history
  16. [Analysis] getIntrinsicForCallSite - add vectorization support for ac…

    …os/asin/atan and cosh/sinh/tanh libcalls (llvm#106844)
    
    Followup to llvm#106584 - ensure acos/asin/atan and cosh/sinh/tanh libcalls correctly map to the llvm intrinsic equivalents
    RKSimon authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    6c8746b View commit details
    Browse the repository at this point in the history
  17. [clang][bytecode] Print Pointers via APValue (llvm#107056)

    Instead of doing this ourselves, just rely on printing the APValue.
    tbaederr authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    733a92d View commit details
    Browse the repository at this point in the history
  18. [bazel] Attempt to fix issue fetching remote blob

    Bazel builds currently fail with `Failed to fetch blobs because they do not exist remotely.`. These extra bazel flags hopefully fix it.
    chsigg authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    a70d999 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    6c59dfb View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    851bacb View commit details
    Browse the repository at this point in the history
  21. [flang][semantics][OpenMP] store DSA using ultimate sym (llvm#107002)

    Previously we tracked data sharing attributes by the symbol itself not
    by the ultimate symbol. When the private clause came first, subsequent
    uses of the symbol found a host-associated version instead of the
    ultimate symbol and so the check didn't consider them to be the same
    symbol. Always adding and checking for the ultimate symbol ensures that
    we have the same behaviour no matter the order of clauses.
    
    The modified list is only used for this multiple clause check.
    
    Closes llvm#78235
    tblah authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4befe65 View commit details
    Browse the repository at this point in the history
  22. [X86] canCreateUndefOrPoisonForTargetNode - X86ISD::CMPP (CMPPS/D) no…

    …des do not generate poison
    RKSimon committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    377045e View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    fe1a1ee View commit details
    Browse the repository at this point in the history
  24. [Utils][SPIR-V] Adding spirv-sim to LLVM (llvm#104020)

    Currently, the testing infrastructure for SPIR-V is based on FileCheck.
    Those tests are great to check some level of codegen, but when the test
    needs check both the CFG layout and the content of each basic-block,
    things becomes messy.
    
    - Because the CHECK/CHECK-DAG/CHECK-NEXT state is limited, it is
    sometimes hard to catch the good block: if 2 basic blocks have similar
    instructions, FileCheck can match the wrong one.
    
    - Cross-lane interaction can be a bit difficult to understand, and
    writting a FileCheck test that is strong enough to catch bad CFG
    transforms while not being broken everytime some unrelated codegen part
    changes is hard.
    
    And lastly, the spirv-val tooling we have checks that the generated
    SPIR-V respects the spec, not that it is correct in regards to the
    source IR.
    
    For those reasons, I believe the best way to test the structurizer is
    to:
     - run spirv-val to make sure the CFG respects the spec.
    - simulate the function to validate result for each lane, making sure
    the generated code is correct.
    
    This simulator has no other dependencies than core python. It also only
    supports a very limited set of instructions as we can test most features
    through control-flow and some basic cross-lane interactions.
    
    As-is, the added tests are just a harness for the simulator itself. If
    this gets merged, the structurizer PR will benefit from this as I'll be
    able to add extensive testing using this.
    
    ---------
    
    Signed-off-by: Nathan Gauër <brioche@google.com>
    Keenuts authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    c3d8124 View commit details
    Browse the repository at this point in the history
  25. [AMDGPU] Create dir for amdgpu specific machineverifier tests (llvm#1…

    …06960)
    
    Move the AMDGPU target specific testcases in MachineVerifier separately
    into new directory.
    Reference :
    llvm#105494 (comment)
    AditiRM authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    d24a2fd View commit details
    Browse the repository at this point in the history
  26. [mlir][vector] Refactor vector-transfer-to-vector-load-store.mlir (NF…

    …C) (llvm#105509)
    
    Overview of changes:
    
    - All memref input arguments are re-named to %mem.
    - All vector input arguments are re-named to %vec.
    - All index input arguments are re-named to %idx.
    - All tensor input arguments are re-named to %src/%dst.
    - LIT variables were updated to be consistent with input arguments.
    - Renamed all output arguments as %res.
    - Removed unused argument in `transfer_write_broadcast_unit_dim`.
    - Unified identation of `FileCheck` commands.
    - Split `transfer_write_permutations` and  `transfer_write_broadcast_unit_dim` into tensor and memref variants.
    - Renamed `transfer_write_permutations_tensor` as `transfer_write_permutations_tensor_masked`.
    pabloantoniom authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4d8903b View commit details
    Browse the repository at this point in the history
  27. [LoopUnroll] Avoid undef values in test (NFC)

    Avoid most of the code being optimized away as a result of
    optimization improvements.
    nikic committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    52b8795 View commit details
    Browse the repository at this point in the history
  28. Revert "[Utils][SPIR-V] Adding spirv-sim to LLVM" (llvm#107084)

    Reverts llvm#104020
    
    Looks like it caused build failures.
    Keenuts authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    8861328 View commit details
    Browse the repository at this point in the history
  29. [libc++][NFC] Canonicalize the benchmark suite a bit

    This replaces `BENCHMARK_TEMPLATE` with `BENCHMARK` and uses
    `BENCHMARK_MAIN()` when possible.
    philnik777 committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    5e19e31 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    a5f03b4 View commit details
    Browse the repository at this point in the history
  31. [lldb/windows] Reset MainLoop events after handling them (llvm#107061)

    This prevents the callback function from being called in a busy loop.
    Discovered by @slydiman on llvm#106955.
    labath authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4353530 View commit details
    Browse the repository at this point in the history
  32. [lldb] Add a callback version of TCPSocket::Accept (llvm#106955)

    The existing function already used the MainLoop class, which allows one
    to wait on multiple events at once. It needed to do this in order to
    wait for v4 and v6 connections simultaneously. However, since it was
    creating its own instance of MainLoop, this meant that it was impossible
    to multiplex these sockets with anything else.
    
    This patch simply adds a version of this function which uses an
    externally provided main loop instance, which allows the caller to add
    any events it deems necessary. The previous function becomes a very thin
    wrapper over the new one.
    labath authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    3d5e1ec View commit details
    Browse the repository at this point in the history
  33. [AArch64][GlobalISel] Legalize 128-bit types for FABS (llvm#104753)

    This patch adds a common lower action for `G_FABS`, which generates `and
    x8, x8, #0x7fffffffffffffff` to reset the sign bit. The action does not
    support vectors since `G_AND` does not support fp128.
    
    
    This approach is different than what SDAG is doing. SDAG stores the
    value onto stack, clears the sign bit in the most significant byte, and
    loads the value back into register. This involves multiple memory ops
    and sounds slower.
    Him188 authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    0748f42 View commit details
    Browse the repository at this point in the history
  34. [analyzer] Fix false positive for stack-addr leak on simple param ptr (

    …llvm#107003)
    
    Assigning to a pointer parameter does not leak the stack address because
    it stays within the function and is not shared with the caller.
    
    Previous implementation reported any association of a pointer parameter
    with a local address, which is too broad.
    
    This fix enforces that the pointer to a stack variable is related by at
    least one level of indirection.
    
    CPP-5642
    
    Fixes llvm#106834
    necto authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    aa4f81e View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    f77f604 View commit details
    Browse the repository at this point in the history
  36. [clang][bytecode] Pass FPOptions to floating point ops (llvm#107063)

    So we don't have to retrieve them from the InterpFrame, which is slow.
    tbaederr authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    0f5f440 View commit details
    Browse the repository at this point in the history
  37. [SCCP] Avoid use of undef value in test (NFC)

    Avoid optimization away most of the code if we resolve this to
    a specific value.
    nikic committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    c80cabf View commit details
    Browse the repository at this point in the history
  38. [Offload] Change x86_64-pc-linux to x86_64-unknown-linux (llvm#107023)

    It appears that the RUNTIMES build prefers the x86-64-unknown-linux-gnu
    triple notation for the host. This fixes runtime / test breakages when
    compiler-rt is used as the CLANG_DEFAULT_RTLIB.
    jplehr authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    1a0cf24 View commit details
    Browse the repository at this point in the history
  39. [profile] Change __llvm_profile_counter_bias etc. types to match llvm (

    …llvm#102747)
    
    As detailed in Issue llvm#101667, two `profile` tests `FAIL` on 32-bit
    SPARC, both Linux/sparc64 and Solaris/sparcv9 (where the tests work when
    enabled):
    ```
      Profile-sparc :: ContinuousSyncMode/runtime-counter-relocation.c
      Profile-sparc :: ContinuousSyncMode/set-file-object.c
    ```
    The Solaris linker provides the crucial clue as to what's wrong:
    ```
    ld: warning: symbol '__llvm_profile_counter_bias' has differing sizes:
    	(file runtime-counter-relocation-17ff25.o value=0x8; file libclang_rt.profile-sparc.a(InstrProfilingFile.c.o) value=0x4);
    	runtime-counter-relocation-17ff25.o definition taken
    ```
    In fact, the types in `llvm` and `compiler-rt` differ:
    - `__llvm_profile_counter_bias`/`INSTR_PROF_PROFILE_COUNTER_BIAS_VAR` is
    created in `llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp`
    (`InstrLowerer::getCounterAddress`) as `int64_t`, while
    `compiler-rt/lib/profile/InstrProfilingFile.c` uses `intptr_t`. While
    this doesn't matter in the 64-bit case, the type sizes differ for
    32-bit.
    - `__llvm_profile_bitmap_bias`/`INSTR_PROF_PROFILE_BITMAP_BIAS_VAR` has
    the same issue: created in `InstrProfiling.cpp`
    (`InstrLowerer::getBitmapAddress`) as `int64_t`, while
    `InstrProfilingFile.c` again uses `intptr_t`.
    
    This patch changes the `compiler-rt` types to match `llvm`. At the same
    time, the affected testcases are enabled on Solaris, too, where they now
    just `PASS`.
    
    Tested on `sparc64-unknown-linux-gnu`, `sparcv9-sun-solaris2.11`,
    `x86_64-pc-linux-gnu`, and `amd64-pc-solaris2.11.
    rorth authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    70a19ad View commit details
    Browse the repository at this point in the history
  40. [SLP]Fix PR107036: Check if the type of the user is sizable before re…

    …questing its size.
    
    Only some instructions should be considered as potentially reducing the
    size of the operands types, not all instructions should be considered.
    
    Fixes llvm#107036
    alexey-bataev committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    f381cd0 View commit details
    Browse the repository at this point in the history
  41. [SCCP] Explicitly mark gep as overdefined if ct eval fails

    Don't just leave the result as unknown. I think this currently
    works out thanks to undef resolution, but the correct thing to
    do is set it to overdefined explicitly.
    nikic committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    0797c18 View commit details
    Browse the repository at this point in the history
  42. [LV] Update call widening decision when scalarzing calls.

    collectInstsToScalarize may decide to scalarize a call. If so, we have
    to update the widening decision for the call, otherwise the call won't
    be scalarized as expected during VPlan construction.
    
    This issue was uncovered by f82543d509.
    fhahn committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    dd94537 View commit details
    Browse the repository at this point in the history
  43. [SLP]Check for the whole vector vectorization in unique scalars analysis

    Need to check that thr whole number of register is attempted to
    vectorize before actually trying to build the node to avoid compiler
    crash.
    alexey-bataev committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b74e09c View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    ce8ec31 View commit details
    Browse the repository at this point in the history
  45. [compiler-rt][rtsan] Record pc and bp higher up in the stack (llvm#10…

    …7014)
    
    Functionally, this change affects only our printed stack traces. New
    version does not expose any internal rtsan interworking
    cjappl authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    a424b79 View commit details
    Browse the repository at this point in the history
  46. [Vectorize] Fix -Wunused-variable in SLPVectorizer.cpp (NFC)

    /llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10310:26:
    error: unused variable 'isExtractSubvectorMask' [-Werror,-Wunused-variable]
                        bool isExtractSubvectorMask =
                             ^
    1 error generated.
    DamonFool committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    20fa37b View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    d7c44ef View commit details
    Browse the repository at this point in the history
  48. [BPF] Make -mcpu=v3 as the default (llvm#107008)

    Before llvm20, (void)__sync_fetch_and_add(...) always generates locked
    xadd insns. In linux kernel upstream discussion [1], it is found that
    for arm64 architecture, the original semantics of
    (void)__sync_fetch_and_add(...), i.e., __atomic_fetch_add(...), is
    preferred in order for jit to emit proper native barrier insns.
    
    In llvm commits [2] and [3], (void)__sync_fetch_and_add(...) will
    generate the following insns:
      - for cpu v1/v2: locked xadd insns to keep backward compatibility
      - for cpu v3/v4: __atomic_fetch_add() insns
    
    To ensure proper barrier semantics for (void)__sync_fetch_and_add(...),
    cpu v3/v4 is recommended.
    
    This patch enables cpu=v3 as the default cpu version. For users wanting
    to use cpu v1, -mcpu=v1 needs to be explicitly added to clang/llc
    command line.
    
      [1]
    https://lore.kernel.org/bpf/ZqqiQQWRnz7H93Hc@google.com/T/#mb68d67bc8f39e35a0c3db52468b9de59b79f021f
      [2] llvm#101428
      [3] llvm#106494
    yonghong-song authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    7852ebc View commit details
    Browse the repository at this point in the history
  49. [clang][bytecode][NFC] Move Call ops into Interp.cpp (llvm#107104)

    They are quite long and not templated.
    tbaederr authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    f70ccda View commit details
    Browse the repository at this point in the history
  50. [GISEL][AArch64][NFC] Stop using wip_match_opcode for some opcodes (l…

    …lvm#106702)
    
    This patch moves to the new style of writing
    pattern for matching opcodes and thus deprecates using wip_match_opcoee.
    It moves G_FCONSTANT, G_ICMP, G_STORE, and G_OR.
    madhur13490 authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    df159d3 View commit details
    Browse the repository at this point in the history
  51. LICM: use IRBuilder in hoist BO assoc (llvm#106978)

    Use IRBuilder when creating the new invariant instruction, so that the
    constant-folder has an opportunity to constant-fold the new Instruction
    that we desire to create.
    artagnon authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    05f5a91 View commit details
    Browse the repository at this point in the history
  52. [ThinLTO] Don't always print ModulesToCompile debugging information (l…

    …lvm#106769)
    
    Nothing went wrong in this case, we just successfully matched a module
    by identifier. No need to print to std::error like we would for
    something that should be user-visible.
    
    Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
    sarnex authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    fedc755 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    3b6e255 View commit details
    Browse the repository at this point in the history
  54. [RISCV] Rename sf_vcix_state to sf.vcix_state. NFC (llvm#107115)

    This PR: llvm#106995 names the
    vendor CSR in a wrong way, it should be `sf.` rather than `sf_` for
    prefix.
    4vtomat authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b7017ef View commit details
    Browse the repository at this point in the history
  55. [SDAG] Fix a typo in comment

    preames committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    e1bde1c View commit details
    Browse the repository at this point in the history
  56. [RISCV] Use RNE rounding mode for fcvt.s.bf16. Don't print the roundi…

    …ng mode if RNE. (llvm#106948)
    
    The rounding mode has no effect on the instruction behavior. Using RNE
    matches what we do for fcvt.s.h, fcvt.d.f, fcvt.d.h which are similarily
    not affected by the rounding mode.
    
    This appears to match the behavior of binutils. According to compiler
    explore, objdump is unable to disassembler fcvt.s.bf16 with a non-zero
    rounding mode.
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    2a9f93b View commit details
    Browse the repository at this point in the history
  57. [ADT] Deprecate DenseMap::getOrInsertDefault (llvm#107040)

    This patch deprecates DenseMap::getOrInsertDefault in favor of
    DenseMap::operator[], which does the same thing, has been around
    longer, and is also a household name as part of std::map and
    std::unordered_map.
    
    Note that DenseMap provides several equivalent ways to insert or
    default-construct a key-value pair:
    
    - operator[Key]
    - try_emplace(Key).first->second
    - getOrInsertDefault(Key)
    - FindAndConstruct(Key).second
    kazutakahirata authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    59a3b41 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    86835d2 View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    93857af View commit details
    Browse the repository at this point in the history
  60. [libclc] More cross compilation fixes (llvm#97811)

    * Move the setup_host_tool calls to the directories of their tool.
    Although it works to call it in libclc, it can only appear in a single
    location so it fails the "what if everyone did this?" test and causes
    problems for downstream code that also wants to use native versions of
    these tools from other projects.
    * Correct the TARGET "${${tool}_target}" check. "${${tool}_target}" may
    be set to the path to the executable, which works in dependencies but
    cannot be tested using if(TARGET). For lack of a better alternative,
    just check that "${${tool}_target}" is non-empty and trust that if it
    is, it is set to a meaningful value. If somehow it turns out to be a
    valid target, its value will still show up in error messages anyway.
    * Account for llvm-spirv possibly being provided in-tree. Per
    https://github.com/KhronosGroup/SPIRV-LLVM-Translator?tab=readme-ov-file#llvm-in-tree-build
    it is possible to drop llvm-spirv into LLVM and have it built as part of
    LLVM's build. In this configuration, cross builds of LLVM require a
    native version of llvm-spirv to be built.
    hvdijk authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    903d1c6 View commit details
    Browse the repository at this point in the history
  61. LICM: extend hoist BO assoc to mul case (llvm#106991)

    Trivially extend hoistBOAssociation to also handle the BinaryOperator
    Mul.
    
    Alive2 proofs: https://alive2.llvm.org/ce/z/zjtR5g
    artagnon authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    f1ef67d View commit details
    Browse the repository at this point in the history
  62. [gn build] Add missing llvm-strings dependency to check-lld (llvm#106896

    )
    
    This has been required by `lld/test/ELF/zsectionheader.s` since it was
    added in 5d972c5.
    BertalanD authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4da0aa3 View commit details
    Browse the repository at this point in the history
  63. [bazel] Change cache-silo-key to fix blob fetch issue.

    Bazel builds currently fail with `Failed to fetch blobs because they do not exist remotely.`. 
    
    Set a cache-silo-key to start a new cache.
    chsigg authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    df4746d View commit details
    Browse the repository at this point in the history
  64. Prefer use of 0.0 over -0.0 for fadd reductions w/nsz (in IR) (llvm#1…

    …06770)
    
    This is a follow up to 924907b, and is mostly motivated by consistency
    but does include one additional optimization. In general, we prefer 0.0
    over -0.0 as the identity value for an fadd. We use that value in
    several places, but don't in others. So, let's be consistent and use the
    same identity (when nsz allows) everywhere.
    
    This creates a bunch of test churn, but due to 924907b, most of that
    churn doesn't actually indicate a change in codegen. The exception is
    that this change enables the use of 0.0 for nsz, but *not* reasoc, fadd
    reductions. Or said differently, it allows the neutral value of an
    ordered fadd reduction to be 0.0.
    preames authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    2c7786e View commit details
    Browse the repository at this point in the history
  65. [M68k] Fix compilation pipeline check

    - After 'RemoveLoadsIntoFakeUses' is enabled to support llvm.fake.use
    darkbuck committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    8e4b815 View commit details
    Browse the repository at this point in the history
  66. [clang][bytecode][NFC] Simplify builtin-functions.cpp (llvm#107118)

    The effect is the same, but this version doesn't take as long to
    evaluate.
    tbaederr authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    9626e84 View commit details
    Browse the repository at this point in the history
  67. [LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC]

    These recurrence types don't have a meaningful identity, and the
    routine was abused to return the start value instead.  Out of the
    three callers to this routine, only one actually wants this
    behavior.  This is a prep change for removing the routine entirely
    and commoning it with other copies of the same logic.
    preames committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    0b2f253 View commit details
    Browse the repository at this point in the history
  68. [MLIR][AMDGPU] Add support for fp8 ops on gfx12 (llvm#106388)

    This PR is adding support for `fp8` and `bfp8` on gfx12
    giuseros authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    a8e1c6f View commit details
    Browse the repository at this point in the history
  69. [SPIR-V] Improve correctness of emitted MIR between passes for branch…

    …ing instructions (llvm#106966)
    
    This PR improves correctness of emitted MIR between passes for branching
    instructions and thus increase number of passing tests when expensive
    checks are on. Specifically, we address here such issues with machine
    verifier as:
    * fix switch generation: generate correct successors and undo the
    "address taken" status to reflect that a successor doesn't actually
    correspond to an IR-level basic block;
    * fix incorrect definition of OpBranch and OpBranchConditional in
    TableGen (SPIRVInstrInfo.td) to set isBarrier status properly and set a
    correct type of virtual registers;
    * fix a case when Phi refers to a type definition that goes after the
    Phi instruction, so that the virtual register definition of the type
    doesn't dominate all uses.
    
    This PR decrease number of failing tests under expensive checks from 56
    to 50.
    VyacheslavLevytskyy authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    ebdadcf View commit details
    Browse the repository at this point in the history
  70. [SPIR-V] Ensure that OpExtInst instructions generated by NonSemantic_…

    …Shader_DebugInfo_100 are not mixed up with other OpExtInst instructions (llvm#107007)
    
    This PR is to ensure that OpExtInst instructions generated by
    NonSemantic_Shader_DebugInfo_100 are not mixed up with other OpExtInst
    instructions.
    
    Original implementation
    (llvm#97558) has introduced an
    issue by moving OpExtInst instruction with the 3rd operand equal to
    DebugSource (value 35) or DebugCompilationUnit (value 1) even if
    OpExtInst is not generated by NonSemantic_Shader_DebugInfo_100
    implementation code.
    
    The reproducer is attached as a new test case. The code of the test case
    reproduces the issue, because "lgamma" has the same code (35) inside
    OpenCL_std as DebugSource inside NonSemantic_Shader_DebugInfo_100.
    VyacheslavLevytskyy authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4f403e8 View commit details
    Browse the repository at this point in the history
  71. [SandboxIR] Add tracking for ShuffleVectorInst::commute. (llvm#106644)

    Track it as an operand swap + a `setShuffleMask` and delegate to the
    `llvm::ShuffleVectorInst` implementation.
    slackito authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    e89bcfc View commit details
    Browse the repository at this point in the history
  72. [NFC][opt] Rename VerifierKind enums (llvm#106789)

    Make into enum class.
    
    Output really should be InputOutput since it also verifies the input IR.
    aeubanks authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    fdc1b5d View commit details
    Browse the repository at this point in the history
  73. [libc++] Add missing std::is_virtual_base_of to type_traits.inc (l…

    …lvm#107009)
    
    std::is_virtual_base_of was implemented in llvm#105847
    H-G-Hristov authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    4640736 View commit details
    Browse the repository at this point in the history
  74. [CMake][compiler-rt] Support for using compiler-rt atomic library (ll…

    …vm#106603)
    
    Not every toolchain provides and want to use libatomic which is a part
    of GCC, some toolchains may opt into using compiler-rt atomic library.
    petrhosek authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    26a4edf View commit details
    Browse the repository at this point in the history
  75. [SandboxIR] Implement remaining ConstantInt functions (llvm#106775)

    This patch adds the remaining ConstantInt:: functions and it also
    implements the IntegerType class.
    vporpo authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b91b1f0 View commit details
    Browse the repository at this point in the history
  76. [PGO][Pipeline] Enable PGOForceFunctionAttrs in PGO optimization pipe…

    …lines (llvm#106790)
    
    Remove flag that turns on the PGOForceFunctionAttrs pass and always add
    it to default pipelines when using PGO.
    
    This is NFC by default since PGOOpt->ColdOptType is by default
    ColdFuncOpt::Default.
    
    Remove -O2 RUN line in basic.ll since we now have the pipeline tests.
    aeubanks authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    fb14f1d View commit details
    Browse the repository at this point in the history
  77. [libc++] Fix __datasizeof_v for Clang17 and 18 in C++03 (llvm#106832)

    This also disables the use of `__datasizeof`, since it's currently
    broken for empty types.
    philnik777 authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    42f5277 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    24b6b82 View commit details
    Browse the repository at this point in the history
  79. Revert "[SLP]Check for the whole vector vectorization in unique scala…

    …rs analysis"
    
    This reverts commit b74e09c after
    post-commit review. The number of parts is calculated incorrectly.
    alexey-bataev committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    884d7c1 View commit details
    Browse the repository at this point in the history
  80. Revert "[SLP]Initial support for non-power-of-2 (but still whole regi…

    …ster) number of elements in operands."
    
    This reverts commit a3ea90f after the
    post commit review. The number of parts is calculated incorrectly.
    alexey-bataev committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    571c8c2 View commit details
    Browse the repository at this point in the history
  81. [SLPVectorizer] Use DenseMap::{find,try_emplace} (NFC) (llvm#107123)

    I'm planning to deprecate and eventually remove
    DenseMap::FindAndConstruct in favor of operator[].
    kazutakahirata authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    126940b View commit details
    Browse the repository at this point in the history
  82. [BOLT][YAML] Allow unknown keys in the input (llvm#100824)

    This ensures forward compatibility, where old BOLT versions can consume
    the profile created by newer versions with extra keys.
    
    Test Plan: added yaml-unknown-keys.test
    aaupov authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    15fa3ba View commit details
    Browse the repository at this point in the history
  83. [Clang] Fix handling of placeholder variables name in init captures (l…

    …lvm#107055)
    
    We were incorrectly not deduplicating results when looking up `_` which,
    for a lambda init capture, would result in an ambiguous lookup.
    
    The same bug caused some diagnostic notes to be emitted twice.
    
    Fixes llvm#107024
    cor3ntin authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    eec1fac View commit details
    Browse the repository at this point in the history
  84. [LV] Prefer FLT_MIN/MAX for fmin/fmax reductions with ninf (llvm#107141)

    Analogous to 2c7786e, cleanup a case
    where the vectorizer is emitting a non-canonical identity value given
    the available flags. We use largest/smallest value during ISEL, and VP
    expansion, but not during vectorization.
    
    Since the fmin/fmax/fminimum/fmaximum intrinsics don't require a start
    value, this difference is only visible when masking of inactive lanes is
    required.
    
    Primary motivation of this change is simply to remove a difference
    between version of code which reason about the identity value of a
    reduction so I can kill all but one off.
    
    In review, it was pointed out that this is actually a functional fix as well. 
    The old code used inf on a noinf reduction instruction - whose
    result is poison!  That wasn't the intent of the code.
    preames authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    1fbb6b4 View commit details
    Browse the repository at this point in the history
  85. Configuration menu
    Copy the full SHA
    451a313 View commit details
    Browse the repository at this point in the history
  86. [clang] [docs] Clarify the issue with compiler-rt on Windows/MSVC (ll…

    …vm#106875)
    
    Compiler-rt does support Windows just fine, even if outdated docs pages
    didn't list it as one of the supported OSes, this is being rectified in
    llvm#106874.
    
    MinGW is another environment configuration on Windows, where compiler-rt
    or libgcc is linked in automatically, so there's no issue with having
    such builtins functions available.
    
    For MSVC style environments, compiler-rt builtins do work just fine, but
    Clang doesn't automatically link them in. See e.g.
    https://discourse.llvm.org/t/improve-autolinking-of-compiler-rt-and-libc-on-windows-with-lld-link/71392
    for a discussion on how to improve this situation. But none of that
    issue is that compiler-rt itself wouldn't support Windows.
    mstorsjo authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    eb05e8f View commit details
    Browse the repository at this point in the history
  87. [clang] Don't add DWARF debug info when assembling .s with clang-cl /…

    …Z7 (llvm#106686)
    
    This fixes a regression from f58330c.
    
    That commit changed the clang-cl options /Zi and /Z7 to be implemented
    as aliases of -g rather than having separate handling.
    
    This had the unintended effect, that when assembling .s files with
    clang-cl, the /Z7 option (which implies using CodeView debug info) was
    treated as a -g option, which causes `ClangAs::ConstructJob` to pick up
    the option as part of `Args.getLastArg(options::OPT_g_Group)`, which
    sets the `WantDebug` variable.
    
    Within `Clang::ConstructJob`, we check for whether explicit `-gdwarf` or
    `-gcodeview` options have been set, and if not, we pick the default
    debug format for the current toolchain. However, in `ClangAs`, if debug
    info has been enabled, it always adds DWARF debug info.
    
    Add similar logic in `ClangAs` - check if the user has explicitly
    requested either DWARF or CodeView, otherwise look up the toolchain
    default. If we (either implicitly or explicitly) should be producing
    CodeView, don't enable the default `ClangAs` DWARF generation.
    
    This fixes the issue, where assembling a single `.s` file with clang-cl,
    with the /Z7 option, causes the file to contain some DWARF sections.
    This causes the output executable to contain DWARF, in addition to the
    separate intended main PDB file.
    
    By having the output executable contain DWARF sections, LLDB only looks
    at the (very little) DWARF info in the executable, rather than looking
    for a separate standalone PDB file. This caused an issue with LLDB's
    tests, llvm#101710.
    mstorsjo authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    fcb7b39 View commit details
    Browse the repository at this point in the history
  88. [LV] Honor forced scalars in setVectorizedCallDecision.

    Similarly to dd94537, setVectorizedCallDecision also did not consider
    ForcedScalars. This lead to VPlans not reflecting the decision by the
    legacy cost model (cost computation would use scalar cost, VPlan would
    have VPWidenCallRecipe).
    
    To fix this, check if the call has been forced to scalar in
    setVectorizedCallDecision.
    
    Note that this requires moving setVectorizedCallDecision after
    collectLoopUniforms (which sets ForcedScalars). collectLoopUniforms does
    not depend on call decisions and can safely be moved.
    
    Fixes llvm#107051.
    fhahn committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    3bd161e View commit details
    Browse the repository at this point in the history
  89. [clang] [test] Fix the debug-options-as.c test on macOS

    Separate the path, which may begin with e.g. /Users, with "--" from
    the other options, to make it clear that it is a path, not an
    option.
    
    This fixes a test from fcb7b39.
    mstorsjo committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    70f3511 View commit details
    Browse the repository at this point in the history
  90. [RISCV] Custom promote f16/bf16 (s/u)int_to_fp. (llvm#107026)

    This avoids having isel patterns that emit two instrutions. It also
    allows us to remove sext.w and slli+srli pairs by using fcvt.s.w(u) on
    RV64.
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    ec8e1c6 View commit details
    Browse the repository at this point in the history
  91. [Clang][Sema] clang generates incorrect fix-its for API_AVAILABLE (ll…

    …vm#105855)
    
    Apple's API_AVAILABLE macro has its own notion of platform names which
    are supported by \_\_API_AVAILABLE_PLATFORM_<name> macros. They don't
    follow a consistent naming convention, but there's at least one that
    matches a valid availability attribute platform name. Instead of
    lowercasing the source spelling name, search for a defined macro and use
    that in the fix-it.
    ian-twilightcoder authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    319e8cd View commit details
    Browse the repository at this point in the history
  92. [X86] Don't save/restore fp/bp around terminator (llvm#106462)

    In function spillFPBP we already try to skip terminator, but there is a
    logic error, so when there is only terminator instruction in the MBB, it
    still tries to save/restore fp/bp around it if the terminator clobbers
    fp/bp, for example a tail call with ghc calling convention.
    
    Now this patch really skips terminator even if it is the only
    instruction in the MBB.
    weiguozhi authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    cdab6ff View commit details
    Browse the repository at this point in the history
  93. [clang] [test] Fix the debug-options-as.c test on PowerPC

    Use an explicit MSVC triple with an architecture that does
    have proper handling for MSVC style targets.
    
    This fixes a test from fcb7b39.
    mstorsjo committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    cbb5f03 View commit details
    Browse the repository at this point in the history
  94. [scudo] Update secondary cache released pages bound. (llvm#106466)

    `MaxReleasedCachePages` has been set to 4. Initially, in llvm#105009 , we
    set `MaxReleasedCachePages` to 0 so that the partial chunk heuristic
    could be introduced incrementally as we observed its impact on retrieval
    order and more generally, performance.
    
    Co-authored-by: Joshua Baehring <josh.baehring@yale.edu>
    JoshuaMBa and JoshuaMBa authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    0ef7b1d View commit details
    Browse the repository at this point in the history
  95. [HLSL] Adjust resource binding diagnostic flags code (llvm#106657)

    Adjust register binding diagnostic flags code in a couple of ways:
    - Store the resource class in the Flags struct to avoid duplicated
    scanning for HLSLResourceClassAttribute
    - Avoid unnecessary indirection when converting resource class to
    register type
    - Remove recursion and reduce duplicated code
    
    Also fixes a case where struct with an array was incorrectly diagnosed
    unfit for `c` register binding.
    
    This will also simplify work that is needed to be done in this area for
    llvm#104861.
    hekota authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    334d123 View commit details
    Browse the repository at this point in the history
  96. [flang][cuda] Convert global allocation for pinned variable (llvm#106807

    )
    
    ALLOCATE/DEALLOCATE statements for module allocatable variable with the
    pinned attribute can be lowered to the standard runtime call and do not
    need further action since these variables will have a unique descriptor
    that is on the host.
    clementval authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    dfc21ac View commit details
    Browse the repository at this point in the history
  97. [Sema] Fix warnings

    This patch fixes:
    
      clang/lib/Sema/SemaHLSL.cpp:838:12: error: unused variable
      'TheVarDecl' [-Werror,-Wunused-variable]
    
      clang/lib/Sema/SemaHLSL.cpp:840:19: error: unused variable
      'CBufferOrTBuffer' [-Werror,-Wunused-variable]
    kazutakahirata committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b2dabd2 View commit details
    Browse the repository at this point in the history
  98. Configuration menu
    Copy the full SHA
    d966d47 View commit details
    Browse the repository at this point in the history
  99. [lldb] Avoid FileSpec indirection where we can use SupportFiles directly

    Now that more parts of LLDB know about SupportFiles, avoid going through
    FileSpec (and losing the Checksum in the process). Instead, use the
    SupportFile directly.
    JDevlieghere committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    98bde7f View commit details
    Browse the repository at this point in the history
  100. [SLPVectorizer] Avoid two successive hash lookups on the same key (ll…

    …vm#107143)
    
    This patch replaces the find-try_emplace sequence with just one call
    to try_emplace, thereby avoiding two successive hash lookups on the
    same key.  I am not using the "inserted" boolean from try_emplace to
    preserve the original behavior (that is, before PR 107123) that checks
    to see if the value is nullptr or not.
    kazutakahirata authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    53d3d1a View commit details
    Browse the repository at this point in the history
  101. Configuration menu
    Copy the full SHA
    db8ca88 View commit details
    Browse the repository at this point in the history
  102. [Docs] Use cacheable myst_heading_slug_func value

    Avoid creating an uncacheable conf variable by using a string instead of
    a function reference. Also has the effect of avoiding triggering the
    "config.cache" sphinx warning.
    
    Requires myst_parser 0.19.0 (specifically
    executablebooks/MyST-Parser#696) which is over a
    year old by now. Do we mandate any minimum version for these
    dependencies?
    slinder1 committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    18cf14e View commit details
    Browse the repository at this point in the history
  103. [RISCV] Custom promote f16/bf16 fp_to_(s/u)int to reduce isel pattern…

    …s that emit two instructions. (llvm#107011)
    
    All of the test changes are because integer type legalization prefers to promote
    fp_to_uint to fp_to_sint if neither is "Legal".
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    db3792b View commit details
    Browse the repository at this point in the history
  104. [lldb] Bump the lldb-dap version number

    Bump the lldb-dap version number so that we can publish and updated
    version in the Visual Studio Marketplace.
    JDevlieghere committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    7d3b81d View commit details
    Browse the repository at this point in the history
  105. [SLP]Fix PR107037: correctly track origonal/modified after vectorizat…

    …ions reduced values
    
    Need to correctly track reduced values with multiple uses in the same
    reduction emission attempt. Otherwise, the number of the reuses might be
    calculated incorrectly, and may cause compiler crash.
    
    Fixes llvm#107037
    alexey-bataev committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    98bb354 View commit details
    Browse the repository at this point in the history
  106. [M68k] Introduce more MOVI cases (llvm#98377)

    Add three more special cases for loading registers with immediates.
    
    The first allows values in the range of [-255, 255] to be loaded with
    MOVEQ, even if the register is more than 8 bits and the sign extention
    is unwanted. This is done by loading the bitwise complement of the
    desired value, then performing a NOT instruction on the loaded register.
    
    This special case is only used when a simple MOVEQ cannot be used, and
    is only used for 32 bit data registers. Address registers cannot support
    MOVEQ, and the two-instruction sequence is no faster or smaller than a
    plain MOVE instruction when loading 16 bit immediates on the 68000, and
    likely slower for more sophisticated microarchitectures. However, the
    instruction sequence is both smaller and faster than the corresponding
    MOVE instruction for 32 bit register widths.
    
    The second special case is for zeroing address registers. This simply
    expands to subtracting a register with itself, consuming one instruction
    word rather than 2-3, with a small improvement in speed as well.
    
    The last special case is for assigning sign-extended 16-bit values to a
    full address register. This takes advantage of the fact that the movea.w
    instruction sign extends the output, permitting the immediate to be
    smaller. This is similar to using lea with a 16-bit address, which is
    not added in this patch as 16-bit absolute addressing is not yet
    implemented.
    
    This is a v2 submission of llvm#90817. It also creates a 'Data' test
    directory to better align with the backend's tablegen layout.
    n8pjl authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    d3c10b5 View commit details
    Browse the repository at this point in the history
  107. [RISCV] Don't promote f16/bf16 SELECT with Zfhmin/Zfbfmin. (llvm#107138)

    Select only needs branches and moves so we don't need to promote it.
    Promoting would canonicalize NaNs which select shouldn't do.
    topperc authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    1c874bb View commit details
    Browse the repository at this point in the history
  108. [lld-macho] Always store symbol name length eagerly (NFC) (llvm#106906)

    The only instance where we weren't already passing a `StringRef` with a
    known length to `Symbol`'s constructor is where the argument is a string
    literal. Even in that case, lazy `strlen` calls don't make sense, as the
    compiler can constant-evaluate the `StringRef(const char*)` constructor.
    
    For symbols that go into the symbol table we need the length when
    calculating the hash anyway. We could get away with not calling
    `getName()` for local symbols, but the total contribution of `strlen` to
    the run time is already below 1%, so that would just complicate the code
    for a negligible benefit.
    BertalanD authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b24a304 View commit details
    Browse the repository at this point in the history
  109. [ctx_prof] Add Inlining support (llvm#106154)

    Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant.
    
    Post-inlining, the update mainly consists of:
    - making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions)
    - in the contextual profile:
       - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual.
       - the contexts of the callee (at the inlined callsite) are moved to the caller.
       - the callee context at the inlined callsite is deleted.
    mtrofin authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    3209766 View commit details
    Browse the repository at this point in the history
  110. Configuration menu
    Copy the full SHA
    dce73e1 View commit details
    Browse the repository at this point in the history
  111. [compiler-rt][rtsan] Add scoped reporting lock (llvm#107167)

    Uses a static lock to ensure multiple threads reporting issues at the
    same time don't have printing collisions. This isn't so important now,
    but will be with continue mode in the future.
    cjappl authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    18263c3 View commit details
    Browse the repository at this point in the history
  112. [lldb] Remove limit on max memory read size (llvm#105765)

    `memory read` will return an error if you try to read more than 1k bytes
    in a single command, instructing you to set
    `target.max-memory-read-size` or use `--force` if you intended to read
    more than that. This is a safeguard for a command where people are being
    explicit about how much memory they would like lldb to read (either to
    display, or save to a file) and is an annoyance every time you need to
    read more than a small amount. If someone confuses the --count argument
    with the start address, lldb may begin dumping gigabytes of data but I'd
    rather that behavior than requiring everyone to special-case their way
    around a common use case.
    
    I don't want to remove the setting because many people have added (much
    larger) default max read sizes to their ~/.lldbinit files after hitting
    this behavior. Another option would be to stop reading/using the value
    in Target.cpp, but I see no harm in leaving the setting if someone
    really does prefer to have a small cap on their memory read size.
    jasonmolenda authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    b076f66 View commit details
    Browse the repository at this point in the history

Commits on Sep 4, 2024

  1. Remove "Target" from createXReduction naming [nfc]

    Despite the stale comments, none of these actually use TTI, and they're
    solely generating standard LLVM IR.
    preames committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    3e8840b View commit details
    Browse the repository at this point in the history
  2. [clang] Add test for CWG2486 (noexcept and function pointer convers…

    …ion) (llvm#107131)
    
    [CWG2486](https://cplusplus.github.io/CWG/issues/2486.html) "Call to
    `noexcept` function via `noexcept(false)` pointer/lvalue" allows
    `noexcept` functions to be called via `noexcept(false)` pointers or
    values. There appears to be no implementation divergence whatsoever:
    https://godbolt.org/z/3afTfeEM8. That said, in C++14 and earlier we do
    not issue all the diagnostics we issue in C++17 and newer, so I'm
    specifying the status of the issue accordingly.
    Endilll authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    eaa95a1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    83ad644 View commit details
    Browse the repository at this point in the history
  4. [SandboxIR] Implement ConstantAggregate (llvm#107136)

    This patch implements sandboxir:: ConstantAggregate, ConstantStruct,
    ConstantArray and ConstantVector, mirroring LLVM IR.
    vporpo authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    814aa43 View commit details
    Browse the repository at this point in the history
  5. [gn build] Port 83ad644

    llvmgnsyncbot committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    48bc8b0 View commit details
    Browse the repository at this point in the history
  6. [RISCV] Bitcast fixed length bf16/f16 build_vector to i16 with Zvfbfm…

    …in/Zvfhmin+Zfbfmin/Zfhmin. (llvm#106637)
    
    Previously, if Zfbfmin/Zfhmin were enabled, we only handled
    build_vectors that could be turned into splat_vectors. We promoted them
    to f32 splats by extending in the scalar domain and narrowing in the
    vector domain.
    
    This patch fixes a crash where we failed to account for whether the f32
    vector type fit in LMUL<=8.
    
    Because the new lowering occurs after type legalization, we have to be
    careful to use XLenVT for the scalar integer type and use custom cast
    nodes.
    topperc authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    ff0f201 View commit details
    Browse the repository at this point in the history
  7. [WebAssembly] Remove Kind argument from WebAssemblyOperand (NFC) (llv…

    …m#107157)
    
    The `Kind` argument does not need to passed separately.
    aheejin authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    f1615e3 View commit details
    Browse the repository at this point in the history
  8. [mlir][tensor] Fix consumer fusion for tensor.pack without explicit…

    … `outer_dims_perm` attribute (llvm#106687)
    Yun-Fly authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    c8763f0 View commit details
    Browse the repository at this point in the history
  9. [clang] Add tests for CWG issues about language linkage (llvm#107019)

    This patch covers Core issues about language linkage during declaration
    matching resolved in
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html),
    namely [CWG563](https://cplusplus.github.io/CWG/issues/563.html) and
    [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html).
    
    [CWG563](https://cplusplus.github.io/CWG/issues/563.html) "Linkage
    specification for objects"
    -----------
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    > [CWG563](https://cplusplus.github.io/CWG/issues/563.html) is resolved
    by simplifications that follow its suggestions.
    
    Wording ([[dcl.link]/5](https://eel.is/c++draft/dcl.link#5)):
    > In a
    [linkage-specification](https://eel.is/c++draft/dcl.link#nt:linkage-specification),
    the specified language linkage applies to the function types of all
    function declarators and to all functions and variables whose names have
    external linkage[.](https://eel.is/c++draft/dcl.link#5.sentence-5)
    
    Now the wording clearly says that linkage-specification applies to
    variables with external linkage.
    
    [CWG1818](https://cplusplus.github.io/CWG/issues/1818.html) "Visibility
    and inherited language linkage"
    ------------
    
    [P1787R6](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html):
    >
    [CWG386](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#386),
    [CWG1839](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1839),
    [CWG1818](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1818),
    [CWG2058](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2058),
    [CWG1900](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1900),
    and Richard’s observation in [“are non-type names ignored in a
    class-head-name or
    enum-head-name?”](http://lists.isocpp.org/core/2017/01/1604.php) are
    resolved by describing the limited lookup that occurs for a
    declarator-id, including the changes in Richard’s [proposed resolution
    for
    CWG1839](http://wiki.edg.com/pub/Wg21cologne2019/CoreWorkingGroup/cwg1839.html)
    (which also resolves CWG1818 and what of CWG2058 was not resolved along
    with CWG2059) and rejecting the example from
    [CWG1477](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1477).
    
    Wording ([[dcl.link]/6](https://eel.is/c++draft/dcl.link#6)):
    > A redeclaration of an entity without a linkage specification inherits
    the language linkage of the entity and (if applicable) its
    type[.](https://eel.is/c++draft/dcl.link#6.sentence-2).
    
    Answer to the question in the example is `extern "C"`, and not linkage
    mismatch. Further analysis of the example is provided as inline comments
    in the test itself. Note that https://eel.is/c++draft/dcl.link#7 does
    NOT apply in this example, as it's focused squarely at declarations that
    are already known to have C language linkage, and declarations of
    variables in the global scope.
    Endilll authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    99f02a8 View commit details
    Browse the repository at this point in the history
  10. [IR] Remove unused MINARITY operand trait tpl args, NFC (llvm#107165)

    These don't look like they've been used since the original 'use-diet'
    branch was merged in 2008 ( f6caff6)
    rnk authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    b057e16 View commit details
    Browse the repository at this point in the history
  11. [VPlan][NFC] Implement VPWidenMemoryRecipe::computeCost(). (llvm#10…

    …5614)
    
    In this patch, we implement the `computeCost()` function in
    `VPWidenMemoryRecipe`.
    ElvisWang123 authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    ed220e1 View commit details
    Browse the repository at this point in the history
  12. [AArch64][GlobalISel] Lower G_BUILD_VECTOR to G_INSERT_VECTOR_ELT (ll…

    …vm#105686)
    
    The lowering happens in post-legalizer lowering if any source registers
    from G_BUILD_VECTOR are not constants.
    
    Add pattern pragment setting `scalar_to_vector ($src)` asequivalent to
    `vector_insert (undef), ($src), (i61 0)`
    chuongg3 authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    9b5971a View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    12c0823 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    a27ff17 View commit details
    Browse the repository at this point in the history
  15. [clang][Driver] Define soft float macros for PPC. (llvm#106012)

    Fixes llvm#105972.
    
    Co-authored-by: Qiu Chaofan <qcf@ecnelises.com>
    alexrp and ecnelises authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    b55186e View commit details
    Browse the repository at this point in the history
  16. [MLIR][Tensor] Fix source/dest type check in UnPackOp canonicalize (l…

    …lvm#106094)
    
    Fix `RankedTensorType` equality check in unpack op canonicalization.
    yifeizh2 authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    8d08166 View commit details
    Browse the repository at this point in the history
  17. [clang-format] Handle pointer/reference in macro definitions (llvm#10…

    …7074)
    
    A macro definition needs its own scope stack in the annotator, so we add
    the MacroBodyScopes stack and use ScopeStack to refer to it when in the
    macro definition body.
    
    Also, we need to have a scope type for a child block because its parent
    line is parsed (and thus the scope type for the braces is popped off the
    scope stack) before the lines in the child block are.
    
    Fixes llvm#99271.
    owenca authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    812c96e View commit details
    Browse the repository at this point in the history
  18. [mlir][TensorToSPIRV] Add type check for tensor.extract in TensorTo…

    …SPIRV (llvm#107110)
    
    This patch add a type check for `tensor.extract` in TensorToSPIRV.
    Only convert `tensor.extract` with supported element type. Fix llvm#74466.
    CoTinker authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    f4b9839 View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2024

  1. Configuration menu
    Copy the full SHA
    2fff529 View commit details
    Browse the repository at this point in the history