Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 51365212 (Aug 25) (10) #363

Open
wants to merge 427 commits into
base: bump_to_b96f18b2
Choose a base branch
from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Aug 22, 2024

  1. [AMDGPU] GFX12 VMEM loads can write VGPR results out of order (llvm#1…

    …05549)
    
    Fix SIInsertWaitcnts to account for this by adding extra waits to avoid
    WAW dependencies.
    jayfoad authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    5506831 View commit details
    Browse the repository at this point in the history
  2. [cmake] Include GNUInstallDirs before using variables defined by it. (l…

    …lvm#83807)
    
    This fixes an odd problem with the regex when `CMAKE_INSTALL_LIBDIR` is
    not defined:
    
    `string sub-command REGEX, mode REPLACE: regex "$" matched an empty
    string.`
    
    Fixes llvm#83802
    vgvassilev authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    5bbd598 View commit details
    Browse the repository at this point in the history
  3. [DebugInfo][NFC] Constify debug DbgVariableRecord::{isDbgValue,isDbgD…

    …eclare} (llvm#105570)
    
    Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}.
    enferex authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    743e70b View commit details
    Browse the repository at this point in the history
  4. Revert "[lldb][swig] Use the correct variable in the return statement"

    This reverts commit 6528157.
    
    I'm reverting llvm#104523
    (llvm@f01f80c)
    and this fixup belongs to the same series of changes.
    gribozavr committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    7323e7e View commit details
    Browse the repository at this point in the history
  5. Revert "[lldb-dap] Mark hidden frames as "subtle" (llvm#105457)"

    This reverts commit 6f45602, which
    depends on llvm#104523, which I'm
    reverting.
    gribozavr committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    aa70f83 View commit details
    Browse the repository at this point in the history
  6. Revert "[lldb] Extend frame recognizers to hide frames from backtraces (

    llvm#104523)"
    
    This reverts commit f01f80c.
    
    This commit introduces an msan violation. See the discussion on llvm#104523.
    gribozavr committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    547917a View commit details
    Browse the repository at this point in the history
  7. [clang][bytecode] Fix void unary * operators (llvm#105640)

    Discard the subexpr.
    tbaederr authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    125aa10 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6932f47 View commit details
    Browse the repository at this point in the history
  9. [NFC][SetTheory] Refactor to use const pointers and range loops (llvm…

    …#105544)
    
    - Refactor SetTheory code to use const pointers when possible.
    - Use auto for variables initialized using dyn_cast<>.
    - Use range based for loops and early continue.
    jurahul authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    d7da79f View commit details
    Browse the repository at this point in the history
  10. [libc++] Fix the documentation build

    There was a duplicate link target.
    ldionne committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    c73b14c View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    6d30b67 View commit details
    Browse the repository at this point in the history
  12. [mlir][OpenMP] Add optional alloc region to reduction decl (llvm#102522)

    This region is intended to separate alloca operations from reduction
    variable initialization. This makes it easier to hoist allocas to the
    entry block before control flow and complex code for initialization.
    
    The verifier checks that there is at most one block in the alloc region.
    This is not sufficient to avoid control flow in general MLIR, but by the
    time we are converting to LLVMIR structured control flow should already
    have been lowered to the cf dialect.
    
    1/3
    Part 2: llvm#102524
    Part 3: llvm#102525
    tblah authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    a964635 View commit details
    Browse the repository at this point in the history
  13. [mlir][OpenMP] Convert reduction alloc region to LLVMIR (llvm#102524)

    The intention of this change is to ensure that allocas end up in the
    entry block not spread out amongst complex reduction variable
    initialization code.
    
    The tests we have are quite minimized for readability and
    maintainability, making the benefits less obvious. The use case for this
    is when there are multiple reduction variables each will multiple blocks
    inside of the init region for that reduction.
    
    2/3
    Part 1: llvm#102522
    Part 3: llvm#102525
    tblah authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    2efc81a View commit details
    Browse the repository at this point in the history
  14. [flang][OpenMP] use reduction alloc region (llvm#102525)

    I removed the `*-hlfir*` tests because they are duplicate now that the
    other tests have been updated to use the HLFIR lowering.
    
    3/3
    Part 1: llvm#102522
    Part 2: llvm#102524
    tblah authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    f2027a9 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    d163935 View commit details
    Browse the repository at this point in the history
  16. [Clang][Sema] Rebuild template parameters for out-of-line template de…

    …finitions and partial specializations (llvm#104030)
    
    We need to rebuild the template parameters of out-of-line
    definitions/specializations of member templates in the context of the
    current instantiation for the purposes of declaration matching. We
    already do this for function templates and class templates, but not
    variable templates, partial specializations of variable template, and
    partial specializations of class templates. This patch fixes the latter
    cases.
    sdkrystian authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    c82f797 View commit details
    Browse the repository at this point in the history
  17. [clang][bytecode] Allow adding offsets to function pointers (llvm#105641

    )
    
    Convert them to Pointers, do the offset calculation and then convert
    them back to function pointers.
    tbaederr authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    db94852 View commit details
    Browse the repository at this point in the history
  18. [InstCombine] Add more tests for foldLogOpOfMaskedICmps transform (NFC)

    Tests for cases that would have been regressed by
    llvm#104941.
    nikic committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    7e3f9dd View commit details
    Browse the repository at this point in the history
  19. [mlir][OpenMP][NFC] clean up optional reduction region parsing (llvm#…

    …105644)
    
    This can be handled in ODS instead of writing custom parsing/printing
    code.
    
    Thanks for the idea @skatrak
    tblah authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    dd3b43a View commit details
    Browse the repository at this point in the history
  20. [mlir][LLVM] Add support for constant struct with multiple fields (ll…

    …vm#102752)
    
    Currently `mlir.llvm.constant` of structure types restricts that the
    structure type effectively represents a complex type -- it must have
    exactly two fields of the same type and the field type must be either an
    integer type or a float type.
    
    This PR relaxes this restriction and it allows the structure type to
    have an arbitrary number of fields.
    Lancern authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    318b067 View commit details
    Browse the repository at this point in the history
  21. [Analysis] Teach ScalarEvolution::getRangeRef about more dereferencea…

    …ble objects (llvm#104778)
    
    Whilst dealing with review comments on
    
    llvm#96752
    
    I discovered that SCEV does not know about the dereferenceable attribute
    on function arguments so I have updated getRangeRef to make use of it
    by calling getPointerDereferenceableBytes.
    david-arm authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    d46812a View commit details
    Browse the repository at this point in the history
  22. [PowerPC] Fix mask for __st[d/w/h/b]cx builtins (llvm#104453)

    These builtins are currently returning CR0 which will have the format
    [0, 0, flag_true_if_saved, XER].
    We only want to return flag_true_if_saved. This patch adds a shift to
    remove the XER bit before returning.
    syzaara authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    327edbe View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    11e1378 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    c8f40e7 View commit details
    Browse the repository at this point in the history
  25. [InstCombine] Handle logical op for and/or of icmp 0/-1

    This aligns the transform with what foldLogOpOfMaskedICmp() does.
    nikic committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    32679e1 View commit details
    Browse the repository at this point in the history
  26. [libc++][docs] Major update to the documentation

    - Landing page: add link to the libc++ Discord channel
    - Landing page: reorder "Getting Involved" above "Design documents"
    - Landing page: remove "Notes and Known Issues" which was completely outdated
    - Rename "Using Libc++" to "User Documentation" and update contents
    - Rename "Building Libc++" to "Vendor Documentation" and update contents
    
    The "BuildingLibcxx" and "UsingLibcxx" pages have basically been used for
    vendor and user documentation respectively. However, they were named in
    a way that doesn't really make that clear. Renaming the pages now gives
    us a location to clearly document what we target at vendors and what we
    target at users, and to do that separately.
    ldionne committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    41dcdfb View commit details
    Browse the repository at this point in the history
  27. [DAG][RISCV] Use vp_reduce_* when widening illegal types for reductio…

    …ns (llvm#105455)
    
    This allows the use a single wider operation with a restricted EVL
    instead of padding the vector with the neutral element.
    
    For RISCV specifically, it's worth noting that an alternate padded
    lowering is available when VL is one less than a power of two, and LMUL
    <= m1. We could slide the vector operand up by one, and insert the
    padding via a vslide1up. We don't currently pattern match this, but we
    could. This form would arguably be better iff the surrounding code
    wanted VL=4. This patch will force a VL toggle in that case instead.
    
    Basically, it comes down to a question of whether we think odd sized
    vectors are going to appear clustered with odd size vector operations,
    or mixed in with larger power of two operations.
    
    Note there is a potential downside of using vp nodes; we loose any
    generic DAG combines which might have applied to the widened form.
    preames authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    00baa1a View commit details
    Browse the repository at this point in the history
  28. [RISCV] Introduce local peephole to reduce VLs based on demanded VL (l…

    …lvm#104689)
    
    This is a fairly narrow transform (at the moment) to reduce the VLs of
    instructions feeding a store with a smaller VL. Note that the goal of
    this transform isn't really to reduce VL - it's to reduce VL *toggles*.
    To our knowledge, small reductions in VL without also changing LMUL are
    generally not profitable on existing hardware.
    
    For a single use instruction without side effects, fp exceptions, or a
    result dependency on VL, reducing VL is legal if only a subset of
    elements are legal. We'd already implemented this logic for vmv.v.v, and
    this patch simply applies it to stores as an alternate root.
    
    Longer term, I plan to extend this to other root instructions (i.e.
    different kind of stores, reduces, etc..), and add a more general
    recursive walkback through operands.
    
    One risk with the dataflow based approach is that we could be reducing
    VL of an instruction scheduled in a region with the wider VL (i.e. mixed
    mode computations) forcing an additional VL toggle. An example of this
    is the @insert_subvector_dag_loop test case, but it doesn't appear to
    happen widely. I think this is a risk we should accept.
    preames authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    26a8a85 View commit details
    Browse the repository at this point in the history
  29. [AArch64] optimise SVE cmp intrinsics with no active lanes (llvm#104779)

    This patch extends llvm#73964 and
    optimises SVE cmp intrinsics to zero vector when predicate is zero.
    Lukacma authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    29cb1e6 View commit details
    Browse the repository at this point in the history
  30. [libc++] Post-LLVM19-release docs cleanup (llvm#99667)

    This patch removes obsolete status pages for projects that were
    completed: LLVM 18 release, C++20 Ranges and Spaceship support.
    
    Co-authored-by: Hristo Hristov <zingam@outlook.com>
    H-G-Hristov and Zingam authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    58ac764 View commit details
    Browse the repository at this point in the history
  31. [SimplifyCFG] Fold switch over ucmp/scmp to icmp and br (llvm#105636)

    If we switch over ucmp/scmp and have two switch cases going to the same
    destination, we can convert into icmp+br.
    
    Fixes llvm#105632.
    nikic authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    4d85285 View commit details
    Browse the repository at this point in the history
  32. [SLP]Do not count extractelement costs in unreachable/landing pad blo…

    …cks.
    
    If the external user of the scalar to be extract is in
    unreachable/landing pad block, we can skip counting their cost.
    
    Reviewers: RKSimon
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#105667
    alexey-bataev authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    9402bb0 View commit details
    Browse the repository at this point in the history
  33. [NFC] Replace bool <= bool comparison (llvm#102948)

    Static analyser tool cppcheck flags ordered comparison with `bool`s.
    Replace with equivalent logical operators to prevent this.
    
    Closes llvm#102912
    MitalAshok authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    ec5e585 View commit details
    Browse the repository at this point in the history
  34. [AMDGPU] Generate checks for vector indexing. NFC. (llvm#105668)

    This allows combining some test files that were only split because
    adding new RUN lines introduced too much churn in the checks.
    jayfoad authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    c4c5fdd View commit details
    Browse the repository at this point in the history
  35. [RISCV][GISel] Implement canLowerReturn. (llvm#105465)

    This allows us to handle return values that are too large to fit in x10
    and x11. They will be converted to a sret by passing a pointer to where
    to store the return value.
    topperc authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8ba2ae3 View commit details
    Browse the repository at this point in the history
  36. [DwarfEhPrepare] Assign dummy debug location for more inserted _Unwin…

    …d_Resume calls (llvm#105513)
    
    Similar to the fix for llvm#57469, ensure that the other `_Unwind_Resume`
    call emitted by DwarfEHPrepare has a debug location if needed.
    
    This fixes nbdd0121/unwinding#34.
    sunfishcode authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    e76db25 View commit details
    Browse the repository at this point in the history
  37. [SLP]Improve/fix subvectors in gather/buildvector nodes handling

    SLP vectorizer has an estimation for gather/buildvector nodes, which
    contain some scalar loads. SLP vectorizer performs pretty similar (but
    large in SLOCs) estimation, which not always correct. Instead, this
    patch implements clustering analysis and actual node allocation with the
    full analysis for the vectorized clustered scalars (not only loads, but
    also some other instructions) with the correct cost estimation and
    vector insert instructions. Improves overall vectorization quality and
    simplifies analysis/estimations.
    
    Reviewers: RKSimon
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#104144
    alexey-bataev authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    69332bb View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    3c54aa1 View commit details
    Browse the repository at this point in the history
  39. [lldb] Pick the correct architecutre when target and core file disagr…

    …ee (llvm#105576)
    
    In f9f3316, Adrian fixed an issue where LLDB wouldn't update the
    target's architecture when the process reported a different triple that
    only differed in its sub-architecture.
    
    This unintentionally regressed core file debugging when the core file
    reports the base architecture (e.g. armv7) while the main binary knows
    the correct CPU subtype (e.g. armv7em). After the aforementioned change,
    we update the target architecture from armv7em to armv7. Fix the issue
    by trusting the target architecture over the ProcessMachCore process.
    
    rdar://133834304
    JDevlieghere authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    9f41805 View commit details
    Browse the repository at this point in the history
  40. [ARM] Fix missing ELF FPU attributes for fp-armv8-fullfp16-d16 (llvm#…

    …105677)
    
    An assembly input with
    
    >   .fpu fp-armv8-fullfp16-d16
    
    crashes the compiler because the ELF FPU attribute emitter misses the
    respective entry. This patch fixes this.
    
    Interestingly, compiling with -mfpu=fp-armv8-fullfp16-d16 does not cause
    the crash because FPv5_D16 is an alias in the compiler and
    
    >   .fpu fpv5-d16
    
    is emitted instead, which does not crash.
    
    The existing .fpu directive test with multiple FPUs serves the purpose
    of verifying that each possible FPU option is defined, but does not
    trigger the crash because only the last .fpu directive goes effectively
    down the code path. Therefore one test for each FPU is required.
    
    Fixes llvm#105674.
    rgwott authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    fe5d1f9 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    b21756f View commit details
    Browse the repository at this point in the history
  42. [AArch64] Lower aarch64_neon_saddlv via SADDLV nodes. (llvm#103307)

    This mirrors what GISel already does, extending the existing lowering of
    aarch64_neon_saddlv/aarch64_neon_uaddlv to SADDLV/UADDLV. This allows us
    to remove some tablegen patterns, and provides a little nicer codegen in
    places as the nodes represent the result being in a vector register
    correctly.
    davemgreen authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8ab6140 View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    24740ec View commit details
    Browse the repository at this point in the history
  44. Reland "[asan] Remove debug tracing from report_globals (llvm#104404)…

    …" (llvm#105601)
    
    This reverts commit 2704b80
    and relands llvm#104404.
    
    The Darwin should not fail after llvm#105599.
    vitalybuka authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8c6f8c2 View commit details
    Browse the repository at this point in the history
  45. [Vectorize] Fix warnings

    This patch fixes warnings of the form:
    
      llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9300:23: error: loop
      variable '[E, Idx]' creates a copy from type 'const value_type' (aka
      'const std::pair<const llvm::slpvectorizer::BoUpSLP::TreeEntry *,
      unsigned int>') [-Werror,-Wrange-loop-construct]
    kazutakahirata committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    a625435 View commit details
    Browse the repository at this point in the history
  46. [AArch64] Fix a warning

    This patch fixes:
    
      llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:6102:9: error:
      unused variable 'OpVT' [-Werror,-Wunused-variable]
    kazutakahirata committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    0bd90ec View commit details
    Browse the repository at this point in the history
  47. [AArch64,ELF] Allow implicit $d/$x at section beginning

    The start state of a new section is `EMS_None`, often leading to a
    $d/$x at offset 0. Introduce a MCTargetOption/cl::opt
    "implicit-mapsyms" to allow an alternative behavior
    (ARM-software/abi-aa#274):
    
    * Set the start state to `EMS_Data` or `EMS_A64`.
    * For text sections, add an ending $x only if the final data is not instructions.
    * For non-text sections, add an ending $d only if the final data is not data commands.
    
    ```
    .section .text.1,"ax"
    nop
    // emit $d
    .long 42
    // emit $x
    
    .section .text.2,"ax"
    nop
    ```
    
    This new behavior decreases the .symtab size significantly:
    
    ```
    % bloaty a64-2/bin/clang -- a64-0/bin/clang
        FILE SIZE        VM SIZE
     --------------  --------------
      -5.4% -1.13Mi  [ = ]       0    .strtab
     -50.9% -4.09Mi  [ = ]       0    .symtab
      -4.0% -5.22Mi  [ = ]       0    TOTAL
    ```
    
    ---
    
    This scheme works as long as the user can rule out some error scenarios:
    
    * .text.1 assembled using the traditional behavior is combined with .text.2 using the new behavior
    * A linker script combining non-text sections and text sections. The
      lack of mapping symbols in the non-text sections could make them
      treated as code, unless the linker inserts extra mapping symbols.
    
    The above mix-and-match scenarios aren't an issue at all for a
    significant portion of users.
    
    A text section may start with data commands in rare cases (e.g.
    -fsanitize=function) that many users don't care about. When combing
    `(.text.0; .word 0)` and `(.text.1; .word 0)`, the ending $x of .text.0
    and the initial $d of .text.1 may have the same address. If both
    sections reside in the same file, ensure the ending symbol comes before
    the initial $d of .text.1, so that a dumb linker respecting the symbol
    order will place the ending $x before the initial $d.
    
    Disassemblers using stable sort will see both symbols at the same
    address, and the second will win.
    
    When section ordering mechanisms (e.g. --symbol-ordering-file,
    --call-graph-profile-sort, `.text : { second.o(.text) first.o(.text) }`)
    are involved, the initial data in a text section following a text
    section with trailing data could be misidentified as code, but the issue
    is local and the risk could be acceptable.
    
    Pull Request: llvm#99718
    MaskRay authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    46707b0 View commit details
    Browse the repository at this point in the history
  48. [AMDGPU][GlobalISel] Disable fixed-point iteration in all Combiners (l…

    …lvm#105517)
    
    Disable fixed-point iteration in all AMDGPU Combiners after llvm#102163.
    
    This saves around 2% compile time in ad hoc testing on some large
    graphics shaders. I did not notice any regressions in the generated
    code, just a bunch of harmless differences in instruction selection and
    register allocation.
    jayfoad authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    2012b25 View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    0926255 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    83fc989 View commit details
    Browse the repository at this point in the history
  51. [Driver] Add -Wa, options -mmapsyms={default,implicit}

    -Wa,-mmapsyms=implicit enables the alternative mapping symbol scheme
    discussed at llvm#99718.
    
    While not conforming to the current aaelf64 ABI, the option is
    invaluable for those with full control over their toolchain, no reliance
    on weird relocatable files, and a strong focus on minimizing both
    relocatable and executable sizes.
    
    The option is discouraged when portability of the relocatable objects is
    a concern.
    https://maskray.me/blog/2024-07-21-mapping-symbols-rethinking-for-efficiency
    elaborates the risk.
    
    Pull Request: llvm#104542
    MaskRay authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    eb549da View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    6ec4c9c View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    933f722 View commit details
    Browse the repository at this point in the history
  54. [C23] Remove WG14 N2517 from the status page

    This paper proposes no normative changes, just updates an example in
    the standard. It was incorrect for us to have marked it as No in the
    first place.
    AaronBallman committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    27727d8 View commit details
    Browse the repository at this point in the history
  55. [WebAssembly] Change half-precision feature name to fp16. (llvm#105434)

    This better aligns with how the feature is being referred to and what
    runtimes (V8) are calling it.
    brendandahl authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    7d373ce View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    bc860b4 View commit details
    Browse the repository at this point in the history
  57. [clang][bytecode] Fix 'if consteval' in non-constant contexts (llvm#1…

    …04707)
    
    The previous code made this a compile-time decision but it's not.
    tbaederr authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    b9c4c4c View commit details
    Browse the repository at this point in the history
  58. [libc++] Adjust armv7 XFAIL target triple for the setfill_wchar_max t…

    …est. (llvm#105586)
    
    Also allow XFAIL for armv7-*-linux-gnueabihf targets, not only for
    armv7l-*.
    vvereschaka authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    4a2a1b5 View commit details
    Browse the repository at this point in the history
  59. [lldb] Change the two remaining SInt64 settings in Target to uint (ll…

    …vm#105460)
    
    TargetProperties.td had a few settings listed as signed integral values,
    but the Target.cpp methods reading those values were reading them as
    unsigned. e.g. target.max-memory-read-size, some accesses of
    target.max-children-count, still today, previously
    target.max-string-summary-length.
    
    After Jonas' change to use templates to read these values in
    https://reviews.llvm.org/D149774, when the code tried to fetch these
    values, we'd eventually end up calling OptionValue::GetAsUInt64 which
    checks that the value is actually a UInt64 before returning it; finding
    that it was an SInt64, it would drop the user setting and return the
    default value. This manifested as a bug that target.max-memory-read-size
    is never used for memory read.
    
    target.max-children-count is less straightforward, where one read of
    that setting was fetching it as an int64_t, the other as a uint64_t.
    
    I suspect all of these settings were originally marked as SInt64 so a
    user could do -1 for "infinite", getting it static_cast to a UINT64_MAX
    value along the way. I can't find any documentation for this behavior,
    but it seems like something Greg would have done. We've partially lost
    that behavior already via
    llvm#72233 for
    target.max-string-summary-length, and this further removes it.
    
    We're still fetching UInt64's and returning them as uint32_t's but I'm
    not overly pressed about someone setting a count/size limit over 4GB.
    
    I added a simple API test for the memory read setting limit.
    jasonmolenda authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    c1e401f View commit details
    Browse the repository at this point in the history
  60. Recommit "[FunctionAttrs] deduce attr cold on functions if all CG p…

    …aths call a `cold` function"
    
    Fixed up the uar test that was failing. It seems with the new `cold`
    attribute the order of the functions is different. As far as I can
    tell this is not a concern.
    
    Closes llvm#105559
    goldsteinn committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    6b11573 View commit details
    Browse the repository at this point in the history
  61. [IR] Simplify comparisons with std::optional (NFC) (llvm#105624)

    For variable X of type std::optional, X && X.value_or(Y) == Z is
    equivalent to X == Z when Y != Z.
    kazutakahirata authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    b2cd81c View commit details
    Browse the repository at this point in the history
  62. [MCA][X86] Add scatter instruction test coverage for llvm#105675

    Missed IceLakeServer when I updated the other CPUs in 6ec4c9c
    RKSimon committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    7faf2c9 View commit details
    Browse the repository at this point in the history
  63. [MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data

    This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should
    
    Fixes llvm#105675
    RKSimon committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    2c1f064 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    c5a0c37 View commit details
    Browse the repository at this point in the history
  65. [VPlan] Move EVL memory recipes to VPlanRecipes.cpp (NFC)

    Move VPWiden[Load|Store]EVLRecipe::executeto VPlanRecipes.cpp in line
    with other ::execute implementations that don't depend on anything
    defined in LoopVectorization.cpp
    fhahn committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    1fa6c99 View commit details
    Browse the repository at this point in the history
  66. [libc++] Fix transform_error.mandates.verify.cpp test on msvc (llvm#1…

    …04635)
    
    PR llvm#102851 marks reference types in union as error on msvc by changing
    the clang, which makes 'transform_error.mandates.verify.cpp' no longer
    failing on msvc from ToT. However, all libcxx buildbots do not build
    clang from source, therefore, this test will still fail on these bots,
    which is incorrect. This patch changed the expected error message of
    this test so it can pass with both release branch clang and ToT clang.
    zeroomega authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    e31322b View commit details
    Browse the repository at this point in the history
  67. [libc] Add ctype.h locale variants (llvm#102711)

    Summary:
    This patch adds all the libc ctype variants. These ignore the locale
    ingormation completely, so they're pretty much just stubs. Because these
    use locale information, which is system scope, we do not enable building
    them outisde of full build mode.
    jhuber6 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8f005f8 View commit details
    Browse the repository at this point in the history
  68. Revert " [libc] Add ctype.h locale variants (llvm#102711)"

    This reverts commit 8f005f8.
    jhuber6 committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    2f4232d View commit details
    Browse the repository at this point in the history
  69. [libc] Initial support for 'locale.h' in the LLVM libc (llvm#102689)

    Summary:
    This patch adds the macros and entrypoints associated with the
    `locale.h` entrypoints.  These are mostly stubs, as we (for now and the
    forseeable future) only expect to support the C and maybe C.UTF-8
    locales in the LLVM libc.
    jhuber6 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    78d8ab2 View commit details
    Browse the repository at this point in the history
  70. [NFC] [Docs] add missing space

    fmayer committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    f3a47b9 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    518b1f0 View commit details
    Browse the repository at this point in the history
  72. [HLSL][SPIRV]Add SPIRV generation for HLSL dot (llvm#104656)

    This adds the SPIRV fdot, sdot, and udot intrinsics and allows them to
    be created at codegen depending on the target architecture. This
    required moving some of the DXIL-specific choices to DXIL instruction
    expansion out of codegen and providing it with at a more generic fdot
    intrinsic as well.
    
    Removed some stale comments that gave the obsolete impression that type
    conversions should be expected to match overloads.
    
    The SPIRV intrinsic handling involves generating multiply and add
    operations for integers and the existing OpDot operation for floating
    point.
    
    New tests for generating SPIRV float and integer dot intrinsics are
    added as well as expanding HLSL tests to include SPIRV generation
    
    Used new dot product intrinsic generation to implement normalize() in SPIRV
    
    Incidentally changed existing dot intrinsic definitions to use
    DefaultAttrsIntrinsic to match the newly added inrinsics
    
    Fixes llvm#88056
    pow2clk authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    319c7a4 View commit details
    Browse the repository at this point in the history
  73. Fix dap stacktrace perf issue (llvm#104874)

    We have got several customer reporting of slow stepping over the past
    year in VSCode.
    Profiling shows the slow stepping is caused by `stackTrace` request
    which can take around 1 second for certain targets. Since VSCode sends
    `stackTrace` during each stop event, the slow `stackTrace` request would
    slow down stepping in VSCode. Below is the hot path:
    
    ```
                   |--68.75%--lldb_dap::DAP::HandleObject(llvm::json::Object const&)
                   |          |
                   |          |--57.70%--(anonymous namespace)::request_stackTrace(llvm::json::Object const&)
                   |          |          |
                   |          |          |--54.43%--lldb::SBThread::GetCurrentExceptionBacktrace()
                   |          |          |          lldb_private::Thread::GetCurrentExceptionBacktrace()
                   |          |          |          lldb_private::Thread::GetCurrentException()
                   |          |          |          lldb_private::ItaniumABILanguageRuntime::GetExceptionObjectForThread(std::shared_ptr<lldb_private::Thread>)
                   |          |          |          |
                   |          |          |          |--53.43%--lldb_private::FunctionCaller::ExecuteFunction(lldb_private::ExecutionContext&, unsigned long*, lldb_private::EvaluateExpressionOptions const&, lldb_private::DiagnosticManager&, lldb_private::Value&)
                   |          |          |          |          |
                   |          |          |          |          |--25.23%--lldb_private::FunctionCaller::InsertFunction(lldb_private::ExecutionContext&, unsigned long&, lldb_private::DiagnosticManager&)
                   |          |          |          |          |          |
                   |          |          |          |          |          |--24.56%--lldb_private::FunctionCaller::WriteFunctionWrapper(lldb_private::ExecutionContext&, lldb_private::DiagnosticManager&)
                   |          |          |          |          |          |          |
                   |          |          |          |          |          |          |--19.73%--lldb_private::ExpressionParser::PrepareForExecution(unsigned long&, unsigned long&, std::shared_ptr<lldb_private::IRExecutionUnit>&, lldb_private::ExecutionContext&, bool&, lldb_private::ExecutionPolicy)
                   |          |          |          |          |          |          |          lldb_private::ClangExpressionParser::DoPrepareForExecution(unsigned long&, unsigned long&, std::shared_ptr<lldb_private::IRExecutionUnit>&, lldb_private::ExecutionContext&, bool&, lldb_private::ExecutionPolicy)
                   |          |          |          |          |          |          |          lldb_private::IRExecutionUnit::GetRunnableInfo(lldb_private::Status&, unsigned long&, unsigned long&)
                   |          |          |          |          |          |          |          |
    ```
    
    The hot path is added by https://reviews.llvm.org/D156465 which should
    at least be disabled for Linux. Note: I am seeing similar performance
    hot path on Mac.
    
    This PR hides the feature behind `enableDisplayExtendedBacktrace` option
    which needs to be enabled on-demand.
    
    ---------
    
    Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
    jeffreytan81 and jeffreytan81 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    e5140ae View commit details
    Browse the repository at this point in the history
  74. [AMDGPU] Correctly insert s_nops for dst forwarding hazard (llvm#100276)

    MI300 ISA section 4.5 states there is a hazard between "VALU op which
    uses OPSEL or SDWA with changes the result’s bit position" and "VALU op
    consumes result of that op"
    
    This includes the case where the second op is SDWA with same dest and
    dst_sel != DWORD && dst_unused == UNUSED_PRESERVE. In this case, there
    is an implicit read of the first op dst and the compiler needs to
    resolve this hazard. Confirmed with HW team.
    
    We model dst_unused == UNUSED_PRESERVE as tied-def of implicit operand,
    so this PR checks for that.
    
    MI300_SP_MAS section 1.3.9.2 specifies that CVT_SR_FP8_F32 and
    CVT_SR_BF8_F32 with opsel[3:2] !=0 have dest forwarding issue.
    Currently, we only add check for CVT_SR_FP8_F32 with opsel[3] != 0 --
    this PR adds support opsel[2] != 0 as well
    jrbyrnes authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    7bcf4d6 View commit details
    Browse the repository at this point in the history
  75. Configuration menu
    Copy the full SHA
    a7c8f41 View commit details
    Browse the repository at this point in the history
  76. [libc] Add ctype.h locale variants (llvm#102711)

    Summary:
    This patch adds all the libc ctype variants. These ignore the locale
    ingormation completely, so they're pretty much just stubs. Because these
    use locale information, which is system scope, we do not enable building
    them outisde of full build mode.
    jhuber6 committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    856dadb View commit details
    Browse the repository at this point in the history
  77. Configuration menu
    Copy the full SHA
    c2a96a2 View commit details
    Browse the repository at this point in the history
  78. Revert "[MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedu… (

    llvm#105716)
    
    …le data"
    
    This reverts commit 2c1f064.
    
    Many build failures in: CodeGen/X86/scatter-schedule.ll
    
    Example of a build failure:
    https://lab.llvm.org/buildbot/#/builders/155/builds/1675
    cjappl authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    e738c81 View commit details
    Browse the repository at this point in the history
  79. [LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (ll…

    …vm#105555)
    
    The new helper functions make the intent clearer while hiding
    implementation details, including how we handle previously added
    entries.  Note that:
    
    - If we are adding a GUID as a GlobalValueSummary::Definition, then we
      override a previously added GlobalValueSummary::Declaration entry
      for the same GUID.
    
    - If we are adding a GUID as a GlobalValueSummary::Declaration, then a
      previously added GlobalValueSummary::Definition entry for the same
      GUID takes precedence, and no change is made.
    kazutakahirata authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    3082a38 View commit details
    Browse the repository at this point in the history
  80. AMDGPU: Remove global/flat atomic fadd intrinics (llvm#97051)

    These have been replaced with atomicrmw.
    arsenm authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    ee08d9c View commit details
    Browse the repository at this point in the history
  81. [VPlan] Factor out precomputing costs from LVP::cost (NFC).

    Move the logic for pre-computing costs of certain instructions to a
    separate helper function, allowing re-use in a follow-up patch.
    fhahn committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    e454d31 View commit details
    Browse the repository at this point in the history
  82. [LLD][COFF] Generate X64 thunks for ARM64EC entry points and patchabl…

    …e functions. (llvm#105499)
    
    This implements Fast-Forward Sequences documented in ARM64EC
    ABI https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi.
    
    There are two conditions when linker should generate such thunks:
    
    - For each exported ARM64EC functions.
    It applies only to ARM64EC functions (we may also have pure x64
    functions, for which no thunk is needed). MSVC linker creates
    `EXP+<mangled export name>` symbol in those cases that points to the
    thunk and uses that symbol for the export. It's observable from the
    module: it's possible to reference such symbols as I did in the test.
    Note that it uses export name, not name of the symbol that's exported
    (as in `foo` in `/EXPORT:foo=bar`). This implies that if the same
    function is exported multiple times, it will have multiple thunks. I
    followed this MSVC behavior.
    
    - For hybrid_patchable functions.
    The linker tries to generate a thunk for each undefined `EXP+*` symbol
    (and such symbols are created by the compiler as a target of weak alias
    from the demangled name). MSVC linker tries to find corresponding
    `*$hp_target` symbol and if fails to do so, it outputs a cryptic error
    like `LINK : fatal error LNK1000: Internal error during
    IMAGE::BuildImage`. I just skip generating the thunk in such case (which
    causes undefined reference error). MSVC linker additionally checks that
    the symbol complex type is a function (see also llvm#102898). We generally
    don't do such checks in LLD, so I made it less strict. It should be
    fine: if it's some data symbol, it will not have `$hp_target` symbol, so
    we will skip it anyway.
    cjacek authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    a2d8743 View commit details
    Browse the repository at this point in the history
  83. [VPlan] Don't trigger VF assertion if VPlan has extra simplifications.

    There are cases where VPlans contain some simplifications that are very
    hard to accurately account for up-front in the legacy cost model. Those
    cases are caused by un-simplified inputs, which trigger the assert
    ensuring both the legacy and VPlan-based cost model agree on the VF.
    
    To avoid false positives due to missed simplifications in general, only
    trigger the assert if the chosen VPlan doesn't contain any additional
    simplifications.
    
    Fixes llvm#104714.
    Fixes llvm#105713.
    fhahn committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    cb4efe1 View commit details
    Browse the repository at this point in the history
  84. [VPlan] Fix typo in cb4efe1.

    fhahn committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    768dba7 View commit details
    Browse the repository at this point in the history
  85. [libunwind] Stop installing the mach-o module map (llvm#105616)

    libunwind shouldn't know that compact_unwind_encoding.h is part of a
    MachO module that it doesn't own. Delete the mach-o module map, and let
    whatever is in charge of the mach-o directory be the one to say how its
    module is organized and where compact_unwind_encoding.h fits in.
    ian-twilightcoder authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    172c4a4 View commit details
    Browse the repository at this point in the history
  86. [clang][rtsan] Introduce realtime sanitizer codegen and driver (llvm#…

    …102622)
    
    Introduce the `-fsanitize=realtime` flag in clang driver
    
    Plug in the RealtimeSanitizer PassManager pass in Codegen, and attribute
    a function based on if it has the `[[clang::nonblocking]]` function
    effect.
    cjappl authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    d010ec6 View commit details
    Browse the repository at this point in the history
  87. [Clang] [Parser] Improve diagnostic for friend concept (llvm#105121)

    Diagnose this early after parsing declaration specifiers; this allows us
    to issue a better diagnostic. This also checks for `concept friend` and
    concept declarations w/o a template-head because it’s easiest to do that
    at the same time.
    
    Fixes llvm#45182.
    Sirraide authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8b5f606 View commit details
    Browse the repository at this point in the history
  88. [compiler-rt][test] Change tests to remove the use of unset command…

    … in lit internal shell (llvm#104880)
    
    This patch rewrites tests to remove the use of the `unset` command,
    which is not supported in the lit internal shell. The tests now use the
    `env -u` to unset environment variables.
    
    The `unset` command is used in shell environments to remove the
    environment variable. However, because the lit internal shell does not
    support the `unset` command, using it in tests would result in errors or
    other unexpected behavior. To overcome this limitation, the tests have
    been updated to use the `env -u` command instead. `env -u` is supported
    by lit and effectively removes specified environment variables. This
    allows the tests to achieve the same goal of unsetting environment
    variables while ensuring compatibility with the lit internal shell.
    
    This change is relevant for [[RFC] Enabling the Lit Internal Shell by
    Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3)
    Fixes: llvm#102397
    Harini0924 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    42d06b8 View commit details
    Browse the repository at this point in the history
  89. [mlir][SCF]-Fix loop coalescing with iteration arguements (llvm#105488)

    Fix a bug found when coalescing loops which have iteration arguments,
    such that the inner loop's terminator may have operands of the inner
    loop iteration arguments which are about to be replaced by the outer
    loop's iteration arguments.
    
    The current flow leads to crush within the IR code.
    amirBish authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    d7fc779 View commit details
    Browse the repository at this point in the history
  90. [NFC][ADT] Add reverse iterators and value_type to StringRef (llvm#…

    …105579)
    
    - Add reverse iterators and `value_type` to StringRef.
    - Add unit test for all 4 iterator flavors.
    - This prepares StringRef to be used with `SequenceToOffsetTable`.
    jurahul authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    911e246 View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    a1e9b7e View commit details
    Browse the repository at this point in the history
  92. [Vectorize] Fix a warning

    This patch fixes:
    
      llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7245:1: error:
      unused function 'planContainsAdditionalSimplifications'
      [-Werror,-Wunused-function]
    kazutakahirata committed Aug 22, 2024
    Configuration menu
    Copy the full SHA
    4e6ff75 View commit details
    Browse the repository at this point in the history
  93. [LTO] Use a helper function to add a definition (NFC) (llvm#105721)

    I missed this one when I introduced helper functions in:
    
      commit 3082a38
      Author: Kazu Hirata <kazu@google.com>
      Date:   Thu Aug 22 12:06:47 2024 -0700
    kazutakahirata authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    ca48b01 View commit details
    Browse the repository at this point in the history
  94. [RISCV][TTI] Use legalized element types when costing casts (llvm#105723

    )
    
    This fixes a crash introduced by my
    ac6e1fd.
    
    I had failed to consider the case where a vector is truncated to an
    illegal element type. The resulting intermediate VT wasn't an MVT and
    we'd fail an assertion. Surprisingly, SLP does query illegal element
    types in some cases.
    preames authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    424b87b View commit details
    Browse the repository at this point in the history
  95. [SandboxIR] Implement CatchReturnInst (llvm#105605)

    This patch implements sandboxir::CatchReturnInst mirroring
    llvm::CatchReturnInst.
    vporpo authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    0d21c2b View commit details
    Browse the repository at this point in the history
  96. Revert "[clang] Merge lifetimebound and GSL code paths for lifetime a…

    …nalysis (llvm#104906)" (llvm#105752)
    
    Revert as it breaks libc++ tests, see llvm#104906.
    
    This reverts commit c368a72.
    vitalybuka authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    1df1504 View commit details
    Browse the repository at this point in the history
  97. [clang][NFC] order C++ standards in reverse in release notes (llvm#10…

    …4866)
    
    Noticed that the release notes currently have a weird order: C++17,
    C++14(!), C++20, C++23, C++2c. Reorder them in reverse chronological
    order, which also matches the [status
    page](https://clang.llvm.org/cxx_status.html).
    h-vetinari authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    ecfceb8 View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2024

  1. [ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (llvm#104842)

    ScalarizedMaskedMemIntr contains an optimization where the <N x i1> mask
    is bitcast into an iN and then bit-tests with powers of two are used to
    determine whether to load/store/... or not.
    
    However, on machines with branch divergence (mainly GPUs), this is a
    mis-optimization, since each i1 in the mask will be stored in a
    condition register - that is, ecah of these "i1"s is likely to be a word
    or two wide, making these bit operations counterproductive.
    
    Therefore, amend this pass to skip the optimizaiton on targets that it
    pessimizes.
    
    Pre-commit tests llvm#104645
    krzysz00 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    25d976b View commit details
    Browse the repository at this point in the history
  2. [llvm][NVPTX] Fix quadratic runtime in ProxyRegErasure (llvm#105730)

    This pass performs RAUW by walking the machine function for each RAUW
    operation. For large functions, this runtime in this pass starts to blow
    up. Linearize the pass by batching the RAUW ops at once.
    Mogball authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    08e5a1d View commit details
    Browse the repository at this point in the history
  3. [bazel] Move lldb-dap cc_binary to lldb/BUILD.bazel (llvm#105733)

    On linux lldb-dap uses the location of the lldb-dap binary to search for
    lldb-server. Previously these were produced in different directories
    corresponding to the BUILD file paths. It's not ideal that the BUILD
    file location matters for the binary at runtime but it doesn't hurt to
    have this tool here too like the others.
    keith authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    be8ee09 View commit details
    Browse the repository at this point in the history
  4. [mlir][tensor] Add consumer fusion for tensor.pack op. (llvm#103715)

    Add missing `getIterationDomainTileFromOperandTile` and `getTiledImplementationFromOperandTile` to `tensor.pack` and enable fusing it as a consumer. NOTE that, it only expects perfect tiling scenario without padding semantic currently.
    Yun-Fly authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f06563a View commit details
    Browse the repository at this point in the history
  5. [NFC][TableGen] Emit more readable builtin string table (llvm#105445)

    - Add `EmitStringLiteralDef` to StringToOffsetTable class to emit more
    readable string table.
    - Use that in `EmitIntrinsicToBuiltinMap`.
    jurahul authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    381405f View commit details
    Browse the repository at this point in the history
  6. [AMDGPU] Refactor code for GETPC bundle updates in hazards (NFCI)

    As suggested in review for PR llvm#100067.
    Refactor code for S_GETPC_B64 bundle updates for use with multiple
    hazard mitigations.
    perlfu committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    987ffc3 View commit details
    Browse the repository at this point in the history
  7. [clang-format] Don't insert a space between :: and * (llvm#105043)

    Also, don't insert a space after ::* for method pointers.
    
    See
    llvm#86253 (comment).
    
    Fixes llvm#100841.
    owenca authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    714033a View commit details
    Browse the repository at this point in the history
  8. Revert "[Vectorize] Fix warnings" (llvm#105771)

    Triggers assert in compiler
    https://lab.llvm.org/buildbot/#/builders/51/builds/2836
    
    ```
    Instructions.cpp:1700: llvm::ShuffleVectorInst::ShuffleVectorInst(Value *, Value *, ArrayRef<int>, const Twine &, InsertPosition): Assertion `isValidOperands(V1, V2, Mask) && "Invalid shuffle vector instruction operands!"' failed.
    ```
    
    This reverts commit a625435.
    vitalybuka authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    1519451 View commit details
    Browse the repository at this point in the history
  9. [SPIRV] Emitting DebugSource, DebugCompileUnit (llvm#97558)

    This commit introduces emission of DebugSource, DebugCompileUnit from
    NonSemantic.Shader.DebugInfo.100 and required OpString with filename.
    NonSemantic.Shader.DebugInfo.100 is divided, following DWARF into two
    main concepts – emitting DIE and Line.
    In DWARF .debug_abbriev and .debug_info sections are responsible for
    emitting tree with information (DEIs) about e.g. types, compilation
    unit. Corresponding to that in NonSemantic.Shader.DebugInfo.100 have
    instructions like DebugSource, DebugCompileUnit etc. which preforms same
    role in SPIR-V file. The difference is in fact that in SPIR-V there are
    no sections but logical layout which forces order of the instruction
    emission.
    The NonSemantic.Shader.DebugInfo.100 requires for this type of global
    information to be emitted after OpTypeXXX and OpConstantXXX
    instructions.
    One of the goals was to minimize changes and interaction with
    SPIRVModuleAnalysis as possible which current commit achieves by
    emitting it’s instructions directly into MachineFunction.
    The possibility of duplicates are mitigated by guard inside pass which
    emits the global information only once in one function.
    By that method duplicates don’t have chance to be emitted.
    From that point, adding new debug global instructions should be
    straightforward.
    bwlodarcz authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    62da359 View commit details
    Browse the repository at this point in the history
  10. [ORC] Add an identifier-override argument to loadRelocatableObject an…

    …d friends.
    
    API clients may want to use things other than paths as the buffer identifiers.
    
    No testcase -- I haven't thought of a good way to expose this via the regression
    testing tools.
    
    rdar://133536831
    lhames committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    e15abb7 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    351f4a5 View commit details
    Browse the repository at this point in the history
  12. [Scalar] Remove an unused variable (llvm#105767)

    The last use was removed by:
    
      commit 89fe570
      Author: Philip Reames <listmail@philipreames.com>
      Date:   Tue May 12 23:39:23 2015 +0000
    kazutakahirata authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    fdaaa87 View commit details
    Browse the repository at this point in the history
  13. [clang-format] Change BinPackParameters to enum and add AlwaysOnePerL…

    …ine (llvm#101882)
    
    Related issues that have requested this feature:
    llvm#51833
    llvm#23796 
    llvm#53190 Partially solves - this issue requests is for both arguments and
    parameters
    VolatileAcorn authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7c3237d View commit details
    Browse the repository at this point in the history
  14. [LTO] Turn ImportMapTy into a proper class (NFC) (llvm#105748)

    This patch turns type alias ImportMapTy into a proper class to provide
    a more intuitive interface like:
    
      ImportList.addDefinition(...)
    
    as opposed to:
    
      FunctionImporter::addDefinition(ImportList, ...)
    
    Also, this patch requires all non-const accesses to go through
    addDefinition, maybeAddDeclaration, and addGUID while providing const
    accesses via:
    
      const ImportMapTyImpl &getImportMap() const { return ImportMap; }
    
    I realize ImportMapTy may not be the best name as a class (maybe OK as
    a type alias).  I am not renaming ImportMapTy in this patch at least
    because there are 47 mentions of ImportMapTy under llvm/.
    kazutakahirata authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    3563907 View commit details
    Browse the repository at this point in the history
  15. Revert "[SLP]Improve/fix subvectors in gather/buildvector nodes handl…

    …ing" (llvm#105780)
    
    with "[Vectorize] Fix warnings"
    
    It introduced compiler crashes, see llvm#104144.
    
    This reverts commit 69332bb and
    351f4a5.
    vitalybuka authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    96b3166 View commit details
    Browse the repository at this point in the history
  16. [memref] Handle edge case in subview of full static size fold (llvm#1…

    …05635)
    
    It is possible to have a subview with a fully static size and a type
    that matches the source type, but a dynamic offset that may be
    different. However, currently the memref dialect folds:
    
    ```mlir
    func.func @subview_of_static_full_size(
      %arg0:  memref<16x4xf32,  strided<[4, 1], offset: ?>>, %idx: index)
      -> memref<16x4xf32,  strided<[4, 1], offset: ?>>
    {
      %0 = memref.subview %arg0[%idx, 0][16, 4][1, 1]
       : memref<16x4xf32,  strided<[4, 1], offset: ?>>
         to memref<16x4xf32,  strided<[4, 1], offset: ?>>
      return %0 : memref<16x4xf32,  strided<[4, 1], offset: ?>>
    }
    ```
    
    To:
    
    ```mlir
    func.func @subview_of_static_full_size(
      %arg0: memref<16x4xf32, strided<[4, 1], offset: ?>>, %arg1: index)
      -> memref<16x4xf32, strided<[4, 1], offset: ?>>
    {
      return %arg0 : memref<16x4xf32, strided<[4, 1], offset: ?>>
    }
    ```
    
    Which drops the dynamic offset from the `subview` op.
    MacDue authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    84aa02d View commit details
    Browse the repository at this point in the history
  17. [MIPS] Optimize sortRelocs for o32

    The o32 ABI specifies:
    
    > Each relocation type of R_MIPS_HI16 must have an associated R_MIPS_LO16 entry immediately following it in the list of relocations. [...] the addend AHL is computed as (AHI << 16) + (short)ALO
    
    In practice, the high-part and low-part relocations may not be adjacent
    in assembly files, requiring the assembler to reorder relocations.
    http://reviews.llvm.org/D19718 performed the reordering, but did not
    optimize for the common case where a %lo immediately follows its
    matching %hi. The quadratic time complexity could make sections with
    many relocations very slow to process.
    
    This patch implements the fast path, simplifies the code, and makes the
    behavior more similar to GNU assembler (for the .rel.mips_hilo_8b test).
    We also remove `OriginalSymbol`, removing overhead for other targets.
    
    Fix llvm#104562
    
    Pull Request: llvm#104723
    MaskRay authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    59721f2 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    a69ba0a View commit details
    Browse the repository at this point in the history
  19. [NFCI] [C++20] [Modules] Relax the case for duplicated declaration in…

    … multiple module units for explicit specialization
    
    Relax the case for duplicated declaration in multiple module units for
    explicit specialization and refactor the implementation of
    checkMultipleDefinitionInNamedModules a little bit.
    
    This is intended to not affect any end users since it only relaxes the
    condition to emit an error.
    ChuanqiXu9 committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    e5f196e View commit details
    Browse the repository at this point in the history
  20. [NFCI] [Serialization] Use demoteThisDefinitionToDeclaration instead …

    …of setCompleteDefinition(false) for CXXRecordDecl
    
    When we merge the definition for CXXRecordDecl, we would use
    setCompleteDefinition(false) to mark the merged definition. But this was
    not the correct/good interface. We can't know that the merged definition
    was a definition then. And actually, we provided an interface for this:
    demoteThisDefinitionToDeclaration.
    
    So this patch tries to use the correct API.
    
    This was found in the downstream developing. This is not strictly NFC
    but it is intended to be NFC for every end users.
    ChuanqiXu9 committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    39986f0 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    85b6aac View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    28133d9 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    f53bfa3 View commit details
    Browse the repository at this point in the history
  24. [AMDGPU] Simplify use of hasMovrel and hasVGPRIndexMode (llvm#105680)

    The generic subtarget has neither of these features. Rather than forcing
    HasMovrel on, it is simpler to expand dynamic vector indexing to a
    sequence of compare/select instructions.
    
    NFC for real subtargets.
    jayfoad authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    b02b5b7 View commit details
    Browse the repository at this point in the history
  25. [Matrix] Preserve signedness when extending matrix index expression. (l…

    …lvm#103044)
    
    As per [1] the indices for a matrix element access operator shall have
    integral or unscoped enumeration types and be non-negative. At the
    moment, the index expression is converted to SizeType irrespective of
    the signedness of the index expression. This causes implicit sign
    conversion warnings if any of the indices is signed.
    
    As per the spec, using signed types as indices is allowed and should not
    cause any warnings. If the index expression is signed, extend to
    SignedSizeType to avoid the warning.
    
    [1]
    https://clang.llvm.org/docs/MatrixTypes.html#matrix-type-element-access-operator
    
    PR: llvm#103044
    fhahn authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    96509bb View commit details
    Browse the repository at this point in the history
  26. [AMDGPU] Remove one case of vmcnt loop header flushing for GFX12 (llv…

    …m#105550)
    
    When a loop contains a VMEM load whose result is only used outside the
    loop, do not bother to flush vmcnt in the loop head on GFX12. A wait for
    vmcnt will be required inside the loop anyway, because VMEM instructions
    can write their VGPR results out of order.
    jayfoad authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    fa2dccb View commit details
    Browse the repository at this point in the history
  27. [MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data …

    …(REAPPLIED)
    
    This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should
    
    Reapplied with codegen fix for scatter-schedule.ll
    
    Fixes llvm#105675
    RKSimon committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    cf6cd1f View commit details
    Browse the repository at this point in the history
  28. [C++20] [Modules] Warn for duplicated decls in mutliple module units (l…

    …lvm#105799)
    
    It is a long standing issue that the duplicated declarations in multiple
    module units would cause the compilation performance to get slowed down.
    And there are many questions or issue reports. So I think it is better
    to add a warning for it.
    
    And given this is not because the users' code violates the language
    specification or any best practices, the warning is disabled by default
    even if `-Wall` is specified. The users need to specify the warning
    explcitly or use `Weverything`.
    
    The documentation will add separately.
    ChuanqiXu9 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    3cca522 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    c8ba317 View commit details
    Browse the repository at this point in the history
  30. [AArch64] Scalarize i128 add/sub/mul/and/or/xor vectors

    This mirrors what we do for SDAG, scalarizing i128 vectors with
    add/sub/mul/and/or/xor operators.
    davemgreen committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    646478f View commit details
    Browse the repository at this point in the history
  31. [clang][bytecode][NFC] Remove containsErrors() check from delegate (l…

    …lvm#105804)
    
    This check was removed a while ago from visit(), remove it from
    delegate() as well.
    tbaederr authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    38b8e54 View commit details
    Browse the repository at this point in the history
  32. [clang][bytecode] Reject void InitListExpr differently (llvm#105802)

    This reverts c79d1fa and
    125aa10
    
    Instead, use the previous approach but allow void-typed InitListExprs
    with 0 initializers.
    tbaederr authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7b4b85b View commit details
    Browse the repository at this point in the history
  33. [ORC] Expose a non-destructive check-macho-buffer overload.

    This allows clients to check buffers that they don't own.
    
    rdar://133536831
    lhames committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    4a12722 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    cbf34a5 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    2b4b909 View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    2f144ac View commit details
    Browse the repository at this point in the history
  37. [flang][NFC] turn fir.call is_bind_c into enum for procedure flags (l…

    …lvm#105691)
    
    First patch to fix a BIND(C) ABI issue
    (llvm#102113). I need to keep
    track of BIND(C) in more locations (fir.dispatch and func.func
    operations), and I need to fix a few passes that are dropping the
    attribute on the floor. Since I expect more procedure attributes that
    cannot be reflected in mlir::FunctionType will be needed for ABI,
    optimizations, or debug info, this NFC patch adds a new enum attribute
    to keep track of procedure attributes in the IR.
    
    This patch is not updating lowering to lower more attributes, this will
    be done in a separate patch to keep the test changes low here.
    
    Adding the attribute on fir.dispatch and func.func will also be done in
    separate patches.
    jeanPerier authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    2051a7b View commit details
    Browse the repository at this point in the history
  38. [NFC][TableGen] Refactor StringToOffsetTable (llvm#105655)

    - Make `EmitString` const by not mutating `AggregateString`.
    - Use C++17 structured bindings in `GetOrAddStringOffset`.
    - Use StringExtras version of isDigit instead of std::isdigit.
    jurahul authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    04ab647 View commit details
    Browse the repository at this point in the history
  39. [Serialization] Fix a warning

    This patch fixes:
    
      clang/lib/Serialization/ASTReader.cpp:9978:27: error: lambda capture
      'this' is not used [-Werror,-Wunused-lambda-capture]
    kazutakahirata committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    1e3dc8c View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    0d1d95e View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    5def27c View commit details
    Browse the repository at this point in the history
  42. [RISCV] Let -data-sections also work on sbss/sdata sections (llvm#87040)

    Add an unique suffix to .sbss/.sdata if -fdata-sections.
    Without assigning an unique .sbss/.sdata section to each symbols, a
    linker may not be able to remove unused part when gc-section since all
    used and unused symbols are all mixed in the same .sbss/.sdata section.
    I believe this also matches the behavior of gcc.
    KaiYG authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    4d348f7 View commit details
    Browse the repository at this point in the history
  43. [mlir][mem2reg] Fix Mem2Reg attempting to promote in graph regions (l…

    …lvm#104910)
    
    Mem2Reg assumes SSA dependencies but did not check for graph regions.
    This fixes it.
    
    ---------
    
    Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
    Moxinilian and Dinistro authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    b084111 View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    2617023 View commit details
    Browse the repository at this point in the history
  45. [NFC] Use stable_hash_combine instead of hash_combine (llvm#105619)

    I found the current stable hash is not deterministic across multiple
    runs on a specific platform. This is because it uses `hash_combine`
    instead of `stable_hash_combine`.
    kyulee-com authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    c9b6339 View commit details
    Browse the repository at this point in the history
  46. [AMDGPU] Improve uniform argument handling in InstCombineIntrinsic (l…

    …lvm#105812)
    
    Common up handling of intrinsics that are a no-op on uniform arguments.
    This catches a couple of new cases:
    
    readlane (readlane x, y), z -> readlane x, y
    (for any z, does not have to equal y).
    
    permlane64 (readfirstlane x) -> readfirstlane x
    (and likewise for any other uniform argument to permlane64).
    jayfoad authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f142f8a View commit details
    Browse the repository at this point in the history
  47. [SLP]Improve/fix subvectors in gather/buildvector nodes handling

    SLP vectorizer has an estimation for gather/buildvector nodes, which
    contain some scalar loads. SLP vectorizer performs pretty similar (but
    large in SLOCs) estimation, which not always correct. Instead, this
    patch implements clustering analysis and actual node allocation with the
    full analysis for the vectorized clustered scalars (not only loads, but
    also some other instructions) with the correct cost estimation and
    vector insert instructions. Improves overall vectorization quality and
    simplifies analysis/estimations.
    
    Reviewers: RKSimon
    
    Reviewed By: RKSimon
    
    Pull Request: llvm#104144
    alexey-bataev committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f3d2609 View commit details
    Browse the repository at this point in the history
  48. [RISCV][MC] Name the vector tuple registers. NFC (llvm#102726)

    Currently vector tuple registers don't have the specified names, the
    default name is, for example: `VRN3M2` -> `V8M2_V10M2_V12M2`, however
    it's equivalent to `v8` in the assembly.
    4vtomat authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    002ba17 View commit details
    Browse the repository at this point in the history
  49. Revert "[clang] Increase the default expression nesting limit (llvm#1…

    …04717)"
    
    This reverts commit 7597e09.
    
    It caused several buildbot failures due to stack overflows with the
    parser test.
    AaronBallman committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    e3ce979 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    67a9093 View commit details
    Browse the repository at this point in the history
  51. [SLP]Fix a crash for the strided nodes with reversed order and extern…

    …ally used pointer.
    
    If the strided node is reversed, need to cehck for the last instruction,
    not the first one in the list of scalars, when checking if the root
    pointer must be extracted.
    alexey-bataev committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    dab19da View commit details
    Browse the repository at this point in the history
  52. Revert "[RISCV] Add isel optimization for (and (sra y, c2), c1) to re…

    …cover regression from llvm#101751. (llvm#104114)"
    
    This caused an assert to fire:
    
      llvm/include/llvm/Support/Casting.h:566:
      decltype(auto) llvm::cast(const From &) [To = llvm::ConstantSDNode, From = llvm::SDValue]:
      Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
    
    see comment on the PR.
    
    > If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
    > c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
    > followed by a SHXADD with c4 as the X amount.
    >
    > Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
    > Alive2: https://alive2.llvm.org/ce/z/AwhheR
    
    This reverts commit 5144817.
    zmodem committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    858afe9 View commit details
    Browse the repository at this point in the history
  53. [PS5][clang][test] x86_64-scei-ps5 -> x86_64-sie-ps5 in tests (llvm#1…

    …05810)
    
    `x86_64-sie-ps5` is the triple we share with PS5 toolchain users who
    have reason to care about such things. The vast majority of PS5 checks
    and tests already use this variant. Quashing the handful of stragglers
    will help prevent future copy+paste of the discouraged variant.
    playstation-edd authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    05ce95e View commit details
    Browse the repository at this point in the history
  54. [VPlan] Skip branches marked as dead in cost precomputation.

    Don't consider the cost of branches marked to be skipped in VPlan cost
    pre-computation. Those aren't included in the legacy cost, so they
    should not be included in the VPlan cast.
    fhahn committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    885c436 View commit details
    Browse the repository at this point in the history
  55. Revert "Reland "[asan] Remove debug tracing from report_globals (ll…

    …vm#104404)" (llvm#105601)"
    
    that change still breaks
    
      SanitizerCommon-asan-x86_64-Darwin :: Darwin/print-stack-trace-in-code-loaded-after-fork.cpp
    
    > This reverts commit 2704b80
    > and relands llvm#104404.
    >
    > The Darwin should not fail after llvm#105599.
    
    This reverts commit 8c6f8c2.
    zmodem committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    6a8f738 View commit details
    Browse the repository at this point in the history
  56. [clang][rtsan] Reland realtime sanitizer codegen and driver (llvm#102622

    )
    
    This reverts commit a1e9b7e
    This relands commit d010ec6
    
    No modifications from the original patch. It was determined that the
    ubsan build failure was happening even after the revert, some examples:
    
    https://lab.llvm.org/buildbot/#/builders/159/builds/4477 
    https://lab.llvm.org/buildbot/#/builders/159/builds/4478 
    https://lab.llvm.org/buildbot/#/builders/159/builds/4479
    cjappl authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f77e8f7 View commit details
    Browse the repository at this point in the history
  57. [C23] Update status page for TS 18661 integration (llvm#105693)

    WG14 N2401 was removed from the list because it was library-only changes
    that don't impact the compiler.
    
    Everything having to do with decimal floating-point types was changed to
    No because we do not currently have any support for those.
    
    WG14 N2314 remains Unknown because it has changes to Annex F for binary
    floating-point types.
    AaronBallman authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    3faf5b9 View commit details
    Browse the repository at this point in the history
  58. [BOLT][test] Removed the use of parentheses in BOLT tests with lit in…

    …ternal shell (llvm#105720)
    
    This patch addresses compatibility issues with the lit internal shell by
    removing the use of subshell execution (parentheses and subshell syntax)
    in the `BOLT` tests. The lit internal shell does not support
    parentheses, so the tests have been refactored to use separate command
    invocations, with outputs redirected to temporary files where necessary.
    
    This change is relevant for enabling the lit internal shell by default,
    as outlined in [[RFC] Enabling the Lit Internal Shell by
    Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179)
    
    fixes: llvm#102401
    Harini0924 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7f37932 View commit details
    Browse the repository at this point in the history
  59. [SCF][PIPELINE] Handle the case when values from the peeled prologue …

    …may escape out of the loop (llvm#105755)
    
    Previously the values in the peeled prologue that weren't treated with
    the `predicateFn` were passed to the loop body without any other
    predication. If those values are later used outside of the loop body,
    they may be incorrect if the num iterations is smaller than num stages -
    1. We need similar masking for those, as is done in the main loop body,
    using already existing predicates.
    pawelszczerbuk authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7c90081 View commit details
    Browse the repository at this point in the history
  60. [Clang] Implement P2747 constexpr placement new (llvm#104586)

    The implementation follows the resolution of CWG2922
    cor3ntin authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    6e78aef View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    8075576 View commit details
    Browse the repository at this point in the history
  62. [libc++] Remove status pages tracking SpecialMath and Zip (llvm#105672)

    Instead of tracking those using our static CSV files, I created lists of
    subtasks in their respective issues (llvm#99939 and llvm#105169) to track the
    work that is still left.
    ldionne authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    ff5552c View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    b8f1505 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    5a25854 View commit details
    Browse the repository at this point in the history
  65. [mlir][Transforms][NFC] Move ReconcileUnrealizedCasts implementation (

    llvm#104671)
    
    Move the implementation of `ReconcileUnrealizedCasts` to
    `DialectConversion.cpp`, so that it can be called from there in a future
    commit.
    
    This commit is in preparation of decoupling argument/source/target
    materializations from the dialect conversion framework. The existing
    logic around unresolved materializations that predicts IR changes to
    decide if a cast op can be folded/erased will become obsolete, as
    `ReconcileUnrealizedCasts` will perform these kind of foldings on fully
    materialized IR.
    
    ---------
    
    Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
    matthias-springer and zero9178 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    a9f6224 View commit details
    Browse the repository at this point in the history
  66. Reland "[clang] Merge lifetimebound and GSL code paths for lifetime a…

    …nalysis (llvm#104906)" (llvm#105838)
    
    Reland without the `EnableLifetimeWarnings` removal. I will remove the
    EnableLifetimeWarnings in a follow-up patch.
    
    I have added a test to prevent regression.
    hokein authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    b1560bd View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    fd7904a View commit details
    Browse the repository at this point in the history
  68. Recommit "[RISCV] Add isel optimization for (and (sra y, c2), c1) to …

    …recover regression from llvm#101751. (llvm#104114)"
    
    Fixed an incorrect cast.
    
    Original message:
    
    If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
    c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
    followed by a SHXADD with c4 as the X amount.
    
    Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
    Alive2: https://alive2.llvm.org/ce/z/AwhheR
    topperc committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    0381e01 View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    3d18cea View commit details
    Browse the repository at this point in the history
  70. InstructionSelect: Use GISelChangeObserver instead of MachineFunction…

    …::Delegate (llvm#105725)
    
    The main difference is that it's possible for multiple change observers
    to be installed at the same time whereas there can only be one
    MachineFunction delegate installed. This allows downstream targets to
    continue to use observers to recursively select. The target in question
    was selecting a gMIR instruction to a machine instruction plus some gMIR
    around it and relying on observers to ensure it correctly selected any
    gMIR it created before returning to the main loop.
    dsandersllvm authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    0bf5846 View commit details
    Browse the repository at this point in the history
  71. [SCCP] fix non-determinism (llvm#105758)

    the visit order depended on hashing because we iterated over a
    SmallPtrSet
    fmayer authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    aec3ec0 View commit details
    Browse the repository at this point in the history
  72. [X86] Add some initial test coverage for half libcall expansion/promo…

    …tion
    
    We can add additional tests in the future, but this is an initial placeholder
    
    Inspired by llvm#105775
    RKSimon committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    df97673 View commit details
    Browse the repository at this point in the history
  73. [NFC] Fix an incorrect comment about operator precedence. (llvm#105784)

    The comment talks about left-associative operators twice, when the
    latter mention is actually describing right-associative operators.
    mpark authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    1821cb3 View commit details
    Browse the repository at this point in the history
  74. [ctx_prof] Remove the dependency on the "name" GlobalVariable (llvm#1…

    …05731)
    
    We don't need that name variable for contextual instrumentation, we just
    use the function to get its GUID which we pass to the runtime, and rely
    on metadata to capture it through the various optimization passes. This
    change removes the need for the name global variable.
    mtrofin authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    960a210 View commit details
    Browse the repository at this point in the history
  75. [orc][mach-o] Unlock the JITDylib state mutex during +load (llvm#105333)

    Similar to what was already done for static initializers, we need to
    unlock the state mutext when calling out to libobjc to run +load methods
    in case they cause us to reenter the runtime, which was previously
    deadlocking. No test for now, because we don't have any code paths in
    llvm-jitlink itself that could lead to this deadlock. If we interpose
    calls to dlopen to go back to the JIT in the future then calling dlopen
    from a +load is the easiest way to reproduce this.
    
    rdar://133430490
    benlangmuir authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    fa089ef View commit details
    Browse the repository at this point in the history
  76. Implement resource binding type prefix mismatch diagnostic infrastruc…

    …ture (llvm#97103)
    
    There are currently no diagnostics being emitted for when a resource is
    bound to a register with an incorrect binding type prefix. For example,
    a CBuffer type resource should be bound with a a binding type prefix of
    'b', but if instead the prefix is 'u', no errors will be emitted. This
    PR implements such diagnostics. The focus of this PR is to implement
    both the flag setting and diagnostic emisison steps specified in the
    relevant spec: microsoft/hlsl-specs#230
    The relevant issue is: llvm#57886
    This is a continuation / refresh of this PR:
    llvm#87578
    bob80905 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    ebc4a66 View commit details
    Browse the repository at this point in the history
  77. [mlir][sparse] partially support lowering sparse coiteration loops to…

    … scf.while/for. (llvm#105565)
    Peiming Liu authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f607102 View commit details
    Browse the repository at this point in the history
  78. [Flang][OpenMP] Align map clause generation and fix issue with non-sh…

    …ared allocations for assumed shape/size descriptor types (llvm#97855)
    
    This PR aims to unify the map argument generation behavior across both
    the implicit capture (captured in a target region) and the explicit
    capture (process map), currently the varPtr field of the MapInfo for the
    same variable will be different depending on how it's captured. This PR
    tries to align that across the generations of MapInfoOp in the OpenMP
    lowering.
    
    Currently, I have opted to utilise the rawInput (input memref to a HLFIR
    DeclareInfoOp) as opposed to the addr field which includes more
    information. The side affect of this is that we have to deal with
    BoxTypes less often, which will result in simpler maps in these cases.
    The negative side affect of this is that we don't have access to the
    bounds information through the resulting value, however, I believe the
    bounds information we require in our case is still appropriately stored
    in the map bounds, and this seems to be the case from testing so far.
    
    The other fix is for cases where we end up with a BoxType argument into
    a function (certain assumed shape and sizes cases do this) that has no
    fir.ref wrapping it. As we need the Box to be a reference type to
    actually utilise the operation to access the base address stored inside
    and create the correct mappings we currently generate an intermediate
    allocation in these cases, and then store into it, and utilise this as
    the map argument, as opposed to the original.
    
    However, as we were not sharing the same intermediate allocation across
    all of the maps for a variable, this resulted in errors in certain cases
    when detatching/attatching the data e.g. via enter and exit. This PR
    adjusts this for cases
    
    Currently we only maintain tracking of all intermediate allocations for
    the current function scope, as opposed to module. Primarily as the only
    case I am aware of that this is required is in cases where we pass
    certain types of arguments to functions (so I opted to minimize the
    overhead of the pass for now). It could likely be extended to module
    scope if required if we find other cases where it's applicable and
    causing issues.
    agozillon authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    f4cf93f View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    d86349c View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    b7c1be1 View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    3c0fba4 View commit details
    Browse the repository at this point in the history
  82. Configuration menu
    Copy the full SHA
    9e9e823 View commit details
    Browse the repository at this point in the history
  83. Revert "Revert "[lldb][swig] Use the correct variable in the return s…

    …tatement""
    
    This reverts commit 7323e7e.
    adrian-prantl committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    ad75775 View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    11d2de4 View commit details
    Browse the repository at this point in the history
  85. [TableGen] Refactor SequenceToOffsetTable class (llvm#104986)

    - Replace use of std::isalnum/ispunct with StringExtras version to avoid
    possibly locale dependent behavior.
    - Remove `static` from printChar (do its deduplicated when linking).
    - Use range based for loops and structured bindings.
    - No need to use `llvm::` for code in llvm namespace.
    jurahul authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    a968ae6 View commit details
    Browse the repository at this point in the history
  86. [mlir][sparse] refactoring sparse_tensor.iterate lowering pattern imp…

    …lementation. (llvm#105566)
    Peiming Liu authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    7186704 View commit details
    Browse the repository at this point in the history
  87. [Clang] Assert non-null enum definition in CGDebugInfo::CreateTypeDef…

    …inition(const EnumType*) (llvm#105556)
    
    This commit adds an assert to check for a non-null enum definition in
    CGDebugInfo::CreateTypeDefinition(const EnumType*), ensuring
    precondition validity.
    
    Previous discussion on llvm#97105
    smanna12 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    8f08b75 View commit details
    Browse the repository at this point in the history
  88. [flang][runtime] Add FLANG_RUNTIME_NO_REAL_3 flag to build (llvm#105856)

    Allow a runtime build to disable SELECTED_REAL_KIND from returning kind
    3 (16-bit truncated form of 32-bit IEEE-754 floating point, a/k/a "brain
    float" or bfloat16).
    klausler authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    57b89fd View commit details
    Browse the repository at this point in the history
  89. [LLD][COFF] Add support for CHPE redirection metadata. (llvm#105739)

    This is part of CHPE metadata containing a sorted list of x86_64 export
    thunks RVAs and RVAs of ARM64EC functions associated with them. It's
    stored in a dedicated .a64xrm section.
    cjacek authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    caa844e View commit details
    Browse the repository at this point in the history
  90. Configuration menu
    Copy the full SHA
    ceb587a View commit details
    Browse the repository at this point in the history
  91. [mlir][SCF] Allow canonicalization of zero-trip count scf.forall wi…

    …th empty mapping. (llvm#105793)
    
    Current folding of one-trip count loop does not kick in with an empty
    mapping. Enable this for empty mapping.
    
    Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
    MaheshRavishankar authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    00620ab View commit details
    Browse the repository at this point in the history
  92. [DXIL][Analysis] Uniquify duplicate resources in DXILResourceAnalysis

    If a resources is used multiple times, we should only have one resource record
    for it. This comes up most prominantly with arrays of resources like so:
    
    ```hlsl
    RWBuffer<float4> BufferArray[10] : register(u0, space4);
    RWBuffer<float4> B1 = BufferArray[0];
    RWBuffer<float4> B2 = BufferArray[SomeIndex];
    RWBuffer<float4> B3 = BufferArray[3];
    ```
    
    In this case, there's only one resource, but we'll generate 3 different
    `dx.handle.fromBinding` calls to access different slices.
    
    Note that this adds some API that won't be used until llvm#104447 later in the
    stack. Trying to avoid that results in unnecessary churn.
    
    Fixes llvm#105143
    
    Pull Request: llvm#105602
    bogner authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    782bc4f View commit details
    Browse the repository at this point in the history
  93. Configuration menu
    Copy the full SHA
    a0fac6f View commit details
    Browse the repository at this point in the history
  94. [LLD][COFF] Add support for CHPE code ranges metadata. (llvm#105741)

    This is part of CHPE metadata containing a sorted list of x86_64 export
    thunks RVAs and sizes.
    cjacek authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    52a7116 View commit details
    Browse the repository at this point in the history
  95. Deprecate -fheinous-gnu-extensions; introduce a new warning flag (llv…

    …m#105821)
    
    The new warning flag is `-Winvalid-gnu-asm-cast`, which is enabled by
    default and is a downgradable diagnostic which defaults to an error.
    
    This language dialect flag only controls whether a single diagnostic is
    emitted as a warning or as an error, and has never been expanded to
    include other behaviors. Given the rather perjorative name, it's better
    for us to just expose a diagnostic flag for the one warning in question
    and let the user elect to do `-Wno-error=` if they need to.
    
    There's not a lot of use of the language dialect flag in the wild, but
    there is some use of it. For the time being, this aliases the -f flag to
    `-Wno-error=invalid-gnu-asm-cast`, but the -f flag can eventually be
    removed.
    AaronBallman authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    c505ce9 View commit details
    Browse the repository at this point in the history
  96. Configuration menu
    Copy the full SHA
    a74f0ab View commit details
    Browse the repository at this point in the history
  97. [DirectX] Lower @llvm.dx.handle.fromBinding to DXIL ops

    The `@llvm.dx.handle.fromBinding` intrinsic is lowered either to the
    `CreateHandle` op or a pair of `CreateHandleFromBinding` and `AnnotateHandle`
    ops, depending on the DXIL version. Regardless of the DXIL version we need to
    emit metadata about the binding, but that's left to a separate change.
    
    These DXIL ops all need to return the `%dx.types.Handle` type, but the llvm
    intrinsic returns a target extension type. To facilitate changing the type of
    the operation and all of its users, we introduce `%llvm.dx.cast.handle`, which
    can cast between the two handle representations.
    
    Pull Request: llvm#104251
    bogner authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    aa61925 View commit details
    Browse the repository at this point in the history
  98. [GDBRemote] Fix processing of comma-separated memory region entries (l…

    …lvm#105873)
    
    The existing algorithm was performing the following comparisons for an
    `aaa,bbb,ccc,ddd`:
    
    aaa\0bbb,ccc,ddd == "stack"
    aaa\0bbb\0ccc,ddd == "stack"
    aaa\0bbb\0ccc\0ddd == "stack"
    
    Which wouldn't work. This commit just dispatches to a known algorithm
    implementation.
    felipepiovezan authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    8b4147d View commit details
    Browse the repository at this point in the history
  99. [nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPro…

    …pertiesAnalysis (llvm#104867)
    
    We need the dominator tree analysis for loop info analysis, which we need to get features like most nested loop and number of top level loops. Invalidating and recomputing these from scratch after each successful inlining can sometimes lead to lengthy compile times. We don't need to recompute from scratch, though, since we have some boundary information about where the changes to the CFG happen; moreover, for dom tree, the API supports incrementally updating the analysis result.
    
    This change addresses the dom tree part. The loop info is still recomputed from scratch. This does reduce the compile time quite significantly already, though (~5x in a specific case)
    
    The loop info change might be more involved and would follow in a subsequent PR.
    mtrofin authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    a2a5508 View commit details
    Browse the repository at this point in the history
  100. [mlir][Linalg] Avoid doing op replacement in linalg::dropUnitDims. (l…

    …lvm#105749)
    
    It is better to do the replacement in the caller. This avoids the
    footgun if the caller needs the original operation. Instead return the
    produced operation and replacement values.
    
    Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
    MaheshRavishankar authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    4dbaef6 View commit details
    Browse the repository at this point in the history
  101. [mlir][Transforms] Dialect conversion: Make materializations optional (

    …llvm#104668)
    
    This commit makes source/target/argument materializations (via the
    `TypeConverter` API) optional.
    
    By default (`ConversionConfig::buildMaterializations = true`), the
    dialect conversion infrastructure tries to legalize all unresolved
    materializations right after the main transformation process has
    succeeded. If at least one unresolved materialization fails to resolve,
    the dialect conversion fails. (With an error message such as `failed to
    legalize unresolved materialization ...`.) Automatic materializations
    through the `TypeConverter` API can now be deactivated. In that case,
    every unresolved materialization will show up as a
    `builtin.unrealized_conversion_cast` op in the output IR.
    
    There used to be a complex and error-prone analysis in the dialect
    conversion that predicted the future uses of unresolved
    materializations. Based on that logic, some casts (that were deemed to
    unnecessary) were folded. This analysis was needed because folding
    happened at a point of time when some IR changes (e.g., op replacements)
    had not materialized yet.
    
    This commit removes that analysis. Any folding of cast ops now happens
    after all other IR changes have been materialized and the uses can
    directly be queried from the IR. This simplifies the analysis
    significantly. And certain helper data structures such as
    `inverseMapping` are no longer needed for the analysis. The folding
    itself is done by `reconcileUnrealizedCasts` (which also exists as a
    standalone pass).
    
    After casts have been folded, the remaining casts are materialized
    through the `TypeConverter`, as usual. This last step can be deactivated
    in the `ConversionConfig`.
    
    `ConversionConfig::buildMaterializations = false` can be used to debug
    error messages such as `failed to legalize unresolved materialization
    ...`. (It is also useful in case automatic materializations are not
    needed.) The materializations that failed to resolve can then be seen as
    `builtin.unrealized_conversion_cast` ops in the resulting IR. (This is
    better than running with `-debug`, because `-debug` shows IR where some
    IR changes have not been materialized yet.)
    matthias-springer authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    d7073c5 View commit details
    Browse the repository at this point in the history
  102. [rtsan][compiler-rt] Prevent UB hang in rtsan lock unit tests (llvm#1…

    …04733)
    
    It is undefined behavior to lock or unlock an uninitialized lock, and
    unlock a lock which isn't locked.
    
    Introduce a fixture to set up and tear down the locks where
    appropriate, and separates them into two tests (realtime death and non
    realtime survival) so each test is guaranteed a fresh lock.
    cjappl authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    64afbf0 View commit details
    Browse the repository at this point in the history
  103. [Bitcode] Use DenseSet instead of std::set (NFC) (llvm#105851)

    DefOrUseGUIDs is used only for membership checking purposes.  We don't
    need std::set's strengths like iterators staying valid or the ability
    to traverse in a sorted order.
    
    While I am at it, this patch replaces count with contains for slightly
    increased readability.
    kazutakahirata authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    3b703d4 View commit details
    Browse the repository at this point in the history
  104. [InstCombine] Fold (x < y) ? -1 : zext(x > y) and `(x > y) ? 1 : se…

    …xt(x < y)` to `ucmp/scmp(x, y)` (llvm#105272)
    
    This patch expands already existing funcionality to include these two
    additional folds, which are nearly identical to the ones already
    implemented.
    
    Proofs: https://alive2.llvm.org/ce/z/Xy7s4j
    Poseydon42 authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    da6f423 View commit details
    Browse the repository at this point in the history
  105. [compiler-rt][nsan] Add support for nan detection (llvm#101531)

    Add support for nan detection.
    llvm#100305
    cseslowpoke authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    283dff4 View commit details
    Browse the repository at this point in the history
  106. [mlir][sparse] unify block arguments order between iterate/coiterate …

    …operations. (llvm#105567)
    Peiming Liu authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    b48ef8d View commit details
    Browse the repository at this point in the history
  107. [SPIRV] Fix return type mismatch for createSPIRVEmitNonSemanticDIPass (

    …llvm#105889)
    
    The declaration in SPIRV.h had this returning a `MachineFunctionPass *`,
    but the implementation returned a `FunctionPass *`. This showed up as a
    build error on windows, but it was clearly a mistake regardless.
    
    I also updated the pass to include SPIRV.h rather than using its own
    declarations for pass initialization, as this results in better errors
    for this kind of typo.
    
    Fixes a build break after llvm#97558
    bogner authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    3e763db View commit details
    Browse the repository at this point in the history
  108. "Reland "[asan] Remove debug tracing from report_globals (llvm#104404

    …)" (llvm#105895)
    
    Reland llvm#104404.
    
    In addition to llvm#104404 it raises required
    verbosity for stack tracing on global
    registration. It confuses a symbolizer test on
    Darwin.
    
    This reverts commit 6a8f738.
    vitalybuka authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    10407be View commit details
    Browse the repository at this point in the history
  109. [mlir][tensor] Add TilingInterface support for fusing tensor.pad (llv…

    …m#105892)
    
    This adds implementations for the two TilingInterface methods required
    for fusion to `tensor.pad`: `getIterationDomainTileFromResultTile` and
    `generateResultTileValue`, allowing fusion of pad with a tiled consumer.
    qedawkins authored Aug 23, 2024
    Configuration menu
    Copy the full SHA
    91e57c6 View commit details
    Browse the repository at this point in the history
  110. Fix bot failures after PR llvm#104867

    An assert was left over after addressing feedback. In the process of
    fixing, realized the way I addressed the feedback was also incomplete.
    mtrofin committed Aug 23, 2024
    Configuration menu
    Copy the full SHA
    cdd11d6 View commit details
    Browse the repository at this point in the history
  111. Configuration menu
    Copy the full SHA
    ca53611 View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2024

  1. [IR] Inroduce ModuleToSummariesForIndexTy (NFC) (llvm#105906)

    This patch introduces type alias ModuleToSummariesForIndexTy.
    
    I'm planning to change the type slightly to allow heterogeneous lookup
    (that is, std::map<K, V, std::less<>>) in a subsequent patch.  The
    problem is that changing the type affects many places.  Using a type
    alias reduces the impact.
    kazutakahirata authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    dbd7ce0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    1f89cd4 View commit details
    Browse the repository at this point in the history
  3. [include-cleaner] Turn new/delete usages to ambiguous references (llv…

    …m#105844)
    
    In practice most of these expressions just resolve to implicitly
    provided `operator new` and standard says it's not necessary to include
    `<new>` for that.
    Hence this is resulting in a lot of churn in cases where inclusion of
    `<new>` doesn't matter, and might even be undesired by the developer.
    
    By switching to an ambiguous reference we try to find a middle ground
    here, ensuring that we don't drop providers of `operator new` when the
    developer explicitly listed them in the includes, and chose to believe
    it's the implicitly provided `operator new` and don't insert an include
    in other cases.
    kadircet authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    74b538d View commit details
    Browse the repository at this point in the history
  4. [clang-format] Treat new expressions as simple functions (llvm#105168)

    ccae7b4 improved handling for nested
    calls, but this resulted in a lot of changes near `new` expressions.
    
    This patch tries to restore previous behavior around new expressions, by
    treating them as simple functions, which seem to align with the concept.
    
    Fixes llvm#105133.
    kadircet authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    e439fdf View commit details
    Browse the repository at this point in the history
  5. [SandboxIR] Implement CleanupReturnInst (llvm#105750)

    This patch implements sandboxir::CleanupReturnInst mirroring
    llvm::CleanupReturnInst.
    vporpo authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    d021321 View commit details
    Browse the repository at this point in the history
  6. [StableHash] Implement with xxh3_64bits (llvm#105849)

    This is a follow-up to address a suggestion from
    llvm#105619.
    The main goal of this change is to efficiently implement stable hash
    functions using the xxh3 64bits API.
    `stable_hash_combine_range` and `stable_hash_combine_array` functions
    are removed and consolidated into a more general `stable_hash_combine`
    function that takes an `ArrayRef<stable_hash>` as input.
    kyulee-com authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    7615c0b View commit details
    Browse the repository at this point in the history
  7. [docs] Fix links in github user guide - graphite section

    Mistakenly used markdown style rather than rst in llvm#104499.
    mtrofin authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    6260125 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    75ef955 View commit details
    Browse the repository at this point in the history
  9. [clang][bytecode] Fix IntegralAP::is{Positive,Negative} (llvm#105924)

    This depends on signed-ness.
    tbaederr authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    c81d666 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    68030f8 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    62e7b59 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    e185850 View commit details
    Browse the repository at this point in the history
  13. [Clang] Overflow Pattern Exclusion - rename some patterns, enhance do…

    …cs (llvm#105709)
    
    From @vitalybuka's review on
    llvm#104889:
    - [x] remove unused variable in tests
    - [x] rename `post-decr-while` --> `unsigned-post-decr-while`
    - [x] split `add-overflow-test` into `add-unsigned-overflow-test` and
    `add-signed-overflow-test`
    - [x] be more clear about defaults within docs
    - [x] add table to docs
    
    Here's a screenshot of the rendered table so you don't have to build the
    html docs yourself to inspect the layout:
    
    ![image](https://github.com/user-attachments/assets/5d3497c4-5f5a-4579-b29b-96a0fd192faa)
    
    
    CCs: @vitalybuka
    
    ---------
    
    Signed-off-by: Justin Stitt <justinstitt@google.com>
    Co-authored-by: Vitaly Buka <vitalybuka@google.com>
    JustinStitt and vitalybuka authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    76236fa View commit details
    Browse the repository at this point in the history
  14. [clang][bytecode][NFC] Add an additional assertion (llvm#105927)

    Since this must be true, add an assertion instead of just documenting it
    via the comment.
    tbaederr authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    99b85ca View commit details
    Browse the repository at this point in the history
  15. [InstCombine] Update the select operand when the cond is trunc

    …and has the `nuw` or `nsw` property. (llvm#105914)
    
    This patch updates the select operand when the cond has the nuw or nsw
    property. Considering the semantics of the nuw and nsw flag, if there is
    no poison value in this expression, this code assumes that X can only be
    0, 1 or -1.
    
    close: llvm#96765
    alive2: https://alive2.llvm.org/ce/z/3n3n2Q
    c8ef authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    43c6fb2 View commit details
    Browse the repository at this point in the history
  16. [Tests] Attempt to fix PowerPC buildbots.

    The intent is that the tests should not be running on PowerPC as the fp128 type
    will differ. This attempts to fix the bots by using __powerpc__ instead, which
    appears to be defined in godbolt.
    davemgreen committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    001e423 View commit details
    Browse the repository at this point in the history
  17. [RISCV] Don't move source if passthru already dominates in vmv.v.v pe…

    …ephole (llvm#105792)
    
    Currently we move the source down to where vmv.v.v to make sure that the
    new passthru dominates, but we do this even if it already does.
    
    This adds a simple local dominance check (taken from
    X86FastPreTileConfig.cpp) and avoids doing the move if it can.
    
    It also modifies the move to only move it to just past the passthru
    definition, and not all the way down to the vmv.v.v.
    
    This allows folding to succeed in some edge cases, which prevents
    regressions in an upcoming patch.
    lukel97 authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    be5ecc3 View commit details
    Browse the repository at this point in the history
  18. [VPlan] Wrap planContainsAdditionalSimplifications in NDEBUG (NFC)

    Only used for an assertion.
    fhahn committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    40975da View commit details
    Browse the repository at this point in the history
  19. [ConstantFolding] Ensure TLI is valid when simplifying fp128 intrinsics.

    TLI might not be valid for all contexts that constant folding is performed. Add
    a quick guard that it is not null.
    davemgreen committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    83a5c7c View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    08acc3f View commit details
    Browse the repository at this point in the history
  21. [lit] Export env vars in script to avoid pruning (llvm#105759)

    On macOS the dynamic loader prunes dyld specific environment variables
    such as `DYLD_INSERT_LIBRARIES`, `DYLD_LIBRARY_PATH`, etc. If these are
    set in the lit config it's safe to assume that the user actually wanted
    their subprocesses to run with these variables, versus the python
    interpreter that gets executed with them before they are pruned. This
    change exports all known variables in the shell script instead of
    relying on them being passed through.
    keith authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    65b7cbb View commit details
    Browse the repository at this point in the history
  22. Update Python requirements to fix more CVEs (llvm#105853)

    Followup to llvm#90109.
    
    In Microsoft, our automated scans are warning that LLVM has vulnerable
    dependencies. Specifically:
    
    * [CVE-2024-35195](https://nvd.nist.gov/vuln/detail/CVE-2024-35195) was
    fixed in `requests` 2.32.0.
    * [CVE-2024-37891](https://nvd.nist.gov/vuln/detail/CVE-2024-37891) was
    fixed in `urllib3` 2.2.2.
    
    I've updated LLVM's dependencies by running the following commands in
    `llvm/utils/git`:
    
    ```
    pip-compile --upgrade --generate-hashes --output-file=requirements.txt requirements.txt.in
    pip-compile --upgrade --generate-hashes --output-file=requirements_formatting.txt requirements_formatting.txt.in
    ```
    
    Note that for `requirements_formatting.txt` this adds
    `--generate-hashes` (according to my vague understanding, it's highly
    desirable and was already used for `requirements.txt`) and was locally
    run within `llvm/utils/git` (changing the recorded command, which
    apparently was originally run from the repo root - again,
    `requirements.txt` was already being regenerated with a locally run
    command, so this increases consistency).
    
    I observe that this has updated the relevant components to pick up the
    CVE fixes. Note that I am largely clueless in this area, so I hope that
    (like llvm#90109) no other changes will be necessary.
    StephanTLavavej authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    7036394 View commit details
    Browse the repository at this point in the history
  23. [libc++][test] Fix msvc_is_lock_free_macro_value() (llvm#105876)

    Followup to llvm#99570.
    
    * `TEST_COMPILER_MSVC` must be tested for `defined`ness, as it is
    everywhere else.
    + Definition:
    https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/support/test_macros.h#L71-L72
    + Example usage:
    https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/std/utilities/function.objects/func.not_fn/not_fn.pass.cpp#L248
    + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(33): fatal
    error C1017: invalid integer constant expression`
    * Fix bogus return type: `msvc_is_lock_free_macro_value()` returns `2`
    or `0`, so it needs to return `int`.
    + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(41): warning
    C4305: 'return': truncation from 'int' to 'bool'`
    * Clarity improvement: also add parens when mixing bitwise with
    arithmetic operators.
    StephanTLavavej authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    886b761 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    a5d89d5 View commit details
    Browse the repository at this point in the history
  25. [llvm][NVPTX] Fix RAUW bug in NVPTXProxyRegErasure (llvm#105871)

    Fix bug introduced in llvm#105730
    
    The bug is in how the batch RAUW is implemented. If we have 
    
    ```
    %0 = mov %src
    %1 = mov %0
    
    use %0
    use %1
    ```
    
    The use of `%1` is rewritten to `%0`, not `%src`. This PR just looks for
    a replacement when it maps to the src register, which should
    transitively propagate the replacements.
    Mogball authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    31b4bf9 View commit details
    Browse the repository at this point in the history
  26. [DAG][RISCV] Use vp_reduce_fadd/fmul when widening types for FP reduc…

    …tions (llvm#105840)
    
    This is a follow up to llvm#105455 which updates the VPIntrinsic mappings
    for the fadd and fmul cases, and supports both ordered and unordered
    reductions. This allows the use a single wider operation with a
    restricted EVL instead of padding the vector with the neutral element.
    
    This has all the same tradeoffs as the previous patch.
    preames authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    2cb25d5 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    d252365 View commit details
    Browse the repository at this point in the history
  28. Update my email

    ecnelises committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    6f618a7 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    9f82f6d View commit details
    Browse the repository at this point in the history
  30. [clang-cl] [AST] Reapply llvm#102848 Fix placeholder return type name…

    … mangling for MSVC 1920+ / VS2019+ (llvm#104722)
    
    Reapply llvm#102848.
    
    The description in this PR will detail the changes from the reverted
    original PR above.
    
    For `auto&&` return types that can partake in reference collapsing we
    weren't properly handling that mangling that can arise.
    When collapsing occurs an inner reference is created with the collapsed
    reference type. If we return `int&` from such a function then an inner
    reference of `int&` is created within the `auto&&` return type.
    `getPointeeType` on a reference type goes through all inner references
    before returning the pointee type which ends up being a builtin type,
    `int`, which is unexpected.
    
    We can use `getPointeeTypeAsWritten` to get the `AutoType` as expected
    however for the instantiated template declaration reference collapsing
    already occurred on the return type. This means `auto&&` is turned into
    `auto&` in our example above.
    We end up mangling an lvalue reference type.
    This is unintended as MSVC mangles on the declaration of the return
    type, `auto&&` in this case, which is treated as an rvalue reference.
    ```
    template<class T>
    auto&& AutoReferenceCollapseT(int& x) { return static_cast<int&>(x); }
    
    void test() 
    {
        int x = 1;
        auto&& rref = AutoReferenceCollapseT<void>(x); // "??$AutoReferenceCollapseT@X@@ya$$QEA_PAEAH@Z"
        // Mangled as an rvalue reference to auto
    }
    ```
    
    If we are mangling a template with a placeholder return type we want to
    get the first template declaration and use its return type to do the
    mangling of any instantiations.
    
    This fixes the bug reported in the original PR that caused the revert
    with libcxx `std::variant`.
    I also tested locally with libcxx and the following test code which
    fails in the original PR but now works in this PR.
    ```
    #include <variant>
    
    void test()
    {
        std::variant<int> v{ 1 };
        int& r = std::get<0>(v);
        (void)r;
    }
    ```
    MaxEW707 authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    43b8885 View commit details
    Browse the repository at this point in the history
  31. [AArch64] Replace AND with LSL#2 for LDR target (llvm#34101) (llvm#89531

    )
    
    Currently, process of replacing bitwise operations consisting of
    `LSR`/`LSL` with `And` is performed by `DAGCombiner`.
    
    However, in certain cases, the `AND` generated by this process
    can be removed.
    
    Consider following case:
    ```
            lsr x8, x8, #56
            and x8, x8, #0xfc
            ldr w0, [x2, x8]
            ret
    ```
    
    In this case, we can remove the `AND` by changing the target of `LDR`
    to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58.
    
    after changed:
    ```
            lsr x8, x8, #58
            ldr w0, [x2, x8, lsl #2]
            ret
    ```
    
    This patch checks to see if the `SHIFTING` + `AND` operation on load
    target can be optimized and optimizes it if it can.
    ParkHanbum authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    77fccb3 View commit details
    Browse the repository at this point in the history
  32. [ARM] Add VECTOR_REG_CAST identity fold.

    v16i8 VECTOR_REG_CAST (v16i8 Op) can use v16i8 Op directly, as the
    VECTOR_REG_CAST is a noop.
    davemgreen committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    b9a0276 View commit details
    Browse the repository at this point in the history
  33. [Mips] Remove a trivial variable (NFC) (llvm#105940)

    We assign I->getNumOperands() to J and immediately print that out as a
    debug message.  We don't need to keep J across iterations.
    kazutakahirata authored Aug 24, 2024
    Configuration menu
    Copy the full SHA
    a6f87ab View commit details
    Browse the repository at this point in the history
  34. Revert "Enable logf128 constant folding for hosts with 128bit long do…

    …uble (llvm#104929)"
    
    ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`.
    LLVM should not change the behavior depending on host configurations.
    
    This reverts commit 14c7e4a.
    (llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
    chapuni committed Aug 24, 2024
    Configuration menu
    Copy the full SHA
    3ef64f7 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2024

  1. Configuration menu
    Copy the full SHA
    6bc225e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0916ae4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5c94dd7 View commit details
    Browse the repository at this point in the history
  4. [RISCV][ISel] Move VCIX ISDs to correct position. NFC (llvm#105934)

    Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is
    not expected, it should be in normal OPCODE area.
    4vtomat authored Aug 25, 2024
    Configuration menu
    Copy the full SHA
    579fd59 View commit details
    Browse the repository at this point in the history
  5. [CodeGen] Replace MCPhysReg with MCRegister in MachineBasicBlock::isL…

    …iveIn/removeLiveIn. NFC
    
    We already used it for addLiveIn.
    topperc committed Aug 25, 2024
    Configuration menu
    Copy the full SHA
    f22b1da View commit details
    Browse the repository at this point in the history
  6. [lldb][TypeSystemClang][NFC] Log failure to InitBuiltinTypes

    If we fail to initialize the ASTContext builtins, LLDB
    may crash in non-obvious ways down-the-line, e.g., when
    it tries to call `ASTContext::getTypeSize` on a builtin like
    `ast.UnsignedCharTy`, which would derefernce a `null` `QualType`.
    
    The initialization can fail if we either didn't set the
    `TypeSystemClang` target triple, or if the embedded clang isn't
    enabled for a certain target.
    
    This patch attempts to help pin-point the failure case post-mortem
    by adding a log message here that prints the triple.
    
    rdar://134260837
    Michael137 committed Aug 25, 2024
    Configuration menu
    Copy the full SHA
    2847020 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    5136521 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2024

  1. Configuration menu
    Copy the full SHA
    f0747cd View commit details
    Browse the repository at this point in the history