Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with fe0dee4d (Jun 10) (62) #323

Closed
wants to merge 640 commits into from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jun 7, 2024

  1. Configuration menu
    Copy the full SHA
    c007883 View commit details
    Browse the repository at this point in the history
  2. [memprof] Use std::move in ContextEdge::ContextEdge (NFC) (llvm#94687)

    Since the constructor of ContextEdge takes ContextIds by value, we
    should move it to the corresponding member variable as suggested by
    clang-tidy's performance-unnecessary-value-param.
    
    While we are at it, this patch updates a couple of callers.  To avoid
    the ambiguity in the evaluation order among the constructor arguments,
    I'm calling computeAllocType before calling the constructor.
    kazutakahirata authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b7d976d View commit details
    Browse the repository at this point in the history
  3. [ORC] Switch ExecutionSession::ErrorReporter to use unique_function.

    This allows the ReportError functor to hold move-only types.
    lhames committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    4a7b800 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f21c2fa View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d224a03 View commit details
    Browse the repository at this point in the history
  6. [LoongArch] Add a pass to rewrite rd to r0 for non-computational inst…

    …rs whose return values are unused (llvm#94590)
    
    This patch adds a peephole pass `LoongArchDeadRegisterDefinitions`. It
    rewrites `rd` to `r0` when `rd` is marked as dead. It may improve the
    register allocation and reduce pipeline hazards on CPUs without register
    renaming and OOO.
    heiher authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    240512c View commit details
    Browse the repository at this point in the history
  7. [clang][Interp][NFC] Add GetPtrFieldPop opcode

    And change the previous GetPtrField to only peek() the base pointer.
    We can get rid of a whole bunch of DupPtr ops this way.
    tbaederr committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c15b867 View commit details
    Browse the repository at this point in the history
  8. [analyzer][NFC] Factor out NoOwnershipChangeVisitor (llvm#94357)

    In preparation for adding essentially the same visitor to StreamChecker,
    this patch factors this visitor out to a common header.
    
    I'll be the first to admit that the interface of these classes are not
    terrific, but it rather tightly held back by its main technical debt,
    which is NoStoreFuncVisitor, the main descendant of
    NoStateChangeVisitor.
    
    Change-Id: I99d73ccd93a18dd145bbbc83afadbb432dd42b90
    Szelethus authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    e622996 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    be18daa View commit details
    Browse the repository at this point in the history
  10. [docs] Fix benchmarking tips (llvm#94724)

    This PR fixes an incorrect line for setting scaling_governer in
    benchmarking tips.
    maekawatoshiki authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    8ef5c98 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    36bc741 View commit details
    Browse the repository at this point in the history
  12. [clang][Interp] Remove StoragKind limitation in Pointer assign operators

    It's not strictly needed and did cause some test failures.
    tbaederr committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    1c0063b View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    ac40463 View commit details
    Browse the repository at this point in the history
  14. [MLIR] Translate DIStringType. (llvm#94480)

    This PR handle translation of DIStringType. Mostly mechanical changes to
    translate DIStringType to/from DIStringTypeAttr. The 'stringLength'
    field is 'DIVariable' in DIStringType. As there was no `DIVariableAttr`
    previously, it has been added to ease the translation.
    
    ---------
    
    Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
    abidh and gysit authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    4f320e6 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    5f1adf0 View commit details
    Browse the repository at this point in the history
  16. [flang][Transforms][NFC] Remove boilerplate from vscale range pass (l…

    …lvm#94598)
    
    Use tablegen to generate the pass constructor.
    
    This pass is supposed to add function attributes so it does not need to
    operate on other top level operations.
    tblah authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    8f11649 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    0749b01 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    3453ded View commit details
    Browse the repository at this point in the history
  19. [ARM] Add NEON support for ISD::ABDS/ABDU nodes. (llvm#94504)

    As noted on llvm#94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns.
    
    This patch flags basic ABDS/ABDU for neon types as legal and updates all tablegen patterns to use abds/abdu instead.
    
    Fixes llvm#94466
    RKSimon authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c0b4685 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    0d1b367 View commit details
    Browse the repository at this point in the history
  21. [DebugInfo] Add DW_OP_LLVM_extract_bits (llvm#93990)

    This operation extracts a number of bits at a given offset and sign or
    zero extends them, which is done by emitting it as a left shift followed
    by a right shift.
    
    This is being added for use in clang for C++ structured bindings of
    bitfields that have offset or size that aren't a byte multiple. A new
    operation is being added, instead of shifts being used directly, as it
    makes correctly handling it in optimisations (which will be done in a
    later patch) much easier.
    john-brawn-arm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    1721c14 View commit details
    Browse the repository at this point in the history
  22. Add checks before hoisting out in loop pipelining (llvm#90872)

    Currently, during a loop pipelining transformation, operations may be
    hoisted out without any checks on the loop bounds, which leads to
    incorrect transformations and unexpected behaviour. The following [issue
    ](llvm#90870) describes the
    problem more extensively, including an example.
    The proposed fix adds some check in the loop bounds before and applies
    the maximum hoisting.
    fotiskoun authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    192cd68 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    5d6acf8 View commit details
    Browse the repository at this point in the history
  24. [clang][Interp] Fix refers_to_enclosing_variable_or_capture DREs

    They do not count into lambda captures, so visit them lazily.
    tbaederr committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    3a31eae View commit details
    Browse the repository at this point in the history
  25. [SimplifyCFG] Remove bogus UTC line from test (NFC)

    The check lines in this test were clearly not generated by UTC.
    nikic committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    1934c1a View commit details
    Browse the repository at this point in the history
  26. [SimplifyCFG] Regenerate switch to lookup tests (NFC)

    Regenerate these with --check-globals. The manual global CHECKS
    get dropped during regeneration otherwise.
    
    Annoyingly UTC insists on putting the globals directly before the
    first function, so the first comment is a bit out of place now.
    nikic committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    8719cb8 View commit details
    Browse the repository at this point in the history
  27. [mlir][vector] Add n-d deinterleave lowering (llvm#94237)

    This patch implements the lowering for vector
    deinterleave for vector of n-dimensions. Process
    involves unrolling the n-d vector to a series
    of one-dimensional vectors. The deinterleave
    operation is then used on these vectors.
    
    From:
    ```
    %0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8>
    ```
    
    To:
    ```
    %cst = arith.constant dense<0> : vector<2x4xi32>
    %0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32>
    %res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32>
    %1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32>
    %2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32>
    %3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32>
    %res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32>
    %4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32>
    %5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32>
    ...etc.
    ```
    mub-at-arm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b87a80d View commit details
    Browse the repository at this point in the history
  28. [ARM] r11 is reserved when using -mframe-chain=aapcs (llvm#86951)

    When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options,
    we cannot use r11 as an allocatable register, even if
    -fomit-frame-pointer is also used. This is so that r11 will always point
    to a valid frame record, even if we don't create one in every function.
    ostannard authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    1a52392 View commit details
    Browse the repository at this point in the history
  29. [DAG] Always allow folding XOR patterns to ABS pre-legalization (llvm…

    …#94601)
    
    Removes residual ARM handling for vXi64 ABS nodes to prevent infinite loops.
    RKSimon authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    af3ffff View commit details
    Browse the repository at this point in the history
  30. fix(mlir/**.py): fix comparison to None (llvm#94019)

    from PEP8
    (https://peps.python.org/pep-0008/#programming-recommendations):
    
    > Comparisons to singletons like None should always be done with is or
    is not, never the equality operators.
    
    Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
    e-kwsm and e-kwsm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    fd45dcc View commit details
    Browse the repository at this point in the history
  31. [ARM] Add support for Cortex-R52+ (llvm#94633)

    Cortex-R52+ is an Armv8-R AArch32 CPU.
    
    Technical Reference Manual for Cortex-R52+:
       https://developer.arm.com/documentation/102199/latest/
    jthackray authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    917afa8 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    537165b View commit details
    Browse the repository at this point in the history
  33. [clang][test] Skip interpreter value test on Arm 32 bit

    llvm#89811 caused this test to fail,
    somehow.
    
    I think it may not be at fault, but actually be exposing some
    existing undefined behaviour, see
    llvm#94741.
    
    Skipping this for now to get the bots green again.
    DavidSpickett committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    54c5dbe View commit details
    Browse the repository at this point in the history
  34. [gn build] Port e622996

    llvmgnsyncbot committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    d3e531c View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    6fe5428 View commit details
    Browse the repository at this point in the history
  36. [clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (llvm#89796)

    This change seeks to add support for vendor flavoured SPIRV - more
    specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
    carries some extra bits of information that are only usable by AMDGCN
    targets, forfeiting absolute genericity to obtain greater expressiveness
    for target features:
    
    - AMDGCN inline ASM is allowed/supported, under the assumption that the
    [SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
    extension is enabled/used
    - AMDGCN target specific builtins are allowed/supported, under the
    assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
    enabled when using the downstream translator
    - the featureset matches the union of AMDGCN targets' features
    - the datalayout string is overspecified to affix both the program
    address space and the alloca address space, the latter under the
    assumption that the
    [SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
    extension is enabled/used, case in which the extant SPIRV datalayout
    string would lead to pointers to function pointing to the private
    address space, which would be wrong.
    
    Existing AMDGCN tests are extended to cover this new target. It is
    currently dormant / will require some additional changes, but I thought
    I'd rather put it up for review to get feedback as early as possible. I
    will note that an alternative option is to place this under AMDGPU, but
    that seems slightly less natural, since this is still SPIRV, albeit
    relaxed in terms of preconditions & constrained in terms of
    postconditions, and only guaranteed to be usable on AMDGCN targets (it
    is still possible to obtain pristine portable SPIRV through usage of the
    flavoured target, though).
    AlexVlx authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    88e2bb4 View commit details
    Browse the repository at this point in the history
  37. [BOLT][NFC] Infailable fns return void (llvm#92018)

    Both `reverseBranchCondition` and `replaceBranchTarget` return a success boolean. But all-but-one caller ignores the return value, and the exception emits a fatal error on failure.
    
    Thus, just return nothing.
    urnathan authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    3fefb3c View commit details
    Browse the repository at this point in the history
  38. [CodeGen][SDAG] Remove CombinedNodes SmallPtrSet (llvm#94609)

    This "small" set grows quite large and it's more performant to store
    whether a node has been combined before in the node itself.
    
    As this information is only relevant for nodes that are currently not in
    the worklist, add a second state to the CombinerWorklistIndex (-2) to
    indicate that a node is currently not in a worklist, but was combined
    before.
    
    This brings a substantial performance improvement.
    aengelke authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    74d62c2 View commit details
    Browse the repository at this point in the history
  39. [clang][Interp] Check ConstantExpr results for initialization

    They need to be fully initialized, similar to global variables.
    tbaederr committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    9ece3eb View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    9eb8a13 View commit details
    Browse the repository at this point in the history
  41. [clang][Interp] Limit lambda capture lazy visting to actual captures

    Check this by looking at the VarDecl.
    tbaederr committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b8cc85b View commit details
    Browse the repository at this point in the history
  42. [serialization] no transitive decl change (llvm#92083)

    Following of llvm#86912
    
    The motivation of the patch series is that, for a module interface unit
    `X`, when the dependent modules of `X` changes, if the changes is not
    relevant with `X`, we hope the BMI of `X` won't change. For the specific
    patch, we hope if the changes was about irrelevant declaration changes,
    we hope the BMI of `X` won't change. **However**, I found the patch
    itself is not very useful in practice, since the adding or removing
    declarations, will change the state of identifiers and types in most
    cases.
    
    That said, for the most simple example,
    
    ```
    // partA.cppm
    export module m:partA;
    
    // partA.v1.cppm
    export module m:partA;
    export void a() {}
    
    // partB.cppm
    export module m:partB;
    export void b() {}
    
    // m.cppm
    export module m;
    export import :partA;
    export import :partB;
    
    // onlyUseB;
    export module onlyUseB;
    import m;
    export inline void onluUseB() {
        b();
    }
    ```
    
    the BMI of `onlyUseB` will change after we change the implementation of
    `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new
    identifiers and types (the function prototype).
    
    So in this patch, we have to write the tests as:
    
    ```
    // partA.cppm
    export module m:partA;
    export int getA() { ... }
    export int getA2(int) { ... }
    
    // partA.v1.cppm
    export module m:partA;
    export int getA() { ... }
    export int getA(int) { ... }
    export int getA2(int) { ... }
    
    // partB.cppm
    export module m:partB;
    export void b() {}
    
    // m.cppm
    export module m;
    export import :partA;
    export import :partB;
    
    // onlyUseB;
    export module onlyUseB;
    import m;
    export inline void onluUseB() {
        b();
    }
    ```
    
    so that the new introduced declaration `int getA(int)` doesn't introduce
    new identifiers and types, then the BMI of `onlyUseB` can keep
    unchanged.
    
    While it looks not so great, the patch should be the base of the patch
    to erase the transitive change for identifiers and types since I don't
    know how can we introduce new types and identifiers without introducing
    new declarations. Given how tightly the relationship between
    declarations, types and identifiers, I think we can only reach the ideal
    state after we made the series for all of the three entties.
    
    The design of the patch is similar to
    llvm#86912, which extends the
    32-bit DeclID to 64-bit and use the higher bits to store the module file
    index and the lower bits to store the Local Decl ID.
    
    A slight difference is that we only use 48 bits to store the new DeclID
    since we try to use the higher 16 bits to store the module ID in the
    prefix of Decl class. Previously, we use 32 bits to store the module ID
    and 32 bits to store the DeclID. I don't want to allocate additional
    space so I tried to make the additional space the same as 64 bits. An
    potential interesting thing here is about the relationship between the
    module ID and the module file index. I feel we can get the module file
    index by the module ID. But I didn't prove it or implement it. Since I
    want to make the patch itself as small as possible. We can make it in
    the future if we want.
    
    Another change in the patch is the new concept Decl Index, which means
    the index of the very big array `DeclsLoaded` in ASTReader. Previously,
    the index of a loaded declaration is simply the Decl ID minus
    PREDEFINED_DECL_NUMs. So there are some places they got used
    ambiguously. But this patch tried to split these two concepts.
    
    As llvm#86912 did, the change will
    increase the on-disk PCM file sizes. As the declaration ID may be the
    most IDs in the PCM file, this can have the biggest impact on the size.
    In my experiments, this change will bring 6.6% increase of the on-disk
    PCM size. No compile-time performance regression observed. Given the
    benefits in the motivation example, I think the cost is worthwhile.
    ChuanqiXu9 committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    5a0181f View commit details
    Browse the repository at this point in the history
  43. [AMDGPU] Fix interaction between WQM and llvm.amdgcn.init.exec (llvm#…

    …93680)
    
    Whole quad mode requires inserting a copy of the initial EXEC mask. In a
    function that also uses llvm.amdgcn.init.exec, insert the COPY after
    initializing EXEC.
    jayfoad authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    df6750e View commit details
    Browse the repository at this point in the history
  44. [Frontend][OpenMP] Sort all the things in OMP.td, NFC (llvm#94653)

    The file OMP.td is becoming tedious to update by hand due to the
    seemingly random ordering of various items in it. This patch brings
    order to it by sorting most of the contents.
    
    The clause definitions are sorted alphabetically with respect to the
    spelling of the clause.[1]
    
    The directive definitions are split into two leaf directives and
    compound directives.[2] Within each, definitions are sorted
    alphabetically with respect to the spelling, with the exception that
    "end xyz" directives are placed immediately following the definition of
    "xyz".[3]
    
    Within each directive definition, the lists of clauses are also sorted
    alphabetically.
    
    [1] All spellings are made of lowercase letters, _, or space. Ordering
    that includes non-letters follows the order assumed by the `sort`
    utility.
    [2] Compound directives refer to the consituent leaf directives, hence
    the leaf definitions must come first.
    [3] Some of the "end xyz" directives have properties derived from the
    corresponding "xyz" directive. This exception guarantees that "xyz"
    precedes the "end xyz".
    kparzysz authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    acc927a View commit details
    Browse the repository at this point in the history
  45. [flang][OpenMP] Lower target .. private(..) to omp.private ops (l…

    …lvm#94195)
    
    Extends delayed privatization support to `taraget .. private(..)`. With
    this PR, `private` is support for `target` **only** is delayed
    privatization mode.
    ergawy authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    913a824 View commit details
    Browse the repository at this point in the history
  46. [libc] Correctly pass the C++ standard to NVPTX internal builds

    Summary:
    The NVPTX build wasn't getting the `C++20` standard necessary for a few
    files.
    jhuber6 committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2c3723d View commit details
    Browse the repository at this point in the history
  47. [mlir][linalg] Support lowering unpack with outer_dims_perm (llvm#94477)

    This commit adds support for lowering `tensor.unpack` with a
    non-identity `outer_dims_perm`. This was previously left as a
    not-yet-implemented case.
    ryan-holt-1 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    5b2f7a1 View commit details
    Browse the repository at this point in the history
  48. [mlir] Add reshape propagation patterns for tensor.pad (llvm#94489)

    This PR adds fusion by collapsing and fusion by expansion patterns for
    `tensor.pad` ops in ElementwiseOpFusion. Pad ops can be expanded or
    collapsed as long as none of the padded dimensions will be expanded or
    collapsed.
    Max191 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c886d66 View commit details
    Browse the repository at this point in the history
  49. [mlir] Fix bugs in expand_shape patterns after semantics changes (llv…

    …m#94631)
    
    After the `output_shape` field was added to `expand_shape` ops,
    dynamically sized expand shapes are now possible, but this was not
    accounted for in the folder. This PR tightens the constraints of the
    folder to fix this.
    Max191 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2117677 View commit details
    Browse the repository at this point in the history
  50. [ARM] Clean up neon_vabd.ll, vaba.ll and vabd.ll tests a bit. NFC

    Change the target triple to remove some unnecessary instructions.
    davemgreen committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    ac02168 View commit details
    Browse the repository at this point in the history
  51. [arm64] Add tan intrinsic lowering (llvm#94545)

    This change is an implementation of
    llvm#87367 investigation on
    supporting IEEE math operations as intrinsics.
    Which was discussed in this RFC:
    https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
    
    This PR is just for Tan.
    
    Now that x86 tan backend landed:
    llvm#90503 we can add other
    backends since the shared pieces are in tree now.
    
    Changes:
    - `llvm/include/llvm/Analysis/VecFuncs.def` - vectorization of tan for
    arm64 backends.
    - `llvm/lib/Target/AArch64/AArch64FastISel.cpp` - Add tan to the libcall
    table
    - `llvm/lib/Target/AArch64/AArch64ISelLowering.cpp` - Add tan expansion
    for f128, f16, and vector\neon operations
    - `llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp` define
    `G_FTAN` as a legal arm64 instruction
    
    resolves llvm#94755
    farzonl authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2f0308e View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    c5fcc2e View commit details
    Browse the repository at this point in the history
  53. [Clang] Add timeout for GPU detection utilities (llvm#94751)

    Summary:
    The utilities `nvptx-arch` and `amdgpu-arch` are used to support
    `--offload-arch=native` among other utilities in clang. However, these
    rely on the GPU drivers to query the features. In certain cases these
    drivers can become locked up, which will lead to indefinate hangs on any
    compiler jobs running in the meantime.
    
    This patch adds a ten second timeout period for these utilities before
    it kills the job and errors out.
    jhuber6 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2981f3a View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    2afea72 View commit details
    Browse the repository at this point in the history
  55. [MachineOutliner] Sort by Benefit to Cost Ratio (llvm#90264)

    This PR depends on llvm#90260
    
    We changed the order in which functions are outlined in Machine
    Outliner.
    
    The formula for priority is found via a black-box Bayesian optimization
    toolbox. Using this formula for sorting consistently reduces the
    uncompressed size of large real-world mobile apps. We also ran a few
    benchmarks using LLVM test suites, and showed that sorting by priority
    consistently reduces the text segment size.
    
    |run (CTMark/)   |baseline (1)|priority (2)|diff (1 -> 2)|
    |----------------|------------|------------|-------------|
    |lencod          |349624      |349264      |-0.1030%     |
    |SPASS           |219672      |219480      |-0.0874%     |
    |kc              |271956      |251200      |-7.6321%     |
    |sqlite3         |223920      |223708      |-0.0947%     |
    |7zip-benchmark  |405364      |402624      |-0.6759%     |
    |bullet          |139820      |139500      |-0.2289%     |
    |consumer-typeset|295684      |290196      |-1.8560%     |
    |pairlocalalign  |72236       |72092       |-0.1993%     |
    |tramp3d-v4      |189572      |189292      |-0.1477%     |
    
    This is part of an enhanced version of machine outliner -- see
    [RFC](https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-1-fulllto-part-2-thinlto-nolto-to-come/78732).
    xuanzhang816 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    3b16630 View commit details
    Browse the repository at this point in the history
  56. [memprof] Clean up IndexedMemProfReader (NFC) (llvm#94710)

    Parameter "Version" is confusing in deserializeV012 and deserializeV3
    because we also have member variable "Version".  Fortunately,
    parameter "Version" and member variable "Version" always have the same
    value because IndexedMemProfReader::deserialize initializes the member
    variable and passes it to deserializeV012 and deserializeV3.
    
    This patch removes the parameter.
    kazutakahirata authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    eb33e46 View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    55bdb36 View commit details
    Browse the repository at this point in the history
  58. [memprof] Use CallStackRadixTreeBuilder in the V3 format (llvm#94708)

    This patch integrates CallStackRadixTreeBuilder into the V3 format,
    reducing the profile size to about 27% of the V2 profile size.
    
    - Serialization: writeMemProfCallStackArray just needs to write out
      the radix tree array prepared by CallStackRadixTreeBuilder.
      Mappings from CallStackIds to LinearCallStackIds are moved by new
      function CallStackRadixTreeBuilder::takeCallStackPos.
    
    - Deserialization: Deserializing a call stack is the same as
      deserializing an array encoded in the obvious manner -- the length
      followed by the payload, except that we need to follow a pointer to
      the parent to take advantage of common prefixes once in a while.
      This patch teaches LinearCallStackIdConverter to how to handle those
      pointers.
    kazutakahirata authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c348e26 View commit details
    Browse the repository at this point in the history
  59. [mlir][vector] Remove Emulated Sub-directory (llvm#94742)

    The "Emulated" sub-directories under "ArmSVE" and
    "ArmSME" have been removed. Associated tests
    have been moved up a directory and now include
    the "REQUIRES" constraint for the arm-emulator.
    mub-at-arm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    7d69095 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    d099d6c View commit details
    Browse the repository at this point in the history
  61. [gn] port cb7690a (ntdll dep)

    nico committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    fc95645 View commit details
    Browse the repository at this point in the history
  62. [KnownBits] Remove hasConflict() assertions (llvm#94568)

    Allow KnownBits to represent "always poison" values via conflict.
    
    close: llvm#94436
    c8ef authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b25b1db View commit details
    Browse the repository at this point in the history
  63. [libc++][test][AIX] Only XFAIL atomic tests for before clang 19 (llvm…

    …#94646)
    
    These tests pass on 64-bit. They were fixed by 5fdd094 on 32-bit.
    So XFAIL only for 32-bit before clang 19.
    jakeegan authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    790992d View commit details
    Browse the repository at this point in the history
  64. [AArch64] Add patterns for add(uzp1(x,y), uzp2(x, y)) -> addp.

    If we are extracting the even lanes and the odd lanes and adding them, we can
    use an addp instruction.
    davemgreen committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    f7018ba View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    4f9c0fa View commit details
    Browse the repository at this point in the history
  66. [libc++][regex] Correctly adjust match prefix for zero-length matches. (

    llvm#94550)
    
    For regex patterns that produce zero-length matches, there is one
    (imaginary) match in-between every character in the sequence being
    searched (as well as before the first character and after the last
    character). It's easiest to demonstrate using replacement:
    `std::regex_replace("abc"s, "!", "")` should produce `!a!b!c!`, where
    each exclamation mark makes a zero-length match visible.
    
    Currently our implementation doesn't correctly set the prefix of each
    zero-length match, "swallowing" the characters separating the imaginary
    matches -- e.g. when going through zero-length matches within `abc`, the
    corresponding prefixes should be `{'', 'a', 'b', 'c'}`, but before this
    patch they will all be empty (`{'', '', '', ''}`). This happens in the
    implementation of `regex_iterator::operator++`. Note that the Standard
    spells out quite explicitly that the prefix might need to be adjusted
    when dealing with zero-length matches in
    [`re.regiter.incr`](http://eel.is/c++draft/re.regiter.incr):
    > In all cases in which the call to `regex_search` returns `true`,
    `match.prefix().first` shall be equal to the previous value of
    `match[0].second`... It is unspecified how the implementation makes
    these adjustments.
    
    [Reproduction example](https://godbolt.org/z/8ve6G3dav)
    ```cpp
    #include <iostream>
    #include <regex>
    #include <string>
    
    int main() {
      std::string str = "abc";
      std::regex empty_matching_pattern("");
    
      { // The underlying problem is that `regex_iterator::operator++` doesn't update
        // the prefix correctly.
        std::sregex_iterator i(str.begin(), str.end(), empty_matching_pattern), e;
        std::cout << "\"";
        for (; i != e; ++i) {
          const std::ssub_match& prefix = i->prefix();
          std::cout << prefix.str();
        }
        std::cout << "\"\n";
        // Before the patch: ""
        // After the patch: "abc"
      }
    
      { // `regex_replace` makes the problem very visible.
        std::string replaced = std::regex_replace(str, empty_matching_pattern, "!");
        std::cout << "\"" << replaced << "\"\n";
        // Before the patch: "!!!!"
        // After the patch: "!a!b!c!"
      }
    }
    ```
    
    Fixes llvm#64451
    
    rdar://119912002
    var-const authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    e9adcc4 View commit details
    Browse the repository at this point in the history
  67. Reapply PR/87550 (llvm#94625)

    Re-apply llvm#87550 with fixes.
    
    Details:
    Some tests in fuchsia failed because of the newly added assertion.
    This was because `GetExceptionBreakpoint()` could be called before
    `g_dap.debugger` was initted.
    
    The fix here is to just lazily populate the list in
    GetExceptionBreakpoint() rather than assuming it's already been initted.
    (There is some nuisance here because we can't simply just populate it in
    DAP::DAP(), which is a global ctor and is called before
    `SBDebugger::Initialize()` is called. )
    oontvoo authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    35fa2de View commit details
    Browse the repository at this point in the history
  68. [libc++] Undeprecate shared_ptr atomic access APIs (llvm#92920)

    This patch reverts 9b832b7 (llvm#87111):
    - [libc++] Deprecated `shared_ptr` Atomic Access APIs as per P0718R2
    - [libc++] Implemented P2869R3: Remove Deprecated `shared_ptr` Atomic Access APIs from C++26
    
    As explained in [1], the suggested replacement in P2869R3 is `__cpp_lib_atomic_shared_ptr`,
    which libc++ does not yet implement. Let's not deprecate the old way of doing things before
    the new way of doing things exists.
    
    [1]: llvm#87111 (comment)
    nico authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    716ed5f View commit details
    Browse the repository at this point in the history
  69. [Reassociate] shifttest.ll - generate test checks to replace custom g…

    …rep expression
    
    (and remove an unused argument)
    RKSimon committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    97b12df View commit details
    Browse the repository at this point in the history
  70. [flang][runtime] add SHAPE runtime interface (llvm#94702)

    Add SHAPE runtime API (will be used for assumed-rank, lowering is
    generating other cases inline).
    
    I tried to make it in a way were there is no dynamic allocation in the
    runtime/deallocation expected to be inserted by inline code for arrays
    that we know are small (lowering will just always stack allocate a rank
    15 array to avoid dynamic stack allocation or heap allocation).
    jeanPerier authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b01ac51 View commit details
    Browse the repository at this point in the history
  71. Configuration menu
    Copy the full SHA
    1539da4 View commit details
    Browse the repository at this point in the history
  72. [OpenMP] Fix passing target id features to AMDGPU offloading (llvm#94765

    )
    
    Summary:
    AMDGPU supports a `target-id` feature which is used to qualify targets
    with different incompatible features. These are both rules and target
    features. Currently, we pass `-target-cpu` twice when offloading to
    OpenMP, and do not pass the target-id features at all. The effect was
    that passing something like `--offload-arch=gfx90a:xnack+` would show up
    as `-target-cpu=gfx90a:xnack+ -target-cpu=gfx90a`. Thus ignoring the
    xnack completely and passing it twice. This patch fixes that to pass it
    once and then separate it like how HIP does.
    jhuber6 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    374f655 View commit details
    Browse the repository at this point in the history
  73. Fixed grammatical error in "enum specifier" error msg llvm#94443 (llv…

    …m#94592)
    
    As discussed in llvm#94443, this PR changes the wording to be more correct.
    kper authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    bbddedb View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    b59567b View commit details
    Browse the repository at this point in the history
  75. Check if LLD is built when checking if lto_supported (llvm#92752)

    Otherwise, older copies of LLD may not understand the latest bitcode
    versions (for example, if we increase
    `ModuleSummaryIndex::BitCodeSummaryVersion`)
    
    Related to
    llvm#90692 (comment)
    jvoung authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c467e60 View commit details
    Browse the repository at this point in the history
  76. [mlir][vector][NFC] Make function name more meaningful in lit tests. (l…

    …lvm#94538)
    
    It also moves the test near other similar test cases.
    hanhanW authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    b653357 View commit details
    Browse the repository at this point in the history
  77. [SDISel][Builder] Fix the instantiation of <1 x bfloat|half> (llvm#94591

    )
    
    Prior to this change, `SelectionDAGBuilder` was producing `SDNode`s of
    the form: `f32 = extract_vector_elt <1 x bfloat|half>, i32 0` when
    lowering phis of `<1 x bfloat|half>` and running on a target that
    promotes this type to `f32` (like some x86 or AMDGPU targets.)
    
    This construct is invalid since this type of node only allows type
    extensions for integer types.
    It went unotice because the `extract_vector_elt` node is later broken
    down in `bitcast` followed by `bf16_to_fp|fp_extend`. However, when the
    argument of the phi is a constant we were crashing because the existing
    code would try to constant fold this `extract_vector_elt` into a
    any_ext.
    
    This patch fixes this by using a proper decomposition for `<1 x
    bfloat|half>`:
    ```
    bfloat|half = bitcast <1 x blfoat|half>
    float = fp_extend bfloat|half
    ```
    
    This change should be NFC for the non-constant-folding cases and fix the
    SDISel crashes (reported in
    llvm#94449) for the folding
    cases.
    
    Note: The change on the arm test is a missing fp16 to f32 constant folding
    exposed by this patch. I'll push a separate improvement for that.
    qcolombet authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    0605e98 View commit details
    Browse the repository at this point in the history
  78. [RISCV] Fold (vXi8 (trunc (vselect (setltu, X, 256), X, (sext (setgt …

    …X, 0))))) to vmax+vnclipu. (llvm#94720)
    
    This pattern is an obscured way to express saturating a signed value
    into a smaller unsigned value.
    
    If (setltu, X, 256) is true, then the value is already in the desired
    range so we can pick X. If it's false, we select (sext (setgt X, 0))
    which is 0 for negative values and all ones for positive values. The all
    ones value when truncated to the final type will still be all ones like
    we want.
    topperc authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    e9fa6ff View commit details
    Browse the repository at this point in the history
  79. [RISCV] Add .insn alias for addresses without the leading immediate. (l…

    …lvm#94698)
    
    Most other instructions accept addresses that start with a '(' without
    an immediate before it. The .insn cases were missing. This is also
    supported by binutils.
    topperc authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    cce10cc View commit details
    Browse the repository at this point in the history
  80. Revert "Reapply PR/87550 (llvm#94625)"

    This reverts commit 35fa2de.
    
    It broke the LLDB bots on green dragon
    felipepiovezan committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    adcf33f View commit details
    Browse the repository at this point in the history
  81. [AArch64] Add patterns for fadd(uzp1(x,y), uzp2(x, y)) -> faddp.

    Similar to f7018ba, this adds patterns for
    floating point faddp from an fadd and shuffles.
    davemgreen committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    e564852 View commit details
    Browse the repository at this point in the history
  82. [libc++][NFC] Fix typo

    ldionne committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    f9ae07b View commit details
    Browse the repository at this point in the history
  83. [CGSCC] Verify that call graph is valid after iteration (llvm#94692)

    Only in expensive checks, to match other LazyCallGraph verification.
    
    Is helpful for verifying LazyCallGraph updates. Many issues only surface
    when we reuse the LazyCallGraph.
    aeubanks authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    6f2c610 View commit details
    Browse the repository at this point in the history
  84. Fix #pragma (packed, n) not emitting the alignment in debug info (llv…

    …m#94673)
    
    Debug info generation won't emit the alignment of types that have a
    standard alignment. It was not taking into account the that case.
    
    rdar://127785973
    augusto2112 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    66df614 View commit details
    Browse the repository at this point in the history
  85. [clang] Add fixit for using declaration with a (qualified) namespace (l…

    …lvm#94762)
    
    For `using std::literals`, we now output:
    
        error: using declaration cannot refer to a namespace
            4 |   using std::literals;
              |         ~~~~~^
        note: did you mean 'using namespace'?
            4 |   using std::literals;
              |         ^
              |         namespace
    
    Previously, we didn't have the note.
    
    This only fires for qualified namespaces. Just `using std;` doesn't
    trigger this, since using declarations without cxx scope specifier are
    rejected earlier. Making that work is an exercise for future selves :)
    nico authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    2c047e6 View commit details
    Browse the repository at this point in the history
  86. [libc] Add baremetal printf (llvm#94078)

    For baremetal targets that don't support FILE, this version of printf
    just writes directly to a function provided by a vendor. To do this both
    printf and vprintf were moved to /generic (vprintf since they need the
    same flags and cmake gets funky about setting variables in one file and
    reading them in another).
    michaelrj-google authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    11d643f View commit details
    Browse the repository at this point in the history
  87. [PseudoProbe] Make probe discriminator compatible with dwarf base dis…

    …criminator (llvm#94506)
    
    It's useful if the probe-based build can consume a dwarf based
    profile(e.g. the profile transition), before there is a conflict for the
    discriminator, this change tries to mitigate the issue by encoding the
    dwarf base discriminator into the probe discriminator.
    As the num of probe id(num of basic block and calls) starts from 1,
    there are some unused space. We try to reuse some bit of the probe id.
    The new encode rule is:
    - Use a bit to [28:28] to indicate whether dwarf base discriminator is
    encoded.(fortunately we can borrow this bit from the `PseudoProbeType`)
    - If the bit is set, use [15:3] for probe id, [18:16] for dwarf base
    discriminator. Otherwise, still use [18:3] for probe id.
    
    Note that these doesn't affect the original probe id capacity, we still
    prioritize probe id encoding, i.e. the base discriminator is not encoded
    when probe id is bigger than [15:3].
     
    Then adjust `getBaseDiscriminatorFromDiscriminator` to use the base
    discriminator from the probe discriminator.
    wlei-llvm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    e20b904 View commit details
    Browse the repository at this point in the history
  88. [Driver,test] Add -Wno-msvc-not-found to gcc-param.c

    Fixes: 56c4971
    
    If the default target triple uses visualstudio::Linker::ConstructJob,
    when a MSVC installation cannot be found, there will be a
    -Wmsvc-not-found diagnostic, which is turned to an error due to -Werror.
    
    We have many driver tests that don't specify --target= and would get a
    -Wmsvc-not-found warning, but this might be the only that uses -Werror
    and is not skipped by a `UNSUPPORTED`.
    MaskRay committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c3a5087 View commit details
    Browse the repository at this point in the history
  89. [clang][driver] Enable '-flto' on bare-metal (llvm#94738)

    Pass the linker LTO options enabled by the clang '-flto' command line
    options when targeting bare-metal.
    
    ---------
    
    Co-authored-by: Keith Walker <keith.walker@arm.com>
    walkerkd and Keith Walker authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    bd6e324 View commit details
    Browse the repository at this point in the history
  90. [CMake] Fix building on Haiku (llvm#94721)

    Needed for getaddrinfo().
    brad0 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    d3bcd9b View commit details
    Browse the repository at this point in the history
  91. [SPIR-V] Improve type inference, addrspacecast and dependencies betwe…

    …en SPIR-V entities and required capability/extensions (llvm#94626)
    
    This PR continues llvm#94467 and
    contains fixes in emission of type intrinsics, constant recording and
    corresponding test cases:
    * type-deduce-global-dup.ll -- fix of integer constant emission on
    32-bit platforms and correct type deduction for globals
    * type-deduce-simple-for.ll -- fix of GEP translation (there was an
    issue previously that led to incorrect translation/broken logic of
    for-range implementation)
    
    This PR also:
    * fixes a cast between identical storage classes and updates the test
    case to include validation run by spirv-val,
    * ensures that Bitcast for pointers satisfies the requirement that the
    address spaces must match and adds the corresponding test case,
    * improve encode in Tablegen and decode in code of dependencies between
    SPIR-V entities and required capability/extensions,
    * prevent emission of identical OpTypePointer instructions.
    VyacheslavLevytskyy authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    9a73710 View commit details
    Browse the repository at this point in the history
  92. Configuration menu
    Copy the full SHA
    4196c18 View commit details
    Browse the repository at this point in the history
  93. [RISCV][GISel] Do libcall for G_FPTOSI, G_FPTOUI when no D or F suppo…

    …rt (llvm#94613)
    
    When compiling the following code:
    ```cpp
    #include <stdio.h>
    #include <stdlib.h>
    #include <stddef.h>
    #include <stdbool.h>
    
    int main() {
        int a;
        float f;
        scanf("%d", &a);
    
        scanf("%f", &f);
        a += (int)f;
        
        return a;
    }
    ``` 
    for `-march=rv32ima_zbb` we get a libcall:
    ```
    call    scanf
            lw      a0, -20(s0)
            call    __fixsfsi
            mv      a1, a0
    ```
    When we try to use GlobalISel we get this error:
    ```
     error in backend: unable to legalize instruction: %9:_(s32) = G_FPTOSI %8:_(s32) (in function: main)
    ```
    
    (Here is a link to a reproducer in Godblot:
    https://godbolt.org/z/f67vEEb41 )
    
    The goal of this PR is to do a libcall for the legalization of
    `G_FPTOSI` and `G_FPTOUI` instead of doing a fallback to Selection DAG
    to do the same libcall later.
    spaits authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    28dd55b View commit details
    Browse the repository at this point in the history
  94. [llvm-dwarfdump] Add a null-check in prettyPrintBaseTypeRef. (llvm#…

    …93156)
    
    Fixes llvm#93104
    
    Prevent a crash by only printing DWARFUnit-unaware information in cases
    in which `DWARFUnit* U` is `nullptr`.
    mgschossmann authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c6e9371 View commit details
    Browse the repository at this point in the history
  95. Configuration menu
    Copy the full SHA
    27084f7 View commit details
    Browse the repository at this point in the history
  96. [InstCombine] Fold (icmp eq/ne (xor x, y), C1) even if multiuse

    Two folds unlocked:
        `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
        `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`
    
    This fixes regressions assosiated with llvm#87180
    
    Closes llvm#87275
    goldsteinn committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    166c184 View commit details
    Browse the repository at this point in the history
  97. [OpenMP][Offload] - Ensure OPENMP_STANDALONE_BUILD is defined (llvm#9…

    …4801)
    
    Without a value set conditional checks like
    if(NOT ${OPENMP_STANDALONE_BUILD})
    will not be able to evaluate to true.
    Fixes issue introduced from PR llvm#93463, which did not allow the OMPT
    variable to be propogated up to offload during a runtimes build.
    estewart08 authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    89c92b0 View commit details
    Browse the repository at this point in the history
  98. InstCombine: Fix testing of pow libcall in errno case (llvm#94772)

    There were some tests in this file with "noerrno" in the name, but all
    the tests were no errno since all the libcalls were declared with
    memory(none). Ensure we have adequate coverage for the errno and
    no-errno cases by duplicating the libcall transform cases into errno and
    non-errno versions with callsite attributes.
    arsenm authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    75b89cc View commit details
    Browse the repository at this point in the history
  99. [lldb] Encode operands and arity in Dwarf.def and use them in LLDB. (l…

    …lvm#94679)
    
    This PR extends Dwarf.def to include the number of operands and the arity (the
    number of entries on the DWARF stack).
    
      - The arity is used in LLDB's DWARF expression evaluator.
      - The number of operands is unused, but is present in the table to avoid
        confusing the arity with the operands. Keeping the latter up to date should
        be straightforward as it maps directly to a table present in the DWARF
        standard.
    JDevlieghere authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    96d01a3 View commit details
    Browse the repository at this point in the history
  100. [AArch64][LoopIdiom] Generalize AArch64LoopIdiomTransform into LoopId…

    …iomVectorize (llvm#94081)
    
    To facilitate sharing LoopIdiomTransform between AArch64 and RISC-V,
    this first patch moves AArch64LoopIdiomTransform from lib/Target/AArch64
    to lib/Transforms/Vectorize and renames it to LoopIdiomVectorize. The
    following patch (llvm#94082) will teach LoopIdiomVectorize how to generate VP
    intrinsics (in addition to the current masked vector style) in favor of
    RVV.
    mshockwave authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    37e309f View commit details
    Browse the repository at this point in the history
  101. [ELF] Implement --force-group-allocation

    GNU ld's relocatable linking behaviors:
    
    * Sections with the `SHF_GROUP` flag are handled like sections matched
      by the `--unique=pattern` option. They are processed like orphan
      sections and ignored by input section descriptions.
    * Section groups' (usually named `.group`) content is updated as the
      section indexes are updated. Section groups can be discarded with
      `/DISCARD/ : { *(.group) }`.
    
    `-r --force-group-allocation` discards section groups and allows
    sections with the `SHF_GROUP` flag to be matched like normal sections.
    If two section group members are placed into the same output section,
    their relocation sections (if present) are combined as well.
    This behavior can be useful when -r output is used as a pseudo shared
    object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments).
    
    This patch implements --force-group-allocation:
    
    * Input SHT_GROUP sections are discarded.
    * Input sections do not get the SHF_GROUP flag, so `addInputSec`
      will combine relocation sections if their relocated section group
      members are combined.
    
    The default behavior is:
    
    * Input SHT_GROUP sections are retained.
    * Input SHF_GROUP sections can be matched (unlike GNU ld)
    * Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec`
      will create different OutputDesc copies.
    
    GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not
    implemented.
    
    Pull Request: llvm#94704
    MaskRay authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    4d9020c View commit details
    Browse the repository at this point in the history
  102. Configuration menu
    Copy the full SHA
    d1b5a4b View commit details
    Browse the repository at this point in the history
  103. Configuration menu
    Copy the full SHA
    06188d9 View commit details
    Browse the repository at this point in the history
  104. [BOLT][NFC] Unset UseAssemblerInfoForParsing for emission (llvm#94778)

    Summary:
    Use workaround for quadratic behavior inside
    AttemptToFoldSymbolOffsetDifference called from BinaryEmitter::emitLSDA.
    
    
    llvm@b06e736#commitcomment-142836456
    aaupov authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    7520d0c View commit details
    Browse the repository at this point in the history
  105. Configuration menu
    Copy the full SHA
    bfa937a View commit details
    Browse the repository at this point in the history
  106. [RISCV] Add TargetConstraintType=2 to vnclip pseudoinstructions. NFC

    These instructions are very similar to narrowing shift instructions
    which already have this.
    
    Remove TargetConstraintType parameter from VPseudoBinaryV_WV
    class. Only 2 was ever passed to it. Pass 2 directly to the classes
    instantiated from VPseudoBinaryV_WV instead.
    topperc committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    06e12b4 View commit details
    Browse the repository at this point in the history
  107. Configuration menu
    Copy the full SHA
    0cdb0b7 View commit details
    Browse the repository at this point in the history
  108. [clang-tidy] new check misc-use-internal-linkage (llvm#90830)

    Add new check misc-use-internal-linkage to detect variable and function
    can be marked as static.
    
    ---------
    
    Co-authored-by: Danny Mösch <danny.moesch@icloud.com>
    HerrCai0907 and SimplyDanny authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c4f83a0 View commit details
    Browse the repository at this point in the history
  109. Configuration menu
    Copy the full SHA
    4346c38 View commit details
    Browse the repository at this point in the history
  110. Configuration menu
    Copy the full SHA
    32d8596 View commit details
    Browse the repository at this point in the history
  111. Reland "[python] Bump Python minimum version to 3.8 (llvm#78828)"

    This reverts commit b6824c9.
    
    This relands commit 0a6c74e.
    
    The original commit was reverted due to buildbot failures. These bots
    should be updated now, so hopefully this will stick.
    boomanaiden154 committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    33f4a77 View commit details
    Browse the repository at this point in the history
  112. [mlir][loops] Add getters for multi dim loop variables in `LoopLikeOp…

    …Interface` (llvm#94516)
    
    This patch adds `getLoopInductionVars`, `getLoopLowerBounds`,
    `getLoopBounds`, `getLoopSteps` interface methods to
    `LoopLIkeOpInterface`. The corresponding single value versions have been
    moved to shared class declaration and have been implemented based on the
    new interface methods.
    srcarroll authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    6b4c122 View commit details
    Browse the repository at this point in the history
  113. [MemProf] Add matching statistics and tracing (llvm#94814)

    To help debug or surface matching issues, add more statistics to the
    matching. Also add optional emission of each context seen in the
    function profiles along with its allocation type, size in bytes, and
    whether it was matched. This information is emitted along with a hash of
    the full stack context, to allow deduplication across modules for
    allocations within header files.
    teresajohnson authored Jun 7, 2024
    Configuration menu
    Copy the full SHA
    7536474 View commit details
    Browse the repository at this point in the history
  114. Configuration menu
    Copy the full SHA
    211edca View commit details
    Browse the repository at this point in the history
  115. Configuration menu
    Copy the full SHA
    507b372 View commit details
    Browse the repository at this point in the history
  116. [RISCV] Remove CarryIn and Constraint parameters from VPseudoTiedBina…

    …ryCarryIn. NFC
    
    They were always passed the same values, 1 for CarryIn and "" for
    Constraint.
    topperc committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    017e240 View commit details
    Browse the repository at this point in the history
  117. [RISCV] Rename VPseudoBinaryCarryIn to VPseudoBinaryCarry. NFC

    It doesn't always have a CarryIn. One of the parameters is named
    CarryIn. It always has CarryOut or a CarryIn and in some cases both.
    topperc committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    c8eff87 View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2024

  1. Add AllowRepeats to SBCommandInterpreterRunOptions. (llvm#94786)

    This is useful if you have a transcript of a user session and want to
    rerun those commands with RunCommandInterpreter. The same functionality
    is also useful in testing.
    
    I'm adding it primarily for the second reason. In a subsequent patch,
    I'm adding the ability to Python based commands to provide their
    "auto-repeat" command. Among other things, that will allow potentially
    state destroying user commands to prevent auto-repeat. Testing this with
    Shell or pexpect tests is not nearly as accurate or convenient as using
    RunCommandInterpreter, but to use that I need to allow auto-repeat.
    
    I think for consistency's sake, having interactive sessions always do
    auto-repeats is the right choice, though that's a lightly held
    opinion...
    jimingham authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    435dd97 View commit details
    Browse the repository at this point in the history
  2. [memprof] Improve deserialization performance in V3 (llvm#94787)

    We call llvm::sort in a couple of places in the V3 encoding:
    
    - We sort Frames by FrameIds for stability of the output.
    
    - We sort call stacks in the dictionary order to maximize the length
      of the common prefix between adjacent call stacks.
    
    It turns out that we can improve the deserialization performance by
    modifying the comparison functions -- without changing the format at
    all.  Both places take advantage of the histogram of Frames -- how
    many times each Frame occurs in the call stacks.
    
    - Frames: We serialize popular Frames in the descending order of
      popularity for improved cache locality.  For two equally popular
      Frames, we break a tie by serializing one that tends to appear
      earlier in call stacks.  Here, "earlier" means a smaller index
      within llvm::SmallVector<FrameId>.
    
    - Call Stacks: We sort the call stacks to reduce the number of times
      we follow pointers to parents during deserialization.  Specifically,
      instead of comparing two call stacks in the strcmp style -- integer
      comparisons of FrameIds, we compare two FrameIds F1 and F2 with
      Histogram[F1] < Histogram[F2] at respective indexes.  Since we
      encode from the end of the sorted list of call stacks, we tend to
      encode popular call stacks first.
    
    Since the two places use the same histogram, we compute it once and
    share it in the two places.
    
    Sorting the call stacks reduces the number of "jumps" by 74% when we
    deserialize all MemProfRecords.  The cycle and instruction counts go
    down by 10% and 1.5%, respectively.
    
    If we sort the Frames in addition to the call stacks, then the cycle
    and instruction counts go down by 14% and 1.6%, respectively, relative
    to the same baseline (that is, without this patch).
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    dc3f8c2 View commit details
    Browse the repository at this point in the history
  3. [InstCombine] Preserve the nsw/nuw flags for (X | Op01C) + Op1C --> X…

    … + (Op01C + Op1C) (llvm#94586)
    
    This patch simplifies `sdiv` to `udiv` by preserving the `nsw` flag for
    `(X | Op01C) + Op1C --> X + (Op01C + Op1C)` if the sum of `Op01C` and
    `Op1C` will not overflow, and preserves the `nuw` flag unconditionally.
    
    Alive2 Proofs (provided by @nikic): https://alive2.llvm.org/ce/z/nrdCZT,
    https://alive2.llvm.org/ce/z/YnJHnH
    csstormq authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    96af114 View commit details
    Browse the repository at this point in the history
  4. [lld] Discard SHT_LLVM_LTO sections in relocatable links (llvm#92825)

    So long as ld -r links using bitcode always result in an ELF object, and
    not a merged bitcode object, the output form a relocatable link using
    FatLTO objects should not have a .llvm.lto section. Prior to this, using
    the object code sections would cause the bitcode section in the output
    of a relocatable link to be corrupted, by concatenating all the
    .llvm.lto
    sections together.
    
    This patch discards SHT_LLVM_LTO sections when not using
    --fat-lto-objects, so that the relocatable ELF output won't contain
    inalid bitcode.
    ilovepi authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    608fb46 View commit details
    Browse the repository at this point in the history
  5. [ProfileData] Use default member initialization (NFC) (llvm#94817)

    While we are at it, this patch changes the type of ValueCounts to
    std:array<double, ...> so that we can use std::array:fill.
    
    Identified with modernize-use-default-member-init.
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    4c28844 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    4cff8ef View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    18c67bf View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    4e0ff05 View commit details
    Browse the repository at this point in the history
  9. [gn build] Port c4f83a0

    llvmgnsyncbot committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    4d95850 View commit details
    Browse the repository at this point in the history
  10. [RISCV] Rename VPseudoVWALU_VV_VX_VI to VPseudoVWSLL. NFC

    The scheduler class name is hardcoded in the class so its not a
    general class.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    5422b5f View commit details
    Browse the repository at this point in the history
  11. [RISCV] Refactor VPseudoVROL and VPseudoVROR multiclasses to use inhe…

    …ritance. NFC
    
    VPseudoVROR can inherit from VPseudoVROL. Adjust the names to
    VPseudoVROT_VV_VX and VPseudoVROT_VV_VX_VI.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    5fc1b82 View commit details
    Browse the repository at this point in the history
  12. [RISCV] Rename VPseudoBinaryNoMaskTU->VPseudoBinaryNoMaskPolicy. NFC

    These pseudoinstructions have a policy operand so calling them
    TU is confusing.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    7d203b1 View commit details
    Browse the repository at this point in the history
  13. [RISCV] Rename VPatBinarySwapped to VPatBinaryMSwapped. NFC

    This class is most closely related to VPatBinaryM.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    5e94163 View commit details
    Browse the repository at this point in the history
  14. [RISCV] Flatten VPatBinaryW_VI_VWSLL and VPatBinaryW_VX_VWSLL into VP…

    …atBinaryW_VV_VX_VI_VWSLL. NFC
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    84b3fe6 View commit details
    Browse the repository at this point in the history
  15. [workflows] Add post-commit job that periodically runs the clang stat…

    …ic analyzer (llvm#94106)
    
    This job will run once per day on the main branch, and for every commit
    on a release branch. It currently only builds llvm, but could add more
    sub-projects in the future.
    
    OpenSSF Best Practices recommends running a static analyzer on software
    before it is released:
    https://www.bestpractices.dev/en/criteria/0#0.static_analysis
    tstellar authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    81671fe View commit details
    Browse the repository at this point in the history
  16. [mlir] Handle the newly-added "Reserved" FramePointerKind for 1a52392

    …(NFC)
    
    /llvm-project/mlir/lib/Target/LLVMIR/ModuleImport.cpp:48:
    tools/mlir/include/mlir/Dialect/LLVMIR/LLVMConversionEnumsFromLLVM.inc:158:11:
    error: enumeration value 'Reserved' not handled in switch [-Werror,-Wswitch]
      switch (value) {
              ^~~~~
    1 error generated.
    DamonFool committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    c0a1214 View commit details
    Browse the repository at this point in the history
  17. [dfsan] Fix release_shadow_space.c (llvm#94770)

    DFSan's sscanf is incorrect
    (llvm#94769), which results in
    erroneous matches when scraping RSS from /proc/maps. This patch works
    around the issue by using strstr as a secondary check.
    
    It also adds a loose validity check for the initial RSS measurement, to
    guard against regressions in get_rss_kb().
    
    Fixes llvm#91287
    thurstond authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    221336c View commit details
    Browse the repository at this point in the history
  18. [HLSL] Use llvm::Triple::EnvironmentType instead of HLSLShaderAttr::S…

    …haderType (llvm#93847)
    
    `HLSLShaderAttr::ShaderType` enum is a subset of
    `llvm::Triple::EnvironmentType`. We can use
    `llvm::Triple::EnvironmentType` directly and avoid converting one enum
    to another.
    hekota authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    5d87ba1 View commit details
    Browse the repository at this point in the history
  19. [CMake] Update CMake cache file for the ARM/Aarch64 cross toolchain b…

    …uilds. NFC. (llvm#94835)
    
    * generate Clang configuration file with provided target sysroot
    (TOOLCHAIN_TARGET_SYSROOTFS)
    * explicitly pass provided target sysroot into the compiler-rt tests
    configuration.
    * added ability to configure a type of the build libraries -- shared or
    static (TOOLCHAIN_SHARED_LIBS, default OFF)
    
    In behalf of: llvm#94284
    vvereschaka authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    5aabbf0 View commit details
    Browse the repository at this point in the history
  20. [RISCV] Remove many ImmType parameters from tablegen classes. NFC

    These usually have a single value that is always used. We can just
    hardcode into the class body.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    950605b View commit details
    Browse the repository at this point in the history
  21. [RISCV] Remove unused defaults for sew paramters in tablegen. NFC

    Also remove some unused Constraint paramters that appeared before
    the sew parameter.
    topperc committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    2fa14fc View commit details
    Browse the repository at this point in the history
  22. [lldb] Remove redundant c_str() calls in stream output (NFC) (llvm#94839

    )
    
    Passing the result of c_str() to a stream is slow and redundant. This
    change removes unnecessary c_str() calls and uses the string object
    directly.
    
    Caught by cppcheck -
    lldb/tools/debugserver/source/JSON.cpp:398:19: performance: Passing the
    result of c_str() to a stream is slow and redundant. [stlcstrStream]
    lldb/tools/debugserver/source/JSON.cpp:408:64: performance: Passing the
    result of c_str() to a stream is slow and redundant. [stlcstrStream]
    lldb/tools/debugserver/source/JSON.cpp:420:54: performance: Passing the
    result of c_str() to a stream is slow and redundant. [stlcstrStream]
    lldb/tools/debugserver/source/JSON.cpp:46:13: performance: Passing the
    result of c_str() to a stream is slow and redundant. [stlcstrStream]
    
    Fix llvm#91212
    xgupta authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    d3fc5cf View commit details
    Browse the repository at this point in the history
  23. Revert "[lld][AArch64][ELF][PAC] Support .relr.auth.dyn section" (l…

    …lvm#94843)
    
    Reverts llvm#87635
    
    On some corner cases, lld generated an object file with an empty REL
    section with `sh_info` set to 0. This file triggers an lld error when
    used as its input. See
    llvm#87635 (comment)
    for details.
    kovdan01 authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    2e1788f View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    a294e89 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    3f0f2cd View commit details
    Browse the repository at this point in the history
  26. [Support] Do not use llvm::size in getLoopPreheader (llvm#94540)

    `BlockT *LoopBase<BlockT, LoopT>::getLoopPreheader()` was changed in
    7243607 to use `llvm::size` rather than
    the checking that `child_begin() + 1 == child_end()`. `llvm::size`
    requires that `std::distance` be O(1) and hence that clients support
    random access. Use `llvm::hasSingleElement` instead.
    bnbarham authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    6885281 View commit details
    Browse the repository at this point in the history
  27. [SystemZ] Fix handling of triples.

    Some Ubuntu builds were broken after 20d497c "[Driver] Remove unneeded
    *-linux-gnu after D158183".
    
    This patch by Fangrui Song fixes this with a handling in config.guess.
    JonPsson1 authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    7f5d1f1 View commit details
    Browse the repository at this point in the history
  28. [mlir][Transforms][NFC] GreedyPatternRewriteDriver: Use composition…

    … instead of inheritance (llvm#92785)
    
    This commit simplifies the design of the `GreedyPatternRewriterDriver`
    class. This class used to inherit from both `PatternRewriter` and
    `RewriterBase::Listener` and then attached itself as a listener.
    
    In the new design, the class has a `PatternRewriter` field instead of
    inheriting from `PatternRewriter`, which is generally perferred in
    object-oriented programming.
    
    ---------
    
    Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
    matthias-springer and zero9178 authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    6b3e000 View commit details
    Browse the repository at this point in the history
  29. [clang] Report erroneous floating point results in _Complex math (llv…

    …m#90588)
    
    Use handleFloatFloatBinOp to properly diagnose NaN results and divisions
    by zero.
    
    Fixes llvm#84871
    tbaederr authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    9ddc014 View commit details
    Browse the repository at this point in the history
  30. [SDISel][Combine] Constant fold FP16_TO_FP (llvm#94790)

    In some case, constant can survive early constant folding optimization
    because they are hidden behind several layers of type changes.
    
    E.g., consider the following sequence (extracted from the arm test that
    this commit changes):
    ```
        t2: v1f16 = BUILD_VECTOR ConstantFP:f16<APFloat(0)>
        t4: v1f16 = insert_vector_elt t2, ConstantFP:f16<APFloat(0)>, Constant:i32<0>
      t5: f16 = bitcast t4
    t6: f32 = fp_extend t5
    ```
    
    Because the constant (APFloat(0)) is hidden behind a <1 x ty> type, all
    the constant folding that normally happen for scalar nodes when using
    `SelectionDAG::getNode` are blocked.
    
    As a result the constant manages to survive as an actual conversion
    instruction down to the select phase:
    ```
    t11: f32 = fp16_to_fp Constant:i32<0>
    ```
    
    With the change in this patch, we try to do constant folding one more
    time during dag combine, which in the motivating example result in the
    much better sequence:
    ```
    t7: ch = CopyToReg t0, Register:f32 %0, ConstantFP:f32<0.000000e+00>
    ```
    
    Note: I'm sure we have this problem in a lot of other places. Generally
    speaking I believe SDISel is not that good with <1 x ty> compared to
    pure scalar. However, I only changed what I could easily test.
    qcolombet authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    25506f4 View commit details
    Browse the repository at this point in the history
  31. [compiler-rt] Replace deprecated aligned_storage with aligned byte ar…

    …ray (llvm#94171)
    
    `std::aligned_storage` is deprecated with C++23, see
    [here](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1413r3.pdf).
    
    This replaces the usages of `std::aligned_storage` within compiler-rt
    with an aligned `std::byte` array.
    I will provide patches for other subcomponents as well.
    marcauberer authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    cac7821 View commit details
    Browse the repository at this point in the history
  32. lld/test: Make sure removing %t at first

    2e1788f reverted llvm#94843. It was creating `%t` as a directory and
    causes an error in incremental builds.
    chapuni committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    82f6cde View commit details
    Browse the repository at this point in the history
  33. Enable LLDB tests in Linux pre-merge CI (llvm#94208)

    This patch removes LLDB from a list of projects that are excluded from
    building and testing on pre-merge CI on Linux.
    
    Windows environment needs to be prepared in order to test LLDB
    (llvm#94208 (comment)),
    but we don't have enough maintenance resources to do that at the moment.
    
    Because LLDB has been in the list of projects that need to be tested on
    Clang changes, this PR make this happen on Linux. This seems to be the
    consensus in the discussion of this PR.
    Endilll authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    d4eed43 View commit details
    Browse the repository at this point in the history
  34. [SimplifyCFG] Don't use a mask for lookup tables generated from switc…

    …hes with an unreachable default case (llvm#94468)
    
    When transforming a switch with holes into a lookup table, we currently
    use a mask to check if the current index is handled by the switch or if
    it is a hole. If it is a hole, we skip loading from the lookup table.
    Normally, if the switch's default case is unreachable this has no
    impact, as the mask test gets optimized away by subsequent passes.
    However, if the switch is large enough that the number of lookup table
    entries exceeds the target's register width, we won't be able to fit all
    the cases into a mask and the switch won't get transformed into a lookup
    table. If we know that the switch's default case is unreachable, we know
    that the mask is unnecessary and can skip constructing it entirely,
    which allows us to transform the switch into a lookup table.
    
    [Example](https://godbolt.org/z/7x7qfx8M1)
    
    In the future, it might be interesting to consider allowing lookup table
    masks to be more than one register large (e.g. using a constant array of
    bit flags, similar to `std::bitset`).
    DaMatrix authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    540f68c View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    2d21851 View commit details
    Browse the repository at this point in the history
  36. [DAGCombine] Fix miscompilation caused by PR94008 (llvm#94850)

    The pr description in llvm#94008 mismatches with the code.
    > + When VT is smaller than ShiftVT, it is safe to use trunc.
    > + When VT is larger than ShiftVT, it is safe to use zext iff
    `is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
    also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
        proofs.
    
    Closes llvm#94824.
    dtcxzyw authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    d9507a3 View commit details
    Browse the repository at this point in the history
  37. [Reassociate] Use uint64_t for repeat count (llvm#94232)

    This patch relands llvm#91469 and uses `uint64_t` for repeat count to avoid
    a miscompilation caused by overflow
    llvm#91469 (comment).
    dtcxzyw authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    645fb04 View commit details
    Browse the repository at this point in the history
  38. [X86] Support ATOMIC_LOAD_FP_BINOP_MI for other binops (llvm#87524)

    Since we can bitcast and then do the same thing sub does in the table
    section above, I figured it was trivial to add fsub, fmul, and fdiv.
    AreaZR authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    bca7864 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    c870882 View commit details
    Browse the repository at this point in the history
  40. [ProfileData] Use a range-based for loop (NFC) (llvm#94856)

    While I am at it, this patch adds const to a couple of places.
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    38124fe View commit details
    Browse the repository at this point in the history
  41. [memprof] Remove redundant virtual (NFC) (llvm#94858)

    'override' makes 'virtual' redundant.
    
    Identified with modernize-use-override.
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    6834e6d View commit details
    Browse the repository at this point in the history
  42. [libc++][NFC] Simplify the implementation of __promote (llvm#81379)

    This depends on enabling the use of extensions.
    philnik777 authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    c8992fb View commit details
    Browse the repository at this point in the history
  43. [RISCV][MC] Implicit 0-offset aliases for JR/JALR (llvm#94688)

    This broadly follows how in almost all places, we accept `(<reg>)` to
    mean `0(<reg>)`, but I think these are the first like this for Jumps
    rather than Loads/Stores. These are accepted by binutils but not by
    LLVM: https://godbolt.org/z/GK7MGE7q7
    lenary authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    bafff3e View commit details
    Browse the repository at this point in the history
  44. [ProfileData] Use default member initialization (NFC) (llvm#94860)

    Identified with modernize-use-default-member-init.
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    80d00bf View commit details
    Browse the repository at this point in the history
  45. [lldb] Use const reference for range variables to improve performance…

    … (NFC) (llvm#94840)
    
    Cppcheck recommends using a const reference for range variables in a
    for-each loop.
    This avoids unnecessary copying of elements, improving performance.
    
    Caught by cppcheck -
    lldb/source/API/SBBreakpoint.cpp:717:22: performance: Range variable
    'name' should be declared as const reference. [iterateByValue]
    lldb/source/API/SBTarget.cpp:1150:15: performance: Range variable 'name'
    should be declared as const reference. [iterateByValue]
    lldb/source/Breakpoint/Breakpoint.cpp:888:26: performance: Range
    variable 'name' should be declared as const reference. [iterateByValue]
    lldb/source/Breakpoint/BreakpointIDList.cpp:262:26: performance: Range
    variable 'name' should be declared as const reference. [iterateByValue]
    
    Fix llvm#91213
    Fix llvm#91217
    Fix llvm#91219
    Fix llvm#91220
    xgupta authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    1e92ad4 View commit details
    Browse the repository at this point in the history
  46. [libc][math][c23] fmul correcly rounded to all rounding modes (llvm#9…

    …1537)
    
    This is an implementation of floating point multiplication:
    
    It will consist of 
       - `double x double -> float`
    Jobhdez authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    263be9f View commit details
    Browse the repository at this point in the history
  47. [libc][math][C23] Implemented remquof128 function (llvm#94809)

    Added remquof128 function. Closes llvm#94312
    HendrikHuebner authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    44aecca View commit details
    Browse the repository at this point in the history
  48. [VPlan] Check if only first part is used for all per-part VPInsts.

    Apply the onlyFirstPartUsed logic generally to all per-part
    VPInstructions. Note that the test changes remove the second part
    of an unsued first-order recurrence splice.
    fhahn committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    a43d999 View commit details
    Browse the repository at this point in the history
  49. [RISCV][GISel] Add calling convention support for half (llvm#94110)

    This patch adds initial support to the half type on RISC-V.
    dtcxzyw authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    643e471 View commit details
    Browse the repository at this point in the history
  50. [VPlan] Mark FirstOrderRecurrenceSplice as not having side-effects.

    Now that FOR exit and resume value creation is explicitly modeled in
    VPlan (05e1b53, 07b3301) it doesn't depend on the first
    order recurrence splice being preserved and it can now be marked as not
    having side-effects. This allows removal of first-order-recurrence-splce
    if the FOR is only used in the exit or as scalar ph resume value.
    fhahn committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    998c33e View commit details
    Browse the repository at this point in the history
  51. [ProfileData] Simplify calls to readNext in readBinaryIdsInternal (NF…

    …C) (llvm#94862)
    
    readNext has two variants:
    
    - readNext<uint64_t, endian>(ptr)
    - readNext<uint64_t>(ptr, endian)
    
    This patch uses the latter to simplify readBinaryIdsInternal.  Both
    forms default to unaligned.
    kazutakahirata authored Jun 8, 2024
    Configuration menu
    Copy the full SHA
    e62c214 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    febfbff View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    c2d68c4 View commit details
    Browse the repository at this point in the history
  54. [InstCombine] Propagate flags when folding consecutative shifts

    When we fold `(shift (shift C0, x), C1)` we can propagate flags that
    are common to both shifts.
    
    Proofs: https://alive2.llvm.org/ce/z/LkEzXD
    
    Closes llvm#94872
    goldsteinn committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    2900d03 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    2e482b2 View commit details
    Browse the repository at this point in the history
  56. [MC] Simplify Sec.getFragmentList().insert(Sec.begin(), F). NFC

    Decrease the uses of getFragmentList() to make it easier to change the
    fragment list representation.
    MaskRay committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    dcb71c0 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2024

  1. [SPARC][IAS] Add GNU extension for addc

    Transform `addc imm, %rs, %rd` into `addc %rs, imm, %rd`.
    This is used in some GNU and Linux code.
    
    Reviewers: s-barannikov, rorth, jrtc27, brad0
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94245
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    f20d8b9 View commit details
    Browse the repository at this point in the history
  2. [SPARC][IAS] Add support for %uhi and %ulo extensions

    This adds support for GNU %uhi and %ulo extensions.
    Those resolve to the same relocations as %hh and %hm.
    
    Reviewers:
    cyndyishida, dcci, brad0, jrtc27, aaupov, Endilll, rorth, maksfb, #reviewers-libcxxabi, s-barannikov, rafaelauler, ayermolo, #reviewers-libunwind, #reviewers-libcxx, daniel-grumberg, tbaederr
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94246
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    44f9357 View commit details
    Browse the repository at this point in the history
  3. [SPARC][IAS] Add aliases for %asr20-21 as defined in JPS1

    This adds %set_softint and %clear_softint alias for %asr20 and %asr21
    as defined in JPS1.
    
    Reviewers: jrtc27, brad0, s-barannikov, rorth
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94247
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    715a5d8 View commit details
    Browse the repository at this point in the history
  4. [clang][Interp][NFC] Refactor lvalue-to-rvalue conversion code

    Really perform the conversion always if the flag is set and don't make
    it dependent on whether we're checking the result for initialization.
    tbaederr committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    cc8fa1e View commit details
    Browse the repository at this point in the history
  5. [clang-tidy] Ignore non-math operators in readability-math-missing-pa…

    …rentheses (llvm#94654)
    
    Do not emit warnings for non-math operators.
    
    Closes llvm#92516
    PiotrZSL authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    d211abc View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    338cbfe View commit details
    Browse the repository at this point in the history
  7. [ARM] vector-store.ll - add big-endian test coverage

    Based on feedback on llvm#94863
    RKSimon committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    32b7043 View commit details
    Browse the repository at this point in the history
  8. [clang-tidy] Ignore implicit functions in readability-implicit-bool-c…

    …onversion (llvm#94512)
    
    Ignore implicit declarations and defaulted functions. Helps with issues
    in generated code like, C++
    spaceship operator.
    
    Closes llvm#93409
    PiotrZSL authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    e329bfc View commit details
    Browse the repository at this point in the history
  9. [clang-tidy] Extend modernize-use-designated-initializers with new op…

    …tions (llvm#94651)
    
    Add StrictCStandardCompliance and StrictCppStandardCompliance options
    that default to true.
    
    Closes llvm#83732
    PiotrZSL authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    31b84d4 View commit details
    Browse the repository at this point in the history
  10. [clang-tidy] Improve bugprone-multi-level-implicit-pointer-conversion (

    …llvm#94524)
    
    Ignore implicit pointer conversions that are part of a cast expression
    
    Closes llvm#93959
    PiotrZSL authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    b55fb56 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    46d94bd View commit details
    Browse the repository at this point in the history
  12. [DAG] FoldConstantArithmetic - allow binop folding to work with diffe…

    …ring bitcasted constants (llvm#94863)
    
    We currently only constant fold binop(bitcast(c1),bitcast(c2)) if c1 and c2 are both bitcasted and from the same type.
    
    This patch relaxes this assumption to allow the constant build vector to originate from different types (and allow cases where only one operand was bitcasted).
    
    We still ensure we bitcast back to one of the original types if both operand were bitcasted (we assume that if we have a non-bitcasted constant then its legal to keep using that type).
    RKSimon authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    53fecef View commit details
    Browse the repository at this point in the history
  13. [DAG] Fold fdiv X, c2 -> fmul X, 1/c2 without AllowReciprocal if exact (

    llvm#93882)
    
    This moves the combine of fdiv by constant to fmul out of an
    'if (Options.UnsafeFPMath || Flags.hasAllowReciprocal()' block,
    so that it triggers if the divide is exact. An extra check for
    Recip.isDenormal() is added as multiple places make reference
    to it being unsafe or slow on certain platforms.
    davemgreen committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    a284bdb View commit details
    Browse the repository at this point in the history
  14. [VPlan] Handle more cases in VPInstruction::onlyFirstPartUsed.

    Handle binary ops and a few other instructions in onlyFirstPartUsed;
    they only use the first part if they themselves only have their first
    part used.
    fhahn committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    2f4ebf8 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    cb8e936 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    69cd2d2 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    5bb9c08 View commit details
    Browse the repository at this point in the history
  18. [AMDGPU] Swap range metadata to attribute for workitem id. (llvm#94871)

    Swap out range metadata to range attribute for calls to be able to
    deprecate range metadata on calls in the future.
    andjo403 authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    cc19374 View commit details
    Browse the repository at this point in the history
  19. [SPARC][IAS] Add named prefetch tag constants

    This adds named tag constants (such as `#one_write` and `#one_read`)
    for the prefetch instruction.
    
    Reviewers: jrtc27, rorth, brad0, s-barannikov
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94249
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    2388129 View commit details
    Browse the repository at this point in the history
  20. [SPARC][IAS] Add support for prefetcha instruction

    This adds support for `prefetcha` instruction for prefetching from
    alternate address spaces.
    
    Reviewers: jrtc27, brad0, rorth, s-barannikov
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94250
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    41f2ea0 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    8901f71 View commit details
    Browse the repository at this point in the history
  22. [SPARC][IAS] Handle the case of non-4-byte aligned writeNopData

    If the Count passed into writeNopData is not a multiple of four,
    add a little amount of zeros before writing the NOP stream.
    This makes it match the behavior of GNU binutils.
    
    Reviewers: brad0, rorth, s-barannikov, jrtc27
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94251
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    2bc36af View commit details
    Browse the repository at this point in the history
  23. [SPARC][IAS] Add movr(n)e alias for movr(n)z

    This adds the alternate mnemonics for movrz and movrnz.
    
    Reviewers: s-barannikov, jrtc27, brad0, rorth
    
    Reviewed By: s-barannikov
    
    Pull Request: llvm#94252
    koachan authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    e0b9cce View commit details
    Browse the repository at this point in the history
  24. [libc++][TZDB] Implements time_zone get_info(local_time). (llvm#89537)

    Implements parts of:
    - P0355 Extending chrono to Calendars and Time Zones
    mordante authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    de736d9 View commit details
    Browse the repository at this point in the history
  25. [Instrumentation] Remove an extraneous ArrayRef (NFC) (llvm#94890)

    We can implicitly convert RemainingVDs to an ArrayRef.  Note that
    RemainingVDs is of type SmallVector<InstrProfValueData, 24>.
    kazutakahirata authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    f7ccb32 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    e090bac View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    089c4bb View commit details
    Browse the repository at this point in the history
  28. [RISCV] Cleanup some Constraint parameters in RISCVInstrInfoVPseudos.…

    …td. NFC
    
    Remove unneeded parameters or sync into class if they are only
    ever used with one value.
    topperc committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    add8908 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    e4b0655 View commit details
    Browse the repository at this point in the history
  30. GlobalISel: Remove faulty assert in buildAtomicRMW op

    Vectors are supported for fp operations now, so remove the assert. The
    supported type/operation combinations are best left for the verifier.
    Avoids regression in future commit that starts treating some vector
    cases as legal.
    arsenm committed Jun 9, 2024
    Configuration menu
    Copy the full SHA
    014446c View commit details
    Browse the repository at this point in the history
  31. [NFC][mlir][gpu] Fully-qualify all namespaces in the GPU compilation …

    …interfaces (llvm#94908)
    
    Fully qualify all namespaces appearing in `GPUTargetAttrInterface` and
    `OffloadingLLVMTranslationAttrInterface`. If they're not fully qualified
    then out-of-tree dialects might encounter name resolution errors.
    fabianmcg authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    d639b91 View commit details
    Browse the repository at this point in the history
  32. [Clang][OpenMP] throw compilation error instead of crash in Stmt::OMP…

    …ScopeDirectiveClass case (llvm#77535) (llvm#84135)
    
    Fix llvm#77535, Change unstable assertion into compilation error, and add a
    test for it.
    Puellaquae authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    dbe63e3 View commit details
    Browse the repository at this point in the history
  33. [ProfileData] Refactor VTableNamePtr and CompressedVTableNamesLen (NF…

    …C) (llvm#94859)
    
    VTableNamePtr and CompressedVTableNamesLen are always used together to
    create a StringRef in getSymtab.
    
    We can create the StringRef ahead of time in readHeader.  This way,
    IndexedInstrProfReader becomes a tiny bit simpler with fewer member
    variables.  Also, StringRef default-constructs itself with its Data
    and Length set to nullptr and 0, respectively, which is exactly what
    we need.
    kazutakahirata authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    521238d View commit details
    Browse the repository at this point in the history
  34. [libc++][TZDB] Implements time_zone::to_sys. (llvm#90394)

    This implements the throwing overload and the exception classes throw by
    this overload.
    
    Implements parts of:
    - P0355 Extending chrono to Calendars and Time Zones
    mordante authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    77116bd View commit details
    Browse the repository at this point in the history
  35. [NFC][mlir][gpu] Make sym_name an inherent attr in GPUModuleOp (llvm#…

    …94918)
    
    Make `sym_name` an inherent attr in GPUModuleOp so that it doesn't show
    in the discardable attributes.
    The change is safe as the attribute is always expected to be present.
    fabianmcg authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    54373e0 View commit details
    Browse the repository at this point in the history
  36. MCInst: decrease inline element count to 6. NFC

    MCInst is primarily used in local variables and MCRelaxableFragment
    (mostly JMP/JCC for x86). Reducing the inline element count can make
    MCRelaxableFragment smaller, potentially leading to a lower peak RSS.
    
    When compiling sqlite3.c, x86-64 has the largest maximum numOperands.
    
    aarch64: 5; ppc64: 6; riscv64: 3; s390x: 6; x86-64: 8
    
    Here is the frequency table for x86-64:
    
    max getNumOperands: 8
    0: 676
    1: 37892
    2: 84046
    3: 26767
    4: 1640
    5: 1222
    6: 80794
    7: 768
    8: 22
    
    Pull Request: llvm#94913
    MaskRay authored Jun 9, 2024
    Configuration menu
    Copy the full SHA
    acf6721 View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2024

  1. Configuration menu
    Copy the full SHA
    63ef2ec View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cbd7eab View commit details
    Browse the repository at this point in the history
  3. [mlir][python] Fix attribute registration in ir.py (llvm#94615)

    This PR fixes attribute registration for `SI8Attr` and `UI8Attr` in
    `ir.py`.
    eospadov authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    367d502 View commit details
    Browse the repository at this point in the history
  4. [ProfileData] Refactor BinaryIdsStart and BinaryIdsSize (NFC) (llvm#9…

    …4922)
    
    BinaryIdsStart and BinaryIdsSize in IndexedInstrProfReader are always
    used together, so this patch packages them into an ArrayRef<uint8_t>.
    
    For now, readBinaryIdsInternal immediately unpacks ArrayRef into its
    constituents to avoid touching the rest of readBinaryIdsInternal.
    kazutakahirata authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    4403cdb View commit details
    Browse the repository at this point in the history
  5. [MC,test] Reorganize relax-recompute-align.s & layout-interdependency.s

    relax-recompute-align.s might change when we change the fragment
    relaxation approach.
    MaskRay committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    bf0d76d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    cb1a727 View commit details
    Browse the repository at this point in the history
  7. [MC] Relax fragments eagerly

    Lazy relaxation caused hash table lookups (`getFragmentOffset`) and
    complex use/compute interdependencies. Some expressions involding
    forward declared symbols (e.g. `subsection-if.s`) cannot be computed.
    Recursion detection requires complex `IsBeingLaidOut`
    (https://reviews.llvm.org/D79570).
    
    D76114's `invalidateFragmentsFrom` makes lazy relaxation even less
    useful.
    
    Switch to eager relaxation to greatly simplify code and resolve these
    issues. This change also removes a `getPrevNode` use, which makes it
    more feasible to replace the fragment representation, which might yield
    a large peak RSS win.
    
    Minor downsides: The number of section relaxations may increase (offset
    by avoiding the hash table lookup). For relax-recompute-align.s, the
    computed layout is not optimal.
    MaskRay committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    9d0754a View commit details
    Browse the repository at this point in the history
  8. [mlir][bufferization] Fix handling of indirect function calls (llvm#9…

    …4896)
    
    This commit fixes a crash in the ownership-based buffer deallocation
    pass when indirectly calling a function via SSA value. Such functions
    must be conservatively assumed to be public.
    
    Fixes llvm#94780.
    matthias-springer authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    13896b6 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    bb4ee27 View commit details
    Browse the repository at this point in the history
  10. [libc++][TZDB] Implements time_zone::to_sys. (llvm#90901)

    This implements the overload with the choose argument and adds this
    enum.
    
    Implements parts of:
    - P0355 Extending chrono to Calendars and Time Zones
    mordante authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    87cedbe View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    a47e40b View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    a6929db View commit details
    Browse the repository at this point in the history
  13. [CodeGen] Simplify codegen for array initialization (llvm#93956)

    This makes codegen for array initialization simpler in two ways:
    1. Drop the zero-index GEP at the start, which is no longer needed with
    opaque pointers.
    2. Emit GEPs directly to the correct element, instead of having a long
    chain of +1 GEPs. This is more canonical, and also avoids regressions in
    unoptimized builds from llvm#93823.
    nikic authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    12d24e0 View commit details
    Browse the repository at this point in the history
  14. [TLI] ReplaceWithVecLib: drop Instruction support (llvm#94365)

    Refactor the pass to only support `IntrinsicInst` calls.
    
    `ReplaceWithVecLib` used to support instructions, as AArch64 was using
    this pass to replace a vectorized frem instruction to the fmod vector
    library call (through TLI).
    
    As this replacement is now done by the codegen (llvm#83859), there is no
    need for this pass to support instructions.
    
    Additionally, removed 'frem' tests from:
    - AArch64/replace-with-veclib-armpl.ll
    - AArch64/replace-with-veclib-sleef-scalable.ll
    - AArch64/replace-with-veclib-sleef.ll
    
    Such testing is done at codegen level:
    - llvm#83859
    paschalis-mpeis authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    e4790ce View commit details
    Browse the repository at this point in the history
  15. [dexter] Correctly identify stop-reason while driving VisualStudio (l…

    …lvm#94754)
    
    Prior to this patch VisualStudio._get_step_info incorrectly identifies
    the reason the debugger has stopped. e.g., stepping through a program
    would be reported as a StopReason.Breakpoint rather than
    StopReason.Step.
    
    Fix. No test added as there are no VisualStudio tests (tested locally).
    OCHyams authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    832b91f View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    e58f830 View commit details
    Browse the repository at this point in the history
  17. Reapply [ConstantFold] Remove non-trivial gep-of-gep fold (llvm#93823)

    Reapply after llvm#93956, which
    changed clang array initialization codegen to avoid size regressions
    for unoptimized builds.
    
    -----
    
    This fold is subtly incorrect, because DL-unaware constant folding does
    not know the correct index type to use, and just performs the addition
    in the type that happens to already be there. This is incorrect, since
    sext(X)+sext(Y) is generally not the same as sext(X+Y). See the
    `@constexpr_gep_of_gep_with_narrow_type()` for a miscompile with the
    current implementation.
    
    One could try to restrict the fold to cases where no overflow occurs,
    but I'm not bothering with that here, because the DL-aware constant
    folding will take care of this anyway. I've only kept the
    straightforward zero-index case, where we just concatenate two GEPs.
    nikic committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    cc158d4 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    c0b65a2 View commit details
    Browse the repository at this point in the history
  19. [clang][analyzer] Improved PointerSubChecker (llvm#93676)

    The checker is made more exact
    (only pointer into array is allowed, check array index)
    and more tests are added.
    balazske authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    26224ca View commit details
    Browse the repository at this point in the history
  20. [RemoveDIs] C API: Add before-dbg-record versions of IRBuilder positi…

    …on funcs (llvm#92417)
    
    Add `LLVMPositionBuilderBeforeDbgRecords` and
    `LLVMPositionBuilderBeforeInstrAndDbgRecords` to `llvm/include/llvm-c/Core.h`
    which behave the same as `LLVMPositionBuilder` and `LVMPositionBuilderBefore`
    except that the position is set before debug records attached to the target
    instruction (the existing functions set the insertion point to after any
    attached debug records).
    
    More info on debug records and the migration towards using them can be found
    here: https://llvm.org/docs/RemoveDIsDebugInfo.html
    
    The distinction is important in some situations. An important example is when
    inserting a phi before another instruction which has debug records attached to
    it (these come "before" the instruction). Inserting before the instruction but
    after the debug records would result in having debug records before a phi, which
    is illegal. That results in an assertion failure:
    
    `llvm/lib/IR/Instruction.cpp:166: Assertion '!isa<PHINode>(this) && "Inserting
    PHI after debug-records!"' failed.`
    
    In llvm (C++) we've added bit to instruction iterators that carries around the
    extra information. Adding dedicated functions seemed like the least invasive and
    least suprising way to update the C API.
    
    Update llvm/tools/llvm-c-test/debuginfo.c to test this functionality.
    
    Update the OCaml bindings, the migration docs and release notes.
    OCHyams authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    d732a32 View commit details
    Browse the repository at this point in the history
  21. [flang] lower SHAPE with assumed-rank arguments (llvm#94812)

    Allocate result statically on the stack (using max rank) and use the
    runtime to fill it in correctly.
    jeanPerier authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    0257f9c View commit details
    Browse the repository at this point in the history
  22. [lldb] Fix redundant condition in compression type check (NFC) (llvm#…

    …94841)
    
    The `else if` condition for checking `m_compression_type` is redundant
    as it matches with a previous `if` condition, making the expression
    always false. Reported by cppcheck as a possible cut-and-paste error.
    
    Caught by cppcheck -
    
    lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp:543:35:
    style: Expression is always false because 'else if' condition matches
    previous condition at line 535. [multiCondition]
    
    Fix llvm#91222
    xgupta authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    0af2e75 View commit details
    Browse the repository at this point in the history
  23. [lldb] Remove redundant condition in watch mask check (NFC) (llvm#94842)

    This issue is reported by cppcheck as a pointless test in the watch mask
    check. The `else if` condition is opposite to the previous `if`
    condition, making the expression always true.
    
    Caught by cppcheck -
    
    lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp:509:25:
    style: Expression is always true because 'else if' condition is opposite
    to previous condition at line 505. [multiCondition]
    
    Fix llvm#91223
    xgupta authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    30bfab3 View commit details
    Browse the repository at this point in the history
  24. NFC fix typos from llvm#92417

    OCHyams committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    38c01c3 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    760d880 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    a0faf79 View commit details
    Browse the repository at this point in the history
  27. [lldb] Gracefully down TestCoroutineHandle test in case the 'coroutin…

    …e' feature is missing (llvm#94903)
    
    Do not let the compiler gets failed in case the target platform does not
    support the 'coroutine' C++ features. Just compile without it and let
    lldb know about missed/unsupported feature.
    slydiman authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    23b8f59 View commit details
    Browse the repository at this point in the history
  28. [KnownBits] Speed up ForeachKnownBits in unit test. NFC. (llvm#94939)

    Use fast unsigned arithmetic before constructing an APInt. This gives
    me a ~2x speed up when running this in my Release+Asserts build:
    
    $ unittests/Support/SupportTests
    --gtest_filter=KnownBitsTest.*Exhaustive
    jayfoad authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    f97bcdb View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    c9fd7b1 View commit details
    Browse the repository at this point in the history
  30. [clang-tidy] doesNotMutateObject: Handle calls to member functions … (

    llvm#94362)
    
    …and operators that have non-const overloads.
    
    This allows  `unnecessary-copy-initialization` to warn on more cases.
    
    The common case is a class with a a set of const/non-sconst overloads
    (e.g. std::vector::operator[]).
    
    ```
    void F() {
      std::vector<Expensive> v;
      // ...
    
      const Expensive e = v[i];
    }
    ```
    legrosbuffle authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    415a82c View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    317ed77 View commit details
    Browse the repository at this point in the history
  32. [flang] use hlfir base when translating assumed-rank entity to fir::E…

    …xtendedValue (llvm#94822)
    
    The hlfir::Entity to fir::ExtendedValue conversion usually uses the "fir
    base" output of hlfir.declare (which is the same as the input) to avoid
    introducing temporary descriptors for the sole purpose of introducing
    updating lower bound information. This is possible because local lower
    bounds, if any, are tracked in a vector inside the fir::ExtendedValue.
    
    With assumed-ranks, the lower bounds cannot be tracked inside the
    fir::ExtendedValue vector (their numbers is unknown at compile time).
    Hence, the fir.box/fir.class used in fir::ExtendedValue in lowering must
    always contain accurate local lower bound information.
    jeanPerier authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    81469a2 View commit details
    Browse the repository at this point in the history
  33. [flang][Transforms][NFC] reduce boilerplate in func attr pass (llvm#9…

    …4739)
    
    Use tablegen to automatically create the pass constructor.
    
    The purpose of this pass is to add attributes to functions, so it
    doesn't need to work on other top level operations.
    tblah authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    a6129a5 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    ae9d89d View commit details
    Browse the repository at this point in the history
  35. [KnownBits] Speed up conflict handling in ForeachKnownBits in unit te…

    …st. (llvm#94943)
    
    Exit early if known bits have a conflict. This gives me a ~15% speed up
    when running this in my Release+Asserts build:
    
    $ unittests/Support/SupportTests
    --gtest_filter=KnownBitsTest.*Exhaustive
    jayfoad authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    ecb9d94 View commit details
    Browse the repository at this point in the history
  36. [flang][OpenMP] Fix unused prefixes in function-filtering-2 test (llv…

    …m#94330)
    
    Co-authored-by: Andrew Gozillon <Andrew.Gozillon@amd.com>
    tblah and agozillon authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    8dc8b9f View commit details
    Browse the repository at this point in the history
  37. [libc++][TZDB] Implements time_zone::to_local. (llvm#91003)

    Implements parts of:
    - P0355 Extending chrono to Calendars and Time Zones
    mordante authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    da03175 View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    fe0dee4 View commit details
    Browse the repository at this point in the history
  39. [mlir][emitc] Remove copy from scf.for lowering (llvm#94898)

    Remove the copy into fresh variables done when lowering scf.for into
    emitc.for and use the variables carrying the init and iter values as
    the loop's results.
    aniragil authored Jun 10, 2024
    Configuration menu
    Copy the full SHA
    8b7e836 View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. Configuration menu
    Copy the full SHA
    6f11615 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6ae8317 View commit details
    Browse the repository at this point in the history