Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 9edd998e (Aug 29) (14) #367

Open
wants to merge 315 commits into
base: bump_to_d4f97da1
Choose a base branch
from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Aug 27, 2024

  1. [lldb] Turn lldb_private::Status into a value type. (llvm#106163)

    This patch removes all of the Set.* methods from Status.
    
    This cleanup is part of a series of patches that make it harder use the
    anti-pattern of keeping a long-lives Status object around and updating
    it while dropping any errors it contains on the floor.
    
    This patch is largely NFC, the more interesting next steps this enables
    is to:
    1. remove Status.Clear()
    2. assert that Status::operator=() never overwrites an error
    3. remove Status::operator=()
    
    Note that step (2) will bring 90% of the benefits for users, and step
    (3) will dramatically clean up the error handling code in various
    places. In the end my goal is to convert all APIs that are of the form
    
    `    ResultTy DoFoo(Status& error)
    `
    to
    
    `    llvm::Expected<ResultTy> DoFoo()
    `
    How to read this patch?
    
    The interesting changes are in Status.h and Status.cpp, all other
    changes are mostly
    
    ` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
    grep -l SetErrorString lldb/source)
    `
    plus the occasional manual cleanup.
    adrian-prantl authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    0642cd7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d1d8edf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c349ded View commit details
    Browse the repository at this point in the history
  4. [llvm-exegesis] Switch from intptr_t to uintptr_t in most cases (llvm…

    …#102860)
    
    This patch switches most of the uses of intptr_t to uintptr_t within
    llvm-exegesis for the subprocess memory support. In the vast majority of
    cases we do not want a signed component of the address, hence making
    intptr_t undesirable. intptr_t is left for error handling, for example
    when making syscalls and we need to see if the syscall returned -1.
    boomanaiden154 authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    fac87b8 View commit details
    Browse the repository at this point in the history
  5. [libc++] Add missing include to three_way_comp_ref_type.h

    We were using a `_LIBCPP_ASSERT_FOO` macro without including `<__assert>`.
    
    rdar://134425695
    ldionne committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    0df7812 View commit details
    Browse the repository at this point in the history
  6. [AIX][PGO] Handle atexit functions when dlclose'ing shared libraries (l…

    …lvm#102940)
    
    Problem:
    On AIX, functions registered by atexit in a shared library are not run
    when the library is dlclosed, but instead run (and fail because the
    function pointer is no longer valid) during main program exit.
    
    The profile-rt registers some functions with atexit:
    
     1. writeFileWithoutReturn that writes out the profile file
     2. llvm_delete_reset_function_list that does some cleanup in the gcov 
        instrumentation library (not sure)
    
    And so right now, we get an "Illegal instruction (core dumped)" when an
    instrumented shared object is dlopen'ed and dlclosed.
    
    Solution:
      When a shared library is dlclose'd, destructors from the library are
      called. So create a destructor function that iterates over all known
      functions that profile-rt registers with atexit, and unregister the ones
      that have been registered and execute them.
    
    Scenarios tested:
    (0) gcov dlopen/dlclose                                       (AIX/gcov-dlopen-dlclose.test)
    (1) multiple dlopen/dlclose of the same lib and multiple libs (instrprof-dlopen-dlclose.test)
    (2) dlopen but no dlclose                                     (exists: Posix/instrprof-dlopen.test)
    (3) a simple fork testcase with dlopen/dlclose                (instrprof-dlopen-dlclose.test)
    (4) dlopen/dlclose by multiple threads.                       (instrprof-dlopen-dlclose.test)
    (5) regular dynamic-linking of instrumented shared libs       (exists: AIX/shared-bexpall-pgo.c)
    (6) a simple fork testcase produces correct profile           (instrprof-fork.c)
    
    
    ---------
    
    Co-authored-by: Hubert Tong <hstong@ca.ibm.com>
    w2yehia and Hubert Tong authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    2abed78 View commit details
    Browse the repository at this point in the history
  7. [BOLT] Handle internal calls in ValidateInternalCalls (llvm#105736)

    Move handling of all internal calls into the designated pass. Preserve
    NOPs and mark functions as non-simple on non-X86 platforms.
    maksfb authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    abd69b3 View commit details
    Browse the repository at this point in the history
  8. [SandboxIR] Implement VAArgInst (llvm#106247)

    This patch implements sandboxir::VAArgInst mirroring llvm::VAArgInst.
    vporpo authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    ff81f9f View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    155e3aa View commit details
    Browse the repository at this point in the history
  10. [InstCombine] Simplify (add/sub (sub/add) (sub/add)) irrelivant of …

    …use-count
    
    Added folds:
        - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)`
        - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)`
    
    The fold typically is handled in the `Reassosiate` pass, but it fails
    if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate
    doesn't propagate flags correctly.
    
    This patch adds the fold explicitly the InstCombine
    
    Proofs: https://alive2.llvm.org/ce/z/p6JyRP
    
    Closes llvm#105866
    goldsteinn committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    a6edcea View commit details
    Browse the repository at this point in the history
  11. [TypeProf][ICP]Allow vtable-comparison as long as vtable count is com…

    …parable with function count for each candidate (llvm#106260)
    
    The current cost-benefit analysis between vtable comparison and function
    comparison require the indirect fallback branch to be cold. This is too
    conservative.
    
    This change allows vtable-comparison as long as vtable count is
    comparable with function count for each function candidate and removes
    the cold indirect fallback requirement.
    
    Tested:
    1. Testing this on benchmarks uplifts the measurable performance wins.
    Counting the (possibly-duplicated) remarks (because of linkonce_odr
    functions, cross-module import of functions) show the number of vtable
    remarks increases from ~30k-ish to 50k-ish.
    2. https://gcc.godbolt.org/z/sbGK7Pacn shows vtable-comparison doesn't
    happen today (using the same IR input)
    minglotus-6 authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    511500e View commit details
    Browse the repository at this point in the history
  12. [SampleFDO][NFC] Refactoring sample reader to support on-demand read …

    …profiles for given functions (llvm#104654)
    
    Currently in extended binary format, sample reader only read the
    profiles when the function are in the current module at initialization
    time, this extends the support to read the arbitrary profiles for given
    input functions in later stage. It's used for
    llvm#101053.
    wlei-llvm authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    23144e8 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    b2dd840 View commit details
    Browse the repository at this point in the history
  14. [MachO] Give the CPUSubTypeARM64 enum uint32_t type. NFCI.

    We recently added various CPU_SUBTYPE_ARM64E values, notably including
    CPU_SUBTYPE_ARM64E_VERSIONED_PTRAUTH_ABI_MASK, which is 0x80000000U.
    The enum is better off as a uint32_t to accomodate that.
    
    This also hopefully helps silence GCC warnings reported on a ternary in
    CPU_SUBTYPE_ARM64E_WITH_PTRAUTH_VERSION.
    
    The subtype is already generally treated as a uint32_t elsewhere, so
    while there, change the new helpers to explicitly pass/return the
    subtype as uint32_t, and the individual narrower components as either
    bool or unsigned.
    ahmedbougacha committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    1d7bb2b View commit details
    Browse the repository at this point in the history
  15. [X86] Check if there is stack access in the spilled FP/BP range (llvm…

    …#106035)
    
    In the clobbered FP/BP range, we can't use it as normal FP/BP to access
    stack. So if there are stack accesses due to register spill, scheduling
    or other back end optimization, we should report an error instead of
    silently generate wrong code.
    
    Also try to minimize the save/restore range of the clobbered FP/BP if
    the FrameSetup doesn't change stack size.
    weiguozhi authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    edbd9d1 View commit details
    Browse the repository at this point in the history
  16. [SLP] Support vectorizing 2^N-1 reductions (llvm#106266)

    Build on the -slp-vectorize-non-power-of-2 experimental option, and
    support vectorizing reductions with 2^N-1 sized vector.
    
    Specifically, two related changes:
    1) When searching for a profitable VL, start with the 2^N-1 reduction
    width.
    If cost model does not select that VL, return to power of two boundaries
       when halfing the search VL.  The later is mostly for simplicity.
    2) Reduce the minimum reduction width from 4 to 3 when supporting
    non-power
       of two vectors.  This is required to support <3 x Ty> cases.
    
    One thing which isn't directly related to this change, but I want to
    note for clarity is that the non-power-of-two vectorization appears to
    be sensative to operand order of reduction. I haven't yet fully figured
    out why, but I suspect this is non-power-of-two specific.
    preames authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    ed03070 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    5e64520 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    6a74b0e View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    b24ffa6 View commit details
    Browse the repository at this point in the history
  20. [ICP] Fix warnings

    This patch fixes:
    
      llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp:845:12:
      error: variable 'RemainingVTableCount' set but not used
      [-Werror,-Wunused-but-set-variable]
    
      llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp:306:23:
      error: private field 'PSI' is not used
      [-Werror,-Wunused-private-field]
    
    Here are a couple of domino effects:
    
    - Once I remove PSI, I need to update the contructor and its caller.
    
    - Once I remove RemainingVTableCount, I don't need TotalCount, so I am
      updating the caller as well.
    kazutakahirata committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    2bdc0da View commit details
    Browse the repository at this point in the history
  21. [LTO] Introduce a helper lambda in gatherImportedSummariesForModule (…

    …NFC) (llvm#106251)
    
    This patch forward ports the heterogeneous std::map::operator[]() from
    C++26 so that we can look up the map without allocating an instance of
    std::string when the key-value pair exists in the map.
    
    The background is as follows.  I'm planning to reduce the memory
    footprint of ThinLTO indexing by changing ImportMapTy, the data
    structure used for an import list.  The new list will be a hash set of
    tuples (SourceModule, GUID, ImportType) represented in a space
    efficient manner.  That means that as we iterate over the hash set, we
    encounter SourceModule as many times as GUID.  We don't want to create
    a temporary instance of std::string every time we look up
    ModuleToSummariesForIndex like:
    
    auto &SummariesForIndex =
    ModuleToSummariesForIndex[std::string(ILI.first)];
    
    This patch removes the need to create the temporaries by enabling the
    hetegeneous lookup with std::set<K, V, std::less<>> and forward
    porting std::map::operator[]() from C++26.
    kazutakahirata authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    29bb523 View commit details
    Browse the repository at this point in the history
  22. [DataLayout] Change return type of getStackAlignment to MaybeAlign (

    llvm#105478)
    
    Currently, `getStackAlignment` asserts if the stack alignment wasn't
    specified. This makes it inconvenient to use and complicates testing.
    
    This change also makes `exceedsNaturalStackAlignment` method redundant.
    s-barannikov authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    4d7a0ab View commit details
    Browse the repository at this point in the history
  23. [AMDGPU] adjust tests to prevent fpclass bitcast folding (llvm#106268)

    Make some minor tweaks to AMDGPU tests to ensure they still work as
    intended after llvm#97762. These
    tests can be radically simplified after bitcast aware fpclass deduction.
    AlexMaclean authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    4c4908c View commit details
    Browse the repository at this point in the history
  24. [SLP] Remove -slp-optimize-identity-hor-reduction-ops option (llvm#10…

    …6238)
    
    This code has been unchanged for two years; let's simplify the code
    and remove configurability which makes the code harder to follow.
    preames authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    ee764a2 View commit details
    Browse the repository at this point in the history
  25. [libc++] Disallow character types being index types of extents (llv…

    …m#105832)
    
    llvm#78086 provided the trait we want to use for this: `__libcpp_integer`.
    
    In some `libcxx/containers/views/mdspan` tests, improper uses of `char` 
    are replaced with `signed char`. 
    
    Fixes llvm#73715
    frederick-vs-ja authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    74e70ba View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    2a3d735 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    bcb6e27 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    fc51797 View commit details
    Browse the repository at this point in the history
  29. [libc++] Deprecate and remove std::uncaught_exception (llvm#101830)

    Works towards P0619R4/llvm#99985.
    
    - std::uncaught_exception was not previously deprecated. This patch
      deprecates it since C++17 as per N4259. std::uncaught_exceptions is
      used instead as libc++ unconditionally provides this function.
    
    - _LIBCPP_ENABLE_CXX20_REMOVED_UNCAUGHT_EXCEPTION restores
      std::uncaught_exception.
    
    - As a drive-by, this patch updates the C++20 status page to 
      explain that D.11 is already done, since it was done in 
      578d09c.
    frederick-vs-ja authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    4ea2c73 View commit details
    Browse the repository at this point in the history
  30. [Headers][X86] Add a test for MMX/SSE intrinsics (llvm#105852)

    Certain intrinsics map to builtins that require an immediate (literal)
    argument; make sure we report non-literal arguments.
    
    This has been kicking around downstream for a while, and the recent
    removal of the MMX builtins caused me to notice it again.
    pogo59 authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    b6b6482 View commit details
    Browse the repository at this point in the history
  31. [Clang] Support initializing structured bindings from an array with d…

    …irect-list-initialization (llvm#102581)
    
    When initializing structured bindings from an array with
    direct-list-initialization, array copy will be performed, which is a
    special case not following list-initialization.
    
    This PR adds support for this case.
    
    Fixes llvm#31813.
    zwuis authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    377257f View commit details
    Browse the repository at this point in the history
  32. [SandboxIR][NFC] Create a DEF_CONST() macro in SandboxIRValues.def (l…

    …lvm#106269)
    
    This helps with Constant::classof().
    vporpo authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    751e681 View commit details
    Browse the repository at this point in the history
  33. Revert "[LLDB][SBSaveCore] Add selectable memory regions to SBSaveCor… (

    llvm#106293)
    
    Reverts llvm#105442. Due to `TestSkinnyCoreFailing` and root causing of the
    failure will likely take longer than EOD.
    Jlalond authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    b959532 View commit details
    Browse the repository at this point in the history
  34. [libc++] Move some macOS CI jobs to Github actions (llvm#89083)

    This patch decouples macOS CI testing from BuildKite, which makes the
    maintenance of macOS CI easier and more accessible to all contributors.
    Right now, the macOS CI is running entirely on machines owned by the
    LLVM Foundation with only a small set of contributors having direct
    access to them. In particular, updating these machines is currently
    a very time-consuming manual process that requires taking the machines
    offline, and using Github-provided instances makes that an order of
    magnitude easier.
    
    The story for performing back-deployment testing still needs to be
    figured out, so for now we are retaining some jobs under BuildKite.
    ldionne authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    e19c3a7 View commit details
    Browse the repository at this point in the history
  35. [MachineOutliner][NFC] Refactor (llvm#105398)

    This patch prepares the NFC groundwork for global outlining using
    CGData, which will follow
    llvm#90074.
    
    - The `MinRepeats` parameter is now explicitly passed to the
    `getOutliningCandidateInfo` function, rather than relying on a default
    value of 2. For local outlining, the minimum number of repetitions is
    typically 2, but for the global outlining (mentioned above), we will
    optimistically create a single `Candidate` for each `OutlinedFunction`
    if stable hashes match a specific code sequence. This parameter is
    adjusted accordingly in global outlining scenarios.
    - I have also implemented `unique_ptr` for `OutlinedFunction` to ensure
    safe and efficient memory management within `FunctionList`, avoiding
    unnecessary implicit copies.
    
    This depends on llvm#101461.
    This is a patch for
    https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
    kyulee-com authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    93b8d07 View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    ff2baf0 View commit details
    Browse the repository at this point in the history
  37. [libc++] Do not redeclare lgamma_r when targeting the LLVM C library (l…

    …lvm#102036)
    
    We use lgamma_r for the random normal distribution support. In this
    code we redeclare it, which causes issues with the LLVM C library as
    this function is marked noexcept in LLVM libc. This patch ensures that
    we don't redeclare that function when targeting LLVM libc.
    jhuber6 authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    5f2389d View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    67eb727 View commit details
    Browse the repository at this point in the history
  39. [lldb] Don't scan more than 10MB of assembly insns (llvm#105890)

    For supported architectures, lldb will do a static scan of the assembly
    instructions of a function to detect stack/frame pointer changes,
    register stores and loads, so we can retrieve register values for the
    caller stack frames. We trust that the function address range reflects
    the actual function range, but in a stripped binary or other unusual
    environment, we can end up scanning all of the text as a single
    "function" which is (1) incorrect and useless, but more importantly (2)
    slow.
    
    Cap the max size we will profile to 10MB of instructions. There will
    surely be functions longer than this with no unwind info, and we will
    miss the final epilogue or mid-function epilogues past the first 10MB,
    but I think this will be unusual, and the failure more to missing the
    epilogue is that the user will need to step out an extra time or two as
    the StackID is not correctly calculated mid-epilogue. I think this is a
    good tradeoff of behaviors.
    
    rdar://134391577
    jasonmolenda authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    3280292 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    f2f78b2 View commit details
    Browse the repository at this point in the history
  41. [StableHash] Implement stable global name for the hash computation (l…

    …lvm#106156)
    
    LLVM often extends global names by adding suffixes to distinguish unique
    identities. However, these suffixes are not always stable across
    different runs and build environments. To address this issue, I
    implemented `get_stable_name` to ignore such suffixes and obtain the
    original name. This approach is not new, as PGO or Bolt already handle
    this issue similarly. Using the stable name obtained from
    `get_stable_name`, I implemented `stable_hash_name` while utilizing the
    same underlying `xxh3_64bit` algorithm as before.
    kyulee-com authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    f9ad249 View commit details
    Browse the repository at this point in the history
  42. [profile] Move test into Posix

    Fix Windows after llvm#102940
    vitalybuka committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c2cac7e View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    d48b0f8 View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    47667ee View commit details
    Browse the repository at this point in the history
  45. [compiler-rt] Fix definition of usize on 32-bit Windows

    32-bit Windows uses `unsigned int` for uintptr_t and size_t.
    Commit 18e06e3 changed uptr to
    unsigned long, so it no longer matches the real size_t/uintptr_t and
    therefore the current definition of usize result in:
    `error C2821: first formal parameter to 'operator new' must be 'size_t'`
    
    However, the real problem is that uptr is wrong to work around the fact
    that we have local SIZE_T and SSIZE_T typedefs that trample on the
    basetsd.h definitions of the same name and therefore need to match
    exactly. Unlike size_t/ssize_t the uppercase ones always use unsigned
    long (even on 32-bit).
    
    This commit works around the build breakage by keeping the existing
    definitions of uptr/sptr and just changing usize. A follow-up change
    will attempt to fix this properly.
    
    Fixes: llvm#101998
    
    Reviewed By: mstorsjo
    
    Pull Request: llvm#106151
    arichardson authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    bb27dd8 View commit details
    Browse the repository at this point in the history
  46. [ctx_prof] Move the "from json" logic more centrally to reuse it from…

    … test. (llvm#106129)
    
    Making the synthesis of a contextual profile file from a JSON descriptor more reusable, for unittest authoring purposes.
    
    The functionality round-trips through the binary format - no reason, currently, to support other ways of loading contextual profiles.
    mtrofin authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    1022323 View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    de687ea View commit details
    Browse the repository at this point in the history
  48. [mlir][gpu] Add metadata attributes for storing kernel metadata in GP…

    …U objects (llvm#95292)
    
    This patch adds the `#gpu.kernel_metadata` and `#gpu.kernel_table`
    attributes. The `#gpu.kernel_metadata` attribute allows storing metadata
    related to a compiled kernel, for example, the number of scalar
    registers used by the kernel. The attribute only has 2 required
    parameters, the name and function type. It also has 2 optional
    parameters, the arguments attributes and generic dictionary for storing
    all other metadata.
    
    The `#gpu.kernel_table` stores a table of `#gpu.kernel_metadata`,
    mapping the name of the kernel to the metadata.
    
    Finally, the function `ROCDL::getAMDHSAKernelsELFMetadata` was added to
    collect ELF metadata from a binary, and to test the class methods in
    both attributes.
    
    Example:
    ```mlir
    gpu.binary @binary [#gpu.object<#rocdl.target<chip = "gfx900">, kernels = #gpu.kernel_table<[
        #gpu.kernel_metadata<"kernel0", (i32) -> (), metadata = {sgpr_count = 255}>,
        #gpu.kernel_metadata<"kernel1", (i32, f32) -> (), arg_attrs = [{llvm.read_only}, {}]>
      ]> , bin = "BLOB">]
    
    ```
    The motivation behind these attributes is to provide useful information
    for things like tunning.
    
    ---------
    
    Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
    fabianmcg and joker-eph authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    016e1eb View commit details
    Browse the repository at this point in the history
  49. [ctx_prof] Add support for ICP (llvm#105469)

    An overload of `llvm::promoteCallWithIfThenElse` that updates the contextual profile.
    
    High-level, this is very simple: after creating the `if... then (direct call) else (indirect call)` structure, we instrument the new callsites and BBs (the instrumentation will help with tracking for other IPO transformations, and, ultimately, to match counter values before flattening to `MD_prof`).
    
    In more detail:
    
    - move the callsite instrumentation of the indirect call to the `else` BB, before the indirect call
    - create a new callsite instrumentation for the direct call
    - create instrumentation for both the `then` and `else` BBs - we could instrument just one (MST-style) but we're not running the binary with this instrumentation, and at most this would save some space (less counters tracked). For simplicity instrumenting both at this point
    - update each context belonging to the caller by updating the counters, and moving the indirect callee to the new, direct callsite ID
    
    Issue llvm#89287
    mtrofin authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    73c3b73 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    d22bee1 View commit details
    Browse the repository at this point in the history
  51. [llvm/llvm-project][Coroutines] Improve debugging and minor refactori…

    …ng (llvm#104642)
    
    No Functional Changes
    
    * Fix comments in several places
    * Instead of using BB-getName() (in dump methods) use
    getBasicBlockLabel. This fixes the poor output of the dumped info that
    resulted in missing BB labels.
    * Use RPO when dumping SuspendCrossingInfo. Without this the dump order
    is determined by the ptr addresses and so is not consistent from run to
    run making IR diffs difficult to read.
    * Inference -> Interference
    * Pull the logic that determines insertion location out of insertSpills
    and into getSpillInsertionPt, to differentiate between these two
    operations.
    * Use Shape getters for CoroId instead of getting it manually.
    
    ---------
    
    Co-authored-by: tnowicki <tnowicki.nowicki@amd.com>
    TylerNowicki and tnowicki authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    51aceb5 View commit details
    Browse the repository at this point in the history
  52. [mlir][GPU] Fix docs modified by llvm#94910 (llvm#106295)

    Fix docs modified by llvm#94910 by adding information about the `module`
    argument in `gpu::TargetAttrInterface::createObject`.
    
    ---------
    
    Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
    fabianmcg and joker-eph authored Aug 27, 2024
    Configuration menu
    Copy the full SHA
    aaed557 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    91e09c3 View commit details
    Browse the repository at this point in the history
  54. [lldb][ClangExpressionParser] Remove duplicate construction of Extern…

    …alASTSourceWrapper
    
    This is an oversight from llvm#104817 where the intention
    was to hoist the ExternalASTSourceWrapper construction out of the
    conditional so it can be set on both the `SemaSourceWithPriorities` and
    be added as an external source to Sema. But the inner
    `ExternalASTSourceWrapper` allocation wasn't actually removed.
    
    This currently all works fine because all these AST sources are
    refcounted and point to the same underlying AST sources. But this
    patch cleans this up regardless.
    Michael137 committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    0b1c8fd View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    6ae657b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    32abe5d View commit details
    Browse the repository at this point in the history
  3. [RISCV][MCP] Remove redundant move from tail duplication (llvm#89865)

    Tail duplication will generate the redundant move before return. It is
    because the MachineCopyPropogation can't recognize COPY after post-RA
    pseudoExpand.
    
    This patch make MachineCopyPropogation recognize `%0 = ADDI %1, 0` as
    COPY
    BeMg authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    2def1c4 View commit details
    Browse the repository at this point in the history
  4. [flang][cuda] Add missing dependency (llvm#106298)

    Add missing dependency that sometimes makes a build fails with ninja.
    clementval authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    f215447 View commit details
    Browse the repository at this point in the history
  5. [flang][cuda] Use declare op results instead of memref (llvm#106287)

    llvm#106120 Simplify the data transfer when possible by using the reference
    and a shape. This bypass the declare op. In order to keep the declare op
    around, use the second results of the declare op which achieve the same.
    clementval authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    ccbee71 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    1601879 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    82db08e View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    c1a4896 View commit details
    Browse the repository at this point in the history
  9. [orc] Fix asan error in RTDyldObjectLinkingLayer.cpp (llvm#106300)

    `JITDylibSearchOrderResolver` local variable can be destroyed before
    completion of all callbacks. Capture it together with `Deps` in
    `OnEmitted` callback.
    
    Original error:
    
    ```
    ==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978
    READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm)
        #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58
        #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25
        #2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5
        #3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12
        #4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr,
    ```
    ezhulenev authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    c0ebc18 View commit details
    Browse the repository at this point in the history
  10. [mlir][Linalg] Fix match convolution message (llvm#106197)

    Fix the message part of bugfix commit `2ef3dcf`.
    yifeizh2 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    bacf312 View commit details
    Browse the repository at this point in the history
  11. Revert "[Clang] [Test] Use lit Syntax for Environment Variables in Cl…

    …ang subproject" (llvm#106267)
    
    Reverts llvm#102647
    
    I am reverting this change because the `readfile` doesn't actually
    perform any useful operation, and yet, for some reason, the test still
    passed. This indicates that the modification was unnecessary and could
    lead to confusion or unexpected behavior in the future.
    Harini0924 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    815bf0f View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    656d5aa View commit details
    Browse the repository at this point in the history
  13. [clang-format] Insert a space between new/delete and a C-style cast (l…

    …lvm#106175)
    
    It doesn't make sense to remove the space between new/delete and a
    C-style cast when SpaceBeforeParensOptions.AfterPlacementOperator is set
    to false.
    
    Fixes llvm#105628.
    owenca authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    fac7e87 View commit details
    Browse the repository at this point in the history
  14. [Release] Add keith to valid archive uploaders (llvm#106018)

    I am interested in helping contribute macOS binaries since we're
    generally sporadic with uploading these.
    
    Fixes llvm#106016
    keith authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    d2b420c View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    b1d1c33 View commit details
    Browse the repository at this point in the history
  16. [ORC] Generalize loadRelocatableObject to loadLinkableFile, add archi…

    …ve support.
    
    This allows us to rewrite part of StaticLibraryDefinitionGenerator in terms of
    loadLinkableFile.
    
    It's also useful for clients who may not know (either from file extensions or
    context) whether a given path will be an object file, an archive, or a
    universal binary.
    
    rdar://134638070
    lhames committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7a4013f View commit details
    Browse the repository at this point in the history
  17. [mlir] Add option to control the emissionKind to DIScopeForLLVMFunc…

    …Op pass (llvm#106229)
    
    This is currently not controllable by the user and always set to
    `DIEmissionKind::LineTablesOnly`.
    The added option allows to set it to the other values accepted by LLVM
    (`None`, `Full`, and `DebugDirectivesOnly`).
    
    ---------
    
    Co-authored-by: jingzec <jingzec@nvidia.com>
    Observer007 and jingzec authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    097138f View commit details
    Browse the repository at this point in the history
  18. [lldb] unique_ptr-ify some GetUserExpression APIs. (llvm#106034)

    These methods already returned a uniquely owned object, this just makes
    them self-documenting.
    lhames authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    3c5ab5a View commit details
    Browse the repository at this point in the history
  19. Revert "[lldb] unique_ptr-ify some GetUserExpression APIs. (llvm#106034

    …)"
    
    This reverts commit 3c5ab5a while I investigate
    bot failures (e.g. https://lab.llvm.org/buildbot/#/builders/163/builds/4286).
    lhames committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    e6cbea1 View commit details
    Browse the repository at this point in the history
  20. [AArch64] Fix buildbot breakage of ubsan

    Fix the ERROR: UndefinedBehaviorSanitizer, reproduced by
      BUILDBOT_REVISION=43ffe2eed llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_bootstrap_ubsan.sh
    It might be also related to llvm#76202
    vfdff committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    8067b88 View commit details
    Browse the repository at this point in the history
  21. [AArch64] Fold more load.x into load.i with large offset

    The list of load.x is refer to canFoldIntoAddrMode on D152828.
    Also support LDRSroX missed in canFoldIntoAddrMode
    vfdff committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    e5a5ac0 View commit details
    Browse the repository at this point in the history
  22. [analyzer] Detect leaks of stack addresses via output params, indirec…

    …t globals 3/3 (llvm#105648)
    
    Fix some false negatives of StackAddrEscapeChecker:
    - Output parameters
      ```
      void top(int **out) {
        int local = 42;
        *out = &local; // Noncompliant
      }
      ```
    - Indirect global pointers
      ```
      int **global;
    
      void top() {
        int local = 42;
        *global = &local; // Noncompliant
      }
      ```
    
    Note that now StackAddrEscapeChecker produces a diagnostic if a function
    with an output parameter is analyzed as top-level or as a callee. I took
    special care to make sure the reports point to the same primary location
    and, in many cases, feature the same primary message. That is the
    motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp
    
    To avoid false positive reports when a global indirect pointer is
    assigned a local address, invalidated, and then reset, I rely on the
    fact that the invalidation symbol will be a DerivedSymbol of a
    ConjuredSymbol that refers to the same memory region.
    
    The checker still has a false negative for non-trivial escaping via a
    returned value. It requires a more sophisticated traversal akin to
    scanReachableSymbols, which out of the scope of this change.
    
    CPP-4734
    
    ---------
    
    This is the last of the 3 stacked PRs, it must not be merged before
    llvm#105652 and
    llvm#105653
    necto authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    190449a View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    3dbb6be View commit details
    Browse the repository at this point in the history
  24. [llvm-cxxfilt][macOS] Don't strip underscores on macOS by default (ll…

    …vm#106233)
    
    Currently, `llvm-cxxfilt` will strip the leading underscore of its input
    on macOS. Historically MachO symbols were prefixed with an extra
    underscore and this is why this default exists. However, nowadays, the
    `ItaniumDemangler` supports all of the following mangling prefixes:
    `_Z`, `__Z`, `___Z`, `____Z`. So really `llvm-cxxfilt` can simply
    forward the mangled name to the demangler and let the library decide
    whether it's a valid encoding.
    
    Compiling C++ on macOS nowadays will generate symbols with `_Z` and
    `___Z` prefixes. So users trying to demangle these symbols will have to
    know that they need to add the `-n` prefix. This routinely catches
    people off-guard.
    
    This patch removes the `-n` default for macOS and allows calling into
    the `ItaniumDemangler` with all the `_Z` prefixes that the demangler
    supports (1-4 underscores).
    
    rdar://132714940
    Michael137 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    0b554dd View commit details
    Browse the repository at this point in the history
  25. [LoongArch] Format LoongArchL{A}SXInstrInfo.td. NFC

    Alignment and start with an upper-case letter.
    wangleiat committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    175aa86 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    6332c36 View commit details
    Browse the repository at this point in the history
  27. [llvm] Prefer StringRef::substr to StringRef::slice (NFC) (llvm#106330)

    S.substr(N) is simpler than S.slice(N, StringRef::npos). Also, substr
    is probably better recognizable than slice thanks to
    std::string_view::substr.
    kazutakahirata authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    22e55ba View commit details
    Browse the repository at this point in the history
  28. [libc++][math] Provide overloads for cv-unqualified floating point ty…

    …pes for `std::isnormal` (llvm#104773)
    
    ## Why
    Currently, the following does not work when compiled with clang:
    
    ```c++
    #include <cmath>
    
    struct ConvertibleToFloat {
        operator float();
    };
    
    bool test(ConvertibleToFloat x) {
        return std::isnormal(x);
    }
    ```
    See https://godbolt.org/z/5bos8v67T for differences with respect to
    msvc, gcc or icx. It fails for `float`, `double` and `long double` (all
    cv-unqualified floating-point types).
    
    ## What
    Test and provide overloads as expected by the ISO C++ standard. The
    classification/comparison function `isnormal` is defined since C++11
    until C++23 as
    ```c++
    bool isnormal( float num );
    bool isnormal( double num );
    bool isnormal( long double num );
    ```
    and since C++23 as
    ```c++
    constexpr bool isnormal( /* floating-point-type */ num );
    ```
    for which "the library provides overloads for all cv-unqualified
    floating-point types as the type of the parameter num". See §28.7.1/1 in
    the [ISO C++
    standard](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf)
    or check
    [cppreference](https://en.cppreference.com/w/cpp/numeric/math/isnormal).
    robincaloudis authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    866bec7 View commit details
    Browse the repository at this point in the history
  29. [clang] Add lifetimebound attr to std::span/std::string_view construc…

    …tor (llvm#103716)
    
    With this patch, clang now automatically adds
    ``[[clang::lifetimebound]]`` to the parameters of `std::span,
    std::string_view` constructors, this enables Clang to capture more cases
    where the returned reference outlives the object.
    
    
    Fixes llvm#100567
    hokein authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    902b2a2 View commit details
    Browse the repository at this point in the history
  30. [Coroutines] Salvage the debug information for coroutine frames withi…

    …n optimizations
    
    This patch tries to salvage the debug information for the coroutine
    frames within optimizations by creating the help alloca varaibles with
    optimizations too. We didn't do this when I implement it initially. I
    roughtly remember the reason was, we feel the additional help alloca
    variable may pessimize the performance, which is almost the most
    important thing under optimizations. But now, it looks like the new
    inserted help alloca variables can be optimized out by the following
    optimizations. So it looks like the time to make it available within
    optimizations.
    
    And also, it looks like the following optimizations will convert the
    generated dbg.declare instrinsic into dbg.value intrinsic within
    optimizations.
    
    In LLVM's test, there is a slightly regression
    that a dbg.declare for the promise object failed to be remained after
    this change. But it looks like we won't have a chance to see dbg.declare
    for the promise object when we split the coroutine as that dbg.declare
    will be converted into a dbg.value in early stage.
    
    So everything looks fine.
    ChuanqiXu9 committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    07514fa View commit details
    Browse the repository at this point in the history
  31. [analyzer] Fix false positive for mutexes inheriting mutex_base (llvm…

    …#106240)
    
    If a mutex interface is split in inheritance chain, e.g. struct mutex
    has `unlock` and inherits `lock` from __mutex_base then calls m.lock()
    and m.unlock() have different "this" targets: m and the __mutex_base of
    m, which used to confuse the `ActiveCritSections` list.
    
    Taking base region canonicalizes the region used to identify a critical
    section and enables search in ActiveCritSections list regardless of
    which class the callee is the member of.
    
    This likely fixes llvm#104241
    
    CPP-5541
    necto authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    82e314e View commit details
    Browse the repository at this point in the history
  32. [LLVM][C API] Clearing initializer and personality by passing NULL (l…

    …lvm#105521)
    
    This is similar to how the C++ API supports passing `nullptr` to
    `setPersonalityFn` or `setInitializer`.
    maleadt authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    0bd5130 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    dfde1a7 View commit details
    Browse the repository at this point in the history
  34. [LoopUnrollAnalyzer] Use constant folding API for loads

    Use ConstantFoldLoadFromConst() instead of a partial re-implementation.
    This makes the code slightly more generic by not depending on the
    exact structure of the constant.
    nikic committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    fe182dd View commit details
    Browse the repository at this point in the history
  35. [clang] Update C++ DR page (llvm#106299)

    [CWG2917](https://cplusplus.github.io/CWG/issues/2917.html) got a new
    proposed resolution that is different from the one the test has been
    written against.
    
    [CWG2922](https://cplusplus.github.io/CWG/issues/2922.html) apparently
    the initial "possible resolution" was approved without changes.
    Endilll authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    9cf052d View commit details
    Browse the repository at this point in the history
  36. Revert "[clang] Add nuw attribute to GEPs" (llvm#106343)

    Reverts llvm#105496
    
    This patch breaks:
    https://lab.llvm.org/buildbot/#/builders/25/builds/1952
    https://lab.llvm.org/buildbot/#/builders/52/builds/1775
    
    Somehow output is different with sanitizers.
    Maybe non-determinism in the code?
    vitalybuka authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    69437a3 View commit details
    Browse the repository at this point in the history
  37. [LoopUnrollAnalyzer] Don't simplify signed pointer comparison

    We're generally not able to simplify signed pointer comparisons
    (because we don't have no-wrap flags that would permit it), so
    we shouldn't pretend that we can in the cost model.
    
    The unsigned comparison case is also not modelled correctly,
    as explained in the added comment. As this is a cost model
    inaccuracy at worst, I'm leaving it alone for now.
    nikic committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    69c4346 View commit details
    Browse the repository at this point in the history
  38. [LSR] Use computeConstantDifference()

    This API is faster than getMinusSCEV() and a SCEVConstant cast.
    nikic committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7660981 View commit details
    Browse the repository at this point in the history
  39. [X86] Add additional test coverage for half libcall expansion/promotion

    Just need to add powi test with llvm#105775
    RKSimon committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    760b172 View commit details
    Browse the repository at this point in the history
  40. [libc++][math] Remove constrained overloads of `std::{isnan, isinf, i…

    …sfinite}` (llvm#106224)
    
    ## Why
    Since llvm#98841 and
    llvm#98952, the constrained
    overloads are unused and not needed anymore as we added explicit
    overloads for all floating point types. I forgot to remove them in the
    mentioned PRs.
    
    ## What
    Remove them.
    robincaloudis authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    2f0661c View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    53d1c21 View commit details
    Browse the repository at this point in the history
  42. [Clang] [Docs] Document runtime config directory options (llvm#66593)

    In the clang user manual the build options `CLANG_CONFIG_FILE_USER_DIR`
    and `CLANG_CONFIG_FILE_SYSTEM_DIR` are documented, but the run time
    overrides `--config-user-dir` and `--config-system-dir` are not.
    
    I have updated the manual to add these run time arguments.
    xbjfk authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    15405b3 View commit details
    Browse the repository at this point in the history
  43. [IndVars] Check if WideInc available before trying to use it

    WideInc/WideIncExpr can be null. Previously this worked out
    because the comparison with WideIncExpr would fail. Now we have
    accesses to WideInc prior to that. Avoid the issue with an
    explicit check.
    
    Fixes llvm#106239.
    nikic committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    c9a5e1b View commit details
    Browse the repository at this point in the history
  44. fix(llvm/**.py): fix comparison to None (llvm#94018)

    from PEP8
    (https://peps.python.org/pep-0008/#programming-recommendations):
    
    > Comparisons to singletons like None should always be done with is or
    is not, never the equality operators.
    
    Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
    e-kwsm and e-kwsm authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    94ed47f View commit details
    Browse the repository at this point in the history
  45. [clang][bytecode] Diagnose array-to-pointer decays of dummy pointers (l…

    …lvm#106366)
    
    We have type information for them now, so we can do this.
    tbaederr authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    f7a74ec View commit details
    Browse the repository at this point in the history
  46. [clang-format] js handle anonymous classes (llvm#106242)

    Addresses a regression in JavaScript when formatting anonymous classes.
    
    ---------
    
    Co-authored-by: Owen Pan <owenpiano@gmail.com>
    krasimirgg and owenca authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    77d63cf View commit details
    Browse the repository at this point in the history
  47. Move stepvector intrinsic out of experimental namespace (llvm#98043)

    This patch is moving out stepvector intrinsic from the experimental
    namespace.
    
    This intrinsic exists in LLVM for several years now, and is widely used.
    mgabka authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    95d2d1c View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    8fd9ec5 View commit details
    Browse the repository at this point in the history
  49. [libc] Disable failing scanf test on AMDGPU temporarily

    Summary:
    This test currently fails in the `amdgpu-attributor` pass. I haven't
    figured out anything beyond that yet as it's difficult to reduce.
    jhuber6 committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    439d7de View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    f4e7e5d View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    71ede8d View commit details
    Browse the repository at this point in the history
  52. [libc++][ranges] P2609R3: Relaxing Ranges Just A Smidge (llvm#101715)

    This patch implements https://wg21.link/p2609r3.
    The test code was originally authored by JMazurkiewicz.
    
    Notes:
    - P2609R3 is not officially a Defect Report, but MSVC STL
      implements it in C++20 mode.
    
      Moreover, P2609R3 and P2997R1 touch exactly the same set of
      concepts, and MSVC STL and libc++ have already treated P2997R1
      as a DR.
    
    - This patch also adjusted feature-test macros.
      + In C++20 mode, the value of __cpp_lib_ranges should be `202110L` because
        - `202202L` covers `range_adaptor_closure` (P2387R3), and
        - `202207L` covers move-only types in range adaptors (P2494R2).
      And all of these changes are only available since C++23 mode.
    
      + In C++23 mode, the value should be `202406L` because
        - `202211L` covers removing poison overloads (P2602R2),
        - `202302L` covers relaxing projected value types (P2609R3), and
        - `202406L` covers removing requirements on `iter_common_reference_t` (P2997R1).
      And all of these changes are already or being implemented.
    
    Fixes llvm#105253.
    
    Co-authored-by: Jakub Mazurkiewicz <mazkuba3@gmail.com>
    frederick-vs-ja and JMazurkiewicz authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    026210e View commit details
    Browse the repository at this point in the history
  53. [VPlan] Move properlyDominates to VPDominatorTree (NFCI).

    This allows for easier re-use in additional places in the future. Also
    move code to VPlanAnalysis.cpp
    fhahn committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    96e1320 View commit details
    Browse the repository at this point in the history
  54. [libc++] Switch to the current XCode beta on macOS builders (llvm#106363

    )
    
    This unblocks a ton of work including llvm#76756 as it updates to a newer
    version of AppleClang.
    philnik777 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    ec9f36a View commit details
    Browse the repository at this point in the history
  55. [ValueLattice] Move intersect from LVI into ValueLattice API (NFC)

    So we can reuse the logic inside IPSCCP.
    nikic committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    a5b6068 View commit details
    Browse the repository at this point in the history
  56. [RemoveDIs] Simplify spliceDebugInfo, fixing splice-to-end edge case (l…

    …lvm#105670)
    
    Not quite NFC, fixes splitBasicBlockBefore case when we split before an
    instruction with debug records (but without the headBit set, i.e., we are
    splitting before the instruction but after the debug records that come before
    it). splitBasicBlockBefore splices the instructions before the split point into
    a new block. Prior to this patch, the debug records would get shifted up to the
    front of the spliced instructions (as seen in the modified unittest - I believe
    the unittest was checking erroneous behaviour). We instead want to leave those
    debug records at the end of the spliced instructions.
    
    The functionality of the deleted `else if` branch is covered by the remaining
    `if` now that `DestMarker` is set to the trailing marker if `Dest` is `end()`.
    Previously the "===" markers were sometimes detached, now we always detach
    them and always reattach them.
    
    Note: `deleteTrailingDbgRecords` only "unlinks" the tailing marker from the
    block, it doesn't delete anything. The trailing marker is still cleaned up
    properly inside the final `if` body with `DestMarker->eraseFromParent();`.
    
    Part 1 of 2 needed for llvm#105571
    OCHyams authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    f581553 View commit details
    Browse the repository at this point in the history
  57. [libc++] Run the Lit test suite against an installed version of the l…

    …ibrary (llvm#96910)
    
    We always strive to test libc++ as close as possible to the way we are
    actually shipping it. This was approximated reasonably well by setting
    up the minimal driver flags when running the test suite, however we were
    running the test suite against the library located in the build
    directory.
    
    This patch improves the situation by installing the library (the
    headers, the built library, modules, etc) into a fake location and then
    running the test suite against that fake "installation root".
    
    This should open the door to getting rid of the temporary copy of the
    headers we make during the build process, however this is left for a
    future improvement.
    
    Note that this adds quite a bit of verbosity whenever running the test
    suite because we install the headers beforehand every time. We should be
    able to override this to silence it, however CMake doesn't currently
    give us a way to do that, see https://gitlab.kitware.com/cmake/cmake/-/issues/26085.
    ldionne authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    0e8208e View commit details
    Browse the repository at this point in the history
  58. [libc++] P2747R2: constexpr placement new (library part) (llvm#105768)

    This patch implements https://wg21.link/P2747R2.
    
    The library changes affect direct `operator new` and `operator new[]`
    calls even when the core language changes are absent.
    
    The changes are not available for MS ABI because the `operator new` and
    `operator new[]` are from VCRuntime's `<vcruntime_new.h>`. A feature
    request was submitted for that [1].
    
    As a drive-by change, the patch reformatted the whole `new.pass.cpp` and
    `new_array.pass.cpp` tests.
    
    Closes llvm#105427
    
    [1]: https://developercommunity.visualstudio.com/t/constexpr-for-placement-operator-newope/10730304.
    frederick-vs-ja authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7808541 View commit details
    Browse the repository at this point in the history
  59. [mlir][tensor] Add a test for invalid tensor.pack (llvm#106246)

    Adds a missing test for when the rank of the output tensor doesn't match
    the input tensor rank + number of blocking factors.
    banach-space authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    74d1960 View commit details
    Browse the repository at this point in the history
  60. [flang] Update the date_and_time intrinsic for AIX (llvm#104849)

    Currently, strftime is called to get the timezone for the ZONE argument.
    On AIX, this routine requires an environment variable set in order to 
    return the required format. This patch is to add the time difference 
    computation from UTC for the platform.
    kkwli authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    8b198ee View commit details
    Browse the repository at this point in the history
  61. [clang] Minor updates to C++ DR page design (llvm#106360)

    This patch updates `make_cxx_dr_status` script to use the same
    spoiler-like way to hide additional details that `cxx_status.html` uses.
    This gives implemented yet unresolved DRs new but very familiar look:
    
    ![s9EpO0E](https://github.com/user-attachments/assets/54852d7b-5fdd-4595-8dca-20628797f952)
    
    I also took an opportunity to fix spelling inconsistency pointed out by
    @zygoloid in
    llvm#106299 (comment).
    
    I got tired of counting `%s`s when we substitute data into HTML
    template, so I replaced them with an f-string (available since Python
    3.6), because I had to touch this code anyway.
    Endilll authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    fc39cc1 View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    b8c0e8a View commit details
    Browse the repository at this point in the history
  63. [mlir][amdgpu] Improve Chipset version utility (llvm#106169)

    * Fix an OOB access
    * Add comparison operators
    * Add documentation
    * Add unit tests
    kuhar authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    b2f1d06 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    8a50e35 View commit details
    Browse the repository at this point in the history
  65. [InstCombine][X86] Only demand used bits for PSHUFB mask values (llvm…

    …#106377)
    
    (V)PSHUFB only uses the sign bit (for zeroing) and the lower 4 bits (to index per-lane byte 0-15) - so use SimplifyDemandedBits to ignore anything touching the remaining bits.
    
    Fixes llvm#106256
    RKSimon authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    51a0951 View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    158ba73 View commit details
    Browse the repository at this point in the history
  67. [libc++] Mark a few papers as done or "Nothing To Do"

    Please refer to the Github issues for details on why those are marked
    as resolved. Huge thanks to @frederick-vs-ja for the analysis.
    
    Closes llvm#104336
    Closes llvm#100042
    Closes llvm#100615
    ldionne committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    cc0f2d5 View commit details
    Browse the repository at this point in the history
  68. [MachineOutliner][NFC] Remove unnecessary RepeatedSequenceLocs.clear() (

    llvm#106171)
    
    - When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no
    `OutlinedFunction` is created), there is no need to clear the input
    argument, `RepeatedSequenceLocs`, as it's already being cleared in the
    main loop of `findCandidates()`.
    - Replaced `2` by `MinRepeats`, which I missed from
    llvm#105398
    kyulee-com authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    140381d View commit details
    Browse the repository at this point in the history
  69. [compiler-rt][rtsan] NFC: Introduce __rtsan_expect_not_realtime helper (

    llvm#106314)
    
    We are extracting this function into the C API so we can eventually
    install it when a user marks a function [[clang::blocking]].
    cjappl authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    fee4836 View commit details
    Browse the repository at this point in the history
  70. [lldb][lldb-dap][test] Enable variable tests on Windows

    At least for our Windows on Arm machine compiling with clang-cl,
    it has inverted which variables get a `::` prefix.
    
    Would not surprise me if msvc does the opposite so feel free to
    revert if these tests fail for you.
    DavidSpickett committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    a3cd8d7 View commit details
    Browse the repository at this point in the history
  71. [VPlan] Move logic to create interleave groups to VPlanTransforms (NFC).

    This is a step towards further breaking up the rather large
    tryToBuildVPlanWithVPRecipes. It moves logic create interleave groups to
    VPlanTransforms.cpp, where similar replacements for other recipes are
    defined as well (e.g. EVL-based ones)
    fhahn committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    16910a2 View commit details
    Browse the repository at this point in the history
  72. [PowerPC] fix legalization crash (llvm#105563)

    If v2i64 scalar_to_vector is made custom, llc can crash in certain
    legalization cases where v2i64 vectors are injected, even if they
    weren't otherwise present. The code generated would be fine, but that
    operation is not handled in ReplaceNodeResults. Add handling.
    RolandF77 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    89bbcbe View commit details
    Browse the repository at this point in the history
  73. [flang] Warn when F128 is unsupported (llvm#102147)

    This generates `warning: REAL(KIND=16) is not an enabled type for this
    target` if that type is used in a build not correctly configured to
    support this type. Uses of `selected_real_kind(30)` return -1.
    tblah authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    114ff99 View commit details
    Browse the repository at this point in the history
  74. Configuration menu
    Copy the full SHA
    37d0841 View commit details
    Browse the repository at this point in the history
  75. [X86][LegalizeDAG] FPOWI: promote f16 operand (llvm#105775)

    Fixes llvm#105747
    
    ---------
    
    Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>
    v01dXYZ and v01dxyz authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    ecd9e0b View commit details
    Browse the repository at this point in the history
  76. [LLVM][NVPTX] Remove nonexistent ftz ops (llvm#106100)

    According to the PTX
    [spec](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-max),
    max & min instructions do not support the `ftz` modifier for `bf16` &
    `bf16x2` types. This PR removes them from instr info, and the non-ftz
    legal versions will be emitted instead.
    zyx-billy authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    82113a4 View commit details
    Browse the repository at this point in the history
  77. [CodeGen] Create IFUNCs in the program address space, not hard-coded 0 (

    llvm#105726)
    
    Commit 0d527e5 ("GlobalIFunc: Make ifunc respect function address
    spaces") added support for this within LLVM, but Clang does not properly
    honour the target's address spaces when creating IFUNCs, crashing with
    RAUW and verifier assertion failures when compiling C code on a target
    with a non-zero program address space, so fix this.
    jrtc27 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    73e0aa5 View commit details
    Browse the repository at this point in the history
  78. [InterleavedAccess] Use SmallVectorImpl references. NFC

    Instead of repeating SmallVector size in multiple places.
    topperc committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    829c47f View commit details
    Browse the repository at this point in the history
  79. [lldb][lldb-dap][test] Enable more attach tests on Windows

    By adding the equivalent includes.
    DavidSpickett committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    af3ee62 View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    be7014e View commit details
    Browse the repository at this point in the history
  81. [clang][bytecode] Fix llvm#55390 here as well (llvm#106395)

    Ignore the multiplication overflow but report the 0 denominator.
    tbaederr authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    40db261 View commit details
    Browse the repository at this point in the history
  82. Configuration menu
    Copy the full SHA
    b40677c View commit details
    Browse the repository at this point in the history
  83. [AMDGPU] Don't realign already allocated LDS. Point fix for 106412 (l…

    …lvm#106421)
    
    Fixes 106412. The logic that skips the pass on already-lowered variables
    doesn't cover the path that increases alignment of variables. If a
    variable is allocated at 24 and then given 16 byte alignment, the
    backend notices and fatal-errors on the inconsistency.
    JonChesterfield authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    1bde8e0 View commit details
    Browse the repository at this point in the history
  84. [LTO] Introduce new type alias ImportListsTy (NFC) (llvm#106420)

    The background is as follows.  I'm planning to reduce the memory
    footprint of ThinLTO indexing by changing ImportMapTy, the data
    structure used for an import list.  Once this patch lands, I'm
    planning to change the type slightly.  The new type alias allows us to
    update the type without touching many places.
    kazutakahirata authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    4f15039 View commit details
    Browse the repository at this point in the history
  85. [libc++] Replace 'tags' in CSV status pages by inline notes (llvm#105581

    )
    
    This patch replaces 'tags' in the CSV status pages by inline notes
    that optionally describe more details about the paper/LWG issue.
    
    Tags were not really useful anymore because we have a vastly superior
    tagging system via Github issues, and keeping the tags up-to-date
    between CSV files and Github is going to be really challenging.
    
    This patch also adds support for encoding custom notes in the CSV
    files via Github issues. To encode a note in the CSV file, the
    body (initial description) of a Github issue can be edited to contain
    the following markers:
    
        BEGIN-RST-NOTES
        text that will be added as a note in the RST
        END-RST-NOTES
    
    Amongst other things, this solves the problem of conveying that a
    paper has been implemented as a DR, and it gives a unified way to
    add notes to the status pages from Github.
    ldionne authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    c2cac69 View commit details
    Browse the repository at this point in the history
  86. Configuration menu
    Copy the full SHA
    ef403f9 View commit details
    Browse the repository at this point in the history
  87. Configuration menu
    Copy the full SHA
    a4989cd View commit details
    Browse the repository at this point in the history
  88. [VPlan] Pass live-ins used as exit values straight to live-out.

    Live-ins that are used as exit values don't need to be extracted, they
    can be passed through directly. This fixes a crash when trying to
    extract from a live-in.
    
    Fixes llvm#106257.
    fhahn committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    4b84288 View commit details
    Browse the repository at this point in the history
  89. DAG: Change round-mode operand type to i32 for FPTRUNC_ROUND (llvm#10…

    …6424)
    
    We need this immediate type to be consistent. This is the pre-commit for
    llvm#105761
    changpeng authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    41b5507 View commit details
    Browse the repository at this point in the history
  90. [gn build] Port 71ede8d

    llvmgnsyncbot committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    b5977b5 View commit details
    Browse the repository at this point in the history
  91. [gn build] Port 7a4013f

    llvmgnsyncbot committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    5a5cf51 View commit details
    Browse the repository at this point in the history
  92. [gn build] Port f9ad249

    llvmgnsyncbot committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    efbafbc View commit details
    Browse the repository at this point in the history
  93. [SandboxIR] Add test that checks if classof() is missing. (llvm#106313)

    Forgetting to implement an `<Instruction Subclass>::classof()` function
    does not cause any failures because it falls back to
    Instruction::classof(). This patch adds an explicit check for all
    instruction classes to confirm that they have a classof implementation.
    vporpo authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    3cf1018 View commit details
    Browse the repository at this point in the history
  94. [LV] Add extra tests with interleave groups and different insert pos.

    Add additional test coverage for interleave groups with different insert
    positions.
    fhahn committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    7912abe View commit details
    Browse the repository at this point in the history
  95. [compiler-rt][test] Rewrote test to remove curly braces (llvm#105696)

    This patch removes curly braces from a test, as lit's internal shell
    implementation does not support curly brace syntax.
    
    Fixes llvm#102382.
    connieyzhu authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    b978bcc View commit details
    Browse the repository at this point in the history
  96. [profile][test] Build Posix/instrprof-dlopen-norpath.test objects as …

    …PIC (llvm#106406)
    
    `Profile-x86_64 :: Posix/instrprof-dlopen-norpath.test` `FAILs` on
    Solaris/amd64 and similarly on Solaris/sparcv9:
    ```
    RUN: at line 10: ./a.out 2>&1 | FileCheck compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test -check-prefix=CHECK-FOO
    + ./a.out
    + FileCheck compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test -check-prefix=CHECK-FOO
    compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test:24:12: error: CHECK-FOO: expected string not found in input
    CHECK-FOO: foo:
               ^
    <stdin>:1:1: note: scanning from here
    unable to lookup symbol 'foo': ld.so.1: a.out: invalid handle: 0x0
    ```
    The problem turned out to be two-fold: `OPEN_AND_RUN` didn't check the
    `dlopen` return value and the objects linked into the shared objects to
    be `dlopen`ed aren't built as PIC.
    
    This patch fixes the latter.
    
    Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
    `x86_64-pc-linux-gnu`.
    rorth authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    e03669a View commit details
    Browse the repository at this point in the history
  97. [RISCV] Add cost model coverage for insert/extract element w/ 2^N - 1…

    … types
    
    We currently return costs which are too low for these.
    preames committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    c43190f View commit details
    Browse the repository at this point in the history
  98. [NVPTX] Support __usAtomicCAS builtin (llvm#99646)

    Supported `__usAtomicCAS` builtin originally defined in
    `/usr/local/cuda/inlcude/crt/sm_70_rt.hpp`
    
    ---------
    
    Co-authored-by: Denis Gerasimov <Denis.Gerasimov@baikalelectronics.ru>
    Co-authored-by: Gonzalo Brito Gadeschi <gonzalob@nvidia.com>
    Co-authored-by: Denis.Gerasimov <dengzmm@gmail.com>
    4 people authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    2d1fba6 View commit details
    Browse the repository at this point in the history
  99. [LTO] Turn ImportListsTy into a proper class (NFC) (llvm#106427)

    This patch turns ImportListsTy into a class that wraps
    DenseMap<StringRef, ImportMapTy>.
    
    Here is the background.  I'm planning to reduce the memory footprint
    of ThinLTO indexing.  Specifically, ImportMapTy, the list of imports
    for a given destination module, will be a hash set of integer IDs
    indexing into a deduplication table of pairs (SourceModule, GUID),
    which is a lot like string interning.  I'm planning to put this
    deduplication table as part of ImportListsTy and have each instance of
    ImportMapTy hold a reference to the deduplication table.
    
    Another reason to wrap the DenseMap is that I need to intercept
    operator[]() so that I can construct an instance of ImportMapTy with a
    reference to the deduplication table.  Note that the default
    implementation of operator[]() would default-construct ImportMapTy,
    which I am going to disable.
    kazutakahirata authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    e61d606 View commit details
    Browse the repository at this point in the history
  100. [clang] check deduction consistency when partial ordering function te…

    …mplates (llvm#100692)
    
    This makes partial ordering of function templates consistent with other
    entities, by implementing [temp.deduct.type]p1 in that case.
    
    Fixes llvm#18291
    mizvekov authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    aa7497a View commit details
    Browse the repository at this point in the history
  101. [ADT] Relax iterator constraints on all_equal (llvm#106400)

    The previous `all_equal` implementation contained `Begin + 1`, which
    implicitly requires `Begin` to model the
    [random_access_iterator](https://en.cppreference.com/w/cpp/iterator/random_access_iterator)
    concept due to the usage of the `+` operator. By swapping this out with
    `std::next`, this method can be used with weaker iterator concepts, such
    as
    [forward_iterator](https://en.cppreference.com/w/cpp/iterator/forward_iterator).
    
    ---------
    
    Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
    aws-taylor and kuhar authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    6b4b8dc View commit details
    Browse the repository at this point in the history
  102. Configuration menu
    Copy the full SHA
    ec360d6 View commit details
    Browse the repository at this point in the history
  103. Configuration menu
    Copy the full SHA
    898d52b View commit details
    Browse the repository at this point in the history
  104. [clang][HLSL] Update DXIL/SPIRV hybird CodeGen tests to use temp var (l…

    …lvm#105930)
    
    Update all hybird DXIL/SPIRV codegen tests to use temp variable
    representing interchange target
    
    Fixes: llvm#105710
    AmrDeveloper authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    e99aa4a View commit details
    Browse the repository at this point in the history
  105. [mlir][spirv] Add an argmax integration test with mlir-vulkan-runner (

    llvm#106426)
    
    This PR adds an integration test for an argmax kernel with
    `mlir-vulkan-runner`. This test exercises the `convert-to-spirv` pass
    (landed in llvm#95942) and demonstrates that we can use SPIR-V ops as
    "intrinsics" among higher-level dialects.
    
    The support for `index` dialect in `mlir-vulkan-runner` is also added.
    angelz913 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    17b7a9d View commit details
    Browse the repository at this point in the history
  106. Disable ThreadPlanSingleThreadTimeout during step over breakpoint (ll…

    …vm#104532)
    
    This PR fixes another race condition in
    llvm#90930. The failure was found
    by @labath with this log: https://paste.debian.net/hidden/30235a5c/:
    ```
    dotest_wrapper.  <  15> send packet: $z0,224505,1#65
    ...
    b-remote.async>  <  22> send packet: $vCont;s:p1dcf.1dcf#4c
    intern-state     GDBRemoteClientBase::Lock::Lock sent packet: \x03
    b-remote.async>  < 818> read packet: $T13thread:p1dcf.1dcf;name:a.out;threads:1dcf,1dd2;jstopinfo:5b7b226e616d65223a22612e6f7574222c22726561736f6e223a227369676e616c222c227369676e616c223a31392c22746964223a373633317d2c7b226e616d65223a22612e6f7574222c22746964223a373633347d5d;thread-pcs:0000000000224505,00007f4e4302119a;00:0000000000000000;01:0000000000000000;02:0100000000000000;03:0000000000000000;04:9084997dfc7f0000;05:a8742a0000000000;06:b084997dfc7f0000;07:6084997dfc7f0000;08:0000000000000000;09:00d7e5424e7f0000;0a:d0d9e5424e7f0000;0b:0202000000000000;0c:80cc290000000000;0d:d8cc1c434e7f0000;0e:2886997dfc7f0000;0f:0100000000000000;10:0545220000000000;11:0602000000000000;12:3300000000000000;13:0000000000000000;14:0000000000000000;15:2b00000000000000;16:80fbe5424e7f0000;17:0000000000000000;18:0000000000000000;19:0000000000000000;reason:signal;#b9
    ```
    It shows an async interrupt "\x03" was sent immediately after `vCont;s`
    single step over breakpoint at address `0x224505` (which was disabled
    before vCont). And the later stop was still at the original PC
    (0x224505) not moving forward.
    
    The investigation shows the failure happens when timeout is short and
    async interrupt is sent to lldb-server immediately after vCont so
    ptrace() resumes and then async interrupts debuggee immediately so
    debuggee does not get a chance to execute and move PC. So it enters stop
    mode immediately at original PC. `ThreadPlanStepOverBreakpoint` does not
    expect PC not moving and reports stop at the original place.
    
    To fix this, the PR prevents `ThreadPlanSingleThreadTimeout` from being
    created during `ThreadPlanStepOverBreakpoint` by introduces a new
    `SupportsResumeOthers()` method and `ThreadPlanStepOverBreakpoint`
    returns false for it. This makes sense because we should never resume
    threads during step over breakpoint anyway otherwise it might cause
    other threads to miss breakpoint.
    
    ---------
    
    Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
    jeffreytan81 and jeffreytan81 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    38b252a View commit details
    Browse the repository at this point in the history
  107. Configuration menu
    Copy the full SHA
    0281339 View commit details
    Browse the repository at this point in the history
  108. AMDGPU: Rename fail.llvm.fptrunc.round.ll to llvm.fptrunc.round.err.ll (

    llvm#106452)
    
    Also correct the suffix of the intrinsic
    changpeng authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    53d95f3 View commit details
    Browse the repository at this point in the history
  109. [LTO] Make getImportType a proper function (NFC) (llvm#106450)

    I'm planning to reduce the memory footprint of ThinLTO indexing by
    changing ImportMapTy.  A look-up of the import type will involve data
    private to ImportMapTy, so it must be done by a member function of
    ImportMapTy.  This patch turns getImportType into a member function so
    that a subsequent "real" change will just have to update the
    implementation of the function in place.
    kazutakahirata authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    eb9c49c View commit details
    Browse the repository at this point in the history
  110. [DXIL] Don't generate per-variable guards for DirectX (llvm#106096)

    Thread init guards are generated for local static variables when using
    the Microsoft CXX ABI. This ABI is also used for HLSL generation, but
    DXIL doesn't need the corresponding _Init_thread_header/footer calls and
    doesn't really have a way to handle them in its output targets.
    
    This modifies the language ops when the target is DXIL to exclude this
    so that they won't be generated and an alternate guardvar method is used
    that is compatible with the usage.
    
    Done to facilitate testing for llvm#89806, but isn't really related
    pow2clk authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    26c582b View commit details
    Browse the repository at this point in the history
  111. Configuration menu
    Copy the full SHA
    18c79ca View commit details
    Browse the repository at this point in the history
  112. Configuration menu
    Copy the full SHA
    1bc7057 View commit details
    Browse the repository at this point in the history
  113. [clang][bytecode] Implement constexpr vector unary operators +, -, ~,…

    … ! (llvm#105996)
    
    Implement constexpr vector unary operators +, -, ~ and ! .
    
    - Follow the current constant interpreter. All of our boolean operations
    on vector types should be '-1' for the 'truth' type.
    - Move the following functions from `Sema` to `ASTContext`, because we
    used it in new interpreter.
    ```C++
    QualType GetSignedVectorType(QualType V);
    QualType GetSignedSizelessVectorType(QualType V);
    ```
    
    ---------
    
    Signed-off-by: yronglin <yronglin777@gmail.com>
    yronglin authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    ee0d706 View commit details
    Browse the repository at this point in the history
  114. [OpenMP][NFC] Remove executable cases from declaration switch (llvm#1…

    …06438)
    
    The executable directives are handled earlier.
    mikerice1969 authored Aug 28, 2024
    Configuration menu
    Copy the full SHA
    13fa78c View commit details
    Browse the repository at this point in the history
  115. [RISCV] Remove effectively duplicate RUN lines form fixed-vectors-fp.…

    …ll. NFC
    
    We had RUN lines with +v,+f and +v,+f,+d. +v implies +f and +d so
    these are equivalent.
    topperc committed Aug 28, 2024
    Configuration menu
    Copy the full SHA
    431db18 View commit details
    Browse the repository at this point in the history
  116. Configuration menu
    Copy the full SHA
    a7ba73b View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2024

  1. [llvm-profdata] Enabled functionality to write split-layout profile (l…

    …lvm#101795)
    
    Using the flag `-split_layout` in llvm-profdata merge, the output
    profile can write profiles with and without inlined function into two
    different extbinary sections (and their FuncOffsetTable too). The
    section without inlined functions are marked with `SecFlagFlat` and is
    skipped by ThinLTO because it provides no useful info.
    
    The split layout feature was already implemented in SampleProfWriter but
    previously there is no way to use it from llvm-profdata.
    huangjd authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    75e9d19 View commit details
    Browse the repository at this point in the history
  2. [NFC] Fix formatv() usage in preparation of validation (llvm#106454)

    Fix several uses of formatv() that would be flagged as invalid by an
    upcoming change that will add additional validation to formatv().
    jurahul authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    b75fe11 View commit details
    Browse the repository at this point in the history
  3. [MachineLoopInfo] Fix getLoopID to handle multi latches. (llvm#106195)

    This patch also fixed `CodegenPrepare` to preserve loop metadata when
    merging blocks.
    
    This fixes issue llvm#102632
    FreddyLeaf authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    3a5c578 View commit details
    Browse the repository at this point in the history
  4. workflows/release-binaries: Enable flang builds on Windows (llvm#101344)

    Flang for Windows depends on compiler-rt, so we need to enable it for
    the stage1 builds. This also fixes failures building the flang tests on
    macOS.
    
    Fixes llvm#100202.
    tstellar authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    8927576 View commit details
    Browse the repository at this point in the history
  5. [clang-format] Revert "[clang-format][NFC] Delete TT_LambdaArrow (#70… (

    llvm#105923)
    
    …519)"
    
    This reverts commit e00d32a and adds a
    test for lambda arrow SplitPenalty.
    
    Fixes llvm#105480.
    owenca authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    438ad9f View commit details
    Browse the repository at this point in the history
  6. [X86,SimplifyCFG] Support hoisting load/store with conditional faulti…

    …ng (Part I) (llvm#96878)
    
    This is simplifycfg part of
    llvm#95515
    
    In this PR, we support hoisting load/store with conditional faulting in
    `SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional
    branches.
    This is for cases like
    ```
    void test (int a, int *b) {
      if (a)
       *b = a;
    }
    ``` 
    
    In the following patches, we will support the hoist in
    `SimplifyCFGOpt::hoistCommonCodeFromSuccessors`.
    That is for cases like
    ```
    void test (int a, int *c, int *d) {
      if (a)
       *c = a;
      else 
       *d = a;
    }
    ```
    KanRobert authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    87c86aa View commit details
    Browse the repository at this point in the history
  7. [SLP] Fix the Vec lane overridden by the shuffle mask (llvm#106341)

    Currently, SLP uses shuffle for the external user of `InsertElementInst`
    and iterates through the `InsertElementInst` chain to fill the mask with
    constant indices. However, it may override the original Vec lane. Using
    the original Vec lane is sufficient.
    tcwzxx authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    121fb2c View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    ee6961d View commit details
    Browse the repository at this point in the history
  9. [LLDB][Minidumps] Read x64 registers as 64b and handle truncation in …

    …the file builder (llvm#106473)
    
    This patch addresses a bug where `cs`/`fs` and other segmentation flags
    were being identified as having a type of `32b` and `64b` for `rflags`.
    In that case the register value was returning the fail value `0xF...`
    and this was corrupting some minidumps. Here we just read it as a 64b
    value and truncate it.
    
    In addition to that fix, I added comparing the registers from the live
    process to the loaded core for the generic minidump test. Prior only
    being ARM register tests. This explains why this was not detected
    before.
    Jlalond authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    82ebd33 View commit details
    Browse the repository at this point in the history
  10. Reapply "[mlir] NFC: fix dependence of (Tensor|Linalg|MemRef|Complex)…

    … dialects on LLVM Dialect and LLVM Core in CMake build (llvm#104832)" (llvm#105703)
    
    Reapply the commit 43b5085 with
    additional fixes for building with
    BUILD_SHARED_LIBS=ON.
    christopherbate authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    8bf69ce View commit details
    Browse the repository at this point in the history
  11. [C++20] [Modules] Merge lambdas in source to imported lambdas (llvm#1…

    …06483)
    
    Close llvm#102721
    
    Generally, the type of merged decls will be reused in ASTContext. But
    for lambda, in the import and then include case, we can't decide its
    previous decl in the imported modules so that we can't assign the
    previous decl before creating the type for it. Since we can't decide its
    numbering before creating it. So we have to assign the previous decl and
    the canonical type for it after creating it, which is unusual and
    slightly hack.
    ChuanqiXu9 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    55cdb3c View commit details
    Browse the repository at this point in the history
  12. [RISCV] Fix v[f]slide1down.vx having VL changed (llvm#106110)

    v[f]slide1down.vx uses VL to determine where the element is inserted
    into, so changing the VL changes the result.
    
    This fixes this by setting ActiveElementsAffectsResult, but it's overly
    conservative. We should relax this later by modelling that it's ok to
    change the mask, just not VL.
    
    Fixes llvm#106109
    lukel97 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    619efd7 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    051054e View commit details
    Browse the repository at this point in the history
  14. [Attributor] Fix an issue that could potentially cause AccessList a…

    …nd `OffsetBins` out of sync (llvm#106187)
    
    The implementation of `AAPointerInfo::RangeList::set_difference` doesn't
    consider the case where two ranges have the same offset but different
    sizes.
    This could cause `AccessList` and `OffsetBins` out of sync because a
    range has
    been already updated in `AccessList` but missing in `ToRemove`.
    
    I do have a reproducer but the reproducer itself is 248kb. `llvm-reduce`
    can't
    further reduce it. Not sure how I can make a smaller reproducer.
    
    Fixes: SWDEV-479757.
    shiltian authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    572d2fd View commit details
    Browse the repository at this point in the history
  15. workflows/release-tasks: Pass required secrets to all called workflows (

    llvm#106286)
    
    Called workflows don't have access to secrets by default, so we need to
    explicitly pass secrets that we use.
    tstellar authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    9d81e7e View commit details
    Browse the repository at this point in the history
  16. [mlir] fix missing LLVMDialect dependency for MLIRSCFToControlFlow

    This is a fix-forward for 8bf69ce.
    The SCF-to-ControlFlow pass has an explicit LLVMDialect dependency.
    christopherbate committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    95361cf View commit details
    Browse the repository at this point in the history
  17. [RISCV] Fix a place that convert an immediate to MCRegister and back …

    …to immediate.
    
    This dropped the upper 32 bits of the immediate, but I'm not sure
    it is ever non-zero.
    topperc committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    62c5de3 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    2adc94c View commit details
    Browse the repository at this point in the history
  19. [RISCV] Decompose LMUL > 1 reverses into LMUL * M1 vrgather.vv (llvm#…

    …104574)
    
    As far as I'm aware, vrgather.vv is quadratic in LMUL on most
    microarchitectures today due to each output register needing to read
    from each input register in the group.
    
    For example, the reciprocal throughput for vrgather.vv on the
    spacemit-x60 is listed on
    https://camel-cdr.github.io/rvv-bench-results/bpi_f3 as:
    
        LMUL1   LMUL2   LMUL4   LMUL8
        4.0	16.0	64.0	256.1
    
    Vector reverses are commonly emitted by the loop vectorizer and are
    lowered as vrgather.vvs, but since the loop vectorizer uses LMUL 2 by
    default they end up being quadratic.
    
    The output registers in a reverse only need to read from one input
    register though, so we can decompose this into LMUL * M1 vrgather.vvs to
    get linear performance.
    
    This gives a 0.43% runtime improvement on 526.blender_r at rva22u64_v O3
    on the Banana Pi F3.
    lukel97 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    3b64ede View commit details
    Browse the repository at this point in the history
  20. [bugpoint] Fix bugpoint for LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABL…

    …ES=Off.
    
    Building with -DLLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off should not
    prevent use of bugpoint plugins.
    
    This fix uses the approach implemented in
    llvm#101741.
    lhames committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    8f96be9 View commit details
    Browse the repository at this point in the history
  21. [AVR] Fix 16-bit LDDs with immediate overflows (llvm#104923)

    16-bit loads are expanded into a pair of 8-bit loads, so the maximum
    offset of such 16-bit loads must be 62, not 63.
    Patryk27 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c7a4efa View commit details
    Browse the repository at this point in the history
  22. [IPSCCP] Intersect attribute info for interprocedural args (llvm#106397)

    IPSCCP can currently return worse results than SCCP for arguments that
    are tracked interprocedurally, because information from attributes is
    not used for them.
    
    Fix this by intersecting in the attribute information when propagating
    lattice values from calls.
    nikic authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    7f59264 View commit details
    Browse the repository at this point in the history
  23. [lldb][lldb-dap][test] Enable more tests on Windows

    These tests "just work" on our Windows On Arm machine.
    DavidSpickett committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c954306 View commit details
    Browse the repository at this point in the history
  24. [C++20] [Modules] Don't insert class not in named modules to PendingE…

    …mittingVTables (llvm#106501)
    
    Close llvm#102933
    
    The root cause of the issue is an oversight in
    llvm#102287 that I didn't notice
    that PendingEmittingVTables should only accept classes in named modules.
    ChuanqiXu9 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    47615ff View commit details
    Browse the repository at this point in the history
  25. [clang-repl] Fix clang-repl for LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECU…

    …TABLES=Off.
    
    clang-repl should stil work when LLVM is built with
    -DLLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off.
    
    This fix uses the approach implemented in
    llvm#101741.
    
    rdar://134910110
    lhames committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    e5b55e6 View commit details
    Browse the repository at this point in the history
  26. [C++20] [Modules] Embed all source files for C++20 Modules (llvm#102444)

    Close llvm#72383
    
    The implementation rationale is, I don't want to pass
    `-fmodules-embed-all-files` all the time since we can't test it in lit
    tests (we're using `clang_cc1`). So I tried to set it in FrontendActions
    for modules.
    ChuanqiXu9 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2eeeff8 View commit details
    Browse the repository at this point in the history
  27. [Driver] Add -mbranch-protection to ARM and AArch64 multilib flags (l…

    …lvm#106391)
    
    This adds the `-mbranch-protection` command line option to the set of
    flags used by the multilib selection for ARM and AArch64 targets.
    pratlucas authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    b822b69 View commit details
    Browse the repository at this point in the history
  28. [mlir] Apply ClangTidyPerformance finding (NFC).

    Use const reference for loop variable.
    akuegel committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    b7981a7 View commit details
    Browse the repository at this point in the history
  29. [LLD][COFF] Add support for range extension thunks for ARM64EC target…

    …s. (llvm#106289)
    
    Thunks themselves are the same as regular ARM64 thunks; they just need
    to report the correct machine type. When processing the code, we also
    need to use the current chunk's machine type instead of the global one:
    we don't want to treat x86_64 thunks as ARM64EC, and we need to report
    the correct machine type in hybrid binaries.
    cjacek authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    efad561 View commit details
    Browse the repository at this point in the history
  30. [llvm][Docs] Update TestSuiteGuide.md (llvm#79613)

    Update svn to git & virtualenv to venv
    VisdaVokhshoori authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    f9ee9f5 View commit details
    Browse the repository at this point in the history
  31. [lldb][lldb-dap][test] Skip logpoint test on Windows again

    This one snuck into the previous patch. The test program needs
    updating if it's ever going to work on Windows.
    DavidSpickett committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ae34257 View commit details
    Browse the repository at this point in the history
  32. [AMDGPU] Graph-based Module Splitting Rewrite (llvm#104763)

    Major rewrite of the AMDGPUSplitModule pass in order to better support
    it long-term.
    
    Highlights:
    - Removal of the "SML" logging system in favor of just using CL options
    and LLVM_DEBUG, like any other pass in LLVM.
    - The SML system started from good intentions, but it was too flawed and
    messy to be of any real use. It was also a real pain to use and made the
    code more annoying to maintain.
     - Graph-based module representation with DOTGraph printing support
    - The graph represents the module accurately, with bidirectional, typed
    edges between nodes (a node usually represents one function).
    - Nodes are assigned IDs starting from 0, which allows us to represent a
    set of nodes as a BitVector. This makes comparing 2 sets of nodes to
    find common dependencies a trivial task. Merging two clusters of nodes
    together is also really trivial.
     - No more defaulting to "P0" for external calls
    - Roots that can reach non-copyable dependencies (such as external
    calls) are now grouped together in a single "cluster" that can go into
    any partition.
     - No more defaulting to "P0" for indirect calls
    - New representation for module splitting proposals that can be graded
    and compared.
    - Graph-search algorithm that can explore multiple branches/assignments
    for a cluster of functions, up to a maximum depth.
    - With the default max depth of 8, we can create up to 256 propositions
    to try and find the best one.
    - We can still fall back to a greedy approach upon reaching max depth.
    That greedy approach uses almost identical heuristics to the previous
    version of the pass.
    
    All of this gives us a lot of room to experiment with new heuristics or
    even entirely different splitting strategies if we need to. For
    instance, the graph representation has room for abstract nodes, e.g. if
    we need to represent some global variables or external constraints. We
    could also introduce more edge types to model other type of relations
    between nodes, etc.
    
    I also designed the graph representation & the splitting strategies to
    be as fast as possible, and it seems to have paid off. Some quick tests
    showed that we spend pretty much all of our time in the CloneModule
    function, with the actual splitting logic being >1% of the runtime.
    Pierre-vh authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c9b6e01 View commit details
    Browse the repository at this point in the history
  33. [mlir][ArmSME] Merge consecutive arm_sme.intr.zero ops (llvm#106215)

    This merges consecutive SME zero intrinsics within a basic block, which
    avoids the backend eventually emitting multiple zero instructions when
    it could just use one.
    
    Note: This kind of peephole optimization could be implemented in the
    backend too.
    MacDue authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    e37d6d2 View commit details
    Browse the repository at this point in the history
  34. [AMDGPU][llvm-split] Remove declarations-debug

    Test didn't have a FileCheck line and is obsolete after llvm#104763
    Pierre-vh committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    31684c6 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    b9f4afa View commit details
    Browse the repository at this point in the history
  36. [AMDGPU][llvm-split] Make declarations test more stable

    Delete the previous files if present, to ensure it won't fail if the output directory of the tests wasn't cleared.
    Pierre-vh committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    575be3e View commit details
    Browse the repository at this point in the history
  37. AMDGPU/NewPM Port GCNDPPCombine to NPM (llvm#105816)

    Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
    optimisan and optimisan authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    fdca2c3 View commit details
    Browse the repository at this point in the history
  38. [Flang][OpenMP] Don't expect block arguments using early privatization (

    llvm#105842)
    
    There are some spots where all symbols to privatize collected by a
    `DataSharingProcessor` instance are expected to have corresponding entry
    block arguments associated regardless of whether delayed privatization
    was enabled.
    
    This can result in compiler crashes if a `DataSharingProcessor` instance
    created with `useDelayedPrivatization=false` is queried in this way. The
    solution proposed by this patch is to provide another public method to
    query specifically delayed privatization symbols, which will either be
    empty or point to the complete set of symbols to privatize accordingly.
    skatrak authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    60e9fb9 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    c28b84e View commit details
    Browse the repository at this point in the history
  40. [compiler-rt][RISCV][NFC] Update code_model with latest spec (llvm#10…

    …6498)
    
    The spec could be found here
    riscv-non-isa/riscv-c-api-doc#74
    
    This patch updates the following symbol:
    
    ```
    mVendorID -> mvendorid
    mArchID -> marchid
    mImplID -> mimpid
    ```
    BeMg authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2505546 View commit details
    Browse the repository at this point in the history
  41. PPC: Custom lower ppcf128 is_fpclass if is_fpclass is custom (llvm#10…

    …5540)
    
    Unfortunately expandIS_FPCLASS is called directly in SelectionDAGBuilder
    depending on whether IS_FPCLASS is custom or not. This helps avoid ppc test
    regressions in a future patch where the custom lowering would be bypassed.
    arsenm authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    911b960 View commit details
    Browse the repository at this point in the history
  42. DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (llvm#…

    …105577)
    
    For some reason, isOperationLegalOrCustom is not the same as
    isOperationLegal || isOperationCustom. Unfortunately, it checks
    if the type is legal which makes it uesless for custom lowering
    on non-legal types (which is always ppcf128).
    
    Really the DAG builder shouldn't be going to expand this in the
    builder, it makes it difficult to work with. It's only here to work
    around the DAG requiring legal integer types the same size as
    the FP type after type legalization.
    arsenm authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    7b7b0b9 View commit details
    Browse the repository at this point in the history
  43. [analyzer] Add missing include <unordered_map> to llvm/lib/Support/Z3…

    …Solver.cpp (llvm#106410)
    
    Resolves llvm#106361. Adding #include <unordered_map> to
    llvm/lib/Support/Z3Solver.cpp fixes compilation errors for homebrew
    build on macOS with Xcode 14.
    https://github.com/Homebrew/homebrew-core/actions/runs/10604291631/job/29390993615?pr=181351
    shows that this is resolved when the include is patched in (Linux CI
    failure is due to unrelated timeout).
    lukeshingles authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    fcb3a04 View commit details
    Browse the repository at this point in the history
  44. [X86, MC] Recognize OSIZE=64b when EVEX.W = 1, EVEX.pp = 01 (llvm#103816

    )
    
    In the legacy space, if both the 66 prefix and REX.W=1 are present, the
    REX.W=1 takes precedence and makes OSIZE=64b. EVEX map 4 inherits this
    convention, with EVEX.pp=01 and EVEX.W playing the roles of the 66
    prefix and REX.W. So if EVEX.pp=00, the OSIZE can only be 64b or 32b,
    depending on whether EVEX.W=1 or not. But if EVEX.pp=01, then OSIZE is
    either 64b or 16b depending on whether EVEX.W=1 or not.
    FreddyLeaf authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    36b7c30 View commit details
    Browse the repository at this point in the history
  45. [SLP] Move some of X86 tests to common directory (llvm#106401)

    Some of the tests from X86 directory can be generalized for AArch64 to
    improve its coverage.
    ElvinaYakubova authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ddbc8f3 View commit details
    Browse the repository at this point in the history
  46. [DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (l…

    …lvm#105524)
    
    Fixes: llvm#104695
    
    This patch adds the is_stmt flag to line table entries for the first
    instruction with a non-0 line location in each basic block, to ensure
    that it will be used for stepping even if the last instruction in the
    previous basic block had the same line number; this is important for
    cases where the new BB is reachable from BBs other than the preceding
    block.
    SLTozer authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    3ef37e2 View commit details
    Browse the repository at this point in the history
  47. [MLIR][Flang][OpenMP] Remove omp.parallel from loop wrapper ops (llvm…

    …#105833)
    
    This patch updates the `omp.parallel` operation according to the results
    of the discussion in [this
    RFC](https://discourse.llvm.org/t/rfc-disambiguation-between-loop-and-block-associated-omp-parallelop/79972).
    It is removed from the set of loop wrapper operations, changing the
    expected MLIR representation for composite `distribute parallel do/for`
    into the following:
    
    ```mlir
    omp.parallel {
      ...
      omp.distribute {
        omp.wsloop {
          omp.loop_nest ... { ... }
          omp.terminator
        }
        omp.terminator
      }
      ...
      omp.terminator
    }
    ```
    
    MLIR verifiers for operations impacted by this representation change are
    updated, as well as related tests. The `LoopWrapperInterface` is also
    updated, since it's no longer representing an optional "role" of an
    operation but a mandatory set of restrictions instead.
    skatrak authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2784060 View commit details
    Browse the repository at this point in the history
  48. [Flang][OpenMP] Move loop privatization out of dispatch (llvm#106066)

    This patch moves the creation of `DataSharingProcessor` instances for
    loop constructs out of `genOMPDispatch()` and into their corresponding
    codegen functions. This is a necessary first step to enable a proper
    handling of privatization on composite constructs.
    
    Some tests are updated due to a change of order between clause
    processing and privatization.
    skatrak authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    0f206b1 View commit details
    Browse the repository at this point in the history
  49. [AArch64] optimise SVE cvt intrinsics with no active lanes (llvm#104809)

    This patch extends llvm#73964 and
    optimises SVE cvt intrinsics away when predicate is zero.
    Lukacma authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    113806d View commit details
    Browse the repository at this point in the history
  50. [Flang][OpenMP] DISTRIBUTE PARALLEL DO lowering (llvm#106207)

    This patch adds PFT to MLIR lowering support for `distribute parallel
    do` composite constructs.
    skatrak authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    9c8ce5f View commit details
    Browse the repository at this point in the history
  51. [Flang][OpenMP] DISTRIBUTE PARALLEL DO SIMD lowering (llvm#106211)

    This patch adds PFT to MLIR lowering support for `distribute parallel do
    simd` composite constructs.
    skatrak authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    57726c4 View commit details
    Browse the repository at this point in the history
  52. [SLP]Fix a crash when requestin the cost for buildvector cmp nodes ty…

    …pes.
    
    Need to use original cmp type i1 when estimating the cost for the
    buildvector node, not its operand types to prevent compiler crash upon
    TTI cost estimation.
    alexey-bataev committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    fdf72c9 View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    c3cb273 View commit details
    Browse the repository at this point in the history
  54. [DebugInfo][NFC] Make is_stmt-at-block-start test X86-specific

    Fixes failure on the llvm-clang-aarch64-darwin buildbot:
    https://lab.llvm.org/buildbot/#/builders/190/builds/4660/
    
    The test mentioned does not rely on any unique property of X86, but does
    rely on the layout of the basic blocks produced by llc, which varies
    between targets. Although the test could be duplicated for other targets,
    it seems unnecessary since the behaviour being tested is not
    target-specific.
    SLTozer committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    616f7d3 View commit details
    Browse the repository at this point in the history
  55. [LV] Use SCEV to analyze second operand for cost query.

    Improve operand analysis using SCEV for cost purposes. This fixes a
    divergence between legacy and VPlan-based cost-modeling after
    533e6bb.
    
    Fixes llvm#106248.
    fhahn committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    0a272d3 View commit details
    Browse the repository at this point in the history
  56. Revert "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instructio…

    …n in BB (llvm#105524)"
    
    Reverted (along with the NFC followup fix) due to buildbot failure:
    https://lab.llvm.org/buildbot/#/builders/160/builds/4142
    
    This reverts commit 3ef37e2, and commit
    616f7d3.
    SLTozer committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    926f097 View commit details
    Browse the repository at this point in the history
  57. [LAA] Add test cases where evaluating AddRecs at symbolic max BTC wraps.

    The underlying issue was discovered by an assert added in
    a800533 by a test case provided by @mstorsjo.
    fhahn committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    606a934 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    50515db View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    9167667 View commit details
    Browse the repository at this point in the history
  60. [clang][bytecode] Properly diagnose non-const reads (llvm#106514)

    If the global variable is constant (but not constexpr), we need to
    diagnose, but keep evaluating.
    tbaederr authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    cb608cc View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    25c9410 View commit details
    Browse the repository at this point in the history
  62. [InstCombine][X86] Only demand used bits for VPERMILPD/VPERMILPS mask…

    … values
    
    VPERMILPS lower bits0-3 (to index per-lane i32/f32 0-3)
    VPERMILPD uses bit1  (to index per-lane i64/f64 0-1)
    
    Use SimplifyDemandedBits to ignore anything touching the remaining bits.
    
    Part of llvm#106413
    RKSimon committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    d57c046 View commit details
    Browse the repository at this point in the history
  63. Restrict LLVM_TARGETS_TO_BUILD in Windows release packaging (llvm#106059

    )
    
    When including all targets, some files become too large for the NSIS
    installer to handle.
    
    Fixes llvm#101994
    zmodem authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2a28df6 View commit details
    Browse the repository at this point in the history
  64. [lldb][lldb-dap][test] Enable Launch tests

    Add Windows include equivalents for includes and shell command.
    DavidSpickett committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    b2a820f View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    0a48482 View commit details
    Browse the repository at this point in the history
  66. [libc][x86] Use prefetch for write for memcpy (llvm#90450)

    Currently when `LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING` is set we
    prefetch memory for read on the source buffer. This patch adds prefetch
    for write on the destination buffer.
    gchatelet authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    73ef397 View commit details
    Browse the repository at this point in the history
  67. [include-cleaner] Mark RecordDecls referenced in UsingDecls as explic…

    …it (llvm#106430)
    
    We were reporting ambigious references from using declarations as user
    can be depending on different overloads of a function just because they
    are visible in the TU.
    This doesn't apply to records, or primary templates as declaration being
    referenced in such cases is unambigious, the ambiguity applies to
    specializations though.
    
    Hence this patch returns an explicit reference to record decls and
    primary templates of those.
    kadircet authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    acff429 View commit details
    Browse the repository at this point in the history
  68. [SPARC][IAS] Add illtrap alias for unimp (llvm#105928)

    This follows Solaris behavior of allowing both mnemonics all the time.
    
    Fixes llvm#105639.
    koachan authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    7955760 View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    ba52a09 View commit details
    Browse the repository at this point in the history
  70. [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (llvm#105671)

    Fix llvm#105571 which demonstrates an end() iterator dereference when
    performing a non-empty splice to end() from a region that ends at
    Src::end().
    
    Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
    (which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
    doesn't clean up the source marker, which in this case we know is a trailing
    marker, so we have to do that manually.
    OCHyams authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    43661a1 View commit details
    Browse the repository at this point in the history
  71. [NFC][AMDGPU] Autogenerate tests for uniform i32 promo in ISel (llvm#…

    …106382)
    
    Many tests were easy to update, but these are quite big and I think it's
    better to autogenerate them to see the difference well.
    Pierre-vh authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    1f8f2ed View commit details
    Browse the repository at this point in the history
  72. [clang][bytecode] Diagnose member calls on deleted blocks (llvm#106529)

    This requires a bit of restructuring of ctor calls when checking for a
    potential constant expression.
    tbaederr authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    df11ee2 View commit details
    Browse the repository at this point in the history
  73. [LoopVectorize][X86] amdlibm-calls.ll - cleanup test checks for 2/4/8…

    …/16 vector widths
    
    This cleans up the existing tests and shows the gaps in the test checks (for instance we're often testing VF4 + VF16 but not VF8 even though amdlibm supports it).
    RKSimon committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c57abc6 View commit details
    Browse the repository at this point in the history
  74. [LoopVectorize][X86] amdlibm-calls.ll - add additional 2/4/8/16 vecto…

    …r widths test checks
    
    This should cover most amdlibm functions, but still not added every VF combo (e.g. 2f32/16f64 often vectorises to the llvm intrinsic for that vector type)
    RKSimon committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2f95298 View commit details
    Browse the repository at this point in the history
  75. [lldb][lldb-dap] Enable more tests on Windows

    These few worked without changes.
    DavidSpickett committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    f7d6dfa View commit details
    Browse the repository at this point in the history
  76. [Analysis] Guard logf128 cst folding (llvm#106543)

    LLVM has a CMake variable to control whether to consider logf128
    constant folding which libAnalysis ignores. This patch changes the
    logf128 check to rely on the global LLVM_HAS_LOGF128 setting made in
    config-ix.cmake.
    RoboTux authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    56152fa View commit details
    Browse the repository at this point in the history
  77. Reapply "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instructi…

    …on in BB (llvm#105524)"
    
    Fixes the previous buildbot error by adding an explicit triple to the test,
    ensuring that llc can produce a valid object file.
    
    This reverts commit 926f097.
    SLTozer committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    5fef40c View commit details
    Browse the repository at this point in the history
  78. Revert "[flang] Warn when F128 is unsupported" (llvm#106561)

    Reverts llvm#102147
    
    It seems some systems which should support F128 are wrongly detected as
    not supporting.
    
    This might be due to checking `LDBL_MANT_DIG` instead of
    `__LDBL_MANT_DIG__`. I will investigate.
    tblah authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    8ae877a View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    9edd998 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. Configuration menu
    Copy the full SHA
    e9c77eb View commit details
    Browse the repository at this point in the history