-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 9edd998e (Aug 29) (14) #367
base: bump_to_d4f97da1
Are you sure you want to change the base?
Commits on Aug 27, 2024
-
[lldb] Turn lldb_private::Status into a value type. (llvm#106163)
This patch removes all of the Set.* methods from Status. This cleanup is part of a series of patches that make it harder use the anti-pattern of keeping a long-lives Status object around and updating it while dropping any errors it contains on the floor. This patch is largely NFC, the more interesting next steps this enables is to: 1. remove Status.Clear() 2. assert that Status::operator=() never overwrites an error 3. remove Status::operator=() Note that step (2) will bring 90% of the benefits for users, and step (3) will dramatically clean up the error handling code in various places. In the end my goal is to convert all APIs that are of the form ` ResultTy DoFoo(Status& error) ` to ` llvm::Expected<ResultTy> DoFoo() ` How to read this patch? The interesting changes are in Status.h and Status.cpp, all other changes are mostly ` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git grep -l SetErrorString lldb/source) ` plus the occasional manual cleanup.
Configuration menu - View commit details
-
Copy full SHA for 0642cd7 - Browse repository at this point
Copy the full SHA 0642cd7View commit details -
Configuration menu - View commit details
-
Copy full SHA for d1d8edf - Browse repository at this point
Copy the full SHA d1d8edfView commit details -
Configuration menu - View commit details
-
Copy full SHA for c349ded - Browse repository at this point
Copy the full SHA c349dedView commit details -
[llvm-exegesis] Switch from intptr_t to uintptr_t in most cases (llvm…
…#102860) This patch switches most of the uses of intptr_t to uintptr_t within llvm-exegesis for the subprocess memory support. In the vast majority of cases we do not want a signed component of the address, hence making intptr_t undesirable. intptr_t is left for error handling, for example when making syscalls and we need to see if the syscall returned -1.
Configuration menu - View commit details
-
Copy full SHA for fac87b8 - Browse repository at this point
Copy the full SHA fac87b8View commit details -
[libc++] Add missing include to three_way_comp_ref_type.h
We were using a `_LIBCPP_ASSERT_FOO` macro without including `<__assert>`. rdar://134425695
Configuration menu - View commit details
-
Copy full SHA for 0df7812 - Browse repository at this point
Copy the full SHA 0df7812View commit details -
[AIX][PGO] Handle atexit functions when dlclose'ing shared libraries (l…
…lvm#102940) Problem: On AIX, functions registered by atexit in a shared library are not run when the library is dlclosed, but instead run (and fail because the function pointer is no longer valid) during main program exit. The profile-rt registers some functions with atexit: 1. writeFileWithoutReturn that writes out the profile file 2. llvm_delete_reset_function_list that does some cleanup in the gcov instrumentation library (not sure) And so right now, we get an "Illegal instruction (core dumped)" when an instrumented shared object is dlopen'ed and dlclosed. Solution: When a shared library is dlclose'd, destructors from the library are called. So create a destructor function that iterates over all known functions that profile-rt registers with atexit, and unregister the ones that have been registered and execute them. Scenarios tested: (0) gcov dlopen/dlclose (AIX/gcov-dlopen-dlclose.test) (1) multiple dlopen/dlclose of the same lib and multiple libs (instrprof-dlopen-dlclose.test) (2) dlopen but no dlclose (exists: Posix/instrprof-dlopen.test) (3) a simple fork testcase with dlopen/dlclose (instrprof-dlopen-dlclose.test) (4) dlopen/dlclose by multiple threads. (instrprof-dlopen-dlclose.test) (5) regular dynamic-linking of instrumented shared libs (exists: AIX/shared-bexpall-pgo.c) (6) a simple fork testcase produces correct profile (instrprof-fork.c) --------- Co-authored-by: Hubert Tong <hstong@ca.ibm.com>
Configuration menu - View commit details
-
Copy full SHA for 2abed78 - Browse repository at this point
Copy the full SHA 2abed78View commit details -
[BOLT] Handle internal calls in ValidateInternalCalls (llvm#105736)
Move handling of all internal calls into the designated pass. Preserve NOPs and mark functions as non-simple on non-X86 platforms.
Configuration menu - View commit details
-
Copy full SHA for abd69b3 - Browse repository at this point
Copy the full SHA abd69b3View commit details -
[SandboxIR] Implement VAArgInst (llvm#106247)
This patch implements sandboxir::VAArgInst mirroring llvm::VAArgInst.
Configuration menu - View commit details
-
Copy full SHA for ff81f9f - Browse repository at this point
Copy the full SHA ff81f9fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 155e3aa - Browse repository at this point
Copy the full SHA 155e3aaView commit details -
[InstCombine] Simplify
(add/sub (sub/add) (sub/add))
irrelivant of ……use-count Added folds: - `(add (sub X, Y), (sub Z, X))` -> `(sub Z, Y)` - `(sub (add X, Y), (add X, Z))` -> `(sub Y, Z)` The fold typically is handled in the `Reassosiate` pass, but it fails if the inner `sub`/`add` are multi-use. Less importantly, Reassosiate doesn't propagate flags correctly. This patch adds the fold explicitly the InstCombine Proofs: https://alive2.llvm.org/ce/z/p6JyRP Closes llvm#105866
Configuration menu - View commit details
-
Copy full SHA for a6edcea - Browse repository at this point
Copy the full SHA a6edceaView commit details -
[TypeProf][ICP]Allow vtable-comparison as long as vtable count is com…
…parable with function count for each candidate (llvm#106260) The current cost-benefit analysis between vtable comparison and function comparison require the indirect fallback branch to be cold. This is too conservative. This change allows vtable-comparison as long as vtable count is comparable with function count for each function candidate and removes the cold indirect fallback requirement. Tested: 1. Testing this on benchmarks uplifts the measurable performance wins. Counting the (possibly-duplicated) remarks (because of linkonce_odr functions, cross-module import of functions) show the number of vtable remarks increases from ~30k-ish to 50k-ish. 2. https://gcc.godbolt.org/z/sbGK7Pacn shows vtable-comparison doesn't happen today (using the same IR input)
Configuration menu - View commit details
-
Copy full SHA for 511500e - Browse repository at this point
Copy the full SHA 511500eView commit details -
[SampleFDO][NFC] Refactoring sample reader to support on-demand read …
…profiles for given functions (llvm#104654) Currently in extended binary format, sample reader only read the profiles when the function are in the current module at initialization time, this extends the support to read the arbitrary profiles for given input functions in later stage. It's used for llvm#101053.
Configuration menu - View commit details
-
Copy full SHA for 23144e8 - Browse repository at this point
Copy the full SHA 23144e8View commit details -
Configuration menu - View commit details
-
Copy full SHA for b2dd840 - Browse repository at this point
Copy the full SHA b2dd840View commit details -
[MachO] Give the CPUSubTypeARM64 enum uint32_t type. NFCI.
We recently added various CPU_SUBTYPE_ARM64E values, notably including CPU_SUBTYPE_ARM64E_VERSIONED_PTRAUTH_ABI_MASK, which is 0x80000000U. The enum is better off as a uint32_t to accomodate that. This also hopefully helps silence GCC warnings reported on a ternary in CPU_SUBTYPE_ARM64E_WITH_PTRAUTH_VERSION. The subtype is already generally treated as a uint32_t elsewhere, so while there, change the new helpers to explicitly pass/return the subtype as uint32_t, and the individual narrower components as either bool or unsigned.
Configuration menu - View commit details
-
Copy full SHA for 1d7bb2b - Browse repository at this point
Copy the full SHA 1d7bb2bView commit details -
[X86] Check if there is stack access in the spilled FP/BP range (llvm…
…#106035) In the clobbered FP/BP range, we can't use it as normal FP/BP to access stack. So if there are stack accesses due to register spill, scheduling or other back end optimization, we should report an error instead of silently generate wrong code. Also try to minimize the save/restore range of the clobbered FP/BP if the FrameSetup doesn't change stack size.
Configuration menu - View commit details
-
Copy full SHA for edbd9d1 - Browse repository at this point
Copy the full SHA edbd9d1View commit details -
[SLP] Support vectorizing 2^N-1 reductions (llvm#106266)
Build on the -slp-vectorize-non-power-of-2 experimental option, and support vectorizing reductions with 2^N-1 sized vector. Specifically, two related changes: 1) When searching for a profitable VL, start with the 2^N-1 reduction width. If cost model does not select that VL, return to power of two boundaries when halfing the search VL. The later is mostly for simplicity. 2) Reduce the minimum reduction width from 4 to 3 when supporting non-power of two vectors. This is required to support <3 x Ty> cases. One thing which isn't directly related to this change, but I want to note for clarity is that the non-power-of-two vectorization appears to be sensative to operand order of reduction. I haven't yet fully figured out why, but I suspect this is non-power-of-two specific.
Configuration menu - View commit details
-
Copy full SHA for ed03070 - Browse repository at this point
Copy the full SHA ed03070View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5e64520 - Browse repository at this point
Copy the full SHA 5e64520View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6a74b0e - Browse repository at this point
Copy the full SHA 6a74b0eView commit details -
Configuration menu - View commit details
-
Copy full SHA for b24ffa6 - Browse repository at this point
Copy the full SHA b24ffa6View commit details -
This patch fixes: llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp:845:12: error: variable 'RemainingVTableCount' set but not used [-Werror,-Wunused-but-set-variable] llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp:306:23: error: private field 'PSI' is not used [-Werror,-Wunused-private-field] Here are a couple of domino effects: - Once I remove PSI, I need to update the contructor and its caller. - Once I remove RemainingVTableCount, I don't need TotalCount, so I am updating the caller as well.
Configuration menu - View commit details
-
Copy full SHA for 2bdc0da - Browse repository at this point
Copy the full SHA 2bdc0daView commit details -
[LTO] Introduce a helper lambda in gatherImportedSummariesForModule (…
…NFC) (llvm#106251) This patch forward ports the heterogeneous std::map::operator[]() from C++26 so that we can look up the map without allocating an instance of std::string when the key-value pair exists in the map. The background is as follows. I'm planning to reduce the memory footprint of ThinLTO indexing by changing ImportMapTy, the data structure used for an import list. The new list will be a hash set of tuples (SourceModule, GUID, ImportType) represented in a space efficient manner. That means that as we iterate over the hash set, we encounter SourceModule as many times as GUID. We don't want to create a temporary instance of std::string every time we look up ModuleToSummariesForIndex like: auto &SummariesForIndex = ModuleToSummariesForIndex[std::string(ILI.first)]; This patch removes the need to create the temporaries by enabling the hetegeneous lookup with std::set<K, V, std::less<>> and forward porting std::map::operator[]() from C++26.
Configuration menu - View commit details
-
Copy full SHA for 29bb523 - Browse repository at this point
Copy the full SHA 29bb523View commit details -
[DataLayout] Change return type of
getStackAlignment
toMaybeAlign
(llvm#105478) Currently, `getStackAlignment` asserts if the stack alignment wasn't specified. This makes it inconvenient to use and complicates testing. This change also makes `exceedsNaturalStackAlignment` method redundant.
Configuration menu - View commit details
-
Copy full SHA for 4d7a0ab - Browse repository at this point
Copy the full SHA 4d7a0abView commit details -
[AMDGPU] adjust tests to prevent fpclass bitcast folding (llvm#106268)
Make some minor tweaks to AMDGPU tests to ensure they still work as intended after llvm#97762. These tests can be radically simplified after bitcast aware fpclass deduction.
Configuration menu - View commit details
-
Copy full SHA for 4c4908c - Browse repository at this point
Copy the full SHA 4c4908cView commit details -
[SLP] Remove -slp-optimize-identity-hor-reduction-ops option (llvm#10…
…6238) This code has been unchanged for two years; let's simplify the code and remove configurability which makes the code harder to follow.
Configuration menu - View commit details
-
Copy full SHA for ee764a2 - Browse repository at this point
Copy the full SHA ee764a2View commit details -
[libc++] Disallow character types being index types of
extents
(llv……m#105832) llvm#78086 provided the trait we want to use for this: `__libcpp_integer`. In some `libcxx/containers/views/mdspan` tests, improper uses of `char` are replaced with `signed char`. Fixes llvm#73715
Configuration menu - View commit details
-
Copy full SHA for 74e70ba - Browse repository at this point
Copy the full SHA 74e70baView commit details -
[bazel][mlir] Add ConvertToSPIRV dep to mlir-vulkan-runner (llvm#106285)
New dep needed for 2bf2468
Configuration menu - View commit details
-
Copy full SHA for 2a3d735 - Browse repository at this point
Copy the full SHA 2a3d735View commit details -
Configuration menu - View commit details
-
Copy full SHA for bcb6e27 - Browse repository at this point
Copy the full SHA bcb6e27View commit details -
Configuration menu - View commit details
-
Copy full SHA for fc51797 - Browse repository at this point
Copy the full SHA fc51797View commit details -
[libc++] Deprecate and remove std::uncaught_exception (llvm#101830)
Works towards P0619R4/llvm#99985. - std::uncaught_exception was not previously deprecated. This patch deprecates it since C++17 as per N4259. std::uncaught_exceptions is used instead as libc++ unconditionally provides this function. - _LIBCPP_ENABLE_CXX20_REMOVED_UNCAUGHT_EXCEPTION restores std::uncaught_exception. - As a drive-by, this patch updates the C++20 status page to explain that D.11 is already done, since it was done in 578d09c.
Configuration menu - View commit details
-
Copy full SHA for 4ea2c73 - Browse repository at this point
Copy the full SHA 4ea2c73View commit details -
[Headers][X86] Add a test for MMX/SSE intrinsics (llvm#105852)
Certain intrinsics map to builtins that require an immediate (literal) argument; make sure we report non-literal arguments. This has been kicking around downstream for a while, and the recent removal of the MMX builtins caused me to notice it again.
Configuration menu - View commit details
-
Copy full SHA for b6b6482 - Browse repository at this point
Copy the full SHA b6b6482View commit details -
[Clang] Support initializing structured bindings from an array with d…
…irect-list-initialization (llvm#102581) When initializing structured bindings from an array with direct-list-initialization, array copy will be performed, which is a special case not following list-initialization. This PR adds support for this case. Fixes llvm#31813.
Configuration menu - View commit details
-
Copy full SHA for 377257f - Browse repository at this point
Copy the full SHA 377257fView commit details -
[SandboxIR][NFC] Create a DEF_CONST() macro in SandboxIRValues.def (l…
…lvm#106269) This helps with Constant::classof().
Configuration menu - View commit details
-
Copy full SHA for 751e681 - Browse repository at this point
Copy the full SHA 751e681View commit details -
Revert "[LLDB][SBSaveCore] Add selectable memory regions to SBSaveCor… (
llvm#106293) Reverts llvm#105442. Due to `TestSkinnyCoreFailing` and root causing of the failure will likely take longer than EOD.
Configuration menu - View commit details
-
Copy full SHA for b959532 - Browse repository at this point
Copy the full SHA b959532View commit details -
[libc++] Move some macOS CI jobs to Github actions (llvm#89083)
This patch decouples macOS CI testing from BuildKite, which makes the maintenance of macOS CI easier and more accessible to all contributors. Right now, the macOS CI is running entirely on machines owned by the LLVM Foundation with only a small set of contributors having direct access to them. In particular, updating these machines is currently a very time-consuming manual process that requires taking the machines offline, and using Github-provided instances makes that an order of magnitude easier. The story for performing back-deployment testing still needs to be figured out, so for now we are retaining some jobs under BuildKite.
Configuration menu - View commit details
-
Copy full SHA for e19c3a7 - Browse repository at this point
Copy the full SHA e19c3a7View commit details -
[MachineOutliner][NFC] Refactor (llvm#105398)
This patch prepares the NFC groundwork for global outlining using CGData, which will follow llvm#90074. - The `MinRepeats` parameter is now explicitly passed to the `getOutliningCandidateInfo` function, rather than relying on a default value of 2. For local outlining, the minimum number of repetitions is typically 2, but for the global outlining (mentioned above), we will optimistically create a single `Candidate` for each `OutlinedFunction` if stable hashes match a specific code sequence. This parameter is adjusted accordingly in global outlining scenarios. - I have also implemented `unique_ptr` for `OutlinedFunction` to ensure safe and efficient memory management within `FunctionList`, avoiding unnecessary implicit copies. This depends on llvm#101461. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
Configuration menu - View commit details
-
Copy full SHA for 93b8d07 - Browse repository at this point
Copy the full SHA 93b8d07View commit details -
Revert "[lldb] Add transitional backwards-compatible API to Status"
This reverts commit d1d8edf.
Configuration menu - View commit details
-
Copy full SHA for ff2baf0 - Browse repository at this point
Copy the full SHA ff2baf0View commit details -
[libc++] Do not redeclare lgamma_r when targeting the LLVM C library (l…
…lvm#102036) We use lgamma_r for the random normal distribution support. In this code we redeclare it, which causes issues with the LLVM C library as this function is marked noexcept in LLVM libc. This patch ensures that we don't redeclare that function when targeting LLVM libc.
Configuration menu - View commit details
-
Copy full SHA for 5f2389d - Browse repository at this point
Copy the full SHA 5f2389dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 67eb727 - Browse repository at this point
Copy the full SHA 67eb727View commit details -
[lldb] Don't scan more than 10MB of assembly insns (llvm#105890)
For supported architectures, lldb will do a static scan of the assembly instructions of a function to detect stack/frame pointer changes, register stores and loads, so we can retrieve register values for the caller stack frames. We trust that the function address range reflects the actual function range, but in a stripped binary or other unusual environment, we can end up scanning all of the text as a single "function" which is (1) incorrect and useless, but more importantly (2) slow. Cap the max size we will profile to 10MB of instructions. There will surely be functions longer than this with no unwind info, and we will miss the final epilogue or mid-function epilogues past the first 10MB, but I think this will be unusual, and the failure more to missing the epilogue is that the user will need to step out an extra time or two as the StackID is not correctly calculated mid-epilogue. I think this is a good tradeoff of behaviors. rdar://134391577
Configuration menu - View commit details
-
Copy full SHA for 3280292 - Browse repository at this point
Copy the full SHA 3280292View commit details -
Configuration menu - View commit details
-
Copy full SHA for f2f78b2 - Browse repository at this point
Copy the full SHA f2f78b2View commit details -
[StableHash] Implement stable global name for the hash computation (l…
…lvm#106156) LLVM often extends global names by adding suffixes to distinguish unique identities. However, these suffixes are not always stable across different runs and build environments. To address this issue, I implemented `get_stable_name` to ignore such suffixes and obtain the original name. This approach is not new, as PGO or Bolt already handle this issue similarly. Using the stable name obtained from `get_stable_name`, I implemented `stable_hash_name` while utilizing the same underlying `xxh3_64bit` algorithm as before.
Configuration menu - View commit details
-
Copy full SHA for f9ad249 - Browse repository at this point
Copy the full SHA f9ad249View commit details -
Configuration menu - View commit details
-
Copy full SHA for c2cac7e - Browse repository at this point
Copy the full SHA c2cac7eView commit details -
Configuration menu - View commit details
-
Copy full SHA for d48b0f8 - Browse repository at this point
Copy the full SHA d48b0f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 47667ee - Browse repository at this point
Copy the full SHA 47667eeView commit details -
[compiler-rt] Fix definition of
usize
on 32-bit Windows32-bit Windows uses `unsigned int` for uintptr_t and size_t. Commit 18e06e3 changed uptr to unsigned long, so it no longer matches the real size_t/uintptr_t and therefore the current definition of usize result in: `error C2821: first formal parameter to 'operator new' must be 'size_t'` However, the real problem is that uptr is wrong to work around the fact that we have local SIZE_T and SSIZE_T typedefs that trample on the basetsd.h definitions of the same name and therefore need to match exactly. Unlike size_t/ssize_t the uppercase ones always use unsigned long (even on 32-bit). This commit works around the build breakage by keeping the existing definitions of uptr/sptr and just changing usize. A follow-up change will attempt to fix this properly. Fixes: llvm#101998 Reviewed By: mstorsjo Pull Request: llvm#106151
Configuration menu - View commit details
-
Copy full SHA for bb27dd8 - Browse repository at this point
Copy the full SHA bb27dd8View commit details -
[ctx_prof] Move the "from json" logic more centrally to reuse it from…
… test. (llvm#106129) Making the synthesis of a contextual profile file from a JSON descriptor more reusable, for unittest authoring purposes. The functionality round-trips through the binary format - no reason, currently, to support other ways of loading contextual profiles.
Configuration menu - View commit details
-
Copy full SHA for 1022323 - Browse repository at this point
Copy the full SHA 1022323View commit details -
Configuration menu - View commit details
-
Copy full SHA for de687ea - Browse repository at this point
Copy the full SHA de687eaView commit details -
[mlir][gpu] Add metadata attributes for storing kernel metadata in GP…
…U objects (llvm#95292) This patch adds the `#gpu.kernel_metadata` and `#gpu.kernel_table` attributes. The `#gpu.kernel_metadata` attribute allows storing metadata related to a compiled kernel, for example, the number of scalar registers used by the kernel. The attribute only has 2 required parameters, the name and function type. It also has 2 optional parameters, the arguments attributes and generic dictionary for storing all other metadata. The `#gpu.kernel_table` stores a table of `#gpu.kernel_metadata`, mapping the name of the kernel to the metadata. Finally, the function `ROCDL::getAMDHSAKernelsELFMetadata` was added to collect ELF metadata from a binary, and to test the class methods in both attributes. Example: ```mlir gpu.binary @binary [#gpu.object<#rocdl.target<chip = "gfx900">, kernels = #gpu.kernel_table<[ #gpu.kernel_metadata<"kernel0", (i32) -> (), metadata = {sgpr_count = 255}>, #gpu.kernel_metadata<"kernel1", (i32, f32) -> (), arg_attrs = [{llvm.read_only}, {}]> ]> , bin = "BLOB">] ``` The motivation behind these attributes is to provide useful information for things like tunning. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 016e1eb - Browse repository at this point
Copy the full SHA 016e1ebView commit details -
[ctx_prof] Add support for ICP (llvm#105469)
An overload of `llvm::promoteCallWithIfThenElse` that updates the contextual profile. High-level, this is very simple: after creating the `if... then (direct call) else (indirect call)` structure, we instrument the new callsites and BBs (the instrumentation will help with tracking for other IPO transformations, and, ultimately, to match counter values before flattening to `MD_prof`). In more detail: - move the callsite instrumentation of the indirect call to the `else` BB, before the indirect call - create a new callsite instrumentation for the direct call - create instrumentation for both the `then` and `else` BBs - we could instrument just one (MST-style) but we're not running the binary with this instrumentation, and at most this would save some space (less counters tracked). For simplicity instrumenting both at this point - update each context belonging to the caller by updating the counters, and moving the indirect callee to the new, direct callsite ID Issue llvm#89287
Configuration menu - View commit details
-
Copy full SHA for 73c3b73 - Browse repository at this point
Copy the full SHA 73c3b73View commit details -
Configuration menu - View commit details
-
Copy full SHA for d22bee1 - Browse repository at this point
Copy the full SHA d22bee1View commit details -
[llvm/llvm-project][Coroutines] Improve debugging and minor refactori…
…ng (llvm#104642) No Functional Changes * Fix comments in several places * Instead of using BB-getName() (in dump methods) use getBasicBlockLabel. This fixes the poor output of the dumped info that resulted in missing BB labels. * Use RPO when dumping SuspendCrossingInfo. Without this the dump order is determined by the ptr addresses and so is not consistent from run to run making IR diffs difficult to read. * Inference -> Interference * Pull the logic that determines insertion location out of insertSpills and into getSpillInsertionPt, to differentiate between these two operations. * Use Shape getters for CoroId instead of getting it manually. --------- Co-authored-by: tnowicki <tnowicki.nowicki@amd.com>
Configuration menu - View commit details
-
Copy full SHA for 51aceb5 - Browse repository at this point
Copy the full SHA 51aceb5View commit details -
[mlir][GPU] Fix docs modified by llvm#94910 (llvm#106295)
Fix docs modified by llvm#94910 by adding information about the `module` argument in `gpu::TargetAttrInterface::createObject`. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for aaed557 - Browse repository at this point
Copy the full SHA aaed557View commit details -
Configuration menu - View commit details
-
Copy full SHA for 91e09c3 - Browse repository at this point
Copy the full SHA 91e09c3View commit details -
[lldb][ClangExpressionParser] Remove duplicate construction of Extern…
…alASTSourceWrapper This is an oversight from llvm#104817 where the intention was to hoist the ExternalASTSourceWrapper construction out of the conditional so it can be set on both the `SemaSourceWithPriorities` and be added as an external source to Sema. But the inner `ExternalASTSourceWrapper` allocation wasn't actually removed. This currently all works fine because all these AST sources are refcounted and point to the same underlying AST sources. But this patch cleans this up regardless.
Configuration menu - View commit details
-
Copy full SHA for 0b1c8fd - Browse repository at this point
Copy the full SHA 0b1c8fdView commit details
Commits on Aug 28, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6ae657b - Browse repository at this point
Copy the full SHA 6ae657bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 32abe5d - Browse repository at this point
Copy the full SHA 32abe5dView commit details -
[RISCV][MCP] Remove redundant move from tail duplication (llvm#89865)
Tail duplication will generate the redundant move before return. It is because the MachineCopyPropogation can't recognize COPY after post-RA pseudoExpand. This patch make MachineCopyPropogation recognize `%0 = ADDI %1, 0` as COPY
Configuration menu - View commit details
-
Copy full SHA for 2def1c4 - Browse repository at this point
Copy the full SHA 2def1c4View commit details -
[flang][cuda] Add missing dependency (llvm#106298)
Add missing dependency that sometimes makes a build fails with ninja.
Configuration menu - View commit details
-
Copy full SHA for f215447 - Browse repository at this point
Copy the full SHA f215447View commit details -
[flang][cuda] Use declare op results instead of memref (llvm#106287)
llvm#106120 Simplify the data transfer when possible by using the reference and a shape. This bypass the declare op. In order to keep the declare op around, use the second results of the declare op which achieve the same.
Configuration menu - View commit details
-
Copy full SHA for ccbee71 - Browse repository at this point
Copy the full SHA ccbee71View commit details -
[compiler-rt][nsan] Fix strsep interceptor (llvm#106307)
Fix strsep interceptor. For strsep description see https://www.man7.org/linux/man-pages/man3/strsep.3.html
Configuration menu - View commit details
-
Copy full SHA for 1601879 - Browse repository at this point
Copy the full SHA 1601879View commit details -
Configuration menu - View commit details
-
Copy full SHA for 82db08e - Browse repository at this point
Copy the full SHA 82db08eView commit details -
Configuration menu - View commit details
-
Copy full SHA for c1a4896 - Browse repository at this point
Copy the full SHA c1a4896View commit details -
[orc] Fix asan error in RTDyldObjectLinkingLayer.cpp (llvm#106300)
`JITDylibSearchOrderResolver` local variable can be destroyed before completion of all callbacks. Capture it together with `Deps` in `OnEmitted` callback. Original error: ``` ==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978 READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm) #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58 #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25 #2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5 #3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12 #4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, ```
Configuration menu - View commit details
-
Copy full SHA for c0ebc18 - Browse repository at this point
Copy the full SHA c0ebc18View commit details -
[mlir][Linalg] Fix match convolution message (llvm#106197)
Fix the message part of bugfix commit `2ef3dcf`.
Configuration menu - View commit details
-
Copy full SHA for bacf312 - Browse repository at this point
Copy the full SHA bacf312View commit details -
Revert "[Clang] [Test] Use lit Syntax for Environment Variables in Cl…
…ang subproject" (llvm#106267) Reverts llvm#102647 I am reverting this change because the `readfile` doesn't actually perform any useful operation, and yet, for some reason, the test still passed. This indicates that the modification was unnecessary and could lead to confusion or unexpected behavior in the future.
Configuration menu - View commit details
-
Copy full SHA for 815bf0f - Browse repository at this point
Copy the full SHA 815bf0fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 656d5aa - Browse repository at this point
Copy the full SHA 656d5aaView commit details -
[clang-format] Insert a space between new/delete and a C-style cast (l…
…lvm#106175) It doesn't make sense to remove the space between new/delete and a C-style cast when SpaceBeforeParensOptions.AfterPlacementOperator is set to false. Fixes llvm#105628.
Configuration menu - View commit details
-
Copy full SHA for fac7e87 - Browse repository at this point
Copy the full SHA fac7e87View commit details -
[Release] Add keith to valid archive uploaders (llvm#106018)
I am interested in helping contribute macOS binaries since we're generally sporadic with uploading these. Fixes llvm#106016
Configuration menu - View commit details
-
Copy full SHA for d2b420c - Browse repository at this point
Copy the full SHA d2b420cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b1d1c33 - Browse repository at this point
Copy the full SHA b1d1c33View commit details -
[ORC] Generalize loadRelocatableObject to loadLinkableFile, add archi…
…ve support. This allows us to rewrite part of StaticLibraryDefinitionGenerator in terms of loadLinkableFile. It's also useful for clients who may not know (either from file extensions or context) whether a given path will be an object file, an archive, or a universal binary. rdar://134638070
Configuration menu - View commit details
-
Copy full SHA for 7a4013f - Browse repository at this point
Copy the full SHA 7a4013fView commit details -
[mlir] Add option to control the
emissionKind
to DIScopeForLLVMFunc……Op pass (llvm#106229) This is currently not controllable by the user and always set to `DIEmissionKind::LineTablesOnly`. The added option allows to set it to the other values accepted by LLVM (`None`, `Full`, and `DebugDirectivesOnly`). --------- Co-authored-by: jingzec <jingzec@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 097138f - Browse repository at this point
Copy the full SHA 097138fView commit details -
[lldb] unique_ptr-ify some GetUserExpression APIs. (llvm#106034)
These methods already returned a uniquely owned object, this just makes them self-documenting.
Configuration menu - View commit details
-
Copy full SHA for 3c5ab5a - Browse repository at this point
Copy the full SHA 3c5ab5aView commit details -
Revert "[lldb] unique_ptr-ify some GetUserExpression APIs. (llvm#106034…
…)" This reverts commit 3c5ab5a while I investigate bot failures (e.g. https://lab.llvm.org/buildbot/#/builders/163/builds/4286).
Configuration menu - View commit details
-
Copy full SHA for e6cbea1 - Browse repository at this point
Copy the full SHA e6cbea1View commit details -
[AArch64] Fix buildbot breakage of ubsan
Fix the ERROR: UndefinedBehaviorSanitizer, reproduced by BUILDBOT_REVISION=43ffe2eed llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_bootstrap_ubsan.sh It might be also related to llvm#76202
Configuration menu - View commit details
-
Copy full SHA for 8067b88 - Browse repository at this point
Copy the full SHA 8067b88View commit details -
[AArch64] Fold more load.x into load.i with large offset
The list of load.x is refer to canFoldIntoAddrMode on D152828. Also support LDRSroX missed in canFoldIntoAddrMode
Configuration menu - View commit details
-
Copy full SHA for e5a5ac0 - Browse repository at this point
Copy the full SHA e5a5ac0View commit details -
[analyzer] Detect leaks of stack addresses via output params, indirec…
…t globals 3/3 (llvm#105648) Fix some false negatives of StackAddrEscapeChecker: - Output parameters ``` void top(int **out) { int local = 42; *out = &local; // Noncompliant } ``` - Indirect global pointers ``` int **global; void top() { int local = 42; *global = &local; // Noncompliant } ``` Note that now StackAddrEscapeChecker produces a diagnostic if a function with an output parameter is analyzed as top-level or as a callee. I took special care to make sure the reports point to the same primary location and, in many cases, feature the same primary message. That is the motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp To avoid false positive reports when a global indirect pointer is assigned a local address, invalidated, and then reset, I rely on the fact that the invalidation symbol will be a DerivedSymbol of a ConjuredSymbol that refers to the same memory region. The checker still has a false negative for non-trivial escaping via a returned value. It requires a more sophisticated traversal akin to scanReachableSymbols, which out of the scope of this change. CPP-4734 --------- This is the last of the 3 stacked PRs, it must not be merged before llvm#105652 and llvm#105653
Configuration menu - View commit details
-
Copy full SHA for 190449a - Browse repository at this point
Copy the full SHA 190449aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3dbb6be - Browse repository at this point
Copy the full SHA 3dbb6beView commit details -
[llvm-cxxfilt][macOS] Don't strip underscores on macOS by default (ll…
…vm#106233) Currently, `llvm-cxxfilt` will strip the leading underscore of its input on macOS. Historically MachO symbols were prefixed with an extra underscore and this is why this default exists. However, nowadays, the `ItaniumDemangler` supports all of the following mangling prefixes: `_Z`, `__Z`, `___Z`, `____Z`. So really `llvm-cxxfilt` can simply forward the mangled name to the demangler and let the library decide whether it's a valid encoding. Compiling C++ on macOS nowadays will generate symbols with `_Z` and `___Z` prefixes. So users trying to demangle these symbols will have to know that they need to add the `-n` prefix. This routinely catches people off-guard. This patch removes the `-n` default for macOS and allows calling into the `ItaniumDemangler` with all the `_Z` prefixes that the demangler supports (1-4 underscores). rdar://132714940
Configuration menu - View commit details
-
Copy full SHA for 0b554dd - Browse repository at this point
Copy the full SHA 0b554ddView commit details -
[LoongArch] Format LoongArchL{A}SXInstrInfo.td. NFC
Alignment and start with an upper-case letter.
Configuration menu - View commit details
-
Copy full SHA for 175aa86 - Browse repository at this point
Copy the full SHA 175aa86View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6332c36 - Browse repository at this point
Copy the full SHA 6332c36View commit details -
[llvm] Prefer StringRef::substr to StringRef::slice (NFC) (llvm#106330)
S.substr(N) is simpler than S.slice(N, StringRef::npos). Also, substr is probably better recognizable than slice thanks to std::string_view::substr.
Configuration menu - View commit details
-
Copy full SHA for 22e55ba - Browse repository at this point
Copy the full SHA 22e55baView commit details -
[libc++][math] Provide overloads for cv-unqualified floating point ty…
…pes for `std::isnormal` (llvm#104773) ## Why Currently, the following does not work when compiled with clang: ```c++ #include <cmath> struct ConvertibleToFloat { operator float(); }; bool test(ConvertibleToFloat x) { return std::isnormal(x); } ``` See https://godbolt.org/z/5bos8v67T for differences with respect to msvc, gcc or icx. It fails for `float`, `double` and `long double` (all cv-unqualified floating-point types). ## What Test and provide overloads as expected by the ISO C++ standard. The classification/comparison function `isnormal` is defined since C++11 until C++23 as ```c++ bool isnormal( float num ); bool isnormal( double num ); bool isnormal( long double num ); ``` and since C++23 as ```c++ constexpr bool isnormal( /* floating-point-type */ num ); ``` for which "the library provides overloads for all cv-unqualified floating-point types as the type of the parameter num". See §28.7.1/1 in the [ISO C++ standard](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf) or check [cppreference](https://en.cppreference.com/w/cpp/numeric/math/isnormal).
Configuration menu - View commit details
-
Copy full SHA for 866bec7 - Browse repository at this point
Copy the full SHA 866bec7View commit details -
[clang] Add lifetimebound attr to std::span/std::string_view construc…
…tor (llvm#103716) With this patch, clang now automatically adds ``[[clang::lifetimebound]]`` to the parameters of `std::span, std::string_view` constructors, this enables Clang to capture more cases where the returned reference outlives the object. Fixes llvm#100567
Configuration menu - View commit details
-
Copy full SHA for 902b2a2 - Browse repository at this point
Copy the full SHA 902b2a2View commit details -
[Coroutines] Salvage the debug information for coroutine frames withi…
…n optimizations This patch tries to salvage the debug information for the coroutine frames within optimizations by creating the help alloca varaibles with optimizations too. We didn't do this when I implement it initially. I roughtly remember the reason was, we feel the additional help alloca variable may pessimize the performance, which is almost the most important thing under optimizations. But now, it looks like the new inserted help alloca variables can be optimized out by the following optimizations. So it looks like the time to make it available within optimizations. And also, it looks like the following optimizations will convert the generated dbg.declare instrinsic into dbg.value intrinsic within optimizations. In LLVM's test, there is a slightly regression that a dbg.declare for the promise object failed to be remained after this change. But it looks like we won't have a chance to see dbg.declare for the promise object when we split the coroutine as that dbg.declare will be converted into a dbg.value in early stage. So everything looks fine.
Configuration menu - View commit details
-
Copy full SHA for 07514fa - Browse repository at this point
Copy the full SHA 07514faView commit details -
[analyzer] Fix false positive for mutexes inheriting mutex_base (llvm…
…#106240) If a mutex interface is split in inheritance chain, e.g. struct mutex has `unlock` and inherits `lock` from __mutex_base then calls m.lock() and m.unlock() have different "this" targets: m and the __mutex_base of m, which used to confuse the `ActiveCritSections` list. Taking base region canonicalizes the region used to identify a critical section and enables search in ActiveCritSections list regardless of which class the callee is the member of. This likely fixes llvm#104241 CPP-5541
Configuration menu - View commit details
-
Copy full SHA for 82e314e - Browse repository at this point
Copy the full SHA 82e314eView commit details -
[LLVM][C API] Clearing initializer and personality by passing NULL (l…
…lvm#105521) This is similar to how the C++ API supports passing `nullptr` to `setPersonalityFn` or `setInitializer`.
Configuration menu - View commit details
-
Copy full SHA for 0bd5130 - Browse repository at this point
Copy the full SHA 0bd5130View commit details -
[LV][NFC] Update and clean up the test case LoopVectorize/RISCV/inloo…
…p-reduction.ll. (llvm#102907)
Configuration menu - View commit details
-
Copy full SHA for dfde1a7 - Browse repository at this point
Copy the full SHA dfde1a7View commit details -
[LoopUnrollAnalyzer] Use constant folding API for loads
Use ConstantFoldLoadFromConst() instead of a partial re-implementation. This makes the code slightly more generic by not depending on the exact structure of the constant.
Configuration menu - View commit details
-
Copy full SHA for fe182dd - Browse repository at this point
Copy the full SHA fe182ddView commit details -
[clang] Update C++ DR page (llvm#106299)
[CWG2917](https://cplusplus.github.io/CWG/issues/2917.html) got a new proposed resolution that is different from the one the test has been written against. [CWG2922](https://cplusplus.github.io/CWG/issues/2922.html) apparently the initial "possible resolution" was approved without changes.
Configuration menu - View commit details
-
Copy full SHA for 9cf052d - Browse repository at this point
Copy the full SHA 9cf052dView commit details -
Revert "[clang] Add nuw attribute to GEPs" (llvm#106343)
Reverts llvm#105496 This patch breaks: https://lab.llvm.org/buildbot/#/builders/25/builds/1952 https://lab.llvm.org/buildbot/#/builders/52/builds/1775 Somehow output is different with sanitizers. Maybe non-determinism in the code?
Configuration menu - View commit details
-
Copy full SHA for 69437a3 - Browse repository at this point
Copy the full SHA 69437a3View commit details -
[LoopUnrollAnalyzer] Don't simplify signed pointer comparison
We're generally not able to simplify signed pointer comparisons (because we don't have no-wrap flags that would permit it), so we shouldn't pretend that we can in the cost model. The unsigned comparison case is also not modelled correctly, as explained in the added comment. As this is a cost model inaccuracy at worst, I'm leaving it alone for now.
Configuration menu - View commit details
-
Copy full SHA for 69c4346 - Browse repository at this point
Copy the full SHA 69c4346View commit details -
[LSR] Use computeConstantDifference()
This API is faster than getMinusSCEV() and a SCEVConstant cast.
Configuration menu - View commit details
-
Copy full SHA for 7660981 - Browse repository at this point
Copy the full SHA 7660981View commit details -
[X86] Add additional test coverage for half libcall expansion/promotion
Just need to add powi test with llvm#105775
Configuration menu - View commit details
-
Copy full SHA for 760b172 - Browse repository at this point
Copy the full SHA 760b172View commit details -
[libc++][math] Remove constrained overloads of `std::{isnan, isinf, i…
…sfinite}` (llvm#106224) ## Why Since llvm#98841 and llvm#98952, the constrained overloads are unused and not needed anymore as we added explicit overloads for all floating point types. I forgot to remove them in the mentioned PRs. ## What Remove them.
Configuration menu - View commit details
-
Copy full SHA for 2f0661c - Browse repository at this point
Copy the full SHA 2f0661cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 53d1c21 - Browse repository at this point
Copy the full SHA 53d1c21View commit details -
[Clang] [Docs] Document runtime config directory options (llvm#66593)
In the clang user manual the build options `CLANG_CONFIG_FILE_USER_DIR` and `CLANG_CONFIG_FILE_SYSTEM_DIR` are documented, but the run time overrides `--config-user-dir` and `--config-system-dir` are not. I have updated the manual to add these run time arguments.
Configuration menu - View commit details
-
Copy full SHA for 15405b3 - Browse repository at this point
Copy the full SHA 15405b3View commit details -
[IndVars] Check if WideInc available before trying to use it
WideInc/WideIncExpr can be null. Previously this worked out because the comparison with WideIncExpr would fail. Now we have accesses to WideInc prior to that. Avoid the issue with an explicit check. Fixes llvm#106239.
Configuration menu - View commit details
-
Copy full SHA for c9a5e1b - Browse repository at this point
Copy the full SHA c9a5e1bView commit details -
fix(llvm/**.py): fix comparison to None (llvm#94018)
from PEP8 (https://peps.python.org/pep-0008/#programming-recommendations): > Comparisons to singletons like None should always be done with is or is not, never the equality operators. Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 94ed47f - Browse repository at this point
Copy the full SHA 94ed47fView commit details -
[clang][bytecode] Diagnose array-to-pointer decays of dummy pointers (l…
…lvm#106366) We have type information for them now, so we can do this.
Configuration menu - View commit details
-
Copy full SHA for f7a74ec - Browse repository at this point
Copy the full SHA f7a74ecView commit details -
[clang-format] js handle anonymous classes (llvm#106242)
Addresses a regression in JavaScript when formatting anonymous classes. --------- Co-authored-by: Owen Pan <owenpiano@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 77d63cf - Browse repository at this point
Copy the full SHA 77d63cfView commit details -
Move stepvector intrinsic out of experimental namespace (llvm#98043)
This patch is moving out stepvector intrinsic from the experimental namespace. This intrinsic exists in LLVM for several years now, and is widely used.
Configuration menu - View commit details
-
Copy full SHA for 95d2d1c - Browse repository at this point
Copy the full SHA 95d2d1cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8fd9ec5 - Browse repository at this point
Copy the full SHA 8fd9ec5View commit details -
[libc] Disable failing scanf test on AMDGPU temporarily
Summary: This test currently fails in the `amdgpu-attributor` pass. I haven't figured out anything beyond that yet as it's difficult to reduce.
Configuration menu - View commit details
-
Copy full SHA for 439d7de - Browse repository at this point
Copy the full SHA 439d7deView commit details -
Configuration menu - View commit details
-
Copy full SHA for f4e7e5d - Browse repository at this point
Copy the full SHA f4e7e5dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 71ede8d - Browse repository at this point
Copy the full SHA 71ede8dView commit details -
[libc++][ranges] P2609R3: Relaxing Ranges Just A Smidge (llvm#101715)
This patch implements https://wg21.link/p2609r3. The test code was originally authored by JMazurkiewicz. Notes: - P2609R3 is not officially a Defect Report, but MSVC STL implements it in C++20 mode. Moreover, P2609R3 and P2997R1 touch exactly the same set of concepts, and MSVC STL and libc++ have already treated P2997R1 as a DR. - This patch also adjusted feature-test macros. + In C++20 mode, the value of __cpp_lib_ranges should be `202110L` because - `202202L` covers `range_adaptor_closure` (P2387R3), and - `202207L` covers move-only types in range adaptors (P2494R2). And all of these changes are only available since C++23 mode. + In C++23 mode, the value should be `202406L` because - `202211L` covers removing poison overloads (P2602R2), - `202302L` covers relaxing projected value types (P2609R3), and - `202406L` covers removing requirements on `iter_common_reference_t` (P2997R1). And all of these changes are already or being implemented. Fixes llvm#105253. Co-authored-by: Jakub Mazurkiewicz <mazkuba3@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 026210e - Browse repository at this point
Copy the full SHA 026210eView commit details -
[VPlan] Move properlyDominates to VPDominatorTree (NFCI).
This allows for easier re-use in additional places in the future. Also move code to VPlanAnalysis.cpp
Configuration menu - View commit details
-
Copy full SHA for 96e1320 - Browse repository at this point
Copy the full SHA 96e1320View commit details -
[libc++] Switch to the current XCode beta on macOS builders (llvm#106363
) This unblocks a ton of work including llvm#76756 as it updates to a newer version of AppleClang.
Configuration menu - View commit details
-
Copy full SHA for ec9f36a - Browse repository at this point
Copy the full SHA ec9f36aView commit details -
[ValueLattice] Move intersect from LVI into ValueLattice API (NFC)
So we can reuse the logic inside IPSCCP.
Configuration menu - View commit details
-
Copy full SHA for a5b6068 - Browse repository at this point
Copy the full SHA a5b6068View commit details -
[RemoveDIs] Simplify spliceDebugInfo, fixing splice-to-end edge case (l…
…lvm#105670) Not quite NFC, fixes splitBasicBlockBefore case when we split before an instruction with debug records (but without the headBit set, i.e., we are splitting before the instruction but after the debug records that come before it). splitBasicBlockBefore splices the instructions before the split point into a new block. Prior to this patch, the debug records would get shifted up to the front of the spliced instructions (as seen in the modified unittest - I believe the unittest was checking erroneous behaviour). We instead want to leave those debug records at the end of the spliced instructions. The functionality of the deleted `else if` branch is covered by the remaining `if` now that `DestMarker` is set to the trailing marker if `Dest` is `end()`. Previously the "===" markers were sometimes detached, now we always detach them and always reattach them. Note: `deleteTrailingDbgRecords` only "unlinks" the tailing marker from the block, it doesn't delete anything. The trailing marker is still cleaned up properly inside the final `if` body with `DestMarker->eraseFromParent();`. Part 1 of 2 needed for llvm#105571
Configuration menu - View commit details
-
Copy full SHA for f581553 - Browse repository at this point
Copy the full SHA f581553View commit details -
[libc++] Run the Lit test suite against an installed version of the l…
…ibrary (llvm#96910) We always strive to test libc++ as close as possible to the way we are actually shipping it. This was approximated reasonably well by setting up the minimal driver flags when running the test suite, however we were running the test suite against the library located in the build directory. This patch improves the situation by installing the library (the headers, the built library, modules, etc) into a fake location and then running the test suite against that fake "installation root". This should open the door to getting rid of the temporary copy of the headers we make during the build process, however this is left for a future improvement. Note that this adds quite a bit of verbosity whenever running the test suite because we install the headers beforehand every time. We should be able to override this to silence it, however CMake doesn't currently give us a way to do that, see https://gitlab.kitware.com/cmake/cmake/-/issues/26085.
Configuration menu - View commit details
-
Copy full SHA for 0e8208e - Browse repository at this point
Copy the full SHA 0e8208eView commit details -
[libc++] P2747R2:
constexpr
placement new (library part) (llvm#105768)This patch implements https://wg21.link/P2747R2. The library changes affect direct `operator new` and `operator new[]` calls even when the core language changes are absent. The changes are not available for MS ABI because the `operator new` and `operator new[]` are from VCRuntime's `<vcruntime_new.h>`. A feature request was submitted for that [1]. As a drive-by change, the patch reformatted the whole `new.pass.cpp` and `new_array.pass.cpp` tests. Closes llvm#105427 [1]: https://developercommunity.visualstudio.com/t/constexpr-for-placement-operator-newope/10730304.
Configuration menu - View commit details
-
Copy full SHA for 7808541 - Browse repository at this point
Copy the full SHA 7808541View commit details -
[mlir][tensor] Add a test for invalid tensor.pack (llvm#106246)
Adds a missing test for when the rank of the output tensor doesn't match the input tensor rank + number of blocking factors.
Configuration menu - View commit details
-
Copy full SHA for 74d1960 - Browse repository at this point
Copy the full SHA 74d1960View commit details -
[flang] Update the date_and_time intrinsic for AIX (llvm#104849)
Currently, strftime is called to get the timezone for the ZONE argument. On AIX, this routine requires an environment variable set in order to return the required format. This patch is to add the time difference computation from UTC for the platform.
Configuration menu - View commit details
-
Copy full SHA for 8b198ee - Browse repository at this point
Copy the full SHA 8b198eeView commit details -
[clang] Minor updates to C++ DR page design (llvm#106360)
This patch updates `make_cxx_dr_status` script to use the same spoiler-like way to hide additional details that `cxx_status.html` uses. This gives implemented yet unresolved DRs new but very familiar look: ![s9EpO0E](https://github.com/user-attachments/assets/54852d7b-5fdd-4595-8dca-20628797f952) I also took an opportunity to fix spelling inconsistency pointed out by @zygoloid in llvm#106299 (comment). I got tired of counting `%s`s when we substitute data into HTML template, so I replaced them with an f-string (available since Python 3.6), because I had to touch this code anyway.
Configuration menu - View commit details
-
Copy full SHA for fc39cc1 - Browse repository at this point
Copy the full SHA fc39cc1View commit details -
Configuration menu - View commit details
-
Copy full SHA for b8c0e8a - Browse repository at this point
Copy the full SHA b8c0e8aView commit details -
[mlir][amdgpu] Improve Chipset version utility (llvm#106169)
* Fix an OOB access * Add comparison operators * Add documentation * Add unit tests
Configuration menu - View commit details
-
Copy full SHA for b2f1d06 - Browse repository at this point
Copy the full SHA b2f1d06View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a50e35 - Browse repository at this point
Copy the full SHA 8a50e35View commit details -
[InstCombine][X86] Only demand used bits for PSHUFB mask values (llvm…
…#106377) (V)PSHUFB only uses the sign bit (for zeroing) and the lower 4 bits (to index per-lane byte 0-15) - so use SimplifyDemandedBits to ignore anything touching the remaining bits. Fixes llvm#106256
Configuration menu - View commit details
-
Copy full SHA for 51a0951 - Browse repository at this point
Copy the full SHA 51a0951View commit details -
Configuration menu - View commit details
-
Copy full SHA for 158ba73 - Browse repository at this point
Copy the full SHA 158ba73View commit details -
[libc++] Mark a few papers as done or "Nothing To Do"
Please refer to the Github issues for details on why those are marked as resolved. Huge thanks to @frederick-vs-ja for the analysis. Closes llvm#104336 Closes llvm#100042 Closes llvm#100615
Configuration menu - View commit details
-
Copy full SHA for cc0f2d5 - Browse repository at this point
Copy the full SHA cc0f2d5View commit details -
[MachineOutliner][NFC] Remove unnecessary RepeatedSequenceLocs.clear() (
llvm#106171) - When `getOutliningCandidateInfo()` returns `std::nullopt` (meaning no `OutlinedFunction` is created), there is no need to clear the input argument, `RepeatedSequenceLocs`, as it's already being cleared in the main loop of `findCandidates()`. - Replaced `2` by `MinRepeats`, which I missed from llvm#105398
Configuration menu - View commit details
-
Copy full SHA for 140381d - Browse repository at this point
Copy the full SHA 140381dView commit details -
[compiler-rt][rtsan] NFC: Introduce __rtsan_expect_not_realtime helper (
llvm#106314) We are extracting this function into the C API so we can eventually install it when a user marks a function [[clang::blocking]].
Configuration menu - View commit details
-
Copy full SHA for fee4836 - Browse repository at this point
Copy the full SHA fee4836View commit details -
[lldb][lldb-dap][test] Enable variable tests on Windows
At least for our Windows on Arm machine compiling with clang-cl, it has inverted which variables get a `::` prefix. Would not surprise me if msvc does the opposite so feel free to revert if these tests fail for you.
Configuration menu - View commit details
-
Copy full SHA for a3cd8d7 - Browse repository at this point
Copy the full SHA a3cd8d7View commit details -
[VPlan] Move logic to create interleave groups to VPlanTransforms (NFC).
This is a step towards further breaking up the rather large tryToBuildVPlanWithVPRecipes. It moves logic create interleave groups to VPlanTransforms.cpp, where similar replacements for other recipes are defined as well (e.g. EVL-based ones)
Configuration menu - View commit details
-
Copy full SHA for 16910a2 - Browse repository at this point
Copy the full SHA 16910a2View commit details -
[PowerPC] fix legalization crash (llvm#105563)
If v2i64 scalar_to_vector is made custom, llc can crash in certain legalization cases where v2i64 vectors are injected, even if they weren't otherwise present. The code generated would be fine, but that operation is not handled in ReplaceNodeResults. Add handling.
Configuration menu - View commit details
-
Copy full SHA for 89bbcbe - Browse repository at this point
Copy the full SHA 89bbcbeView commit details -
[flang] Warn when F128 is unsupported (llvm#102147)
This generates `warning: REAL(KIND=16) is not an enabled type for this target` if that type is used in a build not correctly configured to support this type. Uses of `selected_real_kind(30)` return -1.
Configuration menu - View commit details
-
Copy full SHA for 114ff99 - Browse repository at this point
Copy the full SHA 114ff99View commit details -
Configuration menu - View commit details
-
Copy full SHA for 37d0841 - Browse repository at this point
Copy the full SHA 37d0841View commit details -
[X86][LegalizeDAG] FPOWI: promote f16 operand (llvm#105775)
Fixes llvm#105747 --------- Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>
Configuration menu - View commit details
-
Copy full SHA for ecd9e0b - Browse repository at this point
Copy the full SHA ecd9e0bView commit details -
[LLVM][NVPTX] Remove nonexistent ftz ops (llvm#106100)
According to the PTX [spec](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-max), max & min instructions do not support the `ftz` modifier for `bf16` & `bf16x2` types. This PR removes them from instr info, and the non-ftz legal versions will be emitted instead.
Configuration menu - View commit details
-
Copy full SHA for 82113a4 - Browse repository at this point
Copy the full SHA 82113a4View commit details -
[CodeGen] Create IFUNCs in the program address space, not hard-coded 0 (
llvm#105726) Commit 0d527e5 ("GlobalIFunc: Make ifunc respect function address spaces") added support for this within LLVM, but Clang does not properly honour the target's address spaces when creating IFUNCs, crashing with RAUW and verifier assertion failures when compiling C code on a target with a non-zero program address space, so fix this.
Configuration menu - View commit details
-
Copy full SHA for 73e0aa5 - Browse repository at this point
Copy the full SHA 73e0aa5View commit details -
[InterleavedAccess] Use SmallVectorImpl references. NFC
Instead of repeating SmallVector size in multiple places.
Configuration menu - View commit details
-
Copy full SHA for 829c47f - Browse repository at this point
Copy the full SHA 829c47fView commit details -
[lldb][lldb-dap][test] Enable more attach tests on Windows
By adding the equivalent includes.
Configuration menu - View commit details
-
Copy full SHA for af3ee62 - Browse repository at this point
Copy the full SHA af3ee62View commit details -
Configuration menu - View commit details
-
Copy full SHA for be7014e - Browse repository at this point
Copy the full SHA be7014eView commit details -
[clang][bytecode] Fix llvm#55390 here as well (llvm#106395)
Ignore the multiplication overflow but report the 0 denominator.
Configuration menu - View commit details
-
Copy full SHA for 40db261 - Browse repository at this point
Copy the full SHA 40db261View commit details -
Configuration menu - View commit details
-
Copy full SHA for b40677c - Browse repository at this point
Copy the full SHA b40677cView commit details -
[AMDGPU] Don't realign already allocated LDS. Point fix for 106412 (l…
…lvm#106421) Fixes 106412. The logic that skips the pass on already-lowered variables doesn't cover the path that increases alignment of variables. If a variable is allocated at 24 and then given 16 byte alignment, the backend notices and fatal-errors on the inconsistency.
Configuration menu - View commit details
-
Copy full SHA for 1bde8e0 - Browse repository at this point
Copy the full SHA 1bde8e0View commit details -
[LTO] Introduce new type alias ImportListsTy (NFC) (llvm#106420)
The background is as follows. I'm planning to reduce the memory footprint of ThinLTO indexing by changing ImportMapTy, the data structure used for an import list. Once this patch lands, I'm planning to change the type slightly. The new type alias allows us to update the type without touching many places.
Configuration menu - View commit details
-
Copy full SHA for 4f15039 - Browse repository at this point
Copy the full SHA 4f15039View commit details -
[libc++] Replace 'tags' in CSV status pages by inline notes (llvm#105581
) This patch replaces 'tags' in the CSV status pages by inline notes that optionally describe more details about the paper/LWG issue. Tags were not really useful anymore because we have a vastly superior tagging system via Github issues, and keeping the tags up-to-date between CSV files and Github is going to be really challenging. This patch also adds support for encoding custom notes in the CSV files via Github issues. To encode a note in the CSV file, the body (initial description) of a Github issue can be edited to contain the following markers: BEGIN-RST-NOTES text that will be added as a note in the RST END-RST-NOTES Amongst other things, this solves the problem of conveying that a paper has been implemented as a DR, and it gives a unified way to add notes to the status pages from Github.
Configuration menu - View commit details
-
Copy full SHA for c2cac69 - Browse repository at this point
Copy the full SHA c2cac69View commit details -
[CGData] Document for llvm-cgdata (llvm#106320)
This is a follow-up for llvm#101461. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
Configuration menu - View commit details
-
Copy full SHA for ef403f9 - Browse repository at this point
Copy the full SHA ef403f9View commit details -
Configuration menu - View commit details
-
Copy full SHA for a4989cd - Browse repository at this point
Copy the full SHA a4989cdView commit details -
[VPlan] Pass live-ins used as exit values straight to live-out.
Live-ins that are used as exit values don't need to be extracted, they can be passed through directly. This fixes a crash when trying to extract from a live-in. Fixes llvm#106257.
Configuration menu - View commit details
-
Copy full SHA for 4b84288 - Browse repository at this point
Copy the full SHA 4b84288View commit details -
DAG: Change round-mode operand type to i32 for FPTRUNC_ROUND (llvm#10…
…6424) We need this immediate type to be consistent. This is the pre-commit for llvm#105761
Configuration menu - View commit details
-
Copy full SHA for 41b5507 - Browse repository at this point
Copy the full SHA 41b5507View commit details -
Configuration menu - View commit details
-
Copy full SHA for b5977b5 - Browse repository at this point
Copy the full SHA b5977b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a5cf51 - Browse repository at this point
Copy the full SHA 5a5cf51View commit details -
Configuration menu - View commit details
-
Copy full SHA for efbafbc - Browse repository at this point
Copy the full SHA efbafbcView commit details -
[SandboxIR] Add test that checks if classof() is missing. (llvm#106313)
Forgetting to implement an `<Instruction Subclass>::classof()` function does not cause any failures because it falls back to Instruction::classof(). This patch adds an explicit check for all instruction classes to confirm that they have a classof implementation.
Configuration menu - View commit details
-
Copy full SHA for 3cf1018 - Browse repository at this point
Copy the full SHA 3cf1018View commit details -
[LV] Add extra tests with interleave groups and different insert pos.
Add additional test coverage for interleave groups with different insert positions.
Configuration menu - View commit details
-
Copy full SHA for 7912abe - Browse repository at this point
Copy the full SHA 7912abeView commit details -
[compiler-rt][test] Rewrote test to remove curly braces (llvm#105696)
This patch removes curly braces from a test, as lit's internal shell implementation does not support curly brace syntax. Fixes llvm#102382.
Configuration menu - View commit details
-
Copy full SHA for b978bcc - Browse repository at this point
Copy the full SHA b978bccView commit details -
[profile][test] Build Posix/instrprof-dlopen-norpath.test objects as …
…PIC (llvm#106406) `Profile-x86_64 :: Posix/instrprof-dlopen-norpath.test` `FAILs` on Solaris/amd64 and similarly on Solaris/sparcv9: ``` RUN: at line 10: ./a.out 2>&1 | FileCheck compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test -check-prefix=CHECK-FOO + ./a.out + FileCheck compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test -check-prefix=CHECK-FOO compiler-rt/test/profile/Posix/instrprof-dlopen-norpath.test:24:12: error: CHECK-FOO: expected string not found in input CHECK-FOO: foo: ^ <stdin>:1:1: note: scanning from here unable to lookup symbol 'foo': ld.so.1: a.out: invalid handle: 0x0 ``` The problem turned out to be two-fold: `OPEN_AND_RUN` didn't check the `dlopen` return value and the objects linked into the shared objects to be `dlopen`ed aren't built as PIC. This patch fixes the latter. Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and `x86_64-pc-linux-gnu`.
Configuration menu - View commit details
-
Copy full SHA for e03669a - Browse repository at this point
Copy the full SHA e03669aView commit details -
[RISCV] Add cost model coverage for insert/extract element w/ 2^N - 1…
… types We currently return costs which are too low for these.
Configuration menu - View commit details
-
Copy full SHA for c43190f - Browse repository at this point
Copy the full SHA c43190fView commit details -
[NVPTX] Support __usAtomicCAS builtin (llvm#99646)
Supported `__usAtomicCAS` builtin originally defined in `/usr/local/cuda/inlcude/crt/sm_70_rt.hpp` --------- Co-authored-by: Denis Gerasimov <Denis.Gerasimov@baikalelectronics.ru> Co-authored-by: Gonzalo Brito Gadeschi <gonzalob@nvidia.com> Co-authored-by: Denis.Gerasimov <dengzmm@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 2d1fba6 - Browse repository at this point
Copy the full SHA 2d1fba6View commit details -
[LTO] Turn ImportListsTy into a proper class (NFC) (llvm#106427)
This patch turns ImportListsTy into a class that wraps DenseMap<StringRef, ImportMapTy>. Here is the background. I'm planning to reduce the memory footprint of ThinLTO indexing. Specifically, ImportMapTy, the list of imports for a given destination module, will be a hash set of integer IDs indexing into a deduplication table of pairs (SourceModule, GUID), which is a lot like string interning. I'm planning to put this deduplication table as part of ImportListsTy and have each instance of ImportMapTy hold a reference to the deduplication table. Another reason to wrap the DenseMap is that I need to intercept operator[]() so that I can construct an instance of ImportMapTy with a reference to the deduplication table. Note that the default implementation of operator[]() would default-construct ImportMapTy, which I am going to disable.
Configuration menu - View commit details
-
Copy full SHA for e61d606 - Browse repository at this point
Copy the full SHA e61d606View commit details -
[clang] check deduction consistency when partial ordering function te…
…mplates (llvm#100692) This makes partial ordering of function templates consistent with other entities, by implementing [temp.deduct.type]p1 in that case. Fixes llvm#18291
Configuration menu - View commit details
-
Copy full SHA for aa7497a - Browse repository at this point
Copy the full SHA aa7497aView commit details -
[ADT] Relax iterator constraints on all_equal (llvm#106400)
The previous `all_equal` implementation contained `Begin + 1`, which implicitly requires `Begin` to model the [random_access_iterator](https://en.cppreference.com/w/cpp/iterator/random_access_iterator) concept due to the usage of the `+` operator. By swapping this out with `std::next`, this method can be used with weaker iterator concepts, such as [forward_iterator](https://en.cppreference.com/w/cpp/iterator/forward_iterator). --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 6b4b8dc - Browse repository at this point
Copy the full SHA 6b4b8dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for ec360d6 - Browse repository at this point
Copy the full SHA ec360d6View commit details -
[compiler-rt][rtsan] Fix failing file permissions test by checking um…
…ask (llvm#106095) This reverts: d8d8d65
Configuration menu - View commit details
-
Copy full SHA for 898d52b - Browse repository at this point
Copy the full SHA 898d52bView commit details -
[clang][HLSL] Update DXIL/SPIRV hybird CodeGen tests to use temp var (l…
…lvm#105930) Update all hybird DXIL/SPIRV codegen tests to use temp variable representing interchange target Fixes: llvm#105710
Configuration menu - View commit details
-
Copy full SHA for e99aa4a - Browse repository at this point
Copy the full SHA e99aa4aView commit details -
[mlir][spirv] Add an argmax integration test with
mlir-vulkan-runner
(llvm#106426) This PR adds an integration test for an argmax kernel with `mlir-vulkan-runner`. This test exercises the `convert-to-spirv` pass (landed in llvm#95942) and demonstrates that we can use SPIR-V ops as "intrinsics" among higher-level dialects. The support for `index` dialect in `mlir-vulkan-runner` is also added.
Configuration menu - View commit details
-
Copy full SHA for 17b7a9d - Browse repository at this point
Copy the full SHA 17b7a9dView commit details -
Disable ThreadPlanSingleThreadTimeout during step over breakpoint (ll…
…vm#104532) This PR fixes another race condition in llvm#90930. The failure was found by @labath with this log: https://paste.debian.net/hidden/30235a5c/: ``` dotest_wrapper. < 15> send packet: $z0,224505,1#65 ... b-remote.async> < 22> send packet: $vCont;s:p1dcf.1dcf#4c intern-state GDBRemoteClientBase::Lock::Lock sent packet: \x03 b-remote.async> < 818> read packet: $T13thread:p1dcf.1dcf;name:a.out;threads:1dcf,1dd2;jstopinfo:5b7b226e616d65223a22612e6f7574222c22726561736f6e223a227369676e616c222c227369676e616c223a31392c22746964223a373633317d2c7b226e616d65223a22612e6f7574222c22746964223a373633347d5d;thread-pcs:0000000000224505,00007f4e4302119a;00:0000000000000000;01:0000000000000000;02:0100000000000000;03:0000000000000000;04:9084997dfc7f0000;05:a8742a0000000000;06:b084997dfc7f0000;07:6084997dfc7f0000;08:0000000000000000;09:00d7e5424e7f0000;0a:d0d9e5424e7f0000;0b:0202000000000000;0c:80cc290000000000;0d:d8cc1c434e7f0000;0e:2886997dfc7f0000;0f:0100000000000000;10:0545220000000000;11:0602000000000000;12:3300000000000000;13:0000000000000000;14:0000000000000000;15:2b00000000000000;16:80fbe5424e7f0000;17:0000000000000000;18:0000000000000000;19:0000000000000000;reason:signal;#b9 ``` It shows an async interrupt "\x03" was sent immediately after `vCont;s` single step over breakpoint at address `0x224505` (which was disabled before vCont). And the later stop was still at the original PC (0x224505) not moving forward. The investigation shows the failure happens when timeout is short and async interrupt is sent to lldb-server immediately after vCont so ptrace() resumes and then async interrupts debuggee immediately so debuggee does not get a chance to execute and move PC. So it enters stop mode immediately at original PC. `ThreadPlanStepOverBreakpoint` does not expect PC not moving and reports stop at the original place. To fix this, the PR prevents `ThreadPlanSingleThreadTimeout` from being created during `ThreadPlanStepOverBreakpoint` by introduces a new `SupportsResumeOthers()` method and `ThreadPlanStepOverBreakpoint` returns false for it. This makes sense because we should never resume threads during step over breakpoint anyway otherwise it might cause other threads to miss breakpoint. --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Configuration menu - View commit details
-
Copy full SHA for 38b252a - Browse repository at this point
Copy the full SHA 38b252aView commit details -
Revert "[CodeGen] Use MachineInstr::{all_uses,all_defs} (NFC)" (llvm#…
Configuration menu - View commit details
-
Copy full SHA for 0281339 - Browse repository at this point
Copy the full SHA 0281339View commit details -
AMDGPU: Rename fail.llvm.fptrunc.round.ll to llvm.fptrunc.round.err.ll (
llvm#106452) Also correct the suffix of the intrinsic
Configuration menu - View commit details
-
Copy full SHA for 53d95f3 - Browse repository at this point
Copy the full SHA 53d95f3View commit details -
[LTO] Make getImportType a proper function (NFC) (llvm#106450)
I'm planning to reduce the memory footprint of ThinLTO indexing by changing ImportMapTy. A look-up of the import type will involve data private to ImportMapTy, so it must be done by a member function of ImportMapTy. This patch turns getImportType into a member function so that a subsequent "real" change will just have to update the implementation of the function in place.
Configuration menu - View commit details
-
Copy full SHA for eb9c49c - Browse repository at this point
Copy the full SHA eb9c49cView commit details -
[DXIL] Don't generate per-variable guards for DirectX (llvm#106096)
Thread init guards are generated for local static variables when using the Microsoft CXX ABI. This ABI is also used for HLSL generation, but DXIL doesn't need the corresponding _Init_thread_header/footer calls and doesn't really have a way to handle them in its output targets. This modifies the language ops when the target is DXIL to exclude this so that they won't be generated and an alternate guardvar method is used that is compatible with the usage. Done to facilitate testing for llvm#89806, but isn't really related
Configuration menu - View commit details
-
Copy full SHA for 26c582b - Browse repository at this point
Copy the full SHA 26c582bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 18c79ca - Browse repository at this point
Copy the full SHA 18c79caView commit details -
Revert "[mlir][spirv] Add an argmax integration test with `mlir-vulka…
…n-runner`" (llvm#106457) Reverts llvm#106426. This caused failures on nvidia: https://lab.llvm.org/buildbot/#/builders/138/builds/2853
Configuration menu - View commit details
-
Copy full SHA for 1bc7057 - Browse repository at this point
Copy the full SHA 1bc7057View commit details -
[clang][bytecode] Implement constexpr vector unary operators +, -, ~,…
… ! (llvm#105996) Implement constexpr vector unary operators +, -, ~ and ! . - Follow the current constant interpreter. All of our boolean operations on vector types should be '-1' for the 'truth' type. - Move the following functions from `Sema` to `ASTContext`, because we used it in new interpreter. ```C++ QualType GetSignedVectorType(QualType V); QualType GetSignedSizelessVectorType(QualType V); ``` --------- Signed-off-by: yronglin <yronglin777@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for ee0d706 - Browse repository at this point
Copy the full SHA ee0d706View commit details -
[OpenMP][NFC] Remove executable cases from declaration switch (llvm#1…
…06438) The executable directives are handled earlier.
Configuration menu - View commit details
-
Copy full SHA for 13fa78c - Browse repository at this point
Copy the full SHA 13fa78cView commit details -
[RISCV] Remove effectively duplicate RUN lines form fixed-vectors-fp.…
…ll. NFC We had RUN lines with +v,+f and +v,+f,+d. +v implies +f and +d so these are equivalent.
Configuration menu - View commit details
-
Copy full SHA for 431db18 - Browse repository at this point
Copy the full SHA 431db18View commit details -
Configuration menu - View commit details
-
Copy full SHA for a7ba73b - Browse repository at this point
Copy the full SHA a7ba73bView commit details
Commits on Aug 29, 2024
-
[llvm-profdata] Enabled functionality to write split-layout profile (l…
…lvm#101795) Using the flag `-split_layout` in llvm-profdata merge, the output profile can write profiles with and without inlined function into two different extbinary sections (and their FuncOffsetTable too). The section without inlined functions are marked with `SecFlagFlat` and is skipped by ThinLTO because it provides no useful info. The split layout feature was already implemented in SampleProfWriter but previously there is no way to use it from llvm-profdata.
Configuration menu - View commit details
-
Copy full SHA for 75e9d19 - Browse repository at this point
Copy the full SHA 75e9d19View commit details -
[NFC] Fix formatv() usage in preparation of validation (llvm#106454)
Fix several uses of formatv() that would be flagged as invalid by an upcoming change that will add additional validation to formatv().
Configuration menu - View commit details
-
Copy full SHA for b75fe11 - Browse repository at this point
Copy the full SHA b75fe11View commit details -
[MachineLoopInfo] Fix getLoopID to handle multi latches. (llvm#106195)
This patch also fixed `CodegenPrepare` to preserve loop metadata when merging blocks. This fixes issue llvm#102632
Configuration menu - View commit details
-
Copy full SHA for 3a5c578 - Browse repository at this point
Copy the full SHA 3a5c578View commit details -
workflows/release-binaries: Enable flang builds on Windows (llvm#101344)
Flang for Windows depends on compiler-rt, so we need to enable it for the stage1 builds. This also fixes failures building the flang tests on macOS. Fixes llvm#100202.
Configuration menu - View commit details
-
Copy full SHA for 8927576 - Browse repository at this point
Copy the full SHA 8927576View commit details -
[clang-format] Revert "[clang-format][NFC] Delete TT_LambdaArrow (#70… (
llvm#105923) …519)" This reverts commit e00d32a and adds a test for lambda arrow SplitPenalty. Fixes llvm#105480.
Configuration menu - View commit details
-
Copy full SHA for 438ad9f - Browse repository at this point
Copy the full SHA 438ad9fView commit details -
[X86,SimplifyCFG] Support hoisting load/store with conditional faulti…
…ng (Part I) (llvm#96878) This is simplifycfg part of llvm#95515 In this PR, we support hoisting load/store with conditional faulting in `SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional branches. This is for cases like ``` void test (int a, int *b) { if (a) *b = a; } ``` In the following patches, we will support the hoist in `SimplifyCFGOpt::hoistCommonCodeFromSuccessors`. That is for cases like ``` void test (int a, int *c, int *d) { if (a) *c = a; else *d = a; } ```
Configuration menu - View commit details
-
Copy full SHA for 87c86aa - Browse repository at this point
Copy the full SHA 87c86aaView commit details -
[SLP] Fix the Vec lane overridden by the shuffle mask (llvm#106341)
Currently, SLP uses shuffle for the external user of `InsertElementInst` and iterates through the `InsertElementInst` chain to fill the mask with constant indices. However, it may override the original Vec lane. Using the original Vec lane is sufficient.
Configuration menu - View commit details
-
Copy full SHA for 121fb2c - Browse repository at this point
Copy the full SHA 121fb2cView commit details -
[clangd] Do not collect macros when clang-tidy checks call into the p…
…reprocessor (llvm#106329) Fixes llvm#99617
Configuration menu - View commit details
-
Copy full SHA for ee6961d - Browse repository at this point
Copy the full SHA ee6961dView commit details -
[LLDB][Minidumps] Read x64 registers as 64b and handle truncation in …
…the file builder (llvm#106473) This patch addresses a bug where `cs`/`fs` and other segmentation flags were being identified as having a type of `32b` and `64b` for `rflags`. In that case the register value was returning the fail value `0xF...` and this was corrupting some minidumps. Here we just read it as a 64b value and truncate it. In addition to that fix, I added comparing the registers from the live process to the loaded core for the generic minidump test. Prior only being ARM register tests. This explains why this was not detected before.
Configuration menu - View commit details
-
Copy full SHA for 82ebd33 - Browse repository at this point
Copy the full SHA 82ebd33View commit details -
Reapply "[mlir] NFC: fix dependence of (Tensor|Linalg|MemRef|Complex)…
… dialects on LLVM Dialect and LLVM Core in CMake build (llvm#104832)" (llvm#105703) Reapply the commit 43b5085 with additional fixes for building with BUILD_SHARED_LIBS=ON.
Configuration menu - View commit details
-
Copy full SHA for 8bf69ce - Browse repository at this point
Copy the full SHA 8bf69ceView commit details -
[C++20] [Modules] Merge lambdas in source to imported lambdas (llvm#1…
…06483) Close llvm#102721 Generally, the type of merged decls will be reused in ASTContext. But for lambda, in the import and then include case, we can't decide its previous decl in the imported modules so that we can't assign the previous decl before creating the type for it. Since we can't decide its numbering before creating it. So we have to assign the previous decl and the canonical type for it after creating it, which is unusual and slightly hack.
Configuration menu - View commit details
-
Copy full SHA for 55cdb3c - Browse repository at this point
Copy the full SHA 55cdb3cView commit details -
[RISCV] Fix v[f]slide1down.vx having VL changed (llvm#106110)
v[f]slide1down.vx uses VL to determine where the element is inserted into, so changing the VL changes the result. This fixes this by setting ActiveElementsAffectsResult, but it's overly conservative. We should relax this later by modelling that it's ok to change the mask, just not VL. Fixes llvm#106109
Configuration menu - View commit details
-
Copy full SHA for 619efd7 - Browse repository at this point
Copy the full SHA 619efd7View commit details -
[clang][RISCV] Remove
experimental
for vector crypto intrinsics (ll……vm#106359) The C intrinsic spec is ratified: riscv-non-isa/rvv-intrinsic-doc#234.
Configuration menu - View commit details
-
Copy full SHA for 051054e - Browse repository at this point
Copy the full SHA 051054eView commit details -
[Attributor] Fix an issue that could potentially cause
AccessList
a……nd `OffsetBins` out of sync (llvm#106187) The implementation of `AAPointerInfo::RangeList::set_difference` doesn't consider the case where two ranges have the same offset but different sizes. This could cause `AccessList` and `OffsetBins` out of sync because a range has been already updated in `AccessList` but missing in `ToRemove`. I do have a reproducer but the reproducer itself is 248kb. `llvm-reduce` can't further reduce it. Not sure how I can make a smaller reproducer. Fixes: SWDEV-479757.
Configuration menu - View commit details
-
Copy full SHA for 572d2fd - Browse repository at this point
Copy the full SHA 572d2fdView commit details -
workflows/release-tasks: Pass required secrets to all called workflows (
llvm#106286) Called workflows don't have access to secrets by default, so we need to explicitly pass secrets that we use.
Configuration menu - View commit details
-
Copy full SHA for 9d81e7e - Browse repository at this point
Copy the full SHA 9d81e7eView commit details -
[mlir] fix missing LLVMDialect dependency for MLIRSCFToControlFlow
This is a fix-forward for 8bf69ce. The SCF-to-ControlFlow pass has an explicit LLVMDialect dependency.
Configuration menu - View commit details
-
Copy full SHA for 95361cf - Browse repository at this point
Copy the full SHA 95361cfView commit details -
[RISCV] Fix a place that convert an immediate to MCRegister and back …
…to immediate. This dropped the upper 32 bits of the immediate, but I'm not sure it is ever non-zero.
Configuration menu - View commit details
-
Copy full SHA for 62c5de3 - Browse repository at this point
Copy the full SHA 62c5de3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2adc94c - Browse repository at this point
Copy the full SHA 2adc94cView commit details -
[RISCV] Decompose LMUL > 1 reverses into LMUL * M1 vrgather.vv (llvm#…
…104574) As far as I'm aware, vrgather.vv is quadratic in LMUL on most microarchitectures today due to each output register needing to read from each input register in the group. For example, the reciprocal throughput for vrgather.vv on the spacemit-x60 is listed on https://camel-cdr.github.io/rvv-bench-results/bpi_f3 as: LMUL1 LMUL2 LMUL4 LMUL8 4.0 16.0 64.0 256.1 Vector reverses are commonly emitted by the loop vectorizer and are lowered as vrgather.vvs, but since the loop vectorizer uses LMUL 2 by default they end up being quadratic. The output registers in a reverse only need to read from one input register though, so we can decompose this into LMUL * M1 vrgather.vvs to get linear performance. This gives a 0.43% runtime improvement on 526.blender_r at rva22u64_v O3 on the Banana Pi F3.
Configuration menu - View commit details
-
Copy full SHA for 3b64ede - Browse repository at this point
Copy the full SHA 3b64edeView commit details -
[bugpoint] Fix bugpoint for LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABL…
…ES=Off. Building with -DLLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off should not prevent use of bugpoint plugins. This fix uses the approach implemented in llvm#101741.
Configuration menu - View commit details
-
Copy full SHA for 8f96be9 - Browse repository at this point
Copy the full SHA 8f96be9View commit details -
[AVR] Fix 16-bit LDDs with immediate overflows (llvm#104923)
16-bit loads are expanded into a pair of 8-bit loads, so the maximum offset of such 16-bit loads must be 62, not 63.
Configuration menu - View commit details
-
Copy full SHA for c7a4efa - Browse repository at this point
Copy the full SHA c7a4efaView commit details -
[IPSCCP] Intersect attribute info for interprocedural args (llvm#106397)
IPSCCP can currently return worse results than SCCP for arguments that are tracked interprocedurally, because information from attributes is not used for them. Fix this by intersecting in the attribute information when propagating lattice values from calls.
Configuration menu - View commit details
-
Copy full SHA for 7f59264 - Browse repository at this point
Copy the full SHA 7f59264View commit details -
[lldb][lldb-dap][test] Enable more tests on Windows
These tests "just work" on our Windows On Arm machine.
Configuration menu - View commit details
-
Copy full SHA for c954306 - Browse repository at this point
Copy the full SHA c954306View commit details -
[C++20] [Modules] Don't insert class not in named modules to PendingE…
…mittingVTables (llvm#106501) Close llvm#102933 The root cause of the issue is an oversight in llvm#102287 that I didn't notice that PendingEmittingVTables should only accept classes in named modules.
Configuration menu - View commit details
-
Copy full SHA for 47615ff - Browse repository at this point
Copy the full SHA 47615ffView commit details -
[clang-repl] Fix clang-repl for LLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECU…
…TABLES=Off. clang-repl should stil work when LLVM is built with -DLLVM_ENABLE_EXPORTED_SYMBOLS_IN_EXECUTABLES=Off. This fix uses the approach implemented in llvm#101741. rdar://134910110
Configuration menu - View commit details
-
Copy full SHA for e5b55e6 - Browse repository at this point
Copy the full SHA e5b55e6View commit details -
[C++20] [Modules] Embed all source files for C++20 Modules (llvm#102444)
Close llvm#72383 The implementation rationale is, I don't want to pass `-fmodules-embed-all-files` all the time since we can't test it in lit tests (we're using `clang_cc1`). So I tried to set it in FrontendActions for modules.
Configuration menu - View commit details
-
Copy full SHA for 2eeeff8 - Browse repository at this point
Copy the full SHA 2eeeff8View commit details -
[Driver] Add -mbranch-protection to ARM and AArch64 multilib flags (l…
…lvm#106391) This adds the `-mbranch-protection` command line option to the set of flags used by the multilib selection for ARM and AArch64 targets.
Configuration menu - View commit details
-
Copy full SHA for b822b69 - Browse repository at this point
Copy the full SHA b822b69View commit details -
[mlir] Apply ClangTidyPerformance finding (NFC).
Use const reference for loop variable.
Configuration menu - View commit details
-
Copy full SHA for b7981a7 - Browse repository at this point
Copy the full SHA b7981a7View commit details -
[LLD][COFF] Add support for range extension thunks for ARM64EC target…
…s. (llvm#106289) Thunks themselves are the same as regular ARM64 thunks; they just need to report the correct machine type. When processing the code, we also need to use the current chunk's machine type instead of the global one: we don't want to treat x86_64 thunks as ARM64EC, and we need to report the correct machine type in hybrid binaries.
Configuration menu - View commit details
-
Copy full SHA for efad561 - Browse repository at this point
Copy the full SHA efad561View commit details -
[llvm][Docs] Update TestSuiteGuide.md (llvm#79613)
Update svn to git & virtualenv to venv
Configuration menu - View commit details
-
Copy full SHA for f9ee9f5 - Browse repository at this point
Copy the full SHA f9ee9f5View commit details -
[lldb][lldb-dap][test] Skip logpoint test on Windows again
This one snuck into the previous patch. The test program needs updating if it's ever going to work on Windows.
Configuration menu - View commit details
-
Copy full SHA for ae34257 - Browse repository at this point
Copy the full SHA ae34257View commit details -
[AMDGPU] Graph-based Module Splitting Rewrite (llvm#104763)
Major rewrite of the AMDGPUSplitModule pass in order to better support it long-term. Highlights: - Removal of the "SML" logging system in favor of just using CL options and LLVM_DEBUG, like any other pass in LLVM. - The SML system started from good intentions, but it was too flawed and messy to be of any real use. It was also a real pain to use and made the code more annoying to maintain. - Graph-based module representation with DOTGraph printing support - The graph represents the module accurately, with bidirectional, typed edges between nodes (a node usually represents one function). - Nodes are assigned IDs starting from 0, which allows us to represent a set of nodes as a BitVector. This makes comparing 2 sets of nodes to find common dependencies a trivial task. Merging two clusters of nodes together is also really trivial. - No more defaulting to "P0" for external calls - Roots that can reach non-copyable dependencies (such as external calls) are now grouped together in a single "cluster" that can go into any partition. - No more defaulting to "P0" for indirect calls - New representation for module splitting proposals that can be graded and compared. - Graph-search algorithm that can explore multiple branches/assignments for a cluster of functions, up to a maximum depth. - With the default max depth of 8, we can create up to 256 propositions to try and find the best one. - We can still fall back to a greedy approach upon reaching max depth. That greedy approach uses almost identical heuristics to the previous version of the pass. All of this gives us a lot of room to experiment with new heuristics or even entirely different splitting strategies if we need to. For instance, the graph representation has room for abstract nodes, e.g. if we need to represent some global variables or external constraints. We could also introduce more edge types to model other type of relations between nodes, etc. I also designed the graph representation & the splitting strategies to be as fast as possible, and it seems to have paid off. Some quick tests showed that we spend pretty much all of our time in the CloneModule function, with the actual splitting logic being >1% of the runtime.
Configuration menu - View commit details
-
Copy full SHA for c9b6e01 - Browse repository at this point
Copy the full SHA c9b6e01View commit details -
[mlir][ArmSME] Merge consecutive
arm_sme.intr.zero
ops (llvm#106215)This merges consecutive SME zero intrinsics within a basic block, which avoids the backend eventually emitting multiple zero instructions when it could just use one. Note: This kind of peephole optimization could be implemented in the backend too.
Configuration menu - View commit details
-
Copy full SHA for e37d6d2 - Browse repository at this point
Copy the full SHA e37d6d2View commit details -
[AMDGPU][llvm-split] Remove declarations-debug
Test didn't have a FileCheck line and is obsolete after llvm#104763
Configuration menu - View commit details
-
Copy full SHA for 31684c6 - Browse repository at this point
Copy the full SHA 31684c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for b9f4afa - Browse repository at this point
Copy the full SHA b9f4afaView commit details -
[AMDGPU][llvm-split] Make declarations test more stable
Delete the previous files if present, to ensure it won't fail if the output directory of the tests wasn't cleared.
Configuration menu - View commit details
-
Copy full SHA for 575be3e - Browse repository at this point
Copy the full SHA 575be3eView commit details -
AMDGPU/NewPM Port GCNDPPCombine to NPM (llvm#105816)
Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
Configuration menu - View commit details
-
Copy full SHA for fdca2c3 - Browse repository at this point
Copy the full SHA fdca2c3View commit details -
[Flang][OpenMP] Don't expect block arguments using early privatization (
llvm#105842) There are some spots where all symbols to privatize collected by a `DataSharingProcessor` instance are expected to have corresponding entry block arguments associated regardless of whether delayed privatization was enabled. This can result in compiler crashes if a `DataSharingProcessor` instance created with `useDelayedPrivatization=false` is queried in this way. The solution proposed by this patch is to provide another public method to query specifically delayed privatization symbols, which will either be empty or point to the complete set of symbols to privatize accordingly.
Configuration menu - View commit details
-
Copy full SHA for 60e9fb9 - Browse repository at this point
Copy the full SHA 60e9fb9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c28b84e - Browse repository at this point
Copy the full SHA c28b84eView commit details -
[compiler-rt][RISCV][NFC] Update code_model with latest spec (llvm#10…
…6498) The spec could be found here riscv-non-isa/riscv-c-api-doc#74 This patch updates the following symbol: ``` mVendorID -> mvendorid mArchID -> marchid mImplID -> mimpid ```
Configuration menu - View commit details
-
Copy full SHA for 2505546 - Browse repository at this point
Copy the full SHA 2505546View commit details -
PPC: Custom lower ppcf128 is_fpclass if is_fpclass is custom (llvm#10…
…5540) Unfortunately expandIS_FPCLASS is called directly in SelectionDAGBuilder depending on whether IS_FPCLASS is custom or not. This helps avoid ppc test regressions in a future patch where the custom lowering would be bypassed.
Configuration menu - View commit details
-
Copy full SHA for 911b960 - Browse repository at this point
Copy the full SHA 911b960View commit details -
DAG: Check if is_fpclass is custom, instead of isLegalOrCustom (llvm#…
…105577) For some reason, isOperationLegalOrCustom is not the same as isOperationLegal || isOperationCustom. Unfortunately, it checks if the type is legal which makes it uesless for custom lowering on non-legal types (which is always ppcf128). Really the DAG builder shouldn't be going to expand this in the builder, it makes it difficult to work with. It's only here to work around the DAG requiring legal integer types the same size as the FP type after type legalization.
Configuration menu - View commit details
-
Copy full SHA for 7b7b0b9 - Browse repository at this point
Copy the full SHA 7b7b0b9View commit details -
[analyzer] Add missing include <unordered_map> to llvm/lib/Support/Z3…
…Solver.cpp (llvm#106410) Resolves llvm#106361. Adding #include <unordered_map> to llvm/lib/Support/Z3Solver.cpp fixes compilation errors for homebrew build on macOS with Xcode 14. https://github.com/Homebrew/homebrew-core/actions/runs/10604291631/job/29390993615?pr=181351 shows that this is resolved when the include is patched in (Linux CI failure is due to unrelated timeout).
Configuration menu - View commit details
-
Copy full SHA for fcb3a04 - Browse repository at this point
Copy the full SHA fcb3a04View commit details -
[X86, MC] Recognize OSIZE=64b when EVEX.W = 1, EVEX.pp = 01 (llvm#103816
) In the legacy space, if both the 66 prefix and REX.W=1 are present, the REX.W=1 takes precedence and makes OSIZE=64b. EVEX map 4 inherits this convention, with EVEX.pp=01 and EVEX.W playing the roles of the 66 prefix and REX.W. So if EVEX.pp=00, the OSIZE can only be 64b or 32b, depending on whether EVEX.W=1 or not. But if EVEX.pp=01, then OSIZE is either 64b or 16b depending on whether EVEX.W=1 or not.
Configuration menu - View commit details
-
Copy full SHA for 36b7c30 - Browse repository at this point
Copy the full SHA 36b7c30View commit details -
[SLP] Move some of X86 tests to common directory (llvm#106401)
Some of the tests from X86 directory can be generalized for AArch64 to improve its coverage.
Configuration menu - View commit details
-
Copy full SHA for ddbc8f3 - Browse repository at this point
Copy the full SHA ddbc8f3View commit details -
[DebugInfo][DWARF] Set is_stmt on first non-line-0 instruction in BB (l…
…lvm#105524) Fixes: llvm#104695 This patch adds the is_stmt flag to line table entries for the first instruction with a non-0 line location in each basic block, to ensure that it will be used for stepping even if the last instruction in the previous basic block had the same line number; this is important for cases where the new BB is reachable from BBs other than the preceding block.
Configuration menu - View commit details
-
Copy full SHA for 3ef37e2 - Browse repository at this point
Copy the full SHA 3ef37e2View commit details -
[MLIR][Flang][OpenMP] Remove omp.parallel from loop wrapper ops (llvm…
…#105833) This patch updates the `omp.parallel` operation according to the results of the discussion in [this RFC](https://discourse.llvm.org/t/rfc-disambiguation-between-loop-and-block-associated-omp-parallelop/79972). It is removed from the set of loop wrapper operations, changing the expected MLIR representation for composite `distribute parallel do/for` into the following: ```mlir omp.parallel { ... omp.distribute { omp.wsloop { omp.loop_nest ... { ... } omp.terminator } omp.terminator } ... omp.terminator } ``` MLIR verifiers for operations impacted by this representation change are updated, as well as related tests. The `LoopWrapperInterface` is also updated, since it's no longer representing an optional "role" of an operation but a mandatory set of restrictions instead.
Configuration menu - View commit details
-
Copy full SHA for 2784060 - Browse repository at this point
Copy the full SHA 2784060View commit details -
[Flang][OpenMP] Move loop privatization out of dispatch (llvm#106066)
This patch moves the creation of `DataSharingProcessor` instances for loop constructs out of `genOMPDispatch()` and into their corresponding codegen functions. This is a necessary first step to enable a proper handling of privatization on composite constructs. Some tests are updated due to a change of order between clause processing and privatization.
Configuration menu - View commit details
-
Copy full SHA for 0f206b1 - Browse repository at this point
Copy the full SHA 0f206b1View commit details -
[AArch64] optimise SVE cvt intrinsics with no active lanes (llvm#104809)
This patch extends llvm#73964 and optimises SVE cvt intrinsics away when predicate is zero.
Configuration menu - View commit details
-
Copy full SHA for 113806d - Browse repository at this point
Copy the full SHA 113806dView commit details -
[Flang][OpenMP] DISTRIBUTE PARALLEL DO lowering (llvm#106207)
This patch adds PFT to MLIR lowering support for `distribute parallel do` composite constructs.
Configuration menu - View commit details
-
Copy full SHA for 9c8ce5f - Browse repository at this point
Copy the full SHA 9c8ce5fView commit details -
[Flang][OpenMP] DISTRIBUTE PARALLEL DO SIMD lowering (llvm#106211)
This patch adds PFT to MLIR lowering support for `distribute parallel do simd` composite constructs.
Configuration menu - View commit details
-
Copy full SHA for 57726c4 - Browse repository at this point
Copy the full SHA 57726c4View commit details -
[SLP]Fix a crash when requestin the cost for buildvector cmp nodes ty…
…pes. Need to use original cmp type i1 when estimating the cost for the buildvector node, not its operand types to prevent compiler crash upon TTI cost estimation.
Configuration menu - View commit details
-
Copy full SHA for fdf72c9 - Browse repository at this point
Copy the full SHA fdf72c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c3cb273 - Browse repository at this point
Copy the full SHA c3cb273View commit details -
[DebugInfo][NFC] Make is_stmt-at-block-start test X86-specific
Fixes failure on the llvm-clang-aarch64-darwin buildbot: https://lab.llvm.org/buildbot/#/builders/190/builds/4660/ The test mentioned does not rely on any unique property of X86, but does rely on the layout of the basic blocks produced by llc, which varies between targets. Although the test could be duplicated for other targets, it seems unnecessary since the behaviour being tested is not target-specific.
Configuration menu - View commit details
-
Copy full SHA for 616f7d3 - Browse repository at this point
Copy the full SHA 616f7d3View commit details -
[LV] Use SCEV to analyze second operand for cost query.
Improve operand analysis using SCEV for cost purposes. This fixes a divergence between legacy and VPlan-based cost-modeling after 533e6bb. Fixes llvm#106248.
Configuration menu - View commit details
-
Copy full SHA for 0a272d3 - Browse repository at this point
Copy the full SHA 0a272d3View commit details -
Revert "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instructio…
…n in BB (llvm#105524)" Reverted (along with the NFC followup fix) due to buildbot failure: https://lab.llvm.org/buildbot/#/builders/160/builds/4142 This reverts commit 3ef37e2, and commit 616f7d3.
Configuration menu - View commit details
-
Copy full SHA for 926f097 - Browse repository at this point
Copy the full SHA 926f097View commit details -
[LAA] Add test cases where evaluating AddRecs at symbolic max BTC wraps.
Configuration menu - View commit details
-
Copy full SHA for 606a934 - Browse repository at this point
Copy the full SHA 606a934View commit details -
Configuration menu - View commit details
-
Copy full SHA for 50515db - Browse repository at this point
Copy the full SHA 50515dbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9167667 - Browse repository at this point
Copy the full SHA 9167667View commit details -
[clang][bytecode] Properly diagnose non-const reads (llvm#106514)
If the global variable is constant (but not constexpr), we need to diagnose, but keep evaluating.
Configuration menu - View commit details
-
Copy full SHA for cb608cc - Browse repository at this point
Copy the full SHA cb608ccView commit details -
Configuration menu - View commit details
-
Copy full SHA for 25c9410 - Browse repository at this point
Copy the full SHA 25c9410View commit details -
[InstCombine][X86] Only demand used bits for VPERMILPD/VPERMILPS mask…
… values VPERMILPS lower bits0-3 (to index per-lane i32/f32 0-3) VPERMILPD uses bit1 (to index per-lane i64/f64 0-1) Use SimplifyDemandedBits to ignore anything touching the remaining bits. Part of llvm#106413
Configuration menu - View commit details
-
Copy full SHA for d57c046 - Browse repository at this point
Copy the full SHA d57c046View commit details -
Restrict LLVM_TARGETS_TO_BUILD in Windows release packaging (llvm#106059
) When including all targets, some files become too large for the NSIS installer to handle. Fixes llvm#101994
Configuration menu - View commit details
-
Copy full SHA for 2a28df6 - Browse repository at this point
Copy the full SHA 2a28df6View commit details -
[lldb][lldb-dap][test] Enable Launch tests
Add Windows include equivalents for includes and shell command.
Configuration menu - View commit details
-
Copy full SHA for b2a820f - Browse repository at this point
Copy the full SHA b2a820fView commit details -
Restore missing link in CodeOfConduct.rst (llvm#106385)
Link restored from the original policy outlined here https://discourse.llvm.org/t/code-of-conduct-changes-related-to-llvm-project-policy-changes/64197
Configuration menu - View commit details
-
Copy full SHA for 0a48482 - Browse repository at this point
Copy the full SHA 0a48482View commit details -
[libc][x86] Use prefetch for write for memcpy (llvm#90450)
Currently when `LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING` is set we prefetch memory for read on the source buffer. This patch adds prefetch for write on the destination buffer.
Configuration menu - View commit details
-
Copy full SHA for 73ef397 - Browse repository at this point
Copy the full SHA 73ef397View commit details -
[include-cleaner] Mark RecordDecls referenced in UsingDecls as explic…
…it (llvm#106430) We were reporting ambigious references from using declarations as user can be depending on different overloads of a function just because they are visible in the TU. This doesn't apply to records, or primary templates as declaration being referenced in such cases is unambigious, the ambiguity applies to specializations though. Hence this patch returns an explicit reference to record decls and primary templates of those.
Configuration menu - View commit details
-
Copy full SHA for acff429 - Browse repository at this point
Copy the full SHA acff429View commit details -
[SPARC][IAS] Add
illtrap
alias forunimp
(llvm#105928)This follows Solaris behavior of allowing both mnemonics all the time. Fixes llvm#105639.
Configuration menu - View commit details
-
Copy full SHA for 7955760 - Browse repository at this point
Copy the full SHA 7955760View commit details -
Configuration menu - View commit details
-
Copy full SHA for ba52a09 - Browse repository at this point
Copy the full SHA ba52a09View commit details -
[RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (llvm#105671)
Fix llvm#105571 which demonstrates an end() iterator dereference when performing a non-empty splice to end() from a region that ends at Src::end(). Rather than calling Instruction::adoptDbgRecords from Dest, create a marker (which takes an iterator) and absorbDebugValues onto that. The "absorb" variant doesn't clean up the source marker, which in this case we know is a trailing marker, so we have to do that manually.
Configuration menu - View commit details
-
Copy full SHA for 43661a1 - Browse repository at this point
Copy the full SHA 43661a1View commit details -
[NFC][AMDGPU] Autogenerate tests for uniform i32 promo in ISel (llvm#…
…106382) Many tests were easy to update, but these are quite big and I think it's better to autogenerate them to see the difference well.
Configuration menu - View commit details
-
Copy full SHA for 1f8f2ed - Browse repository at this point
Copy the full SHA 1f8f2edView commit details -
[clang][bytecode] Diagnose member calls on deleted blocks (llvm#106529)
This requires a bit of restructuring of ctor calls when checking for a potential constant expression.
Configuration menu - View commit details
-
Copy full SHA for df11ee2 - Browse repository at this point
Copy the full SHA df11ee2View commit details -
[LoopVectorize][X86] amdlibm-calls.ll - cleanup test checks for 2/4/8…
…/16 vector widths This cleans up the existing tests and shows the gaps in the test checks (for instance we're often testing VF4 + VF16 but not VF8 even though amdlibm supports it).
Configuration menu - View commit details
-
Copy full SHA for c57abc6 - Browse repository at this point
Copy the full SHA c57abc6View commit details -
[LoopVectorize][X86] amdlibm-calls.ll - add additional 2/4/8/16 vecto…
…r widths test checks This should cover most amdlibm functions, but still not added every VF combo (e.g. 2f32/16f64 often vectorises to the llvm intrinsic for that vector type)
Configuration menu - View commit details
-
Copy full SHA for 2f95298 - Browse repository at this point
Copy the full SHA 2f95298View commit details -
[lldb][lldb-dap] Enable more tests on Windows
These few worked without changes.
Configuration menu - View commit details
-
Copy full SHA for f7d6dfa - Browse repository at this point
Copy the full SHA f7d6dfaView commit details -
[Analysis] Guard logf128 cst folding (llvm#106543)
LLVM has a CMake variable to control whether to consider logf128 constant folding which libAnalysis ignores. This patch changes the logf128 check to rely on the global LLVM_HAS_LOGF128 setting made in config-ix.cmake.
Configuration menu - View commit details
-
Copy full SHA for 56152fa - Browse repository at this point
Copy the full SHA 56152faView commit details -
Reapply "[DebugInfo][DWARF] Set is_stmt on first non-line-0 instructi…
…on in BB (llvm#105524)" Fixes the previous buildbot error by adding an explicit triple to the test, ensuring that llc can produce a valid object file. This reverts commit 926f097.
Configuration menu - View commit details
-
Copy full SHA for 5fef40c - Browse repository at this point
Copy the full SHA 5fef40cView commit details -
Revert "[flang] Warn when F128 is unsupported" (llvm#106561)
Reverts llvm#102147 It seems some systems which should support F128 are wrongly detected as not supporting. This might be due to checking `LDBL_MANT_DIG` instead of `__LDBL_MANT_DIG__`. I will investigate.
Configuration menu - View commit details
-
Copy full SHA for 8ae877a - Browse repository at this point
Copy the full SHA 8ae877aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9edd998 - Browse repository at this point
Copy the full SHA 9edd998View commit details
Commits on Sep 24, 2024
-
Configuration menu - View commit details
-
Copy full SHA for e9c77eb - Browse repository at this point
Copy the full SHA e9c77ebView commit details