-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with fe0dee4d (Jun 10) (62) #323
Commits on Jun 7, 2024
-
Revert "[X86] Assign AVX10_1 feature priority to align with gcc. (llv…
…m#94557)" (llvm#94730) This reverts commit d843c02.
Configuration menu - View commit details
-
Copy full SHA for c007883 - Browse repository at this point
Copy the full SHA c007883View commit details -
[memprof] Use std::move in ContextEdge::ContextEdge (NFC) (llvm#94687)
Since the constructor of ContextEdge takes ContextIds by value, we should move it to the corresponding member variable as suggested by clang-tidy's performance-unnecessary-value-param. While we are at it, this patch updates a couple of callers. To avoid the ambiguity in the evaluation order among the constructor arguments, I'm calling computeAllocType before calling the constructor.
Configuration menu - View commit details
-
Copy full SHA for b7d976d - Browse repository at this point
Copy the full SHA b7d976dView commit details -
[ORC] Switch ExecutionSession::ErrorReporter to use unique_function.
This allows the ReportError functor to hold move-only types.
Configuration menu - View commit details
-
Copy full SHA for 4a7b800 - Browse repository at this point
Copy the full SHA 4a7b800View commit details -
[LoongArch] Set isReMaterializable on LU{12,32,52}I.D/ADDI.D and {X}O…
…RI instructions (llvm#94552)
Configuration menu - View commit details
-
Copy full SHA for f21c2fa - Browse repository at this point
Copy the full SHA f21c2faView commit details -
Configuration menu - View commit details
-
Copy full SHA for d224a03 - Browse repository at this point
Copy the full SHA d224a03View commit details -
[LoongArch] Add a pass to rewrite rd to r0 for non-computational inst…
…rs whose return values are unused (llvm#94590) This patch adds a peephole pass `LoongArchDeadRegisterDefinitions`. It rewrites `rd` to `r0` when `rd` is marked as dead. It may improve the register allocation and reduce pipeline hazards on CPUs without register renaming and OOO.
Configuration menu - View commit details
-
Copy full SHA for 240512c - Browse repository at this point
Copy the full SHA 240512cView commit details -
[clang][Interp][NFC] Add GetPtrFieldPop opcode
And change the previous GetPtrField to only peek() the base pointer. We can get rid of a whole bunch of DupPtr ops this way.
Configuration menu - View commit details
-
Copy full SHA for c15b867 - Browse repository at this point
Copy the full SHA c15b867View commit details -
[analyzer][NFC] Factor out NoOwnershipChangeVisitor (llvm#94357)
In preparation for adding essentially the same visitor to StreamChecker, this patch factors this visitor out to a common header. I'll be the first to admit that the interface of these classes are not terrific, but it rather tightly held back by its main technical debt, which is NoStoreFuncVisitor, the main descendant of NoStateChangeVisitor. Change-Id: I99d73ccd93a18dd145bbbc83afadbb432dd42b90
Configuration menu - View commit details
-
Copy full SHA for e622996 - Browse repository at this point
Copy the full SHA e622996View commit details -
Configuration menu - View commit details
-
Copy full SHA for be18daa - Browse repository at this point
Copy the full SHA be18daaView commit details -
[docs] Fix benchmarking tips (llvm#94724)
This PR fixes an incorrect line for setting scaling_governer in benchmarking tips.
Configuration menu - View commit details
-
Copy full SHA for 8ef5c98 - Browse repository at this point
Copy the full SHA 8ef5c98View commit details -
Configuration menu - View commit details
-
Copy full SHA for 36bc741 - Browse repository at this point
Copy the full SHA 36bc741View commit details -
[clang][Interp] Remove StoragKind limitation in Pointer assign operators
It's not strictly needed and did cause some test failures.
Configuration menu - View commit details
-
Copy full SHA for 1c0063b - Browse repository at this point
Copy the full SHA 1c0063bView commit details -
Configuration menu - View commit details
-
Copy full SHA for ac40463 - Browse repository at this point
Copy the full SHA ac40463View commit details -
[MLIR] Translate DIStringType. (llvm#94480)
This PR handle translation of DIStringType. Mostly mechanical changes to translate DIStringType to/from DIStringTypeAttr. The 'stringLength' field is 'DIVariable' in DIStringType. As there was no `DIVariableAttr` previously, it has been added to ease the translation. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
Configuration menu - View commit details
-
Copy full SHA for 4f320e6 - Browse repository at this point
Copy the full SHA 4f320e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f1adf0 - Browse repository at this point
Copy the full SHA 5f1adf0View commit details -
[flang][Transforms][NFC] Remove boilerplate from vscale range pass (l…
…lvm#94598) Use tablegen to generate the pass constructor. This pass is supposed to add function attributes so it does not need to operate on other top level operations.
Configuration menu - View commit details
-
Copy full SHA for 8f11649 - Browse repository at this point
Copy the full SHA 8f11649View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0749b01 - Browse repository at this point
Copy the full SHA 0749b01View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3453ded - Browse repository at this point
Copy the full SHA 3453dedView commit details -
[ARM] Add NEON support for ISD::ABDS/ABDU nodes. (llvm#94504)
As noted on llvm#94466, NEON has ABDS/ABDU instructions but only handles them via intrinsics, plus some VABDL custom patterns. This patch flags basic ABDS/ABDU for neon types as legal and updates all tablegen patterns to use abds/abdu instead. Fixes llvm#94466
Configuration menu - View commit details
-
Copy full SHA for c0b4685 - Browse repository at this point
Copy the full SHA c0b4685View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d1b367 - Browse repository at this point
Copy the full SHA 0d1b367View commit details -
[DebugInfo] Add DW_OP_LLVM_extract_bits (llvm#93990)
This operation extracts a number of bits at a given offset and sign or zero extends them, which is done by emitting it as a left shift followed by a right shift. This is being added for use in clang for C++ structured bindings of bitfields that have offset or size that aren't a byte multiple. A new operation is being added, instead of shifts being used directly, as it makes correctly handling it in optimisations (which will be done in a later patch) much easier.
Configuration menu - View commit details
-
Copy full SHA for 1721c14 - Browse repository at this point
Copy the full SHA 1721c14View commit details -
Add checks before hoisting out in loop pipelining (llvm#90872)
Currently, during a loop pipelining transformation, operations may be hoisted out without any checks on the loop bounds, which leads to incorrect transformations and unexpected behaviour. The following [issue ](llvm#90870) describes the problem more extensively, including an example. The proposed fix adds some check in the loop bounds before and applies the maximum hoisting.
Configuration menu - View commit details
-
Copy full SHA for 192cd68 - Browse repository at this point
Copy the full SHA 192cd68View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d6acf8 - Browse repository at this point
Copy the full SHA 5d6acf8View commit details -
[clang][Interp] Fix refers_to_enclosing_variable_or_capture DREs
They do not count into lambda captures, so visit them lazily.
Configuration menu - View commit details
-
Copy full SHA for 3a31eae - Browse repository at this point
Copy the full SHA 3a31eaeView commit details -
[SimplifyCFG] Remove bogus UTC line from test (NFC)
The check lines in this test were clearly not generated by UTC.
Configuration menu - View commit details
-
Copy full SHA for 1934c1a - Browse repository at this point
Copy the full SHA 1934c1aView commit details -
[SimplifyCFG] Regenerate switch to lookup tests (NFC)
Regenerate these with --check-globals. The manual global CHECKS get dropped during regeneration otherwise. Annoyingly UTC insists on putting the globals directly before the first function, so the first comment is a bit out of place now.
Configuration menu - View commit details
-
Copy full SHA for 8719cb8 - Browse repository at this point
Copy the full SHA 8719cb8View commit details -
[mlir][vector] Add n-d deinterleave lowering (llvm#94237)
This patch implements the lowering for vector deinterleave for vector of n-dimensions. Process involves unrolling the n-d vector to a series of one-dimensional vectors. The deinterleave operation is then used on these vectors. From: ``` %0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8> ``` To: ``` %cst = arith.constant dense<0> : vector<2x4xi32> %0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32> %res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32> %1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32> %2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32> %3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32> %res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32> %4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32> %5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32> ...etc. ```
Configuration menu - View commit details
-
Copy full SHA for b87a80d - Browse repository at this point
Copy the full SHA b87a80dView commit details -
[ARM] r11 is reserved when using -mframe-chain=aapcs (llvm#86951)
When using the -mframe-chain=aapcs or -mframe-chain=aapcs-leaf options, we cannot use r11 as an allocatable register, even if -fomit-frame-pointer is also used. This is so that r11 will always point to a valid frame record, even if we don't create one in every function.
Configuration menu - View commit details
-
Copy full SHA for 1a52392 - Browse repository at this point
Copy the full SHA 1a52392View commit details -
[DAG] Always allow folding XOR patterns to ABS pre-legalization (llvm…
…#94601) Removes residual ARM handling for vXi64 ABS nodes to prevent infinite loops.
Configuration menu - View commit details
-
Copy full SHA for af3ffff - Browse repository at this point
Copy the full SHA af3ffffView commit details -
fix(mlir/**.py): fix comparison to None (llvm#94019)
from PEP8 (https://peps.python.org/pep-0008/#programming-recommendations): > Comparisons to singletons like None should always be done with is or is not, never the equality operators. Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for fd45dcc - Browse repository at this point
Copy the full SHA fd45dccView commit details -
[ARM] Add support for Cortex-R52+ (llvm#94633)
Cortex-R52+ is an Armv8-R AArch32 CPU. Technical Reference Manual for Cortex-R52+: https://developer.arm.com/documentation/102199/latest/
Configuration menu - View commit details
-
Copy full SHA for 917afa8 - Browse repository at this point
Copy the full SHA 917afa8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 537165b - Browse repository at this point
Copy the full SHA 537165bView commit details -
[clang][test] Skip interpreter value test on Arm 32 bit
llvm#89811 caused this test to fail, somehow. I think it may not be at fault, but actually be exposing some existing undefined behaviour, see llvm#94741. Skipping this for now to get the bots green again.
Configuration menu - View commit details
-
Copy full SHA for 54c5dbe - Browse repository at this point
Copy the full SHA 54c5dbeView commit details -
Configuration menu - View commit details
-
Copy full SHA for d3e531c - Browse repository at this point
Copy the full SHA d3e531cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6fe5428 - Browse repository at this point
Copy the full SHA 6fe5428View commit details -
[clang][SPIR-V] Add support for AMDGCN flavoured SPIRV (llvm#89796)
This change seeks to add support for vendor flavoured SPIRV - more specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that carries some extra bits of information that are only usable by AMDGCN targets, forfeiting absolute genericity to obtain greater expressiveness for target features: - AMDGCN inline ASM is allowed/supported, under the assumption that the [SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc) extension is enabled/used - AMDGCN target specific builtins are allowed/supported, under the assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is enabled when using the downstream translator - the featureset matches the union of AMDGCN targets' features - the datalayout string is overspecified to affix both the program address space and the alloca address space, the latter under the assumption that the [SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc) extension is enabled/used, case in which the extant SPIRV datalayout string would lead to pointers to function pointing to the private address space, which would be wrong. Existing AMDGCN tests are extended to cover this new target. It is currently dormant / will require some additional changes, but I thought I'd rather put it up for review to get feedback as early as possible. I will note that an alternative option is to place this under AMDGPU, but that seems slightly less natural, since this is still SPIRV, albeit relaxed in terms of preconditions & constrained in terms of postconditions, and only guaranteed to be usable on AMDGCN targets (it is still possible to obtain pristine portable SPIRV through usage of the flavoured target, though).
Configuration menu - View commit details
-
Copy full SHA for 88e2bb4 - Browse repository at this point
Copy the full SHA 88e2bb4View commit details -
[BOLT][NFC] Infailable fns return void (llvm#92018)
Both `reverseBranchCondition` and `replaceBranchTarget` return a success boolean. But all-but-one caller ignores the return value, and the exception emits a fatal error on failure. Thus, just return nothing.
Configuration menu - View commit details
-
Copy full SHA for 3fefb3c - Browse repository at this point
Copy the full SHA 3fefb3cView commit details -
[CodeGen][SDAG] Remove CombinedNodes SmallPtrSet (llvm#94609)
This "small" set grows quite large and it's more performant to store whether a node has been combined before in the node itself. As this information is only relevant for nodes that are currently not in the worklist, add a second state to the CombinerWorklistIndex (-2) to indicate that a node is currently not in a worklist, but was combined before. This brings a substantial performance improvement.
Configuration menu - View commit details
-
Copy full SHA for 74d62c2 - Browse repository at this point
Copy the full SHA 74d62c2View commit details -
[clang][Interp] Check ConstantExpr results for initialization
They need to be fully initialized, similar to global variables.
Configuration menu - View commit details
-
Copy full SHA for 9ece3eb - Browse repository at this point
Copy the full SHA 9ece3ebView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9eb8a13 - Browse repository at this point
Copy the full SHA 9eb8a13View commit details -
[clang][Interp] Limit lambda capture lazy visting to actual captures
Check this by looking at the VarDecl.
Configuration menu - View commit details
-
Copy full SHA for b8cc85b - Browse repository at this point
Copy the full SHA b8cc85bView commit details -
[serialization] no transitive decl change (llvm#92083)
Following of llvm#86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to llvm#86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As llvm#86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
Configuration menu - View commit details
-
Copy full SHA for 5a0181f - Browse repository at this point
Copy the full SHA 5a0181fView commit details -
[AMDGPU] Fix interaction between WQM and llvm.amdgcn.init.exec (llvm#…
…93680) Whole quad mode requires inserting a copy of the initial EXEC mask. In a function that also uses llvm.amdgcn.init.exec, insert the COPY after initializing EXEC.
Configuration menu - View commit details
-
Copy full SHA for df6750e - Browse repository at this point
Copy the full SHA df6750eView commit details -
[Frontend][OpenMP] Sort all the things in OMP.td, NFC (llvm#94653)
The file OMP.td is becoming tedious to update by hand due to the seemingly random ordering of various items in it. This patch brings order to it by sorting most of the contents. The clause definitions are sorted alphabetically with respect to the spelling of the clause.[1] The directive definitions are split into two leaf directives and compound directives.[2] Within each, definitions are sorted alphabetically with respect to the spelling, with the exception that "end xyz" directives are placed immediately following the definition of "xyz".[3] Within each directive definition, the lists of clauses are also sorted alphabetically. [1] All spellings are made of lowercase letters, _, or space. Ordering that includes non-letters follows the order assumed by the `sort` utility. [2] Compound directives refer to the consituent leaf directives, hence the leaf definitions must come first. [3] Some of the "end xyz" directives have properties derived from the corresponding "xyz" directive. This exception guarantees that "xyz" precedes the "end xyz".
Configuration menu - View commit details
-
Copy full SHA for acc927a - Browse repository at this point
Copy the full SHA acc927aView commit details -
[flang][OpenMP] Lower
target .. private(..)
toomp.private
ops (l……lvm#94195) Extends delayed privatization support to `taraget .. private(..)`. With this PR, `private` is support for `target` **only** is delayed privatization mode.
Configuration menu - View commit details
-
Copy full SHA for 913a824 - Browse repository at this point
Copy the full SHA 913a824View commit details -
[libc] Correctly pass the C++ standard to NVPTX internal builds
Summary: The NVPTX build wasn't getting the `C++20` standard necessary for a few files.
Configuration menu - View commit details
-
Copy full SHA for 2c3723d - Browse repository at this point
Copy the full SHA 2c3723dView commit details -
[mlir][linalg] Support lowering unpack with outer_dims_perm (llvm#94477)
This commit adds support for lowering `tensor.unpack` with a non-identity `outer_dims_perm`. This was previously left as a not-yet-implemented case.
Configuration menu - View commit details
-
Copy full SHA for 5b2f7a1 - Browse repository at this point
Copy the full SHA 5b2f7a1View commit details -
[mlir] Add reshape propagation patterns for tensor.pad (llvm#94489)
This PR adds fusion by collapsing and fusion by expansion patterns for `tensor.pad` ops in ElementwiseOpFusion. Pad ops can be expanded or collapsed as long as none of the padded dimensions will be expanded or collapsed.
Configuration menu - View commit details
-
Copy full SHA for c886d66 - Browse repository at this point
Copy the full SHA c886d66View commit details -
[mlir] Fix bugs in expand_shape patterns after semantics changes (llv…
…m#94631) After the `output_shape` field was added to `expand_shape` ops, dynamically sized expand shapes are now possible, but this was not accounted for in the folder. This PR tightens the constraints of the folder to fix this.
Configuration menu - View commit details
-
Copy full SHA for 2117677 - Browse repository at this point
Copy the full SHA 2117677View commit details -
[ARM] Clean up neon_vabd.ll, vaba.ll and vabd.ll tests a bit. NFC
Change the target triple to remove some unnecessary instructions.
Configuration menu - View commit details
-
Copy full SHA for ac02168 - Browse repository at this point
Copy the full SHA ac02168View commit details -
[arm64] Add tan intrinsic lowering (llvm#94545)
This change is an implementation of llvm#87367 investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This PR is just for Tan. Now that x86 tan backend landed: llvm#90503 we can add other backends since the shared pieces are in tree now. Changes: - `llvm/include/llvm/Analysis/VecFuncs.def` - vectorization of tan for arm64 backends. - `llvm/lib/Target/AArch64/AArch64FastISel.cpp` - Add tan to the libcall table - `llvm/lib/Target/AArch64/AArch64ISelLowering.cpp` - Add tan expansion for f128, f16, and vector\neon operations - `llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp` define `G_FTAN` as a legal arm64 instruction resolves llvm#94755
Configuration menu - View commit details
-
Copy full SHA for 2f0308e - Browse repository at this point
Copy the full SHA 2f0308eView commit details -
Configuration menu - View commit details
-
Copy full SHA for c5fcc2e - Browse repository at this point
Copy the full SHA c5fcc2eView commit details -
[Clang] Add timeout for GPU detection utilities (llvm#94751)
Summary: The utilities `nvptx-arch` and `amdgpu-arch` are used to support `--offload-arch=native` among other utilities in clang. However, these rely on the GPU drivers to query the features. In certain cases these drivers can become locked up, which will lead to indefinate hangs on any compiler jobs running in the meantime. This patch adds a ten second timeout period for these utilities before it kills the job and errors out.
Configuration menu - View commit details
-
Copy full SHA for 2981f3a - Browse repository at this point
Copy the full SHA 2981f3aView commit details -
[RISCV] Codegen support for XCVmem extension (llvm#76916)
All post-Increment load/store, register-register load/store spec: https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie, @realqhc
Configuration menu - View commit details
-
Copy full SHA for 2afea72 - Browse repository at this point
Copy the full SHA 2afea72View commit details -
[MachineOutliner] Sort by Benefit to Cost Ratio (llvm#90264)
This PR depends on llvm#90260 We changed the order in which functions are outlined in Machine Outliner. The formula for priority is found via a black-box Bayesian optimization toolbox. Using this formula for sorting consistently reduces the uncompressed size of large real-world mobile apps. We also ran a few benchmarks using LLVM test suites, and showed that sorting by priority consistently reduces the text segment size. |run (CTMark/) |baseline (1)|priority (2)|diff (1 -> 2)| |----------------|------------|------------|-------------| |lencod |349624 |349264 |-0.1030% | |SPASS |219672 |219480 |-0.0874% | |kc |271956 |251200 |-7.6321% | |sqlite3 |223920 |223708 |-0.0947% | |7zip-benchmark |405364 |402624 |-0.6759% | |bullet |139820 |139500 |-0.2289% | |consumer-typeset|295684 |290196 |-1.8560% | |pairlocalalign |72236 |72092 |-0.1993% | |tramp3d-v4 |189572 |189292 |-0.1477% | This is part of an enhanced version of machine outliner -- see [RFC](https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-1-fulllto-part-2-thinlto-nolto-to-come/78732).
Configuration menu - View commit details
-
Copy full SHA for 3b16630 - Browse repository at this point
Copy the full SHA 3b16630View commit details -
[memprof] Clean up IndexedMemProfReader (NFC) (llvm#94710)
Parameter "Version" is confusing in deserializeV012 and deserializeV3 because we also have member variable "Version". Fortunately, parameter "Version" and member variable "Version" always have the same value because IndexedMemProfReader::deserialize initializes the member variable and passes it to deserializeV012 and deserializeV3. This patch removes the parameter.
Configuration menu - View commit details
-
Copy full SHA for eb33e46 - Browse repository at this point
Copy the full SHA eb33e46View commit details -
Configuration menu - View commit details
-
Copy full SHA for 55bdb36 - Browse repository at this point
Copy the full SHA 55bdb36View commit details -
[memprof] Use CallStackRadixTreeBuilder in the V3 format (llvm#94708)
This patch integrates CallStackRadixTreeBuilder into the V3 format, reducing the profile size to about 27% of the V2 profile size. - Serialization: writeMemProfCallStackArray just needs to write out the radix tree array prepared by CallStackRadixTreeBuilder. Mappings from CallStackIds to LinearCallStackIds are moved by new function CallStackRadixTreeBuilder::takeCallStackPos. - Deserialization: Deserializing a call stack is the same as deserializing an array encoded in the obvious manner -- the length followed by the payload, except that we need to follow a pointer to the parent to take advantage of common prefixes once in a while. This patch teaches LinearCallStackIdConverter to how to handle those pointers.
Configuration menu - View commit details
-
Copy full SHA for c348e26 - Browse repository at this point
Copy the full SHA c348e26View commit details -
[mlir][vector] Remove Emulated Sub-directory (llvm#94742)
The "Emulated" sub-directories under "ArmSVE" and "ArmSME" have been removed. Associated tests have been moved up a directory and now include the "REQUIRES" constraint for the arm-emulator.
Configuration menu - View commit details
-
Copy full SHA for 7d69095 - Browse repository at this point
Copy the full SHA 7d69095View commit details -
Configuration menu - View commit details
-
Copy full SHA for d099d6c - Browse repository at this point
Copy the full SHA d099d6cView commit details -
Configuration menu - View commit details
-
Copy full SHA for fc95645 - Browse repository at this point
Copy the full SHA fc95645View commit details -
[KnownBits] Remove
hasConflict()
assertions (llvm#94568)Allow KnownBits to represent "always poison" values via conflict. close: llvm#94436
Configuration menu - View commit details
-
Copy full SHA for b25b1db - Browse repository at this point
Copy the full SHA b25b1dbView commit details -
[libc++][test][AIX] Only XFAIL atomic tests for before clang 19 (llvm…
Configuration menu - View commit details
-
Copy full SHA for 790992d - Browse repository at this point
Copy the full SHA 790992dView commit details -
[AArch64] Add patterns for add(uzp1(x,y), uzp2(x, y)) -> addp.
If we are extracting the even lanes and the odd lanes and adding them, we can use an addp instruction.
Configuration menu - View commit details
-
Copy full SHA for f7018ba - Browse repository at this point
Copy the full SHA f7018baView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f9c0fa - Browse repository at this point
Copy the full SHA 4f9c0faView commit details -
[libc++][regex] Correctly adjust match prefix for zero-length matches. (
llvm#94550) For regex patterns that produce zero-length matches, there is one (imaginary) match in-between every character in the sequence being searched (as well as before the first character and after the last character). It's easiest to demonstrate using replacement: `std::regex_replace("abc"s, "!", "")` should produce `!a!b!c!`, where each exclamation mark makes a zero-length match visible. Currently our implementation doesn't correctly set the prefix of each zero-length match, "swallowing" the characters separating the imaginary matches -- e.g. when going through zero-length matches within `abc`, the corresponding prefixes should be `{'', 'a', 'b', 'c'}`, but before this patch they will all be empty (`{'', '', '', ''}`). This happens in the implementation of `regex_iterator::operator++`. Note that the Standard spells out quite explicitly that the prefix might need to be adjusted when dealing with zero-length matches in [`re.regiter.incr`](http://eel.is/c++draft/re.regiter.incr): > In all cases in which the call to `regex_search` returns `true`, `match.prefix().first` shall be equal to the previous value of `match[0].second`... It is unspecified how the implementation makes these adjustments. [Reproduction example](https://godbolt.org/z/8ve6G3dav) ```cpp #include <iostream> #include <regex> #include <string> int main() { std::string str = "abc"; std::regex empty_matching_pattern(""); { // The underlying problem is that `regex_iterator::operator++` doesn't update // the prefix correctly. std::sregex_iterator i(str.begin(), str.end(), empty_matching_pattern), e; std::cout << "\""; for (; i != e; ++i) { const std::ssub_match& prefix = i->prefix(); std::cout << prefix.str(); } std::cout << "\"\n"; // Before the patch: "" // After the patch: "abc" } { // `regex_replace` makes the problem very visible. std::string replaced = std::regex_replace(str, empty_matching_pattern, "!"); std::cout << "\"" << replaced << "\"\n"; // Before the patch: "!!!!" // After the patch: "!a!b!c!" } } ``` Fixes llvm#64451 rdar://119912002
Configuration menu - View commit details
-
Copy full SHA for e9adcc4 - Browse repository at this point
Copy the full SHA e9adcc4View commit details -
Re-apply llvm#87550 with fixes. Details: Some tests in fuchsia failed because of the newly added assertion. This was because `GetExceptionBreakpoint()` could be called before `g_dap.debugger` was initted. The fix here is to just lazily populate the list in GetExceptionBreakpoint() rather than assuming it's already been initted. (There is some nuisance here because we can't simply just populate it in DAP::DAP(), which is a global ctor and is called before `SBDebugger::Initialize()` is called. )
Configuration menu - View commit details
-
Copy full SHA for 35fa2de - Browse repository at this point
Copy the full SHA 35fa2deView commit details -
[libc++] Undeprecate shared_ptr atomic access APIs (llvm#92920)
This patch reverts 9b832b7 (llvm#87111): - [libc++] Deprecated `shared_ptr` Atomic Access APIs as per P0718R2 - [libc++] Implemented P2869R3: Remove Deprecated `shared_ptr` Atomic Access APIs from C++26 As explained in [1], the suggested replacement in P2869R3 is `__cpp_lib_atomic_shared_ptr`, which libc++ does not yet implement. Let's not deprecate the old way of doing things before the new way of doing things exists. [1]: llvm#87111 (comment)
Configuration menu - View commit details
-
Copy full SHA for 716ed5f - Browse repository at this point
Copy the full SHA 716ed5fView commit details -
[Reassociate] shifttest.ll - generate test checks to replace custom g…
…rep expression (and remove an unused argument)
Configuration menu - View commit details
-
Copy full SHA for 97b12df - Browse repository at this point
Copy the full SHA 97b12dfView commit details -
[flang][runtime] add SHAPE runtime interface (llvm#94702)
Add SHAPE runtime API (will be used for assumed-rank, lowering is generating other cases inline). I tried to make it in a way were there is no dynamic allocation in the runtime/deallocation expected to be inserted by inline code for arrays that we know are small (lowering will just always stack allocate a rank 15 array to avoid dynamic stack allocation or heap allocation).
Configuration menu - View commit details
-
Copy full SHA for b01ac51 - Browse repository at this point
Copy the full SHA b01ac51View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1539da4 - Browse repository at this point
Copy the full SHA 1539da4View commit details -
[OpenMP] Fix passing target id features to AMDGPU offloading (llvm#94765
) Summary: AMDGPU supports a `target-id` feature which is used to qualify targets with different incompatible features. These are both rules and target features. Currently, we pass `-target-cpu` twice when offloading to OpenMP, and do not pass the target-id features at all. The effect was that passing something like `--offload-arch=gfx90a:xnack+` would show up as `-target-cpu=gfx90a:xnack+ -target-cpu=gfx90a`. Thus ignoring the xnack completely and passing it twice. This patch fixes that to pass it once and then separate it like how HIP does.
Configuration menu - View commit details
-
Copy full SHA for 374f655 - Browse repository at this point
Copy the full SHA 374f655View commit details -
Fixed grammatical error in "enum specifier" error msg llvm#94443 (llv…
…m#94592) As discussed in llvm#94443, this PR changes the wording to be more correct.
Configuration menu - View commit details
-
Copy full SHA for bbddedb - Browse repository at this point
Copy the full SHA bbddedbView commit details -
Configuration menu - View commit details
-
Copy full SHA for b59567b - Browse repository at this point
Copy the full SHA b59567bView commit details -
Check if LLD is built when checking if lto_supported (llvm#92752)
Otherwise, older copies of LLD may not understand the latest bitcode versions (for example, if we increase `ModuleSummaryIndex::BitCodeSummaryVersion`) Related to llvm#90692 (comment)
Configuration menu - View commit details
-
Copy full SHA for c467e60 - Browse repository at this point
Copy the full SHA c467e60View commit details -
[mlir][vector][NFC] Make function name more meaningful in lit tests. (l…
…lvm#94538) It also moves the test near other similar test cases.
Configuration menu - View commit details
-
Copy full SHA for b653357 - Browse repository at this point
Copy the full SHA b653357View commit details -
[SDISel][Builder] Fix the instantiation of <1 x bfloat|half> (llvm#94591
) Prior to this change, `SelectionDAGBuilder` was producing `SDNode`s of the form: `f32 = extract_vector_elt <1 x bfloat|half>, i32 0` when lowering phis of `<1 x bfloat|half>` and running on a target that promotes this type to `f32` (like some x86 or AMDGPU targets.) This construct is invalid since this type of node only allows type extensions for integer types. It went unotice because the `extract_vector_elt` node is later broken down in `bitcast` followed by `bf16_to_fp|fp_extend`. However, when the argument of the phi is a constant we were crashing because the existing code would try to constant fold this `extract_vector_elt` into a any_ext. This patch fixes this by using a proper decomposition for `<1 x bfloat|half>`: ``` bfloat|half = bitcast <1 x blfoat|half> float = fp_extend bfloat|half ``` This change should be NFC for the non-constant-folding cases and fix the SDISel crashes (reported in llvm#94449) for the folding cases. Note: The change on the arm test is a missing fp16 to f32 constant folding exposed by this patch. I'll push a separate improvement for that.
Configuration menu - View commit details
-
Copy full SHA for 0605e98 - Browse repository at this point
Copy the full SHA 0605e98View commit details -
[RISCV] Fold (vXi8 (trunc (vselect (setltu, X, 256), X, (sext (setgt …
…X, 0))))) to vmax+vnclipu. (llvm#94720) This pattern is an obscured way to express saturating a signed value into a smaller unsigned value. If (setltu, X, 256) is true, then the value is already in the desired range so we can pick X. If it's false, we select (sext (setgt X, 0)) which is 0 for negative values and all ones for positive values. The all ones value when truncated to the final type will still be all ones like we want.
Configuration menu - View commit details
-
Copy full SHA for e9fa6ff - Browse repository at this point
Copy the full SHA e9fa6ffView commit details -
[RISCV] Add .insn alias for addresses without the leading immediate. (l…
…lvm#94698) Most other instructions accept addresses that start with a '(' without an immediate before it. The .insn cases were missing. This is also supported by binutils.
Configuration menu - View commit details
-
Copy full SHA for cce10cc - Browse repository at this point
Copy the full SHA cce10ccView commit details -
Revert "Reapply PR/87550 (llvm#94625)"
This reverts commit 35fa2de. It broke the LLDB bots on green dragon
Configuration menu - View commit details
-
Copy full SHA for adcf33f - Browse repository at this point
Copy the full SHA adcf33fView commit details -
[AArch64] Add patterns for fadd(uzp1(x,y), uzp2(x, y)) -> faddp.
Similar to f7018ba, this adds patterns for floating point faddp from an fadd and shuffles.
Configuration menu - View commit details
-
Copy full SHA for e564852 - Browse repository at this point
Copy the full SHA e564852View commit details -
Configuration menu - View commit details
-
Copy full SHA for f9ae07b - Browse repository at this point
Copy the full SHA f9ae07bView commit details -
[CGSCC] Verify that call graph is valid after iteration (llvm#94692)
Only in expensive checks, to match other LazyCallGraph verification. Is helpful for verifying LazyCallGraph updates. Many issues only surface when we reuse the LazyCallGraph.
Configuration menu - View commit details
-
Copy full SHA for 6f2c610 - Browse repository at this point
Copy the full SHA 6f2c610View commit details -
Fix #pragma (packed, n) not emitting the alignment in debug info (llv…
…m#94673) Debug info generation won't emit the alignment of types that have a standard alignment. It was not taking into account the that case. rdar://127785973
Configuration menu - View commit details
-
Copy full SHA for 66df614 - Browse repository at this point
Copy the full SHA 66df614View commit details -
[clang] Add fixit for using declaration with a (qualified) namespace (l…
…lvm#94762) For `using std::literals`, we now output: error: using declaration cannot refer to a namespace 4 | using std::literals; | ~~~~~^ note: did you mean 'using namespace'? 4 | using std::literals; | ^ | namespace Previously, we didn't have the note. This only fires for qualified namespaces. Just `using std;` doesn't trigger this, since using declarations without cxx scope specifier are rejected earlier. Making that work is an exercise for future selves :)
Configuration menu - View commit details
-
Copy full SHA for 2c047e6 - Browse repository at this point
Copy the full SHA 2c047e6View commit details -
[libc] Add baremetal printf (llvm#94078)
For baremetal targets that don't support FILE, this version of printf just writes directly to a function provided by a vendor. To do this both printf and vprintf were moved to /generic (vprintf since they need the same flags and cmake gets funky about setting variables in one file and reading them in another).
Configuration menu - View commit details
-
Copy full SHA for 11d643f - Browse repository at this point
Copy the full SHA 11d643fView commit details -
[PseudoProbe] Make probe discriminator compatible with dwarf base dis…
…criminator (llvm#94506) It's useful if the probe-based build can consume a dwarf based profile(e.g. the profile transition), before there is a conflict for the discriminator, this change tries to mitigate the issue by encoding the dwarf base discriminator into the probe discriminator. As the num of probe id(num of basic block and calls) starts from 1, there are some unused space. We try to reuse some bit of the probe id. The new encode rule is: - Use a bit to [28:28] to indicate whether dwarf base discriminator is encoded.(fortunately we can borrow this bit from the `PseudoProbeType`) - If the bit is set, use [15:3] for probe id, [18:16] for dwarf base discriminator. Otherwise, still use [18:3] for probe id. Note that these doesn't affect the original probe id capacity, we still prioritize probe id encoding, i.e. the base discriminator is not encoded when probe id is bigger than [15:3]. Then adjust `getBaseDiscriminatorFromDiscriminator` to use the base discriminator from the probe discriminator.
Configuration menu - View commit details
-
Copy full SHA for e20b904 - Browse repository at this point
Copy the full SHA e20b904View commit details -
[Driver,test] Add -Wno-msvc-not-found to gcc-param.c
Fixes: 56c4971 If the default target triple uses visualstudio::Linker::ConstructJob, when a MSVC installation cannot be found, there will be a -Wmsvc-not-found diagnostic, which is turned to an error due to -Werror. We have many driver tests that don't specify --target= and would get a -Wmsvc-not-found warning, but this might be the only that uses -Werror and is not skipped by a `UNSUPPORTED`.
Configuration menu - View commit details
-
Copy full SHA for c3a5087 - Browse repository at this point
Copy the full SHA c3a5087View commit details -
[clang][driver] Enable '-flto' on bare-metal (llvm#94738)
Pass the linker LTO options enabled by the clang '-flto' command line options when targeting bare-metal. --------- Co-authored-by: Keith Walker <keith.walker@arm.com>
Configuration menu - View commit details
-
Copy full SHA for bd6e324 - Browse repository at this point
Copy the full SHA bd6e324View commit details -
Configuration menu - View commit details
-
Copy full SHA for d3bcd9b - Browse repository at this point
Copy the full SHA d3bcd9bView commit details -
[SPIR-V] Improve type inference, addrspacecast and dependencies betwe…
…en SPIR-V entities and required capability/extensions (llvm#94626) This PR continues llvm#94467 and contains fixes in emission of type intrinsics, constant recording and corresponding test cases: * type-deduce-global-dup.ll -- fix of integer constant emission on 32-bit platforms and correct type deduction for globals * type-deduce-simple-for.ll -- fix of GEP translation (there was an issue previously that led to incorrect translation/broken logic of for-range implementation) This PR also: * fixes a cast between identical storage classes and updates the test case to include validation run by spirv-val, * ensures that Bitcast for pointers satisfies the requirement that the address spaces must match and adds the corresponding test case, * improve encode in Tablegen and decode in code of dependencies between SPIR-V entities and required capability/extensions, * prevent emission of identical OpTypePointer instructions.
Configuration menu - View commit details
-
Copy full SHA for 9a73710 - Browse repository at this point
Copy the full SHA 9a73710View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4196c18 - Browse repository at this point
Copy the full SHA 4196c18View commit details -
[RISCV][GISel] Do libcall for G_FPTOSI, G_FPTOUI when no D or F suppo…
…rt (llvm#94613) When compiling the following code: ```cpp #include <stdio.h> #include <stdlib.h> #include <stddef.h> #include <stdbool.h> int main() { int a; float f; scanf("%d", &a); scanf("%f", &f); a += (int)f; return a; } ``` for `-march=rv32ima_zbb` we get a libcall: ``` call scanf lw a0, -20(s0) call __fixsfsi mv a1, a0 ``` When we try to use GlobalISel we get this error: ``` error in backend: unable to legalize instruction: %9:_(s32) = G_FPTOSI %8:_(s32) (in function: main) ``` (Here is a link to a reproducer in Godblot: https://godbolt.org/z/f67vEEb41 ) The goal of this PR is to do a libcall for the legalization of `G_FPTOSI` and `G_FPTOUI` instead of doing a fallback to Selection DAG to do the same libcall later.
Configuration menu - View commit details
-
Copy full SHA for 28dd55b - Browse repository at this point
Copy the full SHA 28dd55bView commit details -
[llvm-dwarfdump] Add a null-check in
prettyPrintBaseTypeRef
. (llvm#……93156) Fixes llvm#93104 Prevent a crash by only printing DWARFUnit-unaware information in cases in which `DWARFUnit* U` is `nullptr`.
Configuration menu - View commit details
-
Copy full SHA for c6e9371 - Browse repository at this point
Copy the full SHA c6e9371View commit details -
Configuration menu - View commit details
-
Copy full SHA for 27084f7 - Browse repository at this point
Copy the full SHA 27084f7View commit details -
[InstCombine] Fold
(icmp eq/ne (xor x, y), C1)
even if multiuseTwo folds unlocked: `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)` `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)` This fixes regressions assosiated with llvm#87180 Closes llvm#87275
Configuration menu - View commit details
-
Copy full SHA for 166c184 - Browse repository at this point
Copy the full SHA 166c184View commit details -
[OpenMP][Offload] - Ensure OPENMP_STANDALONE_BUILD is defined (llvm#9…
…4801) Without a value set conditional checks like if(NOT ${OPENMP_STANDALONE_BUILD}) will not be able to evaluate to true. Fixes issue introduced from PR llvm#93463, which did not allow the OMPT variable to be propogated up to offload during a runtimes build.
Configuration menu - View commit details
-
Copy full SHA for 89c92b0 - Browse repository at this point
Copy the full SHA 89c92b0View commit details -
InstCombine: Fix testing of pow libcall in errno case (llvm#94772)
There were some tests in this file with "noerrno" in the name, but all the tests were no errno since all the libcalls were declared with memory(none). Ensure we have adequate coverage for the errno and no-errno cases by duplicating the libcall transform cases into errno and non-errno versions with callsite attributes.
Configuration menu - View commit details
-
Copy full SHA for 75b89cc - Browse repository at this point
Copy the full SHA 75b89ccView commit details -
[lldb] Encode operands and arity in Dwarf.def and use them in LLDB. (l…
…lvm#94679) This PR extends Dwarf.def to include the number of operands and the arity (the number of entries on the DWARF stack). - The arity is used in LLDB's DWARF expression evaluator. - The number of operands is unused, but is present in the table to avoid confusing the arity with the operands. Keeping the latter up to date should be straightforward as it maps directly to a table present in the DWARF standard.
Configuration menu - View commit details
-
Copy full SHA for 96d01a3 - Browse repository at this point
Copy the full SHA 96d01a3View commit details -
[AArch64][LoopIdiom] Generalize AArch64LoopIdiomTransform into LoopId…
…iomVectorize (llvm#94081) To facilitate sharing LoopIdiomTransform between AArch64 and RISC-V, this first patch moves AArch64LoopIdiomTransform from lib/Target/AArch64 to lib/Transforms/Vectorize and renames it to LoopIdiomVectorize. The following patch (llvm#94082) will teach LoopIdiomVectorize how to generate VP intrinsics (in addition to the current masked vector style) in favor of RVV.
Configuration menu - View commit details
-
Copy full SHA for 37e309f - Browse repository at this point
Copy the full SHA 37e309fView commit details -
[ELF] Implement --force-group-allocation
GNU ld's relocatable linking behaviors: * Sections with the `SHF_GROUP` flag are handled like sections matched by the `--unique=pattern` option. They are processed like orphan sections and ignored by input section descriptions. * Section groups' (usually named `.group`) content is updated as the section indexes are updated. Section groups can be discarded with `/DISCARD/ : { *(.group) }`. `-r --force-group-allocation` discards section groups and allows sections with the `SHF_GROUP` flag to be matched like normal sections. If two section group members are placed into the same output section, their relocation sections (if present) are combined as well. This behavior can be useful when -r output is used as a pseudo shared object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments). This patch implements --force-group-allocation: * Input SHT_GROUP sections are discarded. * Input sections do not get the SHF_GROUP flag, so `addInputSec` will combine relocation sections if their relocated section group members are combined. The default behavior is: * Input SHT_GROUP sections are retained. * Input SHF_GROUP sections can be matched (unlike GNU ld) * Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec` will create different OutputDesc copies. GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not implemented. Pull Request: llvm#94704
Configuration menu - View commit details
-
Copy full SHA for 4d9020c - Browse repository at this point
Copy the full SHA 4d9020cView commit details -
Configuration menu - View commit details
-
Copy full SHA for d1b5a4b - Browse repository at this point
Copy the full SHA d1b5a4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 06188d9 - Browse repository at this point
Copy the full SHA 06188d9View commit details -
[BOLT][NFC] Unset UseAssemblerInfoForParsing for emission (llvm#94778)
Summary: Use workaround for quadratic behavior inside AttemptToFoldSymbolOffsetDifference called from BinaryEmitter::emitLSDA. llvm@b06e736#commitcomment-142836456
Configuration menu - View commit details
-
Copy full SHA for 7520d0c - Browse repository at this point
Copy the full SHA 7520d0cView commit details -
Configuration menu - View commit details
-
Copy full SHA for bfa937a - Browse repository at this point
Copy the full SHA bfa937aView commit details -
[RISCV] Add TargetConstraintType=2 to vnclip pseudoinstructions. NFC
These instructions are very similar to narrowing shift instructions which already have this. Remove TargetConstraintType parameter from VPseudoBinaryV_WV class. Only 2 was ever passed to it. Pass 2 directly to the classes instantiated from VPseudoBinaryV_WV instead.
Configuration menu - View commit details
-
Copy full SHA for 06e12b4 - Browse repository at this point
Copy the full SHA 06e12b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0cdb0b7 - Browse repository at this point
Copy the full SHA 0cdb0b7View commit details -
[clang-tidy] new check misc-use-internal-linkage (llvm#90830)
Add new check misc-use-internal-linkage to detect variable and function can be marked as static. --------- Co-authored-by: Danny Mösch <danny.moesch@icloud.com>
Configuration menu - View commit details
-
Copy full SHA for c4f83a0 - Browse repository at this point
Copy the full SHA c4f83a0View commit details -
[libc][math][c23] Temporarily disable fmodf16 on AArch64 (llvm#94813)
See Buildbot failure: https://lab.llvm.org/buildbot/#/builders/138/builds/67337.
Configuration menu - View commit details
-
Copy full SHA for 4346c38 - Browse repository at this point
Copy the full SHA 4346c38View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32d8596 - Browse repository at this point
Copy the full SHA 32d8596View commit details -
Reland "[python] Bump Python minimum version to 3.8 (llvm#78828)"
Configuration menu - View commit details
-
Copy full SHA for 33f4a77 - Browse repository at this point
Copy the full SHA 33f4a77View commit details -
[mlir][loops] Add getters for multi dim loop variables in `LoopLikeOp…
…Interface` (llvm#94516) This patch adds `getLoopInductionVars`, `getLoopLowerBounds`, `getLoopBounds`, `getLoopSteps` interface methods to `LoopLIkeOpInterface`. The corresponding single value versions have been moved to shared class declaration and have been implemented based on the new interface methods.
Configuration menu - View commit details
-
Copy full SHA for 6b4c122 - Browse repository at this point
Copy the full SHA 6b4c122View commit details -
[MemProf] Add matching statistics and tracing (llvm#94814)
To help debug or surface matching issues, add more statistics to the matching. Also add optional emission of each context seen in the function profiles along with its allocation type, size in bytes, and whether it was matched. This information is emitted along with a hash of the full stack context, to allow deduplication across modules for allocations within header files.
Configuration menu - View commit details
-
Copy full SHA for 7536474 - Browse repository at this point
Copy the full SHA 7536474View commit details -
Configuration menu - View commit details
-
Copy full SHA for 211edca - Browse repository at this point
Copy the full SHA 211edcaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 507b372 - Browse repository at this point
Copy the full SHA 507b372View commit details -
[RISCV] Remove CarryIn and Constraint parameters from VPseudoTiedBina…
…ryCarryIn. NFC They were always passed the same values, 1 for CarryIn and "" for Constraint.
Configuration menu - View commit details
-
Copy full SHA for 017e240 - Browse repository at this point
Copy the full SHA 017e240View commit details -
[RISCV] Rename VPseudoBinaryCarryIn to VPseudoBinaryCarry. NFC
It doesn't always have a CarryIn. One of the parameters is named CarryIn. It always has CarryOut or a CarryIn and in some cases both.
Configuration menu - View commit details
-
Copy full SHA for c8eff87 - Browse repository at this point
Copy the full SHA c8eff87View commit details
Commits on Jun 8, 2024
-
Add AllowRepeats to SBCommandInterpreterRunOptions. (llvm#94786)
This is useful if you have a transcript of a user session and want to rerun those commands with RunCommandInterpreter. The same functionality is also useful in testing. I'm adding it primarily for the second reason. In a subsequent patch, I'm adding the ability to Python based commands to provide their "auto-repeat" command. Among other things, that will allow potentially state destroying user commands to prevent auto-repeat. Testing this with Shell or pexpect tests is not nearly as accurate or convenient as using RunCommandInterpreter, but to use that I need to allow auto-repeat. I think for consistency's sake, having interactive sessions always do auto-repeats is the right choice, though that's a lightly held opinion...
Configuration menu - View commit details
-
Copy full SHA for 435dd97 - Browse repository at this point
Copy the full SHA 435dd97View commit details -
[memprof] Improve deserialization performance in V3 (llvm#94787)
We call llvm::sort in a couple of places in the V3 encoding: - We sort Frames by FrameIds for stability of the output. - We sort call stacks in the dictionary order to maximize the length of the common prefix between adjacent call stacks. It turns out that we can improve the deserialization performance by modifying the comparison functions -- without changing the format at all. Both places take advantage of the histogram of Frames -- how many times each Frame occurs in the call stacks. - Frames: We serialize popular Frames in the descending order of popularity for improved cache locality. For two equally popular Frames, we break a tie by serializing one that tends to appear earlier in call stacks. Here, "earlier" means a smaller index within llvm::SmallVector<FrameId>. - Call Stacks: We sort the call stacks to reduce the number of times we follow pointers to parents during deserialization. Specifically, instead of comparing two call stacks in the strcmp style -- integer comparisons of FrameIds, we compare two FrameIds F1 and F2 with Histogram[F1] < Histogram[F2] at respective indexes. Since we encode from the end of the sorted list of call stacks, we tend to encode popular call stacks first. Since the two places use the same histogram, we compute it once and share it in the two places. Sorting the call stacks reduces the number of "jumps" by 74% when we deserialize all MemProfRecords. The cycle and instruction counts go down by 10% and 1.5%, respectively. If we sort the Frames in addition to the call stacks, then the cycle and instruction counts go down by 14% and 1.6%, respectively, relative to the same baseline (that is, without this patch).
Configuration menu - View commit details
-
Copy full SHA for dc3f8c2 - Browse repository at this point
Copy the full SHA dc3f8c2View commit details -
[InstCombine] Preserve the nsw/nuw flags for (X | Op01C) + Op1C --> X…
… + (Op01C + Op1C) (llvm#94586) This patch simplifies `sdiv` to `udiv` by preserving the `nsw` flag for `(X | Op01C) + Op1C --> X + (Op01C + Op1C)` if the sum of `Op01C` and `Op1C` will not overflow, and preserves the `nuw` flag unconditionally. Alive2 Proofs (provided by @nikic): https://alive2.llvm.org/ce/z/nrdCZT, https://alive2.llvm.org/ce/z/YnJHnH
Configuration menu - View commit details
-
Copy full SHA for 96af114 - Browse repository at this point
Copy the full SHA 96af114View commit details -
[lld] Discard SHT_LLVM_LTO sections in relocatable links (llvm#92825)
So long as ld -r links using bitcode always result in an ELF object, and not a merged bitcode object, the output form a relocatable link using FatLTO objects should not have a .llvm.lto section. Prior to this, using the object code sections would cause the bitcode section in the output of a relocatable link to be corrupted, by concatenating all the .llvm.lto sections together. This patch discards SHT_LLVM_LTO sections when not using --fat-lto-objects, so that the relocatable ELF output won't contain inalid bitcode.
Configuration menu - View commit details
-
Copy full SHA for 608fb46 - Browse repository at this point
Copy the full SHA 608fb46View commit details -
[ProfileData] Use default member initialization (NFC) (llvm#94817)
While we are at it, this patch changes the type of ValueCounts to std:array<double, ...> so that we can use std::array:fill. Identified with modernize-use-default-member-init.
Configuration menu - View commit details
-
Copy full SHA for 4c28844 - Browse repository at this point
Copy the full SHA 4c28844View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4cff8ef - Browse repository at this point
Copy the full SHA 4cff8efView commit details -
Configuration menu - View commit details
-
Copy full SHA for 18c67bf - Browse repository at this point
Copy the full SHA 18c67bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4e0ff05 - Browse repository at this point
Copy the full SHA 4e0ff05View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d95850 - Browse repository at this point
Copy the full SHA 4d95850View commit details -
[RISCV] Rename VPseudoVWALU_VV_VX_VI to VPseudoVWSLL. NFC
The scheduler class name is hardcoded in the class so its not a general class.
Configuration menu - View commit details
-
Copy full SHA for 5422b5f - Browse repository at this point
Copy the full SHA 5422b5fView commit details -
[RISCV] Refactor VPseudoVROL and VPseudoVROR multiclasses to use inhe…
…ritance. NFC VPseudoVROR can inherit from VPseudoVROL. Adjust the names to VPseudoVROT_VV_VX and VPseudoVROT_VV_VX_VI.
Configuration menu - View commit details
-
Copy full SHA for 5fc1b82 - Browse repository at this point
Copy the full SHA 5fc1b82View commit details -
[RISCV] Rename VPseudoBinaryNoMaskTU->VPseudoBinaryNoMaskPolicy. NFC
These pseudoinstructions have a policy operand so calling them TU is confusing.
Configuration menu - View commit details
-
Copy full SHA for 7d203b1 - Browse repository at this point
Copy the full SHA 7d203b1View commit details -
[RISCV] Rename VPatBinarySwapped to VPatBinaryMSwapped. NFC
This class is most closely related to VPatBinaryM.
Configuration menu - View commit details
-
Copy full SHA for 5e94163 - Browse repository at this point
Copy the full SHA 5e94163View commit details -
[RISCV] Flatten VPatBinaryW_VI_VWSLL and VPatBinaryW_VX_VWSLL into VP…
…atBinaryW_VV_VX_VI_VWSLL. NFC
Configuration menu - View commit details
-
Copy full SHA for 84b3fe6 - Browse repository at this point
Copy the full SHA 84b3fe6View commit details -
[workflows] Add post-commit job that periodically runs the clang stat…
…ic analyzer (llvm#94106) This job will run once per day on the main branch, and for every commit on a release branch. It currently only builds llvm, but could add more sub-projects in the future. OpenSSF Best Practices recommends running a static analyzer on software before it is released: https://www.bestpractices.dev/en/criteria/0#0.static_analysis
Configuration menu - View commit details
-
Copy full SHA for 81671fe - Browse repository at this point
Copy the full SHA 81671feView commit details -
[mlir] Handle the newly-added "Reserved" FramePointerKind for 1a52392 …
…(NFC) /llvm-project/mlir/lib/Target/LLVMIR/ModuleImport.cpp:48: tools/mlir/include/mlir/Dialect/LLVMIR/LLVMConversionEnumsFromLLVM.inc:158:11: error: enumeration value 'Reserved' not handled in switch [-Werror,-Wswitch] switch (value) { ^~~~~ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for c0a1214 - Browse repository at this point
Copy the full SHA c0a1214View commit details -
[dfsan] Fix release_shadow_space.c (llvm#94770)
DFSan's sscanf is incorrect (llvm#94769), which results in erroneous matches when scraping RSS from /proc/maps. This patch works around the issue by using strstr as a secondary check. It also adds a loose validity check for the initial RSS measurement, to guard against regressions in get_rss_kb(). Fixes llvm#91287
Configuration menu - View commit details
-
Copy full SHA for 221336c - Browse repository at this point
Copy the full SHA 221336cView commit details -
[HLSL] Use llvm::Triple::EnvironmentType instead of HLSLShaderAttr::S…
…haderType (llvm#93847) `HLSLShaderAttr::ShaderType` enum is a subset of `llvm::Triple::EnvironmentType`. We can use `llvm::Triple::EnvironmentType` directly and avoid converting one enum to another.
Configuration menu - View commit details
-
Copy full SHA for 5d87ba1 - Browse repository at this point
Copy the full SHA 5d87ba1View commit details -
[CMake] Update CMake cache file for the ARM/Aarch64 cross toolchain b…
…uilds. NFC. (llvm#94835) * generate Clang configuration file with provided target sysroot (TOOLCHAIN_TARGET_SYSROOTFS) * explicitly pass provided target sysroot into the compiler-rt tests configuration. * added ability to configure a type of the build libraries -- shared or static (TOOLCHAIN_SHARED_LIBS, default OFF) In behalf of: llvm#94284
Configuration menu - View commit details
-
Copy full SHA for 5aabbf0 - Browse repository at this point
Copy the full SHA 5aabbf0View commit details -
[RISCV] Remove many ImmType parameters from tablegen classes. NFC
These usually have a single value that is always used. We can just hardcode into the class body.
Configuration menu - View commit details
-
Copy full SHA for 950605b - Browse repository at this point
Copy the full SHA 950605bView commit details -
[RISCV] Remove unused defaults for sew paramters in tablegen. NFC
Also remove some unused Constraint paramters that appeared before the sew parameter.
Configuration menu - View commit details
-
Copy full SHA for 2fa14fc - Browse repository at this point
Copy the full SHA 2fa14fcView commit details -
[lldb] Remove redundant c_str() calls in stream output (NFC) (llvm#94839
) Passing the result of c_str() to a stream is slow and redundant. This change removes unnecessary c_str() calls and uses the string object directly. Caught by cppcheck - lldb/tools/debugserver/source/JSON.cpp:398:19: performance: Passing the result of c_str() to a stream is slow and redundant. [stlcstrStream] lldb/tools/debugserver/source/JSON.cpp:408:64: performance: Passing the result of c_str() to a stream is slow and redundant. [stlcstrStream] lldb/tools/debugserver/source/JSON.cpp:420:54: performance: Passing the result of c_str() to a stream is slow and redundant. [stlcstrStream] lldb/tools/debugserver/source/JSON.cpp:46:13: performance: Passing the result of c_str() to a stream is slow and redundant. [stlcstrStream] Fix llvm#91212
Configuration menu - View commit details
-
Copy full SHA for d3fc5cf - Browse repository at this point
Copy the full SHA d3fc5cfView commit details -
Revert "[lld][AArch64][ELF][PAC] Support
.relr.auth.dyn
section" (l……lvm#94843) Reverts llvm#87635 On some corner cases, lld generated an object file with an empty REL section with `sh_info` set to 0. This file triggers an lld error when used as its input. See llvm#87635 (comment) for details.
Configuration menu - View commit details
-
Copy full SHA for 2e1788f - Browse repository at this point
Copy the full SHA 2e1788fView commit details -
Configuration menu - View commit details
-
Copy full SHA for a294e89 - Browse repository at this point
Copy the full SHA a294e89View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f0f2cd - Browse repository at this point
Copy the full SHA 3f0f2cdView commit details -
[Support] Do not use
llvm::size
ingetLoopPreheader
(llvm#94540)`BlockT *LoopBase<BlockT, LoopT>::getLoopPreheader()` was changed in 7243607 to use `llvm::size` rather than the checking that `child_begin() + 1 == child_end()`. `llvm::size` requires that `std::distance` be O(1) and hence that clients support random access. Use `llvm::hasSingleElement` instead.
Configuration menu - View commit details
-
Copy full SHA for 6885281 - Browse repository at this point
Copy the full SHA 6885281View commit details -
[SystemZ] Fix handling of triples.
Some Ubuntu builds were broken after 20d497c "[Driver] Remove unneeded *-linux-gnu after D158183". This patch by Fangrui Song fixes this with a handling in config.guess.
Configuration menu - View commit details
-
Copy full SHA for 7f5d1f1 - Browse repository at this point
Copy the full SHA 7f5d1f1View commit details -
[mlir][Transforms][NFC]
GreedyPatternRewriteDriver
: Use composition…… instead of inheritance (llvm#92785) This commit simplifies the design of the `GreedyPatternRewriterDriver` class. This class used to inherit from both `PatternRewriter` and `RewriterBase::Listener` and then attached itself as a listener. In the new design, the class has a `PatternRewriter` field instead of inheriting from `PatternRewriter`, which is generally perferred in object-oriented programming. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 6b3e000 - Browse repository at this point
Copy the full SHA 6b3e000View commit details -
[clang] Report erroneous floating point results in _Complex math (llv…
…m#90588) Use handleFloatFloatBinOp to properly diagnose NaN results and divisions by zero. Fixes llvm#84871
Configuration menu - View commit details
-
Copy full SHA for 9ddc014 - Browse repository at this point
Copy the full SHA 9ddc014View commit details -
[SDISel][Combine] Constant fold FP16_TO_FP (llvm#94790)
In some case, constant can survive early constant folding optimization because they are hidden behind several layers of type changes. E.g., consider the following sequence (extracted from the arm test that this commit changes): ``` t2: v1f16 = BUILD_VECTOR ConstantFP:f16<APFloat(0)> t4: v1f16 = insert_vector_elt t2, ConstantFP:f16<APFloat(0)>, Constant:i32<0> t5: f16 = bitcast t4 t6: f32 = fp_extend t5 ``` Because the constant (APFloat(0)) is hidden behind a <1 x ty> type, all the constant folding that normally happen for scalar nodes when using `SelectionDAG::getNode` are blocked. As a result the constant manages to survive as an actual conversion instruction down to the select phase: ``` t11: f32 = fp16_to_fp Constant:i32<0> ``` With the change in this patch, we try to do constant folding one more time during dag combine, which in the motivating example result in the much better sequence: ``` t7: ch = CopyToReg t0, Register:f32 %0, ConstantFP:f32<0.000000e+00> ``` Note: I'm sure we have this problem in a lot of other places. Generally speaking I believe SDISel is not that good with <1 x ty> compared to pure scalar. However, I only changed what I could easily test.
Configuration menu - View commit details
-
Copy full SHA for 25506f4 - Browse repository at this point
Copy the full SHA 25506f4View commit details -
[compiler-rt] Replace deprecated aligned_storage with aligned byte ar…
…ray (llvm#94171) `std::aligned_storage` is deprecated with C++23, see [here](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1413r3.pdf). This replaces the usages of `std::aligned_storage` within compiler-rt with an aligned `std::byte` array. I will provide patches for other subcomponents as well.
Configuration menu - View commit details
-
Copy full SHA for cac7821 - Browse repository at this point
Copy the full SHA cac7821View commit details -
lld/test: Make sure removing %t at first
2e1788f reverted llvm#94843. It was creating `%t` as a directory and causes an error in incremental builds.
Configuration menu - View commit details
-
Copy full SHA for 82f6cde - Browse repository at this point
Copy the full SHA 82f6cdeView commit details -
Enable LLDB tests in Linux pre-merge CI (llvm#94208)
This patch removes LLDB from a list of projects that are excluded from building and testing on pre-merge CI on Linux. Windows environment needs to be prepared in order to test LLDB (llvm#94208 (comment)), but we don't have enough maintenance resources to do that at the moment. Because LLDB has been in the list of projects that need to be tested on Clang changes, this PR make this happen on Linux. This seems to be the consensus in the discussion of this PR.
Configuration menu - View commit details
-
Copy full SHA for d4eed43 - Browse repository at this point
Copy the full SHA d4eed43View commit details -
[SimplifyCFG] Don't use a mask for lookup tables generated from switc…
…hes with an unreachable default case (llvm#94468) When transforming a switch with holes into a lookup table, we currently use a mask to check if the current index is handled by the switch or if it is a hole. If it is a hole, we skip loading from the lookup table. Normally, if the switch's default case is unreachable this has no impact, as the mask test gets optimized away by subsequent passes. However, if the switch is large enough that the number of lookup table entries exceeds the target's register width, we won't be able to fit all the cases into a mask and the switch won't get transformed into a lookup table. If we know that the switch's default case is unreachable, we know that the mask is unnecessary and can skip constructing it entirely, which allows us to transform the switch into a lookup table. [Example](https://godbolt.org/z/7x7qfx8M1) In the future, it might be interesting to consider allowing lookup table masks to be more than one register large (e.g. using a constant array of bit flags, similar to `std::bitset`).
Configuration menu - View commit details
-
Copy full SHA for 540f68c - Browse repository at this point
Copy the full SHA 540f68cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2d21851 - Browse repository at this point
Copy the full SHA 2d21851View commit details -
[DAGCombine] Fix miscompilation caused by PR94008 (llvm#94850)
The pr description in llvm#94008 mismatches with the code. > + When VT is smaller than ShiftVT, it is safe to use trunc. > + When VT is larger than ShiftVT, it is safe to use zext iff `is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2 proofs. Closes llvm#94824.
Configuration menu - View commit details
-
Copy full SHA for d9507a3 - Browse repository at this point
Copy the full SHA d9507a3View commit details -
[Reassociate] Use uint64_t for repeat count (llvm#94232)
This patch relands llvm#91469 and uses `uint64_t` for repeat count to avoid a miscompilation caused by overflow llvm#91469 (comment).
Configuration menu - View commit details
-
Copy full SHA for 645fb04 - Browse repository at this point
Copy the full SHA 645fb04View commit details -
[X86] Support ATOMIC_LOAD_FP_BINOP_MI for other binops (llvm#87524)
Since we can bitcast and then do the same thing sub does in the table section above, I figured it was trivial to add fsub, fmul, and fdiv.
Configuration menu - View commit details
-
Copy full SHA for bca7864 - Browse repository at this point
Copy the full SHA bca7864View commit details -
Configuration menu - View commit details
-
Copy full SHA for c870882 - Browse repository at this point
Copy the full SHA c870882View commit details -
[ProfileData] Use a range-based for loop (NFC) (llvm#94856)
While I am at it, this patch adds const to a couple of places.
Configuration menu - View commit details
-
Copy full SHA for 38124fe - Browse repository at this point
Copy the full SHA 38124feView commit details -
[memprof] Remove redundant virtual (NFC) (llvm#94858)
'override' makes 'virtual' redundant. Identified with modernize-use-override.
Configuration menu - View commit details
-
Copy full SHA for 6834e6d - Browse repository at this point
Copy the full SHA 6834e6dView commit details -
[libc++][NFC] Simplify the implementation of
__promote
(llvm#81379)This depends on enabling the use of extensions.
Configuration menu - View commit details
-
Copy full SHA for c8992fb - Browse repository at this point
Copy the full SHA c8992fbView commit details -
[RISCV][MC] Implicit 0-offset aliases for JR/JALR (llvm#94688)
This broadly follows how in almost all places, we accept `(<reg>)` to mean `0(<reg>)`, but I think these are the first like this for Jumps rather than Loads/Stores. These are accepted by binutils but not by LLVM: https://godbolt.org/z/GK7MGE7q7
Configuration menu - View commit details
-
Copy full SHA for bafff3e - Browse repository at this point
Copy the full SHA bafff3eView commit details -
[ProfileData] Use default member initialization (NFC) (llvm#94860)
Identified with modernize-use-default-member-init.
Configuration menu - View commit details
-
Copy full SHA for 80d00bf - Browse repository at this point
Copy the full SHA 80d00bfView commit details -
[lldb] Use const reference for range variables to improve performance…
… (NFC) (llvm#94840) Cppcheck recommends using a const reference for range variables in a for-each loop. This avoids unnecessary copying of elements, improving performance. Caught by cppcheck - lldb/source/API/SBBreakpoint.cpp:717:22: performance: Range variable 'name' should be declared as const reference. [iterateByValue] lldb/source/API/SBTarget.cpp:1150:15: performance: Range variable 'name' should be declared as const reference. [iterateByValue] lldb/source/Breakpoint/Breakpoint.cpp:888:26: performance: Range variable 'name' should be declared as const reference. [iterateByValue] lldb/source/Breakpoint/BreakpointIDList.cpp:262:26: performance: Range variable 'name' should be declared as const reference. [iterateByValue] Fix llvm#91213 Fix llvm#91217 Fix llvm#91219 Fix llvm#91220
Configuration menu - View commit details
-
Copy full SHA for 1e92ad4 - Browse repository at this point
Copy the full SHA 1e92ad4View commit details -
[libc][math][c23] fmul correcly rounded to all rounding modes (llvm#9…
…1537) This is an implementation of floating point multiplication: It will consist of - `double x double -> float`
Configuration menu - View commit details
-
Copy full SHA for 263be9f - Browse repository at this point
Copy the full SHA 263be9fView commit details -
[libc][math][C23] Implemented remquof128 function (llvm#94809)
Added remquof128 function. Closes llvm#94312
Configuration menu - View commit details
-
Copy full SHA for 44aecca - Browse repository at this point
Copy the full SHA 44aeccaView commit details -
[VPlan] Check if only first part is used for all per-part VPInsts.
Apply the onlyFirstPartUsed logic generally to all per-part VPInstructions. Note that the test changes remove the second part of an unsued first-order recurrence splice.
Configuration menu - View commit details
-
Copy full SHA for a43d999 - Browse repository at this point
Copy the full SHA a43d999View commit details -
[RISCV][GISel] Add calling convention support for half (llvm#94110)
This patch adds initial support to the half type on RISC-V.
Configuration menu - View commit details
-
Copy full SHA for 643e471 - Browse repository at this point
Copy the full SHA 643e471View commit details -
[VPlan] Mark FirstOrderRecurrenceSplice as not having side-effects.
Now that FOR exit and resume value creation is explicitly modeled in VPlan (05e1b53, 07b3301) it doesn't depend on the first order recurrence splice being preserved and it can now be marked as not having side-effects. This allows removal of first-order-recurrence-splce if the FOR is only used in the exit or as scalar ph resume value.
Configuration menu - View commit details
-
Copy full SHA for 998c33e - Browse repository at this point
Copy the full SHA 998c33eView commit details -
[ProfileData] Simplify calls to readNext in readBinaryIdsInternal (NF…
…C) (llvm#94862) readNext has two variants: - readNext<uint64_t, endian>(ptr) - readNext<uint64_t>(ptr, endian) This patch uses the latter to simplify readBinaryIdsInternal. Both forms default to unaligned.
Configuration menu - View commit details
-
Copy full SHA for e62c214 - Browse repository at this point
Copy the full SHA e62c214View commit details -
Configuration menu - View commit details
-
Copy full SHA for febfbff - Browse repository at this point
Copy the full SHA febfbffView commit details -
Configuration menu - View commit details
-
Copy full SHA for c2d68c4 - Browse repository at this point
Copy the full SHA c2d68c4View commit details -
[InstCombine] Propagate flags when folding consecutative shifts
When we fold `(shift (shift C0, x), C1)` we can propagate flags that are common to both shifts. Proofs: https://alive2.llvm.org/ce/z/LkEzXD Closes llvm#94872
Configuration menu - View commit details
-
Copy full SHA for 2900d03 - Browse repository at this point
Copy the full SHA 2900d03View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e482b2 - Browse repository at this point
Copy the full SHA 2e482b2View commit details -
[MC] Simplify Sec.getFragmentList().insert(Sec.begin(), F). NFC
Decrease the uses of getFragmentList() to make it easier to change the fragment list representation.
Configuration menu - View commit details
-
Copy full SHA for dcb71c0 - Browse repository at this point
Copy the full SHA dcb71c0View commit details
Commits on Jun 9, 2024
-
[SPARC][IAS] Add GNU extension for
addc
Transform `addc imm, %rs, %rd` into `addc %rs, imm, %rd`. This is used in some GNU and Linux code. Reviewers: s-barannikov, rorth, jrtc27, brad0 Reviewed By: s-barannikov Pull Request: llvm#94245
Configuration menu - View commit details
-
Copy full SHA for f20d8b9 - Browse repository at this point
Copy the full SHA f20d8b9View commit details -
[SPARC][IAS] Add support for %uhi and %ulo extensions
This adds support for GNU %uhi and %ulo extensions. Those resolve to the same relocations as %hh and %hm. Reviewers: cyndyishida, dcci, brad0, jrtc27, aaupov, Endilll, rorth, maksfb, #reviewers-libcxxabi, s-barannikov, rafaelauler, ayermolo, #reviewers-libunwind, #reviewers-libcxx, daniel-grumberg, tbaederr Reviewed By: s-barannikov Pull Request: llvm#94246
Configuration menu - View commit details
-
Copy full SHA for 44f9357 - Browse repository at this point
Copy the full SHA 44f9357View commit details -
[SPARC][IAS] Add aliases for %asr20-21 as defined in JPS1
This adds %set_softint and %clear_softint alias for %asr20 and %asr21 as defined in JPS1. Reviewers: jrtc27, brad0, s-barannikov, rorth Reviewed By: s-barannikov Pull Request: llvm#94247
Configuration menu - View commit details
-
Copy full SHA for 715a5d8 - Browse repository at this point
Copy the full SHA 715a5d8View commit details -
[clang][Interp][NFC] Refactor lvalue-to-rvalue conversion code
Really perform the conversion always if the flag is set and don't make it dependent on whether we're checking the result for initialization.
Configuration menu - View commit details
-
Copy full SHA for cc8fa1e - Browse repository at this point
Copy the full SHA cc8fa1eView commit details -
[clang-tidy] Ignore non-math operators in readability-math-missing-pa…
…rentheses (llvm#94654) Do not emit warnings for non-math operators. Closes llvm#92516
Configuration menu - View commit details
-
Copy full SHA for d211abc - Browse repository at this point
Copy the full SHA d211abcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 338cbfe - Browse repository at this point
Copy the full SHA 338cbfeView commit details -
[ARM] vector-store.ll - add big-endian test coverage
Based on feedback on llvm#94863
Configuration menu - View commit details
-
Copy full SHA for 32b7043 - Browse repository at this point
Copy the full SHA 32b7043View commit details -
[clang-tidy] Ignore implicit functions in readability-implicit-bool-c…
…onversion (llvm#94512) Ignore implicit declarations and defaulted functions. Helps with issues in generated code like, C++ spaceship operator. Closes llvm#93409
Configuration menu - View commit details
-
Copy full SHA for e329bfc - Browse repository at this point
Copy the full SHA e329bfcView commit details -
[clang-tidy] Extend modernize-use-designated-initializers with new op…
…tions (llvm#94651) Add StrictCStandardCompliance and StrictCppStandardCompliance options that default to true. Closes llvm#83732
Configuration menu - View commit details
-
Copy full SHA for 31b84d4 - Browse repository at this point
Copy the full SHA 31b84d4View commit details -
[clang-tidy] Improve bugprone-multi-level-implicit-pointer-conversion (…
…llvm#94524) Ignore implicit pointer conversions that are part of a cast expression Closes llvm#93959
Configuration menu - View commit details
-
Copy full SHA for b55fb56 - Browse repository at this point
Copy the full SHA b55fb56View commit details -
Configuration menu - View commit details
-
Copy full SHA for 46d94bd - Browse repository at this point
Copy the full SHA 46d94bdView commit details -
[DAG] FoldConstantArithmetic - allow binop folding to work with diffe…
…ring bitcasted constants (llvm#94863) We currently only constant fold binop(bitcast(c1),bitcast(c2)) if c1 and c2 are both bitcasted and from the same type. This patch relaxes this assumption to allow the constant build vector to originate from different types (and allow cases where only one operand was bitcasted). We still ensure we bitcast back to one of the original types if both operand were bitcasted (we assume that if we have a non-bitcasted constant then its legal to keep using that type).
Configuration menu - View commit details
-
Copy full SHA for 53fecef - Browse repository at this point
Copy the full SHA 53fecefView commit details -
[DAG] Fold fdiv X, c2 -> fmul X, 1/c2 without AllowReciprocal if exact (
llvm#93882) This moves the combine of fdiv by constant to fmul out of an 'if (Options.UnsafeFPMath || Flags.hasAllowReciprocal()' block, so that it triggers if the divide is exact. An extra check for Recip.isDenormal() is added as multiple places make reference to it being unsafe or slow on certain platforms.
Configuration menu - View commit details
-
Copy full SHA for a284bdb - Browse repository at this point
Copy the full SHA a284bdbView commit details -
[VPlan] Handle more cases in VPInstruction::onlyFirstPartUsed.
Handle binary ops and a few other instructions in onlyFirstPartUsed; they only use the first part if they themselves only have their first part used.
Configuration menu - View commit details
-
Copy full SHA for 2f4ebf8 - Browse repository at this point
Copy the full SHA 2f4ebf8View commit details -
Configuration menu - View commit details
-
Copy full SHA for cb8e936 - Browse repository at this point
Copy the full SHA cb8e936View commit details -
Configuration menu - View commit details
-
Copy full SHA for 69cd2d2 - Browse repository at this point
Copy the full SHA 69cd2d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5bb9c08 - Browse repository at this point
Copy the full SHA 5bb9c08View commit details -
[AMDGPU] Swap range metadata to attribute for workitem id. (llvm#94871)
Swap out range metadata to range attribute for calls to be able to deprecate range metadata on calls in the future.
Configuration menu - View commit details
-
Copy full SHA for cc19374 - Browse repository at this point
Copy the full SHA cc19374View commit details -
[SPARC][IAS] Add named prefetch tag constants
This adds named tag constants (such as `#one_write` and `#one_read`) for the prefetch instruction. Reviewers: jrtc27, rorth, brad0, s-barannikov Reviewed By: s-barannikov Pull Request: llvm#94249
Configuration menu - View commit details
-
Copy full SHA for 2388129 - Browse repository at this point
Copy the full SHA 2388129View commit details -
[SPARC][IAS] Add support for
prefetcha
instructionThis adds support for `prefetcha` instruction for prefetching from alternate address spaces. Reviewers: jrtc27, brad0, rorth, s-barannikov Reviewed By: s-barannikov Pull Request: llvm#94250
Configuration menu - View commit details
-
Copy full SHA for 41f2ea0 - Browse repository at this point
Copy the full SHA 41f2ea0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8901f71 - Browse repository at this point
Copy the full SHA 8901f71View commit details -
[SPARC][IAS] Handle the case of non-4-byte aligned writeNopData
If the Count passed into writeNopData is not a multiple of four, add a little amount of zeros before writing the NOP stream. This makes it match the behavior of GNU binutils. Reviewers: brad0, rorth, s-barannikov, jrtc27 Reviewed By: s-barannikov Pull Request: llvm#94251
Configuration menu - View commit details
-
Copy full SHA for 2bc36af - Browse repository at this point
Copy the full SHA 2bc36afView commit details -
[SPARC][IAS] Add movr(n)e alias for movr(n)z
This adds the alternate mnemonics for movrz and movrnz. Reviewers: s-barannikov, jrtc27, brad0, rorth Reviewed By: s-barannikov Pull Request: llvm#94252
Configuration menu - View commit details
-
Copy full SHA for e0b9cce - Browse repository at this point
Copy the full SHA e0b9cceView commit details -
[libc++][TZDB] Implements time_zone get_info(local_time). (llvm#89537)
Implements parts of: - P0355 Extending chrono to Calendars and Time Zones
Configuration menu - View commit details
-
Copy full SHA for de736d9 - Browse repository at this point
Copy the full SHA de736d9View commit details -
[Instrumentation] Remove an extraneous ArrayRef (NFC) (llvm#94890)
We can implicitly convert RemainingVDs to an ArrayRef. Note that RemainingVDs is of type SmallVector<InstrProfValueData, 24>.
Configuration menu - View commit details
-
Copy full SHA for f7ccb32 - Browse repository at this point
Copy the full SHA f7ccb32View commit details -
Configuration menu - View commit details
-
Copy full SHA for e090bac - Browse repository at this point
Copy the full SHA e090bacView commit details -
Configuration menu - View commit details
-
Copy full SHA for 089c4bb - Browse repository at this point
Copy the full SHA 089c4bbView commit details -
[RISCV] Cleanup some Constraint parameters in RISCVInstrInfoVPseudos.…
…td. NFC Remove unneeded parameters or sync into class if they are only ever used with one value.
Configuration menu - View commit details
-
Copy full SHA for add8908 - Browse repository at this point
Copy the full SHA add8908View commit details -
[InstCombine] Fix missing argument typo in `InstCombinerImpl::foldICm…
…pShlConstant` (llvm#94899) Closes llvm#94897.
Configuration menu - View commit details
-
Copy full SHA for e4b0655 - Browse repository at this point
Copy the full SHA e4b0655View commit details -
GlobalISel: Remove faulty assert in buildAtomicRMW op
Vectors are supported for fp operations now, so remove the assert. The supported type/operation combinations are best left for the verifier. Avoids regression in future commit that starts treating some vector cases as legal.
Configuration menu - View commit details
-
Copy full SHA for 014446c - Browse repository at this point
Copy the full SHA 014446cView commit details -
[NFC][mlir][gpu] Fully-qualify all namespaces in the GPU compilation …
…interfaces (llvm#94908) Fully qualify all namespaces appearing in `GPUTargetAttrInterface` and `OffloadingLLVMTranslationAttrInterface`. If they're not fully qualified then out-of-tree dialects might encounter name resolution errors.
Configuration menu - View commit details
-
Copy full SHA for d639b91 - Browse repository at this point
Copy the full SHA d639b91View commit details -
[Clang][OpenMP] throw compilation error instead of crash in Stmt::OMP…
…ScopeDirectiveClass case (llvm#77535) (llvm#84135) Fix llvm#77535, Change unstable assertion into compilation error, and add a test for it.
Configuration menu - View commit details
-
Copy full SHA for dbe63e3 - Browse repository at this point
Copy the full SHA dbe63e3View commit details -
[ProfileData] Refactor VTableNamePtr and CompressedVTableNamesLen (NF…
…C) (llvm#94859) VTableNamePtr and CompressedVTableNamesLen are always used together to create a StringRef in getSymtab. We can create the StringRef ahead of time in readHeader. This way, IndexedInstrProfReader becomes a tiny bit simpler with fewer member variables. Also, StringRef default-constructs itself with its Data and Length set to nullptr and 0, respectively, which is exactly what we need.
Configuration menu - View commit details
-
Copy full SHA for 521238d - Browse repository at this point
Copy the full SHA 521238dView commit details -
[libc++][TZDB] Implements time_zone::to_sys. (llvm#90394)
This implements the throwing overload and the exception classes throw by this overload. Implements parts of: - P0355 Extending chrono to Calendars and Time Zones
Configuration menu - View commit details
-
Copy full SHA for 77116bd - Browse repository at this point
Copy the full SHA 77116bdView commit details -
[NFC][mlir][gpu] Make sym_name an inherent attr in GPUModuleOp (llvm#…
…94918) Make `sym_name` an inherent attr in GPUModuleOp so that it doesn't show in the discardable attributes. The change is safe as the attribute is always expected to be present.
Configuration menu - View commit details
-
Copy full SHA for 54373e0 - Browse repository at this point
Copy the full SHA 54373e0View commit details -
MCInst: decrease inline element count to 6. NFC
MCInst is primarily used in local variables and MCRelaxableFragment (mostly JMP/JCC for x86). Reducing the inline element count can make MCRelaxableFragment smaller, potentially leading to a lower peak RSS. When compiling sqlite3.c, x86-64 has the largest maximum numOperands. aarch64: 5; ppc64: 6; riscv64: 3; s390x: 6; x86-64: 8 Here is the frequency table for x86-64: max getNumOperands: 8 0: 676 1: 37892 2: 84046 3: 26767 4: 1640 5: 1222 6: 80794 7: 768 8: 22 Pull Request: llvm#94913
Configuration menu - View commit details
-
Copy full SHA for acf6721 - Browse repository at this point
Copy the full SHA acf6721View commit details
Commits on Jun 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 63ef2ec - Browse repository at this point
Copy the full SHA 63ef2ecView commit details -
Configuration menu - View commit details
-
Copy full SHA for cbd7eab - Browse repository at this point
Copy the full SHA cbd7eabView commit details -
[mlir][python] Fix attribute registration in ir.py (llvm#94615)
This PR fixes attribute registration for `SI8Attr` and `UI8Attr` in `ir.py`.
Configuration menu - View commit details
-
Copy full SHA for 367d502 - Browse repository at this point
Copy the full SHA 367d502View commit details -
[ProfileData] Refactor BinaryIdsStart and BinaryIdsSize (NFC) (llvm#9…
…4922) BinaryIdsStart and BinaryIdsSize in IndexedInstrProfReader are always used together, so this patch packages them into an ArrayRef<uint8_t>. For now, readBinaryIdsInternal immediately unpacks ArrayRef into its constituents to avoid touching the rest of readBinaryIdsInternal.
Configuration menu - View commit details
-
Copy full SHA for 4403cdb - Browse repository at this point
Copy the full SHA 4403cdbView commit details -
[MC,test] Reorganize relax-recompute-align.s & layout-interdependency.s
relax-recompute-align.s might change when we change the fragment relaxation approach.
Configuration menu - View commit details
-
Copy full SHA for bf0d76d - Browse repository at this point
Copy the full SHA bf0d76dView commit details -
Configuration menu - View commit details
-
Copy full SHA for cb1a727 - Browse repository at this point
Copy the full SHA cb1a727View commit details -
Lazy relaxation caused hash table lookups (`getFragmentOffset`) and complex use/compute interdependencies. Some expressions involding forward declared symbols (e.g. `subsection-if.s`) cannot be computed. Recursion detection requires complex `IsBeingLaidOut` (https://reviews.llvm.org/D79570). D76114's `invalidateFragmentsFrom` makes lazy relaxation even less useful. Switch to eager relaxation to greatly simplify code and resolve these issues. This change also removes a `getPrevNode` use, which makes it more feasible to replace the fragment representation, which might yield a large peak RSS win. Minor downsides: The number of section relaxations may increase (offset by avoiding the hash table lookup). For relax-recompute-align.s, the computed layout is not optimal.
Configuration menu - View commit details
-
Copy full SHA for 9d0754a - Browse repository at this point
Copy the full SHA 9d0754aView commit details -
[mlir][bufferization] Fix handling of indirect function calls (llvm#9…
…4896) This commit fixes a crash in the ownership-based buffer deallocation pass when indirectly calling a function via SSA value. Such functions must be conservatively assumed to be public. Fixes llvm#94780.
Configuration menu - View commit details
-
Copy full SHA for 13896b6 - Browse repository at this point
Copy the full SHA 13896b6View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb4ee27 - Browse repository at this point
Copy the full SHA bb4ee27View commit details -
[libc++][TZDB] Implements time_zone::to_sys. (llvm#90901)
This implements the overload with the choose argument and adds this enum. Implements parts of: - P0355 Extending chrono to Calendars and Time Zones
Configuration menu - View commit details
-
Copy full SHA for 87cedbe - Browse repository at this point
Copy the full SHA 87cedbeView commit details -
Configuration menu - View commit details
-
Copy full SHA for a47e40b - Browse repository at this point
Copy the full SHA a47e40bView commit details -
[lld] Remove const qualifier on symbolKind (NFC) (llvm#94753)
The symbol including this member is being overwritten by memcpy here: https://github.com/llvm/llvm-project/blob/2117677e304d334326f6591f3c75fb2f34dc4bcb/lld/COFF/SymbolTable.cpp#L496-L509
Configuration menu - View commit details
-
Copy full SHA for a6929db - Browse repository at this point
Copy the full SHA a6929dbView commit details -
[CodeGen] Simplify codegen for array initialization (llvm#93956)
This makes codegen for array initialization simpler in two ways: 1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers. 2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from llvm#93823.
Configuration menu - View commit details
-
Copy full SHA for 12d24e0 - Browse repository at this point
Copy the full SHA 12d24e0View commit details -
[TLI] ReplaceWithVecLib: drop Instruction support (llvm#94365)
Refactor the pass to only support `IntrinsicInst` calls. `ReplaceWithVecLib` used to support instructions, as AArch64 was using this pass to replace a vectorized frem instruction to the fmod vector library call (through TLI). As this replacement is now done by the codegen (llvm#83859), there is no need for this pass to support instructions. Additionally, removed 'frem' tests from: - AArch64/replace-with-veclib-armpl.ll - AArch64/replace-with-veclib-sleef-scalable.ll - AArch64/replace-with-veclib-sleef.ll Such testing is done at codegen level: - llvm#83859
Configuration menu - View commit details
-
Copy full SHA for e4790ce - Browse repository at this point
Copy the full SHA e4790ceView commit details -
[dexter] Correctly identify stop-reason while driving VisualStudio (l…
…lvm#94754) Prior to this patch VisualStudio._get_step_info incorrectly identifies the reason the debugger has stopped. e.g., stepping through a program would be reported as a StopReason.Breakpoint rather than StopReason.Step. Fix. No test added as there are no VisualStudio tests (tested locally).
Configuration menu - View commit details
-
Copy full SHA for 832b91f - Browse repository at this point
Copy the full SHA 832b91fView commit details -
Configuration menu - View commit details
-
Copy full SHA for e58f830 - Browse repository at this point
Copy the full SHA e58f830View commit details -
Reapply [ConstantFold] Remove non-trivial gep-of-gep fold (llvm#93823)
Reapply after llvm#93956, which changed clang array initialization codegen to avoid size regressions for unoptimized builds. ----- This fold is subtly incorrect, because DL-unaware constant folding does not know the correct index type to use, and just performs the addition in the type that happens to already be there. This is incorrect, since sext(X)+sext(Y) is generally not the same as sext(X+Y). See the `@constexpr_gep_of_gep_with_narrow_type()` for a miscompile with the current implementation. One could try to restrict the fold to cases where no overflow occurs, but I'm not bothering with that here, because the DL-aware constant folding will take care of this anyway. I've only kept the straightforward zero-index case, where we just concatenate two GEPs.
Configuration menu - View commit details
-
Copy full SHA for cc158d4 - Browse repository at this point
Copy the full SHA cc158d4View commit details -
Configuration menu - View commit details
-
Copy full SHA for c0b65a2 - Browse repository at this point
Copy the full SHA c0b65a2View commit details -
[clang][analyzer] Improved PointerSubChecker (llvm#93676)
The checker is made more exact (only pointer into array is allowed, check array index) and more tests are added.
Configuration menu - View commit details
-
Copy full SHA for 26224ca - Browse repository at this point
Copy the full SHA 26224caView commit details -
[RemoveDIs] C API: Add before-dbg-record versions of IRBuilder positi…
…on funcs (llvm#92417) Add `LLVMPositionBuilderBeforeDbgRecords` and `LLVMPositionBuilderBeforeInstrAndDbgRecords` to `llvm/include/llvm-c/Core.h` which behave the same as `LLVMPositionBuilder` and `LVMPositionBuilderBefore` except that the position is set before debug records attached to the target instruction (the existing functions set the insertion point to after any attached debug records). More info on debug records and the migration towards using them can be found here: https://llvm.org/docs/RemoveDIsDebugInfo.html The distinction is important in some situations. An important example is when inserting a phi before another instruction which has debug records attached to it (these come "before" the instruction). Inserting before the instruction but after the debug records would result in having debug records before a phi, which is illegal. That results in an assertion failure: `llvm/lib/IR/Instruction.cpp:166: Assertion '!isa<PHINode>(this) && "Inserting PHI after debug-records!"' failed.` In llvm (C++) we've added bit to instruction iterators that carries around the extra information. Adding dedicated functions seemed like the least invasive and least suprising way to update the C API. Update llvm/tools/llvm-c-test/debuginfo.c to test this functionality. Update the OCaml bindings, the migration docs and release notes.
Configuration menu - View commit details
-
Copy full SHA for d732a32 - Browse repository at this point
Copy the full SHA d732a32View commit details -
[flang] lower SHAPE with assumed-rank arguments (llvm#94812)
Allocate result statically on the stack (using max rank) and use the runtime to fill it in correctly.
Configuration menu - View commit details
-
Copy full SHA for 0257f9c - Browse repository at this point
Copy the full SHA 0257f9cView commit details -
[lldb] Fix redundant condition in compression type check (NFC) (llvm#…
…94841) The `else if` condition for checking `m_compression_type` is redundant as it matches with a previous `if` condition, making the expression always false. Reported by cppcheck as a possible cut-and-paste error. Caught by cppcheck - lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunication.cpp:543:35: style: Expression is always false because 'else if' condition matches previous condition at line 535. [multiCondition] Fix llvm#91222
Configuration menu - View commit details
-
Copy full SHA for 0af2e75 - Browse repository at this point
Copy the full SHA 0af2e75View commit details -
[lldb] Remove redundant condition in watch mask check (NFC) (llvm#94842)
This issue is reported by cppcheck as a pointless test in the watch mask check. The `else if` condition is opposite to the previous `if` condition, making the expression always true. Caught by cppcheck - lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp:509:25: style: Expression is always true because 'else if' condition is opposite to previous condition at line 505. [multiCondition] Fix llvm#91223
Configuration menu - View commit details
-
Copy full SHA for 30bfab3 - Browse repository at this point
Copy the full SHA 30bfab3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 38c01c3 - Browse repository at this point
Copy the full SHA 38c01c3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 760d880 - Browse repository at this point
Copy the full SHA 760d880View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0faf79 - Browse repository at this point
Copy the full SHA a0faf79View commit details -
[lldb] Gracefully down TestCoroutineHandle test in case the 'coroutin…
…e' feature is missing (llvm#94903) Do not let the compiler gets failed in case the target platform does not support the 'coroutine' C++ features. Just compile without it and let lldb know about missed/unsupported feature.
Configuration menu - View commit details
-
Copy full SHA for 23b8f59 - Browse repository at this point
Copy the full SHA 23b8f59View commit details -
[KnownBits] Speed up ForeachKnownBits in unit test. NFC. (llvm#94939)
Use fast unsigned arithmetic before constructing an APInt. This gives me a ~2x speed up when running this in my Release+Asserts build: $ unittests/Support/SupportTests --gtest_filter=KnownBitsTest.*Exhaustive
Configuration menu - View commit details
-
Copy full SHA for f97bcdb - Browse repository at this point
Copy the full SHA f97bcdbView commit details -
Configuration menu - View commit details
-
Copy full SHA for c9fd7b1 - Browse repository at this point
Copy the full SHA c9fd7b1View commit details -
[clang-tidy]
doesNotMutateObject
: Handle calls to member functions … (llvm#94362) …and operators that have non-const overloads. This allows `unnecessary-copy-initialization` to warn on more cases. The common case is a class with a a set of const/non-sconst overloads (e.g. std::vector::operator[]). ``` void F() { std::vector<Expensive> v; // ... const Expensive e = v[i]; } ```
Configuration menu - View commit details
-
Copy full SHA for 415a82c - Browse repository at this point
Copy the full SHA 415a82cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 317ed77 - Browse repository at this point
Copy the full SHA 317ed77View commit details -
[flang] use hlfir base when translating assumed-rank entity to fir::E…
…xtendedValue (llvm#94822) The hlfir::Entity to fir::ExtendedValue conversion usually uses the "fir base" output of hlfir.declare (which is the same as the input) to avoid introducing temporary descriptors for the sole purpose of introducing updating lower bound information. This is possible because local lower bounds, if any, are tracked in a vector inside the fir::ExtendedValue. With assumed-ranks, the lower bounds cannot be tracked inside the fir::ExtendedValue vector (their numbers is unknown at compile time). Hence, the fir.box/fir.class used in fir::ExtendedValue in lowering must always contain accurate local lower bound information.
Configuration menu - View commit details
-
Copy full SHA for 81469a2 - Browse repository at this point
Copy the full SHA 81469a2View commit details -
[flang][Transforms][NFC] reduce boilerplate in func attr pass (llvm#9…
…4739) Use tablegen to automatically create the pass constructor. The purpose of this pass is to add attributes to functions, so it doesn't need to work on other top level operations.
Configuration menu - View commit details
-
Copy full SHA for a6129a5 - Browse repository at this point
Copy the full SHA a6129a5View commit details -
[Clang][C++23] update constexpr diagnostics for missing return statem…
…ents per P2448 (llvm#94123) Fixes llvm#92583
Configuration menu - View commit details
-
Copy full SHA for ae9d89d - Browse repository at this point
Copy the full SHA ae9d89dView commit details -
[KnownBits] Speed up conflict handling in ForeachKnownBits in unit te…
…st. (llvm#94943) Exit early if known bits have a conflict. This gives me a ~15% speed up when running this in my Release+Asserts build: $ unittests/Support/SupportTests --gtest_filter=KnownBitsTest.*Exhaustive
Configuration menu - View commit details
-
Copy full SHA for ecb9d94 - Browse repository at this point
Copy the full SHA ecb9d94View commit details -
[flang][OpenMP] Fix unused prefixes in function-filtering-2 test (llv…
…m#94330) Co-authored-by: Andrew Gozillon <Andrew.Gozillon@amd.com>
Configuration menu - View commit details
-
Copy full SHA for 8dc8b9f - Browse repository at this point
Copy the full SHA 8dc8b9fView commit details -
[libc++][TZDB] Implements time_zone::to_local. (llvm#91003)
Implements parts of: - P0355 Extending chrono to Calendars and Time Zones
Configuration menu - View commit details
-
Copy full SHA for da03175 - Browse repository at this point
Copy the full SHA da03175View commit details -
Configuration menu - View commit details
-
Copy full SHA for fe0dee4 - Browse repository at this point
Copy the full SHA fe0dee4View commit details -
[mlir][emitc] Remove copy from scf.for lowering (llvm#94898)
Remove the copy into fresh variables done when lowering scf.for into emitc.for and use the variables carrying the init and iter values as the loop's results.
Configuration menu - View commit details
-
Copy full SHA for 8b7e836 - Browse repository at this point
Copy the full SHA 8b7e836View commit details
Commits on Sep 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 6f11615 - Browse repository at this point
Copy the full SHA 6f11615View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ae8317 - Browse repository at this point
Copy the full SHA 6ae8317View commit details