-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoBump] Merge with 51365212 (Aug 25) (10) #363
base: bump_to_b96f18b2
Are you sure you want to change the base?
Commits on Aug 22, 2024
-
[AMDGPU] GFX12 VMEM loads can write VGPR results out of order (llvm#1…
…05549) Fix SIInsertWaitcnts to account for this by adding extra waits to avoid WAW dependencies.
Configuration menu - View commit details
-
Copy full SHA for 5506831 - Browse repository at this point
Copy the full SHA 5506831View commit details -
[cmake] Include GNUInstallDirs before using variables defined by it. (l…
…lvm#83807) This fixes an odd problem with the regex when `CMAKE_INSTALL_LIBDIR` is not defined: `string sub-command REGEX, mode REPLACE: regex "$" matched an empty string.` Fixes llvm#83802
Configuration menu - View commit details
-
Copy full SHA for 5bbd598 - Browse repository at this point
Copy the full SHA 5bbd598View commit details -
[DebugInfo][NFC] Constify debug DbgVariableRecord::{isDbgValue,isDbgD…
…eclare} (llvm#105570) Constify debug DbgVariableRecord::{isDbgValue,isDbgDeclare}.
Configuration menu - View commit details
-
Copy full SHA for 743e70b - Browse repository at this point
Copy the full SHA 743e70bView commit details -
Revert "[lldb][swig] Use the correct variable in the return statement"
This reverts commit 6528157. I'm reverting llvm#104523 (llvm@f01f80c) and this fixup belongs to the same series of changes.
Configuration menu - View commit details
-
Copy full SHA for 7323e7e - Browse repository at this point
Copy the full SHA 7323e7eView commit details -
Revert "[lldb-dap] Mark hidden frames as "subtle" (llvm#105457)"
This reverts commit 6f45602, which depends on llvm#104523, which I'm reverting.
Configuration menu - View commit details
-
Copy full SHA for aa70f83 - Browse repository at this point
Copy the full SHA aa70f83View commit details -
Revert "[lldb] Extend frame recognizers to hide frames from backtraces (
llvm#104523)" This reverts commit f01f80c. This commit introduces an msan violation. See the discussion on llvm#104523.
Configuration menu - View commit details
-
Copy full SHA for 547917a - Browse repository at this point
Copy the full SHA 547917aView commit details -
[clang][bytecode] Fix void unary * operators (llvm#105640)
Discard the subexpr.
Configuration menu - View commit details
-
Copy full SHA for 125aa10 - Browse repository at this point
Copy the full SHA 125aa10View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6932f47 - Browse repository at this point
Copy the full SHA 6932f47View commit details -
[NFC][SetTheory] Refactor to use const pointers and range loops (llvm…
…#105544) - Refactor SetTheory code to use const pointers when possible. - Use auto for variables initialized using dyn_cast<>. - Use range based for loops and early continue.
Configuration menu - View commit details
-
Copy full SHA for d7da79f - Browse repository at this point
Copy the full SHA d7da79fView commit details -
[libc++] Fix the documentation build
There was a duplicate link target.
Configuration menu - View commit details
-
Copy full SHA for c73b14c - Browse repository at this point
Copy the full SHA c73b14cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6d30b67 - Browse repository at this point
Copy the full SHA 6d30b67View commit details -
[mlir][OpenMP] Add optional alloc region to reduction decl (llvm#102522)
This region is intended to separate alloca operations from reduction variable initialization. This makes it easier to hoist allocas to the entry block before control flow and complex code for initialization. The verifier checks that there is at most one block in the alloc region. This is not sufficient to avoid control flow in general MLIR, but by the time we are converting to LLVMIR structured control flow should already have been lowered to the cf dialect. 1/3 Part 2: llvm#102524 Part 3: llvm#102525
Configuration menu - View commit details
-
Copy full SHA for a964635 - Browse repository at this point
Copy the full SHA a964635View commit details -
[mlir][OpenMP] Convert reduction alloc region to LLVMIR (llvm#102524)
The intention of this change is to ensure that allocas end up in the entry block not spread out amongst complex reduction variable initialization code. The tests we have are quite minimized for readability and maintainability, making the benefits less obvious. The use case for this is when there are multiple reduction variables each will multiple blocks inside of the init region for that reduction. 2/3 Part 1: llvm#102522 Part 3: llvm#102525
Configuration menu - View commit details
-
Copy full SHA for 2efc81a - Browse repository at this point
Copy the full SHA 2efc81aView commit details -
[flang][OpenMP] use reduction alloc region (llvm#102525)
I removed the `*-hlfir*` tests because they are duplicate now that the other tests have been updated to use the HLFIR lowering. 3/3 Part 1: llvm#102522 Part 2: llvm#102524
Configuration menu - View commit details
-
Copy full SHA for f2027a9 - Browse repository at this point
Copy the full SHA f2027a9View commit details -
Configuration menu - View commit details
-
Copy full SHA for d163935 - Browse repository at this point
Copy the full SHA d163935View commit details -
[Clang][Sema] Rebuild template parameters for out-of-line template de…
…finitions and partial specializations (llvm#104030) We need to rebuild the template parameters of out-of-line definitions/specializations of member templates in the context of the current instantiation for the purposes of declaration matching. We already do this for function templates and class templates, but not variable templates, partial specializations of variable template, and partial specializations of class templates. This patch fixes the latter cases.
Configuration menu - View commit details
-
Copy full SHA for c82f797 - Browse repository at this point
Copy the full SHA c82f797View commit details -
[clang][bytecode] Allow adding offsets to function pointers (llvm#105641
Configuration menu - View commit details
-
Copy full SHA for db94852 - Browse repository at this point
Copy the full SHA db94852View commit details -
[InstCombine] Add more tests for foldLogOpOfMaskedICmps transform (NFC)
Tests for cases that would have been regressed by llvm#104941.
Configuration menu - View commit details
-
Copy full SHA for 7e3f9dd - Browse repository at this point
Copy the full SHA 7e3f9ddView commit details -
[mlir][OpenMP][NFC] clean up optional reduction region parsing (llvm#…
Configuration menu - View commit details
-
Copy full SHA for dd3b43a - Browse repository at this point
Copy the full SHA dd3b43aView commit details -
[mlir][LLVM] Add support for constant struct with multiple fields (ll…
…vm#102752) Currently `mlir.llvm.constant` of structure types restricts that the structure type effectively represents a complex type -- it must have exactly two fields of the same type and the field type must be either an integer type or a float type. This PR relaxes this restriction and it allows the structure type to have an arbitrary number of fields.
Configuration menu - View commit details
-
Copy full SHA for 318b067 - Browse repository at this point
Copy the full SHA 318b067View commit details -
[Analysis] Teach ScalarEvolution::getRangeRef about more dereferencea…
…ble objects (llvm#104778) Whilst dealing with review comments on llvm#96752 I discovered that SCEV does not know about the dereferenceable attribute on function arguments so I have updated getRangeRef to make use of it by calling getPointerDereferenceableBytes.
Configuration menu - View commit details
-
Copy full SHA for d46812a - Browse repository at this point
Copy the full SHA d46812aView commit details -
[PowerPC] Fix mask for __st[d/w/h/b]cx builtins (llvm#104453)
These builtins are currently returning CR0 which will have the format [0, 0, flag_true_if_saved, XER]. We only want to return flag_true_if_saved. This patch adds a shift to remove the XER bit before returning.
Configuration menu - View commit details
-
Copy full SHA for 327edbe - Browse repository at this point
Copy the full SHA 327edbeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 11e1378 - Browse repository at this point
Copy the full SHA 11e1378View commit details -
Configuration menu - View commit details
-
Copy full SHA for c8f40e7 - Browse repository at this point
Copy the full SHA c8f40e7View commit details -
[InstCombine] Handle logical op for and/or of icmp 0/-1
This aligns the transform with what foldLogOpOfMaskedICmp() does.
Configuration menu - View commit details
-
Copy full SHA for 32679e1 - Browse repository at this point
Copy the full SHA 32679e1View commit details -
[libc++][docs] Major update to the documentation
- Landing page: add link to the libc++ Discord channel - Landing page: reorder "Getting Involved" above "Design documents" - Landing page: remove "Notes and Known Issues" which was completely outdated - Rename "Using Libc++" to "User Documentation" and update contents - Rename "Building Libc++" to "Vendor Documentation" and update contents The "BuildingLibcxx" and "UsingLibcxx" pages have basically been used for vendor and user documentation respectively. However, they were named in a way that doesn't really make that clear. Renaming the pages now gives us a location to clearly document what we target at vendors and what we target at users, and to do that separately.
Configuration menu - View commit details
-
Copy full SHA for 41dcdfb - Browse repository at this point
Copy the full SHA 41dcdfbView commit details -
[DAG][RISCV] Use vp_reduce_* when widening illegal types for reductio…
…ns (llvm#105455) This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. For RISCV specifically, it's worth noting that an alternate padded lowering is available when VL is one less than a power of two, and LMUL <= m1. We could slide the vector operand up by one, and insert the padding via a vslide1up. We don't currently pattern match this, but we could. This form would arguably be better iff the surrounding code wanted VL=4. This patch will force a VL toggle in that case instead. Basically, it comes down to a question of whether we think odd sized vectors are going to appear clustered with odd size vector operations, or mixed in with larger power of two operations. Note there is a potential downside of using vp nodes; we loose any generic DAG combines which might have applied to the widened form.
Configuration menu - View commit details
-
Copy full SHA for 00baa1a - Browse repository at this point
Copy the full SHA 00baa1aView commit details -
[RISCV] Introduce local peephole to reduce VLs based on demanded VL (l…
…lvm#104689) This is a fairly narrow transform (at the moment) to reduce the VLs of instructions feeding a store with a smaller VL. Note that the goal of this transform isn't really to reduce VL - it's to reduce VL *toggles*. To our knowledge, small reductions in VL without also changing LMUL are generally not profitable on existing hardware. For a single use instruction without side effects, fp exceptions, or a result dependency on VL, reducing VL is legal if only a subset of elements are legal. We'd already implemented this logic for vmv.v.v, and this patch simply applies it to stores as an alternate root. Longer term, I plan to extend this to other root instructions (i.e. different kind of stores, reduces, etc..), and add a more general recursive walkback through operands. One risk with the dataflow based approach is that we could be reducing VL of an instruction scheduled in a region with the wider VL (i.e. mixed mode computations) forcing an additional VL toggle. An example of this is the @insert_subvector_dag_loop test case, but it doesn't appear to happen widely. I think this is a risk we should accept.
Configuration menu - View commit details
-
Copy full SHA for 26a8a85 - Browse repository at this point
Copy the full SHA 26a8a85View commit details -
[AArch64] optimise SVE cmp intrinsics with no active lanes (llvm#104779)
This patch extends llvm#73964 and optimises SVE cmp intrinsics to zero vector when predicate is zero.
Configuration menu - View commit details
-
Copy full SHA for 29cb1e6 - Browse repository at this point
Copy the full SHA 29cb1e6View commit details -
[libc++] Post-LLVM19-release docs cleanup (llvm#99667)
This patch removes obsolete status pages for projects that were completed: LLVM 18 release, C++20 Ranges and Spaceship support. Co-authored-by: Hristo Hristov <zingam@outlook.com>
Configuration menu - View commit details
-
Copy full SHA for 58ac764 - Browse repository at this point
Copy the full SHA 58ac764View commit details -
[SimplifyCFG] Fold switch over ucmp/scmp to icmp and br (llvm#105636)
If we switch over ucmp/scmp and have two switch cases going to the same destination, we can convert into icmp+br. Fixes llvm#105632.
Configuration menu - View commit details
-
Copy full SHA for 4d85285 - Browse repository at this point
Copy the full SHA 4d85285View commit details -
[SLP]Do not count extractelement costs in unreachable/landing pad blo…
…cks. If the external user of the scalar to be extract is in unreachable/landing pad block, we can skip counting their cost. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#105667
Configuration menu - View commit details
-
Copy full SHA for 9402bb0 - Browse repository at this point
Copy the full SHA 9402bb0View commit details -
[NFC] Replace bool <= bool comparison (llvm#102948)
Static analyser tool cppcheck flags ordered comparison with `bool`s. Replace with equivalent logical operators to prevent this. Closes llvm#102912
Configuration menu - View commit details
-
Copy full SHA for ec5e585 - Browse repository at this point
Copy the full SHA ec5e585View commit details -
[AMDGPU] Generate checks for vector indexing. NFC. (llvm#105668)
This allows combining some test files that were only split because adding new RUN lines introduced too much churn in the checks.
Configuration menu - View commit details
-
Copy full SHA for c4c5fdd - Browse repository at this point
Copy the full SHA c4c5fddView commit details -
[RISCV][GISel] Implement canLowerReturn. (llvm#105465)
This allows us to handle return values that are too large to fit in x10 and x11. They will be converted to a sret by passing a pointer to where to store the return value.
Configuration menu - View commit details
-
Copy full SHA for 8ba2ae3 - Browse repository at this point
Copy the full SHA 8ba2ae3View commit details -
[DwarfEhPrepare] Assign dummy debug location for more inserted _Unwin…
…d_Resume calls (llvm#105513) Similar to the fix for llvm#57469, ensure that the other `_Unwind_Resume` call emitted by DwarfEHPrepare has a debug location if needed. This fixes nbdd0121/unwinding#34.
Configuration menu - View commit details
-
Copy full SHA for e76db25 - Browse repository at this point
Copy the full SHA e76db25View commit details -
[SLP]Improve/fix subvectors in gather/buildvector nodes handling
SLP vectorizer has an estimation for gather/buildvector nodes, which contain some scalar loads. SLP vectorizer performs pretty similar (but large in SLOCs) estimation, which not always correct. Instead, this patch implements clustering analysis and actual node allocation with the full analysis for the vectorized clustered scalars (not only loads, but also some other instructions) with the correct cost estimation and vector insert instructions. Improves overall vectorization quality and simplifies analysis/estimations. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#104144
Configuration menu - View commit details
-
Copy full SHA for 69332bb - Browse repository at this point
Copy the full SHA 69332bbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3c54aa1 - Browse repository at this point
Copy the full SHA 3c54aa1View commit details -
[lldb] Pick the correct architecutre when target and core file disagr…
…ee (llvm#105576) In f9f3316, Adrian fixed an issue where LLDB wouldn't update the target's architecture when the process reported a different triple that only differed in its sub-architecture. This unintentionally regressed core file debugging when the core file reports the base architecture (e.g. armv7) while the main binary knows the correct CPU subtype (e.g. armv7em). After the aforementioned change, we update the target architecture from armv7em to armv7. Fix the issue by trusting the target architecture over the ProcessMachCore process. rdar://133834304
Configuration menu - View commit details
-
Copy full SHA for 9f41805 - Browse repository at this point
Copy the full SHA 9f41805View commit details -
[ARM] Fix missing ELF FPU attributes for fp-armv8-fullfp16-d16 (llvm#…
…105677) An assembly input with > .fpu fp-armv8-fullfp16-d16 crashes the compiler because the ELF FPU attribute emitter misses the respective entry. This patch fixes this. Interestingly, compiling with -mfpu=fp-armv8-fullfp16-d16 does not cause the crash because FPv5_D16 is an alias in the compiler and > .fpu fpv5-d16 is emitted instead, which does not crash. The existing .fpu directive test with multiple FPUs serves the purpose of verifying that each possible FPU option is defined, but does not trigger the crash because only the last .fpu directive goes effectively down the code path. Therefore one test for each FPU is required. Fixes llvm#105674.
Configuration menu - View commit details
-
Copy full SHA for fe5d1f9 - Browse repository at this point
Copy the full SHA fe5d1f9View commit details -
Configuration menu - View commit details
-
Copy full SHA for b21756f - Browse repository at this point
Copy the full SHA b21756fView commit details -
[AArch64] Lower aarch64_neon_saddlv via SADDLV nodes. (llvm#103307)
This mirrors what GISel already does, extending the existing lowering of aarch64_neon_saddlv/aarch64_neon_uaddlv to SADDLV/UADDLV. This allows us to remove some tablegen patterns, and provides a little nicer codegen in places as the nodes represent the result being in a vector register correctly.
Configuration menu - View commit details
-
Copy full SHA for 8ab6140 - Browse repository at this point
Copy the full SHA 8ab6140View commit details -
Configuration menu - View commit details
-
Copy full SHA for 24740ec - Browse repository at this point
Copy the full SHA 24740ecView commit details -
Reland "[asan] Remove debug tracing from
report_globals
(llvm#104404)……" (llvm#105601) This reverts commit 2704b80 and relands llvm#104404. The Darwin should not fail after llvm#105599.
Configuration menu - View commit details
-
Copy full SHA for 8c6f8c2 - Browse repository at this point
Copy the full SHA 8c6f8c2View commit details -
This patch fixes warnings of the form: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9300:23: error: loop variable '[E, Idx]' creates a copy from type 'const value_type' (aka 'const std::pair<const llvm::slpvectorizer::BoUpSLP::TreeEntry *, unsigned int>') [-Werror,-Wrange-loop-construct]
Configuration menu - View commit details
-
Copy full SHA for a625435 - Browse repository at this point
Copy the full SHA a625435View commit details -
This patch fixes: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:6102:9: error: unused variable 'OpVT' [-Werror,-Wunused-variable]
Configuration menu - View commit details
-
Copy full SHA for 0bd90ec - Browse repository at this point
Copy the full SHA 0bd90ecView commit details -
[AArch64,ELF] Allow implicit $d/$x at section beginning
The start state of a new section is `EMS_None`, often leading to a $d/$x at offset 0. Introduce a MCTargetOption/cl::opt "implicit-mapsyms" to allow an alternative behavior (ARM-software/abi-aa#274): * Set the start state to `EMS_Data` or `EMS_A64`. * For text sections, add an ending $x only if the final data is not instructions. * For non-text sections, add an ending $d only if the final data is not data commands. ``` .section .text.1,"ax" nop // emit $d .long 42 // emit $x .section .text.2,"ax" nop ``` This new behavior decreases the .symtab size significantly: ``` % bloaty a64-2/bin/clang -- a64-0/bin/clang FILE SIZE VM SIZE -------------- -------------- -5.4% -1.13Mi [ = ] 0 .strtab -50.9% -4.09Mi [ = ] 0 .symtab -4.0% -5.22Mi [ = ] 0 TOTAL ``` --- This scheme works as long as the user can rule out some error scenarios: * .text.1 assembled using the traditional behavior is combined with .text.2 using the new behavior * A linker script combining non-text sections and text sections. The lack of mapping symbols in the non-text sections could make them treated as code, unless the linker inserts extra mapping symbols. The above mix-and-match scenarios aren't an issue at all for a significant portion of users. A text section may start with data commands in rare cases (e.g. -fsanitize=function) that many users don't care about. When combing `(.text.0; .word 0)` and `(.text.1; .word 0)`, the ending $x of .text.0 and the initial $d of .text.1 may have the same address. If both sections reside in the same file, ensure the ending symbol comes before the initial $d of .text.1, so that a dumb linker respecting the symbol order will place the ending $x before the initial $d. Disassemblers using stable sort will see both symbols at the same address, and the second will win. When section ordering mechanisms (e.g. --symbol-ordering-file, --call-graph-profile-sort, `.text : { second.o(.text) first.o(.text) }`) are involved, the initial data in a text section following a text section with trailing data could be misidentified as code, but the issue is local and the risk could be acceptable. Pull Request: llvm#99718
Configuration menu - View commit details
-
Copy full SHA for 46707b0 - Browse repository at this point
Copy the full SHA 46707b0View commit details -
[AMDGPU][GlobalISel] Disable fixed-point iteration in all Combiners (l…
…lvm#105517) Disable fixed-point iteration in all AMDGPU Combiners after llvm#102163. This saves around 2% compile time in ad hoc testing on some large graphics shaders. I did not notice any regressions in the generated code, just a bunch of harmless differences in instruction selection and register allocation.
Configuration menu - View commit details
-
Copy full SHA for 2012b25 - Browse repository at this point
Copy the full SHA 2012b25View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0926255 - Browse repository at this point
Copy the full SHA 0926255View commit details -
Configuration menu - View commit details
-
Copy full SHA for 83fc989 - Browse repository at this point
Copy the full SHA 83fc989View commit details -
[Driver] Add -Wa, options -mmapsyms={default,implicit}
-Wa,-mmapsyms=implicit enables the alternative mapping symbol scheme discussed at llvm#99718. While not conforming to the current aaelf64 ABI, the option is invaluable for those with full control over their toolchain, no reliance on weird relocatable files, and a strong focus on minimizing both relocatable and executable sizes. The option is discouraged when portability of the relocatable objects is a concern. https://maskray.me/blog/2024-07-21-mapping-symbols-rethinking-for-efficiency elaborates the risk. Pull Request: llvm#104542
Configuration menu - View commit details
-
Copy full SHA for eb549da - Browse repository at this point
Copy the full SHA eb549daView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ec4c9c - Browse repository at this point
Copy the full SHA 6ec4c9cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 933f722 - Browse repository at this point
Copy the full SHA 933f722View commit details -
[C23] Remove WG14 N2517 from the status page
This paper proposes no normative changes, just updates an example in the standard. It was incorrect for us to have marked it as No in the first place.
Configuration menu - View commit details
-
Copy full SHA for 27727d8 - Browse repository at this point
Copy the full SHA 27727d8View commit details -
[WebAssembly] Change half-precision feature name to fp16. (llvm#105434)
This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.
Configuration menu - View commit details
-
Copy full SHA for 7d373ce - Browse repository at this point
Copy the full SHA 7d373ceView commit details -
Configuration menu - View commit details
-
Copy full SHA for bc860b4 - Browse repository at this point
Copy the full SHA bc860b4View commit details -
[clang][bytecode] Fix 'if consteval' in non-constant contexts (llvm#1…
…04707) The previous code made this a compile-time decision but it's not.
Configuration menu - View commit details
-
Copy full SHA for b9c4c4c - Browse repository at this point
Copy the full SHA b9c4c4cView commit details -
[libc++] Adjust armv7 XFAIL target triple for the setfill_wchar_max t…
…est. (llvm#105586) Also allow XFAIL for armv7-*-linux-gnueabihf targets, not only for armv7l-*.
Configuration menu - View commit details
-
Copy full SHA for 4a2a1b5 - Browse repository at this point
Copy the full SHA 4a2a1b5View commit details -
[lldb] Change the two remaining SInt64 settings in Target to uint (ll…
…vm#105460) TargetProperties.td had a few settings listed as signed integral values, but the Target.cpp methods reading those values were reading them as unsigned. e.g. target.max-memory-read-size, some accesses of target.max-children-count, still today, previously target.max-string-summary-length. After Jonas' change to use templates to read these values in https://reviews.llvm.org/D149774, when the code tried to fetch these values, we'd eventually end up calling OptionValue::GetAsUInt64 which checks that the value is actually a UInt64 before returning it; finding that it was an SInt64, it would drop the user setting and return the default value. This manifested as a bug that target.max-memory-read-size is never used for memory read. target.max-children-count is less straightforward, where one read of that setting was fetching it as an int64_t, the other as a uint64_t. I suspect all of these settings were originally marked as SInt64 so a user could do -1 for "infinite", getting it static_cast to a UINT64_MAX value along the way. I can't find any documentation for this behavior, but it seems like something Greg would have done. We've partially lost that behavior already via llvm#72233 for target.max-string-summary-length, and this further removes it. We're still fetching UInt64's and returning them as uint32_t's but I'm not overly pressed about someone setting a count/size limit over 4GB. I added a simple API test for the memory read setting limit.
Configuration menu - View commit details
-
Copy full SHA for c1e401f - Browse repository at this point
Copy the full SHA c1e401fView commit details -
Recommit "[FunctionAttrs] deduce attr
cold
on functions if all CG p……aths call a `cold` function" Fixed up the uar test that was failing. It seems with the new `cold` attribute the order of the functions is different. As far as I can tell this is not a concern. Closes llvm#105559
Configuration menu - View commit details
-
Copy full SHA for 6b11573 - Browse repository at this point
Copy the full SHA 6b11573View commit details -
[IR] Simplify comparisons with std::optional (NFC) (llvm#105624)
For variable X of type std::optional, X && X.value_or(Y) == Z is equivalent to X == Z when Y != Z.
Configuration menu - View commit details
-
Copy full SHA for b2cd81c - Browse repository at this point
Copy the full SHA b2cd81cView commit details -
[MCA][X86] Add scatter instruction test coverage for llvm#105675
Missed IceLakeServer when I updated the other CPUs in 6ec4c9c
Configuration menu - View commit details
-
Copy full SHA for 7faf2c9 - Browse repository at this point
Copy the full SHA 7faf2c9View commit details -
[MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data
This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should Fixes llvm#105675
Configuration menu - View commit details
-
Copy full SHA for 2c1f064 - Browse repository at this point
Copy the full SHA 2c1f064View commit details -
Configuration menu - View commit details
-
Copy full SHA for c5a0c37 - Browse repository at this point
Copy the full SHA c5a0c37View commit details -
[VPlan] Move EVL memory recipes to VPlanRecipes.cpp (NFC)
Move VPWiden[Load|Store]EVLRecipe::executeto VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp
Configuration menu - View commit details
-
Copy full SHA for 1fa6c99 - Browse repository at this point
Copy the full SHA 1fa6c99View commit details -
[libc++] Fix transform_error.mandates.verify.cpp test on msvc (llvm#1…
…04635) PR llvm#102851 marks reference types in union as error on msvc by changing the clang, which makes 'transform_error.mandates.verify.cpp' no longer failing on msvc from ToT. However, all libcxx buildbots do not build clang from source, therefore, this test will still fail on these bots, which is incorrect. This patch changed the expected error message of this test so it can pass with both release branch clang and ToT clang.
Configuration menu - View commit details
-
Copy full SHA for e31322b - Browse repository at this point
Copy the full SHA e31322bView commit details -
[libc] Add
ctype.h
locale variants (llvm#102711)Summary: This patch adds all the libc ctype variants. These ignore the locale ingormation completely, so they're pretty much just stubs. Because these use locale information, which is system scope, we do not enable building them outisde of full build mode.
Configuration menu - View commit details
-
Copy full SHA for 8f005f8 - Browse repository at this point
Copy the full SHA 8f005f8View commit details -
Revert " [libc] Add
ctype.h
locale variants (llvm#102711)"This reverts commit 8f005f8.
Configuration menu - View commit details
-
Copy full SHA for 2f4232d - Browse repository at this point
Copy the full SHA 2f4232dView commit details -
[libc] Initial support for 'locale.h' in the LLVM libc (llvm#102689)
Summary: This patch adds the macros and entrypoints associated with the `locale.h` entrypoints. These are mostly stubs, as we (for now and the forseeable future) only expect to support the C and maybe C.UTF-8 locales in the LLVM libc.
Configuration menu - View commit details
-
Copy full SHA for 78d8ab2 - Browse repository at this point
Copy the full SHA 78d8ab2View commit details -
Configuration menu - View commit details
-
Copy full SHA for f3a47b9 - Browse repository at this point
Copy the full SHA f3a47b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 518b1f0 - Browse repository at this point
Copy the full SHA 518b1f0View commit details -
[HLSL][SPIRV]Add SPIRV generation for HLSL dot (llvm#104656)
This adds the SPIRV fdot, sdot, and udot intrinsics and allows them to be created at codegen depending on the target architecture. This required moving some of the DXIL-specific choices to DXIL instruction expansion out of codegen and providing it with at a more generic fdot intrinsic as well. Removed some stale comments that gave the obsolete impression that type conversions should be expected to match overloads. The SPIRV intrinsic handling involves generating multiply and add operations for integers and the existing OpDot operation for floating point. New tests for generating SPIRV float and integer dot intrinsics are added as well as expanding HLSL tests to include SPIRV generation Used new dot product intrinsic generation to implement normalize() in SPIRV Incidentally changed existing dot intrinsic definitions to use DefaultAttrsIntrinsic to match the newly added inrinsics Fixes llvm#88056
Configuration menu - View commit details
-
Copy full SHA for 319c7a4 - Browse repository at this point
Copy the full SHA 319c7a4View commit details -
Fix dap stacktrace perf issue (llvm#104874)
We have got several customer reporting of slow stepping over the past year in VSCode. Profiling shows the slow stepping is caused by `stackTrace` request which can take around 1 second for certain targets. Since VSCode sends `stackTrace` during each stop event, the slow `stackTrace` request would slow down stepping in VSCode. Below is the hot path: ``` |--68.75%--lldb_dap::DAP::HandleObject(llvm::json::Object const&) | | | |--57.70%--(anonymous namespace)::request_stackTrace(llvm::json::Object const&) | | | | | |--54.43%--lldb::SBThread::GetCurrentExceptionBacktrace() | | | lldb_private::Thread::GetCurrentExceptionBacktrace() | | | lldb_private::Thread::GetCurrentException() | | | lldb_private::ItaniumABILanguageRuntime::GetExceptionObjectForThread(std::shared_ptr<lldb_private::Thread>) | | | | | | | |--53.43%--lldb_private::FunctionCaller::ExecuteFunction(lldb_private::ExecutionContext&, unsigned long*, lldb_private::EvaluateExpressionOptions const&, lldb_private::DiagnosticManager&, lldb_private::Value&) | | | | | | | | | |--25.23%--lldb_private::FunctionCaller::InsertFunction(lldb_private::ExecutionContext&, unsigned long&, lldb_private::DiagnosticManager&) | | | | | | | | | | | |--24.56%--lldb_private::FunctionCaller::WriteFunctionWrapper(lldb_private::ExecutionContext&, lldb_private::DiagnosticManager&) | | | | | | | | | | | | | |--19.73%--lldb_private::ExpressionParser::PrepareForExecution(unsigned long&, unsigned long&, std::shared_ptr<lldb_private::IRExecutionUnit>&, lldb_private::ExecutionContext&, bool&, lldb_private::ExecutionPolicy) | | | | | | | lldb_private::ClangExpressionParser::DoPrepareForExecution(unsigned long&, unsigned long&, std::shared_ptr<lldb_private::IRExecutionUnit>&, lldb_private::ExecutionContext&, bool&, lldb_private::ExecutionPolicy) | | | | | | | lldb_private::IRExecutionUnit::GetRunnableInfo(lldb_private::Status&, unsigned long&, unsigned long&) | | | | | | | | ``` The hot path is added by https://reviews.llvm.org/D156465 which should at least be disabled for Linux. Note: I am seeing similar performance hot path on Mac. This PR hides the feature behind `enableDisplayExtendedBacktrace` option which needs to be enabled on-demand. --------- Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Configuration menu - View commit details
-
Copy full SHA for e5140ae - Browse repository at this point
Copy the full SHA e5140aeView commit details -
[AMDGPU] Correctly insert s_nops for dst forwarding hazard (llvm#100276)
MI300 ISA section 4.5 states there is a hazard between "VALU op which uses OPSEL or SDWA with changes the result’s bit position" and "VALU op consumes result of that op" This includes the case where the second op is SDWA with same dest and dst_sel != DWORD && dst_unused == UNUSED_PRESERVE. In this case, there is an implicit read of the first op dst and the compiler needs to resolve this hazard. Confirmed with HW team. We model dst_unused == UNUSED_PRESERVE as tied-def of implicit operand, so this PR checks for that. MI300_SP_MAS section 1.3.9.2 specifies that CVT_SR_FP8_F32 and CVT_SR_BF8_F32 with opsel[3:2] !=0 have dest forwarding issue. Currently, we only add check for CVT_SR_FP8_F32 with opsel[3] != 0 -- this PR adds support opsel[2] != 0 as well
Configuration menu - View commit details
-
Copy full SHA for 7bcf4d6 - Browse repository at this point
Copy the full SHA 7bcf4d6View commit details -
Configuration menu - View commit details
-
Copy full SHA for a7c8f41 - Browse repository at this point
Copy the full SHA a7c8f41View commit details -
[libc] Add
ctype.h
locale variants (llvm#102711)Summary: This patch adds all the libc ctype variants. These ignore the locale ingormation completely, so they're pretty much just stubs. Because these use locale information, which is system scope, we do not enable building them outisde of full build mode.
Configuration menu - View commit details
-
Copy full SHA for 856dadb - Browse repository at this point
Copy the full SHA 856dadbView commit details -
Configuration menu - View commit details
-
Copy full SHA for c2a96a2 - Browse repository at this point
Copy the full SHA c2a96a2View commit details -
Revert "[MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedu… (
llvm#105716) …le data" This reverts commit 2c1f064. Many build failures in: CodeGen/X86/scatter-schedule.ll Example of a build failure: https://lab.llvm.org/buildbot/#/builders/155/builds/1675
Configuration menu - View commit details
-
Copy full SHA for e738c81 - Browse repository at this point
Copy the full SHA e738c81View commit details -
[LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (ll…
…vm#105555) The new helper functions make the intent clearer while hiding implementation details, including how we handle previously added entries. Note that: - If we are adding a GUID as a GlobalValueSummary::Definition, then we override a previously added GlobalValueSummary::Declaration entry for the same GUID. - If we are adding a GUID as a GlobalValueSummary::Declaration, then a previously added GlobalValueSummary::Definition entry for the same GUID takes precedence, and no change is made.
Configuration menu - View commit details
-
Copy full SHA for 3082a38 - Browse repository at this point
Copy the full SHA 3082a38View commit details -
AMDGPU: Remove global/flat atomic fadd intrinics (llvm#97051)
These have been replaced with atomicrmw.
Configuration menu - View commit details
-
Copy full SHA for ee08d9c - Browse repository at this point
Copy the full SHA ee08d9cView commit details -
[VPlan] Factor out precomputing costs from LVP::cost (NFC).
Move the logic for pre-computing costs of certain instructions to a separate helper function, allowing re-use in a follow-up patch.
Configuration menu - View commit details
-
Copy full SHA for e454d31 - Browse repository at this point
Copy the full SHA e454d31View commit details -
[LLD][COFF] Generate X64 thunks for ARM64EC entry points and patchabl…
…e functions. (llvm#105499) This implements Fast-Forward Sequences documented in ARM64EC ABI https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi. There are two conditions when linker should generate such thunks: - For each exported ARM64EC functions. It applies only to ARM64EC functions (we may also have pure x64 functions, for which no thunk is needed). MSVC linker creates `EXP+<mangled export name>` symbol in those cases that points to the thunk and uses that symbol for the export. It's observable from the module: it's possible to reference such symbols as I did in the test. Note that it uses export name, not name of the symbol that's exported (as in `foo` in `/EXPORT:foo=bar`). This implies that if the same function is exported multiple times, it will have multiple thunks. I followed this MSVC behavior. - For hybrid_patchable functions. The linker tries to generate a thunk for each undefined `EXP+*` symbol (and such symbols are created by the compiler as a target of weak alias from the demangled name). MSVC linker tries to find corresponding `*$hp_target` symbol and if fails to do so, it outputs a cryptic error like `LINK : fatal error LNK1000: Internal error during IMAGE::BuildImage`. I just skip generating the thunk in such case (which causes undefined reference error). MSVC linker additionally checks that the symbol complex type is a function (see also llvm#102898). We generally don't do such checks in LLD, so I made it less strict. It should be fine: if it's some data symbol, it will not have `$hp_target` symbol, so we will skip it anyway.
Configuration menu - View commit details
-
Copy full SHA for a2d8743 - Browse repository at this point
Copy the full SHA a2d8743View commit details -
[VPlan] Don't trigger VF assertion if VPlan has extra simplifications.
There are cases where VPlans contain some simplifications that are very hard to accurately account for up-front in the legacy cost model. Those cases are caused by un-simplified inputs, which trigger the assert ensuring both the legacy and VPlan-based cost model agree on the VF. To avoid false positives due to missed simplifications in general, only trigger the assert if the chosen VPlan doesn't contain any additional simplifications. Fixes llvm#104714. Fixes llvm#105713.
Configuration menu - View commit details
-
Copy full SHA for cb4efe1 - Browse repository at this point
Copy the full SHA cb4efe1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 768dba7 - Browse repository at this point
Copy the full SHA 768dba7View commit details -
[libunwind] Stop installing the mach-o module map (llvm#105616)
libunwind shouldn't know that compact_unwind_encoding.h is part of a MachO module that it doesn't own. Delete the mach-o module map, and let whatever is in charge of the mach-o directory be the one to say how its module is organized and where compact_unwind_encoding.h fits in.
Configuration menu - View commit details
-
Copy full SHA for 172c4a4 - Browse repository at this point
Copy the full SHA 172c4a4View commit details -
[clang][rtsan] Introduce realtime sanitizer codegen and driver (llvm#…
…102622) Introduce the `-fsanitize=realtime` flag in clang driver Plug in the RealtimeSanitizer PassManager pass in Codegen, and attribute a function based on if it has the `[[clang::nonblocking]]` function effect.
Configuration menu - View commit details
-
Copy full SHA for d010ec6 - Browse repository at this point
Copy the full SHA d010ec6View commit details -
[Clang] [Parser] Improve diagnostic for
friend concept
(llvm#105121)Diagnose this early after parsing declaration specifiers; this allows us to issue a better diagnostic. This also checks for `concept friend` and concept declarations w/o a template-head because it’s easiest to do that at the same time. Fixes llvm#45182.
Configuration menu - View commit details
-
Copy full SHA for 8b5f606 - Browse repository at this point
Copy the full SHA 8b5f606View commit details -
[compiler-rt][test] Change tests to remove the use of
unset
command…… in lit internal shell (llvm#104880) This patch rewrites tests to remove the use of the `unset` command, which is not supported in the lit internal shell. The tests now use the `env -u` to unset environment variables. The `unset` command is used in shell environments to remove the environment variable. However, because the lit internal shell does not support the `unset` command, using it in tests would result in errors or other unexpected behavior. To overcome this limitation, the tests have been updated to use the `env -u` command instead. `env -u` is supported by lit and effectively removes specified environment variables. This allows the tests to achieve the same goal of unsetting environment variables while ensuring compatibility with the lit internal shell. This change is relevant for [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179/3) Fixes: llvm#102397
Configuration menu - View commit details
-
Copy full SHA for 42d06b8 - Browse repository at this point
Copy the full SHA 42d06b8View commit details -
[mlir][SCF]-Fix loop coalescing with iteration arguements (llvm#105488)
Fix a bug found when coalescing loops which have iteration arguments, such that the inner loop's terminator may have operands of the inner loop iteration arguments which are about to be replaced by the outer loop's iteration arguments. The current flow leads to crush within the IR code.
Configuration menu - View commit details
-
Copy full SHA for d7fc779 - Browse repository at this point
Copy the full SHA d7fc779View commit details -
[NFC][ADT] Add reverse iterators and
value_type
to StringRef (llvm#……105579) - Add reverse iterators and `value_type` to StringRef. - Add unit test for all 4 iterator flavors. - This prepares StringRef to be used with `SequenceToOffsetTable`.
Configuration menu - View commit details
-
Copy full SHA for 911e246 - Browse repository at this point
Copy the full SHA 911e246View commit details -
Revert "[clang][rtsan] Introduce realtime sanitizer codegen and drive… (
llvm#105744) …r (llvm#102622)" This reverts commit d010ec6. Build failure: https://lab.llvm.org/buildbot/#/builders/159/builds/4466
Configuration menu - View commit details
-
Copy full SHA for a1e9b7e - Browse repository at this point
Copy the full SHA a1e9b7eView commit details -
This patch fixes: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7245:1: error: unused function 'planContainsAdditionalSimplifications' [-Werror,-Wunused-function]
Configuration menu - View commit details
-
Copy full SHA for 4e6ff75 - Browse repository at this point
Copy the full SHA 4e6ff75View commit details -
[LTO] Use a helper function to add a definition (NFC) (llvm#105721)
I missed this one when I introduced helper functions in: commit 3082a38 Author: Kazu Hirata <kazu@google.com> Date: Thu Aug 22 12:06:47 2024 -0700
Configuration menu - View commit details
-
Copy full SHA for ca48b01 - Browse repository at this point
Copy the full SHA ca48b01View commit details -
[RISCV][TTI] Use legalized element types when costing casts (llvm#105723
) This fixes a crash introduced by my ac6e1fd. I had failed to consider the case where a vector is truncated to an illegal element type. The resulting intermediate VT wasn't an MVT and we'd fail an assertion. Surprisingly, SLP does query illegal element types in some cases.
Configuration menu - View commit details
-
Copy full SHA for 424b87b - Browse repository at this point
Copy the full SHA 424b87bView commit details -
[SandboxIR] Implement CatchReturnInst (llvm#105605)
This patch implements sandboxir::CatchReturnInst mirroring llvm::CatchReturnInst.
Configuration menu - View commit details
-
Copy full SHA for 0d21c2b - Browse repository at this point
Copy the full SHA 0d21c2bView commit details -
Revert "[clang] Merge lifetimebound and GSL code paths for lifetime a…
…nalysis (llvm#104906)" (llvm#105752) Revert as it breaks libc++ tests, see llvm#104906. This reverts commit c368a72.
Configuration menu - View commit details
-
Copy full SHA for 1df1504 - Browse repository at this point
Copy the full SHA 1df1504View commit details -
[clang][NFC] order C++ standards in reverse in release notes (llvm#10…
…4866) Noticed that the release notes currently have a weird order: C++17, C++14(!), C++20, C++23, C++2c. Reorder them in reverse chronological order, which also matches the [status page](https://clang.llvm.org/cxx_status.html).
Configuration menu - View commit details
-
Copy full SHA for ecfceb8 - Browse repository at this point
Copy the full SHA ecfceb8View commit details
Commits on Aug 23, 2024
-
[ScalarizeMaskedMemIntr] Don't use a scalar mask on GPUs (llvm#104842)
ScalarizedMaskedMemIntr contains an optimization where the <N x i1> mask is bitcast into an iN and then bit-tests with powers of two are used to determine whether to load/store/... or not. However, on machines with branch divergence (mainly GPUs), this is a mis-optimization, since each i1 in the mask will be stored in a condition register - that is, ecah of these "i1"s is likely to be a word or two wide, making these bit operations counterproductive. Therefore, amend this pass to skip the optimizaiton on targets that it pessimizes. Pre-commit tests llvm#104645
Configuration menu - View commit details
-
Copy full SHA for 25d976b - Browse repository at this point
Copy the full SHA 25d976bView commit details -
[llvm][NVPTX] Fix quadratic runtime in ProxyRegErasure (llvm#105730)
This pass performs RAUW by walking the machine function for each RAUW operation. For large functions, this runtime in this pass starts to blow up. Linearize the pass by batching the RAUW ops at once.
Configuration menu - View commit details
-
Copy full SHA for 08e5a1d - Browse repository at this point
Copy the full SHA 08e5a1dView commit details -
[bazel] Move lldb-dap cc_binary to lldb/BUILD.bazel (llvm#105733)
On linux lldb-dap uses the location of the lldb-dap binary to search for lldb-server. Previously these were produced in different directories corresponding to the BUILD file paths. It's not ideal that the BUILD file location matters for the binary at runtime but it doesn't hurt to have this tool here too like the others.
Configuration menu - View commit details
-
Copy full SHA for be8ee09 - Browse repository at this point
Copy the full SHA be8ee09View commit details -
[mlir][tensor] Add consumer fusion for
tensor.pack
op. (llvm#103715)Add missing `getIterationDomainTileFromOperandTile` and `getTiledImplementationFromOperandTile` to `tensor.pack` and enable fusing it as a consumer. NOTE that, it only expects perfect tiling scenario without padding semantic currently.
Configuration menu - View commit details
-
Copy full SHA for f06563a - Browse repository at this point
Copy the full SHA f06563aView commit details -
[NFC][TableGen] Emit more readable builtin string table (llvm#105445)
- Add `EmitStringLiteralDef` to StringToOffsetTable class to emit more readable string table. - Use that in `EmitIntrinsicToBuiltinMap`.
Configuration menu - View commit details
-
Copy full SHA for 381405f - Browse repository at this point
Copy the full SHA 381405fView commit details -
[AMDGPU] Refactor code for GETPC bundle updates in hazards (NFCI)
As suggested in review for PR llvm#100067. Refactor code for S_GETPC_B64 bundle updates for use with multiple hazard mitigations.
Configuration menu - View commit details
-
Copy full SHA for 987ffc3 - Browse repository at this point
Copy the full SHA 987ffc3View commit details -
[clang-format] Don't insert a space between :: and * (llvm#105043)
Also, don't insert a space after ::* for method pointers. See llvm#86253 (comment). Fixes llvm#100841.
Configuration menu - View commit details
-
Copy full SHA for 714033a - Browse repository at this point
Copy the full SHA 714033aView commit details -
Revert "[Vectorize] Fix warnings" (llvm#105771)
Triggers assert in compiler https://lab.llvm.org/buildbot/#/builders/51/builds/2836 ``` Instructions.cpp:1700: llvm::ShuffleVectorInst::ShuffleVectorInst(Value *, Value *, ArrayRef<int>, const Twine &, InsertPosition): Assertion `isValidOperands(V1, V2, Mask) && "Invalid shuffle vector instruction operands!"' failed. ``` This reverts commit a625435.
Configuration menu - View commit details
-
Copy full SHA for 1519451 - Browse repository at this point
Copy the full SHA 1519451View commit details -
[SPIRV] Emitting DebugSource, DebugCompileUnit (llvm#97558)
This commit introduces emission of DebugSource, DebugCompileUnit from NonSemantic.Shader.DebugInfo.100 and required OpString with filename. NonSemantic.Shader.DebugInfo.100 is divided, following DWARF into two main concepts – emitting DIE and Line. In DWARF .debug_abbriev and .debug_info sections are responsible for emitting tree with information (DEIs) about e.g. types, compilation unit. Corresponding to that in NonSemantic.Shader.DebugInfo.100 have instructions like DebugSource, DebugCompileUnit etc. which preforms same role in SPIR-V file. The difference is in fact that in SPIR-V there are no sections but logical layout which forces order of the instruction emission. The NonSemantic.Shader.DebugInfo.100 requires for this type of global information to be emitted after OpTypeXXX and OpConstantXXX instructions. One of the goals was to minimize changes and interaction with SPIRVModuleAnalysis as possible which current commit achieves by emitting it’s instructions directly into MachineFunction. The possibility of duplicates are mitigated by guard inside pass which emits the global information only once in one function. By that method duplicates don’t have chance to be emitted. From that point, adding new debug global instructions should be straightforward.
Configuration menu - View commit details
-
Copy full SHA for 62da359 - Browse repository at this point
Copy the full SHA 62da359View commit details -
[ORC] Add an identifier-override argument to loadRelocatableObject an…
…d friends. API clients may want to use things other than paths as the buffer identifiers. No testcase -- I haven't thought of a good way to expose this via the regression testing tools. rdar://133536831
Configuration menu - View commit details
-
Copy full SHA for e15abb7 - Browse repository at this point
Copy the full SHA e15abb7View commit details -
Reland "[Vectorize] Fix warnings"" (llvm#105772)
Revert was wrong, The bot is still broken https://lab.llvm.org/buildbot/#/builders/51/builds/2838 Reverts llvm#105771
Configuration menu - View commit details
-
Copy full SHA for 351f4a5 - Browse repository at this point
Copy the full SHA 351f4a5View commit details -
[Scalar] Remove an unused variable (llvm#105767)
The last use was removed by: commit 89fe570 Author: Philip Reames <listmail@philipreames.com> Date: Tue May 12 23:39:23 2015 +0000
Configuration menu - View commit details
-
Copy full SHA for fdaaa87 - Browse repository at this point
Copy the full SHA fdaaa87View commit details -
[clang-format] Change BinPackParameters to enum and add AlwaysOnePerL…
…ine (llvm#101882) Related issues that have requested this feature: llvm#51833 llvm#23796 llvm#53190 Partially solves - this issue requests is for both arguments and parameters
Configuration menu - View commit details
-
Copy full SHA for 7c3237d - Browse repository at this point
Copy the full SHA 7c3237dView commit details -
[LTO] Turn ImportMapTy into a proper class (NFC) (llvm#105748)
This patch turns type alias ImportMapTy into a proper class to provide a more intuitive interface like: ImportList.addDefinition(...) as opposed to: FunctionImporter::addDefinition(ImportList, ...) Also, this patch requires all non-const accesses to go through addDefinition, maybeAddDeclaration, and addGUID while providing const accesses via: const ImportMapTyImpl &getImportMap() const { return ImportMap; } I realize ImportMapTy may not be the best name as a class (maybe OK as a type alias). I am not renaming ImportMapTy in this patch at least because there are 47 mentions of ImportMapTy under llvm/.
Configuration menu - View commit details
-
Copy full SHA for 3563907 - Browse repository at this point
Copy the full SHA 3563907View commit details -
Revert "[SLP]Improve/fix subvectors in gather/buildvector nodes handl…
…ing" (llvm#105780) with "[Vectorize] Fix warnings" It introduced compiler crashes, see llvm#104144. This reverts commit 69332bb and 351f4a5.
Configuration menu - View commit details
-
Copy full SHA for 96b3166 - Browse repository at this point
Copy the full SHA 96b3166View commit details -
[memref] Handle edge case in subview of full static size fold (llvm#1…
…05635) It is possible to have a subview with a fully static size and a type that matches the source type, but a dynamic offset that may be different. However, currently the memref dialect folds: ```mlir func.func @subview_of_static_full_size( %arg0: memref<16x4xf32, strided<[4, 1], offset: ?>>, %idx: index) -> memref<16x4xf32, strided<[4, 1], offset: ?>> { %0 = memref.subview %arg0[%idx, 0][16, 4][1, 1] : memref<16x4xf32, strided<[4, 1], offset: ?>> to memref<16x4xf32, strided<[4, 1], offset: ?>> return %0 : memref<16x4xf32, strided<[4, 1], offset: ?>> } ``` To: ```mlir func.func @subview_of_static_full_size( %arg0: memref<16x4xf32, strided<[4, 1], offset: ?>>, %arg1: index) -> memref<16x4xf32, strided<[4, 1], offset: ?>> { return %arg0 : memref<16x4xf32, strided<[4, 1], offset: ?>> } ``` Which drops the dynamic offset from the `subview` op.
Configuration menu - View commit details
-
Copy full SHA for 84aa02d - Browse repository at this point
Copy the full SHA 84aa02dView commit details -
[MIPS] Optimize sortRelocs for o32
The o32 ABI specifies: > Each relocation type of R_MIPS_HI16 must have an associated R_MIPS_LO16 entry immediately following it in the list of relocations. [...] the addend AHL is computed as (AHI << 16) + (short)ALO In practice, the high-part and low-part relocations may not be adjacent in assembly files, requiring the assembler to reorder relocations. http://reviews.llvm.org/D19718 performed the reordering, but did not optimize for the common case where a %lo immediately follows its matching %hi. The quadratic time complexity could make sections with many relocations very slow to process. This patch implements the fast path, simplifies the code, and makes the behavior more similar to GNU assembler (for the .rel.mips_hilo_8b test). We also remove `OriginalSymbol`, removing overhead for other targets. Fix llvm#104562 Pull Request: llvm#104723
Configuration menu - View commit details
-
Copy full SHA for 59721f2 - Browse repository at this point
Copy the full SHA 59721f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a69ba0a - Browse repository at this point
Copy the full SHA a69ba0aView commit details -
[NFCI] [C++20] [Modules] Relax the case for duplicated declaration in…
… multiple module units for explicit specialization Relax the case for duplicated declaration in multiple module units for explicit specialization and refactor the implementation of checkMultipleDefinitionInNamedModules a little bit. This is intended to not affect any end users since it only relaxes the condition to emit an error.
Configuration menu - View commit details
-
Copy full SHA for e5f196e - Browse repository at this point
Copy the full SHA e5f196eView commit details -
[NFCI] [Serialization] Use demoteThisDefinitionToDeclaration instead …
…of setCompleteDefinition(false) for CXXRecordDecl When we merge the definition for CXXRecordDecl, we would use setCompleteDefinition(false) to mark the merged definition. But this was not the correct/good interface. We can't know that the merged definition was a definition then. And actually, we provided an interface for this: demoteThisDefinitionToDeclaration. So this patch tries to use the correct API. This was found in the downstream developing. This is not strictly NFC but it is intended to be NFC for every end users.
Configuration menu - View commit details
-
Copy full SHA for 39986f0 - Browse repository at this point
Copy the full SHA 39986f0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 85b6aac - Browse repository at this point
Copy the full SHA 85b6aacView commit details -
Configuration menu - View commit details
-
Copy full SHA for 28133d9 - Browse repository at this point
Copy the full SHA 28133d9View commit details -
Configuration menu - View commit details
-
Copy full SHA for f53bfa3 - Browse repository at this point
Copy the full SHA f53bfa3View commit details -
[AMDGPU] Simplify use of hasMovrel and hasVGPRIndexMode (llvm#105680)
The generic subtarget has neither of these features. Rather than forcing HasMovrel on, it is simpler to expand dynamic vector indexing to a sequence of compare/select instructions. NFC for real subtargets.
Configuration menu - View commit details
-
Copy full SHA for b02b5b7 - Browse repository at this point
Copy the full SHA b02b5b7View commit details -
[Matrix] Preserve signedness when extending matrix index expression. (l…
…lvm#103044) As per [1] the indices for a matrix element access operator shall have integral or unscoped enumeration types and be non-negative. At the moment, the index expression is converted to SizeType irrespective of the signedness of the index expression. This causes implicit sign conversion warnings if any of the indices is signed. As per the spec, using signed types as indices is allowed and should not cause any warnings. If the index expression is signed, extend to SignedSizeType to avoid the warning. [1] https://clang.llvm.org/docs/MatrixTypes.html#matrix-type-element-access-operator PR: llvm#103044
Configuration menu - View commit details
-
Copy full SHA for 96509bb - Browse repository at this point
Copy the full SHA 96509bbView commit details -
[AMDGPU] Remove one case of vmcnt loop header flushing for GFX12 (llv…
…m#105550) When a loop contains a VMEM load whose result is only used outside the loop, do not bother to flush vmcnt in the loop head on GFX12. A wait for vmcnt will be required inside the loop anyway, because VMEM instructions can write their VGPR results out of order.
Configuration menu - View commit details
-
Copy full SHA for fa2dccb - Browse repository at this point
Copy the full SHA fa2dccbView commit details -
[MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data …
…(REAPPLIED) This doesn't match uops.info yet - but it matches the existing vpscatterdq/vscatterqpd entries like uops.info says it should Reapplied with codegen fix for scatter-schedule.ll Fixes llvm#105675
Configuration menu - View commit details
-
Copy full SHA for cf6cd1f - Browse repository at this point
Copy the full SHA cf6cd1fView commit details -
[C++20] [Modules] Warn for duplicated decls in mutliple module units (l…
…lvm#105799) It is a long standing issue that the duplicated declarations in multiple module units would cause the compilation performance to get slowed down. And there are many questions or issue reports. So I think it is better to add a warning for it. And given this is not because the users' code violates the language specification or any best practices, the warning is disabled by default even if `-Wall` is specified. The users need to specify the warning explcitly or use `Weverything`. The documentation will add separately.
Configuration menu - View commit details
-
Copy full SHA for 3cca522 - Browse repository at this point
Copy the full SHA 3cca522View commit details -
Configuration menu - View commit details
-
Copy full SHA for c8ba317 - Browse repository at this point
Copy the full SHA c8ba317View commit details -
[AArch64] Scalarize i128 add/sub/mul/and/or/xor vectors
This mirrors what we do for SDAG, scalarizing i128 vectors with add/sub/mul/and/or/xor operators.
Configuration menu - View commit details
-
Copy full SHA for 646478f - Browse repository at this point
Copy the full SHA 646478fView commit details -
[clang][bytecode][NFC] Remove containsErrors() check from delegate (l…
…lvm#105804) This check was removed a while ago from visit(), remove it from delegate() as well.
Configuration menu - View commit details
-
Copy full SHA for 38b8e54 - Browse repository at this point
Copy the full SHA 38b8e54View commit details -
[clang][bytecode] Reject void InitListExpr differently (llvm#105802)
Configuration menu - View commit details
-
Copy full SHA for 7b4b85b - Browse repository at this point
Copy the full SHA 7b4b85bView commit details -
[ORC] Expose a non-destructive check-macho-buffer overload.
This allows clients to check buffers that they don't own. rdar://133536831
Configuration menu - View commit details
-
Copy full SHA for 4a12722 - Browse repository at this point
Copy the full SHA 4a12722View commit details -
Configuration menu - View commit details
-
Copy full SHA for cbf34a5 - Browse repository at this point
Copy the full SHA cbf34a5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b4b909 - Browse repository at this point
Copy the full SHA 2b4b909View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2f144ac - Browse repository at this point
Copy the full SHA 2f144acView commit details -
[flang][NFC] turn fir.call is_bind_c into enum for procedure flags (l…
…lvm#105691) First patch to fix a BIND(C) ABI issue (llvm#102113). I need to keep track of BIND(C) in more locations (fir.dispatch and func.func operations), and I need to fix a few passes that are dropping the attribute on the floor. Since I expect more procedure attributes that cannot be reflected in mlir::FunctionType will be needed for ABI, optimizations, or debug info, this NFC patch adds a new enum attribute to keep track of procedure attributes in the IR. This patch is not updating lowering to lower more attributes, this will be done in a separate patch to keep the test changes low here. Adding the attribute on fir.dispatch and func.func will also be done in separate patches.
Configuration menu - View commit details
-
Copy full SHA for 2051a7b - Browse repository at this point
Copy the full SHA 2051a7bView commit details -
[NFC][TableGen] Refactor StringToOffsetTable (llvm#105655)
- Make `EmitString` const by not mutating `AggregateString`. - Use C++17 structured bindings in `GetOrAddStringOffset`. - Use StringExtras version of isDigit instead of std::isdigit.
Configuration menu - View commit details
-
Copy full SHA for 04ab647 - Browse repository at this point
Copy the full SHA 04ab647View commit details -
This patch fixes: clang/lib/Serialization/ASTReader.cpp:9978:27: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
Configuration menu - View commit details
-
Copy full SHA for 1e3dc8c - Browse repository at this point
Copy the full SHA 1e3dc8cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d1d95e - Browse repository at this point
Copy the full SHA 0d1d95eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5def27c - Browse repository at this point
Copy the full SHA 5def27cView commit details -
[RISCV] Let -data-sections also work on sbss/sdata sections (llvm#87040)
Add an unique suffix to .sbss/.sdata if -fdata-sections. Without assigning an unique .sbss/.sdata section to each symbols, a linker may not be able to remove unused part when gc-section since all used and unused symbols are all mixed in the same .sbss/.sdata section. I believe this also matches the behavior of gcc.
Configuration menu - View commit details
-
Copy full SHA for 4d348f7 - Browse repository at this point
Copy the full SHA 4d348f7View commit details -
[mlir][mem2reg] Fix Mem2Reg attempting to promote in graph regions (l…
…lvm#104910) Mem2Reg assumes SSA dependencies but did not check for graph regions. This fixes it. --------- Co-authored-by: Christian Ulmann <christianulmann@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for b084111 - Browse repository at this point
Copy the full SHA b084111View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2617023 - Browse repository at this point
Copy the full SHA 2617023View commit details -
[NFC] Use stable_hash_combine instead of hash_combine (llvm#105619)
I found the current stable hash is not deterministic across multiple runs on a specific platform. This is because it uses `hash_combine` instead of `stable_hash_combine`.
Configuration menu - View commit details
-
Copy full SHA for c9b6339 - Browse repository at this point
Copy the full SHA c9b6339View commit details -
[AMDGPU] Improve uniform argument handling in InstCombineIntrinsic (l…
…lvm#105812) Common up handling of intrinsics that are a no-op on uniform arguments. This catches a couple of new cases: readlane (readlane x, y), z -> readlane x, y (for any z, does not have to equal y). permlane64 (readfirstlane x) -> readfirstlane x (and likewise for any other uniform argument to permlane64).
Configuration menu - View commit details
-
Copy full SHA for f142f8a - Browse repository at this point
Copy the full SHA f142f8aView commit details -
[SLP]Improve/fix subvectors in gather/buildvector nodes handling
SLP vectorizer has an estimation for gather/buildvector nodes, which contain some scalar loads. SLP vectorizer performs pretty similar (but large in SLOCs) estimation, which not always correct. Instead, this patch implements clustering analysis and actual node allocation with the full analysis for the vectorized clustered scalars (not only loads, but also some other instructions) with the correct cost estimation and vector insert instructions. Improves overall vectorization quality and simplifies analysis/estimations. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#104144
Configuration menu - View commit details
-
Copy full SHA for f3d2609 - Browse repository at this point
Copy the full SHA f3d2609View commit details -
[RISCV][MC] Name the vector tuple registers. NFC (llvm#102726)
Currently vector tuple registers don't have the specified names, the default name is, for example: `VRN3M2` -> `V8M2_V10M2_V12M2`, however it's equivalent to `v8` in the assembly.
Configuration menu - View commit details
-
Copy full SHA for 002ba17 - Browse repository at this point
Copy the full SHA 002ba17View commit details -
Revert "[clang] Increase the default expression nesting limit (llvm#1…
Configuration menu - View commit details
-
Copy full SHA for e3ce979 - Browse repository at this point
Copy the full SHA e3ce979View commit details -
Configuration menu - View commit details
-
Copy full SHA for 67a9093 - Browse repository at this point
Copy the full SHA 67a9093View commit details -
[SLP]Fix a crash for the strided nodes with reversed order and extern…
…ally used pointer. If the strided node is reversed, need to cehck for the last instruction, not the first one in the list of scalars, when checking if the root pointer must be extracted.
Configuration menu - View commit details
-
Copy full SHA for dab19da - Browse repository at this point
Copy the full SHA dab19daView commit details -
Revert "[RISCV] Add isel optimization for (and (sra y, c2), c1) to re…
…cover regression from llvm#101751. (llvm#104114)" This caused an assert to fire: llvm/include/llvm/Support/Casting.h:566: decltype(auto) llvm::cast(const From &) [To = llvm::ConstantSDNode, From = llvm::SDValue]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed. see comment on the PR. > If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If > c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4) > followed by a SHXADD with c4 as the X amount. > > Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4). > Alive2: https://alive2.llvm.org/ce/z/AwhheR This reverts commit 5144817.
Configuration menu - View commit details
-
Copy full SHA for 858afe9 - Browse repository at this point
Copy the full SHA 858afe9View commit details -
[PS5][clang][test] x86_64-scei-ps5 -> x86_64-sie-ps5 in tests (llvm#1…
…05810) `x86_64-sie-ps5` is the triple we share with PS5 toolchain users who have reason to care about such things. The vast majority of PS5 checks and tests already use this variant. Quashing the handful of stragglers will help prevent future copy+paste of the discouraged variant.
Configuration menu - View commit details
-
Copy full SHA for 05ce95e - Browse repository at this point
Copy the full SHA 05ce95eView commit details -
[VPlan] Skip branches marked as dead in cost precomputation.
Don't consider the cost of branches marked to be skipped in VPlan cost pre-computation. Those aren't included in the legacy cost, so they should not be included in the VPlan cast.
Configuration menu - View commit details
-
Copy full SHA for 885c436 - Browse repository at this point
Copy the full SHA 885c436View commit details -
Revert "Reland "[asan] Remove debug tracing from
report_globals
(ll……vm#104404)" (llvm#105601)" that change still breaks SanitizerCommon-asan-x86_64-Darwin :: Darwin/print-stack-trace-in-code-loaded-after-fork.cpp > This reverts commit 2704b80 > and relands llvm#104404. > > The Darwin should not fail after llvm#105599. This reverts commit 8c6f8c2.
Configuration menu - View commit details
-
Copy full SHA for 6a8f738 - Browse repository at this point
Copy the full SHA 6a8f738View commit details -
[clang][rtsan] Reland realtime sanitizer codegen and driver (llvm#102622
) This reverts commit a1e9b7e This relands commit d010ec6 No modifications from the original patch. It was determined that the ubsan build failure was happening even after the revert, some examples: https://lab.llvm.org/buildbot/#/builders/159/builds/4477 https://lab.llvm.org/buildbot/#/builders/159/builds/4478 https://lab.llvm.org/buildbot/#/builders/159/builds/4479
Configuration menu - View commit details
-
Copy full SHA for f77e8f7 - Browse repository at this point
Copy the full SHA f77e8f7View commit details -
[C23] Update status page for TS 18661 integration (llvm#105693)
WG14 N2401 was removed from the list because it was library-only changes that don't impact the compiler. Everything having to do with decimal floating-point types was changed to No because we do not currently have any support for those. WG14 N2314 remains Unknown because it has changes to Annex F for binary floating-point types.
Configuration menu - View commit details
-
Copy full SHA for 3faf5b9 - Browse repository at this point
Copy the full SHA 3faf5b9View commit details -
[BOLT][test] Removed the use of parentheses in BOLT tests with lit in…
…ternal shell (llvm#105720) This patch addresses compatibility issues with the lit internal shell by removing the use of subshell execution (parentheses and subshell syntax) in the `BOLT` tests. The lit internal shell does not support parentheses, so the tests have been refactored to use separate command invocations, with outputs redirected to temporary files where necessary. This change is relevant for enabling the lit internal shell by default, as outlined in [[RFC] Enabling the Lit Internal Shell by Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179) fixes: llvm#102401
Configuration menu - View commit details
-
Copy full SHA for 7f37932 - Browse repository at this point
Copy the full SHA 7f37932View commit details -
[SCF][PIPELINE] Handle the case when values from the peeled prologue …
…may escape out of the loop (llvm#105755) Previously the values in the peeled prologue that weren't treated with the `predicateFn` were passed to the loop body without any other predication. If those values are later used outside of the loop body, they may be incorrect if the num iterations is smaller than num stages - 1. We need similar masking for those, as is done in the main loop body, using already existing predicates.
Configuration menu - View commit details
-
Copy full SHA for 7c90081 - Browse repository at this point
Copy the full SHA 7c90081View commit details -
[Clang] Implement P2747 constexpr placement new (llvm#104586)
The implementation follows the resolution of CWG2922
Configuration menu - View commit details
-
Copy full SHA for 6e78aef - Browse repository at this point
Copy the full SHA 6e78aefView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8075576 - Browse repository at this point
Copy the full SHA 8075576View commit details -
[libc++] Remove status pages tracking SpecialMath and Zip (llvm#105672)
Instead of tracking those using our static CSV files, I created lists of subtasks in their respective issues (llvm#99939 and llvm#105169) to track the work that is still left.
Configuration menu - View commit details
-
Copy full SHA for ff5552c - Browse repository at this point
Copy the full SHA ff5552cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b8f1505 - Browse repository at this point
Copy the full SHA b8f1505View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a25854 - Browse repository at this point
Copy the full SHA 5a25854View commit details -
[mlir][Transforms][NFC] Move
ReconcileUnrealizedCasts
implementation (llvm#104671) Move the implementation of `ReconcileUnrealizedCasts` to `DialectConversion.cpp`, so that it can be called from there in a future commit. This commit is in preparation of decoupling argument/source/target materializations from the dialect conversion framework. The existing logic around unresolved materializations that predicts IR changes to decide if a cast op can be folded/erased will become obsolete, as `ReconcileUnrealizedCasts` will perform these kind of foldings on fully materialized IR. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for a9f6224 - Browse repository at this point
Copy the full SHA a9f6224View commit details -
Reland "[clang] Merge lifetimebound and GSL code paths for lifetime a…
…nalysis (llvm#104906)" (llvm#105838) Reland without the `EnableLifetimeWarnings` removal. I will remove the EnableLifetimeWarnings in a follow-up patch. I have added a test to prevent regression.
Configuration menu - View commit details
-
Copy full SHA for b1560bd - Browse repository at this point
Copy the full SHA b1560bdView commit details -
Revert "[lldb] Speculative fix for trap_frame_sym_ctx.test"
This reverts commit 19d3f34.
Configuration menu - View commit details
-
Copy full SHA for fd7904a - Browse repository at this point
Copy the full SHA fd7904aView commit details -
Recommit "[RISCV] Add isel optimization for (and (sra y, c2), c1) to …
…recover regression from llvm#101751. (llvm#104114)" Fixed an incorrect cast. Original message: If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4) followed by a SHXADD with c4 as the X amount. Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4). Alive2: https://alive2.llvm.org/ce/z/AwhheR
Configuration menu - View commit details
-
Copy full SHA for 0381e01 - Browse repository at this point
Copy the full SHA 0381e01View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d18cea - Browse repository at this point
Copy the full SHA 3d18ceaView commit details -
InstructionSelect: Use GISelChangeObserver instead of MachineFunction…
…::Delegate (llvm#105725) The main difference is that it's possible for multiple change observers to be installed at the same time whereas there can only be one MachineFunction delegate installed. This allows downstream targets to continue to use observers to recursively select. The target in question was selecting a gMIR instruction to a machine instruction plus some gMIR around it and relying on observers to ensure it correctly selected any gMIR it created before returning to the main loop.
Configuration menu - View commit details
-
Copy full SHA for 0bf5846 - Browse repository at this point
Copy the full SHA 0bf5846View commit details -
[SCCP] fix non-determinism (llvm#105758)
the visit order depended on hashing because we iterated over a SmallPtrSet
Configuration menu - View commit details
-
Copy full SHA for aec3ec0 - Browse repository at this point
Copy the full SHA aec3ec0View commit details -
[X86] Add some initial test coverage for half libcall expansion/promo…
…tion We can add additional tests in the future, but this is an initial placeholder Inspired by llvm#105775
Configuration menu - View commit details
-
Copy full SHA for df97673 - Browse repository at this point
Copy the full SHA df97673View commit details -
[NFC] Fix an incorrect comment about operator precedence. (llvm#105784)
The comment talks about left-associative operators twice, when the latter mention is actually describing right-associative operators.
Configuration menu - View commit details
-
Copy full SHA for 1821cb3 - Browse repository at this point
Copy the full SHA 1821cb3View commit details -
[ctx_prof] Remove the dependency on the "name" GlobalVariable (llvm#1…
…05731) We don't need that name variable for contextual instrumentation, we just use the function to get its GUID which we pass to the runtime, and rely on metadata to capture it through the various optimization passes. This change removes the need for the name global variable.
Configuration menu - View commit details
-
Copy full SHA for 960a210 - Browse repository at this point
Copy the full SHA 960a210View commit details -
[orc][mach-o] Unlock the JITDylib state mutex during +load (llvm#105333)
Similar to what was already done for static initializers, we need to unlock the state mutext when calling out to libobjc to run +load methods in case they cause us to reenter the runtime, which was previously deadlocking. No test for now, because we don't have any code paths in llvm-jitlink itself that could lead to this deadlock. If we interpose calls to dlopen to go back to the JIT in the future then calling dlopen from a +load is the easiest way to reproduce this. rdar://133430490
Configuration menu - View commit details
-
Copy full SHA for fa089ef - Browse repository at this point
Copy the full SHA fa089efView commit details -
Implement resource binding type prefix mismatch diagnostic infrastruc…
…ture (llvm#97103) There are currently no diagnostics being emitted for when a resource is bound to a register with an incorrect binding type prefix. For example, a CBuffer type resource should be bound with a a binding type prefix of 'b', but if instead the prefix is 'u', no errors will be emitted. This PR implements such diagnostics. The focus of this PR is to implement both the flag setting and diagnostic emisison steps specified in the relevant spec: microsoft/hlsl-specs#230 The relevant issue is: llvm#57886 This is a continuation / refresh of this PR: llvm#87578
Configuration menu - View commit details
-
Copy full SHA for ebc4a66 - Browse repository at this point
Copy the full SHA ebc4a66View commit details -
[mlir][sparse] partially support lowering sparse coiteration loops to…
… scf.while/for. (llvm#105565)
Peiming Liu authoredAug 23, 2024 Configuration menu - View commit details
-
Copy full SHA for f607102 - Browse repository at this point
Copy the full SHA f607102View commit details -
[Flang][OpenMP] Align map clause generation and fix issue with non-sh…
…ared allocations for assumed shape/size descriptor types (llvm#97855) This PR aims to unify the map argument generation behavior across both the implicit capture (captured in a target region) and the explicit capture (process map), currently the varPtr field of the MapInfo for the same variable will be different depending on how it's captured. This PR tries to align that across the generations of MapInfoOp in the OpenMP lowering. Currently, I have opted to utilise the rawInput (input memref to a HLFIR DeclareInfoOp) as opposed to the addr field which includes more information. The side affect of this is that we have to deal with BoxTypes less often, which will result in simpler maps in these cases. The negative side affect of this is that we don't have access to the bounds information through the resulting value, however, I believe the bounds information we require in our case is still appropriately stored in the map bounds, and this seems to be the case from testing so far. The other fix is for cases where we end up with a BoxType argument into a function (certain assumed shape and sizes cases do this) that has no fir.ref wrapping it. As we need the Box to be a reference type to actually utilise the operation to access the base address stored inside and create the correct mappings we currently generate an intermediate allocation in these cases, and then store into it, and utilise this as the map argument, as opposed to the original. However, as we were not sharing the same intermediate allocation across all of the maps for a variable, this resulted in errors in certain cases when detatching/attatching the data e.g. via enter and exit. This PR adjusts this for cases Currently we only maintain tracking of all intermediate allocations for the current function scope, as opposed to module. Primarily as the only case I am aware of that this is required is in cases where we pass certain types of arguments to functions (so I opted to minimize the overhead of the pass for now). It could likely be extended to module scope if required if we find other cases where it's applicable and causing issues.
Configuration menu - View commit details
-
Copy full SHA for f4cf93f - Browse repository at this point
Copy the full SHA f4cf93fView commit details -
Configuration menu - View commit details
-
Copy full SHA for d86349c - Browse repository at this point
Copy the full SHA d86349cView commit details -
Revert "Revert "[lldb] Speculative fix for trap_frame_sym_ctx.test""
This reverts commit fd7904a.
Configuration menu - View commit details
-
Copy full SHA for b7c1be1 - Browse repository at this point
Copy the full SHA b7c1be1View commit details -
Revert "Revert "[lldb] Extend frame recognizers to hide frames from b…
…acktraces (llvm#104523)"" This reverts commit 547917a.
Configuration menu - View commit details
-
Copy full SHA for 3c0fba4 - Browse repository at this point
Copy the full SHA 3c0fba4View commit details -
Revert "Revert "[lldb-dap] Mark hidden frames as "subtle" (llvm#105457)…
…"" This reverts commit aa70f83.
Configuration menu - View commit details
-
Copy full SHA for 9e9e823 - Browse repository at this point
Copy the full SHA 9e9e823View commit details -
Revert "Revert "[lldb][swig] Use the correct variable in the return s…
…tatement"" This reverts commit 7323e7e.
Configuration menu - View commit details
-
Copy full SHA for ad75775 - Browse repository at this point
Copy the full SHA ad75775View commit details -
Configuration menu - View commit details
-
Copy full SHA for 11d2de4 - Browse repository at this point
Copy the full SHA 11d2de4View commit details -
[TableGen] Refactor SequenceToOffsetTable class (llvm#104986)
- Replace use of std::isalnum/ispunct with StringExtras version to avoid possibly locale dependent behavior. - Remove `static` from printChar (do its deduplicated when linking). - Use range based for loops and structured bindings. - No need to use `llvm::` for code in llvm namespace.
Configuration menu - View commit details
-
Copy full SHA for a968ae6 - Browse repository at this point
Copy the full SHA a968ae6View commit details -
[mlir][sparse] refactoring sparse_tensor.iterate lowering pattern imp…
…lementation. (llvm#105566)
Peiming Liu authoredAug 23, 2024 Configuration menu - View commit details
-
Copy full SHA for 7186704 - Browse repository at this point
Copy the full SHA 7186704View commit details -
[Clang] Assert non-null enum definition in CGDebugInfo::CreateTypeDef…
…inition(const EnumType*) (llvm#105556) This commit adds an assert to check for a non-null enum definition in CGDebugInfo::CreateTypeDefinition(const EnumType*), ensuring precondition validity. Previous discussion on llvm#97105
Configuration menu - View commit details
-
Copy full SHA for 8f08b75 - Browse repository at this point
Copy the full SHA 8f08b75View commit details -
[flang][runtime] Add FLANG_RUNTIME_NO_REAL_3 flag to build (llvm#105856)
Allow a runtime build to disable SELECTED_REAL_KIND from returning kind 3 (16-bit truncated form of 32-bit IEEE-754 floating point, a/k/a "brain float" or bfloat16).
Configuration menu - View commit details
-
Copy full SHA for 57b89fd - Browse repository at this point
Copy the full SHA 57b89fdView commit details -
[LLD][COFF] Add support for CHPE redirection metadata. (llvm#105739)
This is part of CHPE metadata containing a sorted list of x86_64 export thunks RVAs and RVAs of ARM64EC functions associated with them. It's stored in a dedicated .a64xrm section.
Configuration menu - View commit details
-
Copy full SHA for caa844e - Browse repository at this point
Copy the full SHA caa844eView commit details -
Configuration menu - View commit details
-
Copy full SHA for ceb587a - Browse repository at this point
Copy the full SHA ceb587aView commit details -
[mlir][SCF] Allow canonicalization of zero-trip count
scf.forall
wi……th empty mapping. (llvm#105793) Current folding of one-trip count loop does not kick in with an empty mapping. Enable this for empty mapping. Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 00620ab - Browse repository at this point
Copy the full SHA 00620abView commit details -
[DXIL][Analysis] Uniquify duplicate resources in DXILResourceAnalysis
If a resources is used multiple times, we should only have one resource record for it. This comes up most prominantly with arrays of resources like so: ```hlsl RWBuffer<float4> BufferArray[10] : register(u0, space4); RWBuffer<float4> B1 = BufferArray[0]; RWBuffer<float4> B2 = BufferArray[SomeIndex]; RWBuffer<float4> B3 = BufferArray[3]; ``` In this case, there's only one resource, but we'll generate 3 different `dx.handle.fromBinding` calls to access different slices. Note that this adds some API that won't be used until llvm#104447 later in the stack. Trying to avoid that results in unnecessary churn. Fixes llvm#105143 Pull Request: llvm#105602
Configuration menu - View commit details
-
Copy full SHA for 782bc4f - Browse repository at this point
Copy the full SHA 782bc4fView commit details -
Configuration menu - View commit details
-
Copy full SHA for a0fac6f - Browse repository at this point
Copy the full SHA a0fac6fView commit details -
[LLD][COFF] Add support for CHPE code ranges metadata. (llvm#105741)
This is part of CHPE metadata containing a sorted list of x86_64 export thunks RVAs and sizes.
Configuration menu - View commit details
-
Copy full SHA for 52a7116 - Browse repository at this point
Copy the full SHA 52a7116View commit details -
Deprecate -fheinous-gnu-extensions; introduce a new warning flag (llv…
…m#105821) The new warning flag is `-Winvalid-gnu-asm-cast`, which is enabled by default and is a downgradable diagnostic which defaults to an error. This language dialect flag only controls whether a single diagnostic is emitted as a warning or as an error, and has never been expanded to include other behaviors. Given the rather perjorative name, it's better for us to just expose a diagnostic flag for the one warning in question and let the user elect to do `-Wno-error=` if they need to. There's not a lot of use of the language dialect flag in the wild, but there is some use of it. For the time being, this aliases the -f flag to `-Wno-error=invalid-gnu-asm-cast`, but the -f flag can eventually be removed.
Configuration menu - View commit details
-
Copy full SHA for c505ce9 - Browse repository at this point
Copy the full SHA c505ce9View commit details -
Configuration menu - View commit details
-
Copy full SHA for a74f0ab - Browse repository at this point
Copy the full SHA a74f0abView commit details -
[DirectX] Lower
@llvm.dx.handle.fromBinding
to DXIL opsThe `@llvm.dx.handle.fromBinding` intrinsic is lowered either to the `CreateHandle` op or a pair of `CreateHandleFromBinding` and `AnnotateHandle` ops, depending on the DXIL version. Regardless of the DXIL version we need to emit metadata about the binding, but that's left to a separate change. These DXIL ops all need to return the `%dx.types.Handle` type, but the llvm intrinsic returns a target extension type. To facilitate changing the type of the operation and all of its users, we introduce `%llvm.dx.cast.handle`, which can cast between the two handle representations. Pull Request: llvm#104251
Configuration menu - View commit details
-
Copy full SHA for aa61925 - Browse repository at this point
Copy the full SHA aa61925View commit details -
[GDBRemote] Fix processing of comma-separated memory region entries (l…
…lvm#105873) The existing algorithm was performing the following comparisons for an `aaa,bbb,ccc,ddd`: aaa\0bbb,ccc,ddd == "stack" aaa\0bbb\0ccc,ddd == "stack" aaa\0bbb\0ccc\0ddd == "stack" Which wouldn't work. This commit just dispatches to a known algorithm implementation.
Configuration menu - View commit details
-
Copy full SHA for 8b4147d - Browse repository at this point
Copy the full SHA 8b4147dView commit details -
[nfc][mlgo] Incrementally update DominatorTreeAnalysis in FunctionPro…
…pertiesAnalysis (llvm#104867) We need the dominator tree analysis for loop info analysis, which we need to get features like most nested loop and number of top level loops. Invalidating and recomputing these from scratch after each successful inlining can sometimes lead to lengthy compile times. We don't need to recompute from scratch, though, since we have some boundary information about where the changes to the CFG happen; moreover, for dom tree, the API supports incrementally updating the analysis result. This change addresses the dom tree part. The loop info is still recomputed from scratch. This does reduce the compile time quite significantly already, though (~5x in a specific case) The loop info change might be more involved and would follow in a subsequent PR.
Configuration menu - View commit details
-
Copy full SHA for a2a5508 - Browse repository at this point
Copy the full SHA a2a5508View commit details -
[mlir][Linalg] Avoid doing op replacement in
linalg::dropUnitDims
. (l……lvm#105749) It is better to do the replacement in the caller. This avoids the footgun if the caller needs the original operation. Instead return the produced operation and replacement values. Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4dbaef6 - Browse repository at this point
Copy the full SHA 4dbaef6View commit details -
[mlir][Transforms] Dialect conversion: Make materializations optional (…
…llvm#104668) This commit makes source/target/argument materializations (via the `TypeConverter` API) optional. By default (`ConversionConfig::buildMaterializations = true`), the dialect conversion infrastructure tries to legalize all unresolved materializations right after the main transformation process has succeeded. If at least one unresolved materialization fails to resolve, the dialect conversion fails. (With an error message such as `failed to legalize unresolved materialization ...`.) Automatic materializations through the `TypeConverter` API can now be deactivated. In that case, every unresolved materialization will show up as a `builtin.unrealized_conversion_cast` op in the output IR. There used to be a complex and error-prone analysis in the dialect conversion that predicted the future uses of unresolved materializations. Based on that logic, some casts (that were deemed to unnecessary) were folded. This analysis was needed because folding happened at a point of time when some IR changes (e.g., op replacements) had not materialized yet. This commit removes that analysis. Any folding of cast ops now happens after all other IR changes have been materialized and the uses can directly be queried from the IR. This simplifies the analysis significantly. And certain helper data structures such as `inverseMapping` are no longer needed for the analysis. The folding itself is done by `reconcileUnrealizedCasts` (which also exists as a standalone pass). After casts have been folded, the remaining casts are materialized through the `TypeConverter`, as usual. This last step can be deactivated in the `ConversionConfig`. `ConversionConfig::buildMaterializations = false` can be used to debug error messages such as `failed to legalize unresolved materialization ...`. (It is also useful in case automatic materializations are not needed.) The materializations that failed to resolve can then be seen as `builtin.unrealized_conversion_cast` ops in the resulting IR. (This is better than running with `-debug`, because `-debug` shows IR where some IR changes have not been materialized yet.)
Configuration menu - View commit details
-
Copy full SHA for d7073c5 - Browse repository at this point
Copy the full SHA d7073c5View commit details -
[rtsan][compiler-rt] Prevent UB hang in rtsan lock unit tests (llvm#1…
…04733) It is undefined behavior to lock or unlock an uninitialized lock, and unlock a lock which isn't locked. Introduce a fixture to set up and tear down the locks where appropriate, and separates them into two tests (realtime death and non realtime survival) so each test is guaranteed a fresh lock.
Configuration menu - View commit details
-
Copy full SHA for 64afbf0 - Browse repository at this point
Copy the full SHA 64afbf0View commit details -
[Bitcode] Use DenseSet instead of std::set (NFC) (llvm#105851)
DefOrUseGUIDs is used only for membership checking purposes. We don't need std::set's strengths like iterators staying valid or the ability to traverse in a sorted order. While I am at it, this patch replaces count with contains for slightly increased readability.
Configuration menu - View commit details
-
Copy full SHA for 3b703d4 - Browse repository at this point
Copy the full SHA 3b703d4View commit details -
[InstCombine] Fold
(x < y) ? -1 : zext(x > y)
and `(x > y) ? 1 : se……xt(x < y)` to `ucmp/scmp(x, y)` (llvm#105272) This patch expands already existing funcionality to include these two additional folds, which are nearly identical to the ones already implemented. Proofs: https://alive2.llvm.org/ce/z/Xy7s4j
Configuration menu - View commit details
-
Copy full SHA for da6f423 - Browse repository at this point
Copy the full SHA da6f423View commit details -
[compiler-rt][nsan] Add support for nan detection (llvm#101531)
Add support for nan detection. llvm#100305
Configuration menu - View commit details
-
Copy full SHA for 283dff4 - Browse repository at this point
Copy the full SHA 283dff4View commit details -
[mlir][sparse] unify block arguments order between iterate/coiterate …
…operations. (llvm#105567)
Peiming Liu authoredAug 23, 2024 Configuration menu - View commit details
-
Copy full SHA for b48ef8d - Browse repository at this point
Copy the full SHA b48ef8dView commit details -
[SPIRV] Fix return type mismatch for createSPIRVEmitNonSemanticDIPass (…
…llvm#105889) The declaration in SPIRV.h had this returning a `MachineFunctionPass *`, but the implementation returned a `FunctionPass *`. This showed up as a build error on windows, but it was clearly a mistake regardless. I also updated the pass to include SPIRV.h rather than using its own declarations for pass initialization, as this results in better errors for this kind of typo. Fixes a build break after llvm#97558
Configuration menu - View commit details
-
Copy full SHA for 3e763db - Browse repository at this point
Copy the full SHA 3e763dbView commit details -
"Reland "[asan] Remove debug tracing from
report_globals
(llvm#104404……)" (llvm#105895) Reland llvm#104404. In addition to llvm#104404 it raises required verbosity for stack tracing on global registration. It confuses a symbolizer test on Darwin. This reverts commit 6a8f738.
Configuration menu - View commit details
-
Copy full SHA for 10407be - Browse repository at this point
Copy the full SHA 10407beView commit details -
[mlir][tensor] Add TilingInterface support for fusing tensor.pad (llv…
…m#105892) This adds implementations for the two TilingInterface methods required for fusion to `tensor.pad`: `getIterationDomainTileFromResultTile` and `generateResultTileValue`, allowing fusion of pad with a tiled consumer.
Configuration menu - View commit details
-
Copy full SHA for 91e57c6 - Browse repository at this point
Copy the full SHA 91e57c6View commit details -
Fix bot failures after PR llvm#104867
An assert was left over after addressing feedback. In the process of fixing, realized the way I addressed the feedback was also incomplete.
Configuration menu - View commit details
-
Copy full SHA for cdd11d6 - Browse repository at this point
Copy the full SHA cdd11d6View commit details -
Configuration menu - View commit details
-
Copy full SHA for ca53611 - Browse repository at this point
Copy the full SHA ca53611View commit details
Commits on Aug 24, 2024
-
[IR] Inroduce ModuleToSummariesForIndexTy (NFC) (llvm#105906)
This patch introduces type alias ModuleToSummariesForIndexTy. I'm planning to change the type slightly to allow heterogeneous lookup (that is, std::map<K, V, std::less<>>) in a subsequent patch. The problem is that changing the type affects many places. Using a type alias reduces the impact.
Configuration menu - View commit details
-
Copy full SHA for dbd7ce0 - Browse repository at this point
Copy the full SHA dbd7ce0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f89cd4 - Browse repository at this point
Copy the full SHA 1f89cd4View commit details -
[include-cleaner] Turn new/delete usages to ambiguous references (llv…
…m#105844) In practice most of these expressions just resolve to implicitly provided `operator new` and standard says it's not necessary to include `<new>` for that. Hence this is resulting in a lot of churn in cases where inclusion of `<new>` doesn't matter, and might even be undesired by the developer. By switching to an ambiguous reference we try to find a middle ground here, ensuring that we don't drop providers of `operator new` when the developer explicitly listed them in the includes, and chose to believe it's the implicitly provided `operator new` and don't insert an include in other cases.
Configuration menu - View commit details
-
Copy full SHA for 74b538d - Browse repository at this point
Copy the full SHA 74b538dView commit details -
[clang-format] Treat new expressions as simple functions (llvm#105168)
ccae7b4 improved handling for nested calls, but this resulted in a lot of changes near `new` expressions. This patch tries to restore previous behavior around new expressions, by treating them as simple functions, which seem to align with the concept. Fixes llvm#105133.
Configuration menu - View commit details
-
Copy full SHA for e439fdf - Browse repository at this point
Copy the full SHA e439fdfView commit details -
[SandboxIR] Implement CleanupReturnInst (llvm#105750)
This patch implements sandboxir::CleanupReturnInst mirroring llvm::CleanupReturnInst.
Configuration menu - View commit details
-
Copy full SHA for d021321 - Browse repository at this point
Copy the full SHA d021321View commit details -
[StableHash] Implement with xxh3_64bits (llvm#105849)
This is a follow-up to address a suggestion from llvm#105619. The main goal of this change is to efficiently implement stable hash functions using the xxh3 64bits API. `stable_hash_combine_range` and `stable_hash_combine_array` functions are removed and consolidated into a more general `stable_hash_combine` function that takes an `ArrayRef<stable_hash>` as input.
Configuration menu - View commit details
-
Copy full SHA for 7615c0b - Browse repository at this point
Copy the full SHA 7615c0bView commit details -
[docs] Fix links in github user guide - graphite section
Mistakenly used markdown style rather than rst in llvm#104499.
Configuration menu - View commit details
-
Copy full SHA for 6260125 - Browse repository at this point
Copy the full SHA 6260125View commit details -
Configuration menu - View commit details
-
Copy full SHA for 75ef955 - Browse repository at this point
Copy the full SHA 75ef955View commit details -
[clang][bytecode] Fix IntegralAP::is{Positive,Negative} (llvm#105924)
This depends on signed-ness.
Configuration menu - View commit details
-
Copy full SHA for c81d666 - Browse repository at this point
Copy the full SHA c81d666View commit details -
Configuration menu - View commit details
-
Copy full SHA for 68030f8 - Browse repository at this point
Copy the full SHA 68030f8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 62e7b59 - Browse repository at this point
Copy the full SHA 62e7b59View commit details -
Revert ""Reland "[asan] Remove debug tracing from
report_globals
(l……lvm#104404)"" (llvm#105926) Reverts llvm#105895 Still breaks the test https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1864/
Configuration menu - View commit details
-
Copy full SHA for e185850 - Browse repository at this point
Copy the full SHA e185850View commit details -
[Clang] Overflow Pattern Exclusion - rename some patterns, enhance do…
…cs (llvm#105709) From @vitalybuka's review on llvm#104889: - [x] remove unused variable in tests - [x] rename `post-decr-while` --> `unsigned-post-decr-while` - [x] split `add-overflow-test` into `add-unsigned-overflow-test` and `add-signed-overflow-test` - [x] be more clear about defaults within docs - [x] add table to docs Here's a screenshot of the rendered table so you don't have to build the html docs yourself to inspect the layout: ![image](https://github.com/user-attachments/assets/5d3497c4-5f5a-4579-b29b-96a0fd192faa) CCs: @vitalybuka --------- Signed-off-by: Justin Stitt <justinstitt@google.com> Co-authored-by: Vitaly Buka <vitalybuka@google.com>
Configuration menu - View commit details
-
Copy full SHA for 76236fa - Browse repository at this point
Copy the full SHA 76236faView commit details -
[clang][bytecode][NFC] Add an additional assertion (llvm#105927)
Since this must be true, add an assertion instead of just documenting it via the comment.
Configuration menu - View commit details
-
Copy full SHA for 99b85ca - Browse repository at this point
Copy the full SHA 99b85caView commit details -
[InstCombine] Update the
select
operand when thecond
istrunc
……and has the `nuw` or `nsw` property. (llvm#105914) This patch updates the select operand when the cond has the nuw or nsw property. Considering the semantics of the nuw and nsw flag, if there is no poison value in this expression, this code assumes that X can only be 0, 1 or -1. close: llvm#96765 alive2: https://alive2.llvm.org/ce/z/3n3n2Q
Configuration menu - View commit details
-
Copy full SHA for 43c6fb2 - Browse repository at this point
Copy the full SHA 43c6fb2View commit details -
[Tests] Attempt to fix PowerPC buildbots.
The intent is that the tests should not be running on PowerPC as the fp128 type will differ. This attempts to fix the bots by using __powerpc__ instead, which appears to be defined in godbolt.
Configuration menu - View commit details
-
Copy full SHA for 001e423 - Browse repository at this point
Copy the full SHA 001e423View commit details -
[RISCV] Don't move source if passthru already dominates in vmv.v.v pe…
…ephole (llvm#105792) Currently we move the source down to where vmv.v.v to make sure that the new passthru dominates, but we do this even if it already does. This adds a simple local dominance check (taken from X86FastPreTileConfig.cpp) and avoids doing the move if it can. It also modifies the move to only move it to just past the passthru definition, and not all the way down to the vmv.v.v. This allows folding to succeed in some edge cases, which prevents regressions in an upcoming patch.
Configuration menu - View commit details
-
Copy full SHA for be5ecc3 - Browse repository at this point
Copy the full SHA be5ecc3View commit details -
[VPlan] Wrap planContainsAdditionalSimplifications in NDEBUG (NFC)
Only used for an assertion.
Configuration menu - View commit details
-
Copy full SHA for 40975da - Browse repository at this point
Copy the full SHA 40975daView commit details -
[ConstantFolding] Ensure TLI is valid when simplifying fp128 intrinsics.
TLI might not be valid for all contexts that constant folding is performed. Add a quick guard that it is not null.
Configuration menu - View commit details
-
Copy full SHA for 83a5c7c - Browse repository at this point
Copy the full SHA 83a5c7cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 08acc3f - Browse repository at this point
Copy the full SHA 08acc3fView commit details -
[lit] Export env vars in script to avoid pruning (llvm#105759)
On macOS the dynamic loader prunes dyld specific environment variables such as `DYLD_INSERT_LIBRARIES`, `DYLD_LIBRARY_PATH`, etc. If these are set in the lit config it's safe to assume that the user actually wanted their subprocesses to run with these variables, versus the python interpreter that gets executed with them before they are pruned. This change exports all known variables in the shell script instead of relying on them being passed through.
Configuration menu - View commit details
-
Copy full SHA for 65b7cbb - Browse repository at this point
Copy the full SHA 65b7cbbView commit details -
Update Python requirements to fix more CVEs (llvm#105853)
Followup to llvm#90109. In Microsoft, our automated scans are warning that LLVM has vulnerable dependencies. Specifically: * [CVE-2024-35195](https://nvd.nist.gov/vuln/detail/CVE-2024-35195) was fixed in `requests` 2.32.0. * [CVE-2024-37891](https://nvd.nist.gov/vuln/detail/CVE-2024-37891) was fixed in `urllib3` 2.2.2. I've updated LLVM's dependencies by running the following commands in `llvm/utils/git`: ``` pip-compile --upgrade --generate-hashes --output-file=requirements.txt requirements.txt.in pip-compile --upgrade --generate-hashes --output-file=requirements_formatting.txt requirements_formatting.txt.in ``` Note that for `requirements_formatting.txt` this adds `--generate-hashes` (according to my vague understanding, it's highly desirable and was already used for `requirements.txt`) and was locally run within `llvm/utils/git` (changing the recorded command, which apparently was originally run from the repo root - again, `requirements.txt` was already being regenerated with a locally run command, so this increases consistency). I observe that this has updated the relevant components to pick up the CVE fixes. Note that I am largely clueless in this area, so I hope that (like llvm#90109) no other changes will be necessary.
Configuration menu - View commit details
-
Copy full SHA for 7036394 - Browse repository at this point
Copy the full SHA 7036394View commit details -
[libc++][test] Fix
msvc_is_lock_free_macro_value()
(llvm#105876)Followup to llvm#99570. * `TEST_COMPILER_MSVC` must be tested for `defined`ness, as it is everywhere else. + Definition: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/support/test_macros.h#L71-L72 + Example usage: https://github.com/llvm/llvm-project/blob/52a7116f5c6ada234f47f7794aaf501a3692b997/libcxx/test/std/utilities/function.objects/func.not_fn/not_fn.pass.cpp#L248 + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(33): fatal error C1017: invalid integer constant expression` * Fix bogus return type: `msvc_is_lock_free_macro_value()` returns `2` or `0`, so it needs to return `int`. + Fixes: `llvm-project\libcxx\test\support\atomic_helpers.h(41): warning C4305: 'return': truncation from 'int' to 'bool'` * Clarity improvement: also add parens when mixing bitwise with arithmetic operators.
Configuration menu - View commit details
-
Copy full SHA for 886b761 - Browse repository at this point
Copy the full SHA 886b761View commit details -
Configuration menu - View commit details
-
Copy full SHA for a5d89d5 - Browse repository at this point
Copy the full SHA a5d89d5View commit details -
[llvm][NVPTX] Fix RAUW bug in NVPTXProxyRegErasure (llvm#105871)
Fix bug introduced in llvm#105730 The bug is in how the batch RAUW is implemented. If we have ``` %0 = mov %src %1 = mov %0 use %0 use %1 ``` The use of `%1` is rewritten to `%0`, not `%src`. This PR just looks for a replacement when it maps to the src register, which should transitively propagate the replacements.
Configuration menu - View commit details
-
Copy full SHA for 31b4bf9 - Browse repository at this point
Copy the full SHA 31b4bf9View commit details -
[DAG][RISCV] Use vp_reduce_fadd/fmul when widening types for FP reduc…
…tions (llvm#105840) This is a follow up to llvm#105455 which updates the VPIntrinsic mappings for the fadd and fmul cases, and supports both ordered and unordered reductions. This allows the use a single wider operation with a restricted EVL instead of padding the vector with the neutral element. This has all the same tradeoffs as the previous patch.
Configuration menu - View commit details
-
Copy full SHA for 2cb25d5 - Browse repository at this point
Copy the full SHA 2cb25d5View commit details -
Configuration menu - View commit details
-
Copy full SHA for d252365 - Browse repository at this point
Copy the full SHA d252365View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f618a7 - Browse repository at this point
Copy the full SHA 6f618a7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9f82f6d - Browse repository at this point
Copy the full SHA 9f82f6dView commit details -
[clang-cl] [AST] Reapply llvm#102848 Fix placeholder return type name…
… mangling for MSVC 1920+ / VS2019+ (llvm#104722) Reapply llvm#102848. The description in this PR will detail the changes from the reverted original PR above. For `auto&&` return types that can partake in reference collapsing we weren't properly handling that mangling that can arise. When collapsing occurs an inner reference is created with the collapsed reference type. If we return `int&` from such a function then an inner reference of `int&` is created within the `auto&&` return type. `getPointeeType` on a reference type goes through all inner references before returning the pointee type which ends up being a builtin type, `int`, which is unexpected. We can use `getPointeeTypeAsWritten` to get the `AutoType` as expected however for the instantiated template declaration reference collapsing already occurred on the return type. This means `auto&&` is turned into `auto&` in our example above. We end up mangling an lvalue reference type. This is unintended as MSVC mangles on the declaration of the return type, `auto&&` in this case, which is treated as an rvalue reference. ``` template<class T> auto&& AutoReferenceCollapseT(int& x) { return static_cast<int&>(x); } void test() { int x = 1; auto&& rref = AutoReferenceCollapseT<void>(x); // "??$AutoReferenceCollapseT@X@@ya$$QEA_PAEAH@Z" // Mangled as an rvalue reference to auto } ``` If we are mangling a template with a placeholder return type we want to get the first template declaration and use its return type to do the mangling of any instantiations. This fixes the bug reported in the original PR that caused the revert with libcxx `std::variant`. I also tested locally with libcxx and the following test code which fails in the original PR but now works in this PR. ``` #include <variant> void test() { std::variant<int> v{ 1 }; int& r = std::get<0>(v); (void)r; } ```
Configuration menu - View commit details
-
Copy full SHA for 43b8885 - Browse repository at this point
Copy the full SHA 43b8885View commit details -
[AArch64] Replace AND with LSL#2 for LDR target (llvm#34101) (llvm#89531
) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, #56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, #58 ldr w0, [x2, x8, lsl #2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.
Configuration menu - View commit details
-
Copy full SHA for 77fccb3 - Browse repository at this point
Copy the full SHA 77fccb3View commit details -
[ARM] Add VECTOR_REG_CAST identity fold.
v16i8 VECTOR_REG_CAST (v16i8 Op) can use v16i8 Op directly, as the VECTOR_REG_CAST is a noop.
Configuration menu - View commit details
-
Copy full SHA for b9a0276 - Browse repository at this point
Copy the full SHA b9a0276View commit details -
[Mips] Remove a trivial variable (NFC) (llvm#105940)
We assign I->getNumOperands() to J and immediately print that out as a debug message. We don't need to keep J across iterations.
Configuration menu - View commit details
-
Copy full SHA for a6f87ab - Browse repository at this point
Copy the full SHA a6f87abView commit details -
Revert "Enable logf128 constant folding for hosts with 128bit long do…
…uble (llvm#104929)" ConstantFolding behaves differently depending on host's `HAS_IEE754_FLOAT128`. LLVM should not change the behavior depending on host configurations. This reverts commit 14c7e4a. (llvmorg-20-init-3262-g14c7e4a18449 and llvmorg-20-init-3498-g001e423ac626)
Configuration menu - View commit details
-
Copy full SHA for 3ef64f7 - Browse repository at this point
Copy the full SHA 3ef64f7View commit details
Commits on Aug 25, 2024
-
[clang-format] Fix a misannotation of redundant r_paren as CastRParen (…
…llvm#105921) Fixes llvm#105880.
Configuration menu - View commit details
-
Copy full SHA for 6bc225e - Browse repository at this point
Copy the full SHA 6bc225eView commit details -
[clang-format] Fix a misannotation of less/greater as angle brackets (l…
…lvm#105941) Fixes llvm#105877.
Configuration menu - View commit details
-
Copy full SHA for 0916ae4 - Browse repository at this point
Copy the full SHA 0916ae4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5c94dd7 - Browse repository at this point
Copy the full SHA 5c94dd7View commit details -
[RISCV][ISel] Move VCIX ISDs to correct position. NFC (llvm#105934)
Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is not expected, it should be in normal OPCODE area.
Configuration menu - View commit details
-
Copy full SHA for 579fd59 - Browse repository at this point
Copy the full SHA 579fd59View commit details -
[CodeGen] Replace MCPhysReg with MCRegister in MachineBasicBlock::isL…
…iveIn/removeLiveIn. NFC We already used it for addLiveIn.
Configuration menu - View commit details
-
Copy full SHA for f22b1da - Browse repository at this point
Copy the full SHA f22b1daView commit details -
[lldb][TypeSystemClang][NFC] Log failure to InitBuiltinTypes
If we fail to initialize the ASTContext builtins, LLDB may crash in non-obvious ways down-the-line, e.g., when it tries to call `ASTContext::getTypeSize` on a builtin like `ast.UnsignedCharTy`, which would derefernce a `null` `QualType`. The initialization can fail if we either didn't set the `TypeSystemClang` target triple, or if the embedded clang isn't enabled for a certain target. This patch attempts to help pin-point the failure case post-mortem by adding a log message here that prints the triple. rdar://134260837
Configuration menu - View commit details
-
Copy full SHA for 2847020 - Browse repository at this point
Copy the full SHA 2847020View commit details -
Reapply "[compiler-rt][nsan] Add support for nan detection" (llvm#105909
) This reverts commit 1f89cd4.
Configuration menu - View commit details
-
Copy full SHA for 5136521 - Browse repository at this point
Copy the full SHA 5136521View commit details
Commits on Sep 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f0747cd - Browse repository at this point
Copy the full SHA f0747cdView commit details