Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with 6a3982f8 (May 30) (56) #317

Merged
merged 231 commits into from
Sep 11, 2024

Conversation

mgehre-amd
Copy link
Collaborator

No description provided.

StephanTLavavej and others added 30 commits May 28, 2024 12:15
…to_string` as C++26 (llvm#93255)

[P2845R8](https://wg21.link/P2845R8) "Formatting of
`std::filesystem::path`" and [P2587R3](https://wg21.link/P2587R3)
"`to_string` or not `to_string`" are C++26 features, so they should be
marked accordingly in `generate_feature_test_macro_components.py`.

I verified that without my changes, running the script produced no
edits. Then with my changes, I ran the script to regenerate all files,
with no other manual edits.

Found while running libc++'s tests with MSVC's STL, which noticed this
because it's currently a C++23-only implementation.

Note that @H-G-Hristov has a draft implementation of P2587R3: llvm#78100
Found while running libc++'s tests with MSVC's STL.

* Avoid MSVC warning C5101: use of preprocessor directive in
function-like macro argument list is undefined behavior.
+ We can easily make this portable by extracting `const bool is_newlib`.
  + Followup to llvm#73440.
  + See llvm#73598.
  + See llvm#73836.
* Avoid MSVC warning C4267: 'return': conversion from 'size_t' to 'int',
possible loss of data.
+ This warning is valid, but harmless for the test, so
`static_cast<int>` will avoid it.
* Avoid MSVC warning C4146: unary minus operator applied to unsigned
type, result still unsigned.
+ This warning is also valid (the scenario is sometimes intentional, but
surprising enough that it's worth warning about). This is a C++17 test,
so we can easily avoid it by testing `is_signed_v` at compile-time
before testing `m < 0` and `n < 0` at run-time.
* Silence MSVC warning C4310: cast truncates constant value.
+ These warnings are being emitted by `T(255)`. Disabling the warning is
simpler than attempting to restructure the code.
  + Followup to llvm#79791.
* MSVC no longer emits warning C4521: multiple copy constructors
specified.
+ This warning was removed from the compiler, since at least 2021-12-09.
* Guard `std::__make_from_tuple_impl` tests with `#ifdef _LIBCPP_VERSION` and `LIBCPP_STATIC_ASSERT`.
* Change `_LIBCPP_CONSTEXPR_SINCE_CXX20` to `TEST_CONSTEXPR_CXX20`.
+ Other functions in `variant.swap/swap.pass.cpp` were already using the proper test macro.
* Mark `what` as `[[maybe_unused]]` when used by `TEST_LIBCPP_REQUIRE`.
  + This updates one occurrence in `libcxx/test/libcxx` for consistency.
* Windows `_putenv_s()` takes 2 arguments, not 3.
  + See MSVC documentation: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/putenv-s-wputenv-s?view=msvc-170
+ POSIX `setenv()` takes `int overwrite`, but Windows `_putenv_s()` always overwrites.
* Avoid non-Standard zero-length arrays.
  + Followup to llvm#74183 and llvm#79792.
* Add `operator++()` to `unsized_it`.
+ The Standard requires this due to [N4981][] [move.iter.requirements]/1 "The template parameter `Iterator` shall
  either meet the *Cpp17InputIterator* requirements ([input.iterators])
  or model `input_iterator` ([iterator.concept.input])."
+ MSVC's STL requires this because it has a strengthened exception
  specification in `move_iterator` that inspects the underlying iterator's
  increment operator.
* `uniform_int_distribution` forbids `int8_t`/`uint8_t`.
  + See [N4981][] [rand.req.genl]/1.5. MSVC's STL enforces this.
+ Note that when changing the distribution's `IntType`, we need to be
  careful to preserve the original value range of `[0, max_input]`.
* fstreams are constructible from `const fs::path::value_type*` on wide systems.
  + See [ifstream.cons], [ofstream.cons], [fstream.cons].
* In `msvc_stdlib_force_include.h`, map `_HAS_CXX23` to `TEST_STD_VER` 23 instead of 99.
+ On 2023-05-23, llvm@7140050
  started recognizing 23 as a distinct value.
* Fix test name typo: `destory_elements.pass.cpp` => `destroy_elements.pass.cpp`

[N4981]: https://wg21.link/N4981
…llvm#93602)

Despite the name, the test is used to test merge/show roundtrips for
different MemProf versions.  This patch renames the test to match the
reality.
…P arithmetic. (llvm#92799)

This adds VPSExtPromotedInteger and VPZExtPromotedInteger and uses them
to promote many arithmetic operations.
    
VPSExtPromotedInteger uses a shift pair because we don't have
VP_SIGN_EXTEND_INREG yet.
…erPC) (llvm#93117)

The original pull request
(llvm#92838) was reverted due to a
PowerPC buildbot breakage
(llvm@df626dd).
This reland limits the scope of the change to non-PowerPC platforms. I
am unaware of any PowerPC use cases that would benefit from a larger
kNumStackOriginDescrs constant.

Original CL description: This increases the constant size of
kNumStackOriginDescrs to 4M (64GB of BSS across two arrays), which ought
to be enough for anybody.

This is the easier alternative suggested by eugenis@ in
llvm#92826.
This is an experiment to see if we can prevent some of the compiler OOMs
happening without unduly impacting the Windows build latency.
…#92659)

- Reduce disk IO usage by adding cache to an realpath introduced by
llvm#81985
Swapped code blocks of parameter and variable, which have been confused
(in a clang-tidy doc file)
I think test files for the legacy and the new EH (exnref) are better be
separate, and I'd like to use the current test file names for the new
EH, rather than keeping the current files and naming the new ones as
`-new` or something.
This patch adds Version 3 for development purposes.  For now, this
patch adds V3 as a copy of V2.

For the most part, this patch adds "case Version3:" wherever "case
Version2:" appears.  One exception is writeMemProfV3, which is copied
from writeMemProfV2 but updated to write out memprof::Version3 to the
MemProf header.  We'll incrementally modify writeMemProfV3 in
subsequent patches.
I discovered while working on something else that we were using the
location of the directive name as the 'beginloc' which caused some
problems in a few places.  This patch makes it so our beginloc is the
'#' as we originally designed, and then adds a DirectiveLoc concept to a
construct for use diagnosing the name.
…n with a sample profile (llvm#93286)

Currently if a callsite is hot as determined by the sample profile, it
is unconditionally inlined barring invalid cases (such as recursion).
Inline cost check should still apply because a function's hotness and
its inline cost are two different things.
For example if a function is calling another very large function
multiple times (at different code paths), the large function should not
be inlined even if its hot.
Avoids regression in future commit which starts producing
illegal instances.
…st check host JIT support (llvm#84758)

fea7399 had removed the unused function that was still there when I tested.
For some reason I was using writeStmtRef when I meant writeStmt, so this
corrects that.
This patch adds bind c names to functions and subroutines in cudadevice
so they can be lowered and not hit the intrinsic procedure TODOs.
…obal addresses. (llvm#93352)

This allows register allocation to rematerialize these instead of
spilling and reloading. We need to make it a single instruction due to
limitations in rematerialization.

This pseudo is expanded to an LUI+ADDI pair between regalloc and post RA
scheduling.

This improves the dynamic instruction count on 531.deepsjeng_r from
spec2017 by 3.2% for the train dataset. 500.perlbench and 502.gcc see a
1% improvement. There are couple regressions, but they are 0.1% or
smaller.

AArch64 has similar pseudo instructions like MOVaddr
This patch adds hidden visibility to the variable
that is used by the single byte counters mode in
source-based code coverage.
I've been running a cronjob on my local machine to restart preempted
libc++ CI runs. This is bad and brittle. This upstreams a much better
version of the restarter.

It works by matching on check run annotations looking for mention
of the machine being shutdown.

If there are both preempted jobs and failing jobs, we don't restart
the workflow. Maybe we should change that?
…#93199)

Integer range analysis will not update the range of an operation when
any of the inferred input lattices are uninitialized. In the current
behavior, all lattice values for non integer types are uninitialized.

For operations like arith.cmpf

```mlir
%3 = arith.cmpf ugt, %arg0, %arg1 : f32
```

that will result in the range of the output also being uninitialized,
and so on for any consumer of the arith.cmpf result. When control-flow
ops are involved, the lack of propagation results in incorrect ranges,
as the back edges for loop carried values are not properly joined with
the definitions from the body region.

For example, an scf.while loop whose body region produces a value that
is in a dataflow relationship with some floating-point values through an
arith.cmpf operation:

```mlir
func.func @test_bad_range(%arg0: f32, %arg1: f32) -> (index, index) {
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index

  %3 = arith.cmpf ugt, %arg0, %arg1 : f32

  %1:2 = scf.while (%arg2 = %c0, %arg3 = %c0) : (index, index) -> (index, index) {
    %2 = arith.cmpi ult, %arg2, %c4 : index
    scf.condition(%2) %arg2, %arg3 : index, index
  } do {
  ^bb0(%arg2: index, %arg3: index):
    %4 = arith.select %3, %arg3, %arg3 : index
    %5 = arith.addi %arg2, %c1 : index
    scf.yield %5, %4 : index, index
  }

  return %1#0, %1#1 : index, index
}
```

The existing behavior results in the control condition %2 being
optimized to true, turning the while loop into an infinite loop. The
update to %arg2 through the body region is never factored into the range
calculation, as the ranges for the body ops all test as uninitialized.

This change causes all values initialized with setToEntryState to be set
to some initialized range, even if the values are not integers.

---------

Co-authored-by: Spenser Bauman <sabauma@fastmail>
Recommit 435ea21.
As the comment added by a077271
suggests, these `*Triples` lists should shrink over time.

https://reviews.llvm.org/D158183 allows *-unknown-linux-gnu to detect
*-linux-gnu. If we additionally allow x86_64-unknown-linux-gnu
-m32/-mx32 to detect x86_64-linux-gnu, we can mostly remove these
*-linux-gnu elements.

Retain x86_64-linux-gnu for now to work around llvm#93609.
(In addition, Debian /usr/bin/clang --version uses x86_64-pc-linux-gnu).
Retain i586-linux-gnu for now to work around llvm#93502.
We are using PLTs for cortex-m33 which only supports thumb. More
specifically, this is for a very restricted use case. There's no MMU so
there's no sharing of virtual addresses between two processes, but this
is fine. The MCU is used for running [chre
nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md)
for android. Each nanoapp is a shared library (but effectively acts as
an executable containing a test suite) that is loaded and run on the MCU
one binary at a time and there's only one process running at a time, so
we ensure that the same text segment cannot be shared by two different
running executables. GNU LD supports thumb PLTs but we want to migrate
to a clang toolchain and use LLD, so thumb PLTs are needed.
…itcast (sra (v2Xi16 (bitcast X)), 15)) (llvm#93565)

Similar for i16 and i64 elements for both fixed and scalable vectors.

This reduces the number of vector instructions, but increases vl/vtype
toggles.

This reduces some code in 525.x264_r from SPEC2017. In that usage, the
vectors are fixed with a small number of elements so vsetivli can be
used.

This is similar to `performMulVectorCmpZeroCombine` from AArch64.
topperc and others added 26 commits May 29, 2024 16:50
These can be implemented with multiple vnclips.
)

This is one of the major changes we (Microsoft) have made in the version
of asan we ship with Visual Studio.

@amyw-msft wrote a blog post outlining this work at
https://devblogs.microsoft.com/cppblog/msvc-address-sanitizer-one-dll-for-all-runtime-configurations/

> With Visual Studio 2022 version 17.7 Preview 3, we have refactored the
MSVC Address Sanitizer (ASan) to depend on one runtime DLL regardless of
the runtime configuration. This simplifies project onboarding and
supports more scenarios, particularly for projects statically linked
(/MT, /MTd) to the C Runtimes. However, static configurations have a new
dependency on the ASan runtime DLL.

> Summary of the changes:

> ASan now works with /MT or /MTd built DLLs when the host EXE was not
compiled with ASan. This includes Windows services, COM components, and
plugins.
Configuring your project with ASan is now simpler, since your project
doesn’t need to uniformly specify the same [runtime
configuration](https://learn.microsoft.com/en-us/cpp/build/reference/md-mt-ld-use-run-time-library?view=msvc-170)
(/MT, /MTd, /MD, /MDd).
ASan workflows and pipelines for /MT or /MTd built projects will need to
ensure the ASan DLL (clang_rt.asan_dynamic-<arch>.dll) is available on
PATH.
The names of the ASan .lib files needed by the linker have changed (the
linker normally takes care of this if not manually specifying lib names
via /INFERASANLIBS)
You cannot mix ASan-compiled binaries from previous versions of the MSVC
Address Sanitizer (this is always true, but especially true in this
case).

Here's the description of these changes from our internal PR

1. Build one DLL that includes everything debug mode needs (not included
here, already contributed upstream).
* Remove #if _DEBUG checks everywhere.
* In some places, this needed to be replaced with a runtime check. In
asan_win.cpp, IsDebugRuntimePresent was added where we are searching for
allocations prior to ASAN initialization.
* In asan_win_runtime_functions.cpp and interception_win.cpp, we need to
be aware of debug runtime DLLs even when not built with _DEBUG.
2. Redirect statically linked functions to the ASAN DLL for /MT
* New exports for each of the C allocation APIs so that the statically
linked portion of the runtime can call them (see asan_malloc_win.cpp,
search MALLOC_DLL_EXPORT). Since we want our stack trace information to
be accurate and without noise, this means we need to capture stack frame
info from the original call and tell it to our DLL export. For this, I
have reused the __asan_win_new_delete_data used for op new/delete
support from asan_win_new_delete_thunk_common.h and moved it into
asan_win_thunk_common.h renamed as __asan_win_stack_data.
* For the C allocation APIs, a new file is included in the
statically-linked /WHOLEARCHIVE lib - asan_malloc_win_thunk.cpp. These
functions simply provide definitions for malloc/free/etc to be used
instead of the UCRT's definitions for /MT and instead call the ASAN DLL
export. /INFERASANLIBS ensures libucrt.lib will not take precedence via
/WHOLEARCHIVE.
* For other APIs, the interception code was called, so a new export is
provided: __sanitizer_override_function.
__sanitizer_override_function_by_addr is also provided to support
__except_handler4 on x86 (due to the security cookie being per-module).
3. Support weak symbols for /MD
* We have customers (CoreCLR) that rely on this behavior and would force
/MT to get it.
* There was sanitizer_win_weak_interception.cpp before, which did some
stuff for setting up the .WEAK section, but this only worked on /MT. Now
stuff registered in the .WEAK section is passed to the ASAN DLL via new
export __sanitizer_register_weak_function (impl in
sanitizer_win_interception.cpp). Unlike linux, multiple weak symbol
registrations are possible here. Current behavior is to give priority on
module load order such that whoever loads last (so priority is given to
the EXE) will have their weak symbol registered.
* Unfortunately, the registration can only occur during the user module
startup, which is after ASAN DLL startup, so any weak symbols used by
ASAN during initialization will not be picked up. This is most notable
for __asan_default_options and friends (see asan_flags.cpp). A mechanism
was made to add a callback for when a certain weak symbol was
registered, so now we process __asan_default_options during module
startup instead of ASAN startup. This is a change in behavior, but
there's no real way around this due to how DLLs are.
4. Build reorganization
* I noticed that our current build configuration is very MSVC-specific
and so did a bit of reworking. Removed a lot of
create_multiple_windows_obj_lib use since it's no longer needed and it
changed how we needed to refer to each object_lib by adding runtime
configuration to the name, conflicting with how it works for non-MSVC.
* No more Win32 static build, use /MD everywhere.
* Building with /Zl to avoid defaultlib warnings.

In addition:
* I've reapplied "[sanitizer][asan][win] Intercept _strdup on Windows
instead of strdup" which broke the previous static asan runtime. That
runtime is gone now and this change is required for the strdup tests to
work.
* I've modified the MSVC clang driver to support linking the correct
asan libraries, including via defining _DLL (which triggers different
defaultlibs and should result in the asan dll thunk being linked, along
with the dll CRT (via defaultlib directives).
* I've made passing -static-libsan an error on windows, and made
-shared-libsan the default. I'm not sure I did this correctly, or in the
best way.
* Modified the test harnesses to add substitutions for the dynamic and
static thunks and to make the library substitutions point to the dynamic
asan runtime for all test configurations on windows. Both the static and
dynamic windows test configurations remain, because they correspond to
the static and dynamic CRT, not the static and dynamic asan runtime
library.

---------

Co-authored-by: Amy Wishnousky <amyw@microsoft.com>
…llvm#93448)

Since they can also occur as the template name of
template specializations, handle them from TemplateName printing instead
of TemplateArgument.
Extend NVPTX DAG combining logic to distribute a mul instruction across
an add of 1 into a mad where possible. In addition, add support for
transposing a mul through a select with an option of 1, if that would
allow further mul folding.
…rs/230/builds/29066

Failure was introduced in llvm#81545

On 64-bit targets for i32 return type, there will be extension in the function
prototype.
This reverts commit 2e0cfe6.

Buildbots are broken.
…fferent pointer bases (llvm#91453)

This patch enhances the SCEVAAResult::alias() interface to handle two
pointers with different pointer bases.

Before calling getMinusSCEV(), we firstly try to explicitly convert
these two pointers into ptrtoint expressions to do that.

Either both pointers are used with ptrtoint or neither, so we can't
end up with a ptr + int mix.
The patch introduces the gmock-based unittest infrastructure for PGO
Instrumentation and adds some test cases to check whether the
instrumentation has taken place. The testing infrastructure for analysis
modules was borrowed from the LoopPassManagerTest unittest and
simplified a bit to handle module analysis passes only. Actually, we are
testing whether the result of a trivial analysis pass was invalidated by
the PGOInstrumentGen one: we exploit the fact the pass invalidates all
the analysis results after a module was instrumented.

NFC.
Follow-up to a previous simplification
2473b1a.

The xor difference between a SHT_NOTE and a read-only SHT_PROGBITS
(previously >=NOT_SPECIAL) should be smaller than RF_EXEC. Otherwise,
for the following section layout, `findOrphanPos` would place .text
before note.

```
// simplified from linkerscript/custom-section-type.s
non orphans:
progbits 0x8060c00 NOT_SPECIAL
note     0x8040003

orphan:
.text    0x8061000 NOT_SPECIAL
```

---

Identical to 2e0cfe6.
The revert 30c10fd is wrong.
This reverts commit f639b57.

The premerge bot is still broken with failing bolt test.
…llvm#91681)

In llvm#88323, I changed the logic
within `add_compiler_rt_runtime` to only explicitly code sign the
resulting library if an older version of Apple's ld64 was in use. This
was based on the assumption that newer versions of ld64 and the new
Apple linker always ad-hoc sign their output binaries. This is true in
most cases, but not when using Apple's new linker with the
`-darwin-target-variant` flag to build Mac binaries that are compatible
with Catalyst.

Rather than adding increasingly complicated logic to detect the exact
scenarios that require explicit code signing, I've opted to always
explicitly code sign when using any Apple linker. We instead detect and
use the 'linker-signed' codesigning option when possible to match the
signatures that the linker would otherwise create. This avoids having
non-'linker-signed' ad-hoc signatures which was the underlying problem
that llvm#88323 was intended to
address.

Co-authored-by: Mark Rowe <markrowe@chromium.org>
Move VPlan verification functions to avoid the need to pass VPDT across
multiple calls. This also allows easier extensions in the future.
…ns (llvm#93764)

When a module contains globals and/or function declarations only, the
'__llvm_profile_raw_version' variable should not be generated because
the module was not instrumented at all.

NFC
Implements HLSL availability diagnostics' default and relaxed mode.

HLSL availability diagnostics emits errors or warning when unavailable
shader APIs are used. Unavailable shader APIs are APIs that are exposed
in HLSL code but are not available in the target shader stage or shader
model version.

In the default mode the compiler emits an error when an unavailable API
is found in a code that is reachable from the shader entry point
function. In the future this check will also extended to exported
library functions (llvm#92073). The relaxed diagnostic mode is the same
except the compiler emits a warning. This mode is enabled by
``-Wno-error=hlsl-availability``.

See HLSL Availability Diagnostics design doc
[here](https://github.com/llvm/llvm-project/blob/main/clang/docs/HLSL/AvailabilityDiagnostics.rst)
for more details.

Fixes llvm#90095
…ereferencing pointer to pointers. (llvm#81545)"

This reverts commit aeccfee, and dependents:

Revert "[NFC] Fix PPC buildbot failure https://lab.llvm.org/buildbot/#/builders/230/builds/29066"
This reverts commit 2b1d1c5.

Revert "Fix test - remove unnecessary/incorrect `-S`, in favor of `-emit-llvm`"
This reverts commit ea1ecb5.

The test is failing on MacOs and Windows
Since bf16 is supported by mlir, similar to
complex128/complex64/float16, we need an implementation of bf16 ctype in
Python binding. Furthermore, to resolve the absence of bf16 support in
NumPy, a third-party package [ml_dtypes
](https://github.com/jax-ml/ml_dtypes) is introduced to add bf16
extension, and the same approach was used in `torch-mlir` project.

See motivation and discussion in:
https://discourse.llvm.org/t/how-to-run-executionengine-with-bf16-dtype-in-mlir-python-bindings/79025
DWARFDebugInfo doesn't know how to resolve the "file_index" component of
a DIERef. This patch removes GetUnit (in favor of existing
GetUnitContainingDIEOffset) and changes GetDIE to take only the
components it actually uses.
This commit changes the LLVM dialect's inliner interface to stop
assuming that the inlined function only contained unstructured control
flow. This is not necessarily true, and it lead to not properly
propagating the noalias information.
Base automatically changed from bump_to_debdbeda to feature/fused-ops September 10, 2024 07:08
@mgehre-amd mgehre-amd merged commit 72366b7 into feature/fused-ops Sep 11, 2024
11 checks passed
@mgehre-amd mgehre-amd deleted the bump_to_6a3982f8 branch September 11, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.