Warn when `--iree-llvmcpu-target-cpu` defaults to "generic". #18587

ScottTodd · 2024-09-23T23:39:20Z

Progress on #18561.

We have these high level flags (and MLIR attributes) controlling what code the llvm-cpu compiler target generates using LLVM:

--iree-llvmcpu-target-triple
- e.g. x86_64-pc-windows-msvc
- See https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/TargetParser/Triple.h
--iree-llvmcpu-target-cpu
- e.g. znver2
--iree-llvmcpu-target-cpu-features
- e.g. +prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,...

Some targets (x86 and riscv64) in LLVM can infer the "features" for a given "cpu" with the getFeaturesForCPU() functions. Other targets may be able to look up available features on the host via llvm::sys::getHostCPUFeatures(). Short of either of those, we must fall back to having users explicitly list the features they want the compiler to use.

If the target CPU and target CPU features are both omitted, we currently fall back to "generic", meaning no "features". This PR sends more warning information to stderr if that case is detected. We could go a step further and make the default case propagate an error through the compiler, requiring users to explicit set "host", "generic", a known cpu type, or a known list of features.

TODO

Update docs (https://iree.dev/guides/deployment-configurations/cpu/, source: docs/website/docs/guides/deployment-configurations/cpu.md)
Audit the repository and other projects for usage of --iree-hal-target-backends=llvm-cpu that is missing an explicit --iree-llvmcpu-target-cpu or --iree-llvmcpu-target-cpu-features
Test the different code paths (manually or with lit tests / automation)
Add more context to error messages and docs, including an enumeration of accepted values for each flag

Debugging tips while working on this:

Run with --debug-only=iree-llvm-cpu-target to see the computed LLVMTarget get logged
Note that multiple code paths run this code, including CLI flags that build defaultOptions_ for getDefaultExecutableTargets() and getVariantTarget/loadFromConfigAttr that may come from already constructed IR. We want error handling for all of them.

stellaraccident · 2024-09-23T23:43:55Z

Thanks - I think that starting with warnings is a good way to flush this. Then have an easy thing to upgrade to an error.

bjacob · 2024-09-24T01:56:50Z

Thanks!

The warning message could helpfully dump a list of accepted CPU values for the target architecture.

At that point, passing the missing CPU flag would be so easy that you might as well mandate it, promoting this warning into an error. If a user is technical enough to specify a non-host target triple, and you do the work for them to look up the known CPUs for that target triple, they can be bothered to pick one. Then they can even choose to just pass generic (and the message may suggest so) if that's what they want. At least they will blame their own poor (command) line choices for bad performance.

These strings are small enough that we don't benefit much from std::string_view.

ScottTodd · 2024-09-25T19:07:04Z

compiler/plugins/target/LLVMCPU/LLVMTargetOptions.cpp

+      llvm::errs() << "warning: LLVMTarget cpu unspecified, defaulting to "
+                      "`generic`. Performance may suffer. Use "
+                      "`--iree-llvmcpu-target-cpu=` or the target attribute to "
+                      "set the target cpu.\n";


Maybe this warning should only be logged if cpuFeatures is also empty.

The warning could also mention the other flag (--iree-llvmcpu-target-cpu-features) and attribute. The logs and our docs should cover what @bjacob explained here: #18561 (comment)

On x86, people want to talk in terms of "CPU" (meaning microarchitecture) such as znver4 or cascadelake. People do not typically want to talk in terms of CPU features on x86 because that is very cumbersome.

On RISC-V, people want to talk in terms of CPU features, and there are many, but they are not too cumbersome thanks to very short names

On Arm, people typically specify baseline Arm architecture version plus a few CPU features, e.g. armv8.2-a+i8mm.

So: "you used these flags/attributes, but that is omitting some important information for getting peak performance, this is what you should do to fix that"

Indeed. Want to chat?

Can link to docs if the message gets long: https://iree.dev/guides/deployment-configurations/cpu/
Source for docs: https://github.com/iree-org/iree/blob/main/docs/website/docs/guides/deployment-configurations/cpu.md

(Any logs that are dependent on the code running, like "list all features for the given CPU" would still want to be put here though)

MaheshRavishankar · 2024-09-25T20:47:30Z

I wouldnt say I have the full context of all the moving pieces here (especially all the RISC-V stuff), but I would personally vote for making the host be the default for iree-llvmcpu-target-cpu . Thats the main use case people come with when compiling for CPU. I am not sure I fully understand the value of iree-llvmgpu-target-cpu=generic at this point. If someone is compiling for a different target-cpu then they can explicitly use that CPU and CPU features, but I think in terms of default, using host is what most folks are looking for I think?

benvanik · 2024-09-25T21:29:29Z

generic is useful for people who want to run something on a machine that is not their own - that's why it's the default for things like clang/gcc - host is only useful if you are compiling and running on the same machine and never useful if you're compiling and running separately. Defaulting to host has the danger of making every vmfb produced crash on any machine but the one it runs on and that's why AOT compilers never default to that. Frameworks/etc hosting IREE that know they are JITing can default safely, or if we want to be extremely explicit in all documentation that defaults will produce non-portable vmfbs then we could default iree-compile to it, but I really don't like debugging SIGILLs :)

benvanik · 2024-09-25T21:37:52Z

(note that there's a spectrum between "generic" and "host" - we can decide we want avx2 as a baseline in our generic config instead of emitting scalar code, etc - it's mostly a way to trade off "here's what we expect our minimum requirements for default vmfbs to be" vs "here's something that only runs on one particular machine").

bjacob · 2024-09-26T01:29:29Z

I was going to advocate for least-surprisal as the overriding principle here, so I dug what compilers are doing, and unfortunately it's fragmented:

C/C++ compilers are of course defaulting to generic. This gives determinism (I can reproduce your command line as long as it doesn't explicitly involve host) and running everywhere, but slow by default.
nvcc defaults to sm_52 which means Maxwell. That's akin to what @benvanik suggests above with defaulting to something like avx2 as a baseline. That's still deterministic and still runs in 100% of currently supported devices, while providing better performance than generic.
- The caveat here is that by providing improved performance over generic, this makes the residual performance gap less discoverable, while potentially still very large.
hipcc seems to be defaulting to native (i.e. host) in my experience (but I wasn't able to find docs saying so -- all I know is that just hipcc with no flags gives me all the CDNA3 stuff when invoked on such a machine, while that fails to compile on older machines).

This fragmentation means that whichever default we pick here will be surprising to some users, unfortunately.

Given that, I think I'd like to explore ways to make host/native-as-default work, but with improved diagnostics. For example, we could check immediately on loading a module that the machine we are running on supports the features required by the module. I'm still cringing thinking of the non-reproducible bug reports (iree-compile command lines will be non-reproducible by default) but I also cringe thinking of the lost performance, so we have to pick one poison.

stellaraccident · 2024-09-26T03:41:45Z

In practice, I think they only move the nvcc version up when they physically drop support in the software for anything prior (or at least I have a recollection of seeing announcements about that).

I think we just need to make the flag required. And probably add the other diagnostics and checks you mention, because it will get screwed up.

bjacob · 2024-09-26T11:07:09Z

Oh yes, just make the flag required, with diagnostic, and with helpful messages - so if people would like host, they can just copy and paste a flag from the start of the explanation, which can then proceed with enumerating all the accepted values.

bjacob · 2024-10-03T17:42:23Z

@ScottTodd, @benvanik, @stellaraccident, here is what I have at the moment: #18682 . It folds some good ideas from this PR (the general idea, and defaulting CPU to "" instead of "generic", and resolving host triple in LLVMTarget::create).

ScottTodd · 2024-10-18T17:35:04Z

Closing in favor of #18682. Thanks @bjacob !

ScottTodd added 2 commits September 23, 2024 15:20

Hacking on llvmcpu cpu and cpu feature options.

2cd0aeb

Cleanup, handle more cases, expand logging.

48c8305

ScottTodd added the codegen/llvm LLVM code generation compiler backend label Sep 23, 2024

ScottTodd mentioned this pull request Sep 23, 2024

Compiling for llvm-cpu without targeting a specific CPU is a bad experience #18561

Open

Attempt to patch use-after-free by switching to std::string with copies.

176c9a1

These strings are small enough that we don't benefit much from std::string_view.

ScottTodd commented Sep 25, 2024

View reviewed changes

bjacob mentioned this pull request Oct 3, 2024

Warn when --iree-llvmcpu-target-cpu defaults to "generic". #18682

Merged

ScottTodd closed this Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn when `--iree-llvmcpu-target-cpu` defaults to "generic". #18587

Warn when `--iree-llvmcpu-target-cpu` defaults to "generic". #18587

ScottTodd commented Sep 23, 2024 •

edited

Loading

stellaraccident commented Sep 23, 2024

bjacob commented Sep 24, 2024

ScottTodd Sep 25, 2024

ScottTodd Sep 25, 2024

bjacob Sep 25, 2024

ScottTodd Oct 2, 2024

MaheshRavishankar commented Sep 25, 2024

benvanik commented Sep 25, 2024

benvanik commented Sep 25, 2024

bjacob commented Sep 26, 2024

stellaraccident commented Sep 26, 2024

bjacob commented Sep 26, 2024

bjacob commented Oct 3, 2024 •

edited

Loading

ScottTodd commented Oct 18, 2024

Warn when --iree-llvmcpu-target-cpu defaults to "generic". #18587

Warn when --iree-llvmcpu-target-cpu defaults to "generic". #18587

Conversation

ScottTodd commented Sep 23, 2024 • edited Loading

stellaraccident commented Sep 23, 2024

bjacob commented Sep 24, 2024

ScottTodd Sep 25, 2024

Choose a reason for hiding this comment

ScottTodd Sep 25, 2024

Choose a reason for hiding this comment

bjacob Sep 25, 2024

Choose a reason for hiding this comment

ScottTodd Oct 2, 2024

Choose a reason for hiding this comment

MaheshRavishankar commented Sep 25, 2024

benvanik commented Sep 25, 2024

benvanik commented Sep 25, 2024

bjacob commented Sep 26, 2024

stellaraccident commented Sep 26, 2024

bjacob commented Sep 26, 2024

bjacob commented Oct 3, 2024 • edited Loading

ScottTodd commented Oct 18, 2024

Warn when `--iree-llvmcpu-target-cpu` defaults to "generic". #18587

Warn when `--iree-llvmcpu-target-cpu` defaults to "generic". #18587

ScottTodd commented Sep 23, 2024 •

edited

Loading

bjacob commented Oct 3, 2024 •

edited

Loading