Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn when --iree-llvmcpu-target-cpu defaults to "generic". #18682

Merged
merged 5 commits into from
Oct 17, 2024

Conversation

bjacob
Copy link
Contributor

@bjacob bjacob commented Oct 3, 2024

Progress on #18561. This introduces a warning (which we intend to promote to an error in the future) when targeting a generic CPU without explicitly asking for it. This addresses a performance footgun as that IREE default results in low performance.

Along the way this grew into a substantial change to e2e testing rules:

  • TARGET_CPU and TARGET_CPU_FEATURES arguments are gone (were redundant with COMPILER_FLAGS).
  • For TARGET_CPU_FEATURES_VARIANTS, the special value "default" is renamed to "generic" and a new value "host" is also supported.

Example warning (this is customized to the target architecture, here x86):

/home/benoit/matmul_i8.mlir:0:0: warning: while creating CPU target: 
Defaulting to targeting a generic CPU for the target architecture will result in poor performance. Please specify a target CPU and/or a target CPU feature set. If it is intended to target a generic CPU, specify "generic" as the CPU.

This can be done in two ways:
1. With command-line flags:
    --iree-llvmcpu-target-cpu=...
    --iree-llvmcpu-target-cpu-features=...
2. Within the IR:
    #hal.executable.target< ... , cpu="...", cpu_features="...">

In the rest of this message, these fields are referred to as just `cpu` and `cpu_features`.

Examples:

    cpu=generic
        Target a generic CPU of the target architecture. The generated code will have poor performance, but will run on any CPU.

    cpu=host
        Target the host CPU. The generated code will have optimal performance on the host CPU but will crash on other CPUs not supporting the same CPU features.

    cpu="name"
        Target a specific CPU. This is mostly used on x86. The accepted values are the same as in Clang command lines.
        List of accepted x86 CPUs: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, arrowlake, arrowlake-s, lunarlake, gracemont, pantherlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, clearwaterforest, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, znver5, x86-64, x86-64-v2, x86-64-v3, x86-64-v4

    cpu_features="+feature1,..."
        Target a CPU supporting the comma-separated of (+-prefixed) features. The accepted values are the same as in Clang command lines.

@ScottTodd
Copy link
Member

WDYT @benvanik @stellaraccident @ScottTodd ?

nit about GitHub: Please put tags like these in comments, not commit messages (or PR descriptions that then become commit messages). If someone pushes a commit containing such a tag to their fork, it will notify those users. I don't really need to know whenever someone rebases or otherwise rewrites history in a fork :P

@bjacob bjacob changed the title CPU feature logging Warn when --iree-llvmcpu-target-cpu defaults to "generic". Oct 3, 2024
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like the refactoring into the helper file and methods.

We should think through the diagnostics and error handling a bit more, but beyond that I like the new code behavior.

@bjacob
Copy link
Contributor Author

bjacob commented Oct 16, 2024

@ScottTodd , @benvanik , @stellaraccident , I just pushed what I have. It addresses all of @ScottTodd 's earlier review comments, and produces the expected results in iree-compile command-line invocations with target-backends=llvm-cpu. The main problem that I am struggling with is that it also generates the warning message about the generic GPU fallback... for other target-backends. That is because at the moment, the error message is being printed from LLVMCPUTargetCLOptions::getTargetOptions, which is called at target backend registration, and all target backends are registered regardless of which one is actually used.

I don't know how to fix this. @benvanik suggested buildConfigurationPassPipeline, but that too is being called for the LLVMCPU backend in a iree-compile invocation with vulkan-spirv.

@bjacob
Copy link
Contributor Author

bjacob commented Oct 16, 2024

After chatting some more with @ScottTodd , pushed a 2nd commit that makes it work AFAICS: 18748f7

WDYT?

@stellaraccident
Copy link
Collaborator

After chatting some more with @ScottTodd , pushed a 2nd commit that makes it work AFAICS: 18748f7

WDYT?

That seems ok to me.

@ScottTodd ScottTodd added codegen/llvm LLVM code generation compiler backend compiler/tools IREE's compiler tooling (iree-compile, iree-opt, etc) quality of life 😊 Nice things are nice; let's have some labels Oct 16, 2024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you resolve the merge conflicts on this file (and the CMakeLists.txt), so the github workflows run?

Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! No major concerns on my end with this approach.

Comment on lines 165 to 170
if(_RULE_TARGET_CPU)
list(APPEND _BASE_COMPILER_FLAGS "--iree-llvmcpu-target-cpu=${_RULE_TARGET_CPU}")
endif()
if(_RULE_TARGET_CPU_FEATURES)
list(APPEND _BASE_COMPILER_FLAGS "--iree-llvmcpu-target-cpu-features=${_RULE_TARGET_CPU_FEATURES}")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for these getting their own flags instead of using COMPILER_FLAGS?

In general, I want these build system functions to do less, not more - especially for global flags controlling target backend/device-specific options.

I don't really mind being this explicit in our tests:

  TARGET_BACKEND
    "llvm-cpu"
  COMPILER_FLAGS
    "--iree-llvmcpu-target-cpu=generic"
  DRIVER
    "local-task"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I just made that change in all related rules.

build_tools/cmake/iree_e2e_generated_runner_test.cmake Outdated Show resolved Hide resolved
compiler/plugins/target/LLVMCPU/LLVMTargetOptions.cpp Outdated Show resolved Hide resolved
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@bjacob
Copy link
Contributor Author

bjacob commented Oct 17, 2024

@ScottTodd , this feels ready for review now.

I have followed your suggestion to drop TARGET_CPU and TARGET_CPU_FEATURES. While doing do, I found the place which had originally motivated them: this is where we pass a --requirements flag to the test runner so that it can check at runtime that the host CPU supports the expected features. Now that there isn't a separate TARGET_CPU_FEATURES field anymore, we resort to parsing the COMPILER_FLAGS, which is not too bad and is a tiny detail, so it is the right call to deal with it rather than have those TARGET_CPU and TARGET_CPU_FEATURES flags everywhere.

The TARGET_CPU_FEATURES_VARIANTS field used to accept a special value "default" and default to that. I renamed that to "generic" because that's what it does (so let's not have two different names for that same configuration, "default" and "generic") and I added support for "host" so we can easily add tests that test host codegen and/or decide to default to testing that in addition to "generic" in a follow-up.

As part of this CL, I had to add "--iree-llvmcpu-target-cpu=generic" to a number of iree_check_single_backend_test_suite. If those were iree_check_test_suite, this wouldn't be needed, as those take TARGET_CPU_FEATURES_VARIANTS which default to "generic". This discrepancy between iree_check_single_backend_test_suite and iree_check_test_suite seems unfortunate and not reflected in the names. We should address that in a follow-up. My preferred resolution actually would be to drop iree_check_single_backend_test_suite other than as a local implementation helper within iree_check_test.cmake. Everyone should be able to just use iree_check_test_suite, and it's not even more verbose.

@bjacob bjacob marked this pull request as ready for review October 17, 2024 16:31
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@ScottTodd ScottTodd self-requested a review October 17, 2024 18:25
@ScottTodd
Copy link
Member

I have followed your suggestion to drop TARGET_CPU and TARGET_CPU_FEATURES. While doing do, I found the place which had originally motivated them: this is where we pass a --requirements flag to the test runner so that it can check at runtime that the host CPU supports the expected features. Now that there isn't a separate TARGET_CPU_FEATURES field anymore, we resort to parsing the COMPILER_FLAGS, which is not too bad and is a tiny detail, so it is the right call to deal with it rather than have those TARGET_CPU and TARGET_CPU_FEATURES flags everywhere.

Ah. That's on my list to remove too: https://github.com/iree-org/iree-test-suites/blob/3a0ea13cbdc954365d653a48cc99cc63a9ff09b3/linalg_ops/matmul/generate_e2e_matmul_tests.py#L529-L531

    # TODO(scotttodd): drop this and whatever logic in the test tool used it
    #     multiple backends should be able to use the same input IR, so the
    #     input IR shouldn't need things like CPU features in it

Also for code review, please try to avoid force pushing. I don't see a way to view the diff between when I last reviewed and the latest code >_>

@ScottTodd
Copy link
Member

(GitHub has a few ways to view diffs between commits, but force pushing with a rebase defeats most of them)

  • Changes since last review shows no diff
    image
  • "Compare" on the force push shows all changes, including commits unrelated to the PR
    image

@bjacob
Copy link
Contributor Author

bjacob commented Oct 17, 2024

@ScottTodd , for my education - when my PR has a conflict with main, so I have to rebase it, that in itself seems to require a force-push, right -- so do I correctly understand your suggestion in such cases to refrain from squashing the commits in this PR, so that the force-push of the rebased first commit doesn't prevent you from looking at the other commits on top of it?

@ScottTodd
Copy link
Member

@ScottTodd , for my education - when my PR has a conflict with main, so I have to rebase it, that in itself seems to require a force-push, right -- so do I correctly understand your suggestion in such cases to refrain from squashing the commits in this PR, so that the force-push of the rebased first commit doesn't prevent you from looking at the other commits on top of it?

git merge upstream/main then resolve conflicts should avoid the need to rebase or force push.

@bjacob
Copy link
Contributor Author

bjacob commented Oct 17, 2024

TIL, thanks. Because we ultimately rebase-and-squash when merging PRs, I had followed that workflow also locally. It hadn't occurred to me that I could create merge commits locally and on my PR branch, as that would still ultimately get rebased-and-squashed on merging.

Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. Just a few lingering comments.

Comment on lines 90 to 92
if(_RULE_COMPILER_TARGET_BACKEND STREQUAL "llvm-cpu")
list(APPEND _TRANSLATE_FLAGS "--iree-llvmcpu-target-cpu=generic")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment on lines -254 to +252
if(_RULE_TARGET_CPU_FEATURES)
list(APPEND _GENERATOR_STANDARD_FLAGS "--requirements=${_RULE_TARGET_CPU_FEATURES}")
endif()
foreach(_COMPILER_FLAG IN LISTS _RULE_COMPILER_FLAGS)
set(_CPU_FEATURES_REGEX "^--iree-llvmcpu-target-cpu-features=")
if (_COMPILER_FLAG MATCHES "${_CPU_FEATURES_REGEX}")
string(REGEX REPLACE "${_CPU_FEATURES_REGEX}" "" _CPU_FEATURES "${_COMPILER_FLAG}")
list(APPEND _GENERATOR_STANDARD_FLAGS "--requirements=${_CPU_FEATURES}")
endif()
endforeach()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now / in this repo.

I have followed your suggestion to drop TARGET_CPU and TARGET_CPU_FEATURES. While doing do, I found the place which had originally motivated them: this is where we pass a --requirements flag to the test runner so that it can check at runtime that the host CPU supports the expected features. Now that there isn't a separate TARGET_CPU_FEATURES field anymore, we resort to parsing the COMPILER_FLAGS, which is not too bad and is a tiny detail, so it is the right call to deal with it rather than have those TARGET_CPU and TARGET_CPU_FEATURES flags everywhere.

Ah. That's on my list to remove too: https://github.com/iree-org/iree-test-suites/blob/3a0ea13cbdc954365d653a48cc99cc63a9ff09b3/linalg_ops/matmul/generate_e2e_matmul_tests.py#L529-L531

    # TODO(scotttodd): drop this and whatever logic in the test tool used it
    #     multiple backends should be able to use the same input IR, so the
    #     input IR shouldn't need things like CPU features in it

(Over in iree-test-suites) I think I also want to push this logic to the leaves, similar to how we have if(IREE_HIP_TEST_TARGET_CHIP) controlling defining and running tests for the ROCm backend + HIP driver, we can check explicit options like IREE_TEST_SUITES_INCLUDE_ARM_SME_TESTS or check automatically/programmatically for support.

Basically, each set of tests should make it obvious how the tests are enabled. The existing logic, which you are updating here, hides some of that decision making in helper functions. For remote targets like Android devices and local targets like GPUs, we can't as easily enumerate devices and supported features at configure time, and I don't want to special case any backend in the helper functions.

Comment on lines 52 to 55
flags = [
"--iree-hal-target-backends=%s" % target_backend,
"--iree-llvmcpu-target-cpu=generic",
] + compiler_flags + input_type_flags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this implicit flag needed in the Bazel helper function? It looks like you already updated several call sites to pass it explicitly. We shouldn't always set a llvmcpu, even for tests using other targets.

I don't care as much about the Bazel side of that though, and it looks like CMake has the behavior that I want already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. There was something to change here, but to match what we do in the CMake implementation, it had to be something else. The top-level rules like iree_check_test_suite and iree_generated_e2e_runner_test take the special target_cpu_features_variants parameter. This defaults to ["generic"] on CMake, so it should also default to that on Bazel. So the "generic" stays implicit but moves one level up within the rules.

build_tools/bazel/iree_e2e_generated_runner_test.bzl Outdated Show resolved Hide resolved
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@bjacob bjacob merged commit 0c6a151 into iree-org:main Oct 17, 2024
36 checks passed
@bjacob
Copy link
Contributor Author

bjacob commented Oct 18, 2024

@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.

@ScottTodd
Copy link
Member

Build failures on MSVC: https://github.com/iree-org/iree/actions/runs/11401187470/job/31723569010#step:7:7793

@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.

See the pinned issue: #18432

@bjacob
Copy link
Contributor Author

bjacob commented Oct 18, 2024

bjacob added a commit that referenced this pull request Oct 18, 2024
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
@ScottTodd
Copy link
Member

@stellaraccident , i'm blissfully unaware of release schedules. If there is any upcoming release, this one is worth having on it.

See the pinned issue: #18432

@bjacob I added a bullet to the rolling release notes there:

  • Added warnings for when --iree-llvmcpu-target-cpu= or --iree-llvmcpu-target-cpu-features are missing, since the default of "generic" is slow. These flags will be required in the future: 0c6a151

When we next cut a stable release, the release notes on that issue will be included. Feel free to edit the issue if you have other phrasing you want us to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/llvm LLVM code generation compiler backend compiler/tools IREE's compiler tooling (iree-compile, iree-opt, etc) quality of life 😊 Nice things are nice; let's have some
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants