Skip to content

Commit

Permalink
data-tiling: introduce upper_bound_tile_size op to defer padding-si…
Browse files Browse the repository at this point in the history
…ze choice to MaterializeEncoding. (#14349)

This fixes #11632, by introducing a materializable
`upper_bound_tile_size ` instead of hardcoding a fixed padding amount at
Flow, and fixes it in sufficient generality to also solve the problem
for narrow matmuls - let's explain that in more detail as this is an
important part of what this PR is doing.

For each combination of element types and each target, the
MaterializeEncoding pass selects appropriate matmul tile shapes. Input
tensors get padded to the next multiple of the next tile size. The
padding increases the inherent arithmetic cost of the problem at hand.
When, along some dimension, the original tensor size is smaller than the
tile size, that can result in particulary large overhead. The extreme
case, which is also a very common case, is matrix-times-vector
multiplication. The "vector" right-hand side is really a matrix with one
dimension size equal to 1, so if the general matmul tile shape along
that dimension is 8 or 16, as is usually the case, that can be a 8x or
16x increase in the inherent arithmetic cost of the matmul op.

The solution to that is to adjust MaterializeEncoding tile shapes to
narrow dimensions. We had some logic in place to deal with that, but
#11632 was leaving it moot: the flow-level padding of everything to the
next multiple of 16 meant that our logic there never really had a chance
of kicking in. With #11632 being fixed, this PR was the opportunity to
also fix that along the way, and to ensure that the solution to #11632
worked also in that respect. As matrix-times-vector products were the
common case that suffered the most from #11632, it would have been too
bad to "solve" #11632 without addressing that. By the way,
matrix-times-vector is only the extreme case, but other narrow cases
matter too. When, e.g. on AVX-512, the general matmul tile size is 16,
even width-8 matmuls (MxKx8) were suffering from 2x-widening. So the
solution in this PR is making sure to address all narrow cases, defined
as whenever a tensor dimension size is less than the general tile size.

The difficulty was that when MaterializeEncoding runs on a dispatch
function, it runs on an already-padded tensor; even as this PR
introduces `upper_bound_tile_size`, that only makes it possible to
select the right padding amount, but there's still a `tensor.pad` op and
it's still getting in the way of knowing the actual, original tensor
shape for the purpose of adjusting tile shapes for narrow cases.
Moreover, as `MaterializeEncoding` is a type-converter pass, it can't
just walk from a Value up to its defining-op to find the pre-padding
tensor. There are no values there, only types. So the information about
the pre-padding tensor shape has to be part of the tensor type that
`MaterializeEncoding` sees, that its, the padded tensor type.

The solution to that problem in this PR is to add a `original_type`
field to `EncodingAttr`.

Fixes  #11632.

Fixes a compiler issue encountered in #14398 but not the originally
reported runtime crash by itself.

This now also includes the removal of a now-useless VMVX pass, which was
originally split into #14383 .
  • Loading branch information
bjacob authored Jul 17, 2023
1 parent 49335b7 commit 09685ee
Show file tree
Hide file tree
Showing 50 changed files with 1,309 additions and 789 deletions.
1 change: 1 addition & 0 deletions compiler/src/iree/compiler/Codegen/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ iree_compiler_cc_library(
],
deps = [
"//compiler/src/iree/compiler/Codegen/Common",
"//compiler/src/iree/compiler/Codegen/Common/CPU:CommonCPUPasses",
"//compiler/src/iree/compiler/Codegen/Common/GPU:CommonGPUPasses",
"//compiler/src/iree/compiler/Codegen/Dialect:IREECodegenDialect",
"//compiler/src/iree/compiler/Codegen/LLVMCPU",
Expand Down
1 change: 1 addition & 0 deletions compiler/src/iree/compiler/Codegen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ iree_cc_library(
IREELinalgExtPasses
MLIRPass
iree::compiler::Codegen::Common
iree::compiler::Codegen::Common::CPU::CommonCPUPasses
iree::compiler::Codegen::Common::GPU::CommonGPUPasses
iree::compiler::Codegen::Dialect::IREECodegenDialect
iree::compiler::Codegen::LLVMCPU
Expand Down
90 changes: 90 additions & 0 deletions compiler/src/iree/compiler/Codegen/Common/CPU/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Copyright 2023 The IREE Authors
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

load("//build_tools/bazel:build_defs.oss.bzl", "iree_compiler_cc_library", "iree_gentbl_cc_library")

package(
default_visibility = ["//visibility:public"],
features = ["layering_check"],
licenses = ["notice"], # Apache 2.0
)

iree_gentbl_cc_library(
name = "PassesIncGen",
tbl_outs = [
(
["--gen-pass-decls"],
"Passes.h.inc",
),
],
tblgen = "@llvm-project//mlir:mlir-tblgen",
td_file = "Passes.td",
deps = ["@llvm-project//mlir:PassBaseTdFiles"],
)

iree_compiler_cc_library(
name = "PassHeaders",
hdrs = [
"PassDetail.h",
"Passes.h",
"Passes.h.inc",
],
deps = [
":PassesIncGen",
"//compiler/src/iree/compiler/Codegen/Dialect:IREECodegenDialect",
"//compiler/src/iree/compiler/Dialect/HAL/IR",
"//compiler/src/iree/compiler/Utils",
"@llvm-project//mlir:Pass",
"@llvm-project//mlir:Transforms",
],
)

iree_compiler_cc_library(
name = "CommonCPUPasses",
srcs = [
"CPUMaterializeEncodingPass.cpp",
"Passes.cpp",
],
hdrs = [
"Passes.h",
],
deps = [
":PassHeaders",
":PassesIncGen",
"//compiler/src/iree/compiler/Codegen/Common",
"//compiler/src/iree/compiler/Codegen/Dialect:IREECodegenDialect",
"//compiler/src/iree/compiler/Codegen/Transforms",
"//compiler/src/iree/compiler/Codegen/Utils",
"//compiler/src/iree/compiler/Dialect/HAL/IR",
"//llvm-external-projects/iree-dialects:IREELinalgExtDialect",
"//llvm-external-projects/iree-dialects:IREELinalgExtTransforms",
"//llvm-external-projects/iree-dialects:IREELinalgExtUtils",
"@llvm-project//llvm:Support",
"@llvm-project//mlir:AffineDialect",
"@llvm-project//mlir:AffineTransforms",
"@llvm-project//mlir:AffineUtils",
"@llvm-project//mlir:ArithDialect",
"@llvm-project//mlir:BufferizationDialect",
"@llvm-project//mlir:DestinationStyleOpInterface",
"@llvm-project//mlir:FuncDialect",
"@llvm-project//mlir:IR",
"@llvm-project//mlir:LinalgDialect",
"@llvm-project//mlir:LinalgUtils",
"@llvm-project//mlir:MemRefDialect",
"@llvm-project//mlir:MemRefTransforms",
"@llvm-project//mlir:Pass",
"@llvm-project//mlir:SCFDialect",
"@llvm-project//mlir:SCFTransforms",
"@llvm-project//mlir:SCFUtils",
"@llvm-project//mlir:SideEffectInterfaces",
"@llvm-project//mlir:Support",
"@llvm-project//mlir:TensorDialect",
"@llvm-project//mlir:Transforms",
"@llvm-project//mlir:VectorDialect",
"@llvm-project//mlir:VectorToSCF",
"@llvm-project//mlir:VectorTransforms",
],
)
85 changes: 85 additions & 0 deletions compiler/src/iree/compiler/Codegen/Common/CPU/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
################################################################################
# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from #
# compiler/src/iree/compiler/Codegen/Common/CPU/BUILD.bazel #
# #
# Use iree_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary #
# CMake-only content. #
# #
# To disable autogeneration for this file entirely, delete this header. #
################################################################################

iree_add_all_subdirs()

iree_tablegen_library(
NAME
PassesIncGen
TD_FILE
"Passes.td"
OUTS
--gen-pass-decls Passes.h.inc
)

iree_cc_library(
NAME
PassHeaders
HDRS
"PassDetail.h"
"Passes.h"
"Passes.h.inc"
DEPS
::PassesIncGen
MLIRPass
MLIRTransforms
iree::compiler::Codegen::Dialect::IREECodegenDialect
iree::compiler::Dialect::HAL::IR
iree::compiler::Utils
PUBLIC
)

iree_cc_library(
NAME
CommonCPUPasses
HDRS
"Passes.h"
SRCS
"CPUMaterializeEncodingPass.cpp"
"Passes.cpp"
DEPS
::PassHeaders
::PassesIncGen
IREELinalgExtDialect
IREELinalgExtTransforms
IREELinalgExtUtils
LLVMSupport
MLIRAffineDialect
MLIRAffineTransforms
MLIRAffineUtils
MLIRArithDialect
MLIRBufferizationDialect
MLIRDestinationStyleOpInterface
MLIRFuncDialect
MLIRIR
MLIRLinalgDialect
MLIRLinalgUtils
MLIRMemRefDialect
MLIRMemRefTransforms
MLIRPass
MLIRSCFDialect
MLIRSCFTransforms
MLIRSCFUtils
MLIRSideEffectInterfaces
MLIRSupport
MLIRTensorDialect
MLIRTransforms
MLIRVectorDialect
MLIRVectorToSCF
MLIRVectorTransforms
iree::compiler::Codegen::Common
iree::compiler::Codegen::Dialect::IREECodegenDialect
iree::compiler::Codegen::Transforms
iree::compiler::Codegen::Utils
iree::compiler::Dialect::HAL::IR
PUBLIC
)

### BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_BELOW_THIS_LINE ###
Loading

0 comments on commit 09685ee

Please sign in to comment.