From 91ad66f1902675484588050e43211962c45a59b4 Mon Sep 17 00:00:00 2001 From: Sandeep Dasgupta Date: Fri, 11 Aug 2023 22:07:38 +0000 Subject: [PATCH] revised plan --- rfcs/20230622-quantized-reduction.md | 93 +++++++++++++++++++++++++++- 1 file changed, 92 insertions(+), 1 deletion(-) diff --git a/rfcs/20230622-quantized-reduction.md b/rfcs/20230622-quantized-reduction.md index 1251a59f4d6..2152a5cc9ea 100644 --- a/rfcs/20230622-quantized-reduction.md +++ b/rfcs/20230622-quantized-reduction.md @@ -2,7 +2,9 @@ Status: Review
Initial version: 06/22/2023
-Last updated: 07/13/2023
+updated: 07/13/2023
: Minor refactoring of the examples. +Last updated: 08/11/2023
: Revision of the proposal to introduce an +attribute to capture accumulation type.
Discussion thread: [GitHub](https://github.com/openxla/stablehlo/pull/1664) ## Version log @@ -512,3 +514,92 @@ More formally: element_type(input_types(output_conversion))`. * (C11) `shape(operand) = shape(result)`. + +## [11 Aug'23] Revised proposal + +### Context + +Option #2 should be avoided because it is hard to control the transformation +which might disrupt the pattern to be matched. The option #1 sounds good except +that the extra input/output conversion blocks are surplus information. The +specification would benefit if the intent of the conversion blocks can be +expressed precisely. The conversion blocks provides a way to capture the +accumulation type needed to compute the accumulative operation on. + +The revised proposal is: + +* To capture the accumulation type via an additional StableHLO attribute like + `accumulation_element_type`. +* The attribute seems beneficial for other ops as well like `dot_general` and + `convolution`. +* `precision_config`, currently used for `dot_general` and `convolution`, is + used to override the precision specified by the input parameters, allowing the + choice of low precision vs high precision computation. We should consider + adding `precision_config` to all reduction based op as well. + +### Few implementation details + +#### On StableHLO side + +The reduce syntax to be augmented with a optional [type +attribute](https://github.com/llvm/llvm-project/blob/51a57074bc63842970c4c160b05c1a7e42db7523/mlir/include/mlir/IR/OpBase.td#L1466) +as follows: + +```mlir +%0 = stablehlo.reduce(%arg0 init: %arg1) across dimensions = [0] { + accumulation_type = tensor> +} : (tensor<16x!quant.uniform>, tensor>) -> tensor> + reducer(%arg2: tensor>, %arg3: tensor>) { + %1 = stablehlo.add %arg2, %arg3 : tensor> + stablehlo.return %1 : tensor> + } + +// using tablegen specification like +// OptionalAttr>:$accumulation_type +``` + +Note that the main difference between this option and the option #1 is that the +input and output conversion blocks are no longer used as their intent is +specified via the `accumulation_type` attribute. However, the reducer block +still needs to express the computation in accumulation type only. + +**Why optional attribute?** + +* At times, it might be desirable not to hard-code the accumulation type. For + example, when we would like to write a generic code and let the downstream + compilation tools to decide the exact accumulation type based on the hardware + of choice. +* It allows the stablehlo, used in various existing pipelines, to remain + largely unaffected by this change. + +Next, the StableHLO specification should be updated with the syntax and +semantics aspects of this attribute. + +#### On StableHLO Consumers side + +The consumers can pattern match the op taking the accumulation type in account +if the targeted hardware supports accumulation at higher type. +There are still to explore things about maintaining StableHLO-HLO parity which +needs to be addresses as well. + +### Action Plan + +I propose to follow the action plan (order matters): + +* Update the specification of ReduceOp, ReduceWindowOp, and SelectAndScatterOp + op, taking the accumulation type into account, via [open + pr](https://github.com/openxla/stablehlo/pull/1538). +* Finalize the quantized specification of AllReduceOp, BatchNormTrainingOp, + BatchNormGradOp and ReduceScatterOp, whose semantics depend on ReduceOp, + via [open ticket](https://github.com/openxla/stablehlo/issues/1666). +* Add implementation for additional attribute in the above ops. This includes +updating the tablegen spec/verifiers/type inferencers. [Need a ticket for this]. +* Address the disparity between StableHLO and HLO because of the introduction of +this new attribute in StableHLO: Should/How XLA should consume this additional +attribute? [Need a ticket for this]. +* Spec the behavior of `precision_config` in DotGeneralOp. [open +issue](https://github.com/openxla/stablehlo/issues/755) +* Consider adding `precision_config` in reduction op. [need a ticket for this +* Consider adding `accumulation_type` to `dot_general`/`convolution op`. +[need a ticket for this item]. +item].