diff --git a/rfcs/20230622-quantized-reduction.md b/rfcs/20230622-quantized-reduction.md index 248d97c097a..3ec7a8a26af 100644 --- a/rfcs/20230622-quantized-reduction.md +++ b/rfcs/20230622-quantized-reduction.md @@ -42,14 +42,14 @@ The RFC introduces the following proposal, emerged out of discussion in the , along with their tradeoffs. The proposal allows the reducer block to express the computation in a different -element type (preferably higher accumulation type) than the one used in reduce +element type (preferably wider accumulation type) than the one used in reduce op's ops arguments and return type. For illustrative purposes, in the following -example, the operand element type `tensor>` is different from the element type for - reduction region's block arguments. Similarly, the element type of the - reduce op's result `!quant.uniform>` is - different from that of block return (`tensor>`). +example, the operand element type +`tensor>` is different from the +element type for reduction region's block arguments. Similarly, the element +type of the reduce op's result +`!quant.uniform>` is different from that of +block return (`tensor>`). ```mlir %result = "stablehlo.reduce"(%input, %init_value) ({ @@ -71,32 +71,32 @@ example, the operand element type `tensor, ..., tensor, tensor, ...,` `tensor) -> (tensor, ..., tensor)` where - `is_integer(element_type(inputs[i])) = is_integer(element_type(Ei]` or - `is_float(element_type(inputs[i])) = is_float(element_type(Ei]` or - `is_complex(element_type(inputs[i])) = is_complex(element_type(Ei]` or - `is_quantized(element_type(inputs[i])) = is_quantized(element_type(Ei]`. + `is_integer(element_type(inputs[i])) = is_integer(element_type(E[i]))` or + `is_float(element_type(inputs[i])) = is_float(element_type(E[i]))` or + `is_complex(element_type(inputs[i])) = is_complex(element_type(E[i]))` or + `is_quantized(element_type(inputs[i])) = is_quantized(element_type(E[i]))`. * (C?) `shape(results...) = shape(inputs...)` except that the dimension sizes of `inputs...` corresponding to `dimensions` are not included. @@ -170,10 +170,10 @@ portions of the spec which needs modification. * (C?) `baseline_element_type(inputs...) = baseline_element_type(results...)`. * (C?) `body` has type `tensor, ..., tensor, tensor, ...,` `tensor) -> (tensor, ..., tensor)` where - `is_integer(element_type(inputs[i])) = is_integer(element_type(Ei]` or - `is_float(element_type(inputs[i])) = is_float(element_type(Ei]` or - `is_complex(element_type(inputs[i])) = is_complex(element_type(Ei]` or - `is_quantized(element_type(inputs[i])) = is_quantized(element_type(Ei]`. + `is_integer(element_type(inputs[i])) = is_integer(element_type(E[i]))` or + `is_float(element_type(inputs[i])) = is_float(element_type(E[i]))` or + `is_complex(element_type(inputs[i])) = is_complex(element_type(E[i]))` or + `is_quantized(element_type(inputs[i])) = is_quantized(element_type(E[i]))`. ### Revised specification of select_and_scatter op @@ -190,10 +190,10 @@ not need additional conversion functions associated with `select`. But the * (C3) `element_type(init_value) = element_type(operand)`. * (C?) `baseline_element_type(inputs...) = baseline_element_type(results...)`. * (C10) `scatter` has type `(tensor, tensor) -> tensor` where - `is_integer(element_type(operand)) = is_integer(element_type(E]` or - `is_float(element_type(operand)) = is_float(element_type(E]` or - `is_complex(element_type(operand)) = is_complex(element_type(E]` or - `is_quantized(element_type(operand)) = is_quantized(element_type(E]`. + `is_integer(element_type(operand)) = is_integer(element_type(E))` or + `is_float(element_type(operand)) = is_float(element_type(E))` or + `is_complex(element_type(operand)) = is_complex(element_type(E))` or + `is_quantized(element_type(operand)) = is_quantized(element_type(E))`. ### Action Plan @@ -204,18 +204,18 @@ I propose to follow the action plan (order matters): op, taking the accumulation type into account, via [open pr](https://github.com/openxla/stablehlo/pull/1538). * Finalize the quantized specification of AllReduceOp, BatchNormTrainingOp, - BatchNormGradOp and ReduceScatterOp, whose semantics depend on ReduceOp, - via [open ticket](https://github.com/openxla/stablehlo/issues/1666). + BatchNormGradOp and ReduceScatterOp, whose semantics depend on ReduceOp, + via [open ticket](https://github.com/openxla/stablehlo/issues/1666). * Spec the behavior of `precision_config` in DotGeneralOp. [open issue](https://github.com/openxla/stablehlo/issues/755) * Consider adding `precision_config` in reduction op. `precision_config`, -currently used for `dot_general` and `convolution`, to override the precision -specified by the input parameters, allowing the choice of low precision vs high -precision computation. We should consider adding `precision_config` to all -reduction based op as well. [need a ticket for this] + currently used for `dot_general` and `convolution`, to override the precision + specified by the input parameters, allowing the choice of low precision vs + high precision computation. We should consider adding `precision_config` to + all reduction based op as well. [need a ticket for this] * Consider adding `accumulation_type` to `dot_general`/`convolution op`. The -attribute seems beneficial for ops like `dot_general` and `convolution` which -does not have an explicit reduction function. [need a ticket for this item]. + attribute seems beneficial for ops like `dot_general` and `convolution` which + does not have an explicit reduction function. [need a ticket for this item]. ## Summary of previous proposals @@ -340,9 +340,9 @@ Here we will informally propose the semantics of the additional functions * (-) The disadvantage of this representation is that the syntax is more verbose and requires significant changes to the specification. * (-) The extra input/output conversion blocks are surplus information. The -intent of conversion blocks is to capture the accumulation type needed to -compute the accumulative operation on. The specification would benefit if the -intent can be expressed succinctly. + intent of conversion blocks is to capture the accumulation type needed to + compute the accumulative operation on. The specification would benefit if the + intent can be expressed succinctly. ### Introduce accumulation type attribute