[Epic][CPU] Enable predictable performance for matrix multiplication using data-tiling and micro-kernels #13216

dcaballe · 2023-04-21T02:19:36Z

This EPIC tracking all the micro-kernel related work.

Tasks related to core functionality.

- [x] Enable use of microkernel through plugins
- [x] Enable use of microkernels through bitcode linking (#12112 ) : WIP PR : #13460
- [ ] Benchmark performance on models of data tiling approach using microkernels as plugins (until #12112 is addressed)
- [ ] #11360
- [ ] Avoid needing to unpack the results of the GEMM, by writing results of GEMM in unpacked form
- [ ] Improve distribution of CPU code-generation to threads.
- [ ] https://github.com/openxla/iree/issues/14431

Blocking issues with data-tiling on e2e workloads

- [ ] https://github.com/openxla/iree/issues/14414
- [ ] https://github.com/openxla/iree/issues/14398
- [ ] https://github.com/openxla/iree/issues/14252

Amortizing costs related to packing/unpacking by using propagation

### Tasks
- [ ] Add a early matmul -> data tiling pass to preprocessing stage
- [ ] Add DataLayoutPropagation pass to preprocessing or flow stage

Enable propagation in models where packing can be done without padding, starting with MobileBertfp32.

Some supports are already developed in -upstream. To enable it for MobileBert, we need

### Tasks
- [ ] #13530 
- [ ] #13531 
- [ ] #13532

Evaluating propagation in models where packing needs padding as well.

This needs more investigation. Propagation of tensor.pack ops with padding values is very challenging. We are able to propagate it through some limited cases. Need more investigation for this part.

The text was updated successfully, but these errors were encountered:

dcaballe added the codegen Shared code generation infrastructure and dialects label Apr 21, 2023

aaron-schneider added the epic Tracking issues for related deliverables (as in Agile dev) label Apr 26, 2023

MaheshRavishankar added codegen/llvm LLVM code generation compiler backend codegen Shared code generation infrastructure and dialects and removed codegen Shared code generation infrastructure and dialects labels May 15, 2023

This was referenced May 15, 2023

[Epic] Prototype data layout propagation for data tiling #13533

Closed

[Epic] Data tiling: Complete x86 microkernels, invoke non-VMVX microKernel #13512

Closed

MaheshRavishankar added the p1 label May 15, 2023

MaheshRavishankar assigned MaheshRavishankar and hanhanW May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Epic][CPU] Enable predictable performance for matrix multiplication using data-tiling and micro-kernels #13216

[Epic][CPU] Enable predictable performance for matrix multiplication using data-tiling and micro-kernels #13216

dcaballe commented Apr 21, 2023 •

edited by hanhanW

Loading

[Epic][CPU] Enable predictable performance for matrix multiplication using data-tiling and micro-kernels #13216

[Epic][CPU] Enable predictable performance for matrix multiplication using data-tiling and micro-kernels #13216

Comments

dcaballe commented Apr 21, 2023 • edited by hanhanW Loading

Tasks related to core functionality.

Blocking issues with data-tiling on e2e workloads

Amortizing costs related to packing/unpacking by using propagation

Enable propagation in models where packing can be done without padding, starting with MobileBertfp32.

Evaluating propagation in models where packing needs padding as well.

dcaballe commented Apr 21, 2023 •

edited by hanhanW

Loading