Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

November 2024 Tkw Release Notes #181

Open
14 of 38 tasks
harsh-nod opened this issue Oct 1, 2024 · 0 comments
Open
14 of 38 tasks

November 2024 Tkw Release Notes #181

harsh-nod opened this issue Oct 1, 2024 · 0 comments

Comments

@harsh-nod
Copy link
Contributor

harsh-nod commented Oct 1, 2024

This issue lists all feature requests and improvements slated for the Nov 2024 Tkw release.

  • Ensure that mappings modify the index sequence
  • Paper Submission
  • Masking & Mapping Section
  • IGEMM Performance Results
  • Paper - Language (operators & semantics, constraints)
  • Paper - Compiler & optimizations
  • Paper - Runtime
  • Paper - Sample kernels
  • Paper - Shape & type propagation
  • Broadcast Support (thread-shape analysis)
  • Reduction on non-accumulator (type fixes)
  • Chained Matmul
  • MMA + Reduction (handling mma and vector shapes)
  • Flash Attention Implementation
  • Flash Attention Performance Improvements
  • IGEMM SDXL Shapes Functionality
  • IGEMM Shared Memory Optimizations
  • IGEMM Performance Parity with Tuned IREE on SDXL Shapes
  • Compare IGEMM Performance with CK on SDXL Shapes
  • Obtain GEMM shapes (SDXL, LLAMA shapes ~ 20)
  • Dynamic shapes for GEMMs
  • Add LLVM scheduling intrinsics for GEMMs
  • Add shared memory bank conflict resolution with padding
  • GEMM Performance Parity with Tuned IREE
  • Compare GEMM Performance with hipblasLT
  • GEMM & IGEMM Tuning Capability
  • GEMM Non-temporal loads
  • GEMM + SiLU fusion kernel
  • Performance Dashboard
  • Auto-tuning Capability
  • Batch GEMM support
  • Unaligned shapes for GEMMs
  • GEMM with fused elementwise operations
  • MoE Kernel
  • Temporary File API for benchmarking
  • Use dlpack instead of numpy to copy data between torch and iree
  • Debugger support (add breakpoints and inspect stack on GPU)
  • Profiling support

================================================

Week 1 (Oct 5th)
First version of paper with description of language, (mapping & masking), operators in the language,
compiler optimizations.
Performance comparisons in IGEMM, GEMM.
Flash Attention working.
MI250 & MI300.

Week 2 (Oct 12th)
Paper complete with sections on language, compiler.
FA working.
FP8 GEMM working.
IGEMM fixes landed.
Implement prefill , extend and decode attention and get functional.

Week 3 (Oct 18th)

Week 4 (Oct 25th)

Deadline (Oct 31)

@harsh-nod harsh-nod changed the title November 2024 Release November 2024 Tkw Release Notes Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant