Add generic aie array description paragraph (Xilinx#1191)

Co-authored-by: Joseph Melber <jgmelber@gmail.com> Co-authored-by: Jack Lo <jack.lo@amd.com>
fifield · Apr 10, 2024 · 21ef061 · 21ef061
1 parent 56d198c
commit 21ef061
Show file tree

Hide file tree

Showing 4 changed files with 37 additions and 12 deletions.
diff --git a/programming_guide/README.md b/programming_guide/README.md
@@ -8,9 +8,17 @@
 // 
 //===----------------------------------------------------------------------===//-->
 
-# <ins>MLIR-AIE Programming Guide</ins>
+# <ins>IRON AIE Programming Guide</ins>
 
-MLIR-AIE is an MLIR-based representation for AI Engine design. It provides a foundation from which complex and performant AI Engine designs can be defined and is supported by simulation and hardware impelemenation infrastructure. To better understand how AI Engine designs are defined at the MLIR level, it is recommended that you spend some time going through the [MLIR tutorial](../tutorials/) material. However, this programming guide is intended to lead you through a higher level abstraction (python) of the underlying MLIR-AIE framework and provide design examples and programming tips to allow users to build designs directly. Keep in mind also that MLIR-AIE is a foundational layer in a AI Engine software development framework and while this guide provides a programmer's view for using AI Engines, it also serves as a lower layer for higher abstraction MLIR layers such as [MLIR-AIR](https://github.com/Xilinx/mlir-air).
+<img align="right" widht="300" height="300" src="./assets/AIEarray.svg"> 
+
+The AI Engine (AIE) array is a spatial compute architecture: a modular and scalable system with spatially distributed compute and memories. Its compute dense vector processing runs independently and concurrently to explicitly scheduled data movement. Since the vector compute core (green) of each AIE can only operate on data in its L1 scratchpad memory (light blue), data movement accelerators (purple) bi-directionally transport this data over a switched (dark blue) interconnect network, from any level in the memory hierarchy.
+
+Programming the AIE-array configures all its spatial building blocks: the compute cores' program memory, the data movers' buffer descriptors, interconnect with switches, etc. This guide introduces our Interface Representation for hands-ON (IRON) close-to-metal programming of the AIE-array. IRON is an open access toolkit enabling performance engineers to build fast and efficient, often specialized designs through a set of Python language bindings around MLIR-AIE, our MLIR-based representation of the AIE-array. MLIR-AIE provides the foundation from which complex and performant AI Engine designs can be defined and is supported by simulation and hardware implementation infrastructure. 
+
+> **NOTE:**  For those interested in better understanding how AI Engine designs are defined at the MLIR level, take a look through the [MLIR tutorial](../tutorials/) material. MLIR-AIE also serves as a lower layer for other higher-level abstraction MLIR layers such as [MLIR-AIR](https://github.com/Xilinx/mlir-air).
+
+This IRON AIE programming guide first introduces the language bindings for AIE-array's structural elements (section 1). After explaining how to set up explicit data movement (section 2) to transport the necessary data, you can run your first program on the AIE compute core (section 3). Section 4 adds tracing for performance analysis and explains how to exploit the compute dense vector operations. More vector design examples, basic and larger (ML or computer vision) are given in sections 5 and 6. Finally, the quick reference summarizes the most important API elements.
 
 ## Outline
 <details><summary><a href="./section-1">Section 1 - Basic AI Engine building blocks</a></summary>

diff --git a/programming_guide/assets/AIEarray.svg b/programming_guide/assets/AIEarray.svg
diff --git a/programming_guide/section-1/README.md b/programming_guide/section-1/README.md
@@ -10,11 +10,10 @@
 
 # <ins>Section 1 - Basic AI Engine building blocks</ins>
 
-When we program for AI Engines, our MLIR-AIE framework serves as the entry point to declare and configure the structural building blocks that make up an array of AI Engines. Details for these building blocks, along with the general architecture of AI Engines are described in the [MLIR tutorials](../../mlir_tutorials). Read through the synopsis on first page of the tutorial before continuing here.
+When we program the AIE-array, we need to declare and configure its structural building blocks: compute tiles for vector processing, memory tiles as larger level-2 shared scratchpads, and shim tiles supporting data movement to external memory. In this programming guide, we will be utilizing the IRON python bindings for MLIR-AIE components to describe our design at the tile level of granularity. Later on, when we focus on kernel programming, we will explore vector programming in C/C++. But let's first look at a basic python source file (named [aie2.py](./aie2.py)) for an MLIR-AIE design.
 
-In this programming guide, we will be utilizing the python bindings for MLIR-AIE components to describe our design at the tile level of granularity. Later on, when we focus on kernel programming, we will explore vector programming in C/C++. But let's first look at a basic python source file (named [aie2.py](./aie2.py)) for an MLIR-AIE design.
-
-At the top of this python source, we include modules that define the mlir-aie dialect and the mlir ctx wrapper which encapsulates the definition of our AI Engine enabled device (e.g. xcvc1902) and its associated structural building blocks.
+## <ins>Walkthrough of python source file (aie2.py)</ins>
+At the top of this python source, we include modules that define the mlir-aie dialect and the mlir ctx wrapper which encapsulates the definition of our AI Engine enabled device (e.g. ipu or xcvc1902) and its associated structural building blocks.
 
 ```
 from aie.dialects.aie import *                     # primary mlir-aie dialect definitions
@@ -27,12 +26,12 @@ def mlir_aie_design():
     # ctx wrapper - to convert python to mlir
     with mlir_mod_ctx() as ctx:
 ```
-Within our ctx wrapper, we finally get down to declaring our AI Engine device via `@device(AIEDevice.xcvc1902)` and the blocks within the device. Inside the `def device_body():` , we instantiate our AI Engine blocks, which in this first example is simply the AI Engine tiles. The arguments for the tile delcaration are the tile coordinates (column, row) and we assign it a variable tile name in our python program.
+Within our ctx wrapper, we finally get down to declaring our AIE device via `@device(AIEDevice.ipu)` or `@device(AIEDevice.xcvc1902)` and the blocks within the device. Inside the `def device_body():` , we instantiate our AI Engine blocks, which in this first example are simply AIE compute tiles. The arguments for the tile declaration are the tile coordinates (column, row) and we assign it a variable tile name in our python program.
 
-> **NOTE:**  The actual tile coordinates run on the device may deviate from the ones declared here. In Ryzen AI, for example, these coordinates tend to be relative corodinates as the runtime scheduler may assign it to a different available column.
+> **NOTE:**  The actual tile coordinates run on the device may deviate from the ones declared here. For example, on the NPU on Ryzen AI (`@device(AIEDevice.ipu)`), these coordinates tend to be relative coordinates as the runtime scheduler may assign it to a different available column.
 
 ```
-        # Dvice declaration - here using aie2 device xcvc1902
+        # Device declaration - here using aie2 device xcvc1902
         @device(AIEDevice.xcvc1902)
         def device_body():
 
@@ -41,7 +40,7 @@ Within our ctx wrapper, we finally get down to declaring our AI Engine device vi
             ComputeTile = tile(2, 3)
             ComputeTile = tile(2, 4)
 ```
-Once we are done declaring our blocks (and connections), we print the ctx wrapped design python defined design is converted to mlir and printed to stdout. Then we finish our python code by calling the structural design function.
+Once we are done declaring our blocks (and connections), we print the ctx wrapped design and the python defined design is then converted to mlir and printed to stdout. We finish our python code by calling the structural design function that we defined.
 ```
     # print the mlir conversion
     print(ctx.module)
@@ -50,12 +49,29 @@ Once we are done declaring our blocks (and connections), we print the ctx wrappe
 mlir_aie_design()
 ```
 
+## <ins>Other Tile Types</ins>
+Next to the compute tiles, an AIE-array also contains data movers for accessing L3 memory (also called shim DMAs) and larger L2 scratchpads (called mem tiles) which are available since the AIE-ML generation - see [the introduction of this programming guide](../README.md). Declaring these other types of structural blocks follows the same syntax but requires physical layout details for the specific target device. Shim DMAs typically occupy row zero, while mem tiles (when available) often reside on the following row(s). The following code segment declares all the different tile types found in a single NPU column.
+
+```
+        # Device declaration - here using aie2 device ipu
+        @device(AIEDevice.ipu)
+        def device_body():
+
+            # Tile declarations
+            ShimTile     = tile(0, 0)
+            MemTile      = tile(0, 1)
+            ComputeTile1 = tile(0, 2)
+            ComputeTile2 = tile(0, 3)
+            ComputeTile3 = tile(0, 4)
+            ComputeTile4 = tile(0, 5)
+```
+
 ## <u>Exercises</u>
 1. To run our python program from the command line, we type `python3 aie2.py` which converts our python structural design into mlir source code. This works from the command line if our design environment already contains the mlir-aie python binded dialect module. We included this in the [Makefile](./Makefile) so go ahead and run `make` now. Then take a look at the generated mlir source under `build/aie.mlir`.
 
 2. Run `make clean` to remove the generated files. Then introduce an error to the python source such as misspelling `tile` to `tilex` and then run `make` again. What messages do you see? <img src="../../mlir_tutorials/images/answer1.jpg" title="There is python error because tilex is not recognized." height=25>
 
 3. Run `make clean` again. Now change the error by renaming `tilex` back to `tile` but change the coordinates to (-1,3) which is an inavlid location. Run `make` again. What messages do you see now? <img src="../../mlir_tutorials/images/answer1.jpg" title="No error is generated." height=25>
 
-4. No error is generated but our code is invalid. Take a look at the generated mlir code under `build/aie.mlir`. This generaed mlir syntax is invalid and running our mlir-aie tools on this mlir source will generate an error. We do, however, have some additional python structural syntax checks that can be enabled if change the `print(ctx.module)` to `print(ctx.module.operation.verify())`. Make this change and run `make` again. What message do you see now? <img src="../../mlir_tutorials/images/answer1.jpg" title="It now says column value fails to satisfy the constraint because the minimum value is 0" height=25> 
+4. No error is generated but our code is invalid. Take a look at the generated mlir code under `build/aie.mlir`. This generated mlir syntax is invalid and running our mlir-aie tools on this mlir source will generate an error. We do, however, have some additional python structural syntax checks that can be enabled if we change the `print(ctx.module)` to `print(ctx.module.operation.verify())`. Make this change and run `make` again. What message do you see now? <img src="../../mlir_tutorials/images/answer1.jpg" title="It now says column value fails to satisfy the constraint because the minimum value is 0" height=25> 
 
diff --git a/programming_guide/section-1/aie2.py b/programming_guide/section-1/aie2.py
@@ -14,7 +14,7 @@ def mlir_aie_design():
     # ctx wrapper - to convert python to mlir
     with mlir_mod_ctx() as ctx:
 
-        # Dvice declaration - aie2 device xcvc1902
+        # Device declaration - aie2 device xcvc1902
         @device(AIEDevice.xcvc1902)
         def device_body():