From e52503d540d895904d541d4b32a44380452f3540 Mon Sep 17 00:00:00 2001 From: Andra Bisca Date: Thu, 25 Apr 2024 19:59:00 +0200 Subject: [PATCH] [asplos] Add section summaries to section2 (#1410) --- programming_guide/section-2/section-2a/README.md | 11 +++++++++++ programming_guide/section-2/section-2b/README.md | 11 +++++++++++ programming_guide/section-2/section-2c/README.md | 11 +++++++++++ programming_guide/section-2/section-2d/README.md | 11 +++++++++++ programming_guide/section-2/section-2e/README.md | 11 +++++++++++ programming_guide/section-2/section-2f/README.md | 11 +++++++++++ programming_guide/section-2/section-2g/README.md | 11 +++++++++++ 7 files changed, 77 insertions(+) diff --git a/programming_guide/section-2/section-2a/README.md b/programming_guide/section-2/section-2a/README.md index 2eb790472e..0876393bf3 100644 --- a/programming_guide/section-2/section-2a/README.md +++ b/programming_guide/section-2/section-2a/README.md @@ -10,6 +10,17 @@ # Section 2a - Introduction +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * Section 2a - Introduction + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * [Section 2c - Data Layout Transformations](../section-2c/) + * [Section 2d - Programming for multiple cores](../section-2d/) + * [Section 2e - Practical Examples](../section-2e/) + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + ### Initializing an Object FIFO An Object FIFO represents the data movement connection between a point A and a point B. In the AIE array, these points are AIE tiles (see [Section 1 - Basic AI Engine building blocks](../../section-1/)). Under the hood, the data movement configuration for different types of tiles (Shim tiles, Mem tiles, and compute tile) is different, but there is no difference between them when using an Object FIFO. diff --git a/programming_guide/section-2/section-2b/README.md b/programming_guide/section-2/section-2b/README.md index ac883aef17..eec84db2c4 100644 --- a/programming_guide/section-2/section-2b/README.md +++ b/programming_guide/section-2/section-2b/README.md @@ -10,6 +10,17 @@ # Section 2b - Key Object FIFO Patterns +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * Section 2b - Key Object FIFO Patterns + * [Section 2c - Data Layout Transformations](../section-2c/) + * [Section 2d - Programming for multiple cores](../section-2d/) + * [Section 2e - Practical Examples](../section-2e/) + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + The Object FIFO primitive supports several data movement patterns. We will now describe each of the currently supported patterns in three subsections and provide links to more in-depth practical code examples that showcase each of them.
Object FIFO Reuse Pattern diff --git a/programming_guide/section-2/section-2c/README.md b/programming_guide/section-2/section-2c/README.md index d1a7ff98ad..8c5c36480d 100644 --- a/programming_guide/section-2/section-2c/README.md +++ b/programming_guide/section-2/section-2c/README.md @@ -10,6 +10,17 @@ # Section 2c - Data Layout Transformations +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * Section 2c - Data Layout Transformations + * [Section 2d - Programming for multiple cores](../section-2d/) + * [Section 2e - Practical Examples](../section-2e/) + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + While the Object FIFO primitive aims to reduce the complexity tied to data movement configuration on the AI Engine array, it also gives the user control over some advanced features of the underlying architecture. One such feature is the ability to do data layout transformations on the fly using the tile's dedicated hardware: the Data Movement Accelerators (DMAs). **This is available on AIE-ML devices.** Tile DMAs interact directly with the memory modules of their tiles and are responsible for pushing and retrieving data to and from the AXI stream interconnect. When data is pushed onto the stream, the user can program the DMA's n-dimensional address generation scheme such that the data's layout when pushed may be different than how it is stored in the tile's local memory. In the same way, a user can also specify in what layout a DMA should store the data retrieved from the AXI stream. diff --git a/programming_guide/section-2/section-2d/README.md b/programming_guide/section-2/section-2d/README.md index d9008649c2..f36a1a494b 100644 --- a/programming_guide/section-2/section-2d/README.md +++ b/programming_guide/section-2/section-2d/README.md @@ -10,6 +10,17 @@ # Section 2d - Programming for multiple cores +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * [Section 2c - Data Layout Transformations](../section-2c/) + * Section 2d - Programming for multiple cores + * [Section 2e - Practical Examples](../section-2e/) + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + This section will focus on the process of taking code written for a single core and transforming it into a design with multiple cores relatively quickly. For this we will start with the code in [aie2.py](./aie2.py) which contains a simple design running on a single compute tile, and progressively turn it into the code in [aie2_multi.py](./aie2_multi.py) which contains the same design that distributes the work to three compute tiles. The first step in the design is the tile declaration. In the simple design we use one Shim tile to bring data from external memory into the AIE array inside of a Mem tile that will then send the data to a compute tile, wait for the output and send it back to external memory through the Shim tile. Below is how those tiles are declared in the simple design: diff --git a/programming_guide/section-2/section-2e/README.md b/programming_guide/section-2/section-2e/README.md index 18dea5ecea..5763e10de9 100644 --- a/programming_guide/section-2/section-2e/README.md +++ b/programming_guide/section-2/section-2e/README.md @@ -10,6 +10,17 @@ # Section 2e - Practical Examples +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * [Section 2c - Data Layout Transformations](../section-2c/) + * [Section 2d - Programming for multiple cores](../section-2d/) + * Section 2e - Practical Examples + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + This section introduces several examples with common Object FIFO data movement patterns. These examples are intended to be simple enough so as to be easily imported and adapted into other designs.
Example 01 - Single / Double Buffer diff --git a/programming_guide/section-2/section-2f/README.md b/programming_guide/section-2/section-2f/README.md index 9f06315a1a..5b389e3ba5 100644 --- a/programming_guide/section-2/section-2f/README.md +++ b/programming_guide/section-2/section-2f/README.md @@ -10,6 +10,17 @@ # Section 2f - Data Movement Without Object FIFOs +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * [Section 2c - Data Layout Transformations](../section-2c/) + * [Section 2d - Programming for multiple cores](../section-2d/) + * [Section 2e - Practical Examples](../section-2e/) + * Section 2f - Data Movement Without Object FIFOs + * [Section 2g - Runtime Data Movement](../section-2g/) + +----- + Not all data movement patterns can be described with Object FIFOs. This **advanced** section goes into detail about how a user can express data movement using the Data Movement Accelerators (or `DMA`) on AIE tiles. To better understand the code and concepts introduced in this section it is recommended to first read the [Advanced Topic of Section - 2a on DMAs](../section-2a/README.md/#advanced-topic--data-movement-accelerators). The AIE architecture currently has three different types of tiles: compute tiles, referred to as "tile", memory tiles referred to as "Mem tiles", and external memory interface tiles referred to as "Shim tiles". Each of these tiles have their own attributes regarding compute capabilities and memory capacity, but the base design of their DMAs is the same. The different types of DMAs can be intialized using the constructors in [aie.py](../../../python/dialects/aie.py): diff --git a/programming_guide/section-2/section-2g/README.md b/programming_guide/section-2/section-2g/README.md index db52d0b827..1ed9ae0f4b 100644 --- a/programming_guide/section-2/section-2g/README.md +++ b/programming_guide/section-2/section-2g/README.md @@ -10,6 +10,17 @@ # Section 2g - Runtime Data Movement +* [Section 2 - Data Movement (Object FIFOs)](../../section-2/) + * [Section 2a - Introduction](../section-2a/) + * [Section 2b - Key Object FIFO Patterns](../section-2b/) + * [Section 2c - Data Layout Transformations](../section-2c/) + * [Section 2d - Programming for multiple cores](../section-2d/) + * [Section 2e - Practical Examples](../section-2e/) + * [Section 2f - Data Movement Without Object FIFOs](../section-2f/) + * Section 2g - Runtime Data Movement + +----- + In the preceding sections, we looked at how we can describe data movement between tiles *within* the AIE-array. However, to do anything useful, we need to get data from outside the array, i.e. from the "host", into the AIE-array and back. On NPU devices, we can achieve this with the operations described in this section. The operations that will be described in this section must be placed in a separate `sequence` function. The arguments to this function describe buffers that will be available on the host side; the body of the function describes how those buffers are moved into the AIE-array. [Section 3](../../../programming_examples/) contains an example.