Skip to content

Commit

Permalink
Merge branch 'asplos' into tututorialDescription
Browse files Browse the repository at this point in the history
  • Loading branch information
denolf authored Apr 9, 2024
2 parents af0cd66 + 46d0bbf commit 8bdc26d
Show file tree
Hide file tree
Showing 9 changed files with 336 additions and 21 deletions.
66 changes: 66 additions & 0 deletions programming_examples/utils/test_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# test_utils.py -*- Python -*-
#
# Copyright (C) 2024, Advanced Micro Devices, Inc. All rights reserved.
# SPDX-License-Identifier: MIT

import argparse

# options
def parse_args(args):
p = argparse.ArgumentParser()
p.add_argument(
"-x", "--xclbin", required=True, dest="xclbin", help="the input xclbin path"
)
p.add_argument(
"-k",
"--kernel",
required=True,
dest="kernel",
default="MLIR_AIE",
help="the kernel name in the XCLBIN (for instance MLIR_AIE)",
)
p.add_argument(
"-v", "--verbosity", default=0, type=int, help="the verbosity of the output"
)
p.add_argument(
"-i",
"--instr",
dest="instr",
default="instr.txt",
help="path of file containing userspace instructions sent to the NPU",
)
p.add_argument(
"--verify",
dest="verify",
default=True,
help="whether to verify the AIE computed output",
)
p.add_argument(
"--iters",
dest="iters",
default=1,
type=int,
help="number of benchmark iterations",
)
p.add_argument(
"--warmup",
dest="warmup_iters",
default=0,
type=int,
help="number of warmup iterations",
)
p.add_argument(
"-t",
"--trace_sz",
dest="trace_size",
default=0,
type=int,
help="trace size in bytes",
)
p.add_argument(
"--trace_file",
dest="trace_file",
default="trace.txt",
help="where to store trace output",
)
return p.parse_args(args)
4 changes: 0 additions & 4 deletions programming_examples/utils/unsign.sh

This file was deleted.

6 changes: 6 additions & 0 deletions programming_guide/quick_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@

## Object FIFO Bindings

| Syntax | Definition | Example | Notes |
|--------|------------|---------|-------|
| \<name\> = object_fifo(name, producerTile, consumerTiles, depth, datatype) | Initialize Object FIFO | of0 = object_fifo("objfifo0", A, B, 3, T.memref(256, T.i32())) | The `producerTile` and `consumerTiles` inputs are AI Engine tiles. The `consumerTiles` may also be specified as an array of tiles for multiple consumers. |
| \<name\> = \<objfifo_name\>.acquire(port, num_elem) | Acquire from Object FIFO | elem0 = of0.acquire(ObjectFifoPort.Produce, 1) | The `port` input is either `ObjectFifoPort.Produce` or `ObjectFifoPort.Consume`. The output may be either a single object or an array of objects which can then be indexed in an array-like fashion. |
| \<objfifo_name\>.release(port, num_elem) | Release from Object FIFO | of0.release(ObjectFifoPort.Consume, 2) | The `port` input is either `ObjectFifoPort.Produce` or `ObjectFifoPort.Consume`. |
| object_fifo_link(fifoIns, fifoOuts) | Create a link between Object FIFOs | object_fifo_link(of0, of1) | The inputs `fifoIns` and `fifoOuts` may be either a single Object FIFO or a list of them. Both can be specified either using their python variables or their names. Currently, if one of the two inputs is a list of ObjectFIFOs then the other can only be a single Object FIFO. |

## Python helper functions
| Function | Description |
Expand Down
13 changes: 12 additions & 1 deletion programming_guide/section-2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,27 @@ To understand the need for a data movement abstraction we must first understand

*Note: For more in-depth, low-level material on Object FIFO programming in MLIR please see the MLIR-AIE [tutorials](../mlir_tutorials).*

This guide is split into three sections, where each section builds on top of the previous ones:
This guide is split into four sections, where each section builds on top of the previous ones:

<details><summary><a href="./section-2a">Section 2a - Introduction</a></summary>

* Initializing an Object FIFO
* Accessing the objects of an Object FIFO
* Object FIFOs with same producer / consumer
</details>
<details><summary><a href="./section-2b">Section 2b - Key Object FIFO Patterns</a></summary>

* Introduce data movement patterns supported by the Object FIFO
* Reuse
* Broadcast
* Distribute
* Join
</details>
<details><summary><a href="./section-2c">Section 2c - Data Layout Transformations</a></summary>

* Introduce data layout transformation capabilities
</details>
<details><summary><a href="./section-2d">Section 2d - Programming for multiple cores</a></summary>

* Walkthrough of the process of efficiently upgrading to designs with multiple cores
</details>
46 changes: 36 additions & 10 deletions programming_guide/section-2/section-2a/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ We will now go over each of the inputs, what they represents and why they are re

First of all, an Object FIFO has a unique `name`. It functions as an ordered buffer that has `depth`-many objects of specified `datatype`. Currently, all objects in an Object FIFO have to be of the same datatype. The datatype is a tensor-like attribute where the size of the tensor and the type of the individual elements are specified at the same time (i.e. `<16xi32>`). The `depth` can be either an integer or an array of integers. The latter is used to support a specific dependency that can arise when working with multiple Object FIFOs and it is further explained in the Key Object FIFO Patterns [section](../section-2b/README.md#broadcast).

An Object FIFO is created between a producer or source tile and a consumer or destination tile. Below, you can see an example of an Object FIFO created between producer tile A and consumer tile B:
An Object FIFO is created between a producer, or source tile, and a consumer, or destination tile. The tiles are where producer and consumer processes accessing the Object FIFO will be executed. Below, you can see an example of an Object FIFO created between producer tile A and consumer tile B:
```
A = tile(1, 2)
B = tile(1, 3)
Expand All @@ -44,39 +44,65 @@ As you will see in the Key Object FIFO Patterns [section](../section-2b/README.m

### Accessing the objects of an Object FIFO

An Object FIFO can be accessed by the processes running on the producer and consumer tiles registered to it. Before a process can have access to the objects it has to acquire them from the Object FIFO. This is because the Object FIFO is a synchronized communication primitive and two processes may not access the same object at the same time. Once a process has finished working with an object and has no further use for it, it should release it so that another process will be able to acquire and access it. The patterns in which a producer or a consumer process acquires and releases objects from an Object FIFO are called `access patterns`. We can specifically refer to the acquire and release patterns as well.
An Object FIFO can be accessed by the processes running on the producer and consumer tiles registered to it. Before a process can have access to the objects it has to acquire them from the Object FIFO. This is because the Object FIFO is a synchronized communication primitive that leverages the synchronization mechanism available in the target hardware architecture to ensure that two processes can't access the same object at the same time. Once a process has finished working with an object and has no further use for it, it should release it so that another process will be able to acquire and access it. The patterns in which a producer or a consumer process acquires and releases objects from an Object FIFO are called `access patterns`. We can specifically refer to the acquire and release patterns as well.

To acquire one or multiple objects users should use the acquire function of the `object_fifo` class:
```
def acquire(self, port, num_elem)
```
Based on the `num_elem` input representing the number of acquired elements, the acquire function will either directly return an object, or an array of objects that can be accessed in an array-like fashion.
Based on the `num_elem` input representing the number of acquired elements, the acquire function will either directly return an object, or an array of objects that can be accessed in an array-like fashion.

The Objetc FIFO is an ordered primitive and the API keeps track for each process which object is the next one that they will have access to when acquiring, based on how many they have already acquired and released. Specifically, the first time a process acquires an object it will have access to the first object of the Object FIFO, and after releasing it and acquiring a new one, it'll have access to the second object, and so on until the last object, after which the order starts from the first one again. When acquiring multiple objects and accessing them in the returned array, the object at index 0 will always be the <u>oldest</u> object that that process has access to, which may not be the first object in the pool of that Object FIFO.

To release one or multiple objects users should use the release function of the `object_fifo` class:
```
def release(self, port, num_elem)
```
A process may release one, some or all of the objects it has acquired. The release function will release objects from oldest to youngest in acquired order. If a process does not release all of the objects it has acquired, then the next time it acquires objects the oldest objects will be those that were not released. This functionality is intended to achieve the behaviour of a sliding window through the Object FIFO primitive. (TODO: add link to ref design or subsection) (TODO: merge PR to make the port optional) (TODO: make it clear that to access old unreleased objects, users should use the result of the new acquire)
A process may release one, some or all of the objects it has acquired. The release function will release objects from oldest to youngest in acquired order. If a process does not release all of the objects it has acquired, then the next time it acquires objects the oldest objects will be those that were not released. This functionality is intended to achieve the behaviour of a sliding window through the Object FIFO primitive. This is described further in the Key Object FIFO Patterns [section](../section-2b/README.md#reuse).

When acquiring the objects of an Object FIFO using the acquire function it is important to note that any <u>unreleased objects from a previous acquire</u> will also be returned by the <u>most recent</u> acquire call. Unreleased objects will not be reacquired in the sense that the synchronization mechanism used under the hood has already been set in place such that the process already has the sole access rights to the unreleased objects from the previous acquire. As such, two acquire calls back-to-back without a release call in-between will result in the same objects being returned by both acquire calls. This decision was made to facilitate the understanding of releasing objects between calls to the acquire function as well as to ensure a proper lowering through the Object FIFO primitive. A code example of this behaviour is available in the Key Object FIFO Patterns [section](../section-2b/README.md#reuse).

Below you can see an example of two processes that are accessing the `of0` Object FIFO that we initialized in the previous section, one running on the producer tile and the other on the consumer tile. The producer process runs a loop of ten iterations and during each of them it acquires one object from `of0`, calls a `test_func` function on the acquired object, and releases the object. The consumer process only runs once and acquires two objects from `of0`. It then calls a `test_func2` function to which it gives as input each of the two objects it acquired, before releasing them both at the end.
The `port` input of both the acquire and the release functions represents whether that process is a producer or a consumer process and it is an important hint for the Object FIFO lowering to properly leverage the underlying synchronization mechanism. Its value may be either `ObjectFifoPort.Produce` or `ObjectFifoPort.Consume`. However, an important thing to note is that the terms producer and consumers are used mainly as a means to provide a logical reference for a human user to keep track of what process is at what end of the data movement, but it <u>does not restrict the behaviour of that process</u>, i.e., a producer process may simply access an object to read it and does not require to modify it.

Below you can see an example of two processes that are <u>iterating over the objects of the Object FIFO</u> `of0` that we initialized in the previous section, one running on the producer tile and the other on the consumer tile. To do this, the producer process runs a loop of three iterations, equal to the depth of `of0`, and during each iteration it acquires one object from `of0`, calls a `test_func` function on the acquired object, and releases the object. The consumer process only runs once and acquires all three objects from `of0` at once and stores them in the `elems` array, from which it can <u>access each object individually in any order</u>. It then calls a `test_func2` function three times and in each call it gives as input one of the objects it acquired, before releasing all three objects at the end.
```
A = tile(1, 2)
B = tile(1, 3)
of0 = object_fifo("objfifo0", A, B, 3, T.memref(256, T.i32()))
@core(A)
def core_body():
for _ in range_(10):
for _ in range_(3):
elem0 = of0.acquire(ObjectFifoPort.Produce, 1)
call(test_func, [elem0])
of0.release(ObjectFifoPort.Produce, 1)
yield_([])
@core(B)
def core_body():
elems = of0.acquire(ObjectFifoPort.Consume, 2)
call(test_func2, [elems[0], elems[1]])
of0.release(ObjectFifoPort.Consume, 2)
elems = of0.acquire(ObjectFifoPort.Consume, 3)
call(test_func2, [elems[0]])
call(test_func2, [elems[1]])
call(test_func2, [elems[2]])
of0.release(ObjectFifoPort.Consume, 3)
```

### Object FIFOs with same producer / consumer

An Object FIFO can be created with the same tile as both its producer and consumer tile. This is mostly done in order to ensure proper synchronization within the process itself, as opposed to synchronization across multiple processes running on different tiles as we've seen in examples up until this point. All of the functionalities described up until this point apply in the same way. Below is an example of how such an Object FIFO can be initialized and accessed:
```
A = tile(1, 2)
of0 = object_fifo("objfifo0", A, A, 3, T.memref(256, T.i32()))
TODO: add description of initializing an OF with the same producer and consumer tile
@core(A)
def core_body():
for _ in range_(3):
elem0 = of0.acquire(ObjectFifoPort.Produce, 1)
call(test_func, [elem0])
of0.release(ObjectFifoPort.Produce, 1)
elem1 = of0.acquire(ObjectFifoPort.Consume, 1)
call(test_func2, [elem1])
of0.release(ObjectFifoPort.Consume, 1)
yield_([])
```
Loading

0 comments on commit 8bdc26d

Please sign in to comment.