Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aievec] Add new shuffle ops #1516

Merged
merged 1 commit into from
May 30, 2024
Merged

Conversation

jsetoain
Copy link
Collaborator

This replaces the old aievec.shuffle op with a new one with a better syntax that supports all cases supported by the intrinsics, and has strong type guarantees.

For legacy purposes, we leave the old shuffle instruction renamed as aievec.legacyshuffle.

os << emitter.getOrCreateName(rhs);
os << ", ";
}
os << "eShuffleMode::shuffle_T" << &stringifyEnum(mode).data()[1];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we can do substr(1) here, which is easier to understand...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! You caught me being lazy 😂

return emitError() << "shuffle mode '" << stringifyEnum(mode)
<< "' requires vectors of " << modeBitWidth
<< "-bit elements";

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also check if the operands and results are in 512-bit vectors?

Copy link
Collaborator Author

@jsetoain jsetoain May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's being checked by the type constraints in the op definition, this exclusively verifies whether the shuffle mode is "compatible" with the element type and number of operands. It's up for debate whether we want to impose this restriction or not, since the hardware is agnostic to the mode (it just shuffles bytes around), but I'd rather be conservative for the time being.

@@ -38,7 +38,7 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.endianness"
// CHECK: %[[C16:.*]] = arith.constant 16 : index
// CHECK: %[[C0:.*]] = arith.constant 0 : index
// CHECK: %[[T0:.*]] = aievec.upd %[[A1]][%[[C0:.*]]] {index = 0 : i8, offset = 0 : i32} : memref<?xi8>, vector<64xi8>
// CHECK: %[[T1:.*]] = aievec.shuffle %[[T0:.*]] {mode = 0 : i32} : vector<64xi8>, vector<64xi8>
// CHECK: %[[T1:.*]] = aievec.legacyshuffle %[[T0:.*]] {mode = 0 : i32} : vector<64xi8>, vector<64xi8>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the mode=0 (shuffle_T8_64x2_lo), I'm a bit confused why this can work without a second operand...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Shuffle" always takes two arguments. When translating to C++, the single operand "shuffle" C++ intrinsic calls a two operand version with "undef" in the rhs. For the new shuffle, we will have to inject the undef when lowering to LLVM IR. With the new constraints, I'm just trying to keep things semantically consistent.

@@ -80,7 +80,7 @@ func.func @conv2d (%A: memref<18x288xi8>, %B: memref<48xi8>, %C: memref<16x256xi
// CHECK: %[[C8:.*]] = arith.constant 8 : i32
// CHECK: %[[C0:.*]] = arith.constant 0 : index
// CHECK: %[[T0:.*]] = aievec.upd %[[A1]][%[[C0]]] {index = 0 : i8, offset = 0 : i32} : memref<48xi8>, vector<64xi8>
// CHECK: %[[T1:.*]] = aievec.shuffle %[[T0]] {mode = 0 : i32} : vector<64xi8>, vector<64xi8>
// CHECK: %[[T1:.*]] = aievec.legacyshuffle %[[T0]] {mode = 0 : i32} : vector<64xi8>, vector<64xi8>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one too...

AIEVec_ShuffleModeAttr:$mode)>,
Results<(outs AnyVector:$result)> {
let summary = "AIE2 shuffle";
let description = [{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice and clear explanation!

Copy link
Collaborator

@david-vc david-vc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! It's great to see the symbolic mode in the textual representation.

include/aie/Dialect/AIEVec/IR/AIEVecOps.td Show resolved Hide resolved

Shuffle Mode | Operands | Types Supported
:------------------:|:------------------:|:------------------:
t8_8x4 | `lhs` | `vector<2x32xi8>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entries in the 'Types Supported' seem inconsistent.
I would expect to see only 1D vectors. So for the entry here, I would expect vector<64xi8>.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I'd like to discuss. I stumbled upon this "issue" when looking over each of the shuffle modes. For some of them, like this one, it performs the transposition on the two halves of the vector, as if we had two (in this case) 8x4 matrices. I'm inclined to treat it as a flat 64xi8 and this would be yet another set of "special" cases. The ISA is not particularly semantically consistent. It does what it does, except when it doesn't 😉

t8_64x2_hi | ^ | ^
t8_2x64_lo | ^ | ^
t8_2x64_hi | ^ | ^
t16_4x2 | `lhs` | `vector<4x8xi16>` or `vector<4x8xbf16>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 3rd column should read vector<8xi16> or vector<8xbf16>.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything, vector<32xi16>/vector<32xbf16>. See above.

This replaces the old `aievec.shuffle` op with a new one with a better
syntax that supports all cases supported by the intrinsics, and has
strong type guarantees.

For legacy purpose, we leave the old shuffle instruction renamed as
`aievec.legacyshuffle`.
@jsetoain jsetoain added this pull request to the merge queue May 30, 2024
Merged via the queue into Xilinx:main with commit 652933e May 30, 2024
51 checks passed
@jsetoain jsetoain deleted the add-aievec-shuffle-ops branch May 30, 2024 10:39
singagan pushed a commit that referenced this pull request Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants