[aievec] to-llvm flow for aievec.ext op with I128.I512 extract intrinsic #1490

jamestcl-amd · 2024-05-15T20:47:59Z

This PR add the support for aievec.ext op going through the to-llvm flow with the extract.I128.I512 intrinsic op.

Changes:

Update the extract.I128.I512 intrinsic ops in XLLVM dialect.
Add aievec-to-llvm conversion pattern/tests for the aievec.ext op for extract.I128.I512 intrinsic.
Add test for translating to external llvm.

jsetoain · 2024-05-16T08:49:08Z

test/Conversion/AIEVecToLLVM/test-ext.mlir

+// CHECK-SAME: (vector<16xi32>) -> vector<4xi32>
+// CHECK-NEXT: %[[RES0:.*]] = llvm.bitcast %[[EXT0]] : vector<4xi32> to vector<4xf32>
+// CHECK-NEXT: %[[UNDEF1:.*]] = "xllvm.intr.aie2.v16int32"() : () -> vector<16xi32>
+// CHECK-NEXT: %[[CST48:.*]] = llvm.mlir.constant(48 : i32) : i32


Is this correct? If I understand correctly, the process to extract a subvector from a given position is to shift the whole vector by the number of bits up to the position, and then extracting the lowest part. For 16-bit element vectors and an extraction from position 3, that's 3 x 16 = 48 bit, but for 32-bit element vectors, shouldn't it be 3 x 32 = 96 bit? Am I misunderstanding something?

The shift amount for the shift op/intrinsic is in the number of bytes. The I128.I512 extract intrinsic is special. It always extracts the lowest 128-bit vector from the source 512-bit vector. For index=0 vector extraction, we actually don't need any vector shift beforehand, so I have updated the conversion pattern for this scenario. As for index=3 vector extraction, the shift amount is 48 bytes = 384 bits = 3*128 bits. After the right shift of 48 bytes, the I128.I512 intrinsic extracts the lowest 128-bit vector, which is correct. I have updated the aievec.ext/shift op descriptions, to explicitly clarify the shift amount in bytes and the extraction index to be either 0--1 or 0--3. Let me know if there is anything else I can do :).

Aaaah! Got it, I was trying to make the maths work in my head and failing, thanks for the clarification 🙂

jsetoain

Looks good to me!

jamestcl-amd added 4 commits May 15, 2024 13:40

Correct the I128.I512 ext intrinsic in xllvm

b70dc12

update xllvm to external translation test

f7bcd50

Add conversion pattern for I128.I512 ext intrinsic op

48977e4

Add lit test converage for the conversion pattern

c2ca3dd

jamestcl-amd requested a review from jsetoain as a code owner May 15, 2024 20:47

jamestcl-amd requested a review from david-vc May 15, 2024 20:48

jsetoain reviewed May 16, 2024

View reviewed changes

jamestcl-amd added 2 commits May 16, 2024 12:59

Adding explicit clarification of the semantic of aievec.ext/shift ops

ca17f59

Improve the lowering pattern

a30f0e5

jsetoain approved these changes May 16, 2024

View reviewed changes

Merge branch 'main' into aievec_ext_llvm_128_512

5cefa6a

jamestcl-amd enabled auto-merge May 16, 2024 23:13

jamestcl-amd added this pull request to the merge queue May 16, 2024

Merged via the queue into Xilinx:main with commit 3dc49bf May 17, 2024
51 checks passed

jamestcl-amd deleted the aievec_ext_llvm_128_512 branch May 17, 2024 00:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aievec] to-llvm flow for aievec.ext op with I128.I512 extract intrinsic #1490

[aievec] to-llvm flow for aievec.ext op with I128.I512 extract intrinsic #1490

jamestcl-amd commented May 15, 2024

jsetoain May 16, 2024

jamestcl-amd May 16, 2024

jsetoain May 16, 2024

jsetoain left a comment

[aievec] to-llvm flow for aievec.ext op with I128.I512 extract intrinsic #1490

[aievec] to-llvm flow for aievec.ext op with I128.I512 extract intrinsic #1490

Conversation

jamestcl-amd commented May 15, 2024

jsetoain May 16, 2024

Choose a reason for hiding this comment

jamestcl-amd May 16, 2024

Choose a reason for hiding this comment

jsetoain May 16, 2024

Choose a reason for hiding this comment

jsetoain left a comment

Choose a reason for hiding this comment