forked from Xilinx/mlir-aie
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Abhishek Varma <abhvarma@amd.com> Co-authored-by: Javier Setoain <jsetoain@users.noreply.github.com> Co-authored-by: James Newling <james.newling@gmail.com> Co-authored-by: Maksim Levental <maksim.levental@gmail.com> Co-authored-by: Joseph Melber <jgmelber@gmail.com> Co-authored-by: Abhishek Varma <abhvarma@amd.com> Co-authored-by: erwei-xilinx <erweiw@xilinx.com> Co-authored-by: AndraBisca <andrab@amd.com> Started writing the objfifo intro tutorial Vectorize vec scalar (Xilinx#1135) Added new programming guide section placedholders (Xilinx#1138) [EXAMPLE] An element-wise add example (Xilinx#1148) Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jack Lo <jack.lo@amd.com> Moved objfifo design example 1 to programming_guide sections 3. Started objectFifo programming guide Continue section 3 guide. ObjFifo guide: access patterns Add example to objFifo guide Update objfifo guide Separate section 3 of the guide into 3 subsections [ASPLOS] Weight expand asplos (Xilinx#1158) Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: Phil James-Roxby <pjr@amd.com> Reorganize subsections in section 3 of the guide. Updated sections 3a and 3b Pjr vector exp (Xilinx#1166) Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: Joseph Melber <jgmelber@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> [SOFTMAX] Single column rapid test (Xilinx#1168) Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: Joseph Melber <jgmelber@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Vector softmax (Xilinx#1172) Co-authored-by: pjr <pjr@xilinx.com> [MERGE] This has gone horribly wrong, so fixing up in place Update exp.cc Merge resolved Added new programming guide section placedholders (Xilinx#1174) Swapped section-2 and section-3 (Xilinx#1177) Update section-1 and section-3 examples (Xilinx#1179) Update section-3 (Xilinx#1181) Reorganize tutorials and reference_designs to programming_examples (Xilinx#1182) Fix for mmult lit (Xilinx#1187) Revert "Fix for mmult lit" (Xilinx#1188) [ASPLOS][WIP] python host code example (Xilinx#1185) Extract arg parse from test.py (Xilinx#1192) Add finished write-up for sections 2a and 2b Add objectfifo bindings to quick references Update section2 subsection list [ASPLOS][WIP] initial version of asplos24 tutorial description (Xilinx#1184) Co-authored-by: Jack Lo <jack.lo@amd.com> Fix for lit tests (Xilinx#1189) Rename tutorials folder (Xilinx#1190) (Xilinx#1197) Co-authored-by: AndraBisca <andrab@amd.com> Co-authored-by: Jack Lo <36210336+jackl-xilinx@users.noreply.github.com> [ASPLOS] Rename directories (Xilinx#1196) Co-authored-by: Jeff Fifield <jeff.fifield@amd.com> Co-authored-by: AndraBisca <andrab@amd.com> Co-authored-by: Jack Lo <36210336+jackl-xilinx@users.noreply.github.com> Add section 2c. Update tiles in sections 2a and 2b. Add generic aie array description paragraph (Xilinx#1191) Co-authored-by: Joseph Melber <jgmelber@gmail.com> Co-authored-by: Jack Lo <jack.lo@amd.com> ReLU with tracing (Xilinx#1204) ReLU example with tracing Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: Joseph Melber <jgmelber@gmail.com> Ml eltwise add and mul (Xilinx#1207) Move around of the eltwise add (put it in ml) and a new eltwise mul kernel Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: Jeff Fifield <jeff.fifield@amd.com> Moved test_lib to runtime_lib/test_lib for now Pjr reduce (Xilinx#1222) Reduce programming examples Co-authored-by: pjr <pjr@xilinx.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> [ASPLOS][WIP] Passthrough kernel in basic examples (Xilinx#1216) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> fix paths run.lit passthrough _kernel (Xilinx#1225) Fixed CMakeLists.txt reference to test_utils.h (Xilinx#1223) Minor CMakeLists.txt and Makefile fixes for programming_examples (Xilinx#1227)
- Loading branch information
1 parent
8426d43
commit eb09a5c
Showing
96 changed files
with
3,510 additions
and
489 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
//===- scale.cc -------------------------------------------------*- C++ -*-===// | ||
// | ||
// This file is licensed under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
// Copyright (C) 2023, Advanced Micro Devices, Inc. | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#define __AIENGINE__ 2 | ||
#define NOCPP | ||
#define __AIEARCH__ 20 | ||
|
||
#include <stdint.h> | ||
#include <stdio.h> | ||
#include <stdlib.h> | ||
#include <type_traits> | ||
|
||
#include <aie_api/aie.hpp> | ||
|
||
template <typename T_in, typename T_out, const int N> | ||
void eltwise_add(T_in *a, T_in *b, T_out *c) { | ||
for (int i = 0; i < N; i++) { | ||
c[i] = a[i] + b[i]; | ||
} | ||
} | ||
|
||
template <typename T_in, typename T_out, const int N> | ||
void eltwise_vadd(T_in *a, T_in *b, T_out *c) { | ||
|
||
constexpr int vec_factor = 16; | ||
event0(); | ||
T_in *__restrict pA1 = a; | ||
T_in *__restrict pB1 = b; | ||
T_out *__restrict pC1 = c; | ||
const int F = N / vec_factor; | ||
for (int i = 0; i < F; i++) | ||
chess_prepare_for_pipelining chess_loop_range(16, ) { | ||
aie::vector<T_in, vec_factor> A0 = aie::load_v<vec_factor>(pA1); | ||
pA1 += vec_factor; | ||
aie::vector<T_in, vec_factor> B0 = aie::load_v<vec_factor>(pB1); | ||
pB1 += vec_factor; | ||
aie::vector<T_out, vec_factor> cout = aie::add(A0, B0); | ||
aie::store_v(pC1, cout); | ||
pC1 += vec_factor; | ||
} | ||
event1(); | ||
} | ||
|
||
extern "C" { | ||
|
||
void eltwise_add_bf16_scalar(bfloat16 *a_in, bfloat16 *b_in, bfloat16 *c_out) { | ||
eltwise_add<bfloat16, bfloat16, 1024>(a_in, b_in, c_out); | ||
} | ||
|
||
void eltwise_add_bf16_vector(bfloat16 *a_in, bfloat16 *b_in, bfloat16 *c_out) { | ||
eltwise_vadd<bfloat16, bfloat16, 1024>(a_in, b_in, c_out); | ||
} | ||
|
||
} // extern "C" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#include <stdint.h> | ||
#include <stdio.h> | ||
#include <stdlib.h> | ||
#include <type_traits> | ||
|
||
#include <aie_api/aie.hpp> | ||
|
||
void vector(int32_t *restrict in, int32_t *restrict out) { | ||
|
||
v16int32 tiny = broadcast_to_v16int32((int32_t)-2147483648); | ||
int32_t input_size = 1024; | ||
int32_t vector_size = 16; | ||
v16int32 after_vector; | ||
v16int32 running_max = tiny; | ||
for (int32_t i = 0; i < input_size; i += vector_size) | ||
chess_prepare_for_pipelining chess_loop_range(64, 64) { | ||
v16int32 next = *(v16int32 *)(in + i); | ||
v16int32 test = max(running_max, next); | ||
running_max = test; | ||
} | ||
after_vector = running_max; | ||
v16int32 first = shift_bytes(after_vector, after_vector, 32); | ||
v16int32 second = max(after_vector, first); | ||
v16int32 second_shift = shift_bytes(second, second, 16); | ||
v16int32 third = max(second, second_shift); | ||
v16int32 third_shift = shift_bytes(third, third, 8); | ||
v16int32 fourth = max(third, third_shift); | ||
v16int32 fourth_shift = shift_bytes(fourth, fourth, 4); | ||
v16int32 fifth = max(fourth, fourth_shift); | ||
int32_t last = extract_elem(fifth, 0); | ||
*(int32_t *)out = last; | ||
return; | ||
} | ||
|
||
void scalar(int32_t *restrict in, int32_t *restrict out) { | ||
size_t input_size = 1024; | ||
int32_t running_max = (int32_t)-2147483648; | ||
for (int32_t i = 0; i < input_size; i++) { | ||
if (in[i] > running_max) | ||
running_max = in[i]; | ||
} | ||
*(int32_t *)out = running_max; | ||
|
||
return; | ||
} | ||
|
||
extern "C" { | ||
|
||
void vector_max(int32_t *a_in, int32_t *c_out) { vector(a_in, c_out); } | ||
|
||
void scalar_max(int32_t *a_in, int32_t *c_out) { scalar(a_in, c_out); } | ||
|
||
} // extern "C" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#include <stdint.h> | ||
#include <stdio.h> | ||
#include <stdlib.h> | ||
#include <type_traits> | ||
|
||
#include <aie_api/aie.hpp> | ||
|
||
void vector(int32_t *restrict in, int32_t *restrict out) { | ||
|
||
v16int32 massive = broadcast_to_v16int32((int32_t)2147483647); | ||
int32_t input_size = 1024; | ||
int32_t vector_size = 16; | ||
v16int32 after_vector; | ||
v16int32 running_min = massive; | ||
for (int32_t i = 0; i < input_size; i += vector_size) | ||
chess_prepare_for_pipelining chess_loop_range(64, 64) { | ||
v16int32 next = *(v16int32 *)(in + i); | ||
v16int32 test = min(running_min, next); | ||
running_min = test; | ||
} | ||
after_vector = running_min; | ||
v16int32 first = shift_bytes(after_vector, after_vector, 32); | ||
v16int32 second = min(after_vector, first); | ||
v16int32 second_shift = shift_bytes(second, second, 16); | ||
v16int32 third = min(second, second_shift); | ||
v16int32 third_shift = shift_bytes(third, third, 8); | ||
v16int32 fourth = min(third, third_shift); | ||
v16int32 fourth_shift = shift_bytes(fourth, fourth, 4); | ||
v16int32 fifth = min(fourth, fourth_shift); | ||
int32_t last = extract_elem(fifth, 0); | ||
*(int32_t *)out = last; | ||
return; | ||
} | ||
|
||
void scalar(int32_t *restrict in, int32_t *restrict out) { | ||
size_t input_size = 1024; | ||
int32_t running_min = (int32_t)2147483647; | ||
for (int32_t i = 0; i < input_size; i++) { | ||
if (in[i] < running_min) | ||
running_min = in[i]; | ||
} | ||
*(int32_t *)out = running_min; | ||
|
||
return; | ||
} | ||
|
||
extern "C" { | ||
|
||
void vector_min(int32_t *a_in, int32_t *c_out) { vector(a_in, c_out); } | ||
|
||
void scalar_min(int32_t *a_in, int32_t *c_out) { scalar(a_in, c_out); } | ||
|
||
} // extern "C" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
//===- scale.cc -------------------------------------------------*- C++ -*-===// | ||
// | ||
// This file is licensed under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// | ||
// Copyright (C) 2023, Advanced Micro Devices, Inc. | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#define __AIENGINE__ 2 | ||
#define NOCPP | ||
#define __AIEARCH__ 20 | ||
|
||
#include <stdint.h> | ||
#include <stdio.h> | ||
#include <stdlib.h> | ||
#include <type_traits> | ||
|
||
#include <aie_api/aie.hpp> | ||
|
||
void relu(bfloat16 *restrict a, bfloat16 *restrict c, const int TILE_SIZE) { | ||
const int v_factor = 32; | ||
v32bfloat16 zeroes = broadcast_zero_bfloat16(); | ||
|
||
event0(); | ||
for (size_t i = 0; i < TILE_SIZE; i += v_factor) | ||
chess_prepare_for_pipelining chess_loop_range(32, 32) { | ||
v32bfloat16 input = *(v32bfloat16 *)(a + i); | ||
v32bfloat16 output = max(input, zeroes); | ||
*(v32bfloat16 *)(c + i) = output; | ||
} | ||
event1(); | ||
return; | ||
} | ||
|
||
extern "C" { | ||
|
||
void bf16_relu(bfloat16 *a_in, bfloat16 *c_out) { relu(a_in, c_out, 1024); } | ||
|
||
} // extern "C" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
##===- Makefile -----------------------------------------------------------===## | ||
# | ||
# This file licensed under the Apache License v2.0 with LLVM Exceptions. | ||
# See https://llvm.org/LICENSE.txt for license information. | ||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
# | ||
##===----------------------------------------------------------------------===## | ||
|
||
include ../../makefile-common | ||
|
||
all: build/final.xclbin | ||
|
||
targetname = eltwise_exp | ||
|
||
build/lut_based_ops.o: | ||
mkdir -p ${@D} | ||
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -c ../../../../aie_runtime_lib/AIE2/lut_based_ops.cpp -o ${@F} | ||
|
||
build/exp.o: | ||
mkdir -p ${@D} | ||
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -I ../../../../aie_runtime_lib/AIE2 -c ../../../../aie_kernels/aie2/bf16_exp.cc -o ${@F} | ||
|
||
build/kernels.a: build/exp.o build/lut_based_ops.o | ||
ar rvs $@ $+ | ||
|
||
build/aie.mlir: aie2.py | ||
mkdir -p ${@D} | ||
python3 $< > $@ | ||
|
||
build/final.xclbin: build/aie.mlir build/kernels.a | ||
mkdir -p ${@D} | ||
cd ${@D} && aiecc.py --aie-generate-cdo --aie-generate-ipu --no-compile-host \ | ||
--xclbin-name=${@F} --ipu-insts-name=insts.txt ${<F} | ||
|
||
${targetname}.exe: test.cpp | ||
rm -rf _build | ||
mkdir -p _build | ||
cd _build && ${powershell} cmake .. -DTARGET_NAME=${targetname} | ||
cd _build && ${powershell} cmake --build . --config Release | ||
ifeq "${powershell}" "powershell.exe" | ||
cp _build/${targetname}.exe $@ | ||
else | ||
cp _build/${targetname} $@ | ||
endif | ||
|
||
run: ${targetname}.exe build/final.xclbin build/insts.txt | ||
${powershell} ./$< -x build/final.xclbin -i build/insts.txt -k MLIR_AIE | ||
|
||
run_py: build/final.xclbin build/insts.txt | ||
${powershell} python3 test.py -x build/final.xclbin -i build/insts.txt -k MLIR_AIE | ||
|
||
clean: | ||
rm -rf build _build ${targetname}.exe |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.