Skip to content

Commit

Permalink
Keep pkt header (Xilinx#1125)
Browse files Browse the repository at this point in the history
Signed-off-by: Abhishek Varma <abhvarma@amd.com>
Co-authored-by: Javier Setoain <jsetoain@users.noreply.github.com>
Co-authored-by: James Newling <james.newling@gmail.com>
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
Co-authored-by: Joseph Melber <jgmelber@gmail.com>
Co-authored-by: Abhishek Varma <abhvarma@amd.com>
Co-authored-by: erwei-xilinx <erweiw@xilinx.com>
Co-authored-by: AndraBisca <andrab@amd.com>

Started writing the objfifo intro tutorial

Vectorize vec scalar (Xilinx#1135)

Added new programming guide section placedholders (Xilinx#1138)

[EXAMPLE] An element-wise add example (Xilinx#1148)

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jack Lo <jack.lo@amd.com>

Moved objfifo design example 1 to programming_guide sections 3.

Started objectFifo programming guide

Continue section 3 guide.

ObjFifo guide: access patterns

Add example to objFifo guide

Update objfifo guide

Separate section 3 of the guide into 3 subsections

[ASPLOS] Weight expand asplos (Xilinx#1158)

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: Phil James-Roxby <pjr@amd.com>

Reorganize subsections in section 3 of the guide.

Updated sections 3a and 3b

Pjr vector exp (Xilinx#1166)

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: Joseph Melber <jgmelber@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

[SOFTMAX] Single column rapid test (Xilinx#1168)

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: Joseph Melber <jgmelber@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Vector softmax (Xilinx#1172)

Co-authored-by: pjr <pjr@xilinx.com>

[MERGE] This has gone horribly wrong, so fixing up in place

Update exp.cc

Merge resolved

Added new programming guide section placedholders (Xilinx#1174)

Swapped section-2 and section-3 (Xilinx#1177)

Update section-1 and section-3 examples (Xilinx#1179)

Update section-3 (Xilinx#1181)

Reorganize tutorials and reference_designs to programming_examples (Xilinx#1182)

Fix for mmult lit (Xilinx#1187)

Revert "Fix for mmult lit" (Xilinx#1188)

[ASPLOS][WIP] python host code example (Xilinx#1185)

Extract arg parse from test.py (Xilinx#1192)

Add finished write-up for sections 2a and 2b

Add objectfifo bindings to quick references

Update section2 subsection list

[ASPLOS][WIP] initial version of asplos24 tutorial description (Xilinx#1184)

Co-authored-by: Jack Lo <jack.lo@amd.com>

Fix for lit tests (Xilinx#1189)

Rename tutorials folder (Xilinx#1190) (Xilinx#1197)

Co-authored-by: AndraBisca <andrab@amd.com>
Co-authored-by: Jack Lo <36210336+jackl-xilinx@users.noreply.github.com>

[ASPLOS] Rename directories (Xilinx#1196)

Co-authored-by: Jeff Fifield <jeff.fifield@amd.com>
Co-authored-by: AndraBisca <andrab@amd.com>
Co-authored-by: Jack Lo <36210336+jackl-xilinx@users.noreply.github.com>

Add section 2c. Update tiles in sections 2a and 2b.

Add generic aie array description paragraph (Xilinx#1191)

Co-authored-by: Joseph Melber <jgmelber@gmail.com>
Co-authored-by: Jack Lo <jack.lo@amd.com>

ReLU with tracing (Xilinx#1204)

ReLU example with tracing

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: Joseph Melber <jgmelber@gmail.com>

Ml eltwise add and mul (Xilinx#1207)

Move around of the eltwise add (put it in ml) and a new eltwise mul kernel

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: Jeff Fifield <jeff.fifield@amd.com>

Moved test_lib to runtime_lib/test_lib for now

Pjr reduce (Xilinx#1222) Reduce programming examples

Co-authored-by: pjr <pjr@xilinx.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

[ASPLOS][WIP] Passthrough kernel in basic examples (Xilinx#1216)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fix paths run.lit passthrough _kernel (Xilinx#1225)

Fixed CMakeLists.txt reference to test_utils.h (Xilinx#1223)

Minor CMakeLists.txt and Makefile fixes for programming_examples (Xilinx#1227)
  • Loading branch information
jackl-xilinx authored and fifield committed Apr 12, 2024
1 parent 8426d43 commit eb09a5c
Show file tree
Hide file tree
Showing 96 changed files with 3,510 additions and 489 deletions.
61 changes: 61 additions & 0 deletions aie_kernels/aie2/add.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
//===- scale.cc -------------------------------------------------*- C++ -*-===//
//
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// Copyright (C) 2023, Advanced Micro Devices, Inc.
//
//===----------------------------------------------------------------------===//

#define __AIENGINE__ 2
#define NOCPP
#define __AIEARCH__ 20

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <type_traits>

#include <aie_api/aie.hpp>

template <typename T_in, typename T_out, const int N>
void eltwise_add(T_in *a, T_in *b, T_out *c) {
for (int i = 0; i < N; i++) {
c[i] = a[i] + b[i];
}
}

template <typename T_in, typename T_out, const int N>
void eltwise_vadd(T_in *a, T_in *b, T_out *c) {

constexpr int vec_factor = 16;
event0();
T_in *__restrict pA1 = a;
T_in *__restrict pB1 = b;
T_out *__restrict pC1 = c;
const int F = N / vec_factor;
for (int i = 0; i < F; i++)
chess_prepare_for_pipelining chess_loop_range(16, ) {
aie::vector<T_in, vec_factor> A0 = aie::load_v<vec_factor>(pA1);
pA1 += vec_factor;
aie::vector<T_in, vec_factor> B0 = aie::load_v<vec_factor>(pB1);
pB1 += vec_factor;
aie::vector<T_out, vec_factor> cout = aie::add(A0, B0);
aie::store_v(pC1, cout);
pC1 += vec_factor;
}
event1();
}

extern "C" {

void eltwise_add_bf16_scalar(bfloat16 *a_in, bfloat16 *b_in, bfloat16 *c_out) {
eltwise_add<bfloat16, bfloat16, 1024>(a_in, b_in, c_out);
}

void eltwise_add_bf16_vector(bfloat16 *a_in, bfloat16 *b_in, bfloat16 *c_out) {
eltwise_vadd<bfloat16, bfloat16, 1024>(a_in, b_in, c_out);
}

} // extern "C"
Original file line number Diff line number Diff line change
Expand Up @@ -15,30 +15,21 @@
#include <stdio.h>
#include <stdlib.h>

#define REL_WRITE 0
#define REL_READ 1

#include <aie_api/aie.hpp>

template <typename T, int N>
__attribute__((noinline)) void passThrough_aie(T *restrict in, T *restrict out,
const int32_t height,
const int32_t width) {
//::aie::vector<T, N> data_out;
//::aie::mask<N> temp_val;
event0();

v64uint8 *restrict outPtr = (v64uint8 *)out;
v64uint8 *restrict inPtr = (v64uint8 *)in;

for (int j = 0; j < (height * width); j += N) // Nx samples per loop
chess_prepare_for_pipelining chess_loop_range(6, ) {
//::aie::vector<T, N> tmpVector = ::aie::load_v(in);
//::aie::store_v(out, tmpVector);

*outPtr++ = *inPtr++;
chess_prepare_for_pipelining chess_loop_range(6, ) { *outPtr++ = *inPtr++; }

// in += N;
// out += N;
}
event1();
}

extern "C" {
Expand Down
53 changes: 53 additions & 0 deletions aie_kernels/generic/vector_max.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <type_traits>

#include <aie_api/aie.hpp>

void vector(int32_t *restrict in, int32_t *restrict out) {

v16int32 tiny = broadcast_to_v16int32((int32_t)-2147483648);
int32_t input_size = 1024;
int32_t vector_size = 16;
v16int32 after_vector;
v16int32 running_max = tiny;
for (int32_t i = 0; i < input_size; i += vector_size)
chess_prepare_for_pipelining chess_loop_range(64, 64) {
v16int32 next = *(v16int32 *)(in + i);
v16int32 test = max(running_max, next);
running_max = test;
}
after_vector = running_max;
v16int32 first = shift_bytes(after_vector, after_vector, 32);
v16int32 second = max(after_vector, first);
v16int32 second_shift = shift_bytes(second, second, 16);
v16int32 third = max(second, second_shift);
v16int32 third_shift = shift_bytes(third, third, 8);
v16int32 fourth = max(third, third_shift);
v16int32 fourth_shift = shift_bytes(fourth, fourth, 4);
v16int32 fifth = max(fourth, fourth_shift);
int32_t last = extract_elem(fifth, 0);
*(int32_t *)out = last;
return;
}

void scalar(int32_t *restrict in, int32_t *restrict out) {
size_t input_size = 1024;
int32_t running_max = (int32_t)-2147483648;
for (int32_t i = 0; i < input_size; i++) {
if (in[i] > running_max)
running_max = in[i];
}
*(int32_t *)out = running_max;

return;
}

extern "C" {

void vector_max(int32_t *a_in, int32_t *c_out) { vector(a_in, c_out); }

void scalar_max(int32_t *a_in, int32_t *c_out) { scalar(a_in, c_out); }

} // extern "C"
53 changes: 53 additions & 0 deletions aie_kernels/generic/vector_min.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <type_traits>

#include <aie_api/aie.hpp>

void vector(int32_t *restrict in, int32_t *restrict out) {

v16int32 massive = broadcast_to_v16int32((int32_t)2147483647);
int32_t input_size = 1024;
int32_t vector_size = 16;
v16int32 after_vector;
v16int32 running_min = massive;
for (int32_t i = 0; i < input_size; i += vector_size)
chess_prepare_for_pipelining chess_loop_range(64, 64) {
v16int32 next = *(v16int32 *)(in + i);
v16int32 test = min(running_min, next);
running_min = test;
}
after_vector = running_min;
v16int32 first = shift_bytes(after_vector, after_vector, 32);
v16int32 second = min(after_vector, first);
v16int32 second_shift = shift_bytes(second, second, 16);
v16int32 third = min(second, second_shift);
v16int32 third_shift = shift_bytes(third, third, 8);
v16int32 fourth = min(third, third_shift);
v16int32 fourth_shift = shift_bytes(fourth, fourth, 4);
v16int32 fifth = min(fourth, fourth_shift);
int32_t last = extract_elem(fifth, 0);
*(int32_t *)out = last;
return;
}

void scalar(int32_t *restrict in, int32_t *restrict out) {
size_t input_size = 1024;
int32_t running_min = (int32_t)2147483647;
for (int32_t i = 0; i < input_size; i++) {
if (in[i] < running_min)
running_min = in[i];
}
*(int32_t *)out = running_min;

return;
}

extern "C" {

void vector_min(int32_t *a_in, int32_t *c_out) { vector(a_in, c_out); }

void scalar_min(int32_t *a_in, int32_t *c_out) { scalar(a_in, c_out); }

} // extern "C"
41 changes: 41 additions & 0 deletions aie_kernels/relu.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
//===- scale.cc -------------------------------------------------*- C++ -*-===//
//
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// Copyright (C) 2023, Advanced Micro Devices, Inc.
//
//===----------------------------------------------------------------------===//

#define __AIENGINE__ 2
#define NOCPP
#define __AIEARCH__ 20

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <type_traits>

#include <aie_api/aie.hpp>

void relu(bfloat16 *restrict a, bfloat16 *restrict c, const int TILE_SIZE) {
const int v_factor = 32;
v32bfloat16 zeroes = broadcast_zero_bfloat16();

event0();
for (size_t i = 0; i < TILE_SIZE; i += v_factor)
chess_prepare_for_pipelining chess_loop_range(32, 32) {
v32bfloat16 input = *(v32bfloat16 *)(a + i);
v32bfloat16 output = max(input, zeroes);
*(v32bfloat16 *)(c + i) = output;
}
event1();
return;
}

extern "C" {

void bf16_relu(bfloat16 *a_in, bfloat16 *c_out) { relu(a_in, c_out, 1024); }

} // extern "C"
53 changes: 53 additions & 0 deletions programming_examples/basic/eltwise_exp/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
##===- Makefile -----------------------------------------------------------===##
#
# This file licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
##===----------------------------------------------------------------------===##

include ../../makefile-common

all: build/final.xclbin

targetname = eltwise_exp

build/lut_based_ops.o:
mkdir -p ${@D}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -c ../../../../aie_runtime_lib/AIE2/lut_based_ops.cpp -o ${@F}

build/exp.o:
mkdir -p ${@D}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -I ../../../../aie_runtime_lib/AIE2 -c ../../../../aie_kernels/aie2/bf16_exp.cc -o ${@F}

build/kernels.a: build/exp.o build/lut_based_ops.o
ar rvs $@ $+

build/aie.mlir: aie2.py
mkdir -p ${@D}
python3 $< > $@

build/final.xclbin: build/aie.mlir build/kernels.a
mkdir -p ${@D}
cd ${@D} && aiecc.py --aie-generate-cdo --aie-generate-ipu --no-compile-host \
--xclbin-name=${@F} --ipu-insts-name=insts.txt ${<F}

${targetname}.exe: test.cpp
rm -rf _build
mkdir -p _build
cd _build && ${powershell} cmake .. -DTARGET_NAME=${targetname}
cd _build && ${powershell} cmake --build . --config Release
ifeq "${powershell}" "powershell.exe"
cp _build/${targetname}.exe $@
else
cp _build/${targetname} $@
endif

run: ${targetname}.exe build/final.xclbin build/insts.txt
${powershell} ./$< -x build/final.xclbin -i build/insts.txt -k MLIR_AIE

run_py: build/final.xclbin build/insts.txt
${powershell} python3 test.py -x build/final.xclbin -i build/insts.txt -k MLIR_AIE

clean:
rm -rf build _build ${targetname}.exe
8 changes: 7 additions & 1 deletion programming_examples/basic/eltwise_mul/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,14 @@
# cmake needs this line
cmake_minimum_required(VERSION 3.1)

set(CMAKE_CXX_STANDARD 23)
set(CMAKE_CXX_STANDARD_REQUIRED YES)

find_program(WSL NAMES powershell.exe)

if (NOT WSL)
set(CMAKE_C_COMPILER gcc-13)
set(CMAKE_CXX_COMPILER g++-13)
set(BOOST_ROOT /usr/include/boost CACHE STRING "Path to Boost install")
set(XRT_INC_DIR /opt/xilinx/xrt/include CACHE STRING "Path to XRT cloned repo")
set(XRT_LIB_DIR /opt/xilinx/xrt/lib CACHE STRING "Path to xrt_coreutil.lib")
Expand All @@ -40,6 +45,7 @@ project(${ProjectName})
find_package(Boost REQUIRED)

add_executable(${currentTarget}
${CMAKE_CURRENT_SOURCE_DIR}/../../../runtime_lib/test_lib/test_utils.cpp
test.cpp
)

Expand All @@ -48,7 +54,7 @@ target_compile_definitions(${currentTarget} PUBLIC DISABLE_ABI_CHECK=1)
target_include_directories (${currentTarget} PUBLIC
${XRT_INC_DIR}
${Boost_INCLUDE_DIRS}
../../../programming_examples/utils
${CMAKE_CURRENT_SOURCE_DIR}/../../../runtime_lib/test_lib
)

target_link_directories(${currentTarget} PUBLIC
Expand Down
7 changes: 3 additions & 4 deletions programming_examples/basic/eltwise_mul/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
#
##===----------------------------------------------------------------------===##

include ../../../programming_examples/basic/makefile-common
include ../../makefile-common

all: build/final.xclbin

targetname = myEltwiseMul

build/mul.o:
mkdir -p ${@D}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -c ${REPO_ROOT}/aie_kernels/aie2/mul.cc -o ${@F}
cd ${@D} && xchesscc_wrapper ${CHESSCCWRAP2_FLAGS} -I. -c ../../../../aie_kernels/aie2/mul.cc -o ${@F}

build/aie.mlir: aie2.py
mkdir -p ${@D}
Expand All @@ -28,8 +28,7 @@ build/final.xclbin: build/aie.mlir build/mul.o
${targetname}.exe: test.cpp
rm -rf _build
mkdir -p _build
# cd _build && ${powershell} cmake .. -DTARGET_NAME=${targetname}
cd _build && ${powershell} cmake -E env CXXFLAGS="-std=c++23 -ggdb" cmake .. -D CMAKE_C_COMPILER=gcc-13 -D CMAKE_CXX_COMPILER=g++-13 -DTARGET_NAME=${targetname} -Dsubdir=${subdir}
cd _build && ${powershell} cmake .. -DTARGET_NAME=${targetname}
cd _build && ${powershell} cmake --build . --config Release
ifeq "${powershell}" "powershell.exe"
cp _build/${targetname}.exe $@
Expand Down
2 changes: 1 addition & 1 deletion programming_examples/basic/eltwise_mul/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import sys
import time

sys.path.append("../../programming_examples/utils")
sys.path.append("../../python")
import test_utils

# ------------------------------------------------------
Expand Down
Loading

0 comments on commit eb09a5c

Please sign in to comment.