Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trtri undefined references in cuda/11.2.2 build with no eti #2231

Open
ndellingwood opened this issue Jun 4, 2024 · 1 comment
Open

trtri undefined references in cuda/11.2.2 build with no eti #2231

ndellingwood opened this issue Jun 4, 2024 · 1 comment
Assignees
Labels

Comments

@ndellingwood
Copy link
Contributor

Builds (at least with cuda/11.2.2) with no eti fail to link trtri:

17:34:43 [ 63%] Linking CXX executable KokkosKernels_lapack_serial
17:34:45 CMakeFiles/KokkosKernels_lapack_serial.dir/backends/Test_Serial_Lapack.cpp.o: In function `int KokkosLapack::trtri<Kokkos::View<double**, Kokkos::LayoutLeft, Kokkos::Serial> >(char const*, char const*, Kokkos::View<double**, Kokkos::LayoutLeft, Kokkos::Serial> const&)':
17:34:45 tmpxft_00007941_00000000-6_Test_Serial_Lapack.cudafe1.cpp:(.text._ZN12KokkosLapack5trtriIN6Kokkos4ViewIPPdJNS1_10LayoutLeftENS1_6SerialEEEEEEiPKcS9_RKT_[_ZN12KokkosLapack5trtriIN6Kokkos4ViewIPPdJNS1_10LayoutLeftENS1_6SerialEEEEEEiPKcS9_RKT_]+0x460): undefined reference to `KokkosLapack::Impl::TRTRI<Kokkos::View<int, Kokkos::LayoutRight, Kokkos::HostSpace, Kokkos::MemoryTraits<1u> >, Kokkos::View<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >, false, false>::trtri(Kokkos::View<int, Kokkos::LayoutRight, Kokkos::HostSpace, Kokkos::MemoryTraits<1u> > const&, char const*, char const*, Kokkos::View<double**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> > const&)'
17:34:45 CMakeFiles/KokkosKernels_lapack_serial.dir/backends/Test_Serial_Lapack.cpp.o: In function `int KokkosLapack::trtri<Kokkos::View<Kokkos::complex<double>**, Kokkos::LayoutLeft, Kokkos::Serial> >(char const*, char const*, Kokkos::View<Kokkos::complex<double>**, Kokkos::LayoutLeft, Kokkos::Serial> const&)':
17:34:45 tmpxft_00007941_00000000-6_Test_Serial_Lapack.cudafe1.cpp:(.text._ZN12KokkosLapack5trtriIN6Kokkos4ViewIPPNS1_7complexIdEEJNS1_10LayoutLeftENS1_6SerialEEEEEEiPKcSB_RKT_[_ZN12KokkosLapack5trtriIN6Kokkos4ViewIPPNS1_7complexIdEEJNS1_10LayoutLeftENS1_6SerialEEEEEEiPKcSB_RKT_]+0x460): undefined reference to `KokkosLapack::Impl::TRTRI<Kokkos::View<int, Kokkos::LayoutRight, Kokkos::HostSpace, Kokkos::MemoryTraits<1u> >, Kokkos::View<Kokkos::complex<double>**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> >, false, false>::trtri(Kokkos::View<int, Kokkos::LayoutRight, Kokkos::HostSpace, Kokkos::MemoryTraits<1u> > const&, char const*, char const*, Kokkos::View<Kokkos::complex<double>**, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1u> > const&)'
17:34:45 collect2: error: ld returned 1 exit status

Reproducer: (weaver rhel8)

bsub -Is -n 1 -q rhel8 -gpu "num=1" bash

source /etc/profile.d/modules.sh
source /projects/ppc64le-pwr9-rhel8/legacy-env.sh

module load cuda/11.2.2/gcc/8.3.1 cmake/3.23.1

${KOKKOSKERNELS_PATH}/cm_generate_makefile.bash --with-cuda --with-serial --compiler=${KOKKOS_PATH}/bin/nvcc_wrapper --arch=Volta70,Power9 --with-cuda-options=enable_lambda --kokkos-path=${KOKKOS_PATH} --kokkoskernels-path=${KOKKOSKERNELS_PATH} --with-scalars='double,complex_double' --with-ordinals=int --with-offsets=int,size_t --with-layouts=LayoutLeft --cxxstandard=17 --no-default-eti

make -j16
@cwpearson cwpearson self-assigned this Jun 11, 2024
@cwpearson
Copy link
Contributor

cwpearson commented Sep 17, 2024

Actually, if you let the build go on, many many APIs have this problem for this configuration.

Summary

  1. This configuration does not ETI Serial, but it does enable Serial in Kokkos
  2. Therefore, we build the unit tests with the serial backend
  3. This configuration does not set KokkosKernels_TEST_ETI_ONLY=OFF, so KokkosKernels_TEST_ETI_ONLY defaults to ON
  4. When TEST_ETI_ONLY is on, the actual API definitions are missing for non-ETI'ed things, so of course the serial test then cannot find the non-ETI'ed serial functions

@lucbv what are we supposed to be doing?

References

here we try to build the KokkosKernels_lapack_serial test whenever the Serial backend is enabled, regardless of what we're ETI'ing

IF (KOKKOS_ENABLE_SERIAL)
KOKKOSKERNELS_ADD_UNIT_TEST(
lapack_serial
SOURCES
${PACKAGE_SOURCE_DIR}/test_common/Test_Main.cpp
backends/Test_Serial_Lapack.cpp
COMPONENTS lapack
)
ENDIF ()

Here we declare template <..., false> struct TRTRI but only define it when both

  • !defined(KOKKOSKERNELS_ETI_ONLY)
  • KOKKOSKERNELS_IMPL_COMPILE_LIBRARY is false

template <class RVIT, class AVIT, bool tpl_spec_avail = trtri_tpl_spec_avail<RVIT, AVIT>::value,
bool eti_spec_avail = trtri_eti_spec_avail<RVIT, AVIT>::value>
struct TRTRI {
static void trtri(const RVIT& R, const char uplo[], const char diag[], const AVIT& A);
};
#if !defined(KOKKOSKERNELS_ETI_ONLY) || KOKKOSKERNELS_IMPL_COMPILE_LIBRARY
template <class RVIT, class AVIT>
struct TRTRI<RVIT, AVIT, false, KOKKOSKERNELS_IMPL_COMPILE_LIBRARY> {
static void trtri(const RVIT& R, const char uplo[], const char diag[], const AVIT& A) {
static_assert(Kokkos::is_view<AVIT>::value, "AVIT must be a Kokkos::View.");

As expected, in every generated ETI file, KOKKOSKERNELS_IMPL_COMPILE_LIBRARY is true, so that defines template <..., true> struct TRTRI, but not the missing template <..., false> struct TRTRI

However, when we compile the serial test file KOKKOSKERNELS_ETI_ONLY is defined and KOKKOSKERNELS_IMPL_COMPILE_LIBRARY is not, which means we're trying to use template <..., false> struct TRTRI which is not defined under those circumstances (correctly, as it is not ETI'ed)

Test_Serial.hpp has

#if defined(KOKKOSKERNELS_TEST_ETI_ONLY) && !defined(KOKKOSKERNELS_ETI_ONLY)
#define KOKKOSKERNELS_ETI_ONLY
#endif

cm_test_all_sandia sets ENABLE_TEST_ETI_ONLY=True, which means -DKokkosKernels_TEST_ETI_ONLY=OFF is never set, so KOKKOSKERNELS_TEST_ETI_ONLY ends up defined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants