Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bufr/hdf5 issue with spack-stack-1.6.0/gsi-addon-dev install on Gaea-C6 #1359

Open
DavidBurrows-NCO opened this issue Oct 29, 2024 · 11 comments
Assignees
Labels
bug Something is not working

Comments

@DavidBurrows-NCO
Copy link

Describe the bug
I am attempting to build the GSI package on Gaea-C6 and receive warning messages like:

ld: warning: libhdf5_parallel_intel.so.310, needed by /autofs/ncrc-svm1_proj/epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/intel/2023.2.0/bufr-11.7.0-z6kzf5u/lib64/libbufr_d.so, not found (try using -rpath or -rpath-link).

I’ve tracked this down to differences in the bufr hdf5 libraries between C5 and C6. On C6, I see:

ldd /autofs/ncrc-svm1_proj/epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/intel/2023.2.0/bufr-11.7.0-z6kzf5u/lib64/libbufr_d.so | grep hdf
libhdf5_parallel_intel.so.310 => not found
libhdf5_fortran_parallel_intel.so.310 => not found

but on C5, I see:

ldd /autofs/ncrc-svm1_proj/epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/intel/2023.2.0/bufr-11.7.0-z6kzf5u/lib64/libbufr_d.so | grep hdf
libhdf5_parallel_intel.so.310 => /opt/cray/pe/hdf5-parallel/1.14.3.1/intel/2023.2/lib/libhdf5_parallel_intel.so.310 (0x00007fafe6f2c000)
libhdf5_fortran_parallel_intel.so.310 => /opt/cray/pe/hdf5-parallel/1.14.3.1/intel/2023.2/lib/libhdf5_fortran_parallel_intel.so.310 (0x00007fafe7922000)

To Reproduce
Run ldd /autofs/ncrc-svm1_proj/epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/intel/2023.2.0/bufr-11.7.0-z6kzf5u/lib64/libbufr_d.so for example on Gaea-C5 versus C6

Expected behavior
Receive no build warning and runtime failures due to missing library.

System:
Gaea-C5 versus C6

Additional context
NA

@DavidBurrows-NCO DavidBurrows-NCO added the bug Something is not working label Oct 29, 2024
@DavidBurrows-NCO
Copy link
Author

This issue Refs NOAA-EMC/global-workflow #3011 and Refs NOAA-EMC/GSI #800

@AlexanderRichert-NOAA
Copy link
Collaborator

This is bizarre... there are a ton of Cray libraries that it depends on that bufr doesn't use-- the hdf5 libs, libpmi and libpmi2, libpals, libmpifort_intel, libpnetcdf_intel... It looks like the Cray wrappers are linking everything under the sun. It doesn't seem to happen if I build NCEPLIBS-bufr directly with cmake+Cray wrappers, so I'll see if it's somehow related to Spack or our site config.

@AlexanderRichert-NOAA
Copy link
Collaborator

I'm still looking at this, I haven't been able to reproduce the issue yet. @RatkoVasic-NOAA can you think of any reason that, e.g., the cray parallel hdf5 library would have been linked? It's almost certainly something to do with the Cray wrappers, but I don't know why it would be linking all those libraries in, especially since we're not loading those modules.

@RatkoVasic-NOAA
Copy link
Collaborator

I'm still looking at this, I haven't been able to reproduce the issue yet. @RatkoVasic-NOAA can you think of any reason that, e.g., the cray parallel hdf5 library would have been linked? It's almost certainly something to do with the Cray wrappers, but I don't know why it would be linking all those libraries in, especially since we're not loading those modules.

Unfortunately, I'm puzzled as you are.

@AlexanderRichert-NOAA
Copy link
Collaborator

I should have noted sooner-- The path you're pointed to is an installation for Gaea C5, so in any case I think you'll need to switch to /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/gsi-addon unless Ratko says otherwise. Meanwhile, there's this separate issue of the weird dynamic library linkages which I'm still investigating.

@AlexanderRichert-NOAA
Copy link
Collaborator

@RatkoVasic-NOAA I can (basically) reproduce the issue if I load various modules first in my calling environment and then build the stack, but in a clean environment it doesn't happen-- the linked libraries are as expected. So it's probably worth rebuilding that stack in a clean environment.

@RatkoVasic-NOAA
Copy link
Collaborator

@AlexanderRichert-NOAA I can reinstall during this weekend. Do you mean Gaea-C5, spack-stack-1.6.0 (both unified env and gsi addon)? What changes should I do to install it correctly?

@DavidBurrows-NCO
Copy link
Author

I should have noted sooner-- The path you're pointed to is an installation for Gaea C5, so in any case I think you'll need to switch to /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/gsi-addon unless Ratko says otherwise. Meanwhile, there's this separate issue of the weird dynamic library linkages which I'm still investigating.

Thanks @AlexanderRichert-NOAA I didn't see the c6 versions of spack-stack. Switching to c6/gsi-addon corrected my issue, and GSI regression tests are passing. Do you know if /ncrc is cross mounted between C5 and C6?

@RatkoVasic-NOAA
Copy link
Collaborator

@DavidBurrows-NCO yes they are. You can access from both machines.

@RatkoVasic-NOAA
Copy link
Collaborator

RatkoVasic-NOAA commented Nov 4, 2024

Just to confirm:

C5: /ncrc/proj/epic/spack-stack/spack-stack-1.6.0/envs/gsi-addon-dev/install/modulefiles/Core
C6: /ncrc/proj/epic/spack-stack/c6/spack-stack-1.6.0/envs/gsi-addon/install/modulefiles/Core

@DavidBurrows-NCO
Copy link
Author

@RatkoVasic-NOAA I assume you're asking @AlexanderRichert-NOAA, but those are the installs on C5 and C6 that I'm pointing too. In terms of my reported issue, this is resolved. Feel free to close or keep open to track the other issues found. Thank you both!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

No branches or pull requests

3 participants