-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XPMEM runtime warning/error #46
Comments
Hello, I have similar issue as well running with OpenMPI 4.1.1. I built OpenMPI using UCX and enabled flag Benchmark:
|
The Open UCX team has started maintaining their own fork of XPMEM. Your issue might get more attention if you post it there: https://github.com/openucx/xpmem |
Indeed. I have had limited time to respond to the issues. This one may be an xpmem issue or UCX and MVAPICH issues. I intend to bring all ucx fork fixed here then move it to hpc/xpmem where I can add more people to review fixes. |
@hjelmn As a user, it would be great to have a single source for XPMEM. I don't have an opinion on who should own the repository, but would prefer to have just one. |
@jdinan Agreed. That is why I want to move it to the LANL-collaborative hpc org. UCX should not own the main fork but in the hpc org I can give UCX developers more access :) |
I will try to get that done tomorrow. |
As I'm randomly cruising through here: |
Hey @gkatev 🙂, sorry to bother you here. I encountered a problem with The Open MPI MCA output is like:
It may occurs in ( mca_smsc_xpmem_component.my_seg_id = xpmem_make(0, XPMEM_MAXADDR_SIZE, XPMEM_PERMIT_MODE,
(void *) 0666);
if (-1 == mca_smsc_xpmem_component.my_seg_id) {
return OPAL_ERR_NOT_AVAILABLE;
} I'm runing my program in a super computer platform so I don't have root permission, and there is no |
Hi @jywangx, definetely don't expect XPMEM to work without the kernel module inserted and /dev/xpmem present. It most likely is related to this, xpmem_make (and other ops) work via ioctl to /dev/xpmem. I imagine the -16 you see is the value of |
Got it :) Thank you, this really helped me. |
KERNEL: Use pte offset kernel function for mapped pages
I am trying to use XPMEM with openMPI4.x, and have used the below configure command to configure openMPI4.1.0:-
$ ompi_info --all|grep 'command line'
Configure command line: '--prefix=/home/server/ompi4_xmem' '--with-xpmem=/home/server/xpmm' '--enable-mpi-fortran' '--enable-mpi-cxx' '--enable-shared=yes' '--enable-static=yes' '--enable-mpi1-compatibility'
User-specified command line parameters passed to ROMIO's configure script
Complete set of command-line parameters passed to ROMIO's configure script
But I am getting a warning/error when running the FFTW inbuilt MPI benchmark.
$ mpirun --map-by core -rank-by core --bind-to core ./mpi-bench -s ic1000000
WARNING: Could not generate an xpmem segment id for this process'
address space.
The vader shared memory BTL will fall back on another single-copy
mechanism if one is available. This may result in lower performance.
Local host: lib-server-03
Error code: 2 (No such file or directory)
Problem: ic1000000, setup: 580.97 ms, time: 1.76 ms, ``mflops'': 56555
[lib-server-03:1297333] 127 more processes have sent help message help-btl-vader.txt / xpmem-make-failed
[lib-server-03:1297333] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Due to the above warning/error, I am not sure if the MPI program is using XPMEM or CMA.
Can you please help me in resolving this warning/error?
Thanks in advance.
The text was updated successfully, but these errors were encountered: