Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI-parallelization of sktwocnt #94

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

vanderhe
Copy link
Member

@vanderhe vanderhe commented Aug 16, 2024

Straight forward (non-distributed) and optional MPI parallelization of the sktwocnt binary, employing the following strategy:

  • fixed maximum distance (static tabulation)
    Creates a single batch of dimer distances that are computed by available MPI ranks. Nothing fancy, MPI ranks exceeding the total number of distances are idling. We might want to print a warning or even block such calculations.

  • dynamic batches with maximum distance (2 cases: converged or max. distance reached)
    If the number of MPI ranks undercuts the number of dimer distances contained in the default 1 Bohr batch length, the distances are computed by the different ranks. If the number of ranks exceeds the number of distances within the default 1 Bohr batch length, the batch size is automatically increased to accomodate as many dimer distances as there are MPI ranks. By doing so we mostly avoid ranks idling around, but an increased batch size later requires the Hamiltonian and overlap data to be cropped to the same size one would have obtained with the default batch length of 1 Bohr. Otherwise the length of the converged SK-tables would depend on the number of ranks.

To be merged after #90 (needs to be rebased).

@vanderhe vanderhe added the enhancement New feature or request label Aug 16, 2024
@vanderhe vanderhe marked this pull request as ready for review August 16, 2024 13:15
@vanderhe vanderhe marked this pull request as draft August 16, 2024 13:16
@vanderhe vanderhe force-pushed the sktwocntMpi branch 4 times, most recently from 432b601 to 51cbe34 Compare August 16, 2024 13:31
@vanderhe vanderhe marked this pull request as ready for review August 16, 2024 14:53
@vanderhe
Copy link
Member Author

Rebased on #90 to incorporate shelf search bug fix.

@vanderhe vanderhe added this to the 0.4 milestone Aug 23, 2024
Copy link
Member

@bhourahine bhourahine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is rebased after the merge of #90 I'll have another look.

- name: Compile and Install libXC
run: |
git clone https://gitlab.com/libxc/libxc.git
cd libxc/
git checkout 6.2.2
cmake -DCMAKE_INSTALL_PREFIX=${PWD}/${BUILD_DIR}/${INSTALL_DIR} -DENABLE_FORTRAN=True -B ${BUILD_DIR} .
cd ${BUILD_DIR}
make -j 2
make -j2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to just use all available resources? (likewise for the ctest commands)

Suggested change
make -j2
make -j

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is hyper-threading on the VMs. If yes, I would prefer to keep the -j2 version to only utilize physical cores.

if: contains(matrix.mpi, 'openmpi') || contains(matrix.mpi, 'mpich')
run: |
pushd ${BUILD_DIR}
ctest -j1 --output-on-failure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why -j1 ?

Suggested change
ctest -j1 --output-on-failure
ctest --output-on-failure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the -DTEST_MPI_PROCS=2 run. Is there any difference between ctest and ctest -j1? The latter might be a bit more verbose but maybe a bit more transparent too. I don't really care.

utils/test/check_submodule_commits Outdated Show resolved Hide resolved
write(stdOut, "(A,A)") "!!! Parsing error: ", txt
write(stdOut, "(2X,A,A)") "File: ", trim(fname)
write(stdOut, "(2X,A,I0)") "Line number: ", iLine
write(stdOut, "(2X,A,A,A)") "Line: '", trim(line), "'"

stop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be an MPI abort first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed by the latest commit.

@@ -17,8 +18,19 @@ subroutine write_sktables(skham, skover)
!> Hamiltonian and overlap matrix
real(dp), intent(in) :: skham(:,:), skover(:,:)

call write_sktable_("at1-at2.ham.dat", skham)
call write_sktable_("at1-at2.over.dat", skover)
if (size(skham, dim=2) > 0) then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I've followed the logic fully, but what happens if non-lead processes get here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't/shouldn't call this routine. If they would, all processes will write exactly the same file, with the same name and content (don't know how the filesystem handles this), as all the data is present on all ranks.

vanderhe and others added 3 commits August 25, 2024 21:39
Co-authored-by: Ben Hourahine <benjamin.hourahine@strath.ac.uk>
Co-authored-by: Ben Hourahine <benjamin.hourahine@strath.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants