Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Test UCX 1.16 protov2 changes #9628

Open
abellina opened this issue Nov 3, 2023 · 0 comments
Open

[FEA] Test UCX 1.16 protov2 changes #9628

abellina opened this issue Nov 3, 2023 · 0 comments
Labels
feature request New feature or request shuffle things that impact the shuffle plugin

Comments

@abellina
Copy link
Collaborator

abellina commented Nov 3, 2023

UCX 1.16 is rewriting large parts of the codebase (protov2). We should begin testing it, following suit from the UCX-py guys: rapidsai/ucx-py#992

  • We should turn on UCX in our performance cluster to make sure RDMA continues to work. We should verify RoCE via wireshark.
  • We should also try UCX in an NVLinked environment (DGX) to make sure that it can use NVLink successfuly, and see all NICs in that case.
@abellina abellina added feature request New feature or request ? - Needs Triage Need team to review and classify shuffle things that impact the shuffle plugin labels Nov 3, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request shuffle things that impact the shuffle plugin
Projects
None yet
Development

No branches or pull requests

2 participants