-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make pip source installs a bit easier #1048
Conversation
docs/source/install.rst
Outdated
@@ -170,6 +170,6 @@ Building and installing UCX-Py can be done via `pip install`. For example: | |||
conda activate ucx | |||
git clone https://github.com/rapidsai/ucx-py.git | |||
cd ucx-py | |||
pip install -v . | |||
pip install -v -C rapidsai.matrix-entry="cuda=12.2" . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than always including this, should the default development instructions not enable CUDA, and therefore not specify a matrix entry? This assumes we make the other change I suggest above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a question for @pentschev ?
These docs are under a section that says it's assumed you're building on a system with CUDA.
ucx-py/docs/source/install.rst
Lines 50 to 53 in 7b70211
Source | |
------ | |
The following instructions assume you'll be using UCX-Py on a CUDA enabled system and is in a `Conda environment <https://docs.conda.io/projects/conda/en/latest/>`_. |
I'm happy to change this PR either way, but I don't feel qualified to make the call about whether or not the CUDA-enabled build should be the default.
I DO think that we should continue specifying this flag (and therefore depending on the libucx
package) in docs builds here. Since we've seen that those docs builds end up also carrying some test coverage to catch works-without-CUDA types of issues (rapidsai/ucxx#231).
So @pentschev , the specific question for you is, which of these would you prefer?
pip install .
works with 0 other flags, but requires a system installation of UCXpip install .
requires passing-C rapidsai.matrix-entry="cuda12.2"
or similar, but in exchange the locally-built package depends onlibucx
wheels (so no system version required)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed 7b675a4 implementing this, by the way, so you can look at the diff and see what it'd look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs assume a CUDA system because of the --with-cuda=$CUDA_HOME
build flag, which is our primary use case, if that flag is omitted then it should work on non-CUDA systems as well, but we don't document that at the moment.
the specific question for you is, which of these would you prefer?
pip install .
works with 0 other flags, but requires a system installation of UCXpip install .
requires passing-C rapidsai.matrix-entry="cuda12.2"
or similar, but in exchange the locally-built package depends onlibucx
wheels (so no system version required)
Can I have both? 🙂
Ideally, I would like both to work, but given your phrasing it seems this is not possible. So let's assume you have UCX installed on the system and libucx
installed, what would happen then both with and without -C rapidsai.matrix-entry="cuda12.2"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assume you have
libucx
installed
This question is about whether or not pip install .
pulls that dependency in for you, not about what happens if it happens to already be installed.
If you've already done something like the following:
apt-get install openucx
pip install libucx-cu12
Then import ucp
will call libucx.load_library()
, which should find that system installation of libucs
, libucm
, etc., but will fall back to the libucx
-bundled ones if finding them fails for some reason.
Ok so what's the point of -C rapidsai.matrix-entry
? libucx
isn't distributed as the name libucx
.... you have to pick libucx-cu11
(CUDA 11) or libucx-cu12
(CUDA 12). That selection of major CUDA version can happen automatically in rapids-build-backend
, but that detection is being turned off here via the use of disable-cuda = true
, to support source installation on non-CUDA systems by default.
Another option we could pursue, if you want, is to choose a specific CUDA verrsion as the default one for source installations by e.g. recording libucx-cu12
as a dependency in pyproject.toml
. Then pip install .
would "just work" with no additional flags on a system with or without CUDA available and with or without a system installation of UCX. But anyone using CUDA 11 would have to pass -C rapidsai.matrix-entry="cuda11.8"
or similar, or they'd get a libucx-cu12
(built against CUDA 12), which might lead to runtime issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in a39811c. Would you mind having a look if everything seems correct to you @jameslamb ?
Also, I've accidentally committed this local change that I was going to push to another PR. It's harmless and corrects the environment now, so if you don't mind let's leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's harmless and corrects the environment now, so if you don't mind let's leave it.
Sure, no problem, it's ok to leave it.
Would you mind having a look
Looking right now! Thanks for doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd written up a long comment about how the docs should indicate that CPU-only is totally supported by pip install ucx-py-cu{11,12}
.... then re-read what you wrote and saw that you already said exactly that 🤦🏻
Anyway, in case it's interesting, here's how I tested that to convince myself:
how I tested that (click me)
Ran the following on my macOS laptop (so no possibility of accidentally finding CUDA):
docker run \
--rm \
-it python:3.11 \
bash
# system-install UCX so we don't link to the the wheel one
apt-get update
apt-get install -y --no-install-recommends \
libucx-dev
# install ucx-py-cu12 (which also pulls in libucx-cu12)
pip install --extra-index-url https://pypi.nvidia.com 'ucx-py-cu12>=0.38'
# import ucx-py and check where the loader found libraries
python -c "import ucp"
ldconfig -p | grep libuc
# libuct.so.0 (libc6,AArch64) => /lib/aarch64-linux-gnu/libuct.so.0
# libuct.so (libc6,AArch64) => /lib/aarch64-linux-gnu/libuct.so
# libucs_signal.so.0 (libc6,AArch64) => /lib/aarch64-linux-gnu/libucs_signal.so.0
# libucs_signal.so (libc6,AArch64) => /lib/aarch64-linux-gnu/libucs_signal.so
# ... etc., etc. ... it found them all
# write the demo code from https://github.com/rapidsai/ucx-py/blob/branch-0.39/docs/source/quickstart.rst
cat > server.py <<EOF
import asyncio
import time
import ucp
import numpy as np
n_bytes = 2**30
host = ucp.get_address(ifname='eth0') # ethernet device name
port = 13337
async def send(ep):
# recv buffer
arr = np.empty(n_bytes, dtype='u1')
await ep.recv(arr)
assert np.count_nonzero(arr) == np.array(0, dtype=np.int64)
print("Received NumPy array")
# increment array and send back
arr += 1
print("Sending incremented NumPy array")
await ep.send(arr)
await ep.close()
lf.close()
async def main():
global lf
lf = ucp.create_listener(send, port)
while not lf.closed():
await asyncio.sleep(0.1)
if __name__ == '__main__':
asyncio.run(main())
EOF
cat > ./client.py <<EOF
import asyncio
import ucp
import numpy as np
port = 13337
n_bytes = 2**30
async def main():
host = ucp.get_address(ifname='eth0') # ethernet device name
ep = await ucp.create_endpoint(host, port)
msg = np.zeros(n_bytes, dtype='u1') # create some data to send
# send message
print("Send Original NumPy array")
await ep.send(msg) # send the real message
# recv response
print("Receive Incremented NumPy arrays")
resp = np.empty_like(msg)
await ep.recv(resp) # receive the echo
await ep.close()
np.testing.assert_array_equal(msg + 1, resp)
print("successfully received")
if __name__ == '__main__':
asyncio.run(main())
EOF
# start the client and server
python ./server.py &
python ./client.py &
Saw output like this:
Send Original NumPy array
Receive Incremented NumPy arrays
Received NumPy array
Sending incremented NumPy array
successfully received
Maybe that'd be a helpful smoke test to add for rapidsai/ucxx#231.
Anyway... everything you wrote looks correct and clear to me! I think this is good to merge. Thanks as always for your patience working through it with me 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smoke tests are always useful, my concern is how would that work in CI? Not sure if it's worth the trouble if we have to add another job at this time, from existing jobs it's probably relatively complicated to do so in a "safe" manner (i.e., ensuring linkage isn't resolved via the proper PyPI/conda UCX package).
I'm also +1 on merging this, I've approved the PR. Thanks so much James for the work here! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's worth the trouble
For sure, this wasn't a proposal to do add a new job or anything right now.
Thanks! I'll merge this once CI passes.
# by default, do not rename the package 'ucx-py-cu${ver}' | ||
# (this is overridden in wheel publishing) | ||
disable-cuda=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ucx-py
actually do anything different between CUDA versions? My understanding was no
Only libucx
would potentially do different things for different CUDA versions
So am curious why we want to suffix here. Is this just to provide a way to influence libucx
indirectly? Or is there not really a need?
Edit: Recognize we may just be agreeing, but want to double check that there isn't still something else going on with the suffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll let @pentschev or @vyasr elaborate, but I can share a bit.
why we want to suffix here
I can say confidently it's not ONLY to influence the libucx
dependency, because ucx-py
has been published with CUDA suffixes as far back as v0.32 (https://pypi.nvidia.com/ucx-py-cu12/), and libucx
was only introduced in v0.38 (#1041).
But now that we do have libucx
... the suffixing does serve that purpose of influencing which version you get.
e.g. cugraph-cu12
depends on ucx-py-cu12
which depends on libucx-cu12
Recognize we may just be agreeing
I don't think so.
This disable-cuda=true
in this part of the diff here is removing that suffixing for source builds like pip install .
, based on @pentschev 's feedback that he wanted that workflow to be as simple as possible, especially since in some cases ucx-py
is used without any CUDA at all.
If this PR was accepted in its current state, we'd still be publishing ucx-py-cu{version}
wheels, via pip wheel --config-settings rapidsai.disable-cuda=false
(see changes to ci/build_wheel.sh
in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does
ucx-py
actually do anything different between CUDA versions? My understanding was noOnly
libucx
would potentially do different things for different CUDA versions
John is right, UCX-Py doesn't have a direct dependency on CUDA.
So am curious why we want to suffix here. Is this just to provide a way to influence
libucx
indirectly? Or is there not really a need?
I have previously asked the same question myself, which James answered above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, couldn't a user just specify the libucx-cu*
install if they want that? What does it provide to have ucx-py
also including the same information?
Would add on the Conda side we don't provide this distinction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, couldn't a user just specify the
libucx-cu*
install if they want that? What does it provide to haveucx-py
also including the same information?
I think the issue is that ucx-py
needs libucx
, if we don't encode the CUDA version in ucx-py
then the user is responsible to install the appropriate libucx-cu*
which can be error prone. I.e., installing ucx-py
without libucx-cu*
as a dependency will leave ucx-py
in an unusable state, whereas ucx-py
with a libucx-cu*
dependency would make the correct CUDA version uninstallable (would ucx-py
depend on libucx-cu11
or libucx-cu12
?). Please correct me if I'm wrong @jameslamb .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...on the Conda side we don't provide this distinction
Right, because there's ucx
from conda-forge which builds variants for different CUDA versions.
And can constrain everything without needing to use names to disambiguate, by relying on the cuda-version
metapackage.
With wheels and pip
those same mechanisms don't exist (part of what @msarahan and others have been talking about in https://discuss.python.org/t/implementation-variants-rehashing-and-refocusing/54884).
If you want to use ucx-py
with CUDA-enabled UCX, something somewhere has to get that CUDA-enabled UCX installed on the system... and one that was built against the correct version of CUDA. My understanding of rapidsai/build-planning#57 was that this idea of distributing libucx
wheels was a response to difficulties users were facing when asked to install UCX themselves outside of ucx-py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this idea of distributing
libucx
wheels was a response to difficulties users were facing when asked to install UCX themselves outside ofucx-py
.
I wouldn't even go as far as saying "users", even for ourselves it was quite challenging to bundle UCX with the libraries that need it, so much so that we had UCX broken for many months. One step further, this also bloated package sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to the libucx wheels existing, ucx-py statically linked/bundled components of the CTK in order to support running with CUDA. In that world, yes it did have a CUDA dependency because the CUDA transport layer built into the wheel was specific to a given CUDA version. That CUDA dependency has now migrated to the libucx package, so the purpose of the suffix in ucx-py is now to select the appropriate libucx dependency.
What does it provide to have ucx-py also including the same information?
This is generally what we've done throughout RAPIDS because if a pure Python package A depends on a CUDA-dependent package B there is no way to express the which CUDA version it should select for B without having separate versions of A. For instance, we do this in dask-cudf-cuXY
, which has a hard dependency on cudf-cuXY
. ucx-py is a bit of a special case because it works without CUDA, so the CUDA versions are truly optional unlike with something like dask-cudf
(which requires cudf
and therefore must pull a specific CUDA versioned one). In the past we've discussed is having some more automated way of having a default choice of CUDA version, but that still requires a default behavior so you can't get around the different packages in any way that I can tell.
For ucx-py, CUDA support is optional , and since rapidsai/ucx-wheels#5 we allow installing libucx on a system without a GPU. I don't see an option for the dependency specification that provides a good UX in all cases though. You either have to force users to manually install dependencies or you make the package not work by default out of the box AFAICT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that! The used-to-statically-link-bits-of-the-CTK was a part I was missing.
I think influencing the libucx
dependency via a CUDA suffix should continue to be the approach here.
…k to 'ucx' from the conda environment
…etting a bit of test coverage for libucx in CUDA-less environments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @jameslamb !
/merge |
1 similar comment
/merge |
@vyasr maybe you have to approve too because you've been explicitly asked to review? Not sure to be honest. |
@pentschev This was my mistake, we still need a I've asked @vyasr offline to to help us with one. |
Oh, now I see that. I always miss the grey text. 😅 |
@@ -7,7 +7,7 @@ dependencies: | |||
- python=3.9 | |||
- cudatoolkit=11.5 | |||
- setuptools | |||
- cython>=0.29.14,<3.0.0a0 | |||
- cython>=3.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did this change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pentschev accidentally committed it (it was intentional, just not intended for this PR) and proposed just keeping it: #1048 (comment)
Follow-up to #1044
Contributes to rapidsai/build-planning#31
Related to #1047
libucx
build requirements so they can be satisfied by only packages from pypi.org--extra-index-url https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/
inpip install
calls, now thatrapids-build-backend
is on pypi.org (https://pypi.org/project/rapids-build-backend/)rapids-build-backend
configdisable-cuda=true
inpyproject.toml
pip install .
will not requirenvcc
or produce a wheel with a suffix like-cu12
-cu${ver}
suffixesNotes for Reviewers
These changes came out of an offline conversation with @pentschev and @vyasr