Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make pip source installs a bit easier #1048

Merged
merged 9 commits into from
Jul 3, 2024
2 changes: 0 additions & 2 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ build:
post_create_environment:
- |
pip install \
--extra-index-url "https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/" \
-C rapidsai.disable-cuda=true \
-C rapidsai.matrix-entry="cuda=12.2" \
.

Expand Down
2 changes: 1 addition & 1 deletion ci/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ rapids-generate-version > ./VERSION

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

python -m pip wheel . -w dist --no-deps --disable-pip-version-check
python -m pip wheel . -w dist --no-deps --disable-pip-version-check --config-settings rapidsai.disable-cuda=false

mkdir -p final_dist
python -m auditwheel repair \
Expand Down
16 changes: 10 additions & 6 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -142,17 +142,20 @@ dependencies:
- --extra-index-url=https://pypi.nvidia.com
- --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple
specific:
# very tight >=x.x.x,<x.x.(x+1) here allows for installation of
# post release like 1.15.0.post1
- output_types: [requirements, pyproject]
matrices:
- matrix: {cuda: "12.*"}
packages:
- libucx-cu12==1.15.0
- libucx-cu12>=1.15.0,<1.15.1
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- matrix: {cuda: "11.*"}
packages:
- libucx-cu11==1.15.0
- libucx-cu11>=1.15.0,<1.15.1
# this fallback is intentionally empty... it simplifies building from source
# without CUDA, e.g. 'pip install .'
- matrix: null
packages:
- libucx==1.15.0
packages: null
depends_on_ucx_run:
common:
- output_types: conda
Expand All @@ -172,9 +175,10 @@ dependencies:
- matrix: {cuda: "11.*"}
packages:
- libucx-cu11>=1.15.0,<1.16
# this fallback is intentionally empty... it simplifies building from source
# without CUDA, e.g. 'pip install .'
- matrix: null
packages:
- libucx>=1.15.0,<1.16
packages: null
test_python:
common:
- output_types: [conda, requirements, pyproject]
Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ authors = [
license = { text = "BSD-3-Clause" }
requires-python = ">=3.9"
dependencies = [
"libucx>=1.15.0,<1.16",
"numpy>=1.23,<2.0a0",
"pynvml>=11.4.1",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit dependencies.yaml and run `rapids-dependency-file-generator`.
Expand Down Expand Up @@ -116,10 +115,12 @@ build-backend = "setuptools.build_meta"
commit-files = [
"ucp/COMMIT_FILE"
]
# by default, do not rename the package 'ucx-py-cu${ver}'
# (this is overridden in wheel publishing)
disable-cuda=true
Comment on lines +118 to +120
Copy link
Member

@jakirkham jakirkham Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ucx-py actually do anything different between CUDA versions? My understanding was no

Only libucx would potentially do different things for different CUDA versions

So am curious why we want to suffix here. Is this just to provide a way to influence libucx indirectly? Or is there not really a need?

Edit: Recognize we may just be agreeing, but want to double check that there isn't still something else going on with the suffix

Copy link
Member Author

@jameslamb jameslamb Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @pentschev or @vyasr elaborate, but I can share a bit.

why we want to suffix here

I can say confidently it's not ONLY to influence the libucx dependency, because ucx-py has been published with CUDA suffixes as far back as v0.32 (https://pypi.nvidia.com/ucx-py-cu12/), and libucx was only introduced in v0.38 (#1041).

But now that we do have libucx... the suffixing does serve that purpose of influencing which version you get.

e.g. cugraph-cu12 depends on ucx-py-cu12 which depends on libucx-cu12

Recognize we may just be agreeing

I don't think so.

This disable-cuda=true in this part of the diff here is removing that suffixing for source builds like pip install ., based on @pentschev 's feedback that he wanted that workflow to be as simple as possible, especially since in some cases ucx-py is used without any CUDA at all.

If this PR was accepted in its current state, we'd still be publishing ucx-py-cu{version} wheels, via pip wheel --config-settings rapidsai.disable-cuda=false (see changes to ci/build_wheel.sh in this PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ucx-py actually do anything different between CUDA versions? My understanding was no

Only libucx would potentially do different things for different CUDA versions

John is right, UCX-Py doesn't have a direct dependency on CUDA.

So am curious why we want to suffix here. Is this just to provide a way to influence libucx indirectly? Or is there not really a need?

I have previously asked the same question myself, which James answered above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, couldn't a user just specify the libucx-cu* install if they want that? What does it provide to have ucx-py also including the same information?

Would add on the Conda side we don't provide this distinction

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, couldn't a user just specify the libucx-cu* install if they want that? What does it provide to have ucx-py also including the same information?

I think the issue is that ucx-py needs libucx, if we don't encode the CUDA version in ucx-py then the user is responsible to install the appropriate libucx-cu* which can be error prone. I.e., installing ucx-py without libucx-cu* as a dependency will leave ucx-py in an unusable state, whereas ucx-py with a libucx-cu* dependency would make the correct CUDA version uninstallable (would ucx-py depend on libucx-cu11 or libucx-cu12?). Please correct me if I'm wrong @jameslamb .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...on the Conda side we don't provide this distinction

Right, because there's ucx from conda-forge which builds variants for different CUDA versions.

https://github.com/conda-forge/ucx-split-feedstock/blob/b60fb426784cbb6d5cbd384a76c86920737838f0/recipe/meta.yaml#L51-L53

And can constrain everything without needing to use names to disambiguate, by relying on the cuda-version metapackage.

With wheels and pip those same mechanisms don't exist (part of what @msarahan and others have been talking about in https://discuss.python.org/t/implementation-variants-rehashing-and-refocusing/54884).

If you want to use ucx-py with CUDA-enabled UCX, something somewhere has to get that CUDA-enabled UCX installed on the system... and one that was built against the correct version of CUDA. My understanding of rapidsai/build-planning#57 was that this idea of distributing libucx wheels was a response to difficulties users were facing when asked to install UCX themselves outside of ucx-py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this idea of distributing libucx wheels was a response to difficulties users were facing when asked to install UCX themselves outside of ucx-py.

I wouldn't even go as far as saying "users", even for ourselves it was quite challenging to bundle UCX with the libraries that need it, so much so that we had UCX broken for many months. One step further, this also bloated package sizes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to the libucx wheels existing, ucx-py statically linked/bundled components of the CTK in order to support running with CUDA. In that world, yes it did have a CUDA dependency because the CUDA transport layer built into the wheel was specific to a given CUDA version. That CUDA dependency has now migrated to the libucx package, so the purpose of the suffix in ucx-py is now to select the appropriate libucx dependency.

What does it provide to have ucx-py also including the same information?

This is generally what we've done throughout RAPIDS because if a pure Python package A depends on a CUDA-dependent package B there is no way to express the which CUDA version it should select for B without having separate versions of A. For instance, we do this in dask-cudf-cuXY, which has a hard dependency on cudf-cuXY. ucx-py is a bit of a special case because it works without CUDA, so the CUDA versions are truly optional unlike with something like dask-cudf (which requires cudf and therefore must pull a specific CUDA versioned one). In the past we've discussed is having some more automated way of having a default choice of CUDA version, but that still requires a default behavior so you can't get around the different packages in any way that I can tell.

For ucx-py, CUDA support is optional , and since rapidsai/ucx-wheels#5 we allow installing libucx on a system without a GPU. I don't see an option for the dependency specification that provides a good UX in all cases though. You either have to force users to manually install dependencies or you make the package not work by default out of the box AFAICT.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that! The used-to-statically-link-bits-of-the-CTK was a part I was missing.

I think influencing the libucx dependency via a CUDA suffix should continue to be the approach here.

dependencies-file = "dependencies.yaml"
requires = [
"cython>=3.0.0",
"libucx==1.15.0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit dependencies.yaml and run `rapids-dependency-file-generator`.

[tool.setuptools]
Expand Down
Loading