Relax PyTorch upper bound (allowing 2.4) #4703

jakirkham · 2024-10-07T19:45:03Z

As the issue around PyTorch being built without NumPy was fixed in conda-forge, we can now relax these upper bounds to allow PyTorch 2.4.

xref: conda-forge/pytorch-cpu-feedstock#254
xref: conda-forge/pytorch-cpu-feedstock#266
xref: #4615

As the issue around PyTorch being built without NumPy was fixed in conda-forge, we can now relax these upper bounds to allow PyTorch 2.4.

alexbarghi-nv

👍

jakirkham · 2024-10-07T20:01:35Z

Odd looks the S3 upload of wheels built on CI failed. Going to cancel builds and restart

[rapids-upload-to-s3] Path to upload is a directory, creating .tar.gz
./
./cugraph_equivariant_cu11-24.12.0a40-py3-none-any.whl
upload failed: - to s3://rapids-downloads/ci/cugraph/pull-request/4703/fb7a6a6/cugraph_wheel_python_cugraph-equivariant_cu11.tar.gz An error occurred (InternalError) when calling the PutObject operation (reached max retries: 2): We encountered an internal error. Please try again.
Error: Process completed with exit code 1.

Edit: Restarting appears to have fixed this

jameslamb

great! Always happy to see a < removed when possible.

jakirkham · 2024-10-07T22:45:41Z

One CI build failed as it pulled in CuPy 12.2.0 alongside NumPy 2, which is incompatible

Looks like this is due to a repodata patch not getting applied to older CuPy packages in conda-forge. Addressing in upstream PR: conda-forge/conda-forge-repodata-patches-feedstock#873

Not expecting any other action needed here (beyond restarting the failed build when the upstream PR lands)

alexbarghi-nv · 2024-10-09T19:21:45Z

WholeGraph feature store tests are failing. I'm looking into it.

jakirkham · 2024-10-09T19:31:17Z

Thanks Alex! 🙏

Meaning the failures in this CI job:

FAILED tests/data_store/test_gnn_feat_storage_wholegraph.py::test_feature_storage_wholegraph_backend - assert 0 > 0
FAILED tests/data_store/test_gnn_feat_storage_wholegraph.py::test_feature_storage_wholegraph_backend_mg - assert 0 > 0

Looks like that job does have CuPy 13.3.0 + NumPy 2, which are compatible. So this is unrelated to the CuPy repodata patch above

Agree these are worth looking into

Did notice this line when going through the log

pytorch                   2.4.1           cpu_mkl_py312hf535c18_100    conda-forge

So maybe this a different dependency resolution issue. Have a hunch as to what this might be, but will need to investigate further

alexbarghi-nv · 2024-10-09T19:32:25Z

Thanks Alex! 🙏

Meaning the failures in this CI job:
FAILED tests/data_store/test_gnn_feat_storage_wholegraph.py::test_feature_storage_wholegraph_backend - assert 0 > 0
FAILED tests/data_store/test_gnn_feat_storage_wholegraph.py::test_feature_storage_wholegraph_backend_mg - assert 0 > 0
Looks like that job does have CuPy 13.3.0 + NumPy 2, which are compatible. So this is unrelated to the CuPy repodata patch above

Agree these are worth looking into

Did notice this line when going through the log
pytorch                   2.4.1           cpu_mkl_py312hf535c18_100    conda-forge
So maybe this a different dependency resolution issue. Have a hunch as to what this might be, but will need to investigate further

Oh, good catch. That is very likely the cause.

alexbarghi-nv · 2024-10-09T19:47:01Z

I'm having a hard time replicating that dependency resolution locally. Instead, mamba is resolving:

  + pytorch                                  2.3.1  cuda120_py312h26b3cf7_300            conda-forge            25MB

alexbarghi-nv · 2024-10-09T19:54:07Z

Never mind, I just replicated it.

alexbarghi-nv · 2024-10-09T19:58:09Z

This error goes away when adding -c pytorch and correctly installs pytorch from the pytorch channel. Let me push a change to this PR.

alexbarghi-nv · 2024-10-10T15:56:44Z

It looks like we failed due to selecting an older version of cupy. Can we safely bind cupy to >=13.3 in dependencies.yaml? @jakirkham

jakirkham · 2024-10-12T02:49:21Z

Think this has to do with the issue mentioned above ( #4703 (comment) ), which was recently resolved

Let's give this another try

Am going to try without the pytorch channel if that is ok

Let's look at the results of that

We can then discuss after whether we want any more changes

jakirkham · 2024-10-12T07:04:51Z

Now seeing errors like these in failing CI jobs

NotImplementedError: `ego_graph' is not implemented by ['cugraph'] backends. To remedy this, you may enable automatic conversion to more backends (including 'networkx') by adding them to `nx.config.backend_priority`, or you may specify a backend to use with the `backend=` keyword argument.

Are we missing something in cuGraph?

jakirkham · 2024-10-15T14:40:58Z

From offline discussion, sounds like this was fixed in PR: #4717

Updated this PR to pull in those changes

The use of `>` and `<` confuses the shell as it thinks these are redirects. So use single quotes so that it doesn't try to parse the contents of this string.

jakirkham · 2024-10-31T21:19:53Z

Looks like this is passing! 🥳

@alexbarghi-nv @jameslamb would you like to give it one last look?

jameslamb

Looks fine to me. Now that we have CI set up in https://github.com/rapidsai/cugraph-gnn, can you replicate these changes over to there as well? Hopefully soon we can move cugraph-dgl and cugraph-pyg out of this repo completely and not have to do that duplicate work.

Relax PyTorch upper bound (allowing 2.4)

fb7a6a6

As the issue around PyTorch being built without NumPy was fixed in conda-forge, we can now relax these upper bounds to allow PyTorch 2.4.

jakirkham requested review from seberg, hcho3 and jameslamb October 7, 2024 19:45

jakirkham requested review from a team as code owners October 7, 2024 19:45

github-actions bot added python ci conda labels Oct 7, 2024

jakirkham mentioned this pull request Oct 7, 2024

Remove NumPy <2 pin #4615

Merged

jakirkham added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 7, 2024

jakirkham requested a review from alexbarghi-nv October 7, 2024 19:47

alexbarghi-nv approved these changes Oct 7, 2024

View reviewed changes

jameslamb approved these changes Oct 7, 2024

View reviewed changes

alexbarghi-nv and others added 2 commits October 9, 2024 13:05

add pytorch channel to dependencies.yaml

8607d64

Merge branch 'branch-24.12' into allow_pyt_24

a99369c

jakirkham added 2 commits October 11, 2024 19:50

Retry with conda-forge fixes

b29c70b

Merge branch 'branch-24.12' into allow_pyt_24

f040eff

Merge branch 'branch-24.12' into allow_pyt_24

b8ae527

jakirkham added 5 commits October 15, 2024 10:07

Single quote pytorch to not confuse the shell

c0ec2a6

The use of `>` and `<` confuses the shell as it thinks these are redirects. So use single quotes so that it doesn't try to parse the contents of this string.

Relax PyTorch upper bound in docs

1592dca

Single quote another pytorch constraint

e4c37b0

Merge branch 'branch-24.12' into allow_pyt_24

fcfacc7

Merge branch 'branch-24.12' into allow_pyt_24

281fa43

jameslamb approved these changes Oct 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relax PyTorch upper bound (allowing 2.4) #4703

Relax PyTorch upper bound (allowing 2.4) #4703

jakirkham commented Oct 7, 2024 •

edited

Loading

alexbarghi-nv left a comment

jakirkham commented Oct 7, 2024 •

edited

Loading

jameslamb left a comment

jakirkham commented Oct 7, 2024

alexbarghi-nv commented Oct 9, 2024

jakirkham commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 10, 2024

jakirkham commented Oct 12, 2024

jakirkham commented Oct 12, 2024

jakirkham commented Oct 15, 2024

jakirkham commented Oct 31, 2024

jameslamb left a comment

Relax PyTorch upper bound (allowing 2.4) #4703

Are you sure you want to change the base?

Relax PyTorch upper bound (allowing 2.4) #4703

Conversation

jakirkham commented Oct 7, 2024 • edited Loading

alexbarghi-nv left a comment

Choose a reason for hiding this comment

jakirkham commented Oct 7, 2024 • edited Loading

jameslamb left a comment

Choose a reason for hiding this comment

jakirkham commented Oct 7, 2024

alexbarghi-nv commented Oct 9, 2024

jakirkham commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 9, 2024

alexbarghi-nv commented Oct 10, 2024

jakirkham commented Oct 12, 2024

jakirkham commented Oct 12, 2024

jakirkham commented Oct 15, 2024

jakirkham commented Oct 31, 2024

jameslamb left a comment

Choose a reason for hiding this comment

jakirkham commented Oct 7, 2024 •

edited

Loading

jakirkham commented Oct 7, 2024 •

edited

Loading