Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conda-build may raise false alarm on Windows? #13

Closed
leofang opened this issue Nov 10, 2023 · 15 comments · Fixed by #14
Closed

conda-build may raise false alarm on Windows? #13

leofang opened this issue Nov 10, 2023 · 15 comments · Fixed by #14

Comments

@leofang
Copy link
Member

leofang commented Nov 10, 2023

During the testing of CUDA 12 + Windows (conda-forge/cupy-feedstock#228), we see this warning

WARNING (cupy,Lib/site-packages/cupy_backends/cuda/api/runtime.cp310-win_amd64.pyd): Needed DSO Library/bin/cudart64_12.dll found in ['cuda-cudart_win-64']
WARNING (cupy,Lib/site-packages/cupy_backends/cuda/api/runtime.cp310-win_amd64.pyd): .. but ['cuda-cudart_win-64'] not in reqs/run, (i.e. it is overlinking) (likely) or a missing dependency (less likely)

which seems to be saying we missed a run_exports somewhere in packages generated from this feedstock, but it's not the case. cuda-cudart_win-64 is installed in both build and test stages just fine. Also, there's no problem on Linux, and the export relation (cuda-cudart-dev exporting cuda-cudart that depends on cuda-cudart_{{ platform }}) is the same on both OS. Seems like a false alarm that can be safely ignored.

@jakirkham
Copy link
Member

In the Linux case, we don't see issues as libraries link to the symlinks in cuda-cudart instead

If we look at the Linux case, we see something like the following

INFO (cupy,lib/python3.9/site-packages/cupy/cuda/thrust.cpython-39-x86_64-linux-gnu.so): Needed DSO lib/libcudart.so.12 found in conda-forge::cuda-cudart-12.0.107-h59595ed_6

It doesn't warn here as the the library links to the SO symlinks, which ship with cuda-cudart

Interestingly conda-build doesn't warn that the symlink points to a file in a different package (cuda-cudart_linux_64). Maybe it should?

- name: cuda-cudart
files: # [linux]
- lib/libcu*.so.* # [linux]

@jakirkham
Copy link
Member

One option would be to modify cuda-cudart-dev's run_exports

build:
run_exports:
- {{ pin_subpackage("cuda-cudart", max_pin="x") }}

...to include...

     - {{ pin_subpackage("cuda-cudart_" + target_platform, max_pin="x") }}

This is a bit redundant

Also not sure offhand if this would have other consequences. It could be restricted to Windows

Maybe there are better solutions

@isuruf
Copy link
Member

isuruf commented Nov 10, 2023

conda-build warning is correct. cudart64_12.dll should be in cuda-cudart just like libcudart.so.*.

@jakirkham
Copy link
Member

jakirkham commented Nov 11, 2023

What would happen to cuda-cudart_win-64 then?

In the Linux case those are just symlinks, the actual files are in cuda-cudart_linux-* and are located in targets

AIUI this structure was used for cross-compilation support

@carterbox
Copy link
Member

I am seeing this same error on the libmagma-feedstock. Except on the libmagma feedstock, the warning is promoted to an error.

@isuruf
Copy link
Member

isuruf commented Nov 16, 2023

What would happen to cuda-cudart_win-64 then?

It's not needed as we don't do cross-compilation for windows.

@jakirkham
Copy link
Member

Thought we wanted to avoid special casing for Windows though ( conda-forge/staged-recipes#21723 (comment) )

Currently am seeing the following:

If we start dropping the various *_{{ target_platform }} packages on Windows, we would need to special case Windows in multiple cases

Am wondering whether this is worth it to avoid a warning

What other options might we have that would involve fewer touch points?

@carterbox
Copy link
Member

carterbox commented Nov 23, 2023

We could try to get symlinks working for the Windows package like they do for Linux. According to this blog post from 2016, you no longer need admin priviledge to create symlinks in Windows 10, but you may need to enable "Developer Mode". I'm not sure whether "Developer Mode" is enabled on the Azure workers.

@carterbox
Copy link
Member

I prototyped using relative Symlinks on Windows to mirror how the Linux packages work. Unfortunately, it seems that conda does not handle installting symlinks on Windows correctly.

ERROR conda.core.link:_execute(740): An error occurred while installing package 'local::cuda-cudart-12.0.107-h63175ca_6'.
Rolling back transaction: ...working... done
Traceback (most recent call last):
  File "C:\Users\dchin\miniconda3\Scripts\conda-build-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\cli\main_build.py", line 495, in main
    execute(sys.argv[1:])
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\cli\main_build.py", line 475, in execute
    outputs = api.build(
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\api.py", line 180, in build
    return build_tree(
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\build.py", line 3089, in build_tree
    test(pkg, config=metadata.config.copy(), stats=stats)
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\build.py", line 2879, in test
    environ.create_env(metadata.config.test_prefix, actions, config=metadata.config,
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda_build\environ.py", line 1058, in create_env
    execute_actions(actions, index)
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\common\io.py", line 84, in decorated
    return f(*args, **kwds)
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\plan.py", line 322, in execute_actions
    execute_instructions(plan, index, verbose)
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\plan.py", line 532, in execute_instructions
    cmd(state, arg)
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\instructions.py", line 73, in UNLINKLINKTRANSACTION_CMD
    unlink_link_transaction.execute()
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\core\link.py", line 283, in execute
    self._execute(tuple(chain(*chain(*zip(*self.prefix_action_groups.values())))))
  File "C:\Users\dchin\miniconda3\lib\site-packages\conda\core\link.py", line 756, in _execute
    raise CondaMultiError(
conda.CondaMultiError: [Errno 2] No such file or directory: 'C:\\Users\\dchin\\miniconda3\\pkgs\\cuda-cudart-12.0.107-h63175ca_6\\Library\\bin\\cudart64_12.dll'
()

It's weird because if I unpack the archives manually, the symlinks seem to be intact, so I'm not sure what's up. There are some other open issues related to links on Windows, though.

conda/conda#12465
conda/conda#10521

@jakirkham
Copy link
Member

Sorry, but I think I'm not quite following the proposal, Daniel

How would symlinks be used here (assuming we could get them to work)?

@msarahan
Copy link
Member

I think @carterbox might be proposing to match the behavior on Linux. That is, symlink from the targets folder (provided by the cuda-cudart_win-* package(s)) to %PREFIX%/Library/bin.

Unfortunately, symlinks on Windows are still a dumpster fire that we can't rely on. Beyond that, there's myriad code built around the assumption that symlinks don't work, so it would be a major effort to support them. Requiring users to change to "developer mode" or whatever is just too much to ask, unfortunately.

I agree that we need to avoid special casing on Windows. Even if we decide to throw cross-compiling out the window on all OSes, NVIDIA still wants to keep the targets/... folder structure, because that matches CUDA packages distributed in other places. Breaking the targets structure is more disruptive to NVCC and friends.

My inclination is to follow @jakirkham suggestion at #13 (comment), assuming that it works to fix the issue. If this affected only one consumer package, I'd just ignore the warning/error with a Conda-build option. Since this seems like it will affect any package using CUDA, we need a less ignorant approach.

@carterbox
Copy link
Member

I think @carterbox might be proposing to match the behavior on Linux. That is, symlink from the targets folder (provided by the cuda-cudart_win-* package(s)) to %PREFIX%/Library/bin.

Yes, exactly.

Requiring users to change to "developer mode" or whatever is just too much to ask, unfortunately.

For clarity, my understanding is the "Developer Mode" is only required for creating new links, not using them once they exist, so I don't think it would require every Conda user to enable "Developer Mode" only creators of packages that contain symlinks. No comment on the usability/reliability of symlinks once they are created.

Why don't we just create copies instead of symlinks? That way the library is in the right place for runtime and also is available for cross-compiling if/when Windows ARM gains support for CUDA. With copy instead of link, we might also trim some dependencies, so that end users don't download two copies of the same file.

@adibbley
Copy link
Contributor

Why don't we just create copies instead of symlinks?

Copies are probably a good option here given this should only be required for cudart. Would we need to duplicate .lib files as well? Either way it would only increase package size by <5MB, so may not even be worth trimming dependencies to help keep things less disruptive.

@jakirkham
Copy link
Member

jakirkham commented Nov 30, 2023

Whether symlinks or copies, am not sure this makes sense in the Windows case as we don't have a targets directory (as on Linux), which means this would create files that exist at the same location in both packages leading to clobbering issues

At least atm, am leaning towards the run_exports solution above. It seems like the smallest and least disruptive change we have come up with

@carterbox
Copy link
Member

Whether symlinks or copies, am not sure this makes sense in the Windows case as we don't have a targets directory (as on Linux), which means this would create files that exist at the same location in both packages leading to clobbering issues

The upstream Windows package doesn't have "targets", but it does have a "lib\x64" folder. I assumed that the corresponding hypothetical arm package would have an "lib\arm64" folder. It's just that in this conda recipe the contents of "lib/x64" is unpacked directly into "lib". So I guess my proposal is to have two copies of whatever files. One in "lib" and the others in "lib\x64" and/or "lib\arm64".

I also just realized that the Windows artifacts in this recipe are probably incorrectly split because of differences between how Unix and Windows artifacts are used:

UNIX:
needed at compile/link time: headers and .SO/.DYLIB (dynamic) or .A (static)
needed at run time: .SO (dynamic) files

WINDOWS:
needed at compile/link time: headers and .LIB (static or dynamic) files
needed at run time: .DLL (dynamic) files

Thus to support the cross-compiling for Windows, the artifact split should be a bit different from unix since DLLs are not needed for cross-compile, but the LIB files are? Thus, the DLL should only appear in the native package, while the LIB file should be copied/linked and appear in cross-compile packages, so something like this:

  # This package is runtime deps for native, we only need DLL at runtime
  - name: cuda-cudart
    files:
      - lib/libcu*.so.*                            # [linux]
      - Library\bin\cudart*.dll                    # [win]
  # This package is compile/link deps for native; we need LIB and header files, let's dump they directly into lib\
  - name: cuda-cudart-dev
    build:
      run_exports:
        - {{ pin_subpackage("cuda-cudart", max_pin="x") }}
    files:
      - lib/libcu*.so  # [linux]
      - lib/pkgconfig  # [linux]
      - Library\lib\cuda*.lib  # [win]
  # This package is compile/link deps for cross-compile; we need LIB and header files keep the LIB files in their target subdirectory
  - name: cuda-cudart-dev_{{ target_platform }}
    build:
      noarch: generic
      run_exports:
        - {{ pin_subpackage("cuda-cudart", max_pin="x") }}
    files:
      - targets/{{ target_name }}/include                               # [linux]
      - targets/{{ target_name }}/lib/*.so                              # [linux]
      - Library\include                                                 # [win]
      - Library\lib\{{ target_name }}\cudart*.lib                       # [win]
  # This package is compile/link deps for cross-compile; unused by Windows because you wouldn't use DLLs when target != host
  - name: cuda-cudart_{{ target_platform }}
    build:
      noarch: generic
    files:
      - targets/{{ target_name }}/lib/libcu*.so.*  # [linux]

The only thing that confuses me is whether headers are target specific as they seem to be fore unix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants