Runtime detection MacOS #258

SignalRT · 2023-11-06T22:07:38Z

Add MacOS x86_64 in the build.
Copy different runtimes to the output folder
Dynamically load the right MacOS library
Tested on MacOS Arm64
Tested on MacOS x86_64

Create different paths to different MacOS platforms. Dynamically load the right library

…harp into RuntimeDetection

martindevans · 2023-11-06T23:10:57Z

I guess the https://github.com/SciSharp/LLamaSharp/blob/6334f25627d4a5903e14345fb50fd4cf03389cef/LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec# probably need updating to reference this new binary?

Also, does the MacMetal nuspec still need to exist? I think we dropped that as a specific backend a while ago?

SignalRT · 2023-11-06T23:14:22Z

@martindevans, this is related with Issue #251 I reorganize the binaries in directories (only the MacOS binaries), Change the dynamic detection, change the build to include Intel binaries, update MacOS binaries to make a test in my branch (https://github.com/SignalRT/LLamaSharp/actions/runs/6777594696) where the Mac test on Intel platform are included.

There is no work on Windows / Linux binaries.

martindevans · 2023-11-06T23:30:06Z

What I was meaning about the nuspec is at the moment it does this:

<file src="runtimes/libllama.dylib" target="runtimes\osx-x64\native\libllama.dylib" />
<file src="runtimes/ggml-metal.metal" target="runtimes\osx-x64\native\ggml-metal.metal" />
<file src="runtimes/libllama.dylib" target="runtimes\osx-arm64\native\libllama.dylib" />
<file src="runtimes/ggml-metal.metal" target="runtimes\osx-arm64\native\ggml-metal.metal" />

i.e. the same files on ARM64 and x64.

Should that be changed to something like this:

<file src="runtimes/macos-x86_64/libllama.dylib" target="runtimes\macos-x86_64\native\libllama.dylib" />
<file src="runtimes/macos-arm64/libllama.dylib" target="runtimes\macos-arm64\native\libllama.dylib" />
<file src="runtimes/macos-arm64/ggml-metal.metal" target="runtimes\macos-arm64\native\ggml-metal.metal" />

Note: I haven't worked with nuspec before, so I'm not sure that's correct!

SignalRT · 2023-11-07T05:38:40Z

@martindevans, @AsakusaRinne I did not change any nuget package because there are several possible directions with these packages. With the dynamic library load it could be only "one" nuget package, as the most extreme solution, a nuget package by OS version, etc, etc.

Talking about the Mac use case only, to limit the options, the intel build only supports memory, the arm build supports with the same binary memory and GPU with the configuration of GPU layers so it could have sense to introduce all this package in only ONE nuget package. But this package is not CPU only and not METAL. It could be just MacOS package, it will use the best possible option.

Would we keep the current packages?, Would we think about another nuget package distribution (by OS for example)?.

Once I have clear if you want to keep the current nuget packages or to group the packages with another layout I can make the changes.

AsakusaRinne · 2023-11-07T13:45:05Z

@SignalRT Thank you a lot for this great work! Each of these options has its own advantages and disadvantages, and we need to make a choice among them.

In my opinion, the following options are ranked from high to low priority:

Keep all binaries in one package, but allow using a self-compiled binary without installing the backend.
Split binaries by OS.
Keep the current distribution.

However, my view me be limited. Therefore I explain the reasons of mine and I'm open for suggestions and different voices.

The binaries are not of large size, the sum of all binaries is less than 20MB, so that it won't be a burden in most of the cases. Besides, the backend not installed is one of the most frequently asked questions, which makes me think stacking all binaries in one package may be a good choice. However, we should also support the case that sometimes users want to compile themselves or just don't want to keep all binaries.

There's an even more aggressive but not bad option: keep all binaries along with our main library in LLamaSharp package. Instead, name the one without binaries LLamaSharp.Lite (for those who want to run on mobile for example).

Splitting by OS is a good compromise. To be honest I'm quite hesitant with the first two options. The risk of it comes from user -- I think the gap of changing package by computation ability to package by OS is larger than that of changing to one package.

The last one is related with the risk of option 1 and 2. Changing the package distribution has so many effects, especially on other libraries depending on us, that I'm thinking if we should update the major version to 1.0.0. However is this already a good chance to publish 1.0.0? I'm not sure, too. Keeping the current distribution has only one advantage, which is to make major version change when everything is ok in the future.

cc @martindevans @saddam213 @xbotter @Oceania2018 @sagilio I'd also like to hear from you, please.

martindevans · 2023-11-07T14:15:45Z

One option we could consider would be the extreme version SignalRT mentioned where each package has exactly one DLL (e.g. LLamaSharp.Backend.CPU.AVX512, LLamaSharp.Backend.CUDA12.AVX2 etc), but then we distribute "meta" packages which simply depend on those other packages. e.g. LLamsSharp.Backend.CPU would depend on every single CPU package for all platforms (relying on runtime feature detection to pick the right one). That way users can pick a very specific package if they want, but the default would look at lot like it does today: either install Backend.CPU or Backend.CUDA_VERSION and it all just works.

My concern with this approach is some users might try to isntall various different specific packages to debug DLL dependency issues and end up with a bit of a mess.

For this immediate release I would vote for keeping it simple - add the MacOS binaries for both architectures into the Backend.CPU package (since that's the "default" backend, even if it's not quite the right name since MacOS+ARM64 includes Metal).

SignalRT · 2023-11-07T19:31:57Z

@martindevans When I talk about extreme solution is to have only ONE package (the option 1 that refers @AsakusaRinne ).

It's OK for me to change my changes in the Backend.CPU package. In that case I think that the Metal package should dissapear.

I will try to make the changes today.

martindevans · 2023-11-07T19:49:25Z

Oh sorry, I misunderstood what you meant.

When I was talking about "extreme" I meant we would have:

LLamaSharp.Backend.CPU, which depends on:
- LLamaSharp.Backend.Windows-x64-AVX
- LLamaSharp.Backend.Windows-x64-AVX2
- LLamaSharp.Backend.Windows-x64-AVX512
- LLamaSharp.Backend.Linux-x64-AVX
- etc...
LLamaSharp.Backend.CUDA12, which depends on:
- LLamaSharp.Backend.Windows-x64-CUDA12-AVX
- LLamaSharp.Backend.Windows-x64-CUDA12-AVX2
- LLamaSharp.Backend.Windows-x64-CUDA12-AVX512
- LLamaSharp.Backend.Linux-x64-CUDA12-AVX
- etc...

So we would be shipping a lot of packages. Users would have the ability to pick one exactly specific backend if they want (by just installing one single package). But they could also just install the LLamaSharp.Backend.CPU to get the "best" option for them.

I'm not actually sure I if I like this option to be honest, I'm just presenting it as another option to consider!

Delete Backend.Metal because is not needed anymore. Do not include .metal in x86_64 binaries

…harp into RuntimeDetection

SignalRT · 2023-11-07T21:11:50Z

Eliminated Backend.Metal
Included all the MacOS binaries and metal y Backed.CPU.

AsakusaRinne · 2023-11-08T16:47:03Z

Oh sorry, I misunderstood what you meant.

When I was talking about "extreme" I meant we would have:

LLamaSharp.Backend.CPU, which depends on:

LLamaSharp.Backend.Windows-x64-AVX

LLamaSharp.Backend.Windows-x64-AVX2

LLamaSharp.Backend.Windows-x64-AVX512

LLamaSharp.Backend.Linux-x64-AVX

etc...

LLamaSharp.Backend.CUDA12, which depends on:

LLamaSharp.Backend.Windows-x64-CUDA12-AVX

LLamaSharp.Backend.Windows-x64-CUDA12-AVX2

LLamaSharp.Backend.Windows-x64-CUDA12-AVX512

LLamaSharp.Backend.Linux-x64-CUDA12-AVX

etc...

So we would be shipping a lot of packages. Users would have the ability to pick one exactly specific backend if they want (by just installing one single package). But they could also just install the LLamaSharp.Backend.CPU to get the "best" option for them.

I'm not actually sure I if I like this option to be honest, I'm just presenting it as another option to consider!

I'll agree with that if the number of packages is not so large. However, as you can see, the combinations could reach 20, which may confuse the users who are not familiar with this scope. Besides, if there're other options added to llama.cpp in the future, we'll have to release new packages, which seems to be inconvenience. Anyway, I really thanks for your suggestion.

@SignalRT @martindevans Can the feature detection select the best package now? I think stacking binaries in one package is the best if it can. Otherwise we should keep the current distribution. But I'm still open for other voices.

martindevans · 2023-11-08T17:47:21Z

We can put everything in one package except CUDA, we don't currently have a runtime way to detect that. In theory it could be done in the future, and then we could truly have just one package with all backends.

AsakusaRinne · 2023-11-08T19:36:31Z

@martindevans @SignalRT I think I found a method to detect cuda and avx support. I've pushed my changes in #268, please take a review of it. Since it's already too late for me, it's only a template now and I haven't verified if it works. If it does work, we could have one more choice which is to cherry-pick my commit and support feature detection based on it. The file structure I listed in that PR is only for experiment, please feel easy to change it.

BTW, whether we detect cuda and avx or not, we should consider how to provide a convenient way for user to use LLamaSharp with self-compiled DLL. I haven't had a good idea yet. Do you have any idea about it?

SignalRT · 2023-11-09T21:34:35Z

Reverted #268

martindevans · 2023-11-09T22:36:41Z

What's left to do before we merge this PR?

SignalRT · 2023-11-10T06:38:02Z

@martindevans To test the nuget package. I will do it only if there is no layout changes before release. If the plan is to release changes on nuget packages better wait to test the final configuration.

martindevans · 2023-11-10T12:59:56Z

It's up to @AsakusaRinne but I think the plan is to do a release as soon as this is merged?

AsakusaRinne · 2023-11-10T19:30:23Z

It's up to @AsakusaRinne but I think the plan is to do a release as soon as this is merged?

Yes, I think we'll not change the distribution of nuget packages and include #244 in the next release. It seems that we didn't reach an agreement and I'll open a vote after investigating the possibility of cuda feature detection. To be honest I'm also not sure about which approach the users like. 😶‍🌫️

AsakusaRinne · 2023-11-10T19:51:05Z

There's a conflict caused by downgrading semantic-kernel version in examples in #244. @SignalRT Could you please resolve it and test the nuget package on MACOS? I'd like to help if there's problem generating nuget package (I don't have a MACOS). Thank you for this hard work!

LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec

SignalRT · 2023-11-11T12:23:11Z

@AsakusaRinne I think it is ok. I build the nuget package an test the CPU package on ARM. The test pass OK.

I changed the prepare_release.sh script to be able to execute this on osx (there are some pending changes), I hope not to break nothing on linux execution but osx bash seems to have some incompatibilities.

AsakusaRinne

LGTM, it's a good work. :) However we'll delay the next release for somewhile for the two reasons:

I have to test on my fork to check if packages could be generated correctly by the workflow.
Noticed that this PR removed MacMetal backend, which is a break change. Therefore we'll publish a minor release instead, and include feat: cuda feature detection. #275 if possible.

.github/prepare_release.sh

AsakusaRinne · 2023-11-11T20:43:40Z

One more question, why does the MAC ci so slow 🤣 It won't block the merging of this PR though

martindevans · 2023-11-11T20:44:46Z

It's been waiting for an entire hour on this step!?

Waiting for a runner to pick up this job...

AsakusaRinne · 2023-11-11T20:48:11Z

It's been waiting for an entire hour on this step!?

Not sure if it's because the limited resources of github ci cluster. But anyway, one hour waiting is too long. Is MAC server very rare? 😹

SignalRT · 2023-11-11T23:35:24Z

@AsakusaRinne ....Occam's razor.... It was a mismatch renaming macOS to osx.

SignalRT added 11 commits November 6, 2023 22:03

MacOS Intel Build

47ad7f1

Merge branch 'SciSharp:master' into RuntimeDetection

5d9ce88

MacOS Runtime detection and clasification

d124433

Create different paths to different MacOS platforms. Dynamically load the right library

Merge branch 'RuntimeDetection' of https://github.com/SignalRT/LLamaS…

e64b905

…harp into RuntimeDetection

Include MacOS Intel test

2efc4e1

Change to execute test

cfe03cd

Include MacOS Build

d0856fb

Disable metal con x86_64

e941513

MacOS Intel Disable METAL

b67198c

Do not copy metal on MacOS x86_64 Build

0045606

Let master branch in the CI test execution

bfeaa57

SignalRT mentioned this pull request Nov 6, 2023

MacOs intel #242

Closed

2 tasks

martindevans mentioned this pull request Nov 7, 2023

Align with llama.cpp b1488 #249

Merged

SignalRT added 3 commits November 7, 2023 22:07

Backend.Metal is not needed any more

0336f6f

Change nuget backend packages

0edbd92

Delete Backend.Metal because is not needed anymore. Do not include .metal in x86_64 binaries

Merge branch 'RuntimeDetection' of https://github.com/SignalRT/LLamaS…

9b2ca9c

…harp into RuntimeDetection

martindevans mentioned this pull request Nov 8, 2023

Cannot load libllama from Mac with Intel CPU #241

Closed

feat: add detection template for cuda and avx.

b893c6f

SignalRT marked this pull request as ready for review November 10, 2023 15:28

AsakusaRinne approved these changes Nov 10, 2023

View reviewed changes

AsakusaRinne added enhancement New feature or request patch-release distribution labels Nov 10, 2023

SignalRT added 2 commits November 11, 2023 09:05

Change SemanticKernel version to beta1 on Examples

6de8d62

Merge remote-tracking branch 'upstream/master' into RuntimeDetection

fb95bbb

AsakusaRinne requested changes Nov 11, 2023

View reviewed changes

LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec Outdated Show resolved Hide resolved

AsakusaRinne mentioned this pull request Nov 11, 2023

feat: cuda feature detection. #275

Merged

5 tasks

Test build and nuget packages

7691f83

SignalRT requested a review from AsakusaRinne November 11, 2023 14:28

Merge remote-tracking branch 'upstream/master' into RuntimeDetection

97006a2

AsakusaRinne reviewed Nov 11, 2023

View reviewed changes

.github/prepare_release.sh Show resolved Hide resolved

AsakusaRinne added break change and removed patch-release labels Nov 11, 2023

Correct improper rename

0a2b0ab

AsakusaRinne approved these changes Nov 12, 2023

View reviewed changes

AsakusaRinne merged commit ed479d1 into SciSharp:master Nov 12, 2023
5 checks passed

SignalRT deleted the RuntimeDetection branch November 13, 2023 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime detection MacOS #258

Runtime detection MacOS #258

SignalRT commented Nov 6, 2023 •

edited

Loading

martindevans commented Nov 6, 2023

SignalRT commented Nov 6, 2023

martindevans commented Nov 6, 2023

SignalRT commented Nov 7, 2023

AsakusaRinne commented Nov 7, 2023 •

edited

Loading

martindevans commented Nov 7, 2023

SignalRT commented Nov 7, 2023

martindevans commented Nov 7, 2023

SignalRT commented Nov 7, 2023 •

edited

Loading

AsakusaRinne commented Nov 8, 2023

martindevans commented Nov 8, 2023

AsakusaRinne commented Nov 8, 2023

SignalRT commented Nov 9, 2023

martindevans commented Nov 9, 2023

SignalRT commented Nov 10, 2023

martindevans commented Nov 10, 2023

AsakusaRinne commented Nov 10, 2023

AsakusaRinne commented Nov 10, 2023

SignalRT commented Nov 11, 2023

AsakusaRinne left a comment •

edited

Loading

AsakusaRinne commented Nov 11, 2023

martindevans commented Nov 11, 2023 •

edited

Loading

AsakusaRinne commented Nov 11, 2023

SignalRT commented Nov 11, 2023

Runtime detection MacOS #258

Runtime detection MacOS #258

Conversation

SignalRT commented Nov 6, 2023 • edited Loading

martindevans commented Nov 6, 2023

SignalRT commented Nov 6, 2023

martindevans commented Nov 6, 2023

SignalRT commented Nov 7, 2023

AsakusaRinne commented Nov 7, 2023 • edited Loading

martindevans commented Nov 7, 2023

SignalRT commented Nov 7, 2023

martindevans commented Nov 7, 2023

SignalRT commented Nov 7, 2023 • edited Loading

AsakusaRinne commented Nov 8, 2023

martindevans commented Nov 8, 2023

AsakusaRinne commented Nov 8, 2023

SignalRT commented Nov 9, 2023

martindevans commented Nov 9, 2023

SignalRT commented Nov 10, 2023

martindevans commented Nov 10, 2023

AsakusaRinne commented Nov 10, 2023

AsakusaRinne commented Nov 10, 2023

SignalRT commented Nov 11, 2023

AsakusaRinne left a comment • edited Loading

Choose a reason for hiding this comment

AsakusaRinne commented Nov 11, 2023

martindevans commented Nov 11, 2023 • edited Loading

AsakusaRinne commented Nov 11, 2023

SignalRT commented Nov 11, 2023

SignalRT commented Nov 6, 2023 •

edited

Loading

AsakusaRinne commented Nov 7, 2023 •

edited

Loading

SignalRT commented Nov 7, 2023 •

edited

Loading

AsakusaRinne left a comment •

edited

Loading

martindevans commented Nov 11, 2023 •

edited

Loading