Multi Context #90

martindevans · 2023-08-08T00:18:46Z

Added higher level multi-context support - multiple contexts sharing the same model weights. Added a new demo "TalkToYourself" where you can see this in use.

The most major single change is renaming LLamaModel to LLamaContext, to more closely follow the naming of the original project.

saddam213 · 2023-08-08T02:26:19Z

This approch works really well :)

Like your code comments suggest we will need a wrapper class for the ModelHandle, I personally think LLamaWeight is also good name,

We could even wrap LLamaWeight and a collection of LLamaContext at a slightly higher level to ensure all contexts are closed and disposed when a LLamaWeight is unloaded.

This would make more sense in another PR as an appropiate name for that class would probably be LLamaModel and to avoid nightmare merging that could be further down the track, I think it would be nice if we managed it a bit for the end users as most will be building chat like apps, and tracking contexts will be a common feature

Amazing work man!!

martindevans · 2023-08-08T13:40:20Z

Thanks for the feedback! I'll continue on developing this a bit more, getting rid of some of those todo comments.

martindevans · 2023-08-08T14:24:22Z

I think this is ready for review now

martindevans · 2023-08-08T14:36:14Z

Memory Management Thoughts

At the moment the SafeLlamaModelHandle is reference counted. The LLamaWeights is one reference and each LLamaContext is another reference. This means that the weights will only be unloaded from memory when the LLamaWeights and all of the LLamaContext objects are disposed.

// Load weights
using var weights = LLamaWeights.LoadFromFile(@params);

// Unload weights
weights.Dispose();

// This will fail, because the weights are unloaded
var ctx = weights.CreateContext(@params, Encoding.UTF8);

However this is a bit surprising:

// Load weights
using var weights = LLamaWeights.LoadFromFile(@params);

// Create a context, increasing reference count by one
var ctx = weights.CreateContext(@params, Encoding.UTF8);

// Unload weights
weights.Dispose();

// You can use ctx here!

// Now the weights will be unloaded
ctx.Dispose();

In some ways this is actually quite convenient, contexts automatically keep the weights loaded. However, in other ways this means that calling Dispose() on the weights does not free up all the memory, which is odd!

Should this be changed: i.e. should weights.Dispose() immediately unload them from memory and invalidate all the contexts?

martindevans · 2023-08-09T00:10:41Z

I added a couple of extra tests. Our test coverage is still terrible, but it's a start!

SignalRT · 2023-08-09T20:31:57Z

I test it on a MacOS running in GPU (metal) and this test fails:

[Fact]
public void EmbedCompare()
{
var cat = _embedder.GetEmbeddings("cat");
var kitten = _embedder.GetEmbeddings("kitten");
var spoon = _embedder.GetEmbeddings("spoon");

        var close = Dot(cat, kitten);
        var far = Dot(cat, spoon);

        Assert.True(close < far);
    }

I will try to review if it works on CPU and where is the problem.

martindevans · 2023-08-09T20:46:52Z

Thanks for testing, I have no way to test these things on MacOS.

If you're debugging it try printing out the generated vectors, most likely guess is it's returning all zeros for some reason.

SignalRT · 2023-08-09T20:54:45Z

I debugged this and close (13.xxx) is not < than far (8.yyy). I will try to debug this properly in the next days.

martindevans · 2023-08-09T21:03:17Z

Huh, in that case it must just be returning the wrong vectors when the GPU is in use, which is very concerning!

In fact it might be good to add a test for that, generate a few "known good" vectors and then check that those exact vectors are generated in a Unit test

SignalRT · 2023-08-09T21:21:13Z

Same results with CPU. So it's not a GPU / CPU problem.

I will review the returned values in the _embedder.GetEmbeddings invocation to compare the differences between platforms.

martindevans · 2023-08-09T23:41:36Z

@SignalRT do you see the same problems with the code in master btw? If so then at least the MacOS test problem isn't an issue with this PR.

SignalRT · 2023-08-10T04:44:27Z

@martindevans I don´t know if this is a macOS specific issue. This is the output of the first values of "cat" on the three OS:

In each OS Debug and Release execution of the test (that means different executions) produces the same values.

martindevans · 2023-08-10T12:19:33Z

Well that's clearly broken! I'm fairly confident it's not a problem with this PR at least - I've barely touched the Embedder, it's even using the old model loading method!

SignalRT · 2023-08-10T13:57:36Z

I would try to review this issue.

martindevans · 2023-08-10T16:58:57Z

I just added an output in the CI tests, to grab the values for the "cat" embedding. They're the same as you reported.

CI Output:

Windows: -0.12730388,-0.67805725,-0.08524404,-0.9569152,-0.6386326...
Linux: -0.09917596,-0.71790683,-0.008531962,-0.9898389,-0.66339684...

martindevans · 2023-08-10T17:02:46Z

I've added this test into another PR, on top of master (#97). That way we can see if this is an issue with this PR or not. If the issue can be reproduced in the other PR I'll remove the test from this one.

martindevans · 2023-08-10T17:08:18Z

Ok I got exactly the same results in the new PR, so I'm going to remove the test from this PR to unblock it.

SignalRT · 2023-08-10T17:09:48Z

@martindevans It’s happening in the master. I used the same approach in my fork.

…in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`

…dle`

- Sanity checking that weights are not disposed when creating a context from them - Further simplified `Utils.InitLLamaContextFromModelParams`

- sealed some classes not intended to be extended

…mbedding vector for "cat" in CI

martindevans · 2023-08-16T00:52:24Z

@AsakusaRinne this is ready for your review. It's a pretty big change so I've held off merging it myself

AsakusaRinne · 2023-08-16T05:08:04Z

Thank you for all your contributions! Does the difference of embeddings between windows, MAC and linux matter? I'm not sure if it affects the performance of the model.

LLama/LLamaContext.cs

AsakusaRinne · 2023-08-16T08:38:57Z

LGTM, it's really a good job! @Oceania2018 In this PR the LLamaModel is renamed to LLamaContext, along with some API changes. Will it impact much on BotSharp?

martindevans · 2023-08-16T12:41:36Z

Thank you for all your contributions! Does the difference of embeddings between windows, MAC and linux matter? I'm not sure if it affects the performance of the model.

We've tracked that down to a issue with llama.cpp itself, so it's not a problem with this PR. This PR #97 has a test which fails due to that bug and saddam213 reported it upstream here ggerganov/llama.cpp#2582.

martindevans force-pushed the proposal_multi_context branch from d0a2c57 to 50f9129 Compare August 8, 2023 13:46

martindevans mentioned this pull request Aug 8, 2023

feat: add the api to get the embedding length of the model. #93

Closed

martindevans marked this pull request as ready for review August 8, 2023 14:22

martindevans requested a review from AsakusaRinne August 8, 2023 14:24

martindevans added 8 commits August 13, 2023 01:10

WIP demonstrating changes to support multi-context. You can see this …

f3511e3

…in use in `TalkToYourself`, along with notes on what still needs improving. The biggest single change is renaming `LLamaModel` to `LLamaContext`

Using the right context for Bob

f31bdf6

Fixed mirostat/mirostate

fda7e1c

Added a higher level LLamaWeights wrapper around `SafeLlamaModelHan…

e2fe08a

…dle`

- Apply LoRA in LLamaWeights.LoadFromFile

20bdc2e

- Sanity checking that weights are not disposed when creating a context from them - Further simplified `Utils.InitLLamaContextFromModelParams`

Marked old LLamaContext constructor obsolete

4d741d2

- Cleaned up disposal in LLamaContext

d0a7a8f

- sealed some classes not intended to be extended

Renamed EmbeddingCount to EmbeddingSize

479ff57

martindevans added 5 commits August 13, 2023 01:10

Renamed EmbeddingCount to EmbeddingSize in higher level class

f5a2609

Added some additional basic tests

1b35be2

Temporarily added a Console.WriteLine into the test, to print the e…

6473f8d

…mbedding vector for "cat" in CI

Removed Console.WriteLine

aeb7943

Removed embedding test, moved to another PR

76d991f

martindevans force-pushed the proposal_multi_context branch from 8818aa6 to 76d991f Compare August 13, 2023 00:15

martindevans mentioned this pull request Aug 13, 2023

Will it be possible to load 70b models soon? #98

Closed

AsakusaRinne approved these changes Aug 16, 2023

View reviewed changes

LLama/LLamaContext.cs Show resolved Hide resolved

AsakusaRinne merged commit 6233185 into SciSharp:master Aug 17, 2023
4 checks passed

martindevans deleted the proposal_multi_context branch August 17, 2023 15:14

martindevans mentioned this pull request Aug 18, 2023

How to dispose loaded model? #49

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi Context #90

Multi Context #90

martindevans commented Aug 8, 2023 •

edited

Loading

saddam213 commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 10, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 10, 2023 •

edited

Loading

martindevans commented Aug 10, 2023

martindevans commented Aug 10, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 16, 2023

AsakusaRinne commented Aug 16, 2023

AsakusaRinne commented Aug 16, 2023

martindevans commented Aug 16, 2023

Multi Context #90

Multi Context #90

Conversation

martindevans commented Aug 8, 2023 • edited Loading

saddam213 commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 8, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 9, 2023

martindevans commented Aug 9, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 10, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 10, 2023 • edited Loading

martindevans commented Aug 10, 2023

martindevans commented Aug 10, 2023

SignalRT commented Aug 10, 2023

martindevans commented Aug 16, 2023

AsakusaRinne commented Aug 16, 2023

AsakusaRinne commented Aug 16, 2023

martindevans commented Aug 16, 2023

martindevans commented Aug 8, 2023 •

edited

Loading

martindevans commented Aug 10, 2023 •

edited

Loading