Pull ollama model while creating new instances #113

destrex271 · 2024-05-12T12:52:31Z

No description provided.

ChuckHend · 2024-06-24T13:27:58Z

extension/src/chat/ops.rs

+    let res = runtime.block_on(async {
+        instance.pull_model();
+    });
+
+    let _ = match res {
+        Err(e) => error!("ERROR: {:?}", e.to_string()),
+        _ => info!("Model pulled successfully!"),
+    };


How much overhead is it going to be making a call to pull model on every chat completion request? Would it be possible to only try pulling the model if we first fail a chat completion request because the model has not been pulled? That might even be something better configured server side instead of in the extension 🤔

The pull_model call will only account for a single request but yes this can introduce an overhead.
We can go with the flow that you suggested, of first calling the model and then pulling it if it fails.

That might even be something better configured server side instead of in the extension

According to me its better if this stays within the extension just for the ease of use.

I agree the convenience is nice...

The server in ./vector-serve always pulls a model if it does not already exist (unless it is disabled via env var). So the model, if it does not exist, is typically downloaded during the vectorize.table call, when that function calls the server to get the dim of the embedding model. Calling the vector-serve endpoint for model info ends up triggering the download. Maybe there is something similar we can do here, where the call to get the model info for Ollama models is what triggers the pull?

Okk, I'll try to implement something similar to this, we already have the function just need to see the right place to put it

ChuckHend · 2024-10-04T16:31:35Z

@destrex271 - are you ok if we close this re-open again later if it is still something we want to implement?

destrex271 · 2024-10-04T16:55:27Z

@destrex271 - are you ok if we close this re-open again later if it is still something we want to implement?

Oops! Forgot to close it earlier !
Yep, its probably a bad idea to actually have this step

destrex271 · 2024-10-04T16:55:47Z

Closing #113 also

Pull ollama model

6937ed7

destrex271 requested review from ChuckHend, shhnwz and jasonmp85 as code owners May 12, 2024 12:52

ChuckHend and others added 3 commits May 14, 2024 16:10

Merge branch 'main' into pull_models

c2cbb3e

Merge branch 'main' into pull_models

4f3f0ac

Fixed formatting

6de9bf8

ChuckHend reviewed Jun 24, 2024

View reviewed changes

destrex271 closed this Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pull ollama model while creating new instances #113

Pull ollama model while creating new instances #113

destrex271 commented May 12, 2024

ChuckHend Jun 24, 2024

destrex271 Jun 25, 2024 •

edited

Loading

ChuckHend Jun 27, 2024

destrex271 Jul 5, 2024

ChuckHend commented Oct 4, 2024

destrex271 commented Oct 4, 2024

destrex271 commented Oct 4, 2024

Pull ollama model while creating new instances #113

Pull ollama model while creating new instances #113

Conversation

destrex271 commented May 12, 2024

ChuckHend Jun 24, 2024

Choose a reason for hiding this comment

destrex271 Jun 25, 2024 • edited Loading

Choose a reason for hiding this comment

ChuckHend Jun 27, 2024

Choose a reason for hiding this comment

destrex271 Jul 5, 2024

Choose a reason for hiding this comment

ChuckHend commented Oct 4, 2024

destrex271 commented Oct 4, 2024

destrex271 commented Oct 4, 2024

destrex271 Jun 25, 2024 •

edited

Loading