Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLamaEmbedder 2.0 #902

Merged
merged 2 commits into from
Aug 31, 2024
Merged

Conversation

martindevans
Copy link
Member

Totally rewritten the LLamaEmbedder based on https://github.com/ggerganov/llama.cpp/tree/master/examples/embedding. New embedder properly handles pooling, either returning one embedding for the whole sequence or one per token. This rewrite does not support batching, it's still just one string at a time.

  • Added Encode methods to LLamaContext
  • Moved some native methods from NativeApi to SafeLLamaContextHandle and wrapped them properly
  • Added HasDecoder property to SafeLlamaModelHandle. This function doesn't exist in the current version of llama.cpp, will need to be hooked up in the next binary update
  • Added some normalization methods as extensions on span/array. This required adding a dependency on System.Numerics.Tensors

DRAFT until the HasDecoder function exists, after the next binary update.

@martindevans
Copy link
Member Author

Rebased onto #905

@martindevans martindevans marked this pull request as ready for review August 26, 2024 14:00
…anov/llama.cpp/tree/master/examples/embedding. New embedder properly handles pooling, either returning one embedding for the whole sequence or one per token.

 - Added `Encode` methods to `LLamaContext`
 - Moved some native methods from `NativeApi` to `SafeLLamaContextHandle` and wrapped them properly
 - Added `HasDecoder` property to `SafeLlamaModelHandle`. This function doesn't exist in the current version of llama.cpp, will need to be hooked up in the next binary update
 - Added some normalization methods as extensions on span/array. This required adding a dependency on `System.Numerics.Tensors`
 - Using `llama_set_embeddings` to toggle on embedding mode, so it no longer needs to be specified in the params
@martindevans martindevans merged commit 6025979 into SciSharp:master Aug 31, 2024
6 checks passed
@martindevans martindevans deleted the llama_embedder_2 branch August 31, 2024 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant