Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return TokenData in Inference calls #199

Closed
wants to merge 1 commit into from

Conversation

saddam213
Copy link
Collaborator

@saddam213 saddam213 commented Oct 19, 2023

This PR that is NOT intended to be merged, but to so the idea I had is easier explain.

Adding an overload ILLamaExecutor to return all the TokenData in the Infer call instead of string

IAsyncEnumerable<TokenData> InferTokensAsync(string text, IInferenceParams? inferenceParams = null, CancellationToken cancellationToken = default)

public record TokenData(int Id)
{
    public float Logit { get; set; }
    public float Probability { get; set; }
    public string Content { get; set; }
}

We could also possibly add n tokens that had the next highest probability,

public record TokenData(int Id)
{
    public float Logit { get; set; }
    public float Probability { get; set; }
    public string Content { get; set; }
    public TokenData[] LastN { get; set; }
}

This extra data is super helpful to a lot of end users,

I have a working example similar to this and the code in this PR was shoehorned out of that but I am not sure on the best way to implement this properly, so thought I would open this draft :)

@martindevans
Copy link
Member

That's an interesting idea.

Are you thinking of something that allows the user to modify the tokens before they're selected (to influence token selection) or just returning the data?

I've been playing with ideas to split executors into "stages" which get chained together into an inference pipeline (e.g. Infer -> LogitBias -> Temperature -> TopK), but I haven't come up with anything good yet. That may be similar to what you're looking for?

@saddam213
Copy link
Collaborator Author

Just returning the data at this point, having the logits will make using LogiBias a lot easier, and token ids would be far better for use in the output filters than using strings.

But you are right, this could also be used for many other things in the pipeline.

There is not any nice way for 3rd party libs go get this data without jumping though extra hoops

@saddam213
Copy link
Collaborator Author

saddam213 commented Oct 29, 2023

Found a work around without needing to maintain my custom executors

Run inference, then re-run to get embedding, then use Tokenize to match the token data with the token strings, then zip it all together

It is a lot slower, however no need to change LLamaSharp for the data I need :)

@saddam213 saddam213 closed this Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants