Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate options for amount of lines 'before' and 'after' the current line in FIM prompts #298

Open
AndrewRocky opened this issue Aug 26, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@AndrewRocky
Copy link
Contributor

AndrewRocky commented Aug 26, 2024

Is your feature request related to a problem? Please describe.
When editing the beginning of a long file, prompt evaluation takes a lot of time.
Reason for that - in Additional context

Currently we send similar amount of lines from top and bottom. I believe that we have reasons to make the bottom part smaller:

  1. It takes a long time to reevaluate bottom lines
  2. Bottom lines often aren't as important (IMO). This way we can have more context window left for top lines.

Describe the solution you'd like
I want to have separated options for Context Length for 'before' and 'after'.

Describe alternatives you've considered
Or maybe leave current Twinny: Context Length as is, but add optional override for bottom lines.

Additional context
For context:
AFAIK (this is mostly based on my assumptions), llama.cpp doesn't have to reevaluate prefix part of prompt that haven't changed since last generation. But the moment it encounters a change - it will start reevaluating everything after that change.
So when we have 2 requests in a row with prompts:

<|fim▁begin|>
import numpy
<|fim▁hole|>
print('Hello World!')<|fim▁end|>
<|fim▁begin|>
import numpy
import<|fim▁hole|>
print('Hello World!')<|fim▁end|>

It won't have to spend time on evaluating import numpy.
However, it will still have to run everything after <|fim▁hole|> (because it only checks for prefix in prompt).
(Example of llama.cpp output (not for this exact case): Llama.generate: 2978 prefix-match hit, remaining 8 prompt tokens to eval)

@AndrewRocky AndrewRocky changed the title Add option to have a different number of lines before and after the current line in FIM prompt Separate options for amount of lines 'before' and 'after' the current line in FIM prompts Aug 26, 2024
@rjmacarthy rjmacarthy added the enhancement New feature or request label Aug 27, 2024
@rjmacarthy
Copy link
Collaborator

Hey, currently we use 0.85 for prefix and 0.15 for context, I guess we could make it configurable though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants