Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly format followup messages in turn-based (chat) inference #299

Closed
wants to merge 2 commits into from

Conversation

philippjbauer
Copy link
Contributor

While working with another model than in my previous tests, I noticed that the model always started its responses with the token that indicates the start of a turn (e.g. <|assistant|>) to the second, third, etc. question I asked.

I investigated and found that it is not enough to just send in the raw text for this model as it happened to be for the other model I initially tested with.

I changed the code so that the correct turn template as per HistoryTransform class is applied before sending the prompt into the InternalChatAsync method.

This resolved the issue and turn-based inference started working as expected.

Copy link
Collaborator

@AsakusaRinne AsakusaRinne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall looks good. Thank you for the contribution!

@@ -159,15 +159,15 @@ public async IAsyncEnumerable<string> ChatAsync(string prompt, IInferenceParams?
InteractiveExecutorState state = (InteractiveExecutorState)executor.GetStateData();
prompt = state.IsPromptRun
? HistoryTransform.HistoryToText(History)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an unexpected behaviour here. If the user input the prompt for the first time run, what he/she want is treat it as a raw text instead of transformed text. You could run the example ChatSessionWithRoleName and set break point to check it.

It's not your fault, because the abstraction here makes the boundary vague. Some logic of executors and chat session need to be refactored to completely fix it. However a quick fix for it now will be better. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see what you're saying. Do you think the change to the overlay that uses the ChatHistory is ok to go in by itself? The following messages a user sends in need the template from the IHistoryTransform implementation applied so that the model has the appropriate tokens to continue the conversation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to see if the history is empty before processing the prompt here. If the history is empty, it indicates that the current prompt is the first input, therefore we don't modify it. Otherwise we could resume from history.

Copy link
Contributor Author

@philippjbauer philippjbauer Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a session was loaded wouldn't the history be duplicated in the state?

I think this is a problem that's created because the actual text is not saved when a session is saved.

I'm working on a proposal for the ChatSession class that will communicate the behavior of the class better through its interface, and will have some more guardrails in place so it can't be used in unexpected ways.

I'll send that in as a separate PR and close this. I don't think we'll get this right without breaking changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we may also refactor some parts recently. If you'd like to make some aggressive break changes, you could also open a PR to the preview branch, so that you wouldn't feel limited. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants