-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ai-proxy): enable compatibility with LLM SDKs, model selection by request-parameters #12807
Conversation
a0234ae
to
8227a29
Compare
5d1b3e6
to
f2bef33
Compare
699b3ff
to
4b89b4e
Compare
This one is ready to go. |
4b89b4e
to
78d5b46
Compare
84378da
to
01b693b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've one remaining gripe about a kong.log.warn()
, but we're just about good to go.
Nice work!
4ed111d
to
3c28675
Compare
@flrgh Fixed it, oops |
3c28675
to
3461c29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of style comments and questions, overall looks good.
return false, "cannot use own model for this instance" | ||
end | ||
|
||
-- noop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can this check be completely be removed in the cohere driver, but the anthropic driver still has a check (albeit only checking for matching names?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request_table.truncate = request_table.truncate or "END" | ||
request_table.return_likelihoods = request_table.return_likelihoods or "NONE" | ||
request_table.p = request_table.top_p or model.options.top_p | ||
request_table.k = request_table.top_k or model.options.top_k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it OK to have request_table.p
be false
if top_p
is neither set in the request nor in the options? In the other drivers, there code to merge the request table with the model options, but it is done in a per-option fashion. It might be better to have a generalized merging function that works for all drivers, in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- and move the LAST message (from 'user') into "message" string | ||
if #request_table.messages > 1 then | ||
local chat_history = table_new(#request_table.messages - 1, 0) | ||
for i, v in ipairs(request_table.messages) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A numeric for loop from 1
to #request_table.messages - 1
would be better than ipairs
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
messages.parameters.max_new_tokens = request_table.max_tokens or (model.options and model.options.max_tokens) | ||
messages.parameters.top_p = request_table.top_p or (model.options and model.options.top_p) | ||
messages.parameters.top_k = request_table.top_k or (model.options and model.options.top_k) | ||
messages.parameters.temperature = request_table.temperature or (model.options and model.options.temperature) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems being yet another pattern of merging request_table
and model.options
, a unification across drivers would make sense.
end | ||
|
||
return request | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we have the merge function that all drivers should be using.
) | ||
parsed_url = socket_url.parse(url) | ||
end | ||
|
||
if string.sub(parsed_url.path, 1, 1) ~= "/" then | ||
parsed_url.path = "/" .. parsed_url.path | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A common function to prepend a slash to relative paths should be introduced and used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR was re-done, due to messy rebases from other features.
See: https://github.com/Kong/kong/pull/12903/files#r1575189707
kong/llm/drivers/anthropic.lua
Outdated
if body and body.model then | ||
return nil, "cannot use own model for this instance" | ||
if body and body.model and (body.model ~= conf.model.name) then | ||
return nil, "requested model does not match the configured plugin model" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it sufficient to check the name of the model for equivalence?
72e8164
to
8bd5fe3
Compare
This is going to be superseded by a different PR, closing. |
Summary
Our users have noted that they wanted to have the option to not use the plugin configured model "tuning" options, but rather use something like e.g. the OpenAI SDK, and send the options in per-request.
This function enables this.
Checklist
changelog/unreleased/kong
orskip-changelog
label added on PR if changelog is unnecessary. README.mdIssue reference
KAG-4127
https://konghq.atlassian.net/browse/KAG-4124