feat(ai-proxy): enable compatibility with LLM SDKs, model selection by request-parameters #12807

tysoekong · 2024-03-28T23:38:49Z

Summary

Our users have noted that they wanted to have the option to not use the plugin configured model "tuning" options, but rather use something like e.g. the OpenAI SDK, and send the options in per-request.

This function enables this.

Checklist

The Pull Request has tests
A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

KAG-4127
https://konghq.atlassian.net/browse/KAG-4124

ttyS0e · 2024-04-01T03:02:39Z

This one is ready to go.

kong/llm/drivers/azure.lua

kong/llm/drivers/cohere.lua

kong/llm/drivers/anthropic.lua

kong/llm/drivers/openai.lua

kong/llm/drivers/shared.lua

flrgh

I've one remaining gripe about a kong.log.warn(), but we're just about good to go.

Nice work!

tysoekong · 2024-04-11T11:26:51Z

@flrgh Fixed it, oops

hanshuebner

A couple of style comments and questions, overall looks good.

hanshuebner · 2024-04-15T12:11:00Z

kong/llm/drivers/cohere.lua

-    return false, "cannot use own model for this instance"
-  end
-
+  -- noop


Why can this check be completely be removed in the cohere driver, but the anthropic driver still has a check (albeit only checking for matching names?)

https://github.com/Kong/kong/pull/12903/files#r1575194343

hanshuebner · 2024-04-15T12:13:09Z

kong/llm/drivers/cohere.lua

+  request_table.truncate = request_table.truncate or "END"
+  request_table.return_likelihoods = request_table.return_likelihoods or "NONE"
+  request_table.p = request_table.top_p or model.options.top_p
+  request_table.k = request_table.top_k or model.options.top_k


Is it OK to have request_table.p be false if top_p is neither set in the request nor in the options? In the other drivers, there code to merge the request table with the model options, but it is done in a per-option fashion. It might be better to have a generalized merging function that works for all drivers, in the same way.

https://github.com/Kong/kong/pull/12903/files#r1575191319

hanshuebner · 2024-04-15T12:15:45Z

kong/llm/drivers/cohere.lua

+    -- and move the LAST message (from 'user') into "message" string
+    if #request_table.messages > 1 then
+      local chat_history = table_new(#request_table.messages - 1, 0)
+      for i, v in ipairs(request_table.messages) do


A numeric for loop from 1 to #request_table.messages - 1 would be better than ipairs here.

https://github.com/Kong/kong/pull/12903/files#r1575192689

hanshuebner · 2024-04-15T12:18:10Z

kong/llm/drivers/llama2.lua

+  messages.parameters.max_new_tokens = request_table.max_tokens or (model.options and model.options.max_tokens)
+  messages.parameters.top_p = request_table.top_p or (model.options and model.options.top_p)
+  messages.parameters.top_k = request_table.top_k or (model.options and model.options.top_k)
+  messages.parameters.temperature = request_table.temperature or (model.options and model.options.temperature)


This seems being yet another pattern of merging request_table and model.options, a unification across drivers would make sense.

hanshuebner · 2024-04-15T12:18:43Z

kong/llm/drivers/openai.lua

+  end
+
+  return request
+end


Here we have the merge function that all drivers should be using.

hanshuebner · 2024-04-15T12:23:13Z

kong/llm/drivers/azure.lua

    )
    parsed_url = socket_url.parse(url)
  end

+  if string.sub(parsed_url.path, 1, 1) ~= "/" then
+    parsed_url.path = "/" .. parsed_url.path
+  end


A common function to prepend a slash to relative paths should be introduced and used.

This PR was re-done, due to messy rebases from other features.

See: https://github.com/Kong/kong/pull/12903/files#r1575189707

hanshuebner · 2024-04-15T12:24:54Z

kong/llm/drivers/anthropic.lua

-  if body and body.model then
-    return nil, "cannot use own model for this instance"
+  if body and body.model and (body.model ~= conf.model.name) then
+    return nil, "requested model does not match the configured plugin model"


Is it sufficient to check the name of the model for equivalence?

tysoekong · 2024-04-19T13:34:58Z

This is going to be superseded by a different PR, closing.

pull-request-size bot added the size/M label Mar 28, 2024

github-actions bot assigned tysoekong Mar 28, 2024

github-actions bot added cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee plugins/ai-proxy plugins/ai-request-transformer plugins/ai-response-transformer labels Mar 28, 2024

tysoekong force-pushed the feat/KAG-4127-ai-proxy-allow-override-defaults branch 2 times, most recently from a0234ae to 8227a29 Compare March 29, 2024 03:25

pull-request-size bot added size/L and removed size/M labels Mar 29, 2024

tysoekong force-pushed the feat/KAG-4127-ai-proxy-allow-override-defaults branch 2 times, most recently from 5d1b3e6 to f2bef33 Compare March 29, 2024 03:27

tysoekong marked this pull request as ready for review March 29, 2024 03:27

tysoekong force-pushed the feat/KAG-4127-ai-proxy-allow-override-defaults branch from 699b3ff to 4b89b4e Compare April 1, 2024 01:04

tysoekong force-pushed the feat/KAG-4127-ai-proxy-allow-override-defaults branch from 4b89b4e to 78d5b46 Compare April 8, 2024 13:48

RobSerafini added this to the 3.7.0 milestone Apr 8, 2024

pull-request-size bot added size/XL and removed size/L labels Apr 10, 2024

tysoekong changed the title ~~feat(ai-proxy): allow overriding default model options~~ feat(ai-proxy): enable compatibility with LLM SDKs Apr 10, 2024

tysoekong changed the title ~~feat(ai-proxy): enable compatibility with LLM SDKs~~ feat(ai-proxy): enable compatibility with LLM SDKs, model selection by request-parameters Apr 10, 2024