Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The new APIM policies for Azure OpenAI do not play nicely with some models like GPT-4-Vision. #58

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 28 additions & 10 deletions infra/modules/apim/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -130,16 +130,34 @@ resource "azurerm_api_management_api_policy" "policy" {
</required-claims>
</validate-jwt>

<azure-openai-emit-token-metric
namespace="AzureOpenAI">
<dimension name="API ID" />
<dimension name="Operation ID" />
<dimension name="Client IP" value="@(context.Request.IpAddress)" />
</azure-openai-emit-token-metric>

<azure-openai-token-limit
counter-key="@(context.Request.IpAddress)"
tokens-per-minute="10000" estimate-prompt-tokens="false" remaining-tokens-variable-name="remainingTokens" />
<choose>
<when condition="@(context.Request.Body.As<JObject>(preserveContent: true)["messages"]?.All(message => message["content"].All(content => !(content is JObject))) == true)">

<!-- If all type properties are 'text' or there are no type properties, apply the new Azure OpenAI policies -->

<trace source="Azure OpenAI Policies" severity="information">
<message>Using Azure OpenAI policies.</message>
<metadata name="Using_Azure_OpenAI_Policies" value="true" />
</trace>

<azure-openai-emit-token-metric
namespace="AzureOpenAI">
<dimension name="API ID" />
<dimension name="Operation ID" />
<dimension name="Client IP" value="@(context.Request.IpAddress)" />
</azure-openai-emit-token-metric>

<azure-openai-token-limit
counter-key="@(context.Request.IpAddress)"
tokens-per-minute="10000" estimate-prompt-tokens="false" remaining-tokens-variable-name="remainingTokens" />
</when>
<otherwise>
<trace source="Azure OpenAI Policies" severity="information">
<message>Not using Azure OpenAI policies.</message>
<metadata name="Using_Azure_OpenAI_Policies" value="false" />
</trace>
</otherwise>
</choose>

<set-backend-service backend-id="${azapi_resource.apim_backend_pool.name}" />
</inbound>
Expand Down
Loading