-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to set property to null on AgentPool #2462
Comments
Hey @jaxxstorm ! Thanks for opening this - is this reproducible in other languages as well, or is it only a Python issue? |
In the line Unless the example snippet was manually edited and |
I tried to repro but there's no auto-approvable quota for multi-GPU VM SKUs. sigh Submitted a support request. import * as resources from "@pulumi/azure-native/resources";
import * as containerservice from "@pulumi/azure-native/containerservice";
const resourceGroup = new resources.ResourceGroup("resourceGroup");
const managedCluster = new containerservice.ManagedCluster("managedCluster", {
addonProfiles: {},
agentPoolProfiles: [{
count: 3,
enableNodePublicIP: true,
mode: "System",
name: "nodepool1",
osType: "Linux",
type: "VirtualMachineScaleSets",
vmSize: "Standard_DS2_v2",
}],
dnsPrefix: "dnsprefix1",
enableRBAC: true,
kubernetesVersion: "",
networkProfile: {
loadBalancerProfile: {
managedOutboundIPs: {
count: 2,
},
},
loadBalancerSku: "standard",
outboundType: "loadBalancer",
},
resourceGroupName: resourceGroup.name,
resourceName: "clustername1",
servicePrincipalProfile: {
clientId: REDACTED
secret: REDACTED
},
sku: {
name: "Base",
tier: "Free",
},
});
const pool = new containerservice.AgentPool("sa", {
agentPoolName: "agentpool1",
count: 1,
gpuInstanceProfile: "MIG1g",
orchestratorVersion: "",
osType: "Linux",
resourceGroupName: resourceGroup.name,
resourceName: managedCluster.name,
vmSize: "Standard_ND96asr_v4",
}); |
Starting to familiarise myself with this resource. Here's the facts I think are true. It looks like the AgentPool is a nested resource of the managed cluster. You can create AgentPoolProfiles inline with the parent, but the AgentPool itself is a standalone resource, though shares all of the same properties as the profiles - with the addition of the ARM resource docs: https://learn.microsoft.com/en-us/azure/templates/microsoft.containerservice/managedclusters/agentpools Example VM Size supporting GPUs: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/ndasra100v4-series Our default version is using 2023-04-01 but the latest ARM docs are using 2024-06-02-preview. The specs don't indicate how to unset this field. The empty string is not valid and as far as I can remember ARM update operations don't require all properties, so a missing field is not treated like a removal. I expect this field might be set to a default value server-side for VMs with GPU support, but is removed if the VM doesn't support GPUs. There's no indication of this behaviour in the spec. It's also possible that this is a field which Azure treats as immutable once created, but hasn't documented via the spec. The API might just discard the field it can't update as we've observed this behaviour in other resources. Next step will be retrying the program above. |
Test Plan
Notes:
After 24 failed attempts to deploy an agent pool with GPUs by varying the vmSize I'm going to need to put this back on hold until we can find a provisionable VM Size. |
Docs on what multi-instance GPUs actually are: https://learn.microsoft.com/en-us/azure/aks/gpu-multi-instance From the page above:
Therefore, it's not possible to update this value to either a new value or to remove it. Therefore, we'll mark this property as causing replacement in order to resolve this issue. Also related in giving a little more insight on what can be updated: https://learn.microsoft.com/en-us/cli/azure/aks/nodepool?view=azure-cli-latest#az-aks-nodepool-update |
As noted on https://learn.microsoft.com/en-us/azure/aks/gpu-multi-instance?tabs=azure-cli#gpu-instance-profiles > You can't change the applied GPU instance profile after node pool creation. As noted on https://learn.microsoft.com/en-us/azure/aks/manage-node-pools#limitations > You can't change the VM size of a node pool after you create it. Fixes #2462
As noted on https://learn.microsoft.com/en-us/azure/aks/gpu-multi-instance?tabs=azure-cli#gpu-instance-profiles > You can't change the applied GPU instance profile after node pool creation. As noted on https://learn.microsoft.com/en-us/azure/aks/manage-node-pools#limitations > You can't change the VM size of a node pool after you create it. Fixes #2462
This issue has been addressed in PR #3655 and shipped in release v2.68.0. |
What happened?
When creating a container service agent pool, modifying the
gpu_instance_profile
doesn't work for null valuesExpected Behavior
If you set the property to
None
or any other nullable value, it should modify the resourceSteps to reproduce
Create an agent pool with a
gpu_instance_profile
Then modify this to remove the GPU instance profile:
Output of
pulumi about
N/A
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
The text was updated successfully, but these errors were encountered: