Skip to content

Commit

Permalink
Update Filestore docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nstogner committed Oct 17, 2024
1 parent 3645588 commit 06873fc
Show file tree
Hide file tree
Showing 4 changed files with 61 additions and 8 deletions.
15 changes: 14 additions & 1 deletion charts/kubeai/templates/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ rules:
- ""
resources:
- pods
- persistentvolumeclaims
verbs:
- create
- delete
- deletecollection
- get
- list
- patch
Expand All @@ -25,6 +25,19 @@ rules:
verbs:
- create
- delete
- deletecollection
- get
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- create
- delete
- get
- list
- patch
Expand Down
50 changes: 45 additions & 5 deletions docs/how-to/cache-models-with-gcp-filestore.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,34 @@ gcloud services enable file.googleapis.com

Apply a Model with the cache profile set to `standard-filestore` (defined in the reference [GKE Helm values file](https://github.com/substratusai/kubeai/blob/main/charts/kubeai/values-gke.yaml)).

<details markdown="1">
<summary>TIP: If you want to use `premium-filestore` you will need to ensure you have quota.</summary>
Open the cloud console quotas page: https://console.cloud.google.com/iam-admin/quotas. Make sure your project is selected in the top left.

Ensure that you have at least 2.5Tb of `PremiumStorageGbPerRegion` quota in the region where your cluster is deployed.

![Premium Storage Quota Screenshot](../screenshots/gcp-quota-premium-storage-gb-per-region)

</details>
<br>

NOTE: If you already installed the models chart, you will need to edit you values file and run `helm upgrade`.

```bash
helm install kubeai-models $REPO_DIR/charts/models -f - <<EOF
helm install kubeai-models kubeai/models -f - <<EOF
catalog:
opt-125m-cpu:
llama-3.1-8b-instruct-fp8-l4:
enabled: true
cacheProfile: standard-filestore
llama-3.1-8b-instruct-fp8-l4-nocache:
enabled: true
EOF
```

Wait for the Model to be fully cached.
Wait for the Model to be fully cached. This may take a while if the Filestore instance needs to be created.

```bash
kubectl wait --timeout 10m --for=jsonpath='{.status.cache.loaded}'=true model/opt-125m-cpu
kubectl wait --timeout 10m --for=jsonpath='{.status.cache.loaded}'=true model/llama-3.1-8b-instruct-fp8-l4
```

This model will now be loaded from Filestore when it is served.
Expand All @@ -44,14 +57,41 @@ Ensure that the Filestore CSI driver is enabled by checking for the existance of
kubectl get storageclass standard-rwx premium-rwx
```

### PersistentVolumeClaim
### PersistentVolumes

Check the PersistentVolumeClaim (that should be created by KubeAI).

```bash
kubectl describe pvc shared-model-cache-
```

<details markdown="1">
<summary>Example: Out-of-quota error</summary>
```
Warning ProvisioningFailed 11m (x26 over 21m) filestore.csi.storage.gke.io_gke-50826743a27a4d52bf5b-7fac-9607-vm_b4bdb2ec-b58b-4363-adec-15c270a14066 failed to provision volume with StorageClass "premium-rwx": rpc error: code = ResourceExhausted desc = googleapi: Error 429: Quota limit 'PremiumStorageGbPerRegion' has been exceeded. Limit: 0 in region us-central1.
Details:
[
{
"@type": "type.googleapis.com/google.rpc.QuotaFailure",
"violations": [
{
"description": "Quota 'PremiumStorageGbPerRegion' exhausted. Limit 0 in region us-central1",
"subject": "project:819220466562"
}
]
}
]
```
</details>

Check to see if the PersistentVolume has been fully provisioned.

```bash
kubectl get pv
# Find name of corresponding pv...
kubectl describe pv <name>
```

### Model Loading Job

Check to see if there is an ongoing model loader Job.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions skaffold.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ profiles:

- name: kubeai-only-gke
build:
artifacts:
- image: substratusai/kubeai
local:
push: true
deploy:
helm:
releases:
Expand Down

0 comments on commit 06873fc

Please sign in to comment.