Skip to content

Commit

Permalink
Add bare bones doc about managing models (#149)
Browse files Browse the repository at this point in the history
Helps clarify #145
  • Loading branch information
nstogner authored Aug 30, 2024
1 parent 7e3579f commit 7b8c9d8
Show file tree
Hide file tree
Showing 2 changed files with 73 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Any vLLM or Ollama model can be served by KubeAI. Some examples of popular model
## Guides

* [Cloud Installation](./docs/cloud-install.md) - Deploy on Kubernetes clusters in the cloud
* [Model Management](./docs/model-management.md) - Manage ML models

## OpenAI API Compatibility

Expand Down
72 changes: 72 additions & 0 deletions docs/model-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Model Management

KubeAI uses Model [Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) to configure what ML models are available in the system.

Example:

```yaml
apiVersion: kubeai.org/v1
kind: Model
metadata:
name: llama-3.1-8b-instruct-fp8-l4
spec:
features: ["TextGeneration"]
owner: neuralmagic
url: hf://neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8
engine: VLLM
args:
- --max-model-len=16384
- --max-num-batched-token=16384
- --gpu-memory-utilization=0.9
minReplicas: 0
maxReplicas: 3
resourceProfile: L4:1
```
### Listing Models
You can view all installed models through the Kubernetes API using `kubectl get models` (use the `-o yaml` flag for more details).

You can also list all models via the OpenAI-compatible `/v1/models` endpoint:

```bash
curl http://your-deployed-kubeai-endpoint/openai/v1/models
```

### Installing a predefined Model using Helm

When you are defining your Helm values, you can install a predefined Model by setting `enabled: true`:

```yaml
models:
catalog:
llama-3.1-8b-instruct-fp8-l4:
enabled: true
```

You can also optionally override settings for a given model:

```yaml
models:
catalog:
llama-3.1-8b-instruct-fp8-l4:
enabled: true
env:
MY_CUSTOM_ENV_VAR: "some-value"
```

### Adding Custom Models

You can add your own model by defining a Model yaml file and applying it using `kubectl apply -f model.yaml`.

If you have a running cluster with KubeAI installed you can inspect the schema for a Model using `kubectl explain`:

```bash
kubectl explain models
kubectl explain models.spec
kubectl explain models.spec.engine
```

### Model Management UI

We are considering adding a UI for managing models in a running KubeAI instance. Give the [GitHub Issue](https://github.com/substratusai/kubeai/issues/148) a thumbs up if you would be interested in this feature.

0 comments on commit 7b8c9d8

Please sign in to comment.