Releases: dstackai/dstack
0.16.4
CUDO Compute
The 0.16.4
update introduces the cudo
backend, which allows running workloads with CUDO Compute, a cloud GPU marketplace.
To configure the cudo
backend, you simply need to specify your CUDO Compute project ID and API key:
projects:
- name: main
backends:
- type: cudo
project_id: my-cudo-project
creds:
type: api_key
api_key: 7487240a466624b48de22865589
Once it's done, you can restart the dstack server
and use the dstack
CLI or API to run workloads.
Note
Limitations
- The
dstack gateway
feature is not yet compatible withcudo
, but it is expected to be supported in version0.17.0
,
planned for release within a week. - The
cudo
backend cannot yet be used with dstack Sky, but it will also be enabled within a week.
Full changelog: 0.16.3...0.16.4
0.16.3
Bug-fixes
- [Bug] The
shm_size
property inresources
doesn't take effect #1006 - [Bug]: It's not possible to configure projects other than main via
~/.dstack/server/config.yml
#991 - [Bug] Spot instances don't work on GCP if the username has upper case letters #975
Full changelog: 0.16.2...0.16.3
0.16.1
Improvements to dstack pool
- Change default idle duration for
dstack pool add
to72h
#964 - Set the default spot policy in
dstack pool add
toon-demand
#962 - Add pool support for
lambda
,azure
, andtensordock
#923 - Allow to pass idle duration and spot policy in
dstack pool add
#918 dstack run
does not respect pool-relatedprofiles.yml
parameters #949
Bug-fixes
- Runs submitted via Python API have no termination policy #955
- The
vastai
backend doesn't show any offers since0.16.0
#958 - Handle permission error when adding Include to
~/.ssh/config
#937 - The SSH tunnel fails because of a messy
~/.ssh/config
#933 - The
PATH
is overridden when logging via SSH #930 - The SSH tunnel fails with
Too many authentication failures
#927
We've also updated our guide on how to add new backends. It's now available here.
New contributors
- @iRohith made their first contribution in #959
- @spott made their first contribution in #934
- @KevKibe made their first contribution in #917
Full Changelog: 0.16.0...0.16.1
0.16.0
Pools
The 0.16.0
release is the next major update, which, in addition to many bug fixes, introduces pools, a major new feature that enables a more efficient way to manage instance lifecycles and reuse instances across runs.
dstack run
Previously, when running a dev environment, task, or service, dstack
provisioned an instance in a configured
backend, and upon completion of the run, deleted the instance.
Now, when using the dstack run
command, it tries to reuse an instance from a pool. If no ready instance meets the
requirements, dstack
automatically provisions a new one and adds it to the pool.
Once the workload finishes, the instance is marked as idle
.
If the instance remains idle for the configured duration, dstack
tears it down.
dstack pool
The dstack pool
command allows for managing instances within pools.
To manually add an instance to a pool, use dstack pool add
:
dstack pool add --gpu 80GB --idle-duration 1d
The dstack pool add
command allows specifying resource requirements, along with the spot policy, idle duration, max
price, retry policy, and other policies.
If no idle duration is configured, by default, dstack
sets it to 72h
.
To override it, use the --idle-duration DURATION
argument.
To learn more about pools, refer to the official documentation. To learn more about 0.16.0
, refer to the changelog.
What's changed
- Add dstack pool by @TheBits in #880
- Pools: fix failed instance status by @Egor-S in #889
- Add columns to
dstack pool show
by @TheBits in #898 - Add submit stop by @TheBits in #895
- Add kubernetes logo by @plutov in #900
- Handle exceptions from backend.compute().get_offers by @r4victor in #904
- Fix process_finished_jobs parsing None job_model.job_provisioning_data by @r4victor in #905
- Validate run_name by @r4victor in #906
- Filter out private subnets when provisioning in custom aws vpc by @r4victor in #909
- Issue 894 rework failed instance status by @TheBits in #899
- Handle unexpected exceptions from run_job by @r4victor in #911
- Request GPU in docker with --gpus=all by @Egor-S in #913
- Issue 918 fix cli argimenuts for dstack pool add by @TheBits in #919
- Added router tests for pools by @TheBits in #916
- Fix #921 by @TheBits in #922
New contributors
Full changelog: 0.15.1...0.16.0
0.15.2rc2
Bug-fixes
- Exclude private subnets when provisioning in AWS #908
- Ollama doesn't detect the GPU (requires
--gpus==all
instead of--runtime=nvidia
) #910
Full changelog: 0.15.1...0.15.2rc2
0.15.1
Kubernetes
With the latest update, it's now possible to configure a Kubernetes backend. In this case, if you run a workload, dstack will provision infrastructure within your Kubernetes cluster. This may work with both self-managed and managed clusters.
Specifying a custom VPC for AWS
If you're using dstack with AWS, it's now possible to configure a vpc_name
via ~/.dstack/server/config.yml
.
** Learn more about the new features in detail on the changelog page.**
What's changed
- Print total offers count in run plan by @Egor-S in #862
- Add OpenAPI reference to the docs by @Egor-S in #863
- Fixes #864 by pinning the APScheduler dep to < 4 by @tleyden in #867
- Support gateway creation for Kubernetes by @r4victor in #870
- Improve
get_latest_runner_build
by @Egor-S in #871 - Added ruff by @TheBits in #850
- Handle ResourceNotExistsError instead of 404 by @r4victor in #875
- Simplify Kubernetes backend config by @r4victor in #879
- Add SSH keys to GCP metadata by @Egor-S in #881
- Allow to configure VPC for an AWS backend by @r4victor in #883
New contributors
Full Changelog: 0.15.0...0.15.1
0.15.0
Resources
It is now possible to configure resources in the YAML configuration file:
type: dev-environment
python: 3.11
ide: vscode
# (Optional) Configure `gpu`, `memory`, `disk`, etc
resources:
gpu: 24GB
Supported properties include: gpu
, cpu
, memory
, disk
, and shm_size
.
If you specify memory size, you can either specify an explicit size (e.g. 24GB
) or a
range (e.g. 24GB..
, or 24GB..80GB
, or ..80GB
).
The gpu
property allows specifying not only memory size but also GPU names
and their quantity. Examples: A100
(one A100), A10G,A100
(either A10G or A100),
A100:80GB
(one A100 of 80GB), A100:2
(two A100), 24GB..40GB:2
(two GPUs between 24GB and 40GB), etc.
Authorization in services
Service endpoints now require the Authentication
header with "Bearer <dstack token>"
. This also includes the OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.example.com",
api_key="<dstack token>"
)
completion = client.chat.completions.create(
model="mistralai/Mistral-7B-Instruct-v0.1",
messages=[
{"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
]
)
print(completion.choices[0].message)
Authentication can be disabled by setting auth
to false
in the service configuration file.
OpenAI format in model mapping
Model mapping (required to enable OpenAI interact) now supports format: openai
.
For example, if you run vLLM using the OpenAI mode, it's possible to configure model mapping for it.
type: service
python: "3.11"
env:
- MODEL=NousResearch/Llama-2-7b-chat-hf
commands:
- pip install vllm
- python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
port: 8000
resources:
gpu: 24GB
model:
format: openai
type: chat
name: NousResearch/Llama-2-7b-chat-hf
What's changed
- Configuration resources & ranges by @Egor-S in #844
- Range.str always returns a string by @Egor-S in #845
- Add infinity example by @deep-diver in #847
- error in documentation: use --url instead of --server by @promsoft in #852
- Support authorization on the gateway by @Egor-S in #851
- Implement Kubernetes backend by @r4victor in #853
- Add gpu support for kubernetes by @r4victor in #856
- Resources parse and store by @Egor-S in #857
- Use python3.11 in generate-json-schema by @r4victor in #859
- Implement OpenAI to OpenAI adapter for gateway by @Egor-S in #860
New contributors
- @deep-diver made their first contribution in #847
- @promsoft made their first contribution in #852
Full Changelog: 0.14.0...0.15.0
0.14.0
OpenAI-compatible endpoints
With the latest update, we are extending the service configuration in dstack to enable you to optionally map your custom LLM to an OpenAI-compatible endpoint.
To learn more about how the new feature, read our blog post on it.
What's changed
- Make gateway active by @Egor-S in #829
- Implement OpenAI streaming for TGI by @Egor-S in #833
- Make get_latest_runner_build robuster for editable installs by @Egor-S in #834
- Fix descending logs by @r4victor in #839
- Reraise Jinja2 TemplateError by @Egor-S in #840
Full Changelog: 0.13.1...0.14.0
0.13.1
Mounting repos via Python API
If you submit a task or a service via the Python API, you can now specify the repo
with the Client.runs.submit
method.
This argument accepts an instance of dstack.api.LocalRepo
(which allows you to mount additional files to the run from a local folder), dstack.api.RemoteRepo
(which allows you to mount additional files to the run from a remote Git repo), or dstack.api.VirtualRepo
(which allows you to mount additional files to the run programmatically).
Here's an example:
repo=RemoteRepo.from_url(
repo_url="https://github.com/dstackai/dstack-examples",
repo_branch="main"
)
client.repos.init(repo)
run = client.runs.submit(
configuration=...,
repo=repo,
)
This allows you to access the additional files in your run from the mounted repo.
More examples are now available in the API documentation.
Note that the Python API is just one possible way to manage runs. Another one is the CLI. When using the CLI, it automatically mounts the repo in the current folder.
Bug-fixes
Among other improvements, the update addresses the issue that previously prevented the ability to pass custom arguments to the run using ${{ run.args }}
in the YAML configuration.
Here's an example:
type: task
python: "3.11" # (Optional) If not specified, your local version is used
commands:
- pip install -r requirements.txt
- python train.py ${{ run.args }}
``
Now, you can pass custom arguments to the run via `dstack run`:
```shell
dstack run . -f train.dstack.yml --gpu A100 --train_batch_size=1 --num_train_epochs=100
In this case --train_batch_size=1 --num_train_epochs=100
will be passed to python train.py
.
Contribution guide
Last but not least, we've extended our contribution guide with a new wiki page that guides you through the steps of adding a custom backend. This can be helpful if you decide to extend dstack with support for a custom backend (cloud provider).
Feel free to check out this new wiki page and share your feedback. As always, if you need help with adding custom backend support, you can always ask for assistance from our team.
0.13.0
Disk size
Previously, dstack
set the disk size to 100GB
regardless of the cloud provider. Now, to accommodate larger language
models and datasets, dstack
enables setting a custom disk size using --disk
in dstack run
or via the disk
property in .dstack/profiles.yml
.
CUDA 12.1
We've upgraded the default Docker image's CUDA drivers to 12.1 (for better compatibility with modern libraries).
Mixtral 8x7B
Lastly, and most importantly, we've added the example showing how to deploy Mixtral 8x7B as a service.