Skip to content
This repository has been archived by the owner on May 10, 2024. It is now read-only.

v0.1.3

Compare
Choose a tag to compare
@asaiacai asaiacai released this 11 Aug 19:00
· 18 commits to main since this release

This patch includes some bugfixes as well enabling passing huggingface tokens to access gated/private models for serving and training. This update also enables tensor parallelism on all gpus of a given model to enable serving of larger models like llama-70b on a multigpu instance.

I promise to write a detailed changelog coming up in v0.1.4!