This repository has been archived by the owner on May 10, 2024. It is now read-only.

v0.1.3

asaiacai released this 11 Aug 19:00

· 18 commits to main since this release

v0.1.3

d5e2a32

This patch includes some bugfixes as well enabling passing huggingface tokens to access gated/private models for serving and training. This update also enables tensor parallelism on all gpus of a given model to enable serving of larger models like llama-70b on a multigpu instance.

I promise to write a detailed changelog coming up in v0.1.4!

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.3