Prepare for open-source release #5

nitinkedia7 · 2024-05-08T16:29:22Z

Description

This PR is to open source the code for Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve.
This codebase also serves as the baseline for fidelity tests for the LLM inference system simulator Vidur.

Able to setup this repo from a fresh clone.
Run examples/offline_inference.py for multiple models with different tensor, pipeline parallel combinations.

nitinkedia7 added 2 commits May 8, 2024 10:24

Make this branch exactly like internal fork

49d825d

Fix workflow files

c274eac

nitinkedia7 requested a review from AgrawalAmey May 8, 2024 16:29

nitinkedia7 changed the title ~~Prepare for Open-Source Release~~ Prepare for open-source release May 8, 2024

nitinkedia7 added 7 commits May 8, 2024 21:34

Specify commit id in .gitmodules

44e730f

format

fef8d05

format

cf0b7dd

Remove publish and pylint workflows

1497483

Fix python version to 3.10.x, add missing tqdm dependency

980f323

minor

5f434f8

minor

7b6bc3b

AgrawalAmey merged commit 8d8c986 into main May 8, 2024
2 checks passed