Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add examples/gke/tgi-tpu-deployment/ for TGI on TPU #62

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Commits on Jul 29, 2024

  1. Configuration menu
    Copy the full SHA
    c084a9a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2e46df4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c9b8b6e View commit details
    Browse the repository at this point in the history
  4. Set BATCH_SIZE=2 to skip MAX_BATCH_PREFILL_TOKENS

    Since `MAX_BATCH_PREFILL_TOKENS` is internally set by Text Generation
    Inference (TGI) to `MAX_INPUT_TOKENS + 50`, and as the TGI on TPU model
    warm-up validates that `MAX_BATCH_PREFILL_TOKENS <= MAX_INPUT_TOKENS *
    BATCH_SIZE`, then we set the `BATCH_SIZE=2` so that `MAX_INPUT_TOKENS +
    50 < MAX_INPUT_TOKENS * 2` so that the validation passes. Alternatively,
    one could also set the `MAX_BATCH_PREFILL_TOKENS` to a value lower or
    equal than `MAX_INPUT_TOKENS` (ideally equal).
    alvarobartt committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    3e68329 View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2024

  1. Configuration menu
    Copy the full SHA
    9c7c1a7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    21692e4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4159af3 View commit details
    Browse the repository at this point in the history