Skip to content

Pull requests: ROCm/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

fp8 moe configs. Mixtral-8x(7B,22B) TP=1,2,4,8
#250 opened Oct 29, 2024 by divakar-amd Loading…
Fix kernel cache miss and add RDNA configs
#246 opened Oct 25, 2024 by hyoon1 Loading…
Fused ROPE and reshape cache kernel
#229 opened Oct 11, 2024 by maleksan85 Loading…
Update run-amd-test.sh
#192 opened Sep 17, 2024 by Alexei-V-Ivanov-AMD Loading…
multi-gpu fused_moe tuning support
#143 opened Aug 16, 2024 by divakar-amd Loading…
1 task done
[DO NOT MERGE] Vinayak/moe final hashem
#127 opened Aug 11, 2024 by carlushuang Loading…
Add max-batch-size to benchmark_throughput.py
#122 opened Aug 7, 2024 by dllehr-amd Loading…
Add truncate to all files after json dump
#117 opened Aug 2, 2024 by jpvillam-amd Loading…
[Misc] Use main triton branch
#115 opened Aug 1, 2024 by binarman Loading…
Adding SHM broadcast to ROCm/vllm stale
#113 opened Jul 31, 2024 by Lzy17 Loading…
optimizations for process output step stale
#104 opened Jul 25, 2024 by sanyalington Loading…
Update QueueLLM stale
#97 opened Jul 22, 2024 by gyulaz-htec Loading…
Add benchmark_latency_batched.py stale
#96 opened Jul 22, 2024 by dllehr-amd Loading…
Add VLLM_SCHED_PREFILL_KVC_FREEPCT stale
#89 opened Jul 18, 2024 by sanyalington Loading…
Torchrun api server stale
#71 opened Jun 27, 2024 by gshtras Loading…
Update on naive_attn module
#21 opened May 28, 2024 by seungrokj Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.