-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roberta pr test2 #196
base: main
Are you sure you want to change the base?
Roberta pr test2 #196
Conversation
…oject#8872) Signed-off-by: kevin <kevin@anyscale.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Co-authored-by: mgoin <michael@neuralmagic.com>
vllm-project#8378) Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Signed-off-by: tylertitsworth <tyler.titsworth@intel.com> Co-authored-by: youkaichao <youkaichao@126.com>
Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
…20241008 Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
This reverts commit baeec70. Signed-off-by: Jefferson Fialho <jfialho@ibm.com>
commit 935c58d9e70ed6e84559e95f696c65dfb282e422 Author: Max de Bayser <mbayser@br.ibm.com> Date: Fri Oct 11 14:28:57 2024 -0300 add registry of encoder-only models Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit 579337372e694d900a4e01899b81fe0afcf82c10 Merge: e7044a61c 30b0f2156 Author: Max de Bayser <mbayser@br.ibm.com> Date: Tue Oct 8 10:34:17 2024 -0300 Merge branch 'bert' into roberta_embedding Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit 30b0f2156bccbfb11def0d7902acb8b56d24a98a Merge: 80c18855f 8c746226c Author: Max de Bayser <mbayser@br.ibm.com> Date: Tue Oct 8 10:33:05 2024 -0300 Merge branch 'upstream_main' into bert commit 8c746226c956f7c8a4672689fee91c7d22befed6 Author: Brendan Wong <35351983+LunrEclipse@users.noreply.github.com> Date: Mon Oct 7 22:51:43 2024 -0700 [Frontend] API support for beam search for MQLLMEngine (#9117) commit 80c18855fcff195175b7046923c4b0c3815f141a Author: laishzh <laishengzhang@gmail.com> Date: Mon Oct 7 12:04:34 2024 +0800 feat: update with origin/main commit 6440795f407c652ecdb045d1b141913afdb8b5e1 Merge: 04b0bc6ff 487678d04 Author: laishzh <laishengzhang@gmail.com> Date: Mon Oct 7 11:28:19 2024 +0800 Merge branch 'origin/main' commit 04b0bc6ff534495a9627f5548767f5bfb95268e8 Author: laishzh <laishengzhang@gmail.com> Date: Mon Oct 7 02:54:55 2024 +0800 feat: revert embedding_block_manager commit 352d8b2641d11ffa0e153462fd89b54525998843 Merge: 3fbfdf429 107d9c207 Author: laishzh <laishengzhang@gmail.com> Date: Mon Oct 7 00:45:52 2024 +0800 Merge remote-tracking branch 'maxdebayser/bert' commit e7044a61cebf6b9229a50a8396fdef104e799a9e Merge: a14b4e39d 107d9c207 Author: Max de Bayser <mbayser@br.ibm.com> Date: Wed Oct 2 18:04:38 2024 -0300 Merge branch 'bert' into roberta_embedding commit 107d9c207808c6f070ef086e3ea748cecbc9d809 Merge: 57bdd6049 7f60520de Author: Max de Bayser <mbayser@br.ibm.com> Date: Wed Oct 2 17:52:52 2024 -0300 Merge branch 'upstream_main' into bert Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit a14b4e39d26eb953c569ebb219aa3cb7203699ec Merge: 08f1781d6 57bdd6049 Author: Max de Bayser <mbayser@br.ibm.com> Date: Thu Sep 26 17:25:28 2024 -0300 Merge branch 'bert' into roberta_embedding Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit 57bdd6049129b43244d3c70ea876e784762e96e9 Merge: 2c8a5b922 7193774b1 Author: Max de Bayser <mbayser@br.ibm.com> Date: Thu Sep 26 17:15:18 2024 -0300 Merge branch 'upstream_main' into bert Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit 3fbfdf42966c7324466e266dc6d4b5c26131aee5 Merge: 2c8a5b922 873edda6c Author: laishzh <laishengzhang@gmail.com> Date: Thu Sep 26 23:23:39 2024 +0800 Merge remote-tracking branch 'origin/main' # Conflicts: # vllm/inputs/data.py commit 08f1781d6bd49653bd62ffdfde4f86d903f0c65a Author: Max de Bayser <maxdebayser@gmail.com> Date: Mon Sep 23 17:04:35 2024 -0300 add head size 32 Signed-off-by: Max de Bayser <maxdebayser@gmail.com> commit 2c8a5b9224ce9e26b2e43bb2312be91e2c74de9c Merge: 15be7fa8b f2bd246c1 Author: Max de Bayser <maxdebayser@gmail.com> Date: Mon Sep 23 13:48:10 2024 -0300 Merge branch 'main' into bert Signed-off-by: Max de Bayser <maxdebayser@gmail.com> commit 30c875e9e61f1e9e4d556014f49362adff76269a Merge: afd997ba9 464a90f4e Author: Max de Bayser <maxdebayser@gmail.com> Date: Mon Sep 23 13:59:23 2024 -0300 Merge branch 'bert' into roberta_embedding commit 464a90f4e09165ab724de26b35e9d7913c5d6560 Merge: 15be7fa8b f2bd246c1 Author: Max de Bayser <maxdebayser@gmail.com> Date: Mon Sep 23 13:48:10 2024 -0300 Merge branch 'main' into bert Signed-off-by: Max de Bayser <maxdebayser@gmail.com> commit afd997ba9f6ec2513145c0ca469a15783e0c96e5 Merge: 7d0ecb90c 15be7fa8b Author: Max de Bayser <maxdebayser@gmail.com> Date: Mon Sep 23 13:14:29 2024 -0300 Merge branch '5447' into roberta_embedding commit 15be7fa8bce185f64fafecaabdb8c828e83f4ad8 Author: laishzh <laishengzhang@gmail.com> Date: Mon Sep 9 23:04:44 2024 +0800 feat: fix lint commit 0ea4da1c549bf35c8456c47729da46dd33481cac Author: laishzh <laishengzhang@gmail.com> Date: Mon Sep 9 23:01:22 2024 +0800 feat: fix lint commit 776dcbdae9d693dbd6546b7784712c06e6ef473c Merge: 3ff2d3637 4ef41b847 Author: laishzh <laishengzhang@gmail.com> Date: Mon Sep 9 10:32:46 2024 +0800 Merge branch 'main' of https://github.com/vllm-project/vllm # Conflicts: # vllm/core/embedding_model_block_manager.py commit 3ff2d36375d9560f87c56860ffff8a774a217cf9 Author: laishzh <laishengzhang@gmail.com> Date: Mon Sep 9 10:29:01 2024 +0800 feat: some changes on test_embedding.py commit e351bfd0febe4bbf8030fcd07f39eef5cce97641 Author: laishzh <laishengzhang@gmail.com> Date: Sun Sep 8 23:50:18 2024 +0800 feat: bert embedding implemented, but still have some bugs with mistral, commit 7d0ecb90c5034d41f0d9b38eede25f50bf941e3d Author: Max de Bayser <mbayser@br.ibm.com> Date: Wed Aug 28 16:35:25 2024 -0300 Add support for Roberta embedding models It's almost identical to the Bert models Signed-off-by: Max de Bayser <mbayser@br.ibm.com> commit 612cf1a969fa46105c3685b2eb025cde6416747d Author: laishzh <laishengzhang@gmail.com> Date: Tue Aug 27 15:19:50 2024 +0800 feat: modify test_embedding commit fc1f2b7ceb69f9588799820831145babf29aaa64 Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 15:39:33 2024 +0800 chore: fix lint commit d09860763500b85193230588386f0e3d515e231c Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 15:24:51 2024 +0800 feat: remove embedding_model_block_manager.py commit 37f698b4241a42c9634030e372e419b47e2a1e9c Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 15:16:34 2024 +0800 feat: move BertEmbeddingModel to the end of file commit 6f006f5ad698d76599e0b005520e65921042d07b Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 15:06:21 2024 +0800 chore: fix lint commit bfd7ec9e043cf304e6dea024912eb2a18c786bd6 Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 14:59:06 2024 +0800 feat: model input commit 8b107a24a4ef9abb194686066c3bebc6923c6876 Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 13:41:49 2024 +0800 feat: fix lint commit e15d0cce60e3f39f2aaf8c3f62314a6d6b4ea091 Merge: b76da51c0 f710fb526 Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 12:45:26 2024 +0800 Merge branch 'main' into main commit b76da51c0d9ba1b4e39d432b8fb557ed8319034f Author: laishzh <laishengzhang@gmail.com> Date: Mon Aug 19 11:35:22 2024 +0800 feat: enc_dec_runner base commit b99d783bd852eb4cae228fcd8faf3344cd9a6fed Author: laishzh <laishengzhang@gmail.com> Date: Sun Aug 18 00:49:57 2024 +0800 feat: remove embedding block space manager commit 7e1196d25054d76d92b3777bc077d3cffd742599 Author: laishzh <laishengzhang@gmail.com> Date: Sat Aug 17 14:43:32 2024 +0800 fix: fix hint commit ce9a599194dbc3a208a6a4a21fdccaaa5c26ece8 Author: laishzh <laishengzhang@gmail.com> Date: Sat Aug 17 02:18:54 2024 +0800 feat: bos_token_id commit 275f49de32136eb9e4298d42aa85a1e2dc56924c Author: laishzh <laishengzhang@gmail.com> Date: Sat Aug 17 01:03:55 2024 +0800 feat: embedding model prompt commit 0b3f55c66e5eb40808f46ebde3c38213478050c7 Author: laishzh <laishengzhang@gmail.com> Date: Fri Aug 16 15:12:51 2024 +0800 feat: fix lint commit 91e23d8ad2b45790590889d6ee437702f5003792 Author: laishzh <laishengzhang@gmail.com> Date: Fri Aug 16 15:04:30 2024 +0800 feat: fix lint commit 7657af3f49cdb567bc96b44157c89f18cc4d0a22 Author: laishzh <laishengzhang@gmail.com> Date: Fri Aug 16 15:01:26 2024 +0800 feat: fix lint commit f2158848b9abd839c515c568acd592d0416c6682 Author: laishzh <laishengzhang@gmail.com> Date: Fri Aug 16 11:21:54 2024 +0800 chore: recover commit a0ad0df28c9de89bdd66b587502f6af9265065be Author: laishzh <laishengzhang@gmail.com> Date: Fri Aug 16 11:15:28 2024 +0800 chore: recover unchanged files commit 872e79531b39d1bf12ea81ddcd5bf919dd97265d Author: laishzh <laishengzhang@gmail.com> Date: Thu Aug 15 21:40:55 2024 +0800 feat: embedding model forward commit 682c455bb0b8c950e1e00b43a6841f433f62db97 Author: laishzh <laishengzhang@gmail.com> Date: Thu Aug 15 14:36:40 2024 +0800 feat: recover sequence commit aca786e4359ef55d0af006199728c8b941558579 Author: laishzh <laishengzhang@gmail.com> Date: Thu Aug 15 13:44:03 2024 +0800 feat: default bos_token_id of encoder model commit 76b47fb1b7920fb50a889f19e1c1421e4385d1ca Author: laishzh <laishengzhang@gmail.com> Date: Thu Aug 15 13:18:53 2024 +0800 chore: recover commit 37bcba01408d37b192063e2ee2b9ac1c3087393c Author: laishzh <laishengzhang@gmail.com> Date: Wed Aug 14 17:47:05 2024 +0800 feat: full pipeline commit 63fb7a582cef08ec29a8b30024a01602dc5ee636 Author: laishzh <laishengzhang@gmail.com> Date: Wed Aug 14 02:39:31 2024 +0800 WIP: bert embedding commit 53c5148e9f5024f2eb6a83bbf7af191dc88fe555 Author: laishzh <laishengzhang@gmail.com> Date: Tue Aug 13 16:11:53 2024 +0800 (WIP)feat: EmbeddingModelRunner support encoder model commit 12a9869b5324fa9a4f7090eb8967c81f47f87f75 Merge: 59bf8c44d 97a6be95b Author: laishzh <laishengzhang@gmail.com> Date: Tue Aug 13 11:22:44 2024 +0800 Merge remote-tracking branch 'origin/main' # Conflicts: # .buildkite/test-pipeline.yaml # examples/offline_inference_encoder_decoder.py # tests/conftest.py # tests/core/test_scheduler_encoder_decoder.py # tests/kernels/test_encoder_decoder_attn.py # tests/models/test_bart.py # tests/worker/test_encoder_decoder_model_runner.py # vllm/core/scheduler.py # vllm/engine/llm_engine.py # vllm/inputs/__init__.py # vllm/inputs/data.py # vllm/model_executor/models/bart.py # vllm/sequence.py # vllm/utils.py # vllm/worker/enc_dec_model_runner.py # vllm/worker/worker.py commit 59bf8c44dd79c832a37949d0698bacef6ecc2136 Merge: a40828921 a936faa57 Author: laishzh <laishengzhang@gmail.com> Date: Thu Jul 25 23:02:34 2024 +0800 Merge remote-tracking branch 'bert_deps/afeldman-nm/infra_enc_dec_model_runner' commit a936faa57000aca5be159de260fae8c8849148b6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:52:50 2024 -0400 removed prefix caching from enc/dec modelrunner commit 4bb7fc442f67dd162a001900e485d02d64fa24ed Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:45:03 2024 -0400 removed chunked prefill logic/docstring text from enc/dec modelrunner commit f0abcc27e642dda6371eb1440de519166642a9e7 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:37:45 2024 -0400 format commit d1751db42bac1baf50b5fa542c770fbab13ba9ff Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:35:45 2024 -0400 removed flashinfer references from enc/dec modelrunner commit 64685acfe52177d1e01362ece71d3faab73e8e45 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:13:44 2024 -0400 Sequence docstring commit 035d90dfc21bbc12d12d2368a2d5d5175ead31ca Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 10:01:31 2024 -0400 updated RequestOutput docstring commit 1bb7ad9f2f5e4c84e283c5c0c59006d817440609 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 09:59:34 2024 -0400 updated RequestOutput docstring commit 47c5548936cd7bfe476d31e8248e3208a8a663d1 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 09:53:23 2024 -0400 checked out examples/offline_inference.py from main commit 3327e5be3b07bc35a607a1f4fa1fba2fc4f5904e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 09:49:44 2024 -0400 removed lora & vision & mm code from enc/dec modelrunner commit 175ea95baf0537209a8aa0e9c94f711f794f0f51 Merge: c2cc010ac 316a41ac1 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 09:25:53 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit c2cc010acc1bb632bb7297da970ff865b22c7f27 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 01:33:04 2024 -0400 Removed lora from enc/dec model runner commit fb5a2bcb2baa984b884ba8bdd6293dd06cb8756b Merge: 393515eb0 9e169a4c6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 00:52:21 2024 -0400 upstream merge commit 393515eb07a84c3d1604f0c0bc52eb2d8f7c5ae0 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 00:50:27 2024 -0400 formatting commit 47b4eb2a06bf0811f143668fbfe1f8c2caedc827 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 25 00:50:08 2024 -0400 fixed bug caused by upstream refactoring commit bed9bcd356c3526f5697ddfc2052d5bfca5fa9d2 Merge: 0af58ec10 740374d45 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 21:04:09 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 0af58ec10ac6eb9cab3f78abfa62390ade9ca64c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 05:10:20 2024 -0400 responses to feedback commit d82b27346b444778eeba42e015ac716883c37f76 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 05:01:27 2024 -0400 enc/dec example comments' commit 4b5b2cf956141e3adbc22a7a2aa2ebbb9bad8979 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 04:51:48 2024 -0400 removed unnecessary argument reordering commit ed4a56b9ca31cdf06033611887114920318ad397 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 04:46:49 2024 -0400 formatting commit 5a270ff49f3ebafecf8fb45e090f08d705aa416a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 04:46:32 2024 -0400 refactoring commit 02114bdcd5a832c3610318a8d0b8cfb26070f3ef Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 04:31:32 2024 -0400 _free_seq_group() -> _free_seq_group_cross_attn_blocks() commit be58d8ab92fd4ddab1f48b246a5233ee3a71bcf0 Merge: c493d4029 ccc4a7325 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 04:20:18 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit c493d402929d023a0924018a928502cb05605a2f Merge: f36ffb569 5e8ca973e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 00:34:07 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit f36ffb5695b0694947f4ae9e7417cc1afa85e19c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 00:33:47 2024 -0400 example includes prompt zipper commit 61d2ad2cc7791b6e32c8678b8e88ed99bbab4118 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 24 00:28:20 2024 -0400 fixed bugs in handling non-text formats for individual prompts commit dd784b5423ba21fc6b8188908df417d128376a1f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 21:37:19 2024 -0400 typing fix commit 0b29fd27f17f2751550262f218e6ef1afbef7087 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 21:35:25 2024 -0400 enc/dec handles empty str and None decoder prompts correctly commit aa01d71f90f0c3cda8a7ea419ff4f1fb6dc9d13c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 20:56:51 2024 -0400 empty-string decoder input is now handled for encoder/decoder commit 4a6e39e67c2bb4c2d685df9031cbf64956be4255 Merge: 7e7bbd9e1 87525fab9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 20:16:21 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 7e7bbd9e16900449e350bf8634d584e4b1a5c2f0 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 16:57:41 2024 -0400 deleted unnecessary dependency commit 229847b431469bd17b2d13f3651b322c7b280274 Merge: 059273f3c 1bedf210e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 16:56:27 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 059273f3ca43947413572a0014c1437a53e33b8a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 16:56:07 2024 -0400 wip commit b283544d820bfd96ac80845d2ddd7ad057cca6e9 Merge: 48a742d41 b01937f0c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 04:15:18 2024 -0400 Merge branch 'infra_enc_dec_model_runner_correctness' into infra_enc_dec_model_runner_reviews commit 48a742d4155cba0ffc7effb1c9fdad0706493c43 Merge: 427032a08 bb2fc0807 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 04:15:03 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit b01937f0ce29bc9e417e85cb4dd18ddb47a98e3b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 04:14:06 2024 -0400 set up None/empty str tests which are not passing commit c51a1682be7443ec7d32062491868bd49c631eb8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 23 01:47:43 2024 -0400 fixed bug in how conftest was handling HF encoder/decoder outputs; disabled HF engram repeat checks commit 427032a085cd48701f7abf64518563929a844d6c Merge: 14831b09d fea59c771 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 17:14:13 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 14831b09da05f6d8e689568c77f7dfc5c33895ab Merge: c43a6ed19 b90b6b6ff Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 13:52:34 2024 -0400 Merge branch 'infra_enc_dec_model_runner_reviews' into infra_enc_dec_model_runner commit b90b6b6ffb4417ec64b382e9211273bca1eebbb7 Merge: b174c7ab2 739b61a34 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 13:51:35 2024 -0400 upstream merge commit a40828921c18faf70f4239d90e599da4311b284e Merge: 7ace684da c43a6ed19 Author: laishzh <laishengzhang@gmail.com> Date: Mon Jul 22 19:00:06 2024 +0800 Merge remote-tracking branch 'bert_deps/afeldman-nm/infra_enc_dec_model_runner' commit c43a6ed191e76f81bfd27f25e2ca8bac1fc01bcc Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 04:03:59 2024 -0400 commented out BART TP=4 commit b174c7ab2da60e24a2ca576eccee671541ae142a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 04:02:56 2024 -0400 bart is parallelized, modulo an unfortunate hack for QKVParallelLinear in cross-attention commit 3551b6bf56ab74228c923b698e59a88b06bac81c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 03:59:22 2024 -0400 fixed bug where underlying Attention was constructed using full head-count commit fdf71de8557d588ff3b5767e96df09de4e9278d5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 03:48:35 2024 -0400 parallelized enc/dec cross-attention, using a slight hack commit 9bbed43ab159063a8dff27587dae909b11e1a703 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 03:20:20 2024 -0400 parallelized LM head commit 74abe22287374c9dd801ef059692016ef09777cb Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 03:01:07 2024 -0400 encoder attention & decoder self-attention parallelized commit e5bb9de596bd7f4b5d85ab6d0a2440cae06f982a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 02:33:02 2024 -0400 all attention layer output linears are parallelized commit fb3227f68714ba6ed00e67e8a242db88288cdb8e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 02:25:12 2024 -0400 parallelized BART learned positional embedding commit 00198a633605b786c5f1fdef007c965d6284b39b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 02:22:01 2024 -0400 BART MLPs parallelized commit abbb42749a628f5d199b62046200a6eb85025ab8 Merge: a33b50171 a16cabb90 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 01:54:59 2024 -0400 Merge branch 'infra_enc_dec_model_runner' into infra_enc_dec_model_runner_parallel_bart commit a16cabb9029d86221a69975935622dd53084a554 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 01:54:22 2024 -0400 equalized some generation/sampling config settings between enc/dec HF,vLLM, nonetheless still not perfect match commit a33b50171b6147ad1ff3db16adef4bb3a7819b33 Merge: 584c01e87 32967c1ca Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 01:35:22 2024 -0400 Merge branch 'infra_enc_dec_model_runner' into infra_enc_dec_model_runner_parallel_bart commit 32967c1ca7d706f1e59cbd604b58588210aeeee3 Merge: c00e0a8b5 89c1c6a19 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 01:30:53 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit c00e0a8b561a8243080ef40b1c1b8f0b8257d026 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 22 00:28:29 2024 -0400 CommonMetadataBuilder sets block_tables constructor arg of metadata commit a22f56c8bbb1dde2bd3a440bb0c037ed73ca17e1 Merge: ffa99b2dd 42de2cefc Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jul 21 22:28:38 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit ffa99b2dd61cfe21222a98ed2f95d608d6f6a8a2 Merge: 41ccf0c8c 9364f74ee Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sat Jul 20 16:08:20 2024 -0400 additional merge commit 41ccf0c8ce9079a89ace594a3a0f2eb573c2d6c0 Merge: 9fdd04705 a5314e869 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sat Jul 20 16:06:16 2024 -0400 wip merge commit 7ace684da139b43f38a4ebc328e17056ebfbe18a Merge: fe7786c8a c092ed476 Author: laishzh <laishengzhang@gmail.com> Date: Fri Jul 19 00:27:56 2024 +0800 Merge remote-tracking branch 'bert_deps/afeldman-nm/infra_enc_dec_model_runner' commit 584c01e875e12d870312ab210dec809325482ae3 Merge: 69f0379d2 9fdd04705 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 16:59:40 2024 -0400 Merge branch 'infra_enc_dec_model_runner_reviews' into infra_enc_dec_model_runner_parallel_bart commit 9fdd0470597025057a473eb8e20946f71db54daf Merge: c092ed476 5f0b9933e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 16:59:18 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 69f0379d24323958dd9b332884f7c57a222acfc6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 13:23:42 2024 -0400 wip: commit d7bd617c84880f477a0ce7ae3d1de1215e26748f Merge: 31e335fd2 c092ed476 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 13:13:04 2024 -0400 Merge branch 'infra_enc_dec_model_runner' into infra_enc_dec_model_runner_parallel_bart commit c092ed47621f9061395ea3e89386c997f856c6b3 Merge: 949ac02c5 2fa4623d9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 13:09:14 2024 -0400 merged in upstream changes; left some formatting issues which I expect to be fixed upstream commit 31e335fd206985f5b3791b6a3cfaa021d21d3629 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 13:03:58 2024 -0400 wip activation parallelization commit 88c058e8fe5ae00b39f88f57be745d1b819dbca5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 12:23:31 2024 -0400 wip parallelizing BART commit 949ac02c5694069edf3338b2202717dffda276e6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 11:18:01 2024 -0400 formatting commit 6c940f886950ba0ae77ccb9002a161cf95b686ad Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 11:00:34 2024 -0400 modified HF behavior in BART test to be truly greedy commit f15eacf140810512335a7ac422b09788a1c1964e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 10:55:46 2024 -0400 wip commit 180884605ffd911c553c6b2585c2993204e4a629 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 09:34:42 2024 -0400 formatting commit 1f8c52fac27ed8f10b94a3ecb08e15c1118c186a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 09:34:29 2024 -0400 tweaks to enc/dec example commit 9da8fb3ef77b64c0152e3699513053e1ea4e21a4 Merge: 94c904fb5 a9a2e74d2 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 09:24:19 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 94c904fb5ff01f7e1c93b8d4a5f195ca2bea5bc0 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 08:43:16 2024 -0400 wip parallel bart but encountering GPU count issue commit 9f5a02c21e785704114f8c15bb829f4fe4cded55 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 08:27:53 2024 -0400 RequestOutput & SequenceGroup now include encoder prompt in output, as does encoder/decoder example. commit 597a07da54fa4c399e42bccbb4a14957d782e37c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:59:42 2024 -0400 refactor commit f54f2762f4b4d14165371e3dfc300f1ef3afa9b6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:53:12 2024 -0400 wip refactoring commit cac6283f60f1edc55950eaae54e74db0902ebfd8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:25:58 2024 -0400 added encoder/decoder example to examples test commit b277180575d7d9c85708e2622cc6c32afbc0a383 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:17:40 2024 -0400 formatting commit 50ad5ffc753d1e7b39dfd55822ac0e405533168d Merge: ef9462321 e09ce759a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:16:28 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit ef94623218a718a437526917a8c95e933d614ee9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 07:16:10 2024 -0400 added examples utils w/ context manager for backend override; applied to enc/dec example to force XFormers commit aee5f1615347dcfe2acea9abe16ac61df3404a99 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 06:14:51 2024 -0400 fixed sequence bug commit 3656dc6c843cbf41b99ab4b0c88a974d1cedba2e Merge: 0cc14abc5 5fa6e9876 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 05:23:05 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 0cc14abc5a5569c6ae641c5d3efc0251fd946507 Merge: 1c6e06d0b 10383887e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 02:10:34 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 1c6e06d0be66bf8cbf98cc8429a060b60bb65700 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 02:10:12 2024 -0400 bugfix commit 31127faf0c4637c6b80540c9693c7d5f135416d5 Merge: c2ff615de 1d094fd7c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 00:48:22 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit c2ff615deebea4457721a457103d8e405346b1a5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 00:44:16 2024 -0400 format commit f8dd4a5955ec478720531c47945ddc26e450f743 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 00:43:52 2024 -0400 fixed scheduler bug commit ef80c85f7dd3febc9c76c793427c444f9e62caa6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 00:35:57 2024 -0400 wip commit 03aea187652fc0418d9a66f7eb5af6bc53c9e535 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 17 00:34:45 2024 -0400 wip commit 16c9aa2278e7f9d9b5f5ccffb085b0142a7e20ec Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 22:36:44 2024 -0400 bugfix commit 159c7bcf47aa86e4abbd88ad72a34e196c56626e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 21:58:15 2024 -0400 fixed decoder-only bug commit aea8d34385a64d6e6efa87729fee8fa4c4f15818 Merge: 713d095b4 7f62077af Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 21:09:06 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 713d095b4036404f4580225720da17d7d4e431cb Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 14:49:17 2024 -0400 incorporated encoder sequence into request-add functionality commit 87ed3b6fe380f75ebdafd3bc4da003b42802c18c Merge: 97d81f0a5 94162beb9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 14:17:29 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 97d81f0a53506cf6292f24117e8ecbfca5803805 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 14:17:09 2024 -0400 encoder/decoder input processing; formatting commit e534ffc156479d1b4dbec905ccc0877b746cc068 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 13:25:27 2024 -0400 wip commit 3c7e19d3d0e4c53ca363f40712fe2df160be1d9e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 10:44:23 2024 -0400 zip enc/dec prompts; formatting commit 850a97e812662645452989341eb44b79aa4b3276 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 10:25:38 2024 -0400 bart parallel vocab commit 42ac66b469891ba3085eaa1265c2bd9d445e0839 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:59:04 2024 -0400 VllmRunner encoder/decoder methods commit 796d7a3e7f8a67b644f6a88446e4162a09a1fbac Merge: 374880f71 7508a3dc3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:55:37 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 374880f71d6f81bd2a933b237ff6fa46e0324e6b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:49:30 2024 -0400 input preparation now includes encoder-oriented input setup: commit c5846ac9b31777d131bb0e3af2ad62a74eab1978 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:40:46 2024 -0400 Hfrunner greedy logprobs limit commit 92d9f486b2455ff5ea5215eb61b9cb1e375b17ff Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:33:41 2024 -0400 conftest: encoder/decoder example prompts commit 54ff1420cac3edccff6c751e4930f7fa1b3be247 Merge: ddaf0ade2 7a3d2a5b9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:28:46 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit ddaf0ade21142daafc504df83e15d31911dee497 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 16 09:28:21 2024 -0400 wip commit 914134749aee12e273f38273ed4cfda866ec837f Merge: 251f899ea ec9933f4a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 16:33:24 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 251f899ea158af33ffe1367c57137ac9ed9212ad Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 16:33:10 2024 -0400 wip commit f85997b4bb63352fc1bad72b54eea358f89ec5b0 Merge: 46397c74e 64fdc08c7 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 13:30:57 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 46397c74e7c094d86d4f49fc3230cb92985d5fc5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 13:30:21 2024 -0400 wip commit 336a77d62d2d31a2ed6c9eba9e36190b50cca713 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 09:34:47 2024 -0400 formatting commit 8dccaa510a67e8de71811c13371468024843b71d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 09:34:14 2024 -0400 correctly constructing enc/dec sequences commit dd4031c8e3201ee2e874e40df69c1bd52e7c54be Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 09:11:34 2024 -0400 wip but having wllm.commit_id error commit 552551137b19a9e9c2ebc13856c8e5a22834ae1b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 08:51:18 2024 -0400 Sequence may be constructed with encoder/decoder LLMInput configurations commit 7b0803b1bb9fbf222be2b719729b3494ade79087 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:41:25 2024 -0400 formatting? commit 304caed04dcbc25b76d8e80321da00414ac7dc17 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:36:33 2024 -0400 formatting commit 6c953808f11122a0c5482786b41825a79788a9a4 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:25:01 2024 -0400 wip engine is_encoder_decoder() setting commit 78d3d3c00d30af324dbd1ca0973c1dd68d4cdb5b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:20:50 2024 -0400 modified LLM.generate() error message commit 10ed7145053546d2112ed98252dc46f782a04b72 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:18:13 2024 -0400 Format commit 83c5c43dd6e06d13d9d05c01882b6d705a5aefaa Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:14:34 2024 -0400 prompt type checks commit 94c083cabff971da175eca616ff4b2c94299573b Merge: 64d71980c 0cca1646d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:00:30 2024 -0400 Merge branch 'infra_enc_dec_model_runner_reviews' into infra_enc_dec_model_runner commit 0cca1646dce64fbdf2419b7f075e15da6264ee84 Merge: db5539a85 6ae1597dd Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 07:00:07 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 64d71980c823c167239d5c7338096a144586b7f3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 06:59:49 2024 -0400 wip commit ff940f7adf771465e92a6fad350fb2f1aca4f694 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 06:18:58 2024 -0400 formatting commit 8b8d9812f7b7317448d4872db32cffcb45444c02 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 06:17:41 2024 -0400 refactored AttentionType and related imports; skip BART test definitions entirely if on vllm CPU version (to avoid xformers import commit 590a240fe53dd78e62c78f7ac0263b0c3fda6949 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 06:05:18 2024 -0400 Formatting commit 760355bfeea93c7b85cf440f597485e11a7357b1 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 06:04:43 2024 -0400 bart test skipped on CPU version of vllm commit db5539a85f83ceaa929e2c02129a1a174fa29424 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 05:00:25 2024 -0400 format commit 3d5bb888cfc10c835ff17c18ca367c930a335785 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 04:48:48 2024 -0400 EncoderDecoderModelInput correctly handles encoder token/position fields commit 447a5c7e10b09c1e5cff95e907198d6d050f1ffa Merge: 9ce2da454 22e79ee8f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 15 04:29:30 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 9ce2da45412de77bb358c2ce97521fa6a8b7990d Merge: c5ceb2348 eeceadaec Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sat Jul 13 19:26:27 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit c5ceb23486c3f3ddd15faf8fcf06fcc1ba722fe1 Merge: 196f30cd7 41708e503 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sat Jul 13 02:18:32 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 196f30cd7f25a682dc3d2320d994f949b00084a2 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jul 12 11:15:56 2024 -0400 enc/dec decoder test working, sans sampling check commit 9c898f5b28113ea53758c447175fd9cfd67b2066 Merge: 685604cfc f7160d946 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jul 12 09:41:15 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 685604cfcb90b6e74e37dbf5b5ee478e157f8191 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jul 12 09:40:42 2024 -0400 wip modelrunner commit f6499442e7c434c3ce4a187b344481988f106471 Merge: 9a63f51bd b422d4961 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 10 12:51:51 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner commit 9a63f51bde8059fc361cc7abb2249ce1efb54163 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 10 12:50:40 2024 -0400 wip model runner commit fe7786c8a510d2280f3e25a8461474bb17ab8e11 Merge: 26b6271ca a5c28fca8 Author: laishzh <laishengzhang@gmail.com> Date: Thu Jul 11 00:27:08 2024 +0800 Merge remote-tracking branch 'bert_deps/afeldman-nm/infra_enc_dec_model_runner' commit 6a71f8f4359dab04b9811b84d338db40dafa72bc Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 17:23:01 2024 -0400 formatting commit b4a461d983ed0215777c89f6b2ecbaa754422d4e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 17:18:56 2024 -0400 formatting commit d1343aac0fe6c0063f950e3600f9264aacb0836d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 17:07:43 2024 -0400 scheduler test passes commit c95adf50adcdc315f63b276f52ac9a6a2d35b5fa Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 16:49:34 2024 -0400 scheduler supports encoder-/cross-attention & passes existing scheduler tests, but needs new encoder/decoder-specific tests commit 4c01f1300161bb4a16fdc27612cdace516aedebc Merge: 2c80185fb 4d6ada947 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 16:38:22 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 2c80185fb81602a9a39afe4137bc5f59bcb69f57 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 16:36:11 2024 -0400 formatting commit bd14d29177dda7bd1f2ddd41ccba71703dbaa07d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jul 9 16:17:24 2024 -0400 wip scheduler commit c90140fba9d3ec2ee8a065a267aef571e93c64db Merge: 88e284a53 4f0e0ea13 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 17:55:07 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner2 commit 88e284a5344699e099e5510e5a353b9c5a54d0c7 Merge: db49d48f2 543aa4857 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 13:26:10 2024 -0400 merge from main commit db49d48f2a0913251385e324b28af06bd81cc121 Merge: 22d013c1d 6cd595c3c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 11:15:43 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner2 commit 6cd595c3c879d4ee603bb6a5bc0f1724647a5135 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:47:20 2024 -0400 formatting commit 5df73fc708bf3370a5f6d7f85cce4772d5c679b5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:47:04 2024 -0400 xformers backend cleanup commit d8a692b7dde0656696b726497030970aac0b53d3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:39:37 2024 -0400 cleaning up a number of backends & backends utils.py commit 097aff2029e4560ae28bd7a7acf0f20509f803fe Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:36:05 2024 -0400 vllm/attention/backends/flash_attn.py cleanup commit 45fc9f71641bdd17c67997598463f12ead3998b2 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:35:00 2024 -0400 vllm/attention/backends/blocksparse_attn.py cleanup commit 5ee30fed1d27dbef98dc3e4512741c9ca301197c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:31:09 2024 -0400 vllm/attention/backends/abstract.py cleanup commit 4f27946dcfb73f0a60420eb3ca6c9a74f6c6d3d1 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:27:35 2024 -0400 tests/kernels/utils.py cleanup commit a1bf65212cab0933b2520d8557a9d9132fff8c3d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 10:17:04 2024 -0400 test_encoder_decoder_attn.py cleanup commit 9ae6728ecfe48769f578b0fad3f8e3950daa683d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 09:46:58 2024 -0400 fixed specific point-changes requested by woosuk commit 7ce9a51d4fb3e286fdaa3a3ba12e60d0908d2d64 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 09:38:03 2024 -0400 merged in first pieces of woosuk feedback & latest main; formatting commit e837a73be0b61434116d1f332a84266d05cd61fc Merge: 07df0e158 7e0bc5725 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 09:36:30 2024 -0400 Merge branch 'infra_enc_dec_cross_attn_reviews' into infra_enc_dec_cross_attn commit 7e0bc572541e6018a7cfcebd16ea08b26826b975 Merge: 13f5b5078 717f4bcea Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jul 8 09:35:30 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 07df0e158a60b7d2a90407eecc868eaa10a58180 Author: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Date: Mon Jul 8 09:33:03 2024 -0400 Update vllm/attention/layer.py Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> commit 5dbebbc6f3aafe706a5555119fefa519b71c4634 Author: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com> Date: Mon Jul 8 09:32:43 2024 -0400 Update vllm/attention/backends/torch_sdpa.py nit: This will reduce the number of line changes and make the code look better. Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> commit 13f5b5078cdd81f58ed88a653ecc8ddc0968c073 Merge: d81662c57 abad5746a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jul 5 15:07:21 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 22d013c1de08aa8bc5747c513b12e0c3dd59d144 Merge: ba09fbcd6 d81662c57 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jul 4 00:24:29 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner2 commit d81662c572948ca9e01db21ec5f14f71c9fd1764 Merge: 2f0eb9b59 3dd507083 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 22:59:32 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 2f0eb9b591f298879df48be6d0a74196cf32a5cf Merge: 65e47db5a 966fe7214 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 18:58:24 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit ba09fbcd6b7efff359b1a0cef47c385d130b777d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 11:32:18 2024 -0400 refactored where a number of constants are stored, primarily constants related to encoder/decoder commit b085795eefcf31303c5e38bd734544664b5757c5 Merge: 44c62708f 65e47db5a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 11:22:23 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner2 commit 44c62708f3645f8a82b17a63849c1822a2dca645 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 10:15:57 2024 -0400 manually merged BART code in from previous modelrunner attempt, it won't work tho until new modelrunner is finished commit 65e47db5a59087af005e97df20f9d1a5be466a3c Merge: 2828aa793 7cd2ebb02 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jul 3 07:52:12 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 2828aa7936adab0d2ee3b49ffb0cfd01848581ab Merge: 5ff9c7686 af9ad46fc Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 30 20:16:34 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 5ff9c7686339f8d5f8e42060c1772f43468f2459 Merge: 8d36458fb 7836fdcc1 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 30 18:21:25 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 8d36458fb640e61fd70844739d107f41c0f3e631 Merge: 64981b535 75aa1442d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sat Jun 29 14:15:30 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 64981b535c557ada816b338f83cccf8c11ad0f83 Merge: 83d474e93 2cd402e16 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 28 15:37:00 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 83d474e93559ebbaf51194ef818f2308fd16ef1a Merge: a5018499e 57f09a419 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 28 10:18:17 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit a5018499e3b8475749a8d1af80e14c8d172cf2c7 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 27 18:57:56 2024 -0400 reverted unnecessarily vllm/utils.py changes commit c8f8d59d4ce7e1a3c104bd417f256e9b8f954815 Merge: bcccc3486 c3dde367f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 27 17:34:16 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit bcccc34863f5864307ef9c781471cef4e5d38ba8 Merge: 75756b91e 3fd02bda5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 27 13:59:00 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 75756b91e3753a9c2a60dbae42b2e46d3612ece5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 27 11:28:19 2024 -0400 removed redundant elif commit c24697fe82c844e13c820db916efef0a6b789374 Merge: 7ca0d7a39 e9d32d077 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 27 11:23:21 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 7ca0d7a399da475099cf501b1f4981a7dffc067a Merge: 4dabe1974 294104c3f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 26 19:37:30 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit a5c28fca8f5e21653c6e5874719467e08d3d8503 Merge: ba4e2c12e 4dabe1974 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 15:52:22 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit 4dabe1974766c6db8fd6ce8b6688c25bbd85b633 Merge: e2a46e3b7 dd248f767 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 15:48:31 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit ba4e2c12e6f1a03e3381cabda8902d55df9a292e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 04:05:23 2024 -0400 Removed unnecessary position arguments from BART routine; formatting commit 41e31e861b01896a99fba2f2ea44b717164c4398 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 03:59:48 2024 -0400 BART with new explanatory comments & passing formatting tests commit e61385d90e40b423e1e5d98839413a76ffcd11fb Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 03:49:18 2024 -0400 fixed bug caused by overzealous refactoring commit 4400d7733f7dca2acffac916a00f5edc6a89e14e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 03:36:28 2024 -0400 some reformatting commit 5169a2a6518d5ae338001eae0eae6dad64bf52eb Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 03:25:40 2024 -0400 removed unnecessary positions arguments from BART encoder, decoder forward() commit d43141f20514e77963e1c13ba857b1d3cb71c210 Merge: 753bab068 e2a46e3b7 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 03:16:19 2024 -0400 merge; a lot of formatting fixes to bart code but not fully passing commit e2a46e3b7b9f9d1a9cc751046c3cddd1522620ed Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:53:35 2024 -0400 formatting commit 1a6e5a31846e2ef886b66e9cc9216ffe983d0ec0 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:52:04 2024 -0400 moved make_tensor_with_pad() helper function back to vllm.utils commit d23c28466765496049a1696d0a053a0a2505ce9a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:38:08 2024 -0400 typing and formatting; fixed escape sequences in comments commit 2f0b05bb805513e73eb0609ea87b6367ec9d4803 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:35:34 2024 -0400 typing and formatting commit 47c9f396fdcd40895597423ebfefe585b014c2f3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:32:52 2024 -0400 removed attention_type commit 06c7f7500140c574d20a12079dbd1ef83db29688 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:28:42 2024 -0400 reorganized helper functions that were only being used for testing into tests/kernels/utils.py from vllm/utils.py commit a178b7a8c9838665ee7e169471206b70d62e1b71 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:20:00 2024 -0400 changed nested if/else to elif/else in xformers mask computation code commit 597526a49e041ec99329add79ef272ce6e457b9e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:18:02 2024 -0400 removed extra line commit 125e5dc46724155f5d81e93a7644a3889e864a2f Merge: 5ce2dd083 e9de9dd55 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:16:21 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 753bab06880a05726b2b8274a20d8f9d179c9576 Merge: 919bf88f8 e9de9dd55 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:14:20 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 919bf88f8925b2e60c765f309df655318c392c2e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 25 02:13:52 2024 -0400 BART e2e test runs but does not pass commit b7ff75fc3d3cb5d447503daa8a4a78aa6bf1a18d Merge: 2d8429e1b ba991d5c8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 19:25:24 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 2d8429e1b0002eccb7deaa805d25ebb6d5616187 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 18:47:19 2024 -0400 fixed a number of bugs related to BART decode-phase; added support for the particular architecture alias used by bart-large-cnn commit 8f9ee625557ec34ec29787b6b66ec760ff390e77 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 18:06:10 2024 -0400 wip bart-cnn summarization example commit d58e8c8464d5bcf41121a582b035f5f290658657 Merge: 6fd4c020a 1744cc99b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 15:50:28 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 6fd4c020a9c5ee8ecbf6e086d8b9dfefb3f8396f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 15:42:09 2024 -0400 fixed prompt processing bug that was preventing inference from starting commit 7d2fcf90a6516be432ffd39f4571ed0a524438b2 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 15:39:07 2024 -0400 BART passes profile run commit 3b95225850af9b81a15142344c4c8bae7257a519 Merge: 8b8c40943 b8d5637c5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 13:19:42 2024 -0400 Merge branch 'infra_enc_dec_model_runner_bart' into infra_enc_dec_model_runner_reviews commit 8b8c40943e2e0a4b104ca65c76441d3db03a017d Merge: 42c364439 5ce2dd083 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 13:04:54 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit 5ce2dd08345da9e5a19a913214e5a73ed4923c8d Merge: ce88fa36e c24621295 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 12:55:03 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit b8d5637c510b42a6503d9b0c4d810fe3568314dd Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 24 12:50:25 2024 -0400 wip bart commit 59caabecf2666c33306625843908b1d9dc2ffa8b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 21:42:39 2024 -0400 BART almost passing profile_run() commit f2dac1ce0ae1033b5143b8f1cd234e1eee5e67ee Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 20:13:05 2024 -0400 wip commit 082be510533d1e39008db19ca8754a91aa4879d3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 19:36:46 2024 -0400 loading tied weights commit 42c36443981dd89c9defaf2f51c1481ddb0a5751 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 16:24:26 2024 -0400 encoder decoder model runner fails for unsupported scenarios commit 9ad5143ab290419d27fcde1287d9bea853a58be3 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 16:00:15 2024 -0400 refactored backend constants commit 001cb185141278b6ea3a2fbbf6200032104229e0 Merge: 6219d9590 ce88fa36e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:40:19 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit ce88fa36e6cdbe0352348207a6a4dc405fcd9d76 Merge: ca68c63db f1e72cc19 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:39:06 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 6219d9590dfae14c574d598ce879af58fe97177f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:36:36 2024 -0400 Formatting commit 576c26c86a9b210fcca29180ed20fd15770f2660 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:35:11 2024 -0400 first pass a BART load_weights; probably not handling qkv correctly commit c11db0fd30e326d2273da95439c5087e83725b04 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:21:15 2024 -0400 integrating BART weight loading code commit 2123517ef5fc8a5593e693b7d28d8c217c729282 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 15:13:36 2024 -0400 formatting commit 97cad4b875ee09ebeff455a20fdf351eef9d2f16 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 14:40:40 2024 -0400 wip BART model cleanup commit 45a53877dc815398f1f190fa7e7d513db7928b6f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 14:28:59 2024 -0400 pruning out training functionality & unnecessary code from BART commit 30becae9d35d4b994bcd995c81603a97b93d0e3d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 13:45:48 2024 -0400 profiling fix; wip bart commit d2ad2328e41ad7a8898ddbb37db8c1bfaf2ae803 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 13:37:27 2024 -0400 wip bart integration commit ed610b0b9a6abcdaf874d16225a441509a207076 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 12:09:51 2024 -0400 pulled in bart model code commit 28f0d2fff6752a90227aa8aa07ca32e43bee395d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 12:06:56 2024 -0400 pulled in bart code commit 213dc597274da4c963510b1d72166d0a8eddbc7b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 12:03:50 2024 -0400 test_bart.py commit 49c7162d70441963ec6c26430a8e36426fbfe1aa Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 12:01:59 2024 -0400 formatting commit 84c0dcc5fe2b653cb0517df523504a107055061a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 11:58:45 2024 -0400 scheduler tests commit c15731710bd5c317638fef4d861567031d6126b8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 11:30:25 2024 -0400 free sequence groups commit 614de4e13869f1b2938d1f30369bbb98752a20c6 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 10:54:25 2024 -0400 formatting commit b6d4383e141e1fc23ee0c8c6bb9a7d172949266a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 10:46:15 2024 -0400 enc/dec integrated in Scheduler.schedule() commit 89b0e445bb32bbd5758bdcc05cd1bb869101029e Merge: beec4f571 ca68c63db Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 10:27:42 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit ca68c63db6ef8b9fcd132e84ffc6db1b7c7f618f Merge: e9d7ede3b bd620b01f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 10:26:54 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit beec4f5717d5c8193d70449c066f2aa469bf50b0 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 10:24:50 2024 -0400 enc/dec support in LLMEngine._add_processed_request() commit a1ab7a110c334f54dc451f1b273c3b0f0345332e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 09:50:37 2024 -0400 removing BART test commit 7000573396666a58cf5ca06d626f2b4c2e4f8bb2 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 09:49:37 2024 -0400 temporarily removing BART work commit 1bd916c2f91f7b8d755a9142ee3daeb7d5e489cb Merge: 2b2d2e9df bd620b01f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 09:38:05 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit 2b2d2e9df2b1535883e36b8353a26d52200f7783 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 08:55:19 2024 -0400 wip encoder/decoder API integration; WIP BART integration; WIP BART example commit e9ecd25cb733b220785611056295ea9787b1ce47 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 05:48:50 2024 -0400 added unoptimized BART example commit 2fccd1832a0933dca8537e436449dad4d52fa0c3 Merge: de967174d 0f645112d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 02:28:07 2024 -0400 Merge branch 'infra_enc_dec_model_runner_reviews' into infra_enc_dec_model_runner_bart commit 0f645112de4e1784cd43be505e659f3d3bd56581 Merge: 58139e380 e9d7ede3b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 02:27:25 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit e9d7ede3bfef92527a643809f4beb20cb780e7c0 Merge: 67ed41961 d9a252bc8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 02:26:01 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit de967174dcbbdb5e81d975edf158416bcbeb74cd Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 02:25:36 2024 -0400 wip bart test commit 58139e3808060c550264c800e605129d0082af5c Merge: f8569facd d9a252bc8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 01:55:08 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit f8569facd10b0cbf05689bfc364831a37bb48b45 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 00:35:24 2024 -0400 formatting commit eb5819be6025f0e598831e7e13c0656e184e9524 Merge: a0068fc91 1f5674218 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 00:23:07 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit a0068fc9112c5acefe69f5a8e30470c73a90a039 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 21 00:21:05 2024 -0400 Encoder/decoder model runner passes prefill/decode/empty-SG tests commit f0094bd8a90cc26325f1ea7ca1506fc459a312c9 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Thu Jun 20 10:59:52 2024 -0400 wip enc/dec modelrunner prepare_prompt test commit 736cf45223517f5720aedc53b65258ee8a75a25c Merge: 1581eb7f9 f9f9ae39e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 19 22:56:31 2024 -0400 Merge branch 'infra_enc_dec_model_runner_reviews' into infra_enc_dec_model_runner_bart commit f9f9ae39eea1dd6367cec3b2e878e1d2f3bef4ad Merge: a8a52d293 67ed41961 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 19 22:31:41 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit 67ed419619301a39c04417b29c90822a837e6362 Merge: ea37e17ab 3730a1c83 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 19 22:29:04 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 1581eb7f978a83690e0aaa2b390be491b42ffb15 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 19 22:28:28 2024 -0400 wip commit fbec309f0cc8d94df6ba7ab3f71f172d30f73531 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Wed Jun 19 01:14:35 2024 -0400 moved enc/dec error strings to top-level vllm utils commit a8a52d2935d5a2ab969c05d498ec2423ae19507b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 23:39:15 2024 -0400 some formatting fixes commit 37aeed66141b10b0d43c8e6d56613806dc7108ff Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 23:35:11 2024 -0400 enc dec model runner testable if only for encoder decoder model commit e3ba61e368f0085fe64e8dae3d80494f5254164c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 22:44:23 2024 -0400 wip commit 3311aac9bddd474d0a7037b53c53dfc515df0bcc Merge: f9314fd7d 59a1eb59c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 22:43:23 2024 -0400 Merge branch 'main' into infra_enc_dec_model_runner_reviews commit f9314fd7d1ae0d3146d7456eb41e6885f0055a5d Merge: 89fdb8116 ea37e17ab Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 22:43:07 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit ea37e17ab5ad7c084c13bf8e8492039d6a9bcdbf Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 19:16:38 2024 -0400 merge conflict; typing; formatting commit 91cbaa63d35e72ed0c14b65ed7f79bffdda2da97 Merge: 525303c7c 2bd231a7b Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 19:15:10 2024 -0400 merge; resolve conflicts commit 525303c7c61127900680ff06b6cc09610001b71e Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 18:06:33 2024 -0400 num encoder tokens commit 5f8c7f6cd6776cbda8289a5cee28e5cd8b858f4d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 11:26:24 2024 -0400 Moved attention type for attn_metadata to attention forward(); added NotImplement failures to backends in non-decoder-only scenarios commit c3f7da7620921e14e6c7efabeb0c54fd3d08b30b Merge: 7b9cb7f43 13db4369d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 11:01:28 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 7b9cb7f4339364b66180bf5cf7015f8fea67479d Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 11:01:05 2024 -0400 Replace attn_metadata.attention_type and attn_metadata._attn_type with attn_type argument to forward() commit d0fd9e10ff13157183fc24dfcb558f83c716ead6 Merge: addde7d22 4ad7b53e5 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 09:58:57 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 26b6271caa9b776b0093b874ab94dc8df0bb36b9 Merge: 3ea38598e db5ec52ad Author: laishzh <laishengzhang@gmail.com> Date: Tue Jun 18 17:49:40 2024 +0800 Merge branch 'vllm-project:main' into main commit addde7d22cda9ab0d006538ec0f900ac593c9292 Merge: 47586807a 114d7270f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 00:53:01 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 89fdb811629bfe86ce5aaf85e078ce953e03e700 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Tue Jun 18 00:52:29 2024 -0400 first pass at _prepare_encoder_model_input() commit c7bf81228dc06a1ed2c9d7e7e6f0d61e476e7e7b Merge: 830a05126 47586807a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 17 10:37:42 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit 47586807a3e8e75c6e9c27d1d17aeb22b0dff63d Merge: 90aec385a e2b85cf86 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 17 10:35:45 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit 830a051267732f60b04b99a15552ea984b9f43f8 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Mon Jun 17 01:16:25 2024 -0400 format commit e5c029926043518e63b85739d369b6cbbb9eddda Merge: 9cb8ee685 90aec385a Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 16 22:59:32 2024 -0400 Merge branch 'infra_enc_dec_cross_attn' into infra_enc_dec_model_runner_reviews commit 90aec385a0e77574f5b575257e29b194f6974521 Merge: e229e0018 845a3f26f Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 16 22:50:21 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit e229e0018138698bf13135f067eaf32a8cbf9167 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 16 22:47:04 2024 -0400 format commit 4dccd51c91fd3c1ae3a9ecea4baa46cad2a5f7dd Merge: b3c3411e2 f07d51332 Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Sun Jun 16 20:26:41 2024 -0400 Merge branch 'main' into infra_enc_dec_cross_attn_reviews commit b3c3411e26b7cf6f27604825d99a920c34605c9c Author: Andrew Feldman <afeldman@neuralmagic.com> Date: Fri Jun 14 16:39:35 2024 -0400 formatting commit f06c6873d77962c7b27fc…
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: maxdebayser The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@maxdebayser: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR handles compile-mode unwrap bug for indices length fix in LoRA
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
* update custom PA kernel with support for fp8 kv cache dtype; change custom PA partition size to 512 to prefer throughput scenarios at cost of latency * Fix lint * Fix BF16 with FP8 KV cache (scaled conversion incorrectly done in fp16) * Fix custom PA tests * Merge branch 'main' of git@github.com:ROCm/vllm.git into mawong/fix_custom_pa_tests * Fix partition sizes for PAv2, PAcustom * Fix linting * Fix a few names and variable scopes * Rename custom to rocm as per suggestion --------- Co-authored-by: Shomy Sanyal <shomy.sanyal@amd.com>
No description provided.