-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OSPP23] Add llama2 lowering end2end #216
Conversation
2 similar comments
2663d80
to
ffa3527
Compare
test script and readme.md are in examples/MLIRLlama/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weilinquan Nice!
We also need to add the following items:
examples/DLModel/makefile
Outdated
--reconcile-unrealized-casts | \ | ||
${MLIR_TRANSLATE} \ | ||
-mlir-to-llvmir | \ | ||
${LLC} -mtriple=x86_64 -filetype=obj --relocation-model=pic ${OPT_FLAG} -o resnet18.o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is for LLaMA-related implementations only; don't modify the ResNet part here.
examples/MLIRLlama/requirements.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have added the requirements.txt at the root of our project.
https://github.com/buddy-compiler/buddy-mlir/blob/main/requirements.txt
frontend/Python/ops/tosa.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this tosa.py
be in Yuliang's PR?
examples/BuddyLlama/llama-main.cpp
Outdated
// Print the tokenized result | ||
cout << "Get User input:" << pureStrContainer.revert(pureStrContainer) | ||
<< endl; | ||
cout << "[Buddy] Tokenize input time: " << buddyTokenizeTime * 1000 << "ms" | ||
cout << "[Buddy] Tokenize input time: " << buddyTokenizeTime.count() * 1000 << "ms" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the unit of time measurement here?
The unit multiplied by 1000 here is “ms”, and the unit divided by 1000 below is “s”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I have change buddyTokenizeTime's time unit to milliseconds and update this cout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the README by adding the necessary steps to run the E2E example.
frontend/Python/ops/linalg.py
Outdated
@@ -0,0 +1,2532 @@ | |||
# ===- linalg.py ----------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong format.
Please follow the 80-col limitation.
frontend/Python/frontend.py
Outdated
@@ -241,7 +275,7 @@ def generated_func(*args): | |||
for output_arg in output_node_args: | |||
op = self._symbol_table.get((str(output_arg), 0)) | |||
returns.append(op) | |||
|
|||
returns = returns[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return only the first returns
value? This will trigger the check-buddy
fail (the test_var_mean
case returns two values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In llama2,it will return many tensors include intermediate tensors for compute gradient, but we only need the first tensor to generate next token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use a better way to solve it
examples/BuddyLlama/test-llama2.py
Outdated
from buddy.compiler.frontend import DynamoCompiler | ||
from buddy.compiler.ops import tosa | ||
|
||
tokenizer = LlamaTokenizer.from_pretrained('/llama-2-7B-hf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The /llama-2-7B-hf
seems to cause an error:
huggingface_hub.utils._validators.HFValidationError:
Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden,
'-' and '.' cannot start or end the name, max length is 96: '/llama-2-7B-hf'.
Did I miss something important, or should we remove the /
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I recalled the details about the configurations here. We should modify the path to the huggingface version of the llama model, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it from '/llama-2-7B-hf' to 'path to huggingface llama2 model'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Congrats 🎉🎉🎉
Thank you for your contribution. Hope you enjoyed the OSPP project!
No description provided.