Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OSPP23] Add llama2 lowering end2end #216

Merged
merged 9 commits into from
Oct 28, 2023

Conversation

guessmewho123888
Copy link
Contributor

No description provided.

@zhanghb97 zhanghb97 added this to the llama-inference milestone Oct 9, 2023
@notion-workspace
Copy link

Get the importer ready.

2 similar comments
@notion-workspace
Copy link

Get the importer ready.

@notion-workspace
Copy link

Get the importer ready.

@weilinquan
Copy link
Contributor

test script and readme.md are in examples/MLIRLlama/

Copy link
Member

@zhanghb97 zhanghb97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weilinquan Nice!

We also need to add the following items:

  • Docstrings for conversion functions (see the example for addmm)
  • Tests for these conversion functions (see here as an example)
  • The current example process is somewhat complex, and we expect to use CMake to integrate the various processes, which we will discuss in detail at our group meeting.

--reconcile-unrealized-casts | \
${MLIR_TRANSLATE} \
-mlir-to-llvmir | \
${LLC} -mtriple=x86_64 -filetype=obj --relocation-model=pic ${OPT_FLAG} -o resnet18.o
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is for LLaMA-related implementations only; don't modify the ResNet part here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have added the requirements.txt at the root of our project.
https://github.com/buddy-compiler/buddy-mlir/blob/main/requirements.txt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this tosa.py be in Yuliang's PR?

// Print the tokenized result
cout << "Get User input:" << pureStrContainer.revert(pureStrContainer)
<< endl;
cout << "[Buddy] Tokenize input time: " << buddyTokenizeTime * 1000 << "ms"
cout << "[Buddy] Tokenize input time: " << buddyTokenizeTime.count() * 1000 << "ms"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the unit of time measurement here?
The unit multiplied by 1000 here is “ms”, and the unit divided by 1000 below is “s”.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I have change buddyTokenizeTime's time unit to milliseconds and update this cout.

Copy link
Member

@zhanghb97 zhanghb97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the README by adding the necessary steps to run the E2E example.

@@ -0,0 +1,2532 @@
# ===- linalg.py -----------------------------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong format.
Please follow the 80-col limitation.

@@ -241,7 +275,7 @@ def generated_func(*args):
for output_arg in output_node_args:
op = self._symbol_table.get((str(output_arg), 0))
returns.append(op)

returns = returns[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return only the first returns value? This will trigger the check-buddy fail (the test_var_mean case returns two values).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In llama2,it will return many tensors include intermediate tensors for compute gradient, but we only need the first tensor to generate next token.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use a better way to solve it

from buddy.compiler.frontend import DynamoCompiler
from buddy.compiler.ops import tosa

tokenizer = LlamaTokenizer.from_pretrained('/llama-2-7B-hf')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /llama-2-7B-hf seems to cause an error:

huggingface_hub.utils._validators.HFValidationError: 
Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, 
'-' and '.' cannot start or end the name, max length is 96: '/llama-2-7B-hf'.

Did I miss something important, or should we remove the /?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I recalled the details about the configurations here. We should modify the path to the huggingface version of the llama model, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change it from '/llama-2-7B-hf' to 'path to huggingface llama2 model'

Copy link
Member

@zhanghb97 zhanghb97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Congrats 🎉🎉🎉
Thank you for your contribution. Hope you enjoyed the OSPP project!

@zhanghb97 zhanghb97 merged commit e922c64 into buddy-compiler:main Oct 28, 2023
Copy link

ShiHaoGao pushed a commit to ShiHaoGao/buddy-mlir that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants