-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update NeMo/Megatron #302
base: main
Are you sure you want to change the base?
Update NeMo/Megatron #302
Conversation
else: | ||
kv_channels = self.config.kv_channels | ||
|
||
extra_kwargs["softmax_scale"] = softmax_scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved outside of if is_te_min_version("1.10.0")
. Otherwise we can just call super().__init__
directly without override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related in some ways to #304, maybe sync with @farhadrgh and make sure that you are on a new enough NeMo for his needs as well? I think his stuff was recently merged.
28505eb
to
dcd025e
Compare
Pass esm2 golden value tests.
Bugs
Pytest errors
dtype error comes from Megatron-LM