You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used A100(80G) to run the evaluate_zh.py script for evaluating baichuan model, but it occupied abundant GPU memory up to overflow. Then I found the model loaded without eval mode, meanwhile, it inferred without no_grad.
Required prerequisites
System information
conda environment
torch=2.0.1
transformers=4.29.2
...
Problem description
I used A100(80G) to run the evaluate_zh.py script for evaluating baichuan model, but it occupied abundant GPU memory up to overflow. Then I found the model loaded without eval mode, meanwhile, it inferred without no_grad.
Reproducible example code
The Python snippets:
Command lines:
Extra dependencies:
Steps to reproduce:
Traceback
No response
Expected behavior
No response
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: