-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
matmulnbits zero_point fix #3566
base: develop
Are you sure you want to change the base?
Conversation
https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.MatMulNBits according to the spec zero point can only be uint8/int32/float16/float? |
And if the zero point isn't specified, it can be inferred to be @kahmed10, btw, I don't see any type checking in this parser code -- unless it is done somewhere else. |
I don't think we can assume it is handled. If things are type constrained you'll need to add that for those inputs |
Ok. Let me enhance this PR to include the basic type checking for this operator. |
This build is not recommended to merge 🔴 |
❌bert_base_cased_fp16: ERROR - check error output❌bert_large_uncased_fp16: ERROR - check error output❌tinyllama: ERROR - check error output❌whisper-tiny-decoder: ERROR - check error output❌distilgpt2_fp16: ERROR - check error output |
Currently the matmulnbits parsing introduces a fixed type
uint8_type
for zero_point, and missesint8_type
.