Convert model.bin (fp32) to model.bin (int8) #1761

aryan1165 · 2024-08-24T06:57:59Z

I have a pretrained model.bin file which was earlier converted using OpenNMTPy converter using fp32 quantisation. Now i want to reduce the size of the model and thought of quantising it to int8. But, I was only able to find ways to quantise it upon loading of the model, not able to find how to save it for further use.

Any idea how it can be achieved?

minhthuc2502 · 2024-08-26T08:53:20Z

The model.bin is designed quite strictly in storing the weights and their sizes continuously. Actually, there isn't any solution to save the weight quantized while loading the model.

aryan1165 · 2024-08-26T17:49:14Z

But there is a way while converting the model from lets say openNMTPy, to quantize the model and save it. I require this, to decrease the size of the model.bin file. Isn't it possible to develop an api which will quantise the model and save it as a new model.bin file. It seems logical to me as it is available everywhere where quantisation is supported.

aryan1165 · 2024-08-26T18:16:17Z

I think it can easily be implemented using existing code, but i am not able to figure out how to get model_spec of a current model.bin Once it is there, then the converter code can be modified to save the quantised model again.

minhthuc2502 · 2024-08-27T07:50:49Z

It could develop a new feature where we can save tensor quantized in new model.bin but it isn't simple (require new converter to load the model from model.bin for a spec). Currently, we don't have plan to do it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert model.bin (fp32) to model.bin (int8) #1761

Convert model.bin (fp32) to model.bin (int8) #1761

aryan1165 commented Aug 24, 2024

minhthuc2502 commented Aug 26, 2024

aryan1165 commented Aug 26, 2024

aryan1165 commented Aug 26, 2024

minhthuc2502 commented Aug 27, 2024 •

edited

Loading

Convert model.bin (fp32) to model.bin (int8) #1761

Convert model.bin (fp32) to model.bin (int8) #1761

Comments

aryan1165 commented Aug 24, 2024

minhthuc2502 commented Aug 26, 2024

aryan1165 commented Aug 26, 2024

aryan1165 commented Aug 26, 2024

minhthuc2502 commented Aug 27, 2024 • edited Loading

minhthuc2502 commented Aug 27, 2024 •

edited

Loading