Register pt2e static quantization #1761

yiliu30 · 2024-04-28T03:00:52Z

Type of Change

feature or bug fix or documentation or validation or others
API changed or not

Description

Register pt2e static quantization

Align the W8A8StaticQuantizer with Quantizer
Add export API
Map the StaticQuantConfig to X86InductorQuantizer's config
Add export for ipex by separate PR

Usage

# User script
model = UserModel()
example_inputs = ...

# quantize script

# import intel_extension_for_pytorch ### <--- if user want to use ipex' static quant

from neural_compressor.torch.quantization import get_default_static_config, prepare, convert
from neural_compressor.torch.export import export

# export
dynamic_shapes = {"input_ids": (None, Dim("seq_len"))}
exported_model = export(model, example_inputs=example_inputs, dynamic_shapes=dynamic_shapes)

# prepare
quant_config = get_default_static_config()
prepare_model = prepare(model, quant_config)

# calibrate
run_fn(prepare_model)

# convert
converted_model = convert(prepare_model)

# compile and inference
opt_model = torch.compile(converted_model)
out = opt_model(*example_inputs)

@ftian1 @xin3he @violetch24

How has this PR been tested?

Pre-CI

Dependency Change?

None

Signed-off-by: yiliu30 <yi4.liu@intel.com>

github-actions · 2024-04-29T05:33:54Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow

Check ID	Status
Code-Scan	success	✅
Code-Scan (Bandit Code Scan Bandit)	success	✅
Code-Scan (DocStyle Code Scan DocStyle)	success	✅
Code-Scan (Pylint Code Scan Pylint)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py.

🟢 Model Tests 3x workflow

Check ID	Status
Model-Test-3x	success	✅
Model-Test-3x (Generate Report GenerateReport)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py.

🟢 Unit Tests 3x-PyTorch workflow

Check ID	Status
UT-3x-Torch	success	✅
UT-3x-Torch (Coverage Compare CollectDatafiles)	success	✅
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)	success	✅
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/__init__.py, neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/export/__init__.py, neural_compressor/torch/export/_export.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, neural_compressor/torch/utils/constants.py, neural_compressor/torch/utils/environ.py, neural_compressor/torch/utils/utility.py, test/3x/torch/algorithms/pt2e_quant/test_pt2e_w8a8.py, test/3x/torch/quantization/test_pt2e_quant.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

Signed-off-by: yiliu30 <yi4.liu@intel.com>

… into pt2e_entry

chensuyue · 2024-05-09T01:33:02Z

@xin3he please review.

yiliu30 added 7 commits April 27, 2024 22:57

register pt2e

228f1db

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add pt2e entry

edde6aa

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'master' into pt2e_entry

c159774

fixed function

c239da2

Signed-off-by: yiliu30 <yi4.liu@intel.com>

updated UTs

c1485bf

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add UTs for e2e pt2e

cb58079

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fixed the ut name

8424b30

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 marked this pull request as ready for review April 29, 2024 05:33

yiliu30 requested review from xin3he, yuwenzho and ftian1 April 29, 2024 05:33

yuwenzho approved these changes Apr 29, 2024

View reviewed changes

yiliu30 added 3 commits April 30, 2024 15:26

fixed the ut name

bb6a2f6

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fixed

c258b79

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fixed the UT

a98b19a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 added PyTorch Related to PyTorch F/W INC3.X WIP labels Apr 30, 2024

yiliu30 and others added 8 commits May 8, 2024 11:32

add export

3a29a98

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add export

7e108c5

Signed-off-by: yiliu30 <yi4.liu@intel.com>

merge with remote

d76a40a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

remove export

a7cfaa4

Signed-off-by: yiliu30 <yi4.liu@intel.com>

hide pt2estaticquant

fdc2369

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'master' into pt2e_entry

225ac75

remove pt2econfig

91d5a32

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'pt2e_entry' of https://github.com/intel/neural-compressor…

925970d

… into pt2e_entry

yiliu30 removed the WIP label May 8, 2024

xin3he approved these changes May 9, 2024

View reviewed changes

yiliu30 merged commit 43c3580 into master May 9, 2024
30 checks passed

yiliu30 deleted the pt2e_entry branch May 9, 2024 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Register pt2e static quantization #1761

Register pt2e static quantization #1761

yiliu30 commented Apr 28, 2024 •

edited

Loading

github-actions bot commented Apr 29, 2024 •

edited

Loading

chensuyue commented May 9, 2024

Register pt2e static quantization #1761

Register pt2e static quantization #1761

Conversation

yiliu30 commented Apr 28, 2024 • edited Loading

Type of Change

Description

Usage

How has this PR been tested?

Dependency Change?

github-actions bot commented Apr 29, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

chensuyue commented May 9, 2024

yiliu30 commented Apr 28, 2024 •

edited

Loading

github-actions bot commented Apr 29, 2024 •

edited

Loading