Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

murilosimao · 2024-08-12T19:28:48Z

I've fine-tuned the LayoutLMv3 model and can run inferences on both CPU and GPU without any issues. However, when I try to compile this model for inference on AWS inf1 instances, I keep getting the same error. I tried modifying the forward function, but without success. Here’s the error I’m encountering:

>>> traced = torch.jit.trace(model, [input_ids, attention_mask, bbox, pixel_values])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/jit/_trace.py", line 759, in trace
    return trace_module(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/jit/_trace.py", line 976, in trace_module
    module._c._create_method_from_trace(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "<stdin>", line 5, in custom_forward
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 1099, in forward
    outputs = self.layoutlmv3(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 907, in forward
    embedding_output = self.embeddings(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 338, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)

How to reproduce:

from transformers import AutoModelForTokenClassification, AutoProcessor
import torch

processor = AutoProcessor.from_pretrained("microsoft/layoutlmv3-large", apply_ocr=False)

input_ids, attention_mask, bbox, pixel_values = processor(any_PIL_image, ['first', 'second'], boxes=[[100,100,100,100],[22,200,200,200]], return_tensors='pt', padding='max_length', truncation=True, return_offsets_mapping=True)

model = AutoModelForTokenClassification.from_pretrained("microsoft/layoutlmv3-large")
traced = torch.jit.trace(model, [input_ids, attention_mask, bbox, pixel_values])

My references to compile model using neuron:
https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuron/inference/trocr/TrOCR.ipynb
https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuron/inference/beit/BEiT.ipynb

@NielsRogge can you help to solve that?

murilosimao mentioned this issue Aug 19, 2024

[LayoutLM3] How to export models to onnx format？ #1274

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

murilosimao commented Aug 12, 2024

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

Comments

murilosimao commented Aug 12, 2024