Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

Open
murilosimao opened this issue Aug 12, 2024 · 0 comments
Open

Compile LayoutLMv3 using Neuron SDK for AWS Inferentia (inf1) #1610

murilosimao opened this issue Aug 12, 2024 · 0 comments

Comments

@murilosimao
Copy link

I've fine-tuned the LayoutLMv3 model and can run inferences on both CPU and GPU without any issues. However, when I try to compile this model for inference on AWS inf1 instances, I keep getting the same error. I tried modifying the forward function, but without success. Here’s the error I’m encountering:

>>> traced = torch.jit.trace(model, [input_ids, attention_mask, bbox, pixel_values])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/jit/_trace.py", line 759, in trace
    return trace_module(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/jit/_trace.py", line 976, in trace_module
    module._c._create_method_from_trace(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "<stdin>", line 5, in custom_forward
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 1099, in forward
    outputs = self.layoutlmv3(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 907, in forward
    embedding_output = self.embeddings(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 338, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
    return F.embedding(
  File "/content/aws_neuron_venv_pytorch_inf1/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)

How to reproduce:

from transformers import AutoModelForTokenClassification, AutoProcessor
import torch

processor = AutoProcessor.from_pretrained("microsoft/layoutlmv3-large", apply_ocr=False)

input_ids, attention_mask, bbox, pixel_values = processor(any_PIL_image, ['first', 'second'], boxes=[[100,100,100,100],[22,200,200,200]], return_tensors='pt', padding='max_length', truncation=True, return_offsets_mapping=True)

model = AutoModelForTokenClassification.from_pretrained("microsoft/layoutlmv3-large")
traced = torch.jit.trace(model, [input_ids, attention_mask, bbox, pixel_values])

My references to compile model using neuron:
https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuron/inference/trocr/TrOCR.ipynb
https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuron/inference/beit/BEiT.ipynb

@NielsRogge can you help to solve that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant