Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing intrinsics for AArch32 instructions VMLA.F16 and VMLS.F16 #216

Open
Maratyszcza opened this issue Oct 2, 2022 · 1 comment
Labels

Comments

@Maratyszcza
Copy link

Maratyszcza commented Oct 2, 2022

Alongside VFMA.F16/VFMS.F16, AArch32 offers VMLA.F16/VMLS.F16 instructions which performs multiply-add operation with intermediate rounding. Importantly, the vector-by-vector lane form (e.g. VMLA.F16 Qd, Qn, Dm[x]) on AArch32 is supported only for VMLA/VMLS instructions, and not for VFMA/VFMS instructions.

The NEON intrinsics specification lacks intrinsics for the VMLA/VMLS instructions. In particular, it makes impossible to achieve peak performance on half-precision matrix-matrix multiplication in AArch32 using NEON intrinsics, because the optimal implementation would use the VMLA.F16 Qd, Qn, Dm[x] instructions.

I request that NEON specification be updated to include the following intrinsics for AArch32:

  • vmla_f16 (VMLA.F16 Dd, Dn, Dm)
  • vmls_f16 (VMLS.F16 Dd, Dn, Dm)
  • vmlaq_f16 (VMLA.F16 Qd, Qn, Qm)
  • vmlsq_f16 (VMLS.F16 Qd, Qn, Qm)
  • vmla_lane_f16 (VMLA.F16 Dd, Dn, Dm[x])
  • vmls_lane_f16 (VMLS.F16 Dd, Dn, Dm[x])
  • vmlaq_lane_f16/vmlaq_laneq_f16 (VMLA.F16 Qd, Qn, Dm[x])
  • vmlsq_lane_f16/vmlsq_laneq_f16 (VMLS.F16 Qd, Qn, Dm[x])
@Maratyszcza Maratyszcza added the bug Something isn't working label Oct 2, 2022
@vhscampos
Copy link
Member

Hi @Maratyszcza , thanks for your issue report. And apologies for the late response.

If possible, we encourage you to contribute with a Pull Request that addresses this issue. We will be happy to review it.

@vhscampos vhscampos added proposal and removed bug Something isn't working labels Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants