libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-06-15 08:26:06 +08:00

History

George Steed 1c31461771 [AArch64] Add Neon dot-product implementation for ARGBGrayRow We can use dot product instructions to apply the coefficients without needing to use LD4 deinterleaving load instructions, and then TBL to mix in the original alpha component. This is significantly faster on some micro-architectures where LD4 instructions are known to be slow compared to normal loads. Reduction in cycle counts observed compared to existing Neon code: Cortex-A55: -12.6% Cortex-A510: -48.6% Cortex-A76: -39.7% Cortex-A720: -52.3% Cortex-X1: -63.5% Cortex-X2: -67.0% Bug: b/42280946 Change-Id: I3641785e74873438acc00d675f5bc490dfa95b50 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5785972 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2024-09-16 04:31:11 +00:00
..
libyuv	[AArch64] Add Neon dot-product implementation for ARGBGrayRow	2024-09-16 04:31:11 +00:00
libyuv.h	NV12 Copy, include scale_uv.h	2020-12-08 18:54:16 +00:00

George Steed 1c31461771 [AArch64] Add Neon dot-product implementation for ARGBGrayRow

We can use dot product instructions to apply the coefficients without
needing to use LD4 deinterleaving load instructions, and then TBL to mix
in the original alpha component. This is significantly faster on some
micro-architectures where LD4 instructions are known to be slow compared
to normal loads.

Reduction in cycle counts observed compared to existing Neon code:

 Cortex-A55: -12.6%
Cortex-A510: -48.6%
 Cortex-A76: -39.7%
Cortex-A720: -52.3%
  Cortex-X1: -63.5%
  Cortex-X2: -67.0%

Bug: b/42280946
Change-Id: I3641785e74873438acc00d675f5bc490dfa95b50
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5785972
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>

2024-09-16 04:31:11 +00:00

libyuv

[AArch64] Add Neon dot-product implementation for ARGBGrayRow

2024-09-16 04:31:11 +00:00

libyuv.h

NV12 Copy, include scale_uv.h

2020-12-08 18:54:16 +00:00