libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-08-01 01:06:29 +08:00

History

George Steed c613c3f102 [AArch64] Add SVE2 implementations for RAWTo{ARGB,RGBA}Row We can construct particular predicates to load only up to 3/4 of a full vector, allowing us to use TBL to shuffle elements into the correct place rather than needing to rely on more expensive LD3 or ST4 instructions. Reduction in runtimes observed compared to the existing Neon implementation: \| RAWToARGBRow \| RAWToRGBARow Cortex-A510 \| -32.4% \| -31.9% Cortex-A720 \| -15.7% \| -15.6% Cortex-X2 \| -24.6% \| -24.4% Bug: libyuv:973 Change-Id: I271c625d97bab3b0e08ac1e9d7fcf7d18f3d6894 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5631542 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Justin Green <greenjustin@google.com>	2024-07-06 22:40:15 +00:00
..
libyuv	[AArch64] Add SVE2 implementations for RAWTo{ARGB,RGBA}Row	2024-07-06 22:40:15 +00:00
libyuv.h	NV12 Copy, include scale_uv.h	2020-12-08 18:54:16 +00:00

George Steed c613c3f102 [AArch64] Add SVE2 implementations for RAWTo{ARGB,RGBA}Row

We can construct particular predicates to load only up to 3/4 of a full
vector, allowing us to use TBL to shuffle elements into the correct
place rather than needing to rely on more expensive LD3 or ST4
instructions.

Reduction in runtimes observed compared to the existing Neon
implementation:

            | RAWToARGBRow | RAWToRGBARow
Cortex-A510 |       -32.4% |       -31.9%
Cortex-A720 |       -15.7% |       -15.6%
  Cortex-X2 |       -24.6% |       -24.4%

Bug: libyuv:973
Change-Id: I271c625d97bab3b0e08ac1e9d7fcf7d18f3d6894
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5631542
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>

2024-07-06 22:40:15 +00:00

libyuv

[AArch64] Add SVE2 implementations for RAWTo{ARGB,RGBA}Row

2024-07-06 22:40:15 +00:00

libyuv.h

NV12 Copy, include scale_uv.h

2020-12-08 18:54:16 +00:00