libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-07-31 00:36:32 +08:00

History

George Steed f4eaeca22a [AArch64] Add SVE2 implementation of I422ToARGB1555Row This makes use of the same approach as the Neon code to avoid redundant narrowing and then widening shifts by instead placing the values at the top portion of the lanes and then shifting down from there instead. Observed reduction in runtime compared to the existing Neon code: Cortex-A510: -41.8% Cortex-A520: -42.6% Cortex-A715: -22.5% Cortex-A720: -22.6% Cortex-X2: -22.7% Cortex-X3: -22.4% Cortex-X4: -19.4% Cortex-X925: -27.0% Bug: b/42280942 Change-Id: I24b092bb352d9858e3d969d82b55940bb00ac7e0 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5802967 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2024-10-24 21:27:39 +00:00
..
libyuv	[AArch64] Add SVE2 implementation of I422ToARGB1555Row	2024-10-24 21:27:39 +00:00
libyuv.h	NV12 Copy, include scale_uv.h	2020-12-08 18:54:16 +00:00

George Steed f4eaeca22a [AArch64] Add SVE2 implementation of I422ToARGB1555Row

This makes use of the same approach as the Neon code to avoid redundant
narrowing and then widening shifts by instead placing the values at the
top portion of the lanes and then shifting down from there instead.

Observed reduction in runtime compared to the existing Neon code:

Cortex-A510: -41.8%
Cortex-A520: -42.6%
Cortex-A715: -22.5%
Cortex-A720: -22.6%
  Cortex-X2: -22.7%
  Cortex-X3: -22.4%
  Cortex-X4: -19.4%
Cortex-X925: -27.0%

Bug: b/42280942
Change-Id: I24b092bb352d9858e3d969d82b55940bb00ac7e0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5802967
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>

2024-10-24 21:27:39 +00:00

libyuv

[AArch64] Add SVE2 implementation of I422ToARGB1555Row

2024-10-24 21:27:39 +00:00

libyuv.h

NV12 Copy, include scale_uv.h

2020-12-08 18:54:16 +00:00