libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-08-01 01:06:29 +08:00

History

George Steed 004352ba16 [AArch64] Add SVE2 implementations for AYUVTo{UV,VU}Row These kernels are mostly identical to each other except for the order of the results, so we can use a single macro to parameterize the pairwise addition and use the same macro for both implementations, just with the register order flipped. Similar to other 2x2 kernels the implementation here differs slightly for the last element if the problem size is odd, so use an "any" kernel to avoid needing to handle this in the common code path. Observed reduction in runtime compared to the existing Neon code: \| AYUVToUVRow \| AYUVToVURow Cortex-A510 \| -33.1% \| -33.0% Cortex-A720 \| -25.1% \| -25.1% Cortex-X2 \| -59.5% \| -53.9% Cortex-X4 \| -39.2% \| -39.4% Bug: libyuv:973 Change-Id: I957db9ea31c8830535c243175790db0ff2a3ccae Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5522316 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2024-06-04 18:18:07 +00:00
..
libyuv	[AArch64] Add SVE2 implementations for AYUVTo{UV,VU}Row	2024-06-04 18:18:07 +00:00
libyuv.h	NV12 Copy, include scale_uv.h	2020-12-08 18:54:16 +00:00

George Steed 004352ba16 [AArch64] Add SVE2 implementations for AYUVTo{UV,VU}Row

These kernels are mostly identical to each other except for the order of
the results, so we can use a single macro to parameterize the pairwise
addition and use the same macro for both implementations, just with the
register order flipped.

Similar to other 2x2 kernels the implementation here differs slightly
for the last element if the problem size is odd, so use an "any" kernel
to avoid needing to handle this in the common code path.

Observed reduction in runtime compared to the existing Neon code:

            | AYUVToUVRow | AYUVToVURow
Cortex-A510 |      -33.1% |      -33.0%
Cortex-A720 |      -25.1% |      -25.1%
  Cortex-X2 |      -59.5% |      -53.9%
  Cortex-X4 |      -39.2% |      -39.4%

Bug: libyuv:973
Change-Id: I957db9ea31c8830535c243175790db0ff2a3ccae
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5522316
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>

2024-06-04 18:18:07 +00:00

libyuv

[AArch64] Add SVE2 implementations for AYUVTo{UV,VU}Row

2024-06-04 18:18:07 +00:00

libyuv.h

NV12 Copy, include scale_uv.h

2020-12-08 18:54:16 +00:00