George Steed 418b6df0de [AArch64] Add SME implementation of Convert16To8Row
Mostly just a straightforward copy of the Neon code ported to
Streaming-SVE, we can use predication to avoid needing an `Any` kernel.
SVE has a "widening multiply get high half" instruction in UMULH,
however using the same technique as the Neon code to avoid the need for
a widening multiply at all is more performant here.

These is no benefit from this kernel when the SVE vector length is only
128 bits, so skip writing a non-streaming SVE implementation.

Change-Id: Ib12699c5b8b168d004ebc74c0281ea3772ca8d32
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6070786
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2024-12-12 03:01:55 -08:00
..
libyuv [AArch64] Add SME implementation of Convert16To8Row 2024-12-12 03:01:55 -08:00
libyuv.h NV12 Copy, include scale_uv.h 2020-12-08 18:54:16 +00:00