mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-07 17:26:49 +08:00
For HalfFloat1Row, SVE has direct 16-bit integer to half-float
conversion instructions so there is no need to widen to 32-bits.
For HalfFloatRow, SVE zero-extending loads avoid the need for seperate
UXTL(2) instructions.
Observed reductions in runtime compared to the existing Neon code:
| HalfFloat1Row | HalfFloatRow
Cortex-A510 | -38.3% | -17.3%
Cortex-A520 | -37.6% | -18.8%
Cortex-A720 | -50.1% | -7.8%
Cortex-X2 | -50.2% | -0.4%
Cortex-X4 | -51.5% | -12.5%
Bug: b/42280942
Change-Id: I445071ccd453113144ce42d465ba03c9ee89ec9e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5975319
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
||
|---|---|---|
| .. | ||
| libyuv | ||
| libyuv.h | ||