mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-09 10:16:46 +08:00
SVE can make use of the UMULH instruction to avoid needing separate
widening multiply and narrowing steps for the scale application.
Reduction in runtime for Convert8To8Row_SVE2 observed compared to the
existing Neon implementation:
Cortex-A510: -13.2%
Cortex-A520: -16.4%
Cortex-A710: -37.1%
Cortex-A715: -38.5%
Cortex-A720: -38.4%
Cortex-X2: -33.2%
Cortex-X3: -31.8%
Cortex-X4: -31.8%
Cortex-X925: -13.9%
Change-Id: I17c0cb81661c5fbce786b47cdf481549cfdcbfc7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6207692
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
||
|---|---|---|
| .. | ||
| basic_types.h | ||
| compare_row.h | ||
| compare.h | ||
| convert_argb.h | ||
| convert_from_argb.h | ||
| convert_from.h | ||
| convert.h | ||
| cpu_id.h | ||
| cpu_support.h | ||
| loongson_intrinsics.h | ||
| macros_msa.h | ||
| mjpeg_decoder.h | ||
| planar_functions.h | ||
| rotate_argb.h | ||
| rotate_row.h | ||
| rotate.h | ||
| row_sve.h | ||
| row.h | ||
| scale_argb.h | ||
| scale_rgb.h | ||
| scale_row.h | ||
| scale_uv.h | ||
| scale.h | ||
| version.h | ||
| video_common.h | ||