mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-11 11:17:28 +08:00
There is no nice way of forming the TBL permute indices here since we are operating on sets of three bytes at a time, so instead load the appropriate indices from a static array. We can make use of SVE predication to ensure we are operating on a multiple of three bytes for the load/store instructions rather than needing to make use of more expensive LD3 or ST3 instructions. Reduction in runtime observed compared to the existing Neon implementation: Cortex-A510: -39.2% Cortex-A720: -34.5% Cortex-X2: -31.0% Bug: libyuv:973 Change-Id: I68560bde7a529e5cec150b0e9d3ffe4341038fb8 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5631543 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org> |
||
|---|---|---|
| .. | ||
| basic_types.h | ||
| compare_row.h | ||
| compare.h | ||
| convert_argb.h | ||
| convert_from_argb.h | ||
| convert_from.h | ||
| convert.h | ||
| cpu_id.h | ||
| loongson_intrinsics.h | ||
| macros_msa.h | ||
| mjpeg_decoder.h | ||
| planar_functions.h | ||
| rotate_argb.h | ||
| rotate_row.h | ||
| rotate.h | ||
| row.h | ||
| scale_argb.h | ||
| scale_rgb.h | ||
| scale_row.h | ||
| scale_uv.h | ||
| scale.h | ||
| version.h | ||
| video_common.h | ||