George Steed c4a0c8d34a [AArch64] Add SVE2 and SME implementations for Convert8To8Row
SVE can make use of the UMULH instruction to avoid needing separate
widening multiply and narrowing steps for the scale application.

Reduction in runtime for Convert8To8Row_SVE2 observed compared to the
existing Neon implementation:

        Cortex-A510: -13.2%
        Cortex-A520: -16.4%
        Cortex-A710: -37.1%
        Cortex-A715: -38.5%
        Cortex-A720: -38.4%
          Cortex-X2: -33.2%
          Cortex-X3: -31.8%
          Cortex-X4: -31.8%
        Cortex-X925: -13.9%

Change-Id: I17c0cb81661c5fbce786b47cdf481549cfdcbfc7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6207692
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2025-01-28 15:53:26 -08:00
..
basic_types.h Disable old int types by default. 2018-07-09 21:16:47 +00:00
compare_row.h Avoid duplication of CPU feature disable macros 2024-09-23 09:28:24 +00:00
compare.h Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00
convert_argb.h YUY2ToARGBMatrix and UYVYToARGBMatrix added to allow any color matrix 2024-01-19 21:21:37 +00:00
convert_from_argb.h MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
convert_from.h Add 10/12 bit YUV To YUV functions 2021-02-25 23:16:54 +00:00
convert.h J420ToI420 using planar 8 bit scaling 2025-01-22 02:50:24 -08:00
cpu_id.h avx10_2 detect 2025-01-21 13:53:19 -08:00
cpu_support.h Re-enable SME when building for AArch64 Android 2024-10-04 17:43:26 +00:00
loongson_intrinsics.h RAWToJ400 faster version for ARM 2022-03-18 07:22:36 +00:00
macros_msa.h Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
mjpeg_decoder.h add YUV24 and AYUV formats 2019-03-05 02:53:56 +00:00
planar_functions.h J420ToI420 using planar 8 bit scaling 2025-01-22 02:50:24 -08:00
rotate_argb.h Switch to C99 types 2018-01-23 19:16:05 +00:00
rotate_row.h [AArch64] Re-enable SME only for Linux and new versions of Clang 2024-09-23 09:29:53 +00:00
rotate.h Add 10 bit rotate methods. 2023-01-04 21:10:01 +00:00
row_sve.h [AArch64] Add SVE2 and SME implementations for Convert8To8Row 2025-01-28 15:53:26 -08:00
row.h [AArch64] Add SVE2 and SME implementations for Convert8To8Row 2025-01-28 15:53:26 -08:00
scale_argb.h Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_rgb.h RGBScale function using 3 steps: RGB24ToARGB, ARGBScale, ARGBToRGB24 2022-03-19 01:44:06 +00:00
scale_row.h [RVV] Optimize ScaleARGBFilterCols with RVV 2024-12-29 17:32:00 -08:00
scale_uv.h add yuvconvstants util 2021-02-12 19:45:16 +00:00
scale.h Add NV24 scaling support to libyuv 2024-12-12 02:46:11 -08:00
version.h J420ToI420 AVX2 2025-01-27 11:23:44 -08:00
video_common.h Add support for AR64 format 2021-03-13 20:55:21 +00:00