George Steed f40042533c [AArch64] Add SVE2 implementation of I422ToRGB565Row
This makes use of the same approach as the Neon code to avoid redundant
narrowing and then widening shifts by instead placing the values at the
top portion of the lanes and then shifting down from there instead.

Observed reduction in runtime compared to the existing Neon code:

Cortex-A510: -41.1%
Cortex-A520: -38.2%
Cortex-A715: -21.5%
Cortex-A720: -21.6%
  Cortex-X2: -21.6%
  Cortex-X3: -22.0%
  Cortex-X4: -23.5%
Cortex-X925: -21.7%

Bug: b/42280942
Change-Id: Id84872141435566bbf94a4bbf0227554b5b5fb91
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5802966
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-10-24 21:27:39 +00:00
..
compare_common.cc clang-tidy applied 2021-04-01 21:42:47 +00:00
compare_gcc.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
compare_msa.cc use unix line endings 2018-06-20 23:19:59 +00:00
compare_neon64.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
compare_neon.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
compare_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
compare.cc [AArch64] Add Neon implementation of HashDjb2 2024-05-01 19:37:31 +00:00
convert_argb.cc [AArch64] Add SVE2 implementation of I422ToRGB565Row 2024-10-24 21:27:39 +00:00
convert_from_argb.cc [AArch64] Add I8MM implementation of ARGBToUV444Row 2024-07-16 17:32:52 +00:00
convert_from.cc Change ScalePlane,ScalePlane_16,... to return int 2023-11-03 23:53:24 +00:00
convert_jpeg.cc PlaneScale, UVScale and ARGBScale test 3x and 4x down sample. 2020-10-28 20:41:59 +00:00
convert_to_argb.cc Make functions that malloc check for ubsan math overflow 2024-10-08 21:08:34 +00:00
convert_to_i420.cc Make functions that malloc check for ubsan math overflow 2024-10-08 21:08:34 +00:00
convert.cc [AArch64] Use full Neon vectors in RGB565To{ARGB,UV,Y}Row_NEON 2024-09-16 04:35:47 +00:00
cpu_id.cc Untangle arm and aarch64 #ifdefs in GetCpuFlags() 2024-09-20 23:40:19 +00:00
mjpeg_decoder.cc Add AMXINT8 cpu detect 2024-02-15 21:44:47 +00:00
mjpeg_validate.cc Update to r1732 for more robust jpeg 2019-07-01 22:32:36 +00:00
planar_functions.cc [AArch64] Add Neon dot-product implementation for ARGBSepiaRow 2024-09-16 04:31:35 +00:00
rotate_any.cc [AArch64] Fix rotate by odd sizes 2024-07-15 18:13:31 +00:00
rotate_argb.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
rotate_common.cc [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
rotate_gcc.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
rotate_lsx.cc [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
rotate_msa.cc cpuid show vector length on ARM and RISCV 2024-07-02 18:10:56 +00:00
rotate_neon64.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
rotate_neon.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
rotate_sme.cc [AArch64] Re-enable SME only for Linux and new versions of Clang 2024-09-23 09:29:53 +00:00
rotate_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
rotate.cc Rotate use NULL for C compatability 2024-07-23 18:02:47 +00:00
row_any.cc [AArch64] Unroll ARGB1555ToARGBRow_NEON to use full Neon vectors 2024-09-16 04:36:43 +00:00
row_common.cc Change ARGBMultiplyRow_C to match Neon 2024-09-23 21:48:33 +00:00
row_gcc.cc Convert16To8Row_AVX512BW using vpmovuswb 2024-08-15 20:13:33 +00:00
row_lasx.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
row_lsx.cc [AArch64] Fix SVE/SME vector length printing in cpuid 2024-07-02 19:44:41 +00:00
row_msa.cc Fix Bugs on mips platform V2. 2022-03-01 13:16:31 +00:00
row_neon64.cc [AArch64] Unroll ARGB1555ToARGBRow_NEON to use full Neon vectors 2024-09-16 04:36:43 +00:00
row_neon.cc Fix -Wmissing-prototypes warnings 2024-08-12 19:08:24 +00:00
row_rvv.cc Fix -Wmissing-prototypes warnings 2024-08-12 19:08:24 +00:00
row_sve.cc [AArch64] Add SVE2 implementation of I422ToRGB565Row 2024-10-24 21:27:39 +00:00
row_win.cc Fix tidy warning that uint32_t dither4 should not be const 2023-06-02 00:42:02 +00:00
scale_any.cc [AArch64] Unroll and use TBL in ScaleRowDown34_NEON 2024-09-16 15:37:27 +00:00
scale_argb.cc Make functions that malloc check for ubsan math overflow 2024-10-08 21:08:34 +00:00
scale_common.cc Fix warnings for missing prototypes 2023-06-30 17:46:56 +00:00
scale_gcc.cc cpuid show vector length on ARM and RISCV 2024-07-02 18:10:56 +00:00
scale_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
scale_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_neon64.cc [AArch64] Rework data loading in ScaleFilterCols_NEON 2024-10-24 21:25:23 +00:00
scale_neon.cc scale_neon.cc: Fix -Wmissing-prototypes warnings 2024-08-13 03:50:51 +00:00
scale_rgb.cc Make functions that malloc check for ubsan math overflow 2024-10-08 21:08:34 +00:00
scale_rvv.cc Add volatile for gcc inline to avoid being removed 2024-07-02 01:25:24 +00:00
scale_uv.cc Fix for ARGB scaling down by 4x horizontally but not vertically 2024-09-24 18:00:47 +00:00
scale_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
scale.cc ScalePlaneDown34: test dst_width%24 == 0 for armv7 2024-09-27 23:00:19 +00:00
test.sh Optimze ABGRToI420 for AVX2 2020-06-04 18:24:45 +00:00
video_common.cc Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00