George Steed 73f6e82b1a [AArch64] Add missing clobber, fix zero-init for compare kernels
The "memory" clobber needs to be present even if the asm does not store
anything to memory, since otherwise the compiler would be allowed to
reorder earlier stores to the pointers after they would be needed by the
asm.

Also fix up the zero-initialisation of accumulators in
SumSquareError_NEON, since EOR'ing a register by itself is not a
recognised zeroing idiom on most AArch64 micro-architectures.

Bug: libyuv:976
Change-Id: I3175367abf6f59db8371b4478f1156950277d7c5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5378705
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-04-19 06:38:06 +00:00
..
compare_common.cc clang-tidy applied 2021-04-01 21:42:47 +00:00
compare_gcc.cc MT2T Warning fixes for fuchsia 2022-12-06 19:54:40 +00:00
compare_msa.cc use unix line endings 2018-06-20 23:19:59 +00:00
compare_neon64.cc [AArch64] Add missing clobber, fix zero-init for compare kernels 2024-04-19 06:38:06 +00:00
compare_neon.cc Scale by even factor low level row function 2020-11-03 21:25:18 +00:00
compare_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
compare.cc MT2T Warning fixes for fuchsia 2022-12-06 19:54:40 +00:00
convert_argb.cc [AArch64] Add SVE2 implementation for I444ToARGBRow 2024-04-09 03:11:01 +00:00
convert_from_argb.cc [AArch64] Use Neon dot-product instructions in ARGBToYMatrixRow 2024-04-09 03:09:36 +00:00
convert_from.cc Change ScalePlane,ScalePlane_16,... to return int 2023-11-03 23:53:24 +00:00
convert_jpeg.cc PlaneScale, UVScale and ARGBScale test 3x and 4x down sample. 2020-10-28 20:41:59 +00:00
convert_to_argb.cc Remove M420 and refactor NV12ToI420 2020-05-26 18:48:00 +00:00
convert_to_i420.cc Fix ConvertToI420 when using YUY2 or UYVY with odd crop_x. 2021-07-19 22:22:22 +00:00
convert.cc [AArch64] Use Neon dot-product instructions in ARGBToYMatrixRow 2024-04-09 03:09:36 +00:00
cpu_id.cc [AArch64] Use getauxval(AT_HWCAP{,2}) for feature detection 2024-04-19 06:37:04 +00:00
mjpeg_decoder.cc Add AMXINT8 cpu detect 2024-02-15 21:44:47 +00:00
mjpeg_validate.cc Update to r1732 for more robust jpeg 2019-07-01 22:32:36 +00:00
planar_functions.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
rotate_any.cc Remove MMI support 2022-01-26 08:41:33 +00:00
rotate_argb.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
rotate_common.cc Fix warnings for missing prototypes 2023-06-30 17:46:56 +00:00
rotate_gcc.cc Transpose 4x4 for SSE2 and AVX2 2023-03-03 17:46:23 +00:00
rotate_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
rotate_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
rotate_neon64.cc MergeUV_AVX512BW for I420ToNV12 2023-02-13 20:14:57 +00:00
rotate_neon.cc GCC warning fix for MT2T 2023-03-16 06:57:20 +00:00
rotate_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
rotate.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
row_any.cc [AArch64] Use Neon dot-product instructions in ARGBToYMatrixRow 2024-04-09 03:09:36 +00:00
row_common.cc [RVV] Support AR64ToAB64 and RGBA-family color conversions 2023-09-05 22:44:48 +00:00
row_gcc.cc YUY2ToARGB use ymm6/7 for shuffle constants 2024-01-22 21:47:23 +00:00
row_lasx.cc AVX10 cpuid detect added 2024-01-10 00:08:22 +00:00
row_lsx.cc Fix compilation errors. 2024-01-03 19:15:56 +00:00
row_msa.cc Fix Bugs on mips platform V2. 2022-03-01 13:16:31 +00:00
row_neon64.cc [AArch64] Load full vectors in ARGB{Add,Subtract}Row 2024-04-18 19:02:43 +00:00
row_neon.cc ARGBAttenuate use (a + b + 255) >> 8 2023-06-16 21:37:53 +00:00
row_rvv.cc [RVV] Support AR64ToAB64 and RGBA-family color conversions 2023-09-05 22:44:48 +00:00
row_sve.cc [AArch64] Add SVE2 implementation for I444ToARGBRow 2024-04-09 03:11:01 +00:00
row_win.cc Fix tidy warning that uint32_t dither4 should not be const 2023-06-02 00:42:02 +00:00
scale_any.cc UVScale down by 2 fix for C and optimize for NEON 2023-04-12 22:49:20 +00:00
scale_argb.cc Add HAS_SCALEARGBROWDOWNEVEN_RVV marco and disable it by default 2023-12-07 22:54:23 +00:00
scale_common.cc Fix warnings for missing prototypes 2023-06-30 17:46:56 +00:00
scale_gcc.cc ScaleRowUp2_Bilinear_12_SSSE3 preserve xmm7 for Windows 2022-10-21 19:35:17 +00:00
scale_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
scale_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_neon64.cc [AArch64] Optimize ScaleARGBRowDown2Box_NEON 2024-04-10 20:07:22 +00:00
scale_neon.cc UVScale down by 2 fix for C and optimize for NEON 2023-04-12 22:49:20 +00:00
scale_rgb.cc RGBScale function using 3 steps: RGB24ToARGB, ARGBScale, ARGBToRGB24 2022-03-19 01:44:06 +00:00
scale_rvv.cc Add HAS_SCALEARGBROWDOWNEVEN_RVV marco and disable it by default 2023-12-07 22:54:23 +00:00
scale_uv.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
scale_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
scale.cc malloc return 1 for failures and assert for internal functions 2023-12-04 22:55:20 +00:00
test.sh Optimze ABGRToI420 for AVX2 2020-06-04 18:24:45 +00:00
video_common.cc Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00