libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-06-15 08:26:06 +08:00

History

George Steed 949cb623bf Add SVE2 and SME implementations of I444ToRGB24Row Move the READYUV444_SVE_2X and I444TORGB_SVE_2X macros to row_sve.h so they are usable in both SVE2 and SME implementations, and use them to add new I444ToRGB24Row implementations for SVE2 and SME. We need to use the unrolled versions here to use the ST3B interleaving store instructions, since there is no partial vector version of this store instruction. Reduction in time taken observed for the new SVE2 implementation, compared to the existing Neon implementation: Cortex-A510: -57.6% Cortex-A520: -38.1% Cortex-A710: -15.5% Cortex-A715: -9.2% Cortex-A720: -9.2% Cortex-X2: -25.8% Cortex-X3: -26.2% Cortex-X4: -23.2% Cortex-X925: -17.8% Change-Id: I6acd0b798a35e5352d4fad664769f12d3d938ed7 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6530646 Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>		2025-05-22 13:33:06 -07:00
..
compare_common.cc	clang-tidy applied	2021-04-01 21:42:47 +00:00
compare_gcc.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
compare_msa.cc	use unix line endings	2018-06-20 23:19:59 +00:00
compare_neon64.cc	ARGBToUV 64 bit use ymm8 for shuffler	2025-05-12 15:09:40 -07:00
compare_neon.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
compare_win.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
compare.cc	[AArch64] Add Neon implementation of HashDjb2	2024-05-01 19:37:31 +00:00
convert_argb.cc	Add SVE2 and SME implementations of I444ToRGB24Row	2025-05-22 13:33:06 -07:00
convert_from_argb.cc	Add Neon I8MM implementations of ARGB to UV and variants	2025-05-12 11:14:00 -07:00
convert_from.cc	Sub sampling conversions use CopyPlane for Y channel	2025-01-02 13:34:11 -08:00
convert_jpeg.cc	PlaneScale, UVScale and ARGBScale test 3x and 4x down sample.	2020-10-28 20:41:59 +00:00
convert_to_argb.cc	Apply clang format	2025-01-02 13:31:20 -08:00
convert_to_i420.cc	Apply clang format	2025-01-02 13:31:20 -08:00
convert.cc	Add Neon I8MM implementations of ARGB to UV and variants	2025-05-12 11:14:00 -07:00
cpu_id.cc	Detect SME without SVE dependency	2025-03-31 17:27:40 -07:00
mjpeg_decoder.cc	Add AMXINT8 cpu detect	2024-02-15 21:44:47 +00:00
mjpeg_validate.cc	Update to r1732 for more robust jpeg	2019-07-01 22:32:36 +00:00
planar_functions.cc	[AArch64] Add SVE2 and SME implementations for Convert8To8Row	2025-01-28 15:53:26 -08:00
rotate_any.cc	[AArch64] Fix rotate by odd sizes	2024-07-15 18:13:31 +00:00
rotate_argb.cc	Apply clang format	2025-01-02 13:31:20 -08:00
rotate_common.cc	[AArch64] Use full vectors in TransposeWx{8 => 16}_NEON	2024-05-21 07:46:42 +00:00
rotate_gcc.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
rotate_lsx.cc	[AArch64] Use full vectors in TransposeWx{8 => 16}_NEON	2024-05-21 07:46:42 +00:00
rotate_msa.cc	cpuid show vector length on ARM and RISCV	2024-07-02 18:10:56 +00:00
rotate_neon64.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
rotate_neon.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
rotate_sme.cc	[AArch64] Re-enable SME only for Linux and new versions of Clang	2024-09-23 09:29:53 +00:00
rotate_win.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
rotate.cc	[AArch64] Add SME implementation of CopyRow	2024-12-12 03:02:07 -08:00
row_any.cc	Add Neon I8MM implementations of ARGB to UV and variants	2025-05-12 11:14:00 -07:00
row_common.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
row_gcc.cc	ARGBToUV 64 bit use ymm8 for shuffler	2025-05-12 15:09:40 -07:00
row_lasx.cc	Fix unified sources build for LoongArch LASX	2025-04-01 09:48:19 -07:00
row_lsx.cc	Fix unified sources build for LoongArch LASX	2025-04-01 09:48:19 -07:00
row_msa.cc	Fix Bugs on mips platform V2.	2022-03-01 13:16:31 +00:00
row_neon64.cc	ARGBToUV 64 bit use ymm8 for shuffler	2025-05-12 15:09:40 -07:00
row_neon.cc	ARGBToUV allow 32 bit x86 build	2025-04-28 12:11:00 -07:00
row_rvv.cc	Apply clang format	2025-01-02 13:31:20 -08:00
row_sme.cc	Add SVE2 and SME implementations of I444ToRGB24Row	2025-05-22 13:33:06 -07:00
row_sve.cc	Add SVE2 and SME implementations of I444ToRGB24Row	2025-05-22 13:33:06 -07:00
row_win.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
scale_any.cc	[AArch64] Unroll and use TBL in ScaleRowDown34_NEON	2024-09-16 15:37:27 +00:00
scale_argb.cc	RVV disable 64 bit elements and vcombine_v	2025-03-25 12:51:25 -07:00
scale_common.cc	[AArch64] Add SME implementations of InterpolateRow{,_16,_16To8}	2024-12-12 03:03:41 -08:00
scale_gcc.cc	ARGBToUV 64 bit use ymm8 for shuffler	2025-05-12 15:09:40 -07:00
scale_lsx.cc	DetilePlane and unittest for NEON	2022-01-31 20:05:55 +00:00
scale_msa.cc	Switch to C99 types	2018-01-23 19:16:05 +00:00
scale_neon64.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
scale_neon.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
scale_rgb.cc	Apply clang format	2025-01-02 13:31:20 -08:00
scale_rvv.cc	RVV disable 64 bit elements and vcombine_v	2025-03-25 12:51:25 -07:00
scale_sme.cc	Apply clang format	2025-01-02 13:31:20 -08:00
scale_uv.cc	Apply clang format	2025-01-02 13:31:20 -08:00
scale_win.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
scale.cc	J420ToI420 using planar 8 bit scaling	2025-01-22 02:50:24 -08:00
test.sh	Optimze ABGRToI420 for AVX2	2020-06-04 18:24:45 +00:00
video_common.cc	Lint cleanup after C99 change CL	2018-01-24 19:16:03 +00:00