Frank Barchard 1b1c058787 ARGBToUV for SSE use pshufb/pmaddubsw
Was
ARGBToJ420_Opt (377 ms)
Now
ARGBToJ420_Opt (340 ms)

Bug: None
Change-Id: Iada2d6e9ecdb141b9e2acbdf343f890e4aaebe34
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6967754
Reviewed-by: Justin Green <greenjustin@google.com>
2025-09-19 12:39:39 -07:00
..
compare_common.cc clang-tidy applied 2021-04-01 21:42:47 +00:00
compare_gcc.cc ARGBToJ444 use 256 for fixed point scale UV 2025-02-27 13:04:15 -08:00
compare_msa.cc use unix line endings 2018-06-20 23:19:59 +00:00
compare_neon64.cc Add hybrid detect for Intel laptop cpus 2025-06-13 13:22:54 -07:00
compare_neon.cc Apply format with no code changes 2025-02-24 23:57:01 -08:00
compare_win.cc ARGBToJ444 use 256 for fixed point scale UV 2025-02-27 13:04:15 -08:00
compare.cc [AArch64] Add Neon implementation of HashDjb2 2024-05-01 19:37:31 +00:00
convert_argb.cc Add SVE2 and SME implementations of I422ToAR30Row 2025-05-27 11:39:00 -07:00
convert_from_argb.cc [AArch64] Add SME implementation of ARGBToUVRow and similar 2025-06-30 09:20:23 -07:00
convert_from.cc Sub sampling conversions use CopyPlane for Y channel 2025-01-02 13:34:11 -08:00
convert_jpeg.cc PlaneScale, UVScale and ARGBScale test 3x and 4x down sample. 2020-10-28 20:41:59 +00:00
convert_to_argb.cc Apply clang format 2025-01-02 13:31:20 -08:00
convert_to_i420.cc Apply clang format 2025-01-02 13:31:20 -08:00
convert.cc [AArch64] Add SME implementation of ARGBToUVRow and similar 2025-06-30 09:20:23 -07:00
cpu_id.cc loong64: Use HWCAP instead of CPUCFG to detect LSX/LASX 2025-07-24 23:43:54 -07:00
mjpeg_decoder.cc Add AMXINT8 cpu detect 2024-02-15 21:44:47 +00:00
mjpeg_validate.cc Update to r1732 for more robust jpeg 2019-07-01 22:32:36 +00:00
planar_functions.cc [AArch64] Add SME implementation of Convert8To16Row_SME 2025-06-23 11:32:56 -07:00
rotate_any.cc [AArch64] Fix rotate by odd sizes 2024-07-15 18:13:31 +00:00
rotate_argb.cc Apply clang format 2025-01-02 13:31:20 -08:00
rotate_common.cc [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
rotate_gcc.cc ARGBToJ444 use 256 for fixed point scale UV 2025-02-27 13:04:15 -08:00
rotate_lsx.cc [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
rotate_msa.cc cpuid show vector length on ARM and RISCV 2024-07-02 18:10:56 +00:00
rotate_neon64.cc Apply format with no code changes 2025-02-24 23:57:01 -08:00
rotate_neon.cc Apply format with no code changes 2025-02-24 23:57:01 -08:00
rotate_sme.cc [AArch64] Re-enable SME only for Linux and new versions of Clang 2024-09-23 09:29:53 +00:00
rotate_win.cc ARGBToJ444 use 256 for fixed point scale UV 2025-02-27 13:04:15 -08:00
rotate.cc [AArch64] Add SME implementation of CopyRow 2024-12-12 03:02:07 -08:00
row_any.cc [AArch64] Add SME implementation of ARGBToUVRow and similar 2025-06-30 09:20:23 -07:00
row_common.cc ARGBToUV SSE use average of 4 pixels 2025-06-17 11:55:27 -07:00
row_gcc.cc ARGBToUV for SSE use pshufb/pmaddubsw 2025-09-19 12:39:39 -07:00
row_lasx.cc loong64: UV subsample's 4-pixel rounding average and ARGBToJ444 fixed-point scaling 2025-09-03 12:22:44 -07:00
row_lsx.cc loong64: UV subsample's 4-pixel rounding average and ARGBToJ444 fixed-point scaling 2025-09-03 12:22:44 -07:00
row_msa.cc Fix Bugs on mips platform V2. 2022-03-01 13:16:31 +00:00
row_neon64.cc [AArch64] Fix compilation due to incorrect register constraint 2025-08-05 11:23:20 -07:00
row_neon.cc Convert8To16 use VPSRLW instead of VPMULHUW for better lunarlake performance 2025-08-04 12:42:50 -07:00
row_rvv.cc Apply clang format 2025-01-02 13:31:20 -08:00
row_sme.cc Convert8To16 use VPSRLW instead of VPMULHUW for better lunarlake performance 2025-08-04 12:42:50 -07:00
row_sve.cc [AArch64] Add SME implementation of ARGBToUVRow and similar 2025-06-30 09:20:23 -07:00
row_win.cc Convert8To16 use VPSRLW instead of VPMULHUW for better lunarlake performance 2025-08-04 12:42:50 -07:00
scale_any.cc [AArch64] Unroll and use TBL in ScaleRowDown34_NEON 2024-09-16 15:37:27 +00:00
scale_argb.cc RVV disable 64 bit elements and vcombine_v 2025-03-25 12:51:25 -07:00
scale_common.cc [AArch64] Add SME implementations of InterpolateRow{,_16,_16To8} 2024-12-12 03:03:41 -08:00
scale_gcc.cc ARGBToUV 64 bit use ymm8 for shuffler 2025-05-12 15:09:40 -07:00
scale_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
scale_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_neon64.cc Apply format with no code changes 2025-02-24 23:57:01 -08:00
scale_neon.cc Apply format with no code changes 2025-02-24 23:57:01 -08:00
scale_rgb.cc Apply clang format 2025-01-02 13:31:20 -08:00
scale_rvv.cc RVV disable 64 bit elements and vcombine_v 2025-03-25 12:51:25 -07:00
scale_sme.cc Apply clang format 2025-01-02 13:31:20 -08:00
scale_uv.cc Apply clang format 2025-01-02 13:31:20 -08:00
scale_win.cc ARGBToJ444 use 256 for fixed point scale UV 2025-02-27 13:04:15 -08:00
scale.cc J420ToI420 using planar 8 bit scaling 2025-01-22 02:50:24 -08:00
test.sh Optimze ABGRToI420 for AVX2 2020-06-04 18:24:45 +00:00
video_common.cc Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00