Frank Barchard 893eacf9b4 ARGBToY for AVX512
- add ARGBToYMatrixRow_AVX512BW
- refactor SSE and AVX to use Matrix functions, making old functions
  call the new ones.

Zen5 1280x720
Was AVX2   LibYUVConvertTest.ARGBToI444_Opt (1125 ms)
Now AVX512 LibYUVConvertTest.ARGBToI444_Opt (641 ms)

Details by Gemini:
  1. Created 3 new Matrix functions:
    Added ARGBToYMatrixRow_SSSE3, ARGBToYMatrixRow_AVX2, and
    ARGBToYMatrixRow_AVX512BW to source/row_gcc.cc. These take the
    const struct ArgbConstants* c parameter similarly to
    ARGBToUV444MatrixRow_*. The x86 vector instructions dynamically
    calculate the needed values using the properties of the constants
    struct, including using vpmaddwd inside the AVX512 code to offset
    the lack of a native vphaddw.

  2. Replaced Old Functions with Wrappers:
    Modified the existing implementations of ARGBToYRow_SSSE3,
    ARGBToYJRow_SSSE3, ABGRToYRow_SSSE3, ABGRToYJRow_SSSE3,
    RGBAToYRow_SSSE3, RGBAToYJRow_SSSE3, BGRAToYRow_SSSE3 (and their
    _AVX2 equivalents) in source/row_gcc.cc to act as inline wrappers
    calling the new ARGBToYMatrixRow_* functions, passing the right
    matrix parameters (e.g. &kArgbI601Constants, &kArgbJPEGConstants,
    &kAbgrI601Constants).

  3. Added row_any.cc Handlers:
    Added ANY11MC definitions to source/row_any.cc to autogenerate
    ARGBToYMatrixRow_Any_SSSE3, ARGBToYMatrixRow_Any_AVX2, and
    ARGBToYMatrixRow_Any_AVX512BW which safely handles non-aligned
    tails.

  4. Updated include/libyuv/row.h:
    Updated the headers with the proper void declarations for all newly
    generated Matrix and Any_ variants. Also defined
    HAS_ARGBTOYROW_AVX512BW in the CPU macros.

  5. Tested the Implementations:
    Compiled and tested on Linux x86, which resulted in all tests passing
    cleanly. Also successfully completed all Windows 32-bit build checks
    ensuring 32-bit regression prevention without issues.

Bug: 477295731
Change-Id: I4f5eec9a961e24a9d760d0a1c0810fb5e29a0bd1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7759494
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-13 17:26:07 -07:00
..
basic_types.h Disable old int types by default. 2018-07-09 21:16:47 +00:00
compare_row.h Deprecate MIPS and MSA support. 2025-10-16 12:20:40 -07:00
compare.h Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00
convert_argb.h Add ABGR versions of the ArgbConstants structures 2026-03-17 17:28:51 -07:00
convert_from_argb.h Unify X86/X64 versions of ARGBToI4xxMatrix functions 2026-03-18 16:27:07 -07:00
convert_from.h Add 10/12 bit YUV To YUV functions 2021-02-25 23:16:54 +00:00
convert.h Forward-declare ArgbConstants in convert.h to fix visibility error 2026-04-09 08:53:56 -07:00
cpu_id.h Experimental SVE FMMLA detect 2025-10-27 14:34:55 -07:00
cpu_support.h Disable Arm SME and SVE assmbly code under MSan 2025-04-03 11:27:31 -07:00
loongson_intrinsics.h RAWToJ400 faster version for ARM 2022-03-18 07:22:36 +00:00
mjpeg_decoder.h add YUV24 and AYUV formats 2019-03-05 02:53:56 +00:00
planar_functions.h Remove duplicate code in planar_functions.h 2025-04-04 15:48:23 -07:00
rotate_argb.h Switch to C99 types 2018-01-23 19:16:05 +00:00
rotate_row.h Deprecate MIPS and MSA support. 2025-10-16 12:20:40 -07:00
rotate.h Add 10 bit rotate methods. 2023-01-04 21:10:01 +00:00
row_sve.h [AArch64] Add SME implementation of ARGBToUVRow and similar 2025-06-30 09:20:23 -07:00
row.h ARGBToY for AVX512 2026-04-13 17:26:07 -07:00
scale_argb.h Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_rgb.h RGBScale function using 3 steps: RGB24ToARGB, ARGBScale, ARGBToRGB24 2022-03-19 01:44:06 +00:00
scale_row.h RVV: Enable some function for intrinsic >= v1.0 2025-10-17 11:44:14 -07:00
scale_uv.h add yuvconvstants util 2021-02-12 19:45:16 +00:00
scale.h Add NV24 scaling support to libyuv 2024-12-12 02:46:11 -08:00
version.h ARGBToY for AVX512 2026-04-13 17:26:07 -07:00
video_common.h Add support for AR64 format 2021-03-13 20:55:21 +00:00