22 Commits

Author SHA1 Message Date
Frank Barchard
62c19d062d [libyuv] Remove all x86 SSE optimizations
Removed all SSE functions, macros, dispatching logic, and related
unit tests across the repository to reduce code size and complexity.
Left cpuid detection intact. Supported architectures like AVX2, NEON,
SVE, etc. are unaffected.

R=rrwinterton@gmail.com

Bug: None
Test: Build and run libyuv_unittest
Change-Id: Id19608dba35b79c4c8fc31f920a6a968883d300f
2026-04-29 16:56:03 -07:00
Wan-Teh Chang
2895faed32 Use GTEST_SKIP() macro to skip TestI400LargeSize
Change-Id: I3a8f5c498b07f26dea5468fbecad9081f8bbe6d5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800542
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 12:30:58 -07:00
Frank Barchard
d445250d8b Replace RAWToY/RGB24ToY with RGBToYMatrix
Bug: libyuv:42280902
Change-Id: I6ddebd492036c416550fc045eb39493dea73246b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7784094
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-21 17:11:14 -07:00
Frank Barchard
81f698829b Add RGBToNV21Matrix function
- implement wrappers with RAW, RGB24, NV21 and JNV21 to call it.

Zen5
Was [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1146 ms)
Now [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1446 ms)
reason - the new code uses 1 pass for RAWToY but 2 pass for RAWToARGB,ARGBToUV.  needs 1 RGBToUV

Bug: libyuv:42280902
Change-Id: Ife6fbed0829484045409e6d42b85cec1d1fd6052
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7780026
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-20 18:03:34 -07:00
Frank Barchard
9f13b2814d add RGBToYMatrixRow_AVX2
Adds RGBToYMatrixRow_AVX2 which reads 24 bit RGB values by reading 3 vectors instead of 4 and permutes them into 4 ARGB vectors before conversion.
Also adds RGBToYMatrixRow_Opt and RGBToYMatrixRow_2Step_Opt to convert_argb_test.cc to benchmark and compare the direct AVX2 conversion vs a 2-step approach.

./libyuv_test '--gunit_filter=*RAWToJ400_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1

AMD Zen 5
Was LibYUVConvertTest.RAWToJ400_Opt (757 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (699 ms)

Intel Skylake
Was LibYUVConvertTest.RAWToJ400_Opt (1705 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (1426 ms)

Bug: 477295731
Change-Id: I29866baf4ad5fe7a3725e4a01f2fe24649510a7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7777325
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-20 12:52:44 -07:00
Frank Barchard
ddc6764d13 ARGBToUVMatrixRow_RVV replace vlseg8 with vlseg4,
implementing horizontal paired adds and accumulation to improve
performance on SiFive x280, and fixes the remainder logic to use valid
vlseg4 loads. Adds TestARGBToUVRow_Any to test odd-width remainder
handling.

Also fixes a build break for non-RVV compilations by ensuring all RVV
functions and their closing cplusplus braces are correctly wrapped in
#if !defined(LIBYUV_DISABLE_RVV).

Also adds NV12ToNV21 as a macro alias for NV21ToNV12 in
planar_functions.h, as the conversion is bidirectional (swapping byte
pairs in the interleaved chroma plane). (Patch from
https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762904)

Bug: libyuv:42280902
Change-Id: If2d6cbb3e232d63d43e32aba33fa9b2eee8190e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7772164
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-17 15:04:45 -07:00
Frank Barchard
94644361b4 row_win.cc rewrite into intrinsics
- remove inline asm which was only for 32 bit
- add ARGBToYMatrixRow_AVX2
- add gn flag libyuv_enable_rowwin=true

Example of building with GN and Ninja:

Without the new flag:
  gn gen out/Release "--args=is_debug=false"
  ninja -C out/Release

With the new flag:
 gn gen out/Release "--args=is_debug=false libyuv_enable_rowwin=true"
 ninja -C out/Release

Bug: libyuv:42280806, 477295731, libyuv:42280902, libyuv:439628764
R=​dalecurtis@chromium.org, rrwinterton@gmail.com

Change-Id: I451bf814622fba690005c02fbf5816819c6a08c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7765790
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-15 19:53:16 -07:00
Frank Barchard
4f4e1ac553 Fix 2 failing golden tests
- Add ifdef for LIBYUV_UNLIMITED_DATA

Fixed by Gemini just telling it how to build and run the test and to fix it.

Bug: libyuv:353545922
Change-Id: I117a25b75b9616ee2ce6122aa163c2085ed4dc7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7742120
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-09 11:51:13 -07:00
Frank Barchard
a61882c049 ARGBToUV AVX2 for x86_64
Icelake
Was SSSE3+SSSE3 ARGBToJ420_Opt (356 ms)
Was SSSE3+AVX2  ARGBToJ420_Opt (301 ms)
Now AVX2+AVX2   ARGBToJ420_Opt (227 ms)

Change-Id: I2cb427bc164b225b3ad4c5f43c09d6da6ca496d5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6943036
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2025-09-16 11:33:54 -07:00
Frank Barchard
5eea781282 Fix util/cpuid hybrid detect
- clang-format applied

Bug: None
Change-Id: If8aec0bbb3d3461886f176a77e029833f5dc197d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6805445
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-08-04 16:29:34 -07:00
Frank Barchard
cdd3bae848 TestI400LargeSize fix for warning message build error
- change %ld to %zd for size_t printf warnings
- disable TestI400LargeSize when disabling SLOW_TESTS
- disable cpuid tests that read proc/cpuinfo test data files
- add ifdef around timers to allow hexagon build
- remove faulty hybrid detect
- remove old mips LIBYUV_DISABLE_DSPR2 reference in gyp build
- apply clang-format

Bug: 434382656
Change-Id: Id74812e6ef29d4a8d0ff967a9189d249b80816d4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6812825
Reviewed-by: Jeremy Leconte <jleconte@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2025-08-01 12:03:11 -07:00
Frank Barchard
3ff31b2a5f Make LibYUVConvertTest.TestI400LargeSize skip test on low end arm cpu
- detect lack of dot product instruction to infer the cpu is low end
- only run the test on higher end arm

Bug: 416842099
Change-Id: Idd2dd16a624bbba280cf531644440024b12f7ecf
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6804632
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2025-07-31 02:41:17 -07:00
Frank Barchard
6f729fbe65 ARGBToUV SSE use average of 4 pixels
- Was using avgb twice for non-exact and C for exact.

On Skylake Xeon:

Now SSE3
ARGBToJ420_Opt (326 ms)

Was
Exact C
ARGBToJ420_Opt (871 ms)
Not exact AVX2
ARGBToJ420_Opt (237 ms)
Not exact SSSE3
ARGBToJ420_Opt (312 ms)

Bug: 381138208
Change-Id: I6d1081bb52e36f06736c0c6575fa82bb2268629b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6629605
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ben Weiss <bweiss@google.com>
2025-06-17 11:55:27 -07:00
Frank Barchard
889613683a Add hybrid detect for Intel laptop cpus
- Add +i8mm build option for sve ARGBToUV which uses usdot
- util/cpuid Get cpu count (windows, macos, linux)
- For each x86 cpu, detect hybrid (e-core)
- Includes a comment fix for ubsan unittest
- Bump version
- Apply clang format to util/*.c as well as all *.cc/*.h

Bug: 424637372
Change-Id: I08310e18051fff62c9e4e4a10d1e4361871119ac
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6635640
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-06-13 13:22:54 -07:00
Frank Barchard
843cda7e7b TestI400LargeSize test __x86_64__, _M_X64, or __aarch64__
- apply clang-format to row_neon64.cc

Bug: 416842099
Change-Id: Ic21f08d8b65bb86cf72eba82d45591f6558170ec
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6634515
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-06-10 15:53:02 -07:00
Frank Barchard
4ac0a3ae3d ubsan compliant '_any' functions using ptrdiff_t for pointer math
Bug: 416842099
Change-Id: I1e3c7bc1b363c11baeb3b529ee78e5ac8878c359
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6634217
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-06-10 15:01:52 -07:00
Frank Barchard
e0040eb318 Apply clang format
Bug: None
Change-Id: I0d9db4b384144523e61ae32b6ab3f72e93a0c265
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6138934
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-01-02 13:31:20 -08:00
Wan-Teh Chang
ec6f15079f Remove unneeded #ifdef HAVE_JPEG code
Change-Id: Ic7e1393b48bec735625197243b3d436ea01cfb07
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5529467
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-05-09 23:02:18 +00:00
Frank Barchard
914624f0b8 YUY2ToARGBMatrix and UYVYToARGBMatrix added to allow any color matrix
Bug: libyuv:971
Change-Id: If15d4598d75500a3717f07d02c0c295fdc58254e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5214453
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-19 21:21:37 +00:00
Frank Barchard
af6ac8265b AVX10 cpuid detect added
Replace unused popcount feature bit

Bug: libyuv:911
Change-Id: Icd88fcc732751d39b0950d5f09a58bc9ac2c4e30
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5179911
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-10 00:08:22 +00:00
Frank Barchard
6dc03dacbf Split scale_test and scale_plane_test to allow building on small devices
Bug: libyuv:956
Change-Id: I1903aa616243e891440ed92836dfb0992d31d4cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5107257
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-12-09 18:39:41 +00:00
Frank Barchard
9e61d7f9c1 Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests.
Update build files to include the new tests and source code.

Bug: libyuv:956
Change-Id: I6ec0beb6dc9570f0597d7df1835d616489dbaece
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5103585
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-12-08 13:39:56 +00:00