2962 Commits

Author SHA1 Message Date
Wan-Teh Chang
c98edcc8dc Don't coalesce rows if width*height would overflow
Audit all occurrences of "width *= height;" in the libyuv source code.
Make sure height > 0 and (ptrdiff_t)width * height <= INT_MAX before
executing width *= height.

Bug: chromium:517339758
Change-Id: I143a41c66492a6e4c48b6aa2a1c4a2ae974ceeb1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7883816
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-29 11:57:47 -07:00
Frank Barchard
e449eb2172 J400ToARGB switch from SSE2 to AVX2
- port for row_win
- remove unused HAS_ macros

Was C/SSE2
MSVC  J400ToARGB_Opt (1967 ms)
Clang J400ToARGB_Opt (568 ms)

Now AVX2
MSVC  J400ToARGB_Opt (411 ms)
Clang J400ToARGB_Opt (418 ms)

Test: libyuv_unittest --gtest_filter=*J400ToARGB*
Bug: libyuv:508639302

Change-Id: Ifdfb026832b708b61f55477250cc5ee52449f421
TAG=agy
CONV=186608fc-966a-4ea7-bf57-9fe07cc1383c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7877368
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
2026-05-28 21:24:32 -07:00
Wan-Teh Chang
904f562d86 Remove redundant #include <stddef.h>
"libyuv/basic_types.h" includes <stddef.h>. So it is not necessary to
include both <stddef.h> and "libyuv/basic_types.h".

Change-Id: I5a461258a3c6820d1007ac635838f910237f367f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7884381
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-28 17:10:22 -07:00
Wan-Teh Chang
ebe6fef903 Fix integer overflow in multiplications of stride
Audit all occurrences of "stride *" in the libyuv source tree. Ensure
that these multiplications are performed in the ptrdiff_t type.

For functions not declared in a public header (such as static
functions), prefer to declare the stride parameters (typically named
src_stride and dst_stride) and related stride local variables as
ptrdiff_t. If this is not possible, add ptrdiff_t casts to the stride
parameters in multiplications. If intptr_t or int64_t casts were used,
change them to ptrdiff_t casts.

Bug: chromium:516986556
Change-Id: I6cd8a8eb00cbb5380db828bf83e4d89ff95891f3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7882967
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-28 14:12:37 -07:00
Andrew Grieve
de63bd90f4 Stop setting --dynamic-linker for target_os="android"
lld ignores it for shared libraries, but wild linker adds an .interp
section because of it. Regardless, llvm has long known to set the
correct linker flag when building for android.

Bug: 40208899
Change-Id: I50dc015946382f9e99289333e7d8b870409ed8d6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7874019
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Andrew Grieve <agrieve@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2026-05-27 04:34:07 -07:00
Frank Barchard
9d98aaefe7 InterpolateRow for Visual C
- remove InterpolateRow_SSSE3
- optimize ARGBToUV444MatrixRow_AVX2 to use unsigned pixels

5.7x faster on AMD Zen4

Was C
TestInterpolatePlane (144 ms)
TestInterpolatePlane_16 (142 ms)

Now AVX2
TestInterpolatePlane (25 ms)
TestInterpolatePlane_16 (48 ms)

Was signed
ARGBToJ444_Opt (157 ms)
Now unsigned
ARGBToJ444_Opt (155 ms)

Bug: None
Change-Id: I903109668ff9cfedaddad1ad75411393b3226f41
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7856498
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-18 17:28:46 -07:00
Frank Barchard
9f751100d2 InterpolateRow_16_AVX2 for row_gcc
On AMD Zen4
Was C
TestInterpolatePlane_16 (143 ms)
Now AVX2
TestInterpolatePlane_16 (48 ms)

Was
I210ToI420_Opt (87 ms)
 35.60% InterpolateRow_16To8_AVX2
 31.03% Convert16To8Row_AVX512BW
 21.35% Convert16To8Row_AVX2

Now
I210ToI420_Opt (69 ms)
 37.57% Convert16To8Row_AVX512BW
 32.69% InterpolateRow_16_AVX2
  7.18% Convert16To8Row_AVX2
  5.23% InterpolateRow_16To8_AVX2

Bug: None
Change-Id: Ica9b9c5dbd847068ae076b682c487e1753d3c812
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7855648
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-18 14:29:36 -07:00
Frank Barchard
cda55fcf53 Mirrow AVX2 functions for Visual C
Bug: libyuv:42280902
Change-Id: Iabbec9af3a4f4dd89294e60145823c7fc4dd6ec6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7843378
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-15 15:05:31 -07:00
Sergey Silkin
0f320a03f7 Fix linear interpolation
C interpolator applied to chroma plane at scaling NV12 on Mac/ARM used
(0x7f ^ f) which is (127-f) instead of (128-f). This resulted in changes
like 128 -> 127 when scaling flat colors and caused visually noticeable
difference.

Bug: b/465721312
Change-Id: Iecf5d2ca2a85602de4146cba7e0f64ecb4b2c1fe
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830198
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
2026-05-13 05:33:33 -07:00
Frank Barchard
c6c8689c74 Fix I444 and J444 parameter names/order
Bug: libyuv:42280902
Change-Id: Ia2c45f2d996d071534b08381f61adf8cb8ef35b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841767
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-12 15:30:35 -07:00
Frank Barchard
dd8b46630a ARGBToUV444MatrixRow_AVX2 intrinsics for Visual C
Was C
LibYUVConvertTest.ARGBToI444_Opt (1027 ms)

Now AVX2
LibYUVConvertTest.ARGBToI444_Opt (310 ms)

Bug: libyuv:508639302
Change-Id: I0bc7f5c5b72160d24226a98d5fddb184a004ed00
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841655
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-12 14:19:58 -07:00
Frank Barchard
cb061d0378 Unittests use ASSERT instead of EXPECT
Bug: libyuv:508639302
Change-Id: I22c35e08f3b6db1a656192877c1fb1bf4e96d6f5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7838659
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-11 19:10:47 -07:00
Frank Barchard
e23282704f ARGBToYRow_AVX512BW preserve XMM6-XMM15 due to Windows stack alignment
Bug: 505124541
Change-Id: Id5ae539f57b314980182bec76a788e33273b2392
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7835639
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-11 13:12:22 -07:00
Frank Barchard
4b4e68b372 ABGRToJ420 call ARGBToI420Matrix
- Standardize libyuv ARGB-family (ARGB, ABGR, RGBA, BGRA) to YUV conversion by utilizing the generic MatrixRow architecture and explicit ArgbConstants.
- Consolidated ARGBToI420, ABGRToI420, BGRAToI420, and RGBAToI420 as wrappers for ARGBToI420Matrix.
- Refactored ABGRToJ420, ABGRToJ422, and ABGRToI422 to use generic matrix functions.
- Added matrix-based versions for NV21, I400, YUY2, and UYVY.
- Updated RAW and RGB24 to I420/I422/I444 dispatchers to use MatrixRow logic and explicit constants.
- Fixed parameter swap bugs in ARGBToI422, ARGBToJ422, and ABGRToJ422.
- Fixed a bug in the generic C implementation of matrix row functions ensuring all 4 channels are processed correctly for all ARGB-family formats.
- Moved kShuffleAARRGGBB in row_gcc.cc to the top of the libyuv namespace for visibility.
- Cleaned up redundant format-specific row implementations.

Bug: libyuv:42280902
Change-Id: I67ffa4c476abc0d2dcc4650510d7bda91b65988e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830291
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-08 15:23:30 -07:00
Frank Barchard
4aacbbdfb4 Refactored RGB/RAW to YUV color conversion functions to use generic Matrix-based functions parameterized by ArgbConstants.
This consolidation standardizes conversion logic, improves code
maintainability, and provides flexible support for various color spaces
(e.g., BT.601, JPEG full
  range).

Key Modifications:
 - Function Consolidation: Refactored several high-level conversion functions into lightweight wrappers around generic Matrix variants:
     - ARGBToI420 → ARGBToI420Matrix
     - ARGBToI444 → ARGBToI444Matrix
     - ARGBToI422 → ARGBToI422Matrix
     - ARGBToNV12 → ARGBToNV12Matrix
     - RAWToJ400, RGB24ToJ400 → RGBToI400Matrix
     - RAWToI444, RAWToJ444 → RGBToI444Matrix
 - 2-Pass Conversions: Updated RGB565ToI420, ARGB1555ToI420, and ARGB4444ToI420 to utilize 2-pass conversions via RGBToI420Matrix.
 - Standardization: Refactored ARGBToNV21, ARGBToYUY2, and ARGBToUYVY to use parameterized matrix row functions (ARGBToYMatrixRow,
   ARGBToUVMatrixRow).
 - Legacy Cleanup: Replaced legacy calls to ARGBToYJRow with the parameterized ARGBToYMatrixRow in the ARGBSobelize helper.
 - Internal Integration: Included libyuv/convert_from_argb.h in planar_functions.cc and ensured all new matrix symbols are properly
   declared/exported (LIBYUV_API).

Bug: libyuv:42280902
Change-Id: Ied5fd9899767427e3a03cdcfbeaff3e9d502374a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7822033
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-06 20:02:47 -07:00
Mirko Bonadei
8773064a72 Revert "Fix rounding in scaling routines."
This reverts commit 37a848135b2b5df6177a6fd42e1d60eb0fa9539d.

Reason for revert: The CL was not ready to land.

Original change's description:
> Fix rounding in scaling routines.
>
> * before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv
> * after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9
> * source: https://screenshot.googleplex.com/9yevVP7URe3XfSm
>
> Bug: b/465721312
> Change-Id: I2aede005db252b2912ceef23379463f176675205
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417
> Reviewed-by: Frank Barchard <fbarchard@google.com>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>

Bug: b/465721312
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Change-Id: Ic3d0de4d4475942bc91fbee17d012bc1b656589f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7816864
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Bot-Commit: rubber-stamper@appspot.gserviceaccount.com <rubber-stamper@appspot.gserviceaccount.com>
2026-05-06 03:30:24 -07:00
Sergey Silkin
37a848135b Fix rounding in scaling routines.
* before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv
* after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9
* source: https://screenshot.googleplex.com/9yevVP7URe3XfSm

Bug: b/465721312
Change-Id: I2aede005db252b2912ceef23379463f176675205
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-05 19:05:43 -07:00
Frank Barchard
125f151316 ARGBToNV12 use Matrix
Refactored Matrix functions (ARGBToI420Matrix, ARGBToI422Matrix, ARGBToI444Matrix and ARGBToNV12Matrix)
  and updated their CPU dispatch logic.

ARGBToNV12 clang
 68.05% ARGBToUVMatrixRow_AVX512BW
 21.04% ARGBToYMatrixRow_AVX512BW
  2.88% MergeUVRow_AVX512BW

ARGBToNV12 rowwin
 61.26% ARGBToUVMatrixRow_AVX2
 25.43% ARGBToYMatrixRow_AVX2
  3.09% MergeUVRow_AVX2

ARM on One Plus 15
 42.98% libyuv::ARGBToUVMatrixRow_SVE_SC()
 38.95% ARGBToYMatrixRow_NEON_DotProd
  2.96% MergeUVRow_NEON
  0.18% ARGBToUVMatrixRow_SVE2

ARGBToI420
 72.28% ARGBToUVMatrixRow_AVX512BW
 19.04% ARGBToYMatrixRow_AVX512BW

ARGBToI422
 77.46% ARGBToUVMatrixRow_AVX512BW
 15.55% ARGBToYMatrixRow_AVX512BW

ARGBToI444
 67.03% ARGBToYMatrixRow_AVX512BW
 24.80% ARGBToUV444MatrixRow_AVX512BW

Bug: libyuv:42280902
Change-Id: I463ebcdb70cb669a1ce1a81102b8fd2fb3943bd3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7819051
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-05 18:17:19 -07:00
Frank Barchard
561a9780e2 YUV to RGB avoid avx assist
Here are the functions flagged for mixing both SSE and AVX (or AVX-512)
instructions, which can trigger an AVX transition/assist performance
penalty:

Libyuv Functions addressed in this CL
   * I422ToARGBRow_AVX512BW
   * HalfFloatRow_SSE2

Not addressed:
   * ScaleFilterCols_SSSE3

Bug: libyuv:509681367
Change-Id: I8ced6065dfe0c516d05857086393782c8590062a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7814945
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-05 12:57:55 -07:00
Frank Barchard
5a17753597 libyuv: Optimize Convert8To8Row_NEON for 32-bit ARM
Benchmark (Convert8To8Plane 1280x720, 1000 repeats):

32-bit: 106 ms -> 44 ms
64-bit: 52 ms (unchanged)
Bug: libyuv:42280902
Change-Id: I389a482f93404984759ef6223d7d191579d3578d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7812450
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-04 13:00:21 -07:00
Frank Barchard
2143edfa7a ARGBToUVMatrixRow_NEON arm32 reimplemented for GCC
Bug: libyuv:508639302
Change-Id: Ib120373d799c66926a64c980873034be262d8848
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7810481
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
2026-05-04 11:38:45 -07:00
Frank Barchard
f2ac6db694 RAWToNV21 using SME, SVE, I8MM or Neon
Pixel 9 Now SVE2 2 pass LibYUVConvertTest.RAWToNV21_Opt (364 ms)
 31.76% libyuv::ARGBToUVMatrixRow_SVE_SC()
 30.38% RAWToARGBRow_SVE2
 26.81% ARGBToYMatrixRow_NEON_DotProd
  3.26% MergeUVRow_NEON

Was NEON 1 pass LibYUVConvertTest.RAWToJNV21_Opt (295 ms)
 44.14% RAWToYJRow_NEON
 41.91% RAWToUVJRow_NEON
  5.11% MergeUVRow_NEON

Clang on Intel Skylake clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt
(301 ms) visual c (row_win) [ OK ] LibYUVConvertTest.RAWToJNV21_Opt
(2056 ms)

clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (275 ms) visual c [ OK ]
LibYUVConvertTest.RAWToJNV21_Opt (365 ms)

Bug: libyuv:42280902
Change-Id: Iaba558ebe96ce6b9881ee9335ba72b8aac390cde
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7802432
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-29 13:11:04 -07:00
Wan-Teh Chang
b438739c8b Use ptrdiff_t for buffer offsets
Use ptrdiff_t instead of intptr_t for buffer offsets, such as stride,
width_temp, and src_step*.

Change-Id: I64e6701fa71ab59c94325a6dad8762d040035208
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800070
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 18:21:42 -07:00
Frank Barchard
9a0226cb3f Add GEMINI.md with guidelines on libyuv
Bug: None
Change-Id: If5d2d84ff88b3c7069f0f6e9c98a4acb76078618
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800069
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-28 16:41:45 -07:00
Wan-Teh Chang
a7849e8a5e Fix yi * src_stride overflow in ScalePlaneVertical
Fix int overflow of yi * src_stride overflow in ScalePlaneVertical(),
ScalePlaneVertical_16(), and ScalePlaneVertical_16To8() by casting the
operand src_stride to ptrdiff_t.

Adapted from the patches by Victor Miura <vmiura@google.com>.

Bug: 505814332
Change-Id: I4a4751041a213f7208b01eb18c43c9e196a36261
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796558
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-04-28 12:34:12 -07:00
Wan-Teh Chang
2895faed32 Use GTEST_SKIP() macro to skip TestI400LargeSize
Change-Id: I3a8f5c498b07f26dea5468fbecad9081f8bbe6d5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800542
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 12:30:58 -07:00
Wan-Teh Chang
54d40344ca No need to cast ptrdiff_t src_stride to intptr_t
ptrdiff_t is the appropriate type for a buffer offset. intptr_t is
intended for a different purpose.

Change-Id: I475c548338b61f573fb11766c24cde6d31fbbed8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796559
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 09:50:46 -07:00
Frank Barchard
4afb965416 RAWToARGB use AVX512BW
Bug: libyuv:42280902
Change-Id: I7a80fd64d97b6d411316819df0fd917d609a173b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787163
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-22 16:56:46 -07:00
Frank Barchard
bd2c4c76ec RAWToARGB AVX512VBMI
Bug: libyuv:42280902
Change-Id: I1c7f432f004079357a00515785bc524c459ed4b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787160
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-22 14:48:29 -07:00
Frank Barchard
d445250d8b Replace RAWToY/RGB24ToY with RGBToYMatrix
Bug: libyuv:42280902
Change-Id: I6ddebd492036c416550fc045eb39493dea73246b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7784094
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-21 17:11:14 -07:00
Frank Barchard
81f698829b Add RGBToNV21Matrix function
- implement wrappers with RAW, RGB24, NV21 and JNV21 to call it.

Zen5
Was [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1146 ms)
Now [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1446 ms)
reason - the new code uses 1 pass for RAWToY but 2 pass for RAWToARGB,ARGBToUV.  needs 1 RGBToUV

Bug: libyuv:42280902
Change-Id: Ife6fbed0829484045409e6d42b85cec1d1fd6052
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7780026
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-20 18:03:34 -07:00
Frank Barchard
9f13b2814d add RGBToYMatrixRow_AVX2
Adds RGBToYMatrixRow_AVX2 which reads 24 bit RGB values by reading 3 vectors instead of 4 and permutes them into 4 ARGB vectors before conversion.
Also adds RGBToYMatrixRow_Opt and RGBToYMatrixRow_2Step_Opt to convert_argb_test.cc to benchmark and compare the direct AVX2 conversion vs a 2-step approach.

./libyuv_test '--gunit_filter=*RAWToJ400_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1

AMD Zen 5
Was LibYUVConvertTest.RAWToJ400_Opt (757 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (699 ms)

Intel Skylake
Was LibYUVConvertTest.RAWToJ400_Opt (1705 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (1426 ms)

Bug: 477295731
Change-Id: I29866baf4ad5fe7a3725e4a01f2fe24649510a7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7777325
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-20 12:52:44 -07:00
Frank Barchard
ddc6764d13 ARGBToUVMatrixRow_RVV replace vlseg8 with vlseg4,
implementing horizontal paired adds and accumulation to improve
performance on SiFive x280, and fixes the remainder logic to use valid
vlseg4 loads. Adds TestARGBToUVRow_Any to test odd-width remainder
handling.

Also fixes a build break for non-RVV compilations by ensuring all RVV
functions and their closing cplusplus braces are correctly wrapped in
#if !defined(LIBYUV_DISABLE_RVV).

Also adds NV12ToNV21 as a macro alias for NV21ToNV12 in
planar_functions.h, as the conversion is bidirectional (swapping byte
pairs in the interleaved chroma plane). (Patch from
https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762904)

Bug: libyuv:42280902
Change-Id: If2d6cbb3e232d63d43e32aba33fa9b2eee8190e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7772164
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-17 15:04:45 -07:00
Frank Barchard
ace7c4573c Add ARGBToUV444MatrixRow_RVV, ARGBToUVMatrixRow_RVV, and wrappers
This change implements ARGBToUV444MatrixRow_RVV, ARGBToUVMatrixRow_RVV,
and their wrappers (ARGBToUVRow_RVV, ARGBToUVJRow_RVV, etc.) using RVV
intrinsics, mirroring the NEON/AVX2 designs. It wires them into the
build and dispatch systems.

LIBYUV_RVV_HAS_TUPLE_TYPE is always true on new compilers. This macro
has been removed, assuming it is true everywhere, reducing the amount of
code in row_rvv.cc, scale_rvv.cc, and row.h.

Tested via: ~/bin/doyuv3v && ~/bin/runyuv3v TestARGBToI444Matrix
~/bin/doyuv3av

Bug: libyuv:42280902
Change-Id: I36d305386b297d69023c068aa9c62ab6b2ad039c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7769956
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-16 20:52:43 -07:00
Chema Gonzalez
dec8272138 Fix typo
Change-Id: I4dea1bcacc7d10dd2db74f4b221db42e2deade83
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762903
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-16 14:27:40 -07:00
Frank Barchard
94644361b4 row_win.cc rewrite into intrinsics
- remove inline asm which was only for 32 bit
- add ARGBToYMatrixRow_AVX2
- add gn flag libyuv_enable_rowwin=true

Example of building with GN and Ninja:

Without the new flag:
  gn gen out/Release "--args=is_debug=false"
  ninja -C out/Release

With the new flag:
 gn gen out/Release "--args=is_debug=false libyuv_enable_rowwin=true"
 ninja -C out/Release

Bug: libyuv:42280806, 477295731, libyuv:42280902, libyuv:439628764
R=​dalecurtis@chromium.org, rrwinterton@gmail.com

Change-Id: I451bf814622fba690005c02fbf5816819c6a08c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7765790
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-15 19:53:16 -07:00
Frank Barchard
0d8494abc0 Add Bazel build support
Change-Id: Idf205997010a95f975dbd347e268e36c2072f797
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7745020
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-15 18:20:26 -07:00
Frank Barchard
e034c41661 Port ARGBToUVMatrixRow from AVX2 to AVX512BW
Benchmark on Icelake Xeon
Now AVX512BW:
[       OK ] LibYUVConvertTest.ARGBToNV12_Opt (1723 ms)
Was AVX2:
[       OK ] LibYUVConvertTest.ARGBToNV12_Opt (2144 ms)

- Added `ARGBToUVMatrixRow_AVX512BW` implementation in `source/row_gcc.cc`.
- Added corresponding `ARGBToUVRow_AVX512BW` and `ABGRToUVRow_AVX512BW` functions.
- Added unaligned wrappers `ARGBToUVRow_Any_AVX512BW` and `ABGRToUVRow_Any_AVX512BW` in `source/row_any.cc`.
- Updated `source/row_any.cc` to correctly size `vin` and `vout` buffers for AVX512BW width and adjusted the `ANY12MS` and `ANY12S` macros to handle `MASK=63`.
- Updated `include/libyuv/row.h` with the required AVX512BW headers and definitions, scoped appropriately.
- Wired all callers of `ARGBToUVRow_AVX2` and related functions in `source/convert.cc` and `source/convert_from_argb.cc` to dynamically use the `AVX512BW` implementations if the CPU flag indicates AVX-512BW support.
- Optimized AVX-512 code to generate the `-1` multiplier in a single instruction (`vpternlogd`) and reused it across word (`vpmaddwd`) dot products. Handled the resulting negation by replacing a subtraction with `vpaddw` offset adjustment.

Bug: 477295731
R=dalecurtis@chromium.org, rrwinterton@gmail.com

Change-Id: Ida5fb27e59ae4c1c3824737f009b80549cd20a06
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7763257
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-14 16:15:31 -07:00
Frank Barchard
cbc64c353c Port ARGBToYRow_AVX2 usages to dynamically use ARGBToYRow_AVX512BW
I have successfully ported the usage of ARGBToYRow_AVX2 to dynamically detect and utilize ARGBToYRow_AVX512BW when available.

  Here's a summary of the changes:
   1. Source Modifications: In both source/convert.cc and source/convert_from_argb.cc, I searched for all references where ARGBToYRow_AVX2 was
      being conditionally used (which operates on 32 pixels).
   2. AVX512BW Detection: Immediately following those blocks, I injected a new check for kCpuHasAVX512BW. If the CPU flag is present, the logic
      now utilizes ARGBToYRow_Any_AVX512BW by default, falling back to the fully aligned ARGBToYRow_AVX512BW when the width is aligned to 64
      bytes.
   3. Profiling: After building and compiling the tests (doyuv3x), I validated the change using perfyuv3 ARGBToNV12_Opt | cat. The test
      successfully executed and the performance profile indicated that ARGBToYRow_AVX512BW successfully executed (taking up ~18% of CPU cycles,
      replacing the previous AVX2 specific instruction overhead for the Y row extraction).

  The HAS_ARGBTOYROW_AVX512BW macro implementation now fully supports all AVX2 conversion paths to utilize AVX512BW when the system processor
  flags allow it!

R=richard, rrwinterton@gmail.com

Change-Id: Iad811e12d301f5621e6f6d039105420861ade43e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7760779
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-14 11:42:59 -07:00
Frank Barchard
59ca5d8074 Fix parameter names and comments for ARGB/BGRA/RGBA/ABGR functions
In all functions that start with ARGB, BGRA, RGBA or ABGR in the include/libyuv/ headers, make sure the parameter variable name has the same 4 letters, but lower case, and the comment before the function should have the same matching name. Then make sure the implementation in source/ folder has the same variable names.

Change-Id: Idadbbbb993156eea16e318719f4888cb3bed5f6a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7760057
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-13 18:28:37 -07:00
Frank Barchard
893eacf9b4 ARGBToY for AVX512
- add ARGBToYMatrixRow_AVX512BW
- refactor SSE and AVX to use Matrix functions, making old functions
  call the new ones.

Zen5 1280x720
Was AVX2   LibYUVConvertTest.ARGBToI444_Opt (1125 ms)
Now AVX512 LibYUVConvertTest.ARGBToI444_Opt (641 ms)

Details by Gemini:
  1. Created 3 new Matrix functions:
    Added ARGBToYMatrixRow_SSSE3, ARGBToYMatrixRow_AVX2, and
    ARGBToYMatrixRow_AVX512BW to source/row_gcc.cc. These take the
    const struct ArgbConstants* c parameter similarly to
    ARGBToUV444MatrixRow_*. The x86 vector instructions dynamically
    calculate the needed values using the properties of the constants
    struct, including using vpmaddwd inside the AVX512 code to offset
    the lack of a native vphaddw.

  2. Replaced Old Functions with Wrappers:
    Modified the existing implementations of ARGBToYRow_SSSE3,
    ARGBToYJRow_SSSE3, ABGRToYRow_SSSE3, ABGRToYJRow_SSSE3,
    RGBAToYRow_SSSE3, RGBAToYJRow_SSSE3, BGRAToYRow_SSSE3 (and their
    _AVX2 equivalents) in source/row_gcc.cc to act as inline wrappers
    calling the new ARGBToYMatrixRow_* functions, passing the right
    matrix parameters (e.g. &kArgbI601Constants, &kArgbJPEGConstants,
    &kAbgrI601Constants).

  3. Added row_any.cc Handlers:
    Added ANY11MC definitions to source/row_any.cc to autogenerate
    ARGBToYMatrixRow_Any_SSSE3, ARGBToYMatrixRow_Any_AVX2, and
    ARGBToYMatrixRow_Any_AVX512BW which safely handles non-aligned
    tails.

  4. Updated include/libyuv/row.h:
    Updated the headers with the proper void declarations for all newly
    generated Matrix and Any_ variants. Also defined
    HAS_ARGBTOYROW_AVX512BW in the CPU macros.

  5. Tested the Implementations:
    Compiled and tested on Linux x86, which resulted in all tests passing
    cleanly. Also successfully completed all Windows 32-bit build checks
    ensuring 32-bit regression prevention without issues.

Bug: 477295731
Change-Id: I4f5eec9a961e24a9d760d0a1c0810fb5e29a0bd1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7759494
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-13 17:26:07 -07:00
Frank Barchard
644251f252 Fix buffer sizes in ANY macros and ANY11MC typo
Increases buffer sizes from 128 to 256 in ANY11, ANY11C, ANY11MC, ANY12,
and ANY12M macros to safely accommodate AVX512BW processing which can
write up to 256 bytes per operation.

Bug: libyuv:42280902, libyuv:502250231, 501882928

Change-Id: Icfba1982dc5fb6545255464f7decb2baec7be90f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7758060
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-13 13:01:49 -07:00
Frank Barchard
5cfaa44d71 Replace strtok_r with strchr in RISC-V CPU capability detection
This fixes a build failure on bare-metal toolchains like
riscv64-unknown-elf-clang++ where strtok_r may be undeclared.

Bug: 477295731
Change-Id: If4edd6c6d2e975ae34278f479700ef9b996c0a3e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7744872
Reviewed-by: James Zern <jzern@google.com>
2026-04-10 12:33:43 -07:00
Frank Barchard
5b5a2f6b92 Fix 'ghost AVX512' detection on Alder Lake CPUs
Adds a check for the AVX512F feature bit (cpu_info7[1] & 0x00010000)
before enabling AVX512 features. Alder Lake CPUs can report OS support
for YMM/ZMM but not actually support AVX512F, leading to incorrect
capability detection and crashes.

Bug: libyuv:500318522
Change-Id: I84167ee3fcfc7a2572afba148bbb275bd3ccb1e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7746229
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-09 17:52:24 -07:00
Frank Barchard
4f4e1ac553 Fix 2 failing golden tests
- Add ifdef for LIBYUV_UNLIMITED_DATA

Fixed by Gemini just telling it how to build and run the test and to fix it.

Bug: libyuv:353545922
Change-Id: I117a25b75b9616ee2ce6122aa163c2085ed4dc7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7742120
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-09 11:51:13 -07:00
Sam Maier
e3ceea1e67 Forward-declare ArgbConstants in convert.h to fix visibility error
The libyuv into Chromium roller is currently broken, see bug 500795092.

This change adds a forward declaration for struct ArgbConstants in
include/libyuv/convert.h. This resolves a -Wvisibility error where the
struct was being declared within a function prototype, making it
invisible outside that scope and breaking automated binding generation
(e.g., for crabbyavif).

Verified building crabbyavif_libyuv_bindings locally and this patch
fixed it.

Bug: 500795092
Change-Id: Ie0126650ab346940f4610bd4d2e8a5b3ef9ce103
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7739974
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-09 08:53:56 -07:00
Frank Barchard
4c3d7d517a ARGBToUV444 for AVX512
1.27x faster on AMD Zen5 (turin)

Now AVX512
perf record ./libyuv_test '--gunit_filter=*ARGBToI444_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1

[       OK ] LibYUVConvertTest.ARGBToI444_Opt (1071 ms)
Overhead  Symbol
  53.49%  ARGBToYRow_AVX2
  44.70%  ARGBToUV444Row_AVX512BW

Was AVX2
[       OK ] LibYUVConvertTest.ARGBToI444_Opt (1369 ms)
  61.06%  ARGBToUV444Row_AVX2
  37.67%  ARGBToYRow_AVX2

Bug:  libyuv:42280902
Change-Id: I306fbac656d6f7834ce1559e86d01eb34931ec3c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7738362
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-08 19:25:41 -07:00
Sam Maier
7903a6c632 Fix deprecated usage of strtok
The latest Android NDK marks strtok as deprecated and suggests using
strtok_r instead.

Bug: 477295731
Change-Id: I2b20a2ae0a9e19ec93e31669ec380802e6902090
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7739107
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2026-04-08 11:34:51 -07:00
Dale Curtis
1170363ce5 Add Gemini implementation for NEON32 RGB to YUV matrix operations
These are about 25% faster than the C versions.

Bug: libyuv:42280902

Change-Id: I8b298670ee5f3ed5db35527fc41d6d9a51b020a1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7573682
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
2026-03-23 16:30:44 -07:00
Frank Barchard
4183733af5 Rename MergeUVRow_ variable to MergeUVRow
Bug:  libyuv:42280902
Change-Id: I9935bf958b901ddf84cf91b2097c8cd5d6efadde
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7683070
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-03-18 17:18:25 -07:00