2979 Commits

Author SHA1 Message Date
Frank Barchard
d23308a2a7 add bmm detect and vdpphps in util/cpuid
Bug: None
Change-Id: I9954f96a74e653e3ecd3fbeba533299fa8e57d95
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7914867
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-06-09 14:52:48 -07:00
Frank Barchard
3bdb3b94ca I420ToRAW use 2 step AVX512
On Icelake
Was AVX2
I420ToRAW_Opt (283 ms)
  67.55%  I422ToARGBRow_AVX2
  26.46%  ARGBToRGB24Row_AVX2

Now AVX512VBMI
I420ToRAW_Opt (238 ms)
  73.08%  I422ToARGBRow_AVX512BW
  21.59%  ARGBToRGB24Row_AVX512VBMI

Bug: 42280902
Change-Id: I9d4d21faed30c529a5e593819f103be115709f37
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7909924
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-06-08 14:32:13 -07:00
Frank Barchard
4be798d7c5 BGRAToI420 use BgraConstants for a direct conversion using AVX512BW
row win (msvc)
Was C/SSSE3
BGRAToARGB_Opt (594 ms)
BGRAToARGB_Endswap_Opt (609 ms)
BGRAToI420_Opt (122 ms)

Now AVX2
BGRAToARGB_Opt (100 ms)
BGRAToARGB_Endswap_Opt (99 ms)
BGRAToI420_Opt (115 ms)

Clang/GCC AVX512BW
BGRAToARGB_Opt (86 ms)
BGRAToARGB_Endswap_Opt (91 ms)
BGRAToI420_Opt (110 ms)


Bug: 42280902
Change-Id: I52cb2b0cacea8f2f0b138ec3cc521185dbef8595
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7905821
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-06-08 12:21:47 -07:00
Wan-Teh Chang
95eedb9687 ConvertToARGB: compute buffer offsets in ptrdiff_t
Also validate crop_x, crop_y, crop_width, crop_height and make sure the
crop region stays inside the source rectangle.

Change-Id: I68748e14b21307b262d8b283147bce5ace8108d2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7904591
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-06-05 18:38:42 -07:00
Wan-Teh Chang
ccd415101d Fix int negation overflow in ConvertToARGB/I420
Fix int negation overflow in ConvertToARGB() and ConvertToI420().

Change-Id: Ia8e1f1a2994962a0372f4c31f6cc9c8972d8a954
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7904588
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-06-05 12:34:38 -07:00
Wan-Teh Chang
f722313c74 Validate int param is not INT_MIN before negating
Validate that an int parameter is not equal to INT_MIN before negating
it.

Remove redundant src_width > 32768 || src_height > 32768 checks in
callers of ScalePlane(), ScalePlane_16(), ScalePlane_12(), and
UVScale().

Change UVScale() to validate its parameters in the same way as
ScalePlane(), ScalePlane_16(), and ScalePlane_12().

Change-Id: I64e03257cf090760030c966b49c4d23e4cec25e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7902889
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-06-04 21:55:57 -07:00
Wan-Teh Chang
826ab02fcc Remove __attribute__((no_sanitize("cfi-icall")))
Remove __attribute__((no_sanitize("cfi-icall"))) from
ARGBToUVMatrixRow_AVX2(). This breaks MSVC compilation, and no other
libyuv function is marked with this attribute.

Change-Id: I2bb6a688e296dd4acff325c5bd750573a577f246
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7904777
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-06-04 18:04:11 -07:00
Sun Yuechi
b7c959cab5 Add missing files for riscv64 GYP build
There are a few added source files since the (re-)addition of GYP build
support, for better SIMD optimization support (AArch64 SME & SVE,
LoongArch LSX & LASX, RISC-V RVV). This CL covers the RISC-V RVV part in
preparation of fixing GYP builds for this architecture.

The files' arch-specific contents are all gated behind preprocessor
macro checks, so it is safe to have everything included in the build
unconditionally.

Bug: None
Change-Id: Id2d5c7fcc1e274cef6c83e2ad5945610e6c52f9d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7872114
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-06-03 20:11:31 -07:00
James Zern
af36de328f Android.mk: add a check for NDK_ROOT
This simplifies integration with the Android platform and avoids the
files from being used when a non-NDK build is performed. In that case
Android.bp is preferred.

Change-Id: Ic669f33931ad294d6570341b4e39fccd7e7f1ad8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7897896
Commit-Queue: James Zern <jzern@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-06-03 17:00:30 -07:00
Wan-Teh Chang
62fffa9eeb Fix integer overflow when flipping negative height
Treat height == INT_MIN as invalid. Omit explicit height == INT_MIN
check if we disallow height < 32768.

Perform multiplications of stride in the ptrdiff_t type.

Add checks for invalid width and height to some functions.

Bug: 518806561
Change-Id: I5e39fffed7f806852a8758d4b59df919839c0a3b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7891415
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-06-03 16:17:37 -07:00
Frank Barchard
e14b0e2c60 RGB565ToARGB use AVX2 instead of SSE2
Now AVX2/AVX512
ARGB4444ToI420_Opt (204 ms)
RGB565ToI420_Opt (211 ms)
ARGB1555ToI420_Opt (231 ms)
RAWToI420_Opt (197 ms)
RGB24ToI420_Opt (197 ms)

Was SSE2/AVX2
ARGB4444ToI420_Opt (276 ms)
RGB565ToI420_Opt (292 ms)
ARGB1555ToI420_Opt (332 ms)
RAWToI420_Opt (237 ms)
RGB24ToI420_Opt (232 ms)

Bug: libyuv:508639302
Change-Id: I2005189d1b6af15cb5ebef1f6d66b426fa9df8eb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7891416
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-06-02 18:28:02 -07:00
Wan-Teh Chang
06cc67fd2f Don't ignore UVCopy() and UVCopy_16() return value
Change-Id: I9d7944da60bf73ec6a578a43540c5a247ad00417
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7891418
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-06-01 16:28:08 -07:00
Frank Barchard
3c5fa6ef27 libyuv] Replace hardcoded RGB to YUV functions with Matrix variants
Removes non-matrix implementations for RGB24, RAW, RGB565, ARGB1555,
and ARGB4444 conversions. Introduces RGBToYMatrixRow, RGBToUVMatrixRow,
and equivalent functions for 16-bit and 24-bit formats. These functions
utilize a 2-step conversion internally (to ARGB, then to YUV) inside
row_common.cc for C, AVX2, and NEON, allowing the high-level
convert.cc logic to execute in a single pass using ArgbConstants.

Benchmark on Zen4
Test: libyuv_unittest --gtest_filter=*RGB*ToI420*

Was BT.601-only
ARGBToI420_Opt (115 ms)
ARGB4444ToI420_Opt (190 ms)
RGB565ToI420_Opt (194 ms)
ARGB1555ToI420_Opt (207 ms)
RGB24ToI420_Opt (143 ms)
RGBAToI420_Opt (167 ms)
28.07% ARGBToUVMatrixRow_AVX512BW
19.65% ARGBToYMatrixRow_AVX512BW
11.32% RGBAToUVRow_SSSE3
10.24% ARGB1555ToARGBRow_SSE2
 8.56% ARGB4444ToARGBRow_SSE2
 8.47% RGB565ToARGBRow_SSE2
 4.17% RGBAToYRow_AVX512BW
 4.04% RGB24ToARGBRow_AVX512BW

Now Matrix
ARGBToI420_Opt (124 ms)
ARGB4444ToI420_Opt (287 ms)
RGB565ToI420_Opt (292 ms)
ARGB1555ToI420_Opt (324 ms)
RGB24ToI420_Opt (236 ms)
RGBAToI420_Opt (126 ms)
29.74% ARGBToUVMatrixRow_AVX2
14.58% ARGB1555ToARGBRow_SSE2
12.59% RGB565ToARGBRow_SSE2
11.32% ARGB4444ToARGBRow_SSE2
 9.35% ARGBToYMatrixRow_AVX2
 8.45% RGB24ToARGBRow_SSSE3
 5.56% ARGBToYMatrixRow_AVX512BW
 1.37% ARGBToUVMatrixRow_Any_AVX2
 0.74% ARGBToYMatrixRow_Any_AVX2
 0.49% ARGB4444ToARGBRow_Any_SSE2
 0.46% RGB565ToARGBRow_Any_SSE2
 0.39% ARGB1555ToARGBRow_Any_SSE2
 0.28% RGB24ToARGBRow_Any_SSSE3
 0.11% ARGB4444ToYMatrixRow_AVX2
 0.09% RGB565ToUVMatrixRow_AVX2
 0.09% RGB565ToYMatrixRow_AVX2
 0.07% RGBToYMatrixRow_AVX2
 0.05% ARGB1555ToUVMatrixRow_AVX2
 0.04% ARGB1555ToYMatrixRow_AVX2
 0.03% RGBToUVMatrixRow_AVX2
 0.02% ARGB4444ToUVMatrixRow_AVX2

Bug: libyuv:508639302
Change-Id: I362c0cfe4c86ee1f3ffb569fa4f784b84148f11a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7891045
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-06-01 14:04:07 -07:00
Frank Barchard
957f295ea9 [libyuv] Fix potential UV coalescing overflow in NV12ToI420
Adds a safety check to prevent signed integer overflow in the UV
plane coalescing logic within NV12ToI420. This ensures that
halfwidth * halfheight does not overflow INT_MAX, matching the Y
plane coalescing check and preventing potential undefined behavior
(signed integer overflow) which could lead to negative widths being
passed to SIMD functions.

Test: libyuv_unittest --gtest_filter=*NV12Crop*
Bug: None

CONV=6401df25-4d5d-4595-a231-f72c2c8e78df
TAG=agy
R=wtc@google.com

Change-Id: I15a51609a1e000a82f4b6958b4ada444efb1f2f4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7886824
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2026-05-29 22:53:29 -07:00
Wan-Teh Chang
d2c6dd5e6a Fix integer overflow in two convert functions
Fix integer overflow in buffer allocation size calculations in the
align_buffer_64() macro and the I422ToNV21() and
Android420ToARGBMatrix() functions.

Based on a CL autogenerated by MendIt (go/androidmendit):
https://googleplex-android-review.googlesource.com/c/platform/external/libyuv/+/39981732

Bug: 511821134
Change-Id: Ie1728c3ad337d460d9b85979489a817cc97e3bf3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7886817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-05-29 19:26:14 -07:00
Wan-Teh Chang
b7389e99be Fix integer overflows in ConvertToI420()
Validate the input parameters crop_x, crop_y, crop_width, crop_height.

Ensure all calculations of buffer sizes and offsets are performed using
the size_t or ptrdiff_t type.

Bug: 511820801
Change-Id: I43f82133c4049e2874c87d2ada147a7c3022f3c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7886366
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-29 19:22:01 -07:00
Frank Barchard
ef08f21f6d [libyuv] Fix security vulnerabilities in ScalePlane and ARGBAffineRow_C
This CL addresses two security findings related to integer overflows:

1. Input validation in ScalePlane, ScalePlane_16, and ScalePlane_12:
   Added checks to reject invalid dimensions (e.g. width <= 0, height
   == 0) and dimensions larger than 32768 (or smaller than -32768 for
   height). This prevents FixedDiv signed integer overflows that can
   lead to division by zero/overflow crashes (SIGFPE on x86) or
   incorrect step calculations.

2. Stride overflow in ARGBAffineRow_C:
   Casted pointer arithmetic operands to ptrdiff_t before multiplication
   (y * stride and x * 4) to ensure 64-bit calculations, preventing
   signed 32-bit integer overflow when calculating source pixel offsets.

Added unit tests to verify the input validation in ScalePlane functions.

Test: libyuv_unittest --gtest_filter=*InvalidInputs*
Test: libyuv_unittest --gtest_filter=*Scale*
Test: libyuv_unittest --gtest_filter=*TestAffine*
Bug: None

TAG=agy
CONV=0e990960-611b-4f38-94ec-24e79b66242e
R=wtc@google.com

Change-Id: I252af47a98e45dff8bb5f06308c3739c6eead741
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7886217
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-29 18:30:38 -07:00
Wan-Teh Chang
c98edcc8dc Don't coalesce rows if width*height would overflow
Audit all occurrences of "width *= height;" in the libyuv source code.
Make sure height > 0 and (ptrdiff_t)width * height <= INT_MAX before
executing width *= height.

Bug: chromium:517339758
Change-Id: I143a41c66492a6e4c48b6aa2a1c4a2ae974ceeb1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7883816
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-29 11:57:47 -07:00
Frank Barchard
e449eb2172 J400ToARGB switch from SSE2 to AVX2
- port for row_win
- remove unused HAS_ macros

Was C/SSE2
MSVC  J400ToARGB_Opt (1967 ms)
Clang J400ToARGB_Opt (568 ms)

Now AVX2
MSVC  J400ToARGB_Opt (411 ms)
Clang J400ToARGB_Opt (418 ms)

Test: libyuv_unittest --gtest_filter=*J400ToARGB*
Bug: libyuv:508639302

Change-Id: Ifdfb026832b708b61f55477250cc5ee52449f421
TAG=agy
CONV=186608fc-966a-4ea7-bf57-9fe07cc1383c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7877368
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
2026-05-28 21:24:32 -07:00
Wan-Teh Chang
904f562d86 Remove redundant #include <stddef.h>
"libyuv/basic_types.h" includes <stddef.h>. So it is not necessary to
include both <stddef.h> and "libyuv/basic_types.h".

Change-Id: I5a461258a3c6820d1007ac635838f910237f367f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7884381
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-28 17:10:22 -07:00
Wan-Teh Chang
ebe6fef903 Fix integer overflow in multiplications of stride
Audit all occurrences of "stride *" in the libyuv source tree. Ensure
that these multiplications are performed in the ptrdiff_t type.

For functions not declared in a public header (such as static
functions), prefer to declare the stride parameters (typically named
src_stride and dst_stride) and related stride local variables as
ptrdiff_t. If this is not possible, add ptrdiff_t casts to the stride
parameters in multiplications. If intptr_t or int64_t casts were used,
change them to ptrdiff_t casts.

Bug: chromium:516986556
Change-Id: I6cd8a8eb00cbb5380db828bf83e4d89ff95891f3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7882967
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-05-28 14:12:37 -07:00
Andrew Grieve
de63bd90f4 Stop setting --dynamic-linker for target_os="android"
lld ignores it for shared libraries, but wild linker adds an .interp
section because of it. Regardless, llvm has long known to set the
correct linker flag when building for android.

Bug: 40208899
Change-Id: I50dc015946382f9e99289333e7d8b870409ed8d6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7874019
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Andrew Grieve <agrieve@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2026-05-27 04:34:07 -07:00
Frank Barchard
9d98aaefe7 InterpolateRow for Visual C
- remove InterpolateRow_SSSE3
- optimize ARGBToUV444MatrixRow_AVX2 to use unsigned pixels

5.7x faster on AMD Zen4

Was C
TestInterpolatePlane (144 ms)
TestInterpolatePlane_16 (142 ms)

Now AVX2
TestInterpolatePlane (25 ms)
TestInterpolatePlane_16 (48 ms)

Was signed
ARGBToJ444_Opt (157 ms)
Now unsigned
ARGBToJ444_Opt (155 ms)

Bug: None
Change-Id: I903109668ff9cfedaddad1ad75411393b3226f41
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7856498
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-18 17:28:46 -07:00
Frank Barchard
9f751100d2 InterpolateRow_16_AVX2 for row_gcc
On AMD Zen4
Was C
TestInterpolatePlane_16 (143 ms)
Now AVX2
TestInterpolatePlane_16 (48 ms)

Was
I210ToI420_Opt (87 ms)
 35.60% InterpolateRow_16To8_AVX2
 31.03% Convert16To8Row_AVX512BW
 21.35% Convert16To8Row_AVX2

Now
I210ToI420_Opt (69 ms)
 37.57% Convert16To8Row_AVX512BW
 32.69% InterpolateRow_16_AVX2
  7.18% Convert16To8Row_AVX2
  5.23% InterpolateRow_16To8_AVX2

Bug: None
Change-Id: Ica9b9c5dbd847068ae076b682c487e1753d3c812
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7855648
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-18 14:29:36 -07:00
Frank Barchard
cda55fcf53 Mirrow AVX2 functions for Visual C
Bug: libyuv:42280902
Change-Id: Iabbec9af3a4f4dd89294e60145823c7fc4dd6ec6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7843378
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-15 15:05:31 -07:00
Sergey Silkin
0f320a03f7 Fix linear interpolation
C interpolator applied to chroma plane at scaling NV12 on Mac/ARM used
(0x7f ^ f) which is (127-f) instead of (128-f). This resulted in changes
like 128 -> 127 when scaling flat colors and caused visually noticeable
difference.

Bug: b/465721312
Change-Id: Iecf5d2ca2a85602de4146cba7e0f64ecb4b2c1fe
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830198
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
2026-05-13 05:33:33 -07:00
Frank Barchard
c6c8689c74 Fix I444 and J444 parameter names/order
Bug: libyuv:42280902
Change-Id: Ia2c45f2d996d071534b08381f61adf8cb8ef35b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841767
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-12 15:30:35 -07:00
Frank Barchard
dd8b46630a ARGBToUV444MatrixRow_AVX2 intrinsics for Visual C
Was C
LibYUVConvertTest.ARGBToI444_Opt (1027 ms)

Now AVX2
LibYUVConvertTest.ARGBToI444_Opt (310 ms)

Bug: libyuv:508639302
Change-Id: I0bc7f5c5b72160d24226a98d5fddb184a004ed00
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841655
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-12 14:19:58 -07:00
Frank Barchard
cb061d0378 Unittests use ASSERT instead of EXPECT
Bug: libyuv:508639302
Change-Id: I22c35e08f3b6db1a656192877c1fb1bf4e96d6f5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7838659
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-11 19:10:47 -07:00
Frank Barchard
e23282704f ARGBToYRow_AVX512BW preserve XMM6-XMM15 due to Windows stack alignment
Bug: 505124541
Change-Id: Id5ae539f57b314980182bec76a788e33273b2392
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7835639
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-11 13:12:22 -07:00
Frank Barchard
4b4e68b372 ABGRToJ420 call ARGBToI420Matrix
- Standardize libyuv ARGB-family (ARGB, ABGR, RGBA, BGRA) to YUV conversion by utilizing the generic MatrixRow architecture and explicit ArgbConstants.
- Consolidated ARGBToI420, ABGRToI420, BGRAToI420, and RGBAToI420 as wrappers for ARGBToI420Matrix.
- Refactored ABGRToJ420, ABGRToJ422, and ABGRToI422 to use generic matrix functions.
- Added matrix-based versions for NV21, I400, YUY2, and UYVY.
- Updated RAW and RGB24 to I420/I422/I444 dispatchers to use MatrixRow logic and explicit constants.
- Fixed parameter swap bugs in ARGBToI422, ARGBToJ422, and ABGRToJ422.
- Fixed a bug in the generic C implementation of matrix row functions ensuring all 4 channels are processed correctly for all ARGB-family formats.
- Moved kShuffleAARRGGBB in row_gcc.cc to the top of the libyuv namespace for visibility.
- Cleaned up redundant format-specific row implementations.

Bug: libyuv:42280902
Change-Id: I67ffa4c476abc0d2dcc4650510d7bda91b65988e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830291
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-08 15:23:30 -07:00
Frank Barchard
4aacbbdfb4 Refactored RGB/RAW to YUV color conversion functions to use generic Matrix-based functions parameterized by ArgbConstants.
This consolidation standardizes conversion logic, improves code
maintainability, and provides flexible support for various color spaces
(e.g., BT.601, JPEG full
  range).

Key Modifications:
 - Function Consolidation: Refactored several high-level conversion functions into lightweight wrappers around generic Matrix variants:
     - ARGBToI420 → ARGBToI420Matrix
     - ARGBToI444 → ARGBToI444Matrix
     - ARGBToI422 → ARGBToI422Matrix
     - ARGBToNV12 → ARGBToNV12Matrix
     - RAWToJ400, RGB24ToJ400 → RGBToI400Matrix
     - RAWToI444, RAWToJ444 → RGBToI444Matrix
 - 2-Pass Conversions: Updated RGB565ToI420, ARGB1555ToI420, and ARGB4444ToI420 to utilize 2-pass conversions via RGBToI420Matrix.
 - Standardization: Refactored ARGBToNV21, ARGBToYUY2, and ARGBToUYVY to use parameterized matrix row functions (ARGBToYMatrixRow,
   ARGBToUVMatrixRow).
 - Legacy Cleanup: Replaced legacy calls to ARGBToYJRow with the parameterized ARGBToYMatrixRow in the ARGBSobelize helper.
 - Internal Integration: Included libyuv/convert_from_argb.h in planar_functions.cc and ensured all new matrix symbols are properly
   declared/exported (LIBYUV_API).

Bug: libyuv:42280902
Change-Id: Ied5fd9899767427e3a03cdcfbeaff3e9d502374a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7822033
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-06 20:02:47 -07:00
Mirko Bonadei
8773064a72 Revert "Fix rounding in scaling routines."
This reverts commit 37a848135b2b5df6177a6fd42e1d60eb0fa9539d.

Reason for revert: The CL was not ready to land.

Original change's description:
> Fix rounding in scaling routines.
>
> * before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv
> * after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9
> * source: https://screenshot.googleplex.com/9yevVP7URe3XfSm
>
> Bug: b/465721312
> Change-Id: I2aede005db252b2912ceef23379463f176675205
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417
> Reviewed-by: Frank Barchard <fbarchard@google.com>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>

Bug: b/465721312
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Change-Id: Ic3d0de4d4475942bc91fbee17d012bc1b656589f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7816864
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Bot-Commit: rubber-stamper@appspot.gserviceaccount.com <rubber-stamper@appspot.gserviceaccount.com>
2026-05-06 03:30:24 -07:00
Sergey Silkin
37a848135b Fix rounding in scaling routines.
* before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv
* after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9
* source: https://screenshot.googleplex.com/9yevVP7URe3XfSm

Bug: b/465721312
Change-Id: I2aede005db252b2912ceef23379463f176675205
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-05 19:05:43 -07:00
Frank Barchard
125f151316 ARGBToNV12 use Matrix
Refactored Matrix functions (ARGBToI420Matrix, ARGBToI422Matrix, ARGBToI444Matrix and ARGBToNV12Matrix)
  and updated their CPU dispatch logic.

ARGBToNV12 clang
 68.05% ARGBToUVMatrixRow_AVX512BW
 21.04% ARGBToYMatrixRow_AVX512BW
  2.88% MergeUVRow_AVX512BW

ARGBToNV12 rowwin
 61.26% ARGBToUVMatrixRow_AVX2
 25.43% ARGBToYMatrixRow_AVX2
  3.09% MergeUVRow_AVX2

ARM on One Plus 15
 42.98% libyuv::ARGBToUVMatrixRow_SVE_SC()
 38.95% ARGBToYMatrixRow_NEON_DotProd
  2.96% MergeUVRow_NEON
  0.18% ARGBToUVMatrixRow_SVE2

ARGBToI420
 72.28% ARGBToUVMatrixRow_AVX512BW
 19.04% ARGBToYMatrixRow_AVX512BW

ARGBToI422
 77.46% ARGBToUVMatrixRow_AVX512BW
 15.55% ARGBToYMatrixRow_AVX512BW

ARGBToI444
 67.03% ARGBToYMatrixRow_AVX512BW
 24.80% ARGBToUV444MatrixRow_AVX512BW

Bug: libyuv:42280902
Change-Id: I463ebcdb70cb669a1ce1a81102b8fd2fb3943bd3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7819051
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-05 18:17:19 -07:00
Frank Barchard
561a9780e2 YUV to RGB avoid avx assist
Here are the functions flagged for mixing both SSE and AVX (or AVX-512)
instructions, which can trigger an AVX transition/assist performance
penalty:

Libyuv Functions addressed in this CL
   * I422ToARGBRow_AVX512BW
   * HalfFloatRow_SSE2

Not addressed:
   * ScaleFilterCols_SSSE3

Bug: libyuv:509681367
Change-Id: I8ced6065dfe0c516d05857086393782c8590062a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7814945
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-05-05 12:57:55 -07:00
Frank Barchard
5a17753597 libyuv: Optimize Convert8To8Row_NEON for 32-bit ARM
Benchmark (Convert8To8Plane 1280x720, 1000 repeats):

32-bit: 106 ms -> 44 ms
64-bit: 52 ms (unchanged)
Bug: libyuv:42280902
Change-Id: I389a482f93404984759ef6223d7d191579d3578d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7812450
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-05-04 13:00:21 -07:00
Frank Barchard
2143edfa7a ARGBToUVMatrixRow_NEON arm32 reimplemented for GCC
Bug: libyuv:508639302
Change-Id: Ib120373d799c66926a64c980873034be262d8848
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7810481
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
2026-05-04 11:38:45 -07:00
Frank Barchard
f2ac6db694 RAWToNV21 using SME, SVE, I8MM or Neon
Pixel 9 Now SVE2 2 pass LibYUVConvertTest.RAWToNV21_Opt (364 ms)
 31.76% libyuv::ARGBToUVMatrixRow_SVE_SC()
 30.38% RAWToARGBRow_SVE2
 26.81% ARGBToYMatrixRow_NEON_DotProd
  3.26% MergeUVRow_NEON

Was NEON 1 pass LibYUVConvertTest.RAWToJNV21_Opt (295 ms)
 44.14% RAWToYJRow_NEON
 41.91% RAWToUVJRow_NEON
  5.11% MergeUVRow_NEON

Clang on Intel Skylake clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt
(301 ms) visual c (row_win) [ OK ] LibYUVConvertTest.RAWToJNV21_Opt
(2056 ms)

clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (275 ms) visual c [ OK ]
LibYUVConvertTest.RAWToJNV21_Opt (365 ms)

Bug: libyuv:42280902
Change-Id: Iaba558ebe96ce6b9881ee9335ba72b8aac390cde
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7802432
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-29 13:11:04 -07:00
Wan-Teh Chang
b438739c8b Use ptrdiff_t for buffer offsets
Use ptrdiff_t instead of intptr_t for buffer offsets, such as stride,
width_temp, and src_step*.

Change-Id: I64e6701fa71ab59c94325a6dad8762d040035208
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800070
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 18:21:42 -07:00
Frank Barchard
9a0226cb3f Add GEMINI.md with guidelines on libyuv
Bug: None
Change-Id: If5d2d84ff88b3c7069f0f6e9c98a4acb76078618
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800069
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-28 16:41:45 -07:00
Wan-Teh Chang
a7849e8a5e Fix yi * src_stride overflow in ScalePlaneVertical
Fix int overflow of yi * src_stride overflow in ScalePlaneVertical(),
ScalePlaneVertical_16(), and ScalePlaneVertical_16To8() by casting the
operand src_stride to ptrdiff_t.

Adapted from the patches by Victor Miura <vmiura@google.com>.

Bug: 505814332
Change-Id: I4a4751041a213f7208b01eb18c43c9e196a36261
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796558
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2026-04-28 12:34:12 -07:00
Wan-Teh Chang
2895faed32 Use GTEST_SKIP() macro to skip TestI400LargeSize
Change-Id: I3a8f5c498b07f26dea5468fbecad9081f8bbe6d5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800542
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 12:30:58 -07:00
Wan-Teh Chang
54d40344ca No need to cast ptrdiff_t src_stride to intptr_t
ptrdiff_t is the appropriate type for a buffer offset. intptr_t is
intended for a different purpose.

Change-Id: I475c548338b61f573fb11766c24cde6d31fbbed8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796559
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2026-04-28 09:50:46 -07:00
Frank Barchard
4afb965416 RAWToARGB use AVX512BW
Bug: libyuv:42280902
Change-Id: I7a80fd64d97b6d411316819df0fd917d609a173b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787163
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-22 16:56:46 -07:00
Frank Barchard
bd2c4c76ec RAWToARGB AVX512VBMI
Bug: libyuv:42280902
Change-Id: I1c7f432f004079357a00515785bc524c459ed4b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787160
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-22 14:48:29 -07:00
Frank Barchard
d445250d8b Replace RAWToY/RGB24ToY with RGBToYMatrix
Bug: libyuv:42280902
Change-Id: I6ddebd492036c416550fc045eb39493dea73246b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7784094
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-21 17:11:14 -07:00
Frank Barchard
81f698829b Add RGBToNV21Matrix function
- implement wrappers with RAW, RGB24, NV21 and JNV21 to call it.

Zen5
Was [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1146 ms)
Now [       OK ] LibYUVConvertTest.RAWToJNV21_Opt (1446 ms)
reason - the new code uses 1 pass for RAWToY but 2 pass for RAWToARGB,ARGBToUV.  needs 1 RGBToUV

Bug: libyuv:42280902
Change-Id: Ife6fbed0829484045409e6d42b85cec1d1fd6052
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7780026
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2026-04-20 18:03:34 -07:00
Frank Barchard
9f13b2814d add RGBToYMatrixRow_AVX2
Adds RGBToYMatrixRow_AVX2 which reads 24 bit RGB values by reading 3 vectors instead of 4 and permutes them into 4 ARGB vectors before conversion.
Also adds RGBToYMatrixRow_Opt and RGBToYMatrixRow_2Step_Opt to convert_argb_test.cc to benchmark and compare the direct AVX2 conversion vs a 2-step approach.

./libyuv_test '--gunit_filter=*RAWToJ400_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1

AMD Zen 5
Was LibYUVConvertTest.RAWToJ400_Opt (757 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (699 ms)

Intel Skylake
Was LibYUVConvertTest.RAWToJ400_Opt (1705 ms)
Now LibYUVConvertTest.RAWToJ400_Opt (1426 ms)

Bug: 477295731
Change-Id: I29866baf4ad5fe7a3725e4a01f2fe24649510a7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7777325
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2026-04-20 12:52:44 -07:00
Frank Barchard
ddc6764d13 ARGBToUVMatrixRow_RVV replace vlseg8 with vlseg4,
implementing horizontal paired adds and accumulation to improve
performance on SiFive x280, and fixes the remainder logic to use valid
vlseg4 loads. Adds TestARGBToUVRow_Any to test odd-width remainder
handling.

Also fixes a build break for non-RVV compilations by ensuring all RVV
functions and their closing cplusplus braces are correctly wrapped in
#if !defined(LIBYUV_DISABLE_RVV).

Also adds NV12ToNV21 as a macro alias for NV21ToNV12 in
planar_functions.h, as the conversion is bidirectional (swapping byte
pairs in the interleaved chroma plane). (Patch from
https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762904)

Bug: libyuv:42280902
Change-Id: If2d6cbb3e232d63d43e32aba33fa9b2eee8190e5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7772164
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2026-04-17 15:04:45 -07:00