libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-06-15 08:26:06 +08:00

Author	SHA1	Message	Date
Frank Barchard	9d98aaefe7	InterpolateRow for Visual C - remove InterpolateRow_SSSE3 - optimize ARGBToUV444MatrixRow_AVX2 to use unsigned pixels 5.7x faster on AMD Zen4 Was C TestInterpolatePlane (144 ms) TestInterpolatePlane_16 (142 ms) Now AVX2 TestInterpolatePlane (25 ms) TestInterpolatePlane_16 (48 ms) Was signed ARGBToJ444_Opt (157 ms) Now unsigned ARGBToJ444_Opt (155 ms) Bug: None Change-Id: I903109668ff9cfedaddad1ad75411393b3226f41 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7856498 Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-05-18 17:28:46 -07:00
Frank Barchard	9f751100d2	InterpolateRow_16_AVX2 for row_gcc On AMD Zen4 Was C TestInterpolatePlane_16 (143 ms) Now AVX2 TestInterpolatePlane_16 (48 ms) Was I210ToI420_Opt (87 ms) 35.60% InterpolateRow_16To8_AVX2 31.03% Convert16To8Row_AVX512BW 21.35% Convert16To8Row_AVX2 Now I210ToI420_Opt (69 ms) 37.57% Convert16To8Row_AVX512BW 32.69% InterpolateRow_16_AVX2 7.18% Convert16To8Row_AVX2 5.23% InterpolateRow_16To8_AVX2 Bug: None Change-Id: Ica9b9c5dbd847068ae076b682c487e1753d3c812 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7855648 Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-18 14:29:36 -07:00
Frank Barchard	cda55fcf53	Mirrow AVX2 functions for Visual C Bug: libyuv:42280902 Change-Id: Iabbec9af3a4f4dd89294e60145823c7fc4dd6ec6 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7843378 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-05-15 15:05:31 -07:00
Sergey Silkin	0f320a03f7	Fix linear interpolation C interpolator applied to chroma plane at scaling NV12 on Mac/ARM used (0x7f ^ f) which is (127-f) instead of (128-f). This resulted in changes like 128 -> 127 when scaling flat colors and caused visually noticeable difference. Bug: b/465721312 Change-Id: Iecf5d2ca2a85602de4146cba7e0f64ecb4b2c1fe Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830198 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Mirko Bonadei <mbonadei@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>	2026-05-13 05:33:33 -07:00
Frank Barchard	c6c8689c74	Fix I444 and J444 parameter names/order Bug: libyuv:42280902 Change-Id: Ia2c45f2d996d071534b08381f61adf8cb8ef35b9 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841767 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-12 15:30:35 -07:00
Frank Barchard	dd8b46630a	ARGBToUV444MatrixRow_AVX2 intrinsics for Visual C Was C LibYUVConvertTest.ARGBToI444_Opt (1027 ms) Now AVX2 LibYUVConvertTest.ARGBToI444_Opt (310 ms) Bug: libyuv:508639302 Change-Id: I0bc7f5c5b72160d24226a98d5fddb184a004ed00 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7841655 Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-05-12 14:19:58 -07:00
Frank Barchard	cb061d0378	Unittests use ASSERT instead of EXPECT Bug: libyuv:508639302 Change-Id: I22c35e08f3b6db1a656192877c1fb1bf4e96d6f5 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7838659 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-11 19:10:47 -07:00
Frank Barchard	e23282704f	ARGBToYRow_AVX512BW preserve XMM6-XMM15 due to Windows stack alignment Bug: 505124541 Change-Id: Id5ae539f57b314980182bec76a788e33273b2392 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7835639 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: James Zern <jzern@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-11 13:12:22 -07:00
Frank Barchard	4b4e68b372	ABGRToJ420 call ARGBToI420Matrix - Standardize libyuv ARGB-family (ARGB, ABGR, RGBA, BGRA) to YUV conversion by utilizing the generic MatrixRow architecture and explicit ArgbConstants. - Consolidated ARGBToI420, ABGRToI420, BGRAToI420, and RGBAToI420 as wrappers for ARGBToI420Matrix. - Refactored ABGRToJ420, ABGRToJ422, and ABGRToI422 to use generic matrix functions. - Added matrix-based versions for NV21, I400, YUY2, and UYVY. - Updated RAW and RGB24 to I420/I422/I444 dispatchers to use MatrixRow logic and explicit constants. - Fixed parameter swap bugs in ARGBToI422, ARGBToJ422, and ABGRToJ422. - Fixed a bug in the generic C implementation of matrix row functions ensuring all 4 channels are processed correctly for all ARGB-family formats. - Moved kShuffleAARRGGBB in row_gcc.cc to the top of the libyuv namespace for visibility. - Cleaned up redundant format-specific row implementations. Bug: libyuv:42280902 Change-Id: I67ffa4c476abc0d2dcc4650510d7bda91b65988e Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7830291 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-08 15:23:30 -07:00
Frank Barchard	4aacbbdfb4	Refactored RGB/RAW to YUV color conversion functions to use generic Matrix-based functions parameterized by ArgbConstants. This consolidation standardizes conversion logic, improves code maintainability, and provides flexible support for various color spaces (e.g., BT.601, JPEG full range). Key Modifications: - Function Consolidation: Refactored several high-level conversion functions into lightweight wrappers around generic Matrix variants: - ARGBToI420 → ARGBToI420Matrix - ARGBToI444 → ARGBToI444Matrix - ARGBToI422 → ARGBToI422Matrix - ARGBToNV12 → ARGBToNV12Matrix - RAWToJ400, RGB24ToJ400 → RGBToI400Matrix - RAWToI444, RAWToJ444 → RGBToI444Matrix - 2-Pass Conversions: Updated RGB565ToI420, ARGB1555ToI420, and ARGB4444ToI420 to utilize 2-pass conversions via RGBToI420Matrix. - Standardization: Refactored ARGBToNV21, ARGBToYUY2, and ARGBToUYVY to use parameterized matrix row functions (ARGBToYMatrixRow, ARGBToUVMatrixRow). - Legacy Cleanup: Replaced legacy calls to ARGBToYJRow with the parameterized ARGBToYMatrixRow in the ARGBSobelize helper. - Internal Integration: Included libyuv/convert_from_argb.h in planar_functions.cc and ensured all new matrix symbols are properly declared/exported (LIBYUV_API). Bug: libyuv:42280902 Change-Id: Ied5fd9899767427e3a03cdcfbeaff3e9d502374a Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7822033 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-06 20:02:47 -07:00
Mirko Bonadei	8773064a72	Revert "Fix rounding in scaling routines." This reverts commit 37a848135b2b5df6177a6fd42e1d60eb0fa9539d. Reason for revert: The CL was not ready to land. Original change's description: > Fix rounding in scaling routines. > > * before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv > * after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9 > * source: https://screenshot.googleplex.com/9yevVP7URe3XfSm > > Bug: b/465721312 > Change-Id: I2aede005db252b2912ceef23379463f176675205 > Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417 > Reviewed-by: Frank Barchard <fbarchard@google.com> > Reviewed-by: richard winterton <rrwinterton@gmail.com> Bug: b/465721312 No-Presubmit: true No-Tree-Checks: true No-Try: true Change-Id: Ic3d0de4d4475942bc91fbee17d012bc1b656589f Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7816864 Commit-Queue: Mirko Bonadei <mbonadei@chromium.org> Bot-Commit: rubber-stamper@appspot.gserviceaccount.com <rubber-stamper@appspot.gserviceaccount.com>	2026-05-06 03:30:24 -07:00
Sergey Silkin	37a848135b	Fix rounding in scaling routines. * before: https://screenshot.googleplex.com/3ujxU7drx8J9aVv * after: https://screenshot.googleplex.com/5twPmxuBUKjvVD9 * source: https://screenshot.googleplex.com/9yevVP7URe3XfSm Bug: b/465721312 Change-Id: I2aede005db252b2912ceef23379463f176675205 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7813417 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-05-05 19:05:43 -07:00
Frank Barchard	125f151316	ARGBToNV12 use Matrix Refactored Matrix functions (ARGBToI420Matrix, ARGBToI422Matrix, ARGBToI444Matrix and ARGBToNV12Matrix) and updated their CPU dispatch logic. ARGBToNV12 clang 68.05% ARGBToUVMatrixRow_AVX512BW 21.04% ARGBToYMatrixRow_AVX512BW 2.88% MergeUVRow_AVX512BW ARGBToNV12 rowwin 61.26% ARGBToUVMatrixRow_AVX2 25.43% ARGBToYMatrixRow_AVX2 3.09% MergeUVRow_AVX2 ARM on One Plus 15 42.98% libyuv::ARGBToUVMatrixRow_SVE_SC() 38.95% ARGBToYMatrixRow_NEON_DotProd 2.96% MergeUVRow_NEON 0.18% ARGBToUVMatrixRow_SVE2 ARGBToI420 72.28% ARGBToUVMatrixRow_AVX512BW 19.04% ARGBToYMatrixRow_AVX512BW ARGBToI422 77.46% ARGBToUVMatrixRow_AVX512BW 15.55% ARGBToYMatrixRow_AVX512BW ARGBToI444 67.03% ARGBToYMatrixRow_AVX512BW 24.80% ARGBToUV444MatrixRow_AVX512BW Bug: libyuv:42280902 Change-Id: I463ebcdb70cb669a1ce1a81102b8fd2fb3943bd3 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7819051 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-05 18:17:19 -07:00
Frank Barchard	561a9780e2	YUV to RGB avoid avx assist Here are the functions flagged for mixing both SSE and AVX (or AVX-512) instructions, which can trigger an AVX transition/assist performance penalty: Libyuv Functions addressed in this CL * I422ToARGBRow_AVX512BW * HalfFloatRow_SSE2 Not addressed: * ScaleFilterCols_SSSE3 Bug: libyuv:509681367 Change-Id: I8ced6065dfe0c516d05857086393782c8590062a Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7814945 Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-05-05 12:57:55 -07:00
Frank Barchard	5a17753597	libyuv: Optimize Convert8To8Row_NEON for 32-bit ARM Benchmark (Convert8To8Plane 1280x720, 1000 repeats): 32-bit: 106 ms -> 44 ms 64-bit: 52 ms (unchanged) Bug: libyuv:42280902 Change-Id: I389a482f93404984759ef6223d7d191579d3578d Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7812450 Reviewed-by: Justin Green <greenjustin@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-05-04 13:00:21 -07:00
Frank Barchard	2143edfa7a	ARGBToUVMatrixRow_NEON arm32 reimplemented for GCC Bug: libyuv:508639302 Change-Id: Ib120373d799c66926a64c980873034be262d8848 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7810481 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Justin Green <greenjustin@google.com>	2026-05-04 11:38:45 -07:00
Frank Barchard	f2ac6db694	RAWToNV21 using SME, SVE, I8MM or Neon Pixel 9 Now SVE2 2 pass LibYUVConvertTest.RAWToNV21_Opt (364 ms) 31.76% libyuv::ARGBToUVMatrixRow_SVE_SC() 30.38% RAWToARGBRow_SVE2 26.81% ARGBToYMatrixRow_NEON_DotProd 3.26% MergeUVRow_NEON Was NEON 1 pass LibYUVConvertTest.RAWToJNV21_Opt (295 ms) 44.14% RAWToYJRow_NEON 41.91% RAWToUVJRow_NEON 5.11% MergeUVRow_NEON Clang on Intel Skylake clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (301 ms) visual c (row_win) [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (2056 ms) clang [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (275 ms) visual c [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (365 ms) Bug: libyuv:42280902 Change-Id: Iaba558ebe96ce6b9881ee9335ba72b8aac390cde Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7802432 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Dale Curtis <dalecurtis@chromium.org>	2026-04-29 13:11:04 -07:00
Wan-Teh Chang	b438739c8b	Use ptrdiff_t for buffer offsets Use ptrdiff_t instead of intptr_t for buffer offsets, such as stride, width_temp, and src_step*. Change-Id: I64e6701fa71ab59c94325a6dad8762d040035208 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7800070 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Wan-Teh Chang <wtc@google.com>	2026-04-28 18:21:42 -07:00
Wan-Teh Chang	a7849e8a5e	Fix yi * src_stride overflow in ScalePlaneVertical Fix int overflow of yi * src_stride overflow in ScalePlaneVertical(), ScalePlaneVertical_16(), and ScalePlaneVertical_16To8() by casting the operand src_stride to ptrdiff_t. Adapted from the patches by Victor Miura <vmiura@google.com>. Bug: 505814332 Change-Id: I4a4751041a213f7208b01eb18c43c9e196a36261 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796558 Commit-Queue: Wan-Teh Chang <wtc@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com>	2026-04-28 12:34:12 -07:00
Wan-Teh Chang	54d40344ca	No need to cast ptrdiff_t src_stride to intptr_t ptrdiff_t is the appropriate type for a buffer offset. intptr_t is intended for a different purpose. Change-Id: I475c548338b61f573fb11766c24cde6d31fbbed8 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7796559 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Wan-Teh Chang <wtc@google.com>	2026-04-28 09:50:46 -07:00
Frank Barchard	4afb965416	RAWToARGB use AVX512BW Bug: libyuv:42280902 Change-Id: I7a80fd64d97b6d411316819df0fd917d609a173b Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787163 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-04-22 16:56:46 -07:00
Frank Barchard	bd2c4c76ec	RAWToARGB AVX512VBMI Bug: libyuv:42280902 Change-Id: I1c7f432f004079357a00515785bc524c459ed4b9 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7787160 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-04-22 14:48:29 -07:00
Frank Barchard	d445250d8b	Replace RAWToY/RGB24ToY with RGBToYMatrix Bug: libyuv:42280902 Change-Id: I6ddebd492036c416550fc045eb39493dea73246b Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7784094 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-04-21 17:11:14 -07:00
Frank Barchard	81f698829b	Add RGBToNV21Matrix function - implement wrappers with RAW, RGB24, NV21 and JNV21 to call it. Zen5 Was [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (1146 ms) Now [ OK ] LibYUVConvertTest.RAWToJNV21_Opt (1446 ms) reason - the new code uses 1 pass for RAWToY but 2 pass for RAWToARGB,ARGBToUV. needs 1 RGBToUV Bug: libyuv:42280902 Change-Id: Ife6fbed0829484045409e6d42b85cec1d1fd6052 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7780026 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2026-04-20 18:03:34 -07:00
Frank Barchard	9f13b2814d	add RGBToYMatrixRow_AVX2 Adds RGBToYMatrixRow_AVX2 which reads 24 bit RGB values by reading 3 vectors instead of 4 and permutes them into 4 ARGB vectors before conversion. Also adds RGBToYMatrixRow_Opt and RGBToYMatrixRow_2Step_Opt to convert_argb_test.cc to benchmark and compare the direct AVX2 conversion vs a 2-step approach. ./libyuv_test '--gunit_filter=*RAWToJ400_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1 AMD Zen 5 Was LibYUVConvertTest.RAWToJ400_Opt (757 ms) Now LibYUVConvertTest.RAWToJ400_Opt (699 ms) Intel Skylake Was LibYUVConvertTest.RAWToJ400_Opt (1705 ms) Now LibYUVConvertTest.RAWToJ400_Opt (1426 ms) Bug: 477295731 Change-Id: I29866baf4ad5fe7a3725e4a01f2fe24649510a7d Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7777325 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Justin Green <greenjustin@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-20 12:52:44 -07:00
Frank Barchard	ddc6764d13	ARGBToUVMatrixRow_RVV replace vlseg8 with vlseg4, implementing horizontal paired adds and accumulation to improve performance on SiFive x280, and fixes the remainder logic to use valid vlseg4 loads. Adds TestARGBToUVRow_Any to test odd-width remainder handling. Also fixes a build break for non-RVV compilations by ensuring all RVV functions and their closing cplusplus braces are correctly wrapped in #if !defined(LIBYUV_DISABLE_RVV). Also adds NV12ToNV21 as a macro alias for NV21ToNV12 in planar_functions.h, as the conversion is bidirectional (swapping byte pairs in the interleaved chroma plane). (Patch from https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762904) Bug: libyuv:42280902 Change-Id: If2d6cbb3e232d63d43e32aba33fa9b2eee8190e5 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7772164 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-04-17 15:04:45 -07:00
Frank Barchard	ace7c4573c	Add ARGBToUV444MatrixRow_RVV, ARGBToUVMatrixRow_RVV, and wrappers This change implements ARGBToUV444MatrixRow_RVV, ARGBToUVMatrixRow_RVV, and their wrappers (ARGBToUVRow_RVV, ARGBToUVJRow_RVV, etc.) using RVV intrinsics, mirroring the NEON/AVX2 designs. It wires them into the build and dispatch systems. LIBYUV_RVV_HAS_TUPLE_TYPE is always true on new compilers. This macro has been removed, assuming it is true everywhere, reducing the amount of code in row_rvv.cc, scale_rvv.cc, and row.h. Tested via: ~/bin/doyuv3v && ~/bin/runyuv3v TestARGBToI444Matrix ~/bin/doyuv3av Bug: libyuv:42280902 Change-Id: I36d305386b297d69023c068aa9c62ab6b2ad039c Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7769956 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-16 20:52:43 -07:00
Chema Gonzalez	dec8272138	Fix typo Change-Id: I4dea1bcacc7d10dd2db74f4b221db42e2deade83 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7762903 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-04-16 14:27:40 -07:00
Frank Barchard	94644361b4	row_win.cc rewrite into intrinsics - remove inline asm which was only for 32 bit - add ARGBToYMatrixRow_AVX2 - add gn flag libyuv_enable_rowwin=true Example of building with GN and Ninja: Without the new flag: gn gen out/Release "--args=is_debug=false" ninja -C out/Release With the new flag: gn gen out/Release "--args=is_debug=false libyuv_enable_rowwin=true" ninja -C out/Release Bug: libyuv:42280806, 477295731, libyuv:42280902, libyuv:439628764 R=dalecurtis@chromium.org, rrwinterton@gmail.com Change-Id: I451bf814622fba690005c02fbf5816819c6a08c2 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7765790 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-15 19:53:16 -07:00
Frank Barchard	e034c41661	Port ARGBToUVMatrixRow from AVX2 to AVX512BW Benchmark on Icelake Xeon Now AVX512BW: [ OK ] LibYUVConvertTest.ARGBToNV12_Opt (1723 ms) Was AVX2: [ OK ] LibYUVConvertTest.ARGBToNV12_Opt (2144 ms) - Added `ARGBToUVMatrixRow_AVX512BW` implementation in `source/row_gcc.cc`. - Added corresponding `ARGBToUVRow_AVX512BW` and `ABGRToUVRow_AVX512BW` functions. - Added unaligned wrappers `ARGBToUVRow_Any_AVX512BW` and `ABGRToUVRow_Any_AVX512BW` in `source/row_any.cc`. - Updated `source/row_any.cc` to correctly size `vin` and `vout` buffers for AVX512BW width and adjusted the `ANY12MS` and `ANY12S` macros to handle `MASK=63`. - Updated `include/libyuv/row.h` with the required AVX512BW headers and definitions, scoped appropriately. - Wired all callers of `ARGBToUVRow_AVX2` and related functions in `source/convert.cc` and `source/convert_from_argb.cc` to dynamically use the `AVX512BW` implementations if the CPU flag indicates AVX-512BW support. - Optimized AVX-512 code to generate the `-1` multiplier in a single instruction (`vpternlogd`) and reused it across word (`vpmaddwd`) dot products. Handled the resulting negation by replacing a subtraction with `vpaddw` offset adjustment. Bug: 477295731 R=dalecurtis@chromium.org, rrwinterton@gmail.com Change-Id: Ida5fb27e59ae4c1c3824737f009b80549cd20a06 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7763257 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-14 16:15:31 -07:00
Frank Barchard	cbc64c353c	Port ARGBToYRow_AVX2 usages to dynamically use ARGBToYRow_AVX512BW I have successfully ported the usage of ARGBToYRow_AVX2 to dynamically detect and utilize ARGBToYRow_AVX512BW when available. Here's a summary of the changes: 1. Source Modifications: In both source/convert.cc and source/convert_from_argb.cc, I searched for all references where ARGBToYRow_AVX2 was being conditionally used (which operates on 32 pixels). 2. AVX512BW Detection: Immediately following those blocks, I injected a new check for kCpuHasAVX512BW. If the CPU flag is present, the logic now utilizes ARGBToYRow_Any_AVX512BW by default, falling back to the fully aligned ARGBToYRow_AVX512BW when the width is aligned to 64 bytes. 3. Profiling: After building and compiling the tests (doyuv3x), I validated the change using perfyuv3 ARGBToNV12_Opt \| cat. The test successfully executed and the performance profile indicated that ARGBToYRow_AVX512BW successfully executed (taking up ~18% of CPU cycles, replacing the previous AVX2 specific instruction overhead for the Y row extraction). The HAS_ARGBTOYROW_AVX512BW macro implementation now fully supports all AVX2 conversion paths to utilize AVX512BW when the system processor flags allow it! R=richard, rrwinterton@gmail.com Change-Id: Iad811e12d301f5621e6f6d039105420861ade43e Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7760779 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-04-14 11:42:59 -07:00
Frank Barchard	59ca5d8074	Fix parameter names and comments for ARGB/BGRA/RGBA/ABGR functions In all functions that start with ARGB, BGRA, RGBA or ABGR in the include/libyuv/ headers, make sure the parameter variable name has the same 4 letters, but lower case, and the comment before the function should have the same matching name. Then make sure the implementation in source/ folder has the same variable names. Change-Id: Idadbbbb993156eea16e318719f4888cb3bed5f6a Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7760057 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-13 18:28:37 -07:00
Frank Barchard	893eacf9b4	ARGBToY for AVX512 - add ARGBToYMatrixRow_AVX512BW - refactor SSE and AVX to use Matrix functions, making old functions call the new ones. Zen5 1280x720 Was AVX2 LibYUVConvertTest.ARGBToI444_Opt (1125 ms) Now AVX512 LibYUVConvertTest.ARGBToI444_Opt (641 ms) Details by Gemini: 1. Created 3 new Matrix functions: Added ARGBToYMatrixRow_SSSE3, ARGBToYMatrixRow_AVX2, and ARGBToYMatrixRow_AVX512BW to source/row_gcc.cc. These take the const struct ArgbConstants* c parameter similarly to ARGBToUV444MatrixRow_. The x86 vector instructions dynamically calculate the needed values using the properties of the constants struct, including using vpmaddwd inside the AVX512 code to offset the lack of a native vphaddw. 2. Replaced Old Functions with Wrappers: Modified the existing implementations of ARGBToYRow_SSSE3, ARGBToYJRow_SSSE3, ABGRToYRow_SSSE3, ABGRToYJRow_SSSE3, RGBAToYRow_SSSE3, RGBAToYJRow_SSSE3, BGRAToYRow_SSSE3 (and their _AVX2 equivalents) in source/row_gcc.cc to act as inline wrappers calling the new ARGBToYMatrixRow_ functions, passing the right matrix parameters (e.g. &kArgbI601Constants, &kArgbJPEGConstants, &kAbgrI601Constants). 3. Added row_any.cc Handlers: Added ANY11MC definitions to source/row_any.cc to autogenerate ARGBToYMatrixRow_Any_SSSE3, ARGBToYMatrixRow_Any_AVX2, and ARGBToYMatrixRow_Any_AVX512BW which safely handles non-aligned tails. 4. Updated include/libyuv/row.h: Updated the headers with the proper void declarations for all newly generated Matrix and Any_ variants. Also defined HAS_ARGBTOYROW_AVX512BW in the CPU macros. 5. Tested the Implementations: Compiled and tested on Linux x86, which resulted in all tests passing cleanly. Also successfully completed all Windows 32-bit build checks ensuring 32-bit regression prevention without issues. Bug: 477295731 Change-Id: I4f5eec9a961e24a9d760d0a1c0810fb5e29a0bd1 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7759494 Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2026-04-13 17:26:07 -07:00
Frank Barchard	644251f252	Fix buffer sizes in ANY macros and ANY11MC typo Increases buffer sizes from 128 to 256 in ANY11, ANY11C, ANY11MC, ANY12, and ANY12M macros to safely accommodate AVX512BW processing which can write up to 256 bytes per operation. Bug: libyuv:42280902, libyuv:502250231, 501882928 Change-Id: Icfba1982dc5fb6545255464f7decb2baec7be90f Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7758060 Reviewed-by: James Zern <jzern@google.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2026-04-13 13:01:49 -07:00
Frank Barchard	5cfaa44d71	Replace strtok_r with strchr in RISC-V CPU capability detection This fixes a build failure on bare-metal toolchains like riscv64-unknown-elf-clang++ where strtok_r may be undeclared. Bug: 477295731 Change-Id: If4edd6c6d2e975ae34278f479700ef9b996c0a3e Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7744872 Reviewed-by: James Zern <jzern@google.com>	2026-04-10 12:33:43 -07:00
Frank Barchard	5b5a2f6b92	Fix 'ghost AVX512' detection on Alder Lake CPUs Adds a check for the AVX512F feature bit (cpu_info7[1] & 0x00010000) before enabling AVX512 features. Alder Lake CPUs can report OS support for YMM/ZMM but not actually support AVX512F, leading to incorrect capability detection and crashes. Bug: libyuv:500318522 Change-Id: I84167ee3fcfc7a2572afba148bbb275bd3ccb1e5 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7746229 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Dale Curtis <dalecurtis@chromium.org>	2026-04-09 17:52:24 -07:00
Frank Barchard	4c3d7d517a	ARGBToUV444 for AVX512 1.27x faster on AMD Zen5 (turin) Now AVX512 perf record ./libyuv_test '--gunit_filter=*ARGBToI444_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=10000 --libyuv_flags=-1 --libyuv_cpu_info=-1 [ OK ] LibYUVConvertTest.ARGBToI444_Opt (1071 ms) Overhead Symbol 53.49% ARGBToYRow_AVX2 44.70% ARGBToUV444Row_AVX512BW Was AVX2 [ OK ] LibYUVConvertTest.ARGBToI444_Opt (1369 ms) 61.06% ARGBToUV444Row_AVX2 37.67% ARGBToYRow_AVX2 Bug: libyuv:42280902 Change-Id: I306fbac656d6f7834ce1559e86d01eb34931ec3c Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7738362 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Dale Curtis <dalecurtis@chromium.org>	2026-04-08 19:25:41 -07:00
Sam Maier	7903a6c632	Fix deprecated usage of strtok The latest Android NDK marks strtok as deprecated and suggests using strtok_r instead. Bug: 477295731 Change-Id: I2b20a2ae0a9e19ec93e31669ec380802e6902090 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7739107 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Wan-Teh Chang <wtc@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2026-04-08 11:34:51 -07:00
Dale Curtis	1170363ce5	Add Gemini implementation for NEON32 RGB to YUV matrix operations These are about 25% faster than the C versions. Bug: libyuv:42280902 Change-Id: I8b298670ee5f3ed5db35527fc41d6d9a51b020a1 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7573682 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Dale Curtis <dalecurtis@chromium.org>	2026-03-23 16:30:44 -07:00
Frank Barchard	4183733af5	Rename MergeUVRow_ variable to MergeUVRow Bug: libyuv:42280902 Change-Id: I9935bf958b901ddf84cf91b2097c8cd5d6efadde Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7683070 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Dale Curtis <dalecurtis@chromium.org>	2026-03-18 17:18:25 -07:00
Dale Curtis	b1cacfb38f	Unify X86/X64 versions of ARGBToI4xxMatrix functions Change-Id: Iead13414414543e5f10ba9ba47a6ceaeb3113dee Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7562443 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Wan-Teh Chang <wtc@google.com>	2026-03-18 16:27:07 -07:00
Dale Curtis	f69a479f04	Add ARGBToNV12Matrix implementation This one reuses the SIMD implementations for MergeUVRow_ from the existing ARGBToNV12 functions. Bug: libyuv:42280902 Change-Id: If0a4be133d657ed0262f29fdd568dac90b49636c Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7564317 Reviewed-by: Wan-Teh Chang <wtc@google.com> Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Dale Curtis <dalecurtis@chromium.org>	2026-03-18 16:26:59 -07:00
Dale Curtis	2c21d57319	Add ABGR versions of the ArgbConstants structures This allows for ABGR conversion using the same methods Bug: libyuv:42280902 Change-Id: I5566e3150b30573a2326a900ce31ab095f8935f9 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7564316 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Wan-Teh Chang <wtc@google.com>	2026-03-17 17:28:51 -07:00
Dale Curtis	30809ff64a	Add ARGBToI4xxMatrix variants This was implemented by Gemini followed by manual review and some tweaking for style. The 601 and JPEG constants are fully verified against the existing non-matrix implementations. On x86 the C-only versions appear to be about 25% slower than the optimized ones. Bug: libyuv:42280902 Change-Id: Ia5b7cb499bad5c76faec53f36086ebb18f2b530f Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7512030 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Wan-Teh Chang <wtc@google.com> Commit-Queue: Dale Curtis <dalecurtis@chromium.org>	2026-03-04 10:55:06 -08:00
Valentin Haudiquet	022efdb0b7	RVV: Enable RVV on GCC GCC now supports vector segment load and store, which was previously missing; and the reason why it was disabled. Change-Id: I923fd8a15476de8dcc2103bb8335d4fcc3ca96a9 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7241606 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Wan-Teh Chang <wtc@google.com> Commit-Queue: Wan-Teh Chang <wtc@google.com>	2026-01-06 11:16:24 -08:00
Frank Barchard	900da61d3c	Experimental SVE FMMLA detect Detect if arm cpu support FMMLA instruction Bug: None Change-Id: Ia7b83bf2735ddeeb8a85da44177e708c34e4b1fb Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7085486 Reviewed-by: Wan-Teh Chang <wtc@google.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2025-10-27 14:34:55 -07:00
Frank Barchard	2b4453d46f	Deprecate MIPS and MSA support. - Remove *_msa.cc source files - Update build files - Update header references, planar ifdefs for row functions - Update documentation on supported platforms - Version bumped to 1921 - clang-format applied Bug: 434383432 Change-Id: I072d6aac4956f0ed668e64614ac8557612171f76 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/7045953 Reviewed-by: Justin Green <greenjustin@google.com>	2025-10-16 12:20:40 -07:00
Frank Barchard	94417b9d21	Pass rgbconstants via struct pointer instead of elements with m Now 66 instructions SYM ARGBToUVRow_SSSE3: 62ccd0: BASE push ebp 62ccd1: BASE mov ebp, esp 62ccd3: BASE push ebx 62ccd4: BASE push edi 62ccd5: BASE push esi 62ccd6: BASE and esp, 0xfffffffc 62ccd9: BASE sub esp, 0xc 62ccdc: BASE call 0x62cce1 <ARGBToUVRow_SSSE3+0x11> 62cce1: BASE pop eax 62cce2: BASE add eax, 0xe1c27 62cce8: BASE mov ecx, dword ptr [ebp+0xc] 62cceb: BASE mov edx, dword ptr [ebp+0x8] 62ccee: BASE mov esi, dword ptr [ebp+0x10] 62ccf1: BASE mov edi, dword ptr [ebp+0x18] 62ccf4: BASE mov dword ptr [esp+0x8], edi 62ccf8: BASE mov edi, dword ptr [ebp+0x14] 62ccfb: BASE lea ebx, ptr [eax-0x5ecf88] 62cd01: SSE2 movdqa xmm4, xmmword ptr [ebx] 62cd05: SSE2 movdqa xmm5, xmmword ptr [ebx+0x10] 62cd0a: SSE2 pcmpeqb xmm6, xmm6 62cd0e: SSSE3 pabsb xmm6, xmm6 62cd13: SSE2 movdqa xmm7, xmmword ptr [eax-0x5ecfa8] 62cd1b: BASE sub edi, esi 62cd1d: SSE2 movdqu xmm0, xmmword ptr [edx] 62cd21: SSE2 movdqu xmm1, xmmword ptr [edx+0x10] 62cd26: SSE2 movdqu xmm2, xmmword ptr [edx+ecx1] 62cd2b: SSE2 movdqu xmm3, xmmword ptr [edx+ecx1+0x10] 62cd31: SSSE3 pshufb xmm0, xmm7 62cd36: SSSE3 pshufb xmm1, xmm7 62cd3b: SSSE3 pshufb xmm2, xmm7 62cd40: SSSE3 pshufb xmm3, xmm7 62cd45: SSSE3 pmaddubsw xmm0, xmm6 62cd4a: SSSE3 pmaddubsw xmm1, xmm6 62cd4f: SSSE3 pmaddubsw xmm2, xmm6 62cd54: SSSE3 pmaddubsw xmm3, xmm6 62cd59: SSE2 paddw xmm0, xmm2 62cd5d: SSE2 paddw xmm1, xmm3 62cd61: SSE2 pxor xmm2, xmm2 62cd65: SSE2 psrlw xmm0, 0x1 62cd6a: SSE2 psrlw xmm1, 0x1 62cd6f: SSE2 pavgw xmm0, xmm2 62cd73: SSE2 pavgw xmm1, xmm2 62cd77: SSE2 packuswb xmm0, xmm1 62cd7b: SSE2 movdqa xmm2, xmm6 62cd7f: SSE2 psllw xmm2, 0xf 62cd84: SSE2 movdqa xmm1, xmm0 62cd88: SSSE3 pmaddubsw xmm1, xmm5 62cd8d: SSSE3 pmaddubsw xmm0, xmm4 62cd92: SSSE3 phaddw xmm0, xmm1 62cd97: SSE2 psubw xmm2, xmm0 62cd9b: SSE2 psrlw xmm2, 0x8 62cda0: SSE2 packuswb xmm2, xmm2 62cda4: SSE2 movd dword ptr [esi], xmm2 62cda8: SSE2 pshufd xmm2, xmm2, 0x55 62cdad: SSE2 movd dword ptr [esi+edi1], xmm2 62cdb2: BASE lea edx, ptr [edx+0x20] 62cdb5: BASE lea esi, ptr [esi+0x4] 62cdb8: BASE sub dword ptr [esp+0x8], 0x8 62cdbd: BASE jnle 0x62cd1d <ARGBToUVRow_SSSE3+0x4d> 62cdc3: BASE lea esp, ptr [ebp-0xc] 62cdc6: BASE pop esi 62cdc7: BASE pop edi 62cdc8: BASE pop ebx 62cdc9: BASE pop ebp 62cdca: BASE ret Was 68 instructions ARGBToUVRow_SSSE3: 62ccd0: BASE push ebp 62ccd1: BASE mov ebp, esp 62ccd3: BASE push edi 62ccd4: BASE push esi 62ccd5: BASE and esp, 0xfffffff0 62ccd8: BASE sub esp, 0x30 62ccdb: BASE call 0x62cce0 <ARGBToUVRow_SSSE3+0x10> 62cce0: BASE pop eax 62cce1: BASE add eax, 0xe1c28 62cce7: BASE mov ecx, dword ptr [ebp+0xc] 62ccea: BASE mov edx, dword ptr [ebp+0x8] 62cced: BASE mov esi, dword ptr [ebp+0x10] 62ccf0: BASE mov edi, dword ptr [ebp+0x18] 62ccf3: BASE mov dword ptr [esp+0xc], edi 62ccf7: BASE mov edi, dword ptr [ebp+0x14] 62ccfa: SSE movaps xmm0, xmmword ptr [eax-0x5ecf88] 62cd01: SSE movaps xmmword ptr [esp+0x20], xmm0 62cd06: SSE movaps xmm0, xmmword ptr [eax-0x5ecf78] 62cd0d: SSE movaps xmmword ptr [esp+0x10], xmm0 62cd12: SSE2 movdqa xmm4, xmmword ptr [esp+0x20] 62cd18: SSE2 movdqa xmm5, xmmword ptr [esp+0x10] 62cd1e: SSE2 pcmpeqb xmm6, xmm6 62cd22: SSSE3 pabsb xmm6, xmm6 62cd27: SSE2 movdqa xmm7, xmmword ptr [eax-0x5ecfa8] 62cd2f: BASE sub edi, esi 62cd31: SSE2 movdqu xmm0, xmmword ptr [edx] 62cd35: SSE2 movdqu xmm1, xmmword ptr [edx+0x10] 62cd3a: SSE2 movdqu xmm2, xmmword ptr [edx+ecx1] 62cd3f: SSE2 movdqu xmm3, xmmword ptr [edx+ecx1+0x10] 62cd45: SSSE3 pshufb xmm0, xmm7 62cd4a: SSSE3 pshufb xmm1, xmm7 62cd4f: SSSE3 pshufb xmm2, xmm7 62cd54: SSSE3 pshufb xmm3, xmm7 62cd59: SSSE3 pmaddubsw xmm0, xmm6 62cd5e: SSSE3 pmaddubsw xmm1, xmm6 62cd63: SSSE3 pmaddubsw xmm2, xmm6 62cd68: SSSE3 pmaddubsw xmm3, xmm6 62cd6d: SSE2 paddw xmm0, xmm2 62cd71: SSE2 paddw xmm1, xmm3 62cd75: SSE2 pxor xmm2, xmm2 62cd79: SSE2 psrlw xmm0, 0x1 62cd7e: SSE2 psrlw xmm1, 0x1 62cd83: SSE2 pavgw xmm0, xmm2 62cd87: SSE2 pavgw xmm1, xmm2 62cd8b: SSE2 packuswb xmm0, xmm1 62cd8f: SSE2 movdqa xmm2, xmm6 62cd93: SSE2 psllw xmm2, 0xf 62cd98: SSE2 movdqa xmm1, xmm0 62cd9c: SSSE3 pmaddubsw xmm1, xmm5 62cda1: SSSE3 pmaddubsw xmm0, xmm4 62cda6: SSSE3 phaddw xmm0, xmm1 62cdab: SSE2 psubw xmm2, xmm0 62cdaf: SSE2 psrlw xmm2, 0x8 62cdb4: SSE2 packuswb xmm2, xmm2 62cdb8: SSE2 movd dword ptr [esi], xmm2 62cdbc: SSE2 pshufd xmm2, xmm2, 0x55 62cdc1: SSE2 movd dword ptr [esi+edi1], xmm2 62cdc6: BASE lea edx, ptr [edx+0x20] 62cdc9: BASE lea esi, ptr [esi+0x4] 62cdcc: BASE sub dword ptr [esp+0xc], 0x8 62cdd1: BASE jnle 0x62cd31 <ARGBToUVRow_SSSE3+0x61> 62cdd7: BASE lea esp, ptr [ebp-0x8] 62cdda: BASE pop esi 62cddb: BASE pop edi 62cddc: BASE pop ebp 62cddd: BASE ret 62cdde: BASE int3 BUG=444157316 Change-Id: Iad044f851359f5b052091c7bdab9b96946fc3682 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6987370 Reviewed-by: Justin Green <greenjustin@google.com>	2025-09-29 12:34:36 -07:00
Daniel.L (Byoungchan Lee)	5b22f31cb5	Fix compilation issue for 32bit PIC build Currently, ARGBToUVMatrixRow_AVX2 and ARGBToUVMatrixRow_SSSE3 fail to compile with clang on 32bit PIC build with the error message: inline assembly requires more registers than available This is because in PIC code EBX is reserved for the GOT and with a frame pointer EBP is also unavailable. Fix this by copying the RGB-to-UV constants to stack locals first and let the asm use simple stack-relative addressing. Bug: 444157316 Change-Id: Ica90f0c35039303ecaa145534683f59659fb5d7f Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6980714 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2025-09-25 13:49:02 -07:00
Frank Barchard	1b1c058787	ARGBToUV for SSE use pshufb/pmaddubsw Was ARGBToJ420_Opt (377 ms) Now ARGBToJ420_Opt (340 ms) Bug: None Change-Id: Iada2d6e9ecdb141b9e2acbdf343f890e4aaebe34 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6967754 Reviewed-by: Justin Green <greenjustin@google.com>	2025-09-19 12:39:39 -07:00

1 2 3 4 5 ...

2042 Commits