libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-06-15 08:26:06 +08:00

History

Frank Barchard 03d8b0990b I420ToRAW and I420ToRGB24 1 pass AVX2 Replaced the 2-pass conversion (I420 -> ARGB -> RGB24/RAW) with a highly optimized 1-pass AVX2 implementation. This avoids intermediate stack buffering and significantly reduces memory bandwidth. Implemented `I422ToRGB24Row_AVX2` in: - `row_gcc.cc`: Inline assembly for GCC/Clang. - `row_win.cc`: C++ intrinsics for MSVC (also verified with Clang). Optimized the width alignment requirement: changed from 32-pixel to 16-pixel alignment in `convert_argb.cc` and `row_any.cc`. This allows the optimized AVX2 path to be used for more common video resolutions. Performance results (1080p, 100 iterations): - C Reference: ~18.5 ms - AVX2 2-Pass (Baseline): ~412 us (~45x speedup) - AVX2 1-Pass (GCC Assembly): ~411 us (~s45x speedup) - AVX2 1-Pass (Intrinsics): ~365 us (~50x speedup, 11% faster than asm) Test: libyuv_unittest --gunit_filter=I420ToRGB24 Test: libyuv_unittest --gunit_filter=I420ToRAW Bug: 42280902 Change-Id: I07c0505c95410ea16a6218c858844791a11ef073		2026-06-08 19:33:58 -07:00
..
compare_common.cc	clang-tidy applied	2021-04-01 21:42:47 +00:00
compare_gcc.cc	ARGBToJ444 use 256 for fixed point scale UV	2025-02-27 13:04:15 -08:00
compare_neon64.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
compare_neon.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
compare_win.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
compare.cc	Don't coalesce rows if width*height would overflow	2026-05-29 11:57:47 -07:00
convert_argb.cc	I420ToRAW and I420ToRGB24 1 pass AVX2	2026-06-08 19:33:58 -07:00
convert_from_argb.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
convert_from.cc	Fix integer overflow when flipping negative height	2026-06-03 16:17:37 -07:00
convert_jpeg.cc	PlaneScale, UVScale and ARGBScale test 3x and 4x down sample.	2020-10-28 20:41:59 +00:00
convert_to_argb.cc	ConvertToARGB: compute buffer offsets in ptrdiff_t	2026-06-05 18:38:42 -07:00
convert_to_i420.cc	Fix int negation overflow in ConvertToARGB/I420	2026-06-05 12:34:38 -07:00
convert.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
cpu_id.cc	Replace strtok_r with strchr in RISC-V CPU capability detection	2026-04-10 12:33:43 -07:00
mjpeg_decoder.cc	Add AMXINT8 cpu detect	2024-02-15 21:44:47 +00:00
mjpeg_validate.cc	Update to r1732 for more robust jpeg	2019-07-01 22:32:36 +00:00
planar_functions.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
rotate_any.cc	Deprecate MIPS and MSA support.	2025-10-16 12:20:40 -07:00
rotate_argb.cc	Fix integer overflow when flipping negative height	2026-06-03 16:17:37 -07:00
rotate_common.cc	Remove redundant #include <stddef.h>	2026-05-28 17:10:22 -07:00
rotate_gcc.cc	Use ptrdiff_t for buffer offsets	2026-04-28 18:21:42 -07:00
rotate_lsx.cc	[AArch64] Use full vectors in TransposeWx{8 => 16}_NEON	2024-05-21 07:46:42 +00:00
rotate_neon64.cc	Fix integer overflow in multiplications of stride	2026-05-28 14:12:37 -07:00
rotate_neon.cc	Fix integer overflow in multiplications of stride	2026-05-28 14:12:37 -07:00
rotate_sme.cc	[AArch64] Re-enable SME only for Linux and new versions of Clang	2024-09-23 09:29:53 +00:00
rotate_win.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
rotate.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_any.cc	I420ToRAW and I420ToRGB24 1 pass AVX2	2026-06-08 19:33:58 -07:00
row_common.cc	I420ToRAW and I420ToRGB24 1 pass AVX2	2026-06-08 19:33:58 -07:00
row_gcc.cc	I420ToRAW and I420ToRGB24 1 pass AVX2	2026-06-08 19:33:58 -07:00
row_lasx.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_lsx.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_neon64.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_neon.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_rvv.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_sme.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_sve.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
row_win.cc	I420ToRAW and I420ToRGB24 1 pass AVX2	2026-06-08 19:33:58 -07:00
scale_any.cc	Deprecate MIPS and MSA support.	2025-10-16 12:20:40 -07:00
scale_argb.cc	Fix integer overflow when flipping negative height	2026-06-03 16:17:37 -07:00
scale_common.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
scale_gcc.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
scale_lsx.cc	DetilePlane and unittest for NEON	2022-01-31 20:05:55 +00:00
scale_neon64.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
scale_neon.cc	Apply format with no code changes	2025-02-24 23:57:01 -08:00
scale_rgb.cc	Fix integer overflow when flipping negative height	2026-06-03 16:17:37 -07:00
scale_rvv.cc	Replace RAWToY/RGB24ToY with RGBToYMatrix	2026-04-21 17:11:14 -07:00
scale_sme.cc	Apply clang format	2025-01-02 13:31:20 -08:00
scale_uv.cc	Validate int param is not INT_MIN before negating	2026-06-04 21:55:57 -07:00
scale_win.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
scale.cc	BGRAToI420 use BgraConstants for a direct conversion using AVX512BW	2026-06-08 12:21:47 -07:00
test.sh	Optimze ABGRToI420 for AVX2	2020-06-04 18:24:45 +00:00
video_common.cc	Lint cleanup after C99 change CL	2018-01-24 19:16:03 +00:00