Frank Barchard 67f3f17d9a aarch32 J420ToI420
benchmark on medium core
adbrun -- taskset 10 blaze-bin/third_party/libyuv/libyuv_test '--gunit_filter=*J420ToI420*' --gunit_also_run_disabled_tests --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

Now Neon
J420ToI420_Opt (159 ms)
Was C
J420ToI420_Opt (215 ms)

AArch64
J420ToI420_Opt (93 ms)

C version does this:
vld1.8	{d20, d21}, [r6]!
vorr	q12, q8, q8
subs	r4, #16
vmovl.u8	q11, d21
vmovl.u8	q10, d20
vmul.i16	q11, q9, q11
vmul.i16	q10, q9, q10
vsra.u16	q12, q11, #8
vorr	q11, q8, q8
vsra.u16	q11, q10, #8
vmovn.i16	d21, q12
vmovn.i16	d20, q11
vst1.8	{d20, d21}, [r5]!
bne	0x3d9078 <Convert8To8Row_C+0x36> @ imm = #-54

Explanation of above C code
vorr moves 16 into register
vsra does shift + accumulate to that register

Compared to aarch64
instead of mull, C uses movl+mul
instead of uzp2, C uses sra #8 + movn. takes 2 movn vs 1 uzp2
instead of add, C does vorr + sra

Change-Id: I9648f06e52ccbafaecf07bd89f8ffff27565d025
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6189497
Reviewed-by: Justin Green <greenjustin@google.com>
2025-01-22 13:47:09 -08:00
build_overrides Roll chromium_revision af3d01376b..f2539aa88a (1174635:1398488) 2025-01-08 06:40:17 -08:00
docs avx10_2 detect 2025-01-21 13:53:19 -08:00
include aarch32 J420ToI420 2025-01-22 13:47:09 -08:00
infra/config Revert "Remove linux_tsan2 bot from CQ." 2025-01-09 00:26:58 -08:00
riscv_script Support RVV v0.12 intrinsics for row_rvv.cc & scale_rvv.cc 2024-06-17 18:01:49 +00:00
source aarch32 J420ToI420 2025-01-22 13:47:09 -08:00
tools_libyuv Remove libyuv dependency on base/ 2025-01-08 09:04:21 -08:00
unit_test J420ToI420 using planar 8 bit scaling 2025-01-22 02:50:24 -08:00
util avx10_2 detect 2025-01-21 13:53:19 -08:00
.clang-format clang-format libyuv 2016-11-07 17:37:23 -08:00
.gitignore DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
.gn Roll chromium_revision 829c6df33d..7d683aeda8 (945687:1050091) 2022-09-22 14:56:57 +00:00
.vpython3 Update protobuf version in .vpython3. 2025-01-09 00:32:24 -08:00
Android.bp Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
Android.mk Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
AUTHORS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
BUILD.gn Roll chromium_revision af3d01376b..f2539aa88a (1174635:1398488) 2025-01-08 06:40:17 -08:00
CM_linux_packages.cmake Use grep extended regex for version 2024-11-13 02:11:17 +00:00
CMakeLists.txt Fix bugs in ARGBAttenuateRow_LASX/LSX function 2024-11-30 23:09:04 +00:00
codereview.settings [infra] remove no longer supported git cl upload setting. 2021-04-28 12:47:52 +00:00
DEPS Remove libyuv dependency on base/ 2025-01-08 09:04:21 -08:00
DIR_METADATA Move metadata in OWNERS files to DIR_METADATA files 2021-02-09 19:34:43 +00:00
download_vs_toolchain.py Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
libyuv.gni Revert "Do not enable libyuv_use_sme for is_android" 2024-10-15 18:20:36 +00:00
libyuv.gyp Add libyuv.gyp build files 2022-03-21 23:48:16 +00:00
libyuv.gypi Fix missing headers in GN/GYP build files 2024-04-01 09:19:24 +00:00
LICENSE Update Copyright notice to follow new chromium conventions. 2012-08-08 19:04:24 +00:00
linux.mk Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests. 2023-12-08 13:39:56 +00:00
OWNERS add jansson@google.com to infra owners to cover when Mirko is OOO 2022-10-28 09:46:02 +00:00
PATENTS LibYuv: Adding PATENT and LICENSE files 2011-10-25 16:15:49 +00:00
PRESUBMIT.py Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
public.mk use unix line endings 2018-06-20 23:19:59 +00:00
pylintrc Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
README.chromium J420ToI420 using planar 8 bit scaling 2025-01-22 02:50:24 -08:00
README.md Update README.md and environment_variables.md for Arm 2024-09-20 00:29:33 +00:00
winarm.mk NV12 Copy, include scale_uv.h 2020-12-08 18:54:16 +00:00

libyuv is an open source project that includes YUV scaling and conversion functionality.

  • Scale YUV to prepare content for compression, with point, bilinear or box filter.
  • Convert to YUV from webcam formats for compression.
  • Convert to RGB formats for rendering/effects.
  • Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
  • Optimized for SSSE3/AVX2 on x86/x64.
  • Optimized for Neon/SVE2/SME on Arm.
  • Optimized for MSA on Mips.
  • Optimized for RVV on RISC-V.

Development

See Getting started for instructions on how to get started developing.

You can also browse the docs directory for more documentation.