George Steed 867bdc51ed [AArch64] Unroll I422ToAR30Row_{SVE2,SME}
The existing STOREAR30_SVE macro works fine for out of order cores,
however for in-order cores the number of dependent vector instructions
laid out consecutively impacts performance.

We can improve this by unrolling the loop to process two sets of vectors
at a time, allowing little cores to process two independent streams of
vector instructions at the same time to improve performance. Using one
set of ZIP instructions at the end allows us to (a) avoid ST4 which we
know is slow on some micro-architectures, and (b) enable the use of
predication and avoid the need for separate "any" kernels.

Reduction in run times of I422ToAR30Row_SVE2 observed compared to the
previous SVE2 implementation:

Cortex-A510: -37.7%
Cortex-A520: -38.8%
Cortex-A710: -14.8%
Cortex-A715: -17.1%
Cortex-A720: -16.9%
  Cortex-X2: -10.3%
  Cortex-X3:  -6.7%
  Cortex-X4:  -9.4%
Cortex-X925:  -7.1%

Change-Id: I160fb41300d2d08fce2e6eb92181324fd723a02d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6632916
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2025-06-12 14:09:49 -07:00
build_overrides [tracing] Remove enable_base_tracing 2025-05-15 01:22:38 -07:00
docs Fix typo, remove mips as title already contain mips 2025-04-15 14:02:28 -07:00
include [AArch64] Unroll I422ToAR30Row_{SVE2,SME} 2025-06-12 14:09:49 -07:00
infra/config infra: Remove reclient properties from infra config 2025-06-02 00:54:25 -07:00
riscv_script Support RVV v0.12 intrinsics for row_rvv.cc & scale_rvv.cc 2024-06-17 18:01:49 +00:00
source TestI400LargeSize test __x86_64__, _M_X64, or __aarch64__ 2025-06-10 15:53:02 -07:00
tools_libyuv Remove libyuv dependency on base/ 2025-01-08 09:04:21 -08:00
unit_test TestI400LargeSize test __x86_64__, _M_X64, or __aarch64__ 2025-06-10 15:53:02 -07:00
util Add SME2 detect 2025-03-27 11:08:08 -07:00
.clang-format clang-format libyuv 2016-11-07 17:37:23 -08:00
.gitignore DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
.gn Roll chromium_revision 908f3898af..3d4d5701ea (1403569:1445131) 2025-04-10 04:15:34 -07:00
.vpython3 Update protobuf version in .vpython3. 2025-01-09 00:32:24 -08:00
Android.bp Enable CFI assembly support 2025-03-31 10:47:47 -07:00
Android.mk Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
AUTHORS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
BUILD.gn Support Siso builds 2025-05-27 01:54:30 -07:00
CM_linux_packages.cmake Use grep extended regex for version 2024-11-13 02:11:17 +00:00
CMakeLists.txt Call cmake_minimum_required(VERSION 3.16) first 2025-04-13 10:08:52 -07:00
codereview.settings [infra] remove no longer supported git cl upload setting. 2021-04-28 12:47:52 +00:00
DEPS Support Siso builds 2025-05-27 01:54:30 -07:00
DIR_METADATA Move metadata in OWNERS files to DIR_METADATA files 2021-02-09 19:34:43 +00:00
download_vs_toolchain.py Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
libyuv.gni Revert "Do not enable libyuv_use_sme for is_android" 2024-10-15 18:20:36 +00:00
libyuv.gyp Enable explicit control over LoongArch LSX & LASX for GYP builds 2025-05-30 10:17:27 -07:00
libyuv.gypi Add missing files for loong64 GYP build 2025-04-15 14:03:27 -07:00
LICENSE Update Copyright notice to follow new chromium conventions. 2012-08-08 19:04:24 +00:00
linux.mk Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests. 2023-12-08 13:39:56 +00:00
OWNERS add jansson@google.com to infra owners to cover when Mirko is OOO 2022-10-28 09:46:02 +00:00
PATENTS LibYuv: Adding PATENT and LICENSE files 2011-10-25 16:15:49 +00:00
PRESUBMIT.py Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
public.mk use unix line endings 2018-06-20 23:19:59 +00:00
pylintrc Update pylintrc to a pep-8 like style 2024-12-18 05:38:56 -08:00
README.chromium ubsan compliant '_any' functions using ptrdiff_t for pointer math 2025-06-10 15:01:52 -07:00
README.md Update README.md and environment_variables.md for Arm 2024-09-20 00:29:33 +00:00
winarm.mk NV12 Copy, include scale_uv.h 2020-12-08 18:54:16 +00:00

libyuv is an open source project that includes YUV scaling and conversion functionality.

  • Scale YUV to prepare content for compression, with point, bilinear or box filter.
  • Convert to YUV from webcam formats for compression.
  • Convert to RGB formats for rendering/effects.
  • Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
  • Optimized for SSSE3/AVX2 on x86/x64.
  • Optimized for Neon/SVE2/SME on Arm.
  • Optimized for MSA on Mips.
  • Optimized for RVV on RISC-V.

Development

See Getting started for instructions on how to get started developing.

You can also browse the docs directory for more documentation.