George Steed 5bac99fe09 [AArch64] Rework data loading in ScaleARGBFilterCols_NEON
The existing code makes use of lane-indexed LD2 instructions to load the
input data however this creates a strong dependency chain between
consecutive load instructions. We can reduce this dependency chain by
instead loading two vectors with wider lane-indexed LD1 instructions and
then performing a permute to unzip the data.

We can also avoid the need for a complex sequence of DUP + EXT
instructions by using TBL to permute the data exactly as we want it.

Reduction in runtimes observed compared to the existing Neon
implementation:

 Cortex-A55:  =0.0%
Cortex-A510: -44.2%
Cortex-A520: -47.6%
 Cortex-A76: -45.8%
Cortex-A715: -58.3%
Cortex-A720: -58.4%
  Cortex-X1: -66.7%
  Cortex-X2: -68.0%
  Cortex-X3: -67.9%
  Cortex-X4: -70.0%

Change-Id: I8a1d1fe08d8a2ddb0b86d4a44f0d49b69ab03ece
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5683126
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-07-10 23:10:43 +00:00
build_overrides Define enable_safe_libcxx in build_overrides/build.gni. 2023-05-03 06:08:40 +00:00
docs [docs] Add documentation on AArch64 SME for feature detection 2024-06-03 22:39:03 +00:00
include [AArch64] Add SVE2 implementations of ARGBTo{RAW,RGB24}Row 2024-07-08 20:27:54 +00:00
infra/config infra: Move libyuv ci/try jobs from Ubuntu-18.04 to Ubuntu-22.04 2024-05-21 15:55:24 +00:00
riscv_script Support RVV v0.12 intrinsics for row_rvv.cc & scale_rvv.cc 2024-06-17 18:01:49 +00:00
source [AArch64] Rework data loading in ScaleARGBFilterCols_NEON 2024-07-10 23:10:43 +00:00
tools_libyuv Do not roll the Fuchsia SDK. 2023-07-03 09:18:00 +00:00
unit_test [AArch64] Fix SVE/SME vector length printing in cpuid 2024-07-02 19:44:41 +00:00
util Change inline to __asm__ for C 2024-07-10 23:05:29 +00:00
.clang-format clang-format libyuv 2016-11-07 17:37:23 -08:00
.gitignore DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
.gn Roll chromium_revision 829c6df33d..7d683aeda8 (945687:1050091) 2022-09-22 14:56:57 +00:00
.vpython remove swarming_client 2021-09-09 07:11:45 +00:00
.vpython3 Update vpython3 requests 2023-06-01 19:06:40 +00:00
Android.bp Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
Android.mk Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
AUTHORS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
BUILD.gn [Arm][AArch64] Stop explicitly optimising for speed in BUILD.gn 2024-06-14 17:01:27 +00:00
CM_linux_packages.cmake Reduce cmake verbosity and update min version 2022-08-03 06:59:54 +00:00
CMakeLists.txt [AArch64] Add initial build system support for SME 2024-06-08 23:32:41 +00:00
codereview.settings [infra] remove no longer supported git cl upload setting. 2021-04-28 12:47:52 +00:00
DEPS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
DIR_METADATA Move metadata in OWNERS files to DIR_METADATA files 2021-02-09 19:34:43 +00:00
download_vs_toolchain.py Update gclient instructions + environment 2022-02-24 15:19:23 +00:00
libyuv.gni [AArch64] Add initial build system support for SME 2024-06-08 23:32:41 +00:00
libyuv.gyp Add libyuv.gyp build files 2022-03-21 23:48:16 +00:00
libyuv.gypi Fix missing headers in GN/GYP build files 2024-04-01 09:19:24 +00:00
LICENSE Update Copyright notice to follow new chromium conventions. 2012-08-08 19:04:24 +00:00
linux.mk Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests. 2023-12-08 13:39:56 +00:00
OWNERS add jansson@google.com to infra owners to cover when Mirko is OOO 2022-10-28 09:46:02 +00:00
PATENTS LibYuv: Adding PATENT and LICENSE files 2011-10-25 16:15:49 +00:00
PRESUBMIT.py Update PRESUBMIT, cleanup_links and autoroller to py3 2022-02-24 13:34:14 +00:00
public.mk use unix line endings 2018-06-20 23:19:59 +00:00
pylintrc Use DEPS for all dependencies + add PRESUBMIT.py 2017-02-03 11:36:53 +00:00
README.chromium [AArch64] Fix SVE/SME vector length printing in cpuid 2024-07-02 19:44:41 +00:00
README.md Add RAWToARGBRow_RVV,RAWToRGBARow_RVV,RAWToRGB24Row_RVV 2023-04-07 18:45:08 +00:00
winarm.mk NV12 Copy, include scale_uv.h 2020-12-08 18:54:16 +00:00

libyuv is an open source project that includes YUV scaling and conversion functionality.

  • Scale YUV to prepare content for compression, with point, bilinear or box filter.
  • Convert to YUV from webcam formats for compression.
  • Convert to RGB formats for rendering/effects.
  • Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
  • Optimized for SSSE3/AVX2 on x86/x64.
  • Optimized for Neon on Arm.
  • Optimized for MSA on Mips.
  • Optimized for RVV on RISC-V.

Development

See Getting started for instructions on how to get started developing.

You can also browse the docs directory for more documentation.