George Steed 4f7fd808b7 [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON
The existing Neon code only makes use of 64-bit vectors throughout which
limits the performance on larger cores. To avoid this, swap the Neon
code from a Wx8 implementation to a Wx16 implementation and process
blocks of 16 full vectors at a time.

The original code also handled widths that were not exact multiples of
16, however this should already be handled by the "any" kernel so it is
removed.

Finally, avoid duplicating the TransposeWx16_C fallback kernel
definition in all architectures that need it, and just put it once in
rotate_common.cc instead.

Observed speedups for TransposePlane across a range of
micro-architectures:

 Cortex-A53: -40.0%
 Cortex-A55: -20.7%
 Cortex-A57: -43.9%
Cortex-A510: -43.5%
Cortex-A520: -43.9%
Cortex-A720: -31.1%
  Cortex-X2: -38.3%
  Cortex-X4: -43.6%

Change-Id: Ic7c4d5f24eb27091d743ddc00cd95ef178b6984e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5545459
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-05-21 07:46:42 +00:00
build_overrides Define enable_safe_libcxx in build_overrides/build.gni. 2023-05-03 06:08:40 +00:00
docs Add AMXINT8 cpu detect 2024-02-15 21:44:47 +00:00
include [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
infra/config infra/config: remove goma property 2023-08-17 06:06:53 +00:00
riscv_script [RISC-V] Support CMake build with custom compiler flags 2023-07-25 09:21:59 +00:00
source [AArch64] Use full vectors in TransposeWx{8 => 16}_NEON 2024-05-21 07:46:42 +00:00
tools_libyuv Do not roll the Fuchsia SDK. 2023-07-03 09:18:00 +00:00
unit_test Remove unneeded #ifdef HAVE_JPEG code 2024-05-09 23:02:18 +00:00
util [AArch64] Enable detection of additional architecture features 2024-04-05 17:48:22 +00:00
.clang-format clang-format libyuv 2016-11-07 17:37:23 -08:00
.gitignore DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
.gn Roll chromium_revision 829c6df33d..7d683aeda8 (945687:1050091) 2022-09-22 14:56:57 +00:00
.vpython remove swarming_client 2021-09-09 07:11:45 +00:00
.vpython3 Update vpython3 requests 2023-06-01 19:06:40 +00:00
Android.bp Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
Android.mk Split scale_test and scale_plane_test to allow building on small devices 2023-12-09 18:39:41 +00:00
AUTHORS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
BUILD.gn [AArch64] Add :libyuv_sve library in preparation for SVE kernels 2024-04-09 03:10:01 +00:00
CM_linux_packages.cmake Reduce cmake verbosity and update min version 2022-08-03 06:59:54 +00:00
CMakeLists.txt CMake: Use CMAKE_SOURCE_DIR in GTEST_SRC_DIR 2024-04-11 09:09:46 +00:00
codereview.settings [infra] remove no longer supported git cl upload setting. 2021-04-28 12:47:52 +00:00
DEPS [DEPS] Remove cleanup_links pre_deps_hooks 2024-04-08 15:47:48 +00:00
DIR_METADATA Move metadata in OWNERS files to DIR_METADATA files 2021-02-09 19:34:43 +00:00
download_vs_toolchain.py Update gclient instructions + environment 2022-02-24 15:19:23 +00:00
libyuv.gni [AArch64] Add :libyuv_sve library in preparation for SVE kernels 2024-04-09 03:10:01 +00:00
libyuv.gyp Add libyuv.gyp build files 2022-03-21 23:48:16 +00:00
libyuv.gypi Fix missing headers in GN/GYP build files 2024-04-01 09:19:24 +00:00
LICENSE Update Copyright notice to follow new chromium conventions. 2012-08-08 19:04:24 +00:00
linux.mk Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests. 2023-12-08 13:39:56 +00:00
OWNERS add jansson@google.com to infra owners to cover when Mirko is OOO 2022-10-28 09:46:02 +00:00
PATENTS LibYuv: Adding PATENT and LICENSE files 2011-10-25 16:15:49 +00:00
PRESUBMIT.py Update PRESUBMIT, cleanup_links and autoroller to py3 2022-02-24 13:34:14 +00:00
public.mk use unix line endings 2018-06-20 23:19:59 +00:00
pylintrc Use DEPS for all dependencies + add PRESUBMIT.py 2017-02-03 11:36:53 +00:00
README.chromium Fix environment variable LIBYUV_CPU_INFO for unittests 2024-04-20 17:41:56 +00:00
README.md Add RAWToARGBRow_RVV,RAWToRGBARow_RVV,RAWToRGB24Row_RVV 2023-04-07 18:45:08 +00:00
winarm.mk NV12 Copy, include scale_uv.h 2020-12-08 18:54:16 +00:00

libyuv is an open source project that includes YUV scaling and conversion functionality.

  • Scale YUV to prepare content for compression, with point, bilinear or box filter.
  • Convert to YUV from webcam formats for compression.
  • Convert to RGB formats for rendering/effects.
  • Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
  • Optimized for SSSE3/AVX2 on x86/x64.
  • Optimized for Neon on Arm.
  • Optimized for MSA on Mips.
  • Optimized for RVV on RISC-V.

Development

See Getting started for instructions on how to get started developing.

You can also browse the docs directory for more documentation.