mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-06 08:46:47 +08:00
benchmark on medium core
adbrun -- taskset 10 blaze-bin/third_party/libyuv/libyuv_test '--gunit_filter=*J420ToI420*' --gunit_also_run_disabled_tests --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
Now Neon
J420ToI420_Opt (159 ms)
Was C
J420ToI420_Opt (215 ms)
AArch64
J420ToI420_Opt (93 ms)
C version does this:
vld1.8 {d20, d21}, [r6]!
vorr q12, q8, q8
subs r4, #16
vmovl.u8 q11, d21
vmovl.u8 q10, d20
vmul.i16 q11, q9, q11
vmul.i16 q10, q9, q10
vsra.u16 q12, q11, #8
vorr q11, q8, q8
vsra.u16 q11, q10, #8
vmovn.i16 d21, q12
vmovn.i16 d20, q11
vst1.8 {d20, d21}, [r5]!
bne 0x3d9078 <Convert8To8Row_C+0x36> @ imm = #-54
Explanation of above C code
vorr moves 16 into register
vsra does shift + accumulate to that register
Compared to aarch64
instead of mull, C uses movl+mul
instead of uzp2, C uses sra #8 + movn. takes 2 movn vs 1 uzp2
instead of add, C does vorr + sra
Change-Id: I9648f06e52ccbafaecf07bd89f8ffff27565d025
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6189497
Reviewed-by: Justin Green <greenjustin@google.com>
|
||
|---|---|---|
| build_overrides | ||
| docs | ||
| include | ||
| infra/config | ||
| riscv_script | ||
| source | ||
| tools_libyuv | ||
| unit_test | ||
| util | ||
| .clang-format | ||
| .gitignore | ||
| .gn | ||
| .vpython3 | ||
| Android.bp | ||
| Android.mk | ||
| AUTHORS | ||
| BUILD.gn | ||
| CM_linux_packages.cmake | ||
| CMakeLists.txt | ||
| codereview.settings | ||
| DEPS | ||
| DIR_METADATA | ||
| download_vs_toolchain.py | ||
| libyuv.gni | ||
| libyuv.gyp | ||
| libyuv.gypi | ||
| LICENSE | ||
| linux.mk | ||
| OWNERS | ||
| PATENTS | ||
| PRESUBMIT.py | ||
| public.mk | ||
| pylintrc | ||
| README.chromium | ||
| README.md | ||
| winarm.mk | ||
libyuv is an open source project that includes YUV scaling and conversion functionality.
- Scale YUV to prepare content for compression, with point, bilinear or box filter.
- Convert to YUV from webcam formats for compression.
- Convert to RGB formats for rendering/effects.
- Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
- Optimized for SSSE3/AVX2 on x86/x64.
- Optimized for Neon/SVE2/SME on Arm.
- Optimized for MSA on Mips.
- Optimized for RVV on RISC-V.
Development
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.