mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2026-01-01 03:12:16 +08:00
Roughly. instead of 4 loads and 8 multiples, use 1 load and 2 multiples 4 times over. The original code, as with the C code from clang and gcc, did all the loads, then all the math, then the store. The new code does a load, then the math, then the next load, etc. This schedules better on current arm 64 cpus. Number of registers also reduced, reusing the same registers. HiSilicon ARM A73: Now TestGaussRow_Opt (890 ms) TestGaussCol_Opt (571 ms) Was TestGaussRow_Opt (1061 ms) TestGaussCol_Opt (595 ms) Qualcomm 821 (Pixel): Now TestGaussRow_Opt (571 ms) TestGaussCol_Opt (474 ms) Was TestGaussRow_Opt (751 ms) TestGaussCol_Opt (520 ms) TBR=kjellander@chromium.org BUG=libyuv:719 TEST=LibYUVPlanarTest.TestGaussRow_Opt Reviewed-on: https://chromium-review.googlesource.com/627478 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Change-Id: I5ec81191d460801f0d4a89f0384f89925ff036de Reviewed-on: https://chromium-review.googlesource.com/634448 Commit-Queue: Frank Barchard <fbarchard@google.com> |
||
|---|---|---|
| build_overrides | ||
| docs | ||
| include | ||
| infra/config | ||
| source | ||
| tools_libyuv | ||
| unit_test | ||
| util | ||
| .clang-format | ||
| .gitignore | ||
| .gn | ||
| all.gyp | ||
| Android.mk | ||
| AUTHORS | ||
| BUILD.gn | ||
| cleanup_links.py | ||
| CM_linux_packages.cmake | ||
| CMakeLists.txt | ||
| codereview.settings | ||
| DEPS | ||
| download_vs_toolchain.py | ||
| gyp_libyuv | ||
| gyp_libyuv.py | ||
| libyuv_nacl.gyp | ||
| libyuv_test.gyp | ||
| libyuv.gni | ||
| libyuv.gyp | ||
| libyuv.gypi | ||
| LICENSE | ||
| LICENSE_THIRD_PARTY | ||
| linux.mk | ||
| OWNERS | ||
| PATENTS | ||
| PRESUBMIT.py | ||
| public.mk | ||
| pylintrc | ||
| README.chromium | ||
| README.md | ||
| winarm.mk | ||
libyuv is an open source project that includes YUV scaling and conversion functionality.
- Scale YUV to prepare content for compression, with point, bilinear or box filter.
- Convert to YUV from webcam formats.
- Convert from YUV to formats for rendering/effects.
- Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
- Optimized for SSE2/SSSE3/AVX2 on x86/x64.
- Optimized for Neon on Arm.
- Optimized for DSP R2 on Mips.
Development
See [Getting started] 1 for instructions on how to get started developing.
You can also browse the [docs directory] 2 for more documentation.