mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-06 16:56:55 +08:00
I422ToUYVYRow_AVX2 optimized from 7 cycles per 32 pixels to 4.6 cycles. Instead of 2 vpermq and vpunpcklbw: vmovdqu (%1),%%xmm2 vmovdqu 0x00(%1,%2,1),%%xmm3 vpermq $0xd8,%%ymm2,%%ymm2 vpermq $0xd8,%%ymm3,%%ymm3 vpunpcklbw %%ymm3,%%ymm2,%%ymm2 ..use vpmovzxbd to expand the bytes to shorts, then vpslld and vpor vpmovzxbd (%1),%%ymm2 vpmovzxbd 0x00(%1,%2,1),%%ymm3 vpslld $0x10,%%ymm3,%%ymm3 vpor %%ymm3,%%ymm2,%%ymm2 which reduces the port 5 bottleneck by 1 cycle. Bug: libyuv:556 Test: out/Release/libyuv_unittest --gtest_filter=*I42?To*UY*Opt Change-Id: I53799e53cc6b090a1a695c839094c193be3eecaf Reviewed-on: https://chromium-review.googlesource.com/899873 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Cheng Wang <wangcheng@google.com> |
||
|---|---|---|
| build_overrides | ||
| docs | ||
| include | ||
| infra/config | ||
| source | ||
| tools_libyuv | ||
| unit_test | ||
| util | ||
| .clang-format | ||
| .gitignore | ||
| .gn | ||
| .vpython | ||
| all.gyp | ||
| Android.bp | ||
| Android.mk | ||
| AUTHORS | ||
| BUILD.gn | ||
| cleanup_links.py | ||
| CM_linux_packages.cmake | ||
| CMakeLists.txt | ||
| codereview.settings | ||
| DEPS | ||
| download_vs_toolchain.py | ||
| gyp_libyuv | ||
| gyp_libyuv.py | ||
| libyuv_nacl.gyp | ||
| libyuv_test.gyp | ||
| libyuv.gni | ||
| libyuv.gyp | ||
| libyuv.gypi | ||
| LICENSE | ||
| LICENSE_THIRD_PARTY | ||
| linux.mk | ||
| OWNERS | ||
| PATENTS | ||
| PRESUBMIT.py | ||
| public.mk | ||
| pylintrc | ||
| README.chromium | ||
| README.md | ||
| winarm.mk | ||
libyuv is an open source project that includes YUV scaling and conversion functionality.
- Scale YUV to prepare content for compression, with point, bilinear or box filter.
- Convert to YUV from webcam formats for compression.
- Convert to RGB formats for rendering/effects.
- Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
- Optimized for SSSE3/AVX2 on x86/x64.
- Optimized for Neon on Arm.
- Optimized for MSA on Mips.
Development
See [Getting started] 1 for instructions on how to get started developing.
You can also browse the [docs directory] 2 for more documentation.