Removed all SSE functions, macros, dispatching logic, and related
unit tests across the repository to reduce code size and complexity.
Left cpuid detection intact. Supported architectures like AVX2, NEON,
SVE, etc. are unaffected.
R=rrwinterton@gmail.com
Bug: None
Test: Build and run libyuv_unittest
Change-Id: Id19608dba35b79c4c8fc31f920a6a968883d300f
- use negative coefficients for UV to allow -128
- change shift to truncate instead of round for UV
- adapt all row_gcc RGB to UV into matrix functions
- add -DLIBYUV_ENABLE_ROWWIN to allow clang on Windows to use row_win.cc
Bug: 381138208
Change-Id: I6016062c859faf147a8a2cdea6c09976cbf2963c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6277710
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- ARM Planar test use regular asm volatile syntax
- x86 row functions remove volatile from asm
Bug: 347111119, 347112532
Change-Id: I535b3dfa1a7a19824503bd95584a63b047b0e9a1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5637058
Reviewed-by: Justin Green <greenjustin@google.com>
Add tests of all macros used by libyuv public headers
When a 1 step conversion is added, a 2 step test can compare
the old 2 step method to the 1 step. A 1 step unittest is
also added which compares C to SIMD. Making the 2 step
conversions measure performance of the 2 steps allows the
old 2 step performance to be compared to 1 step.
All macros used in public headers are added to an ifdef test.
Showing them in a unittest allows some diagnostics when
a test is failing.
Bug: libyuv:901
Change-Id: I7ffa6ed0cb3b506fa1b7fd4b7b1b729658c3c266
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2857916
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This allows the linker to move the variables from the .data section to
the .rodata section.
Bug: libyuv:254
Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0
Reviewed-on: https://chromium-review.googlesource.com/777033
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
vmovd is an AVX instruction. This will crash on an older CPU with
only SSSE3 but not AVX. Use movd instead.
Bug: libyuv:753
Test: ~/intelsde/sde -mrm -- out/Release/libyuv_unittest --gtest_filter=LibYUVCompareTest.BenchmarkHammingDistance_Opt
Change-Id: I1fb0026039d5f83d124f5d03fed7dc0d2d723e49
Reviewed-on: https://chromium-review.googlesource.com/756200
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Bug: libyuv:701
Test: objdump to confirm code gen
Change-Id: Ibdcb2cc6bc9bf14b4ccb874c49fc9ff664650e1a
Reviewed-on: https://chromium-review.googlesource.com/745390
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
popcnt has a fake dependency on the destination.
This assembly avoids the dependency by using a different
register for each popcnt.
Bug: libyuv:701
Test: LIBYUV_DISABLE_SSSE3=1 out/Release/libyuv_unittest --gtest_filter=*Ham*Opt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: Ie1d202e2613b7fa8a3c02acd433940e92c80eafa
Reviewed-on: https://chromium-review.googlesource.com/731826
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
BUG=libyuv:703
TEST=compile and disassemble. see registers used not stack.
R=wangcheng@google.com
Change-Id: Iaa07ee5d0c35252994491bb2868276e161149efd
Reviewed-on: https://chromium-review.googlesource.com/500427
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
BUG=libyuv:701
TEST=built and disassembled for aarch64
R=kjellander@chromium.org
Change-Id: I7712b1c7934e5dfb55fda1fa7c8405c32d6964ce
Reviewed-on: https://chromium-review.googlesource.com/495327
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Reviewed-by: Cheng Wang <wangcheng@google.com>
clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang.
move defines/protos to compare_row.h
fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination.
R=harryjin@google.comTBR=harryjin@google.com
BUG=libyuv:484
Review URL: https://webrtc-codereview.appspot.com/59379004.