cpu flags of 1 disables SIMD and uses C. This used to be 0, but the change
in auto init behavior means that 0 now means uninitialized, and will cause
auto detect to reinit the cpu info. A value of 1 disables the auto init.
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1408753004 .
in order to compare C and Neon code, a new command line flag is added.
historically environment variables controlled cpu features, but on
android apk it is easier to pass a command line option to disable cpu
optimizations.
R=harryjin@google.com
BUG=libyuv:516
Review URL: https://codereview.chromium.org/1407193009 .
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.
Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.
R=harryjin@google.com
BUG=libyuv:518
Review URL: https://codereview.chromium.org/1427993004 .
Allows us to ignore flags passed on to us by Chromium build bots
without having to explicitly disable them. (Thanks pbos!)
TESTED=webrtc ran modules_unittests with a bogus flag did not result in an
error.
R=kjellander@chromium.org
BUG=libyuv:507
Review URL: https://codereview.chromium.org/1417573002 .
A hang in color conversion on arm occurs somewhere in yuv to rgb.
Breaking the color test into its own category of test will help
run selective tests to narrow down the issue.
R=harryjin@google.com
BUG=libyuv:506
Review URL: https://codereview.chromium.org/1405543003 .
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V. But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.
TBR=harryjin@google.com
BUG=libyuv:500
Review URL: https://codereview.chromium.org/1388273002 .
J444 is JPeg YUV color space with 444 subsampling.
This implementation uses the existing I444ToARGB conversion, which is
BT.601 color space with 444 subsampling, but passing in the jpeg
color matrix constants.
TBR=harryjin@google.com
BUG=449
Review URL: https://codereview.chromium.org/1387313002 .
API change - I420AlphaToARGB takes flag indicating if RGB should be
premultiplied by alpha.
This version implements an efficient SSSE3 version for Windows.
C version done in 2 steps.
Was
libyuvTest.I420AlphaToARGB_Any (1136 ms)
libyuvTest.I420AlphaToARGB_Unaligned (1210 ms)
libyuvTest.I420AlphaToARGB_Invert (966 ms)
libyuvTest.I420AlphaToARGB_Opt (1031 ms)
libyuvTest.I420AlphaToABGR_Any (1020 ms)
libyuvTest.I420AlphaToABGR_Unaligned (1359 ms)
libyuvTest.I420AlphaToABGR_Invert (1082 ms)
libyuvTest.I420AlphaToABGR_Opt (986 ms)
R=harryjin@google.com
BUG=libyuv:496
Review URL: https://codereview.chromium.org/1367093002 .
random / rand is slow and impacts performance testing.
Although its only called to clear a frame once, a typical profile shows
it high in the overall profile, when doing 1000 frames for a benchmark.
95.10% libyuv_unittest libyuv_unittest [.] YUY2ToARGBRow_SSSE3
2.01% libyuv_unittest libc-2.19.so [.] __random_r
1.13% libyuv_unittest libc-2.19.so [.] __random
Replace random is a faster version for unittests.
set LIBYUV_WIDTH=1280
set LIBYUV_HEIGHT=720
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1
out\release\libyuv_unittest --gtest_filter=*YUY2ToARGB* | findms
Was
libyuvTest.YUY2ToARGB_Opt (497 ms)
Now
libyuvTest.YUY2ToARGB_Opt (454 ms)
R=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1361813002 .
Reimplements I444ToARGB as a matrix function.
new I444ToABGR as matrix functions with wrappers and any functions.
Allows for future J444 and H444 versions.
I444ToABGR user level function added.
BUG=libyuv:490, libyuv:449
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1355733002 .
clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang.
move defines/protos to compare_row.h
fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination.
R=harryjin@google.comTBR=harryjin@google.com
BUG=libyuv:484
Review URL: https://webrtc-codereview.appspot.com/59379004.