Work around for android full debug build runnign out of registers.
5 functions were running out of registers causing the compiler error
error: 'asm' operand has impossible constraints
These functions mostly have 4 pointers, a counter (width) and a tempory
eax register. With fpic and debug using stackframes, 2 registers are
unavailable. So a total of 8 registers are used.
Although fpic and stack frame dont apply to assembly, the compiler
reserves 2 registers. The optimized version builds, so its likely
freeing up the registers once it knows they are not used.
These functions used to build, so compile options and/or compiler may
have updated.. likely fpic was turned on.
An attribute can be done to disable each, and will avoid using the
2 GPR registers, but they are still reserved and unavailable in debug
builds on current compilers (gcc 4.9 and clang 3.8).
R=dhrosa@google.com
BUG=libyuv:602
Review URL: https://codereview.chromium.org/2066933002 .
cpu_info_ is zero for uninitialized state and all bits are off, disabling all cpu optimizations.
the 1 bit indicates cpu_info_ is initialized avoiding calling the detection code again for performance.
MaskCpuFlags initializes the cpu ignoring existing flags, then masks with the supplied flags and stores to cpu_info_.
As a mask, -1 has no effect, enabling all cpu features that were detected, but nothing that wasnt detected.
Setting to 0 will cause the next call to re-initialize the cpu, which is same as enabling all features.
Setting mask to 1 will turn off all cpu features but keep the initialized bit on, so the next detection call wont reinitialize and the cpu features are all disabled.
So normal behavior for command line and programatic masking is:
1 = C
-1 = SIMD
TBR=harryjin@google.com
BUG=libyuv:600
TESTED=out64/Release/bin/run_libyuv_unittest -s libyuv_unittest --verbose --release --gtest_filter=*ARGBExtractAlpha* -a "--libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=1 --libyuv_cpu_info=1"
Review URL: https://codereview.chromium.org/2042933002 .
width %w size modifier the int width can be passed directly to arm assembly.
For functions that take input constants, the outputs are declared as early
write using &, meaning the outputs use used before all inputs are consumed.
R=harryjin@google.com
BUG=libyuv:598
Review URL: https://codereview.chromium.org/2043073003 .
ifdefs on a function level are not needed for neon functions, unless
they are conditionally enabled in row.h. No functions are conditionally
enabled at this time, so all ifdefs can be removed from row_neon.cc and
row_neon64.cc
TBR=kjellander@chromium.org
BUG=libyuv:599
Review URL: https://codereview.chromium.org/2044223002 .
blur requires memory be aligned. change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.
TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.
Review URL: https://codereview.chromium.org/2019753002 .
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs). This allows the initialize
to zero step to be removed, saving 1 instruction.
BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1895743008 .
This requires you don't have target_os=["ios"] set in your
libyuv root .gclient file, since that will make native_client not
being downloaded due to
https://code.google.com/p/chromium/codesearch#chromium/src/DEPS&l=357
BUG=libyuv:573
TESTED=
rm chromium/.last_sync_chromium
rm chromium/.gclient.tmp_entries
gclient sync
native_client/build/gyp_nacl all.gyp -Dgtest_target_type=executable -Dmsan=0 -Duse_system_yasm=0
ninja -C out/Debug
Review URL: https://codereview.chromium.org/1845003004 .
normally this warning is disabled but for nacl builds that
use clang its not. this CL makes the option universally disabled
for all clang based builds.
BUG=libyuv:581
TESTED=native_client/build/gyp_nacl all.gyp -Dgtest_target_type=executable -Dmsan=0 & ninja -C out/Debug libyuv
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/1865183002 .
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C
BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com
Review URL: https://codereview.chromium.org/1872953002 .
They are not needed, and due to them there was a call to _xgetbv()
without a declaration of the function. This used to work because we
implicitly included intrin.h in all translation units with clang-cl, but
we want to stop doing that.
BUG=chromium:592745
R=fbarchard@google.com
Review URL: https://codereview.chromium.org/1780473003 .
An SSSE3 version already exists, and an AVX2 version is available for
Visual C. This ports the function to AVX2 completing the AVX2 ports of
all YUV to RGB functions for AVX2 on gcc.
TBR=harryjin@google.com
BUG=libyuv:555
Review URL: https://codereview.chromium.org/1687253002 .
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2
NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.
R=harryjin@google.com
BUG=libyuv:554
Review URL: https://codereview.chromium.org/1687953002 .