1262 Commits

Author SHA1 Message Date
Frank Barchard
aa197ee1a3 HalfFloat_SSE2 for Visual C
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows
R=hubbe@chromium.org, wangcheng@google.com

Review URL: https://codereview.chromium.org/2387713002 .
2016-10-03 10:33:38 -07:00
Frank Barchard
4a14cb2e81 HalfFloat_SSE2 port from C algorithm to SSE2
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2381493006 .
2016-09-30 09:47:16 -07:00
Frank Barchard
7fc932ddd3 Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560,chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2371293002 .
2016-09-29 15:06:30 -07:00
Frank Barchard
c11e9b7fb7 bt709 coefficients for video constrained space
Original bt709 color space coefficients were full range yuv for higher
quality.  This change makes the coefficients use the video constrained
color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for
chroma channels.

BUG=libyuv:639
TEST=libyuv unittests run locally
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2367253003 .
2016-09-28 15:07:46 -07:00
Frank Barchard
6732bcbde9 ShortToHalfFloat_AVX2 function
BUG=libyuv:560
TEST=local compile for windows
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2364293002 .
2016-09-27 14:18:32 -07:00
Frank Barchard
618149084e Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function
This patch adds MSA optimized ARGBMirrorRow function in libYUV project.

Performance gain ~3x

R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2368313003 .
2016-09-26 16:28:01 -07:00
Frank Barchard
34a29bf756 fix warning on visual C for mips cpu detect
follow up warning fixs
cpu_id.cc(167): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
lint warning: cpu_id.cc:171:  Missing space before ( in if(  [whitespace/parens] [5]

TBR=manojkumar.bhosale@imgtec.com
BUG=libyuv:634
TEST=try bots for windows.

Review URL: https://codereview.chromium.org/2365813002 .
2016-09-22 18:25:52 -07:00
Frank Barchard
c5323b0fdc Add MIPS SIMD Arch (MSA) optimized MirrorRow function
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds

This patch adds first MSA optimized function in libYUV project.

BUG=libyuv:634
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2285683002 .
2016-09-22 16:12:22 -07:00
Frank Barchard
6ad3aa6ae4 fix multi-line comment warning
../../source/scale_neon.cc:576:1: error: multi-line comment [-Werror=comment]
 // #define BLENDER(a, b, f) (uint8)((int)(a) + \
 ^

BUG=None
TEST=try bots

Review URL: https://codereview.chromium.org/2344203003 .
2016-09-16 15:16:39 -07:00
Frank Barchard
8279df963e Scale by 3/8 only if source is multiple of 8 tall.
BUG=libyuv:635
TEST=try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2347733002 .
2016-09-16 14:57:47 -07:00
Frank Barchard
137aa63afe Fix some comment typos
BUG=None
TEST=try bots

Review URL: https://codereview.chromium.org/2346633002 .
2016-09-15 15:38:19 -07:00
Frank Barchard
de944ed8c7 YuvConstants declare alignment for externs as well as declarations
On visual c 2013 and earlier a warning is generated if externs
are not declared with the same alignment as the declaration, when
using /ltcg

BUG=libyuv:633
TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg
R=skal@google.com

Review URL: https://codereview.chromium.org/2291533004 .
2016-08-30 11:06:46 -07:00
Frank Barchard
c244a3e9a0 Add SplitUVPlanes and MergeUVPlanes
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.

BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2277603004 .
2016-08-24 16:47:24 -07:00
Frank Barchard
161e5c4569 Allow NULL for dst_y in planar formats. BUG=libyuv:631 TEST=unittests build/pass
BUG=libyuv:631
TEST=unittests build/pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2271053003 .
2016-08-24 10:19:14 -07:00
Frank Barchard
17d31e6a4a NV12 allow NULL for Y
The conversion from NV12 and other Bi or Tri planar formats, differs only in the UV handling.  The helper function supports passing a NULL for the dst_y channel indicating you only want to do the UV conversion.

TBR=harryjin@google.com
TEST=LibYUVConvertTest.NV12ToI420_NullY (601 ms)
BUG=libyuv:626

Review URL: https://codereview.chromium.org/2276703002 .
2016-08-23 19:05:25 -07:00
Frank Barchard
d58297a2df NV12ToI420 use SplitPlane function
TBR=magjed@chromium.org
BUG=libyuv:629
TEST=LibYUVConvertTest.NV12ToI420_Opt

Review URL: https://codereview.chromium.org/2267303002 .
2016-08-22 18:35:55 -07:00
Frank Barchard
36ae08ce1c Suppress MJPEG fprintf() runtime warning
TBR=harryjin@google.com
BUG=libyuv:630
TEST=local build and try bots pass

Review URL: https://codereview.chromium.org/2264293002 .
2016-08-22 16:30:36 -07:00
Frank Barchard
46a8eaaf0c fix typo in YUV
R=braveyao@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2152623002 .
2016-07-13 17:17:19 -07:00
Frank Barchard
1aa4ddd21c Attribute aligned 32 for YUV conversion structure on Intel
Fix for unaligned memory exception.

R=braveyao@chromium.org
BUG=libyuv:616

Review URL: https://codereview.chromium.org/2152553002 .
2016-07-13 12:19:26 -07:00
Frank Barchard
abcb70f183 Test nv21 layout of Android420ToI420 function.
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Any
[       OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Any
[       OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms)

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2146733002 .
2016-07-12 18:34:04 -07:00
Frank Barchard
84e04699c2 Add libyuv:Android420ToI420 function which takes 3 pointers
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2114843002 .
2016-07-12 16:23:51 -07:00
Frank Barchard
2f101fdbda mingw64 fix - guard row_win.cc against mingw build.
The old guard only checked for defined(_M_X64) which is defined by mingw64.  Add a test for defined(_MSC_VER) which is defined for clangcl and visual c but not mingw.  mingw should use row_gcc.cc for both 32 and 64 bit.

R=harryjin@google.com
BUG=webm:1252,libyuv:613
TEST=local gcc/clang builds on linux tested and try bots for others.

Review URL: https://codereview.chromium.org/2105603002 .
2016-06-28 10:21:27 -07:00
Frank Barchard
b8ddb5a2a7 rounding for arm filter
R=wangcheng@google.com, harryjin@google.com
BUG=libyuv:607

Review URL: https://codereview.chromium.org/2093913004 .
2016-06-24 16:07:49 -07:00
Frank Barchard
1b3e4aee47 make count a memory variable for 32 bit
32 bit clang runs out of registers and compiler does core dump.
force 32 bit build to use memory variable for counter.

BUG=libyuv:612
TBR=harryjin@google.com

Review URL: https://codereview.chromium.org/2091913003 .
2016-06-23 20:42:10 -07:00
Frank Barchard
cc88adc620 YUV scale filter columns improved filtering accuracy
upscale a YUV image.  observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.

was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)

BUG=libyuv:605
TEST=try bots
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Niels Möller
365ed3851c Treat YU12 as an alias for I420. Simplify setting of inv_crop_height.
BUG=
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2020193002 .
2016-06-16 12:49:17 +02:00
Frank Barchard
fd3e676e91 android_full_debug x86 fix - use +rm for width count
Work around for android full debug build runnign out of registers.
5 functions were running out of registers causing the compiler error
error: 'asm' operand has impossible constraints
These functions mostly have 4 pointers, a counter (width) and a tempory
eax register.  With fpic and debug using stackframes, 2 registers are
unavailable.  So a total of 8 registers are used.
Although fpic and stack frame dont apply to assembly, the compiler
reserves 2 registers.  The optimized version builds, so its likely
freeing up the registers once it knows they are not used.
These functions used to build, so compile options and/or compiler may
have updated.. likely fpic was turned on.
An attribute can be done to disable each, and will avoid using the
2 GPR registers, but they are still reserved and unavailable in debug
builds on current compilers (gcc 4.9 and clang 3.8).

R=dhrosa@google.com
BUG=libyuv:602

Review URL: https://codereview.chromium.org/2066933002 .
2016-06-14 15:25:28 -07:00
Frank Barchard
026be3cd85 neon64 use width int directly.
width %w size modifier the int width can be passed directly to arm assembly.
For functions that take input constants, the outputs are declared as early
write using &, meaning the outputs use used before all inputs are consumed.

R=harryjin@google.com
BUG=libyuv:598

Review URL: https://codereview.chromium.org/2043073003 .
2016-06-08 10:26:53 -07:00
Frank Barchard
17e8a4d3df Remove ifdefs for neon in row_neon*.cc
ifdefs on a function level are not needed for neon functions, unless
they are conditionally enabled in row.h.  No functions are conditionally
enabled at this time, so all ifdefs can be removed from row_neon.cc and
row_neon64.cc

TBR=kjellander@chromium.org
BUG=libyuv:599

Review URL: https://codereview.chromium.org/2044223002 .
2016-06-07 14:34:13 -07:00
Frank Barchard
6546096269 ARGBExtractAlpha 16 pixels at a time for ARM
arm64   8     TestARGBExtractAlpha (10019 ms) <-original 64 bit code
arm64   8 x2  TestARGBExtractAlpha (7639 ms)
arm64   16    TestARGBExtractAlpha (7369 ms) <- new 64 bit code
thumb32 8     TestARGBExtractAlpha (9505 ms) <- original 32 bit code
thumb32 8 x2  TestARGBExtractAlpha (7400 ms)
thumb32 8 x2i TestARGBExtractAlpha (7266 ms) <- new 32 bit code
arm32   8     TestARGBExtractAlpha (10002 ms)

BUG=libyuv:572
TESTED=local test on nexus 9
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2035573002 .
2016-06-07 10:44:28 -07:00
Frank Barchard
b00d40160a make unittest allocator align to 64 bytes.
blur requires memory be aligned.  change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.

TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.

Review URL: https://codereview.chromium.org/2019753002 .
2016-05-27 18:02:47 -07:00
Frank Barchard
ade85fb55c remove row.h from unittests
add SIMD_ALIGNED to unittest header.

BUG=libyuv:594
TESTED=local build passes with row.h removed from tests.
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2001373002 .
2016-05-27 10:57:49 -07:00
Magnus Jedvert
942db3016a Add ARGBExtractAlpha function
BUG=libyuv:572
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1995293002 .
2016-05-26 10:30:57 +02:00
Frank Barchard
74a69522da white space fixes for MIPS
TBR=kjellander@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2005053004 .
2016-05-24 14:17:18 -07:00
Frank Barchard
7edf572e28 remove includes for duplicate functions
R=harryjin@google.com
BUG=libyuv:592
TESTED=local builds work with fewer headers

Review URL: https://codereview.chromium.org/2006943002 .
2016-05-23 17:38:26 -07:00
Frank Barchard
fbdc43a03c fix wrong HAS_ARGBCOPYALPHAROW_SSE2 ifdef
TBR=kjellander@chromium.org
BUG=libyuv:593
TESTED=try bots pass.

Review URL: https://codereview.chromium.org/2000393002 .
2016-05-23 16:26:02 -07:00
Frank Barchard
cf101116c9 Remove initialize to zero on output variables for inline.
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs).  This allows the initialize
to zero step to be removed, saving 1 instruction.

BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
9c53ff2c57 Fix temporary stride for ConvertToARGB with rotation.
BUG=libyuv:578
TESTED=local unittests pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1879783002 .
2016-04-11 15:21:04 -07:00
Frank Barchard
3c862e3d29 Fix stride bug for msan on I420Interpolate.
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C

BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1872953002 .
2016-04-08 15:58:53 -07:00
Frank Barchard
c7372a323a add if defined(_MSC_FULL_VER) for NaCL
TBR=kjellander@chromium.org
BUG=libyuv:573
TESTED=try bots

Review URL: https://codereview.chromium.org/1850053002 .
2016-04-01 17:48:23 -07:00
Frank Barchard
76aee8ced7 Remove most clang-cl special cases from cpu_id.cc
They are not needed, and due to them there was a call to _xgetbv()
without a declaration of the function.  This used to work because we
implicitly included intrin.h in all translation units with clang-cl, but
we want to stop doing that.

BUG=chromium:592745
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1780473003 .
2016-03-10 14:01:26 -08:00
Frank Barchard
ee99b85126 Port ARGBToRGB565 from aarch64 neon to 32 bit
The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts and inserts, saving some masking.  The instruction is available for neon 32 bit as well.

R=magjed@chromium.org, harryjin@google.com
BUG=libyuv:571

Review URL: https://codereview.chromium.org/1724393002 .
2016-02-29 12:22:25 -08:00
Frank Barchard
22e062a448 Port ARGBToJ420 to AVX2
ARGBToJ420 had an SSSE3 version, but not AVX2.
ARGBToI420 had an AVX2, so adapt that code to J420.

TBR=harryjin@google.com
BUG=libyuv:553

Review URL: https://codereview.chromium.org/1702373004 .
2016-02-17 23:16:39 -08:00
Frank Barchard
127ff512b3 add perf data files to ignores
document play services update

R=jkellander@chromium.org
BUG=none

Review URL: https://codereview.chromium.org/1712463002 .
2016-02-17 21:37:09 -08:00
Frank Barchard
cc33dc68c7 Port I411ToARGBRow to AVX2.
An SSSE3 version already exists, and an AVX2 version is available for
Visual C.  This ports the function to AVX2 completing the AVX2 ports of
all YUV to RGB functions for AVX2 on gcc.

TBR=harryjin@google.com
BUG=libyuv:555

Review URL: https://codereview.chromium.org/1687253002 .
2016-02-12 10:26:10 -08:00
Frank Barchard
0e554b18fe port NV12ToRGB565Row_AVX2 to gcc
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2

NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.

R=harryjin@google.com
BUG=libyuv:554

Review URL: https://codereview.chromium.org/1687953002 .
2016-02-10 11:13:41 -08:00
Frank Barchard
c39509c8e5 add avx2 wrappers for functions that can call I422ToARGBRow_AVX2
R=harryjin@google.com
BUG=libyuv:557

Review URL: https://codereview.chromium.org/1687713002 .
2016-02-09 17:14:29 -08:00
Frank Barchard
0d880e5bc0 rename MIPS_DSPR2 to DSPR2 for consistency
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor.  This should be named consistently.

Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.

TBR=harryjin@google.com
BUG=libyuv:562

Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
05ed0c539c rework scale code for ubsan
For more info on ubsan, see
http://dev.chromium.org/developers/testing/undefinedbehaviorsanitizer

TESTED=Passing compilation using:
GYP_DEFINES="ubsan=1"
GYP_DEFINES="ubsan_vptr=1"

R=harryjin@google.com, pbos@webrtc.org
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1654253004 .
2016-02-02 11:01:49 -08:00
Frank Barchard
9e39c1f271 ubsan overflow fix for multiply by 0x01010101
This is an UBSan error reported by libjingle

[ RUN      ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride
[000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73
../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int'
[8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms)
[9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms)
Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted

The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes.
Changing the constant to 0x01010101u should avoid overflow.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1657533005 .
2016-02-01 12:29:04 -08:00