377 Commits

Author SHA1 Message Date
Frank Barchard
0815568a50 test for unaligned vs aligned for CopyRow_SSE2
improves performance on older CPUs where movdqa is faster.
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1455463002 .
2015-11-17 00:04:03 -08:00
Frank Barchard
ce4c2fad1d Raw 24 bit RGB to RGB24 (bgr)
Add unittests that do 1 step conversion vs 2 step conversion.

Tests end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1419103007 .
2015-11-03 10:30:30 -08:00
Frank Barchard
87926cec8b remove store bgra, abgr, raw unused macros
TBR=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1420033004 .
2015-11-02 10:40:03 -08:00
Frank Barchard
2c7aa0070a remove I422ToBGRA and use I422ToRGBA internally
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.

Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1427993004 .
2015-11-02 10:24:12 -08:00
Frank Barchard
5d97b93369 refactor I420ToABGR to use I420ToARGBRow
Using a transposed conversion matrix, I420ToARGB can output ABGR.

R=harryjin@google.com, xhwang@chromium.org
BUG=libyuv:473

Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
b86dbf24d3 refactor I420AlphaToABGR to use I420AlphaToARGB internally
swap U and V and transpose conversion matrix, so I420AlphaToARGB and
I420AlphaToABGR share low level code.

Having less code with same performance allows more focused
optimization for future ARM versions.

R=harryjin@google.com
TBR=harryjin@chromium.org
BUG=libyuv:473,libyuv:516

Review URL: https://codereview.chromium.org/1422263002 .
2015-10-27 14:17:21 -07:00
Frank Barchard
cf160cdbaa implement I444ToABGR by swapping uv and transpose matrix
U contributes to B and G.  V contributes to R and G.
By swapping U and V, they contribute to the opposite channels.  Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.

As a result the existing I444ToABGRRow functions are no longer needed and are removed.

Previously this function was only Intel AVX2 optimized for Windwos.  Now it is also optimized for Arm and GCC.

ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.

R=xhwang@chromium.org
BUG=libyuv:515

Review URL: https://codereview.chromium.org/1414133006 .
2015-10-27 10:21:21 -07:00
Frank Barchard
2e4466e282 change all pix parameters to width for consistency
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
Frank Barchard
76a599ec3b fix jpeg and bt.709 yuvconstants for neon64.
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces.  But the jpeg and bt.709
colour conversion constants were still in armv7 form.  This changes
the constants for aarch64 builds to be compatible with the code.

yuv constants are now passed as const *

Remove Yvu constants which were used for older version on nv21 but not new code.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
cc89e3a77b port ARGB to 565 dithering SSE2 code to GCC.
Previously the assembly code was only available to Windows.
This CL ports the SSE2 code to GCC syntax.

When running a profiler on all the unittests, this function
was the slowest of all functions that still ran in C code.
   3.71%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C

Was
ARGBToRGB565Dither_Opt (2894 ms)
Now
ARGBToRGB565Dither_Opt (432 ms)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1397673002 .
2015-10-07 18:24:50 -07:00
Frank Barchard
914a9856c7 Reimplement NV21ToARGB to allow different color matrix.
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V.  But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.

TBR=harryjin@google.com
BUG=libyuv:500

Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
2cc1a2b233 Remove sse2 functions that also have ssse3
ARGBBlendRow_SSE2, ARGBAttenuateRow_SSE2, and MirrorRow_SSE2
Since vast majority of CPUs have SSSE3 now, removing the SSE2
improves the performance of CPU dispatching.

R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1377053003 .
2015-09-30 14:24:44 -07:00
Frank Barchard
febc26a2c9 win64 version of I422AlphaToARGB.
Was
I420AlphaToARGB_Premult (8861 ms)
I420AlphaToARGB_Opt (7119 ms)
Now
I420AlphaToABGR_Premult (2840 ms)
I420AlphaToARGB_Opt (484 ms)

C function switched to 1 step.
Was
I420AlphaToARGB_Premult (8862 ms)
I420AlphaToABGR_Opt (6718 ms)

Now
I420AlphaToARGB_Premult (8706 ms)
I420AlphaToARGB_Opt (6541 ms)

R=harryjin@google.com
BUG=libyuv:496, libyuv:473

Review URL: https://codereview.chromium.org/1359183003 .
2015-09-25 15:06:41 -07:00
Frank Barchard
9a0e12f5f1 AVX2 1 step I422AlphaToARGB for gcc and win.
C     I420AlphaToARGB_Opt (5169 ms)
SSSE3 I420AlphaToARGB_Opt (432 ms)
AVX2  I420AlphaToARGB_Opt (358 ms)

and with premultiplication as 2 step process:
I420AlphaToARGB_Premult (7029 ms)
I420AlphaToARGB_Premult (757 ms)
I420AlphaToARGB_Premult (508 ms)

R=harryjin@google.com
BUG=libyuv:496,libyuv:473

Review URL: https://codereview.chromium.org/1372653003 .
2015-09-25 13:37:42 -07:00
Frank Barchard
e365cdde3b I420Alpha row function in 1 pass.
API change - I420AlphaToARGB takes flag indicating if RGB should be
premultiplied by alpha.

This version implements an efficient SSSE3 version for Windows.
C version done in 2 steps.

Was
libyuvTest.I420AlphaToARGB_Any (1136 ms)
libyuvTest.I420AlphaToARGB_Unaligned (1210 ms)
libyuvTest.I420AlphaToARGB_Invert (966 ms)
libyuvTest.I420AlphaToARGB_Opt (1031 ms)
libyuvTest.I420AlphaToABGR_Any (1020 ms)
libyuvTest.I420AlphaToABGR_Unaligned (1359 ms)
libyuvTest.I420AlphaToABGR_Invert (1082 ms)
libyuvTest.I420AlphaToABGR_Opt (986 ms)

R=harryjin@google.com
BUG=libyuv:496

Review URL: https://codereview.chromium.org/1367093002 .
2015-09-25 10:29:20 -07:00
Frank Barchard
d4594beefc switch from ebp to ebx.
ebx encodes more efficiently (1 byte less) for most address modes, than ebp.
previously it was used for 411 format, but the reader uses pinsrw now avoiding
gpr register.

BUG=libyuv:488
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1365003003 .
2015-09-24 17:25:11 -07:00
Frank Barchard
000cf89ca8 YUY2ToARGB avx2 in 1 step conversion.
Includes UYVYToARGB ssse3 fix.

Was
YUY2ToARGB_Opt (433 ms)
69.79%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
20.73%  libyuv_unittest  libyuv_unittest      [.] YUY2ToUV422Row_AVX2
 6.04%  libyuv_unittest  libyuv_unittest      [.] YUY2ToYRow_AVX2
 0.77%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_AVX2

Now
YUY2ToARGB_Opt (280 ms)
95.66%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_AVX2

BUG=libyuv:494
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1364813002 .
2015-09-23 11:15:18 -07:00
Frank Barchard
2b92ec8d0f Fix git markers introduced on landing previous CL
BUG=none

Review URL: https://codereview.chromium.org/1359023003 .
2015-09-22 15:00:57 -07:00
Frank Barchard
5f3d4270d1 yuy2 to rgb gcc versions
read in read function for yuv conversion

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1355393002 .
2015-09-22 14:27:33 -07:00
Frank Barchard
03cd8584e7 Read Y channel in read function for yuv conversion.
Allows reader to support YUY2 format.
Also contains fix for win64 build for yuv conversion.

TBR=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1355333002 .
2015-09-22 12:05:16 -07:00
Frank Barchard
f96890a0be yuvconstants for all YUV to RGB conversion functions.
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1363503002 .
2015-09-22 10:26:03 -07:00
Frank Barchard
62c49dc811 move constants into common
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1359443005 .
2015-09-18 16:28:44 -07:00
Frank Barchard
28427a53e2 I444ToABGR for android
Reimplements I444ToARGB as a matrix function.
new I444ToABGR as matrix functions with wrappers and any functions.
Allows for future J444 and H444 versions.
I444ToABGR user level function added.

BUG=libyuv:490, libyuv:449
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1355733002 .
2015-09-18 11:20:58 -07:00
Frank Barchard
316e1ab996 avx2 width parameter bug fix
R=harryjin@google.com
BUG=libyuv:489

Review URL: https://codereview.chromium.org/1321773004 .
2015-09-09 11:56:35 -07:00
Frank Barchard
ed55d24d9f H420 functionality
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/54869004 .
2015-09-06 11:01:40 -07:00
Frank Barchard
7060e0d826 I420ToABGRMatrix functions with J420ToABGR wrapper.
Allows direct conversion from JPeg to ABGR for android.

BUG=libyuv:488
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/55719004 .
2015-09-03 10:42:36 -07:00
Frank Barchard
925c3d9e26 I420ToARGB conversion with matrix.
Take color conversion constants as a parameter to row function for I420ToARGBMatrixRow_SSSE3.
Allows future variations of color space using a single low level.

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/56669004 .
2015-09-02 10:45:42 -07:00
Frank Barchard
0bc626a5d7 nolint removed
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/59389004.
2015-08-31 10:52:13 -07:00
Frank Barchard
0735245c52 pinsrw instruction allows reading 2 bytes directly into an xmm register.
Saving a gpr register allows the register to not be pushed for now, and in future it can be used to point to color conversion matrix or alpha channel.

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/52789004.
2015-08-28 17:03:54 -07:00
Frank Barchard
be11f500f0 Use ebp to point to conversion table.
Proof of concept that conversions can table color matrix as a parameter.

R=harryjin@google.com

BUG=libyuv:472, libyuv:488

Review URL: https://webrtc-codereview.appspot.com/58489004.
2015-08-28 12:00:49 -07:00
Frank Barchard
3c4f5735ce use pointer to inverse table for clangcl
R=harryjin@google.com
TBR=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/54859004.
2015-08-26 12:53:03 -07:00
Frank Barchard
5452cce452 port row to clangcl
BUG=libyuv:487
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/53799005.
2015-08-25 16:15:42 -07:00
Frank Barchard
fa7ce4af3f fixed table for clangcl
R=harryjin@google.com
BUG=libyuv:487

Review URL: https://webrtc-codereview.appspot.com/53799004.
2015-08-25 10:47:30 -07:00
Frank Barchard
ee9aaea02f i422torgb565 is asm for clangcl as well
Merge branch 'master' of https://chromium.googlesource.com/libyuv/libyuv into convertcl

allow lto for llvm but not gcc

R=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/52769004.
2015-08-19 10:46:30 -07:00
Frank Barchard
715a29195b vpermq for avx2 ARGB4444ToARGB, ARGB1555ToARGB and RGB565ToARGB
R=harryjin@google.com
BUG=libyuv:462

Review URL: https://webrtc-codereview.appspot.com/52759004.
2015-07-07 17:06:04 -07:00
Frank Barchard
97b35daf75 disable faulty avx2 in argb conversions and box filter. and extend temporary buffer to 128 for an avx2 any function.
R=harryjin@google.com
BUG=libyuv:462
TESTED=libyuv_unittest run on haswell laptop

Review URL: https://webrtc-codereview.appspot.com/53759004.
2015-07-07 15:40:24 -07:00
Frank Barchard
0686f26938 blend remove alignment 1 pixel loop for less overhead.
R=tpsiaki@google.com
BUG=none
TESTED=libyuvTest.ARGBBlend_Opt

Review URL: https://webrtc-codereview.appspot.com/50289005.
2015-06-24 11:34:12 -07:00
fbarchard@google.com
2e9f3e5cf5 rename source files from row_posix.cc etc to row_gcc.cc to avoid gyp build filtering out source files from build when on windows with clang. The source code contained in row_gcc.cc is gcc syntax inline assembly available for any platform that supports gcc or clang for intel cpus.
BUG=440
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/56579004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1430 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 17:27:52 +00:00
fbarchard@google.com
b07de879b6 enable intrinsics for clangcl if -mssse3 is enabled.
BUG=451
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/52699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1427 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-08 22:48:18 +00:00
fbarchard@google.com
cfce47efc8 Change Sobel to use JPeg Luma calculation instead of extracting G channel. Using luma produces a better sobel that respects all 3 channels of RGB. Historically the G channel was used to improve performance, and because the luma of I420 is a constrained range, hurting quality. Using the JPeg variation of YUV, the luma is more accurate, including cross platform, better optimized for AVX2 and odd widths, and full range.
BUG=444
TESTED=ARGBSobelXY_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/57479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1414 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-27 22:32:26 +00:00
fbarchard@google.com
632c50f29c include posix source for 64 bit clang builds.
BUG=440
TESTED=ninja -C out\Release_x64
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/46259004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1407 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-14 21:40:05 +00:00
fbarchard@google.com
2c44965e8d make row_win windows code built with clangcl include the _posix source code.
depot_tools excludes these source files now, so they need to be manually
included.
BUG=435
TESTED=clangcl local build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1395 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-02 00:39:29 +00:00
fbarchard@google.com
01db3d1d1d Remove declspec(align(32)) from AVX2 functions.
BUG=422
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/43229004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1374 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-20 22:57:04 +00:00
fbarchard@google.com
c7161d1c36 Remove code alignment declspec from Visual C versions for vs2014 compatibility.
BUG=422
TESTED=local vs2013 build still passes.

Review URL: https://webrtc-codereview.appspot.com/45959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1365 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-12 23:54:26 +00:00
fbarchard@google.com
bb5a009d11 ARGB4444ToARGB and ARGB1555ToARGB ported to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGB4444ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48009004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1363 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 23:52:57 +00:00
fbarchard@google.com
8b9f908134 RGB565ToARGB AVX2 vzeroupper before the ret, not after.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1362 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 22:53:12 +00:00
fbarchard@google.com
8f0b32773c ARGBToUV AVX2 functions hooked up.
BUG=none
TESTED=RGB565ToI420
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46829004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1359 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 00:10:52 +00:00
fbarchard@google.com
2827277496 port RGB565ToARGB to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1357 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 19:24:23 +00:00
fbarchard@google.com
d28cd77f99 Enable assembly for clangcl build on Windows. Previously assembly was disabled so clangcl would work, but only with C code. As clangcl mimics both Visual C and GCC, ifdefs need to pick one or the other or often you'll end up with both. In this CL we disable most Visual C code and use the GCC versions which allow assembly for both 32 and 64 bit intel.
BUG=412
TESTED=clang=1 build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1341 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 20:36:31 +00:00
fbarchard@google.com
3b4f5eb7b8 Port J422 colorspace to GCC
BUG=414
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/43809004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1334 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:54:50 +00:00
fbarchard@google.com
92f7f421fd rename I400 to J400 and I400 reference to I400. J400 is a simple replication of values to convert to RGB, which is what the old I400 was. I400 reference is the Y part of the YUV formula, so renaming that to I400.
BUG=none
TESTED=libyuvTest (5925 ms total)
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/50369005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1333 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:01:18 +00:00
fbarchard@google.com
f2fad0faa5 Optimized J422ToARGB.
BUG=414
TESTED=J422ToARGB unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/42799004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1328 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 18:08:30 +00:00
fbarchard@google.com
685b92b0a6 I400ToARGB_AVX2 port from SSE2 to AVX2.
BUG=403
TESTED=libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I400ToARGB*
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1322 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 18:12:17 +00:00
fbarchard@google.com
f5a7b2b48a I411ToARGB AVX2 version
BUG=403
TESTED=I411ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1321 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 00:08:56 +00:00
fbarchard@google.com
cdd80e04c9 Port I444ToARGB to AVX2.
BUG=403
TESTED=I444ToARGB unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1314 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-09 21:56:48 +00:00
fbarchard@google.com
697c5aa831 disable nv12 avx2 for vs9/10 that dont support avx2 instructions.
BUG=409
TESTED=try bots
R=harryjin@google.com, johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/43629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1311 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 19:12:21 +00:00
fbarchard@google.com
bdeb9ac584 switch from 8x8 to 4x4 matrix for dithering
BUG=407
TESTED=Dither unittests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1310 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 18:28:00 +00:00
fbarchard@google.com
0fe4abbc5c ARGBToRGB565 AVX2 with dithering
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/44519004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1309 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 22:31:43 +00:00
fbarchard@google.com
9245317e16 ARGBToRGB565 SSE2 port.
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1308 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 00:00:50 +00:00
fbarchard@google.com
933bd40c3c port ARGBToRGB565 and ARGB1555 to AVX2. Enable functions that use ARGBToRGB565 AVX2 code. Add ARGBToRGB565Dither function.
BUG=403
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1302 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-27 21:15:28 +00:00
fbarchard@google.com
bffd326f74 AVX2 version of ARGBToARGB4444
BUG=403
TESTED=local build on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/43429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1297 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 17:26:28 +00:00
fbarchard@google.com
d96047761e AVX2 version of NV12ToARGB
BUG=403
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40089004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1295 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:45:08 +00:00
fbarchard@google.com
975dd5a699 macros for storing RGB on windows.
BUG=403
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38119004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1283 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-14 00:50:48 +00:00
fbarchard@google.com
2f56d2859f Macro to store ARGB value
BUG=396
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1279 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 18:53:54 +00:00
fbarchard@google.com
d1ac8b17e6 use matrix for win64 version of I420ToARGB
BUG=396
TESTED=local unittests build/pass
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41899004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1276 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 00:57:46 +00:00
fbarchard@google.com
3bb829a44f Add a macro for YUV to RGB on Windows. Allows multiple color matrix structures in the future.
BUG=393
TESTED=local build
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1275 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 23:03:37 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
baafc97d6b port YToARGB AVX2 to GCC
BUG=393
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39819004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1262 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 20:17:27 +00:00
fbarchard@google.com
c4e032c543 change Y multiplier and bias to compensate for 257/256 which makes YToARGB exactly match float math.
Histogram Before
hist            -3      -2      -1      0       1       2       3
red             0       0       1809408 13140736        1827072 0       0
green           0       0       1679912 13471329        1625975 0       0
blue            168448  994816  1876480 10655488        1893376 1006336 182272
Histogram After
hist            -3      -2      -1      0       1       2       3
red             0       0       558848  15632128        586240  0       0
green           0       0       209907  16350588        216721  0       0
blue            14848   642816  1989376 11363328        2053120 695040  18688
BUG=394
TESTED=more stringent luma tests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/38859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1259 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-04 19:45:26 +00:00
fbarchard@google.com
3982998c7c YToARGB AVX2 port from SSE2
BUG=393
TESTED=YToARGB unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1258 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-03 01:35:11 +00:00
fbarchard@google.com
29db9b0b89 C version of YToARGB with ubias removed to produce consistent luma ramp.
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1251 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:07:46 +00:00
fbarchard@google.com
080a316492 port yuv chroma improvements to gcc. YUV to RGB is more accurate using a negative matrix. 2% slower but half as much error.
BUG=324
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1249 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 04:35:51 +00:00
fbarchard@google.com
d12a08712b adjust ubias to minimize error histogram centering error.
BUG=324
TESTED=TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37739004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1248 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 22:16:33 +00:00
fbarchard@google.com
eb8dda3ac7 fix for ybias on YToARGB function.
BUG=324
TESTED=libyuvTest.YToARGB_Any
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/36939004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1247 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 18:31:29 +00:00
fbarchard@google.com
b114986477 Change YUV to RGB to subtract the chroma contributions from the bias.
BUG=324
TESTED=win64 build and TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1246 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 04:22:35 +00:00
fbarchard@google.com
c62d30111f adjust bias on Y channel so error histogram is better centered on green channel
BUG=324
TESTED=FullYUVTest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/38689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1245 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-22 19:43:34 +00:00
fbarchard@google.com
319f047710 Compute chroma using negative coefficients to extend range of U contribution on B to 2
BUG=324
TESTED=TestI420
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/41569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1238 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-21 18:45:13 +00:00
fbarchard@google.com
e7873910df port YUV luma accuracy to posix
BUG=324
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1236 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-21 00:36:30 +00:00
fbarchard@google.com
c3d09f6021 Improve accuracy of luma channel in YUV to RGB conversion
BUG=324
TESTED=TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/36859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1233 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-20 23:42:15 +00:00
fbarchard@google.com
b2a6af1be6 Change rectangle low level functions to use more conventional row functions including 'any' variations. Previously the yuv function SetPlane stored 32 bit values. Now a more conventional memset() style function is used for YUV that stores bytes. On Haswell a rep stosb is used for YUV. Overall benefit of this CL is improved performance for 'any' width, and simpler row assembly instead of full image assembly. Previously ARGBRect used a low level function that supported a rectangle in assembly. Now it uses a row function, and relies on row coalesce to combine into a single low level call.
BUG=371
TESTED=untested
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1222 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-12 03:58:24 +00:00
fbarchard@google.com
992c3b089a Use HAS_ARGBSETROWS_X86 to detect presence of function.
BUG=none
TESTED=rectangle unittests
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35639004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1218 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-07 00:11:51 +00:00
fbarchard@google.com
966233e5eb Remove sub 16 from yuv conversions and change bias to include it.
BUG=388
TESTED=out\release\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*420ToARGB_Opt  | sortms
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1216 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-31 01:07:02 +00:00
fbarchard@google.com
7892ea1fe1 Fix for ARGBToUV on AVX2
BUG=269
TESTED=local testing
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33669004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1202 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:59:23 +00:00
fbarchard@google.com
ddee77cdbd Fix for I422ToRGBA when I422ToARGB is not enabled for AVX2
BUG=269
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1201 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:28:59 +00:00
fbarchard@google.com
f5f5d15dcd Fix register order for ARGBToUV_AVX2
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1200 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:07:09 +00:00
fbarchard@google.com
540e8af80c remove add 16 from ARGBToYJ and add rounding, for consistency with Windows version. row.h header macros sorted alphabetically.
BUG=269
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32579005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1185 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 22:37:47 +00:00
fbarchard@google.com
c5aac16af9 Remove loop alignment for benefit of modern cpus that dont require alignment.
BUG=none
TESTED=local libyuv unittest passes
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1180 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-24 21:26:22 +00:00
fbarchard@google.com
ef14972df0 MergeUV AVX2 use vextractf128 to store results to avoid shuffling.
BUG=none
TESTED=intel sde on unittests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/33369004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1178 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-22 03:33:33 +00:00
fbarchard@google.com
ef67597b48 ARGBMirror use SSE2 pshufd instruction instead of SSSE3 pshufb.
BUG=269
TESTED=local benchmark for ARGBMirror
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32509004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1176 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-21 19:25:14 +00:00
fbarchard@google.com
91f240c5db Move sub before branch for loops.
Remove CopyRow_x86
Add CopyRow_Any versions for AVX, SSE2 and Neon.
BUG=269
TESTED=local build
R=harryjin@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/26209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1175 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-20 21:14:27 +00:00
fbarchard@google.com
b9d17e1d79 Fix offset in addresses for windows. Wants it within [] now.
BUG=none
TESTED=local windows build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1168 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 19:50:42 +00:00
fbarchard@google.com
5822505e0a Remove extra unaligned loop from alphablender. Both aligned and unaligned loops were the same, so remove the extra.
BUG=none
TESTED=try bots.
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1166 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:33:07 +00:00
fbarchard@google.com
1eb636d249 remove initial lea in mirror functions and add the offset in the address mode.
BUG=none
TESTED=local libyuv unittests on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26169004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1165 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:16:23 +00:00
fbarchard@google.com
35508d0979 Mirror_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1164 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 23:11:10 +00:00
fbarchard@google.com
91000425a3 ARGBUnattenuate_AVX2 ported to GCC. Minor cleanup of constants to use broadcast to make 16 byte constant instead of 32 byte.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1163 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 17:57:33 +00:00
fbarchard@google.com
ec1f854f86 Use broadcast to duplicate constants from 16 bytes to 32 bytes to save data space.
BUG=none
TESTED=intelsde
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/32029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1161 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-12 01:45:27 +00:00
fbarchard@google.com
ee4bc0d834 vzeroupper moved to just before ret. in one case it was done after ret, which is a bug that would cause a performance stall.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1149 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-30 19:27:21 +00:00
fbarchard@google.com
2edea9454d Fix lint extraneous warning on row_win assembly by disabling the warning for those affected lines.
BUG=none
TESTED=line row_win.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1144 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-27 16:27:48 +00:00
fbarchard@google.com
f2fa453b94 Port I422ToABGR to AVX2.
BUG=269
TESTED=intelsde on I422ToABGR
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1138 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 17:20:22 +00:00
fbarchard@google.com
22eb5965fc Optimize I422ToRGBA for AVX2 by hoisting ymm5 initialization and using different register for output of unpack.
BUG=269
TESTED=intelsde on I422ToABGR
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29889004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1137 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 23:39:16 +00:00