138 Commits

Author SHA1 Message Date
Frank Barchard
451af5e922 scale by 1 for neon implemented
void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fcvtn      v4.4h, v2.4s                   \n"  // 8 floatsgit
    "fcvtn2     v4.8h, v1.4s                   \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  :
  : "cc", "memory", "v1", "v2", "v4"
  );
}

void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fmul       v2.4s, v2.4s, %3.s[0]          \n"  // adjust exponent
    "fmul       v1.4s, v1.4s, %3.s[0]          \n"
    "uqshrn     v4.4h, v2.4s, #13              \n"  // isolate halffloat
    "uqshrn2    v4.8h, v1.4s, #13              \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  : "w"(scale * 1.9259299444e-34f)    // %3
  : "cc", "memory", "v1", "v2", "v4"
  );
}

TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2430313008 .
2016-10-21 14:30:03 -07:00
Frank Barchard
550cf829fb HalfFloat avx2 unpack bug fix.
AVX unpack parameters were reverse ordered causing incorrect results
on AVX2 hardware.

TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Half*

BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2438893002 .
2016-10-20 15:49:00 -07:00
Frank Barchard
f553db2d30 HalfFloatPlane unittest for denormal half floats
Halffloats have a limited range.  It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero.  This test ensures they behave the same in C and SIMD and tests the performance of denormals.

TEST=TestHalfFloatPlane_denormal
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2424233004 .
2016-10-19 18:13:01 -07:00
Frank Barchard
7fc932ddd3 Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560,chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2371293002 .
2016-09-29 15:06:30 -07:00
Frank Barchard
dc3a1295be add mergeuv test
Add test for SplitUVPlane and MergeUVPlane

Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists.

TEST=SplitUVPlane unittest
BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2279603002 .
2016-08-25 10:29:16 -07:00
Frank Barchard
b00d40160a make unittest allocator align to 64 bytes.
blur requires memory be aligned.  change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.

TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.

Review URL: https://codereview.chromium.org/2019753002 .
2016-05-27 18:02:47 -07:00
Frank Barchard
ade85fb55c remove row.h from unittests
add SIMD_ALIGNED to unittest header.

BUG=libyuv:594
TESTED=local build passes with row.h removed from tests.
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2001373002 .
2016-05-27 10:57:49 -07:00
Magnus Jedvert
942db3016a Add ARGBExtractAlpha function
BUG=libyuv:572
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1995293002 .
2016-05-26 10:30:57 +02:00
Frank Barchard
54bbea1701 Disable I420Blend_Any test that uses C
Also renames Inverted to Invert in test name for consistency.

TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1577973004 .
2016-01-11 18:23:04 -08:00
Frank Barchard
23c6a83561 Fix ifdef mismatch for mirroruv
Macro define and macro ifdef didnt match, leading to C code
being used.  Make macro match function name.

TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1579023002 .
2016-01-11 16:33:36 -08:00
Frank Barchard
3f4d86053e avx2 interpolate use 8 bit
BUG=libyuv:535
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1535833003 .
2015-12-21 10:57:32 -08:00
Frank Barchard
f4447745ae Add rounding to InterpolateRow for improved quality and consistency.
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly.  Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.

BUG=libyuv:535
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1533643005 .
2015-12-17 15:24:06 -08:00
Frank Barchard
ae55e41851 use rounding in scaledown by 2
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
2657688e70 Add support for odd height YUVA alpha blending.
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1507683003 .
2015-12-07 12:03:20 -08:00
Frank Barchard
bea690b3e0 AVX2 YUV alpha blender and improved unittests
AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions.

unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1505433002 .
2015-12-05 22:23:29 -08:00
Frank Barchard
8af0ebf816 planar blend use signed images
R=dhrosa@google.com, harryjin@google.com, jzern@chromium.org
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1491533002 .
2015-12-02 14:20:17 -08:00
Frank Barchard
b6f37bd8ec Interpolate plane initial implementation.
YUV version of interpolation between two images.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:526

Review URL: https://codereview.chromium.org/1479593002 .
2015-11-25 16:11:42 -08:00
Frank Barchard
c629cb3afe add command line cpu info to allow android neon test
in order to compare C and Neon code, a new command line flag is added.
historically environment variables controlled cpu features, but on
android apk it is easier to pass a command line option to disable cpu
optimizations.

R=harryjin@google.com
BUG=libyuv:516

Review URL: https://codereview.chromium.org/1407193009 .
2015-11-03 17:01:48 -08:00
Frank Barchard
26db4de2ae break up unittests into categories
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1399523004 .
2015-10-13 16:01:07 -07:00
Frank Barchard
16f12b58cc Replace random with fastrand
random / rand is slow and impacts performance testing.
Although its only called to clear a frame once, a typical profile shows
it high in the overall profile, when doing 1000 frames for a benchmark.

95.10%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_SSSE3
 2.01%  libyuv_unittest  libc-2.19.so         [.] __random_r
 1.13%  libyuv_unittest  libc-2.19.so         [.] __random

Replace random is a faster version for unittests.

set LIBYUV_WIDTH=1280
set LIBYUV_HEIGHT=720
set LIBYUV_REPEAT=999
set LIBYUV_FLAGS=-1
out\release\libyuv_unittest --gtest_filter=*YUY2ToARGB*  | findms

Was
libyuvTest.YUY2ToARGB_Opt (497 ms)

Now
libyuvTest.YUY2ToARGB_Opt (454 ms)

R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1361813002 .
2015-09-22 15:47:36 -07:00
fbarchard@google.com
d3d8e0d933 make source for planar tests contiguous to test planar functions coalesce into a single low level call.
BUG=431
TESTED=SetPlane unittest
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/51999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1419 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-01 23:28:59 +00:00
fbarchard@google.com
f16f33d4ce All cpu flags to be set so that instead of comparing C code, compare assembler to assembler, for benchmarking purposes.
BUG=none
TESTED=libyuv_unittest.exe
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/50499004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1346 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-26 18:22:29 +00:00
fbarchard@google.com
d28cd77f99 Enable assembly for clangcl build on Windows. Previously assembly was disabled so clangcl would work, but only with C code. As clangcl mimics both Visual C and GCC, ifdefs need to pick one or the other or often you'll end up with both. In this CL we disable most Visual C code and use the GCC versions which allow assembly for both 32 and 64 bit intel.
BUG=412
TESTED=clang=1 build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1341 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 20:36:31 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
8e3db2dc73 Support invert for ARGBRect and SetPlane
BUG=387
TESTED=ARGBRect_Invert
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37539004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1219 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-07 19:02:01 +00:00
fbarchard@google.com
61ffd847d7 Add tests for ARGBRect and SetPlane. Remove comment to test Neon shuffle and Setrows for Neon.
BUG=387
TESTED=libyuvTest.ARGBRect_Opt and libyuvTest.SetPlane_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1217 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-06 22:27:35 +00:00
fbarchard@google.com
ae9a1388a7 Use malloc for row buffers in rotate
BUG=296
TESTED=rotate_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@922 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-26 21:41:11 +00:00
fbarchard@google.com
ff74e023e1 A simple Makefile for libyuv on linux
BUG=286
TEST=make
R=kjellander@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/3979005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@856 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-18 16:56:45 +00:00
fbarchard@google.com
5daa25f9ba Add small test for blur
BUG=none
TEST=Blur*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3309004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@842 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-05 03:35:29 +00:00
fbarchard@google.com
f6bd6c0ac5 Use allocation instead of stack for a unittest that uses a bit too much.
BUG=284
TEST=Unattenuate test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3369004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@840 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-04 20:52:28 +00:00
fbarchard@google.com
092099507e Sobel using max to get abs for SSE2
BUG=none
TEST=none
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@824 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-23 00:51:52 +00:00
fbarchard@google.com
095f33d870 Coalesce rows by changing width/height and dropping into code instead of recursing. Improve coalesce by setting stride to 0 so it can be used even on odd width images. Reduce unittests to improve time to run emulators.
BUG=277
TEST=unittests all build and pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@819 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 19:29:10 +00:00
fbarchard@google.com
8be4b289c7 ARGBSobelToPlane which produces a planar output.
BUG=none
TEST=none
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2415005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@818 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 18:39:07 +00:00
fbarchard@google.com
adef267edf CopyYToAlpha to copy from a plane to alpha channel of ARGB
BUG=275
TESTED=untested
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2415004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@814 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-17 07:32:16 +00:00
fbarchard@google.com
7f67961ec5 ARGBCopyAlpha for effects
BUG=none
TEST=none
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/2385004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@810 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-12 22:27:37 +00:00
fbarchard@google.com
c99db063e2 Change ARGBColorMatrix to a 4x4.
BUG=none
TEST=planar_unitest updates
R=johannkoenig@google.com, ryanpetrie@google.com, thorcarpenter@google.com

Review URL: https://webrtc-codereview.appspot.com/2320008

git-svn-id: http://libyuv.googlecode.com/svn/trunk@805 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-01 01:27:30 +00:00
fbarchard@google.com
a927c6fb87 DrMemory fix for Sobel overread.
BUG=262
TESTED=Sobel* unittests re-enabled.

Review URL: https://webrtc-codereview.appspot.com/2273008

git-svn-id: http://libyuv.googlecode.com/svn/trunk@800 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-24 11:10:22 +00:00
fbarchard@google.com
07c3fe2f61 Fix DrMemory errors in unittests that were not initializing memory.
BUG=263
TEST=set GYP_DEFINES=build_for_tool=drmemory target_arch=ia32 & drmemory out\debug\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*
R=kjellander@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/2270007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@798 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-24 07:39:10 +00:00
fbarchard@google.com
afd1d6b4ec Fix 2 bugs with Luma scale
BUG=267
TEST=luma unittest improved
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2260005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@794 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-20 01:00:54 +00:00
fbarchard@google.com
a1ab194545 Color Table x86 reoptimized and ported to gcc.
BUG=266
TESTED=color table unittests
R=changjun.yang@intel.com

Review URL: https://webrtc-codereview.appspot.com/2216004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@791 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-16 17:01:02 +00:00
fbarchard@google.com
1390aaf69a fix for luma table valgrind uninitialized variable.
BUG=267
TEST=try bots
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2184008

git-svn-id: http://libyuv.googlecode.com/svn/trunk@784 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-10 23:38:49 +00:00
fbarchard@google.com
b38b73d88c ARGBLumaColorTable function.
BUG=267
TEST=Luma*
R=thorcarpenter@google.com

Review URL: https://webrtc-codereview.appspot.com/2202004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@783 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-10 20:34:09 +00:00
fbarchard@google.com
c3c06ec328 polynomial sse2 do 2 pixels at a time.
BUG=265
TEST=*Poly*
R=changjun.yang@intel.com

Review URL: https://webrtc-codereview.appspot.com/2195004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@782 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-10 08:16:06 +00:00
fbarchard@google.com
6da76f3b34 AVX version of Polynomial
BUG=265
TEST=untested
R=thorcarpenter@google.com, yunqingwang@google.com

Review URL: https://webrtc-codereview.appspot.com/2166004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@780 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-07 07:05:06 +00:00
fbarchard@google.com
ae0091e3a7 ARGBPolynomial for applying a 3 term polynomial matrix to pixels.
BUG=265
TEST=ARGBPolynomial
R=thorcarpenter@google.com

Review URL: https://webrtc-codereview.appspot.com/2159005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@778 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-03 19:20:47 +00:00
fbarchard@google.com
c4a70492c0 blur unittest and fix for negative height
BUG=256
TEST=*Blur*
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2027005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@757 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-14 16:16:39 +00:00
fbarchard@google.com
4b4b50fb44 Make unittests to 1280 pixels for simple planar tests, to get more realistic performance metrics than 256 pixels.
BUG=253
TEST=planar tests
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/1994004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@753 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-09 02:02:52 +00:00
fbarchard@google.com
5520710ef7 Add RGBColorTable which is like ARGBColorTable but only does first 3 channels.
BUG=none
TEST=none
R=dingkai@google.com, thorcarpenter@google.com, wuwang@google.com

Review URL: https://webrtc-codereview.appspot.com/1858004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@739 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-07-24 21:35:57 +00:00
fbarchard@google.com
4127a2637d ARGBInterpolate odd width support and inverted odd width test. ARGBToNV12/21 odd height fix. Compare test tolerate small height with warning.
BUG=202
TEST=libyuvTest.ARGBInterpolate85_Any_Invert
Review URL: https://webrtc-codereview.appspot.com/1325004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@663 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-15 10:43:33 +00:00
fbarchard@google.com
cd6056c01c InterpolateAny for unaligned and odd width interpolate. To be used in ARGBScaler in future.
BUG=208
TEST=ARGBInterpolate255_Unaligned
Review URL: https://webrtc-codereview.appspot.com/1324004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@662 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-15 03:05:08 +00:00