101 Commits

Author SHA1 Message Date
Manojkumar Bhosale
a899dea251 Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA          - ~1.1x
ARGBAttenuateRow_Any_MSA      - ~1.1x
ARGBToRGB565DitherRow_MSA     - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA            - ~5.1x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~1.1x
ARGBGrayRow_MSA               - ~2.6x
ARGBSepiaRow_MSA              - ~11.6x

Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA          - ~2.46x
ARGBAttenuateRow_Any_MSA      - ~2.45x
ARGBToRGB565DitherRow_MSA     - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA            - ~5.2x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~4.3x
ARGBGrayRow_MSA               - ~10.5x
ARGBSepiaRow_MSA              - ~12.2x

Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Frank Barchard
dde8ba7009 ConvertFromI420: use halfstride instead of halfwidth
BUG=libyuv:660
TEST=try bots
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2554213003 .
2016-12-07 10:16:16 -08:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
10ce829bad Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422ToRGB565Row_MSA             : ~1.5x
I422ToRGB565Row_Any_MSA         : ~1.5x
I422ToARGB4444Row_MSA           : ~1.4x
I422ToARGB4444Row_Any_MSA       : ~1.4x
I422ToARGB1555Row_MSA           : ~1.4x
I422ToARGB1555Row_Any_MSA       : ~1.4x

Performance Gain (vs C non-vectorized)
I422ToRGB565Row_MSA             : ~6.8x
I422ToRGB565Row_Any_MSA         : ~6.8x
I422ToARGB4444Row_MSA           : ~6.6x
I422ToARGB4444Row_Any_MSA       : ~6.6x
I422ToARGB1555Row_MSA           : ~6.6x
I422ToARGB1555Row_Any_MSA       : ~6.6x

Review URL: https://codereview.chromium.org/2445343007 .
2016-10-27 10:47:35 -07:00
Frank Barchard
532f5708a9 Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA      : ~1.4x
I422AlphaToARGBRow_Any_MSA  : ~1.4x
I422ToRGB24Row_MSA          : ~4.8x
I422ToRGB24Row_Any_MSA      : ~4.8x

Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA      : ~7.0x
I422AlphaToARGBRow_Any_MSA  : ~7.0x
I422ToRGB24Row_MSA          : ~7.9x
I422ToRGB24Row_Any_MSA      : ~7.7x

Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
f5d5bd88d6 Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gains :- (vs C vectorized)

I422ToARGBRow_MSA     : ~1.6x
I422ToRGBARow_MSA     : ~1.6x

I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x

Performance Gains :- (vs C non-vectorized)

I422ToARGBRow_MSA     : ~7x
I422ToRGBARow_MSA     : ~7x

I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x

Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.

Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
d363ea6527 Remove I411 support.
YUV 411 is very uncommon format.  Remove support.

Update documentation to reflect that 411 is deprecated.

Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.

BUG=libyuv:645
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
7018f5be0f Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains :-

I422ToYUY2Row_MSA     - ~12x
I422ToYUY2Row_Any_MSA - ~7x

I422ToUYVYRow_MSA     - ~12x
I422ToUYVYRow_Any_MSA - ~7x

Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
c244a3e9a0 Add SplitUVPlanes and MergeUVPlanes
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.

BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2277603004 .
2016-08-24 16:47:24 -07:00
Frank Barchard
161e5c4569 Allow NULL for dst_y in planar formats. BUG=libyuv:631 TEST=unittests build/pass
BUG=libyuv:631
TEST=unittests build/pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2271053003 .
2016-08-24 10:19:14 -07:00
Niels Möller
365ed3851c Treat YU12 as an alias for I420. Simplify setting of inv_crop_height.
BUG=
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2020193002 .
2016-06-16 12:49:17 +02:00
Frank Barchard
0d880e5bc0 rename MIPS_DSPR2 to DSPR2 for consistency
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor.  This should be named consistently.

Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.

TBR=harryjin@google.com
BUG=libyuv:562

Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
8377c798fb Fix I420ToNV21 for wrong dst_stride_y parameter.
I420ToNV21 passes the wrong dst_stride_y when it calls I420ToNV12; parameter 8 (convert_from.cc:448) is src_stride_y but should be dst_stride_y.  This causes image corruption when converting I420 -> NV21 with mismatched luminance strides.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:547

Review URL: https://codereview.chromium.org/1582793008 .
2016-01-14 17:38:54 -08:00
Frank Barchard
d95d2169d9 rename yuv matrix constants to be more clear about what they are
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1429693006 .
2015-11-03 17:09:53 -08:00
Frank Barchard
2c7aa0070a remove I422ToBGRA and use I422ToRGBA internally
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.

Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1427993004 .
2015-11-02 10:24:12 -08:00
Frank Barchard
5d97b93369 refactor I420ToABGR to use I420ToARGBRow
Using a transposed conversion matrix, I420ToARGB can output ABGR.

R=harryjin@google.com, xhwang@chromium.org
BUG=libyuv:473

Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
2e4466e282 change all pix parameters to width for consistency
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
Frank Barchard
76a599ec3b fix jpeg and bt.709 yuvconstants for neon64.
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces.  But the jpeg and bt.709
colour conversion constants were still in armv7 form.  This changes
the constants for aarch64 builds to be compatible with the code.

yuv constants are now passed as const *

Remove Yvu constants which were used for older version on nv21 but not new code.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
f96890a0be yuvconstants for all YUV to RGB conversion functions.
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1363503002 .
2015-09-22 10:26:03 -07:00
fbarchard@google.com
32ad6e0e12 Remove unused variable 'I422ToRGB565Row' that breaks osx builds.
BUG=426
TESTED=untested

Review URL: https://webrtc-codereview.appspot.com/48079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1368 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 02:50:35 +00:00
fbarchard@google.com
0e4388aea3 I422ToRGB24 AVX2 and I422ToRAW
BUG=none
TESTED=I422ToRGB24 unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46619004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1337 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 17:25:27 +00:00
yang.zhang@arm.com
e246e6c18f Add ARGBToRGB565DitherRow_NEON for ARM32/64
ARM32/64 NEON versions of ARGBToRGB565DitherRow_NEON are implemented.

BUG=407
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ia689170fb39db964392e5e1113801592ab0628bf

Review URL: https://webrtc-codereview.appspot.com/49409004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1335 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:22:25 +00:00
fbarchard@google.com
24152b2435 Dither from I420 to RGB565 in 2 steps - I420ToARGB then ARGBToRGB565.
BUG=407
TESTED=untested
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1315 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 01:45:04 +00:00
fbarchard@google.com
446fa95587 I422ToRGB565, ARGB4444 and ARGB1555 for AVX2
BUG=403
TESTED=avx2 emulator

Review URL: https://webrtc-codereview.appspot.com/34359004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1293 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:14:46 +00:00
fbarchard@google.com
5ab38f9258 Remove Q420 fourcc support.
BUG=396
TESTED=local build of unittest builds and passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39089004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1278 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 18:20:54 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
9ed836b154 The 'Any' versions of functions can handle any width now, so remove the check from the calling code. This has 2 advantages - less code, and less overhead in calling function when any function is NOT used. Downside is more code for case where any is used.
BUG=373
TESTED=libyuv_unittest still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1143 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 23:29:31 +00:00
fbarchard@google.com
f2fa453b94 Port I422ToABGR to AVX2.
BUG=269
TESTED=intelsde on I422ToABGR
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1138 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 17:20:22 +00:00
fbarchard@google.com
c000955bc0 Port I422ToRGBA to AVX.
BUG=269
TESTED=intelsde on I422ToRGBA
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/28769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1136 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 22:41:39 +00:00
fbarchard@google.com
d81dddd3d0 port I420ToBGRA to AVX2.
BUG=269
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*I420ToBGRA*
R=brucedawson@google.com, harryjin@google.com, magjed@chromium.org

Review URL: https://webrtc-codereview.appspot.com/26869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1127 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 19:35:55 +00:00
fbarchard@google.com
f713691a6f Change elif to endif and if to allow AVX2 as well as SSE2 in future changes instead of one or the other.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1122 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 20:47:22 +00:00
fbarchard@google.com
ca308327d2 Remove unaligned functions, since most function support unaligned memory now. This reduces complexity and improves performance for unaligned cases because C code can be avoided, and overhead is less. Downside is old cpus (core2 and earlier) will be slower for aligned memory case. Except mips, which has alignment requirement, but remove unaligned variant.
BUG=365
TESTED=unittest builds and passes locally
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1113 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 00:59:31 +00:00
fbarchard@google.com
8798e04075 Port conversion functions to c.
BUG=303
TESTED=cl /c /TC /Iinclude source\convert_from.cc source\convert_argb.cc source\convert_from_argb.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/17909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1030 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-08 18:44:57 +00:00
fbarchard@google.com
d9c9f37ac4 Conversions use malloc for row buffers.
BUG=296
TESTED=libyuv convert_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@928 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 02:00:30 +00:00
fbarchard@google.com
99a1298c54 I444ToI420 etc use ScalePlane on Y to allow mirroring.
BUG=291
TESTED=unittests still pass.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@893 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-04 23:36:06 +00:00
fbarchard@google.com
a8e4dcb5d5 Use scaling for YUV to YUV.
BUG=none
TEST=I4*ToI4*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@892 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-04 20:31:53 +00:00
fbarchard@google.com
095f33d870 Coalesce rows by changing width/height and dropping into code instead of recursing. Improve coalesce by setting stride to 0 so it can be used even on odd width images. Reduce unittests to improve time to run emulators.
BUG=277
TEST=unittests all build and pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@819 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 19:29:10 +00:00
fbarchard@google.com
37c0e648d2 Fix crash on wide images
BUG=239
TEST=LIBYUV_WIDTH=10000 out\release\libyuv_unittest
R=changjun.yang@intel.com, johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/1586006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@712 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-05-31 16:55:27 +00:00
fbarchard@google.com
c297d103f1 I420ToARGB for Haswell.
BUG=216
TEST=I420ToARGB
Review URL: https://webrtc-codereview.appspot.com/1314004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@660 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-12 07:26:24 +00:00
fbarchard@google.com
aa7988ff73 Enhanced Rep Mov String version of CopyRow for posix and use cpu detect for ERMS
BUG=213
TEST=none
Review URL: https://webrtc-codereview.appspot.com/1306008

git-svn-id: http://libyuv.googlecode.com/svn/trunk@658 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-12 00:44:33 +00:00
fbarchard@google.com
9b4c00b908 Move vzeroupper to row functions to simplify caller and allow mix of avx2 and sse2. Impact reduced by row coalescing.
BUG=none
TEST=all tests pass with sde
Review URL: https://webrtc-codereview.appspot.com/1269009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@641 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-04 05:54:59 +00:00
fbarchard@google.com
f8e9017685 switch toyuy2 from aligned to unaligned
BUG=211
TESTED=ToYUY2*
Review URL: https://webrtc-codereview.appspot.com/1274005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@633 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-02 21:18:12 +00:00
fbarchard@google.com
5ca144d214 NV12 to/from I420 coalesce rows for Y and UV independently.
BUG=197
TESTED=*NV12*_Opt
Review URL: https://webrtc-codereview.appspot.com/1201004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@607 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-14 17:57:47 +00:00
fbarchard@google.com
07a99dc278 Row coalesce convert_from.cc for I420ToNV12, YUY2ToI422, UYVYToI422
BUG=197
TESTED=I420ToNV12_Opt
Review URL: https://webrtc-codereview.appspot.com/1196004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@605 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-13 17:48:22 +00:00
fbarchard@google.com
51398e0be5 ARGBMirror AVX2
BUG=none
TEST=out\release\libyuv_unittest --gtest_filter=*ARGBMirror*
Review URL: https://webrtc-codereview.appspot.com/1159005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@594 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-06 00:57:48 +00:00
fbarchard@google.com
928483f397 Add I420ToNV12 and NV21 to convert from
BUG=178
TEST=none
Review URL: https://webrtc-codereview.appspot.com/1127007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@581 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-26 03:00:31 +00:00
fbarchard@google.com
b444bae883 ARGBToI400 and ARGBToI411 using AVX2. YUY2ToI420 and UVYVToI420 use AVX2. CopyPlane use rep movsb for AVX2. CopyPlane2 use rep movsb for AVX2 and CopyPlane if strides match AVX2, which will do a single rep movsb for entire image if stride == width. MergeUV for I420ToNV12.
BUG=181
TESTED=unittests pass
Review URL: https://webrtc-codereview.appspot.com/1103007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@569 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-11 20:30:41 +00:00
fbarchard@google.com
92352b7081 mips optimized copy for all functions using CopyRows.
BUG=176
TEST=try bots
Review URL: https://webrtc-codereview.appspot.com/1074010

git-svn-id: http://libyuv.googlecode.com/svn/trunk@556 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-28 19:43:29 +00:00
fbarchard@google.com
cde587092f Replace two spaces with one after .
BUG=none
TEST=lint
Review URL: https://webrtc-codereview.appspot.com/1063010

git-svn-id: http://libyuv.googlecode.com/svn/trunk@553 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-28 00:02:35 +00:00
fbarchard@google.com
9780dd4e81 Remove V210. Quality of this code is insufficient for libyuv. Unable to make V210 pass valgrind. Would require effort to add missing support and optimization.
BUG=91
TEST=valgrind
Review URL: https://webrtc-codereview.appspot.com/1021009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@536 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-11 23:42:00 +00:00