Manojkumar Bhosale
b6e8e9aa97
Add MSA optimized HalfFloatRow function
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: I54a2c57d66093b887c8ba31fd7a21a102165393a
Reviewed-on: https://chromium-review.googlesource.com/628557
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-29 18:40:08 +00:00
Frank Barchard
78e44628c6
Add MSA optimized SplitUV, Set, MirrorUV, SobelX and SobelY row functions.
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: Ie2342f841f1bb8469fc4631b784eddd804f5d53e
Reviewed-on: https://chromium-review.googlesource.com/616765
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-17 18:39:22 +00:00
Manojkumar Bhosale
dbd7c1a9c5
Add MSA optimized ARGBExtractAlpha, ARGBBlend, ARGBQuantize and ARGBColorMatrix row functions
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: I17bd3f87336f613ad363af7d7b9d7af49d725e56
Reviewed-on: https://chromium-review.googlesource.com/613100
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-14 17:38:31 +00:00
Frank Barchard
6d083e2d12
clang 6 build disable some msa functions
...
R=kjellander@chromium.org
Bug: libyuv:715
Test: gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
Change-Id: Ia3943b0afc02e05a8bc32350719b296b0b9d5479
Reviewed-on: https://chromium-review.googlesource.com/592720
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-03 17:44:35 +00:00
Manojkumar Bhosale
45b176d153
Add MSA optimized Interpolate/MergeUV/Misc functions
...
BUG=libyuv:634
Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Performance Gain (vs C auto-vectorized)
InterpolateRow_MSA - ~3.3x
InterpolateRow_Any_MSA - ~2.5x
ARGBSetRow_MSA - ~1.0x
ARGBSetRow_Any_MSA - ~1.0x
ARGBToRGB24Row_MSA - ~1.9x
ARGBToRGB24Row_Any_MSA - ~1.6x
MergeUVRow_MSA - ~1.6x
MergeUVRow_Any_MSA - ~1.2x
Performance Gain (vs C non-vectorized)
InterpolateRow_MSA - ~11.3x
InterpolateRow_Any_MSA - ~ 7.9x
ARGBSetRow_MSA - ~ 6.2x
ARGBSetRow_Any_MSA - ~ 4.0x
ARGBToRGB24Row_MSA - ~ 9.9x
ARGBToRGB24Row_Any_MSA - ~ 8.4x
MergeUVRow_MSA - ~12.7x
MergeUVRow_Any_MSA - ~ 8.0x
Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Reviewed-on: https://chromium-review.googlesource.com/445817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-23 01:42:22 +00:00
Manojkumar Bhosale
eed66b2028
Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions
...
BUG=libyuv:634
Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Performance Gain (vs C auto-vectorized)
I444ToARGBRow_MSA - ~1.6x
I444ToARGBRow_Any_MSA - ~1.6x
I400ToARGBRow_MSA - ~5.5x
I400ToARGBRow_Any_MSA - ~5.3x
J400ToARGBRow_MSA - ~1.0x
J400ToARGBRow_Any_MSA - ~1.0x
YUY2ToARGBRow_MSA - ~1.6x
YUY2ToARGBRow_Any_MSA - ~1.6x
UYVYToARGBRow_MSA - ~1.6x
UYVYToARGBRow_Any_MSA - ~1.6x
Performance Gain (vs C non-vectorized)
I444ToARGBRow_MSA - ~7.3x
I444ToARGBRow_Any_MSA - ~7.1x
I400ToARGBRow_MSA - ~5.5x
I400ToARGBRow_Any_MSA - ~5.2x
J400ToARGBRow_MSA - ~6.8x
J400ToARGBRow_Any_MSA - ~5.7x
YUY2ToARGBRow_MSA - ~7.2x
YUY2ToARGBRow_Any_MSA - ~7.0x
UYVYToARGBRow_MSA - ~7.1x
UYVYToARGBRow_Any_MSA - ~6.9x
Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Reviewed-on: https://chromium-review.googlesource.com/439246
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-21 23:22:07 +00:00
Frank Barchard
7a54d0a302
row_msa fix clang build warnings.
...
BUG=libyuv:634
TEST=untested
Change-Id: Ib7f0c99e669ddba0a1efbd15895880281ad6303e
Reviewed-on: https://chromium-review.googlesource.com/435303
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-02-07 09:36:28 +00:00
Manojkumar Bhosale
54ce8f23d6
Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C auto-vectorized)
ARGBToYJRow_MSA - ~3.2x
ARGBToYJRow_Any_MSA - ~2.7x
BGRAToYRow_MSA - ~3.2x
BGRAToYRow_Any_MSA - ~2.7x
ABGRToYRow_MSA - ~3.2x
ABGRToYRow_Any_MSA - ~2.6x
RGBAToYRow_MSA - ~3.1x
RGBAToYRow_Any_MSA - ~2.7x
ARGBToUVJRow_MSA - ~5.5x
ARGBToUVJRow_Any_MSA - ~4.5x
BGRAToUVRow_MSA - ~2.1x
BGRAToUVRow_Any_MSA - ~2.0x
ABGRToUVRow_MSA - ~2.1x
ABGRToUVRow_Any_MSA - ~1.9x
RGBAToUVRow_MSA - ~2.2x
RGBAToUVRow_Any_MSA - ~1.9x
Performance Gain (vs C non-vectorized)
ARGBToYJRow_MSA - ~10.9x
ARGBToYJRow_Any_MSA - ~9.2x
BGRAToYRow_MSA - ~10.9x
BGRAToYRow_Any_MSA - ~9.3x
ABGRToYRow_MSA - ~11.0x
ABGRToYRow_Any_MSA - ~9.3x
RGBAToYRow_MSA - ~10.9x
RGBAToYRow_Any_MSA - ~9.1x
ARGBToUVJRow_MSA - ~12.4x
ARGBToUVJRow_Any_MSA - ~10.5x
BGRAToUVRow_MSA - ~4.7x
BGRAToUVRow_Any_MSA - ~4.4x
ABGRToUVRow_MSA - ~4.7x
ABGRToUVRow_Any_MSA - ~4.5x
RGBAToUVRow_MSA - ~4.8x
RGBAToUVRow_Any_MSA - ~4.4x
Review-Url: https://codereview.chromium.org/2641153003 .
2017-02-01 10:31:28 +05:30
Manojkumar Bhosale
09b8c971b3
Add MSA optimized NV12/21 To RGB row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C auto-vectorized)
NV12ToARGBRow_MSA - ~1.5x
NV12ToARGBRow_Any_MSA - ~1.4x
NV12ToRGB565Row_MSA - ~1.4x
NV12ToRGB565Row_Any_MSA - ~1.4x
NV21ToARGBRow_MSA - ~1.5x
NV21ToARGBRow_Any_MSA - ~1.5x
SobelRow_MSA - ~4.3x
SobelRow_Any_MSA - ~3.4x
SobelToPlaneRow_MSA - ~8.0x
SobelToPlaneRow_Any_MSA - ~4.7x
SobelXYRow_MSA - ~3.0x
SobelXYRow_Any_MSA - ~2.5x
Performance Gain (vs C non-vectorized)
NV12ToARGBRow_MSA - ~6.5x
NV12ToARGBRow_Any_MSA - ~6.5x
NV12ToRGB565Row_MSA - ~6.2x
NV12ToRGB565Row_Any_MSA - ~6.1x
NV21ToARGBRow_MSA - ~6.5x
NV21ToARGBRow_Any_MSA - ~6.5x
SobelRow_MSA - ~14.5x
SobelRow_Any_MSA - ~11.3x
SobelToPlaneRow_MSA - ~34.2x
SobelToPlaneRow_Any_MSA - ~19.4x
SobelXYRow_MSA - ~11.1x
SobelXYRow_Any_MSA - ~9.1x
Review-Url: https://codereview.chromium.org/2636483002 .
2017-01-18 09:24:39 +05:30
Manojkumar Bhosale
7c64163ff4
Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGB1555ToARGBRow_MSA - 1.85
ARGB1555ToARGBRow_Any_MSA - 1.82
RGB565ToARGBRow_MSA - 2.14
RGB565ToARGBRow_Any_MSA - 2.08
RGB24ToARGBRow_MSA - 8.57
RGB24ToARGBRow_Any_MSA - 7.42
RAWToARGBRow_MSA - 8.57
RAWToARGBRow_Any_MSA - 7.42
ARGB1555ToYRow_MSA - 2.60
ARGB1555ToYRow_Any_MSA - 2.47
RGB565ToYRow_MSA - 2.45
RGB565ToYRow_Any_MSA - 2.33
RGB24ToYRow_MSA - 2.23
RGB24ToYRow_Any_MSA - 2.01
RAWToYRow_MSA - 2.25
RAWToYRow_Any_MSA - 2.02
ARGB1555ToUVRow_MSA - 1.40
ARGB1555ToUVRow_Any_MSA - 1.37
RGB565ToUVRow_MSA - 1.68
RGB565ToUVRow_Any_MSA - 1.63
RGB24ToUVRow_MSA - 3.02
RGB24ToUVRow_Any_MSA - 2.87
RAWToUVRow_MSA - 3.04
RAWToUVRow_Any_MSA - 2.85
Performance Gain (vs C non-vectorized)
ARGB1555ToARGBRow_MSA - 4.66
ARGB1555ToARGBRow_Any_MSA - 4.45
RGB565ToARGBRow_MSA - 5.58
RGB565ToARGBRow_Any_MSA - 5.34
RGB24ToARGBRow_MSA - 8.57
RGB24ToARGBRow_Any_MSA - 7.42
RAWToARGBRow_MSA - 8.57
RAWToARGBRow_Any_MSA - 7.42
ARGB1555ToYRow_MSA - 6.38
ARGB1555ToYRow_Any_MSA - 5.98
RGB565ToYRow_MSA - 6.42
RGB565ToYRow_Any_MSA - 6.05
RGB24ToYRow_MSA - 7.87
RGB24ToYRow_Any_MSA - 7.01
RAWToYRow_MSA - 7.98
RAWToYRow_Any_MSA - 7.01
ARGB1555ToUVRow_MSA - 5.39
ARGB1555ToUVRow_Any_MSA - 5.06
RGB565ToUVRow_MSA - 6.39
RGB565ToUVRow_Any_MSA - 5.90
RGB24ToUVRow_MSA - 3.04
RGB24ToUVRow_Any_MSA - 2.87
RAWToUVRow_MSA - 3.04
RAWToUVRow_Any_MSA - 2.88
Review-Url: https://codereview.chromium.org/2600713002 .
2017-01-13 15:43:37 +05:30
Manojkumar Bhosale
a899dea251
Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA - ~1.1x
ARGBAttenuateRow_Any_MSA - ~1.1x
ARGBToRGB565DitherRow_MSA - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA - ~5.1x
ARGBShuffleRow_Any_MSA - ~1.9x
ARGBShadeRow_MSA - ~1.1x
ARGBGrayRow_MSA - ~2.6x
ARGBSepiaRow_MSA - ~11.6x
Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA - ~2.46x
ARGBAttenuateRow_Any_MSA - ~2.45x
ARGBToRGB565DitherRow_MSA - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA - ~5.2x
ARGBShuffleRow_Any_MSA - ~1.9x
ARGBShadeRow_MSA - ~4.3x
ARGBGrayRow_MSA - ~10.5x
ARGBSepiaRow_MSA - ~12.2x
Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Manojkumar Bhosale
83f460be33
Add MSA optimized ARGB Multiply/Add/Subtract row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBMultiplyRow_MSA - 1.4x
ARGBAddRow_MSA - 8.6x
ARGBSubtractRow_MSA - 8.6x
ARGBMultiplyRow_Any_MSA - 1.35x
ARGBAddRow_Any_MSA - 7.3x
ARGBSubtractRow_Any_MSA - 7.2x
Performance Gain (vs C non-vectorized)
ARGBMultiplyRow_MSA - 4.4x
ARGBAddRow_MSA - 27x
ARGBSubtractRow_MSA - 22x
ARGBMultiplyRow_Any_MSA - 3.5x
ARGBAddRow_Any_MSA - 23x
ARGBSubtractRow_Any_MSA - 18x
Review URL: https://codereview.chromium.org/2529983002 .
2016-12-02 15:21:10 +05:30
Frank Barchard
da0c29dada
Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBToRGB565Row_MSA - ~1.6x
ARGBToRGB565Row_Any_MSA - ~1.6x
ARGBToARGB1555Row_MSA - ~1.3x
ARGBToARGB1555Row_Any_MSA - ~1.3x
ARGBToARGB4444Row_MSA - ~3.8x
ARGBToARGB4444Row_Any_MSA - ~3.8x
ARGBToUV444Row_MSA - ~2.4x
ARGBToUV444Row_Any_MSA - ~2.4x
Performance Gain (vs C non-vectorized)
ARGBToRGB565Row_MSA - ~2.8x
ARGBToRGB565Row_Any_MSA - ~2.8x
ARGBToARGB1555Row_MSA - ~2.2x
ARGBToARGB1555Row_Any_MSA - ~2.2x
ARGBToARGB4444Row_MSA - ~6.8x
ARGBToARGB4444Row_Any_MSA - ~6.6x
ARGBToUV444Row_MSA - ~6.7x
ARGBToUV444Row_Any_MSA - ~6.7x
Review URL: https://codereview.chromium.org/2520003004 .
2016-11-22 10:47:55 -08:00
Frank Barchard
b1504a8e48
Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2487913004 .
2016-11-18 15:05:10 -08:00
Frank Barchard
e62309f259
clang-format libyuv
...
BUG=libyuv:654
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
10ce829bad
Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
I422ToRGB565Row_MSA : ~1.5x
I422ToRGB565Row_Any_MSA : ~1.5x
I422ToARGB4444Row_MSA : ~1.4x
I422ToARGB4444Row_Any_MSA : ~1.4x
I422ToARGB1555Row_MSA : ~1.4x
I422ToARGB1555Row_Any_MSA : ~1.4x
Performance Gain (vs C non-vectorized)
I422ToRGB565Row_MSA : ~6.8x
I422ToRGB565Row_Any_MSA : ~6.8x
I422ToARGB4444Row_MSA : ~6.6x
I422ToARGB4444Row_Any_MSA : ~6.6x
I422ToARGB1555Row_MSA : ~6.6x
I422ToARGB1555Row_Any_MSA : ~6.6x
Review URL: https://codereview.chromium.org/2445343007 .
2016-10-27 10:47:35 -07:00
Frank Barchard
532f5708a9
Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA : ~1.4x
I422AlphaToARGBRow_Any_MSA : ~1.4x
I422ToRGB24Row_MSA : ~4.8x
I422ToRGB24Row_Any_MSA : ~4.8x
Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA : ~7.0x
I422AlphaToARGBRow_Any_MSA : ~7.0x
I422ToRGB24Row_MSA : ~7.9x
I422ToRGB24Row_Any_MSA : ~7.7x
Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
2488b3105b
White spaces, comments and lint fixes for msa.
...
no functional changes.
TBR=kjellander@chromium.org
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2446313002 .
2016-10-25 11:36:54 -07:00
Frank Barchard
f5d5bd88d6
Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gains :- (vs C vectorized)
I422ToARGBRow_MSA : ~1.6x
I422ToRGBARow_MSA : ~1.6x
I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x
Performance Gains :- (vs C non-vectorized)
I422ToARGBRow_MSA : ~7x
I422ToRGBARow_MSA : ~7x
I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x
Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.
Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
78c58ab8aa
Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains : (Auto-vectorized C vs MSA SIMD)
ARGB4444ToYRow_MSA : ~3.0x
ARGB4444ToUVRow_MSA : ~1.8x
ARGB4444ToARGBRow_MSA : ~3.4x
ARGB4444ToYRow_Any_MSA : ~2.8x
ARGB4444ToUVRow_Any_MSA : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x
Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
a2891ec77c
Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains as below,
YUY2ToI422, YUY2ToI420 :-
YUY2ToYRow_MSA : ~10x
YUY2ToUVRow_MSA : ~11x
YUY2ToUV422Row_MSA : ~9x
YUY2ToYRow_Any_MSA : ~6x
YUY2ToUVRow_Any_MSA : ~5x
YUY2ToUV422Row_Any_MSA : ~4x
UYVYToI422, UYVYToI420 :-
UYVYToYRow_MSA : ~10x
UYVYToUVRow_MSA : ~11x
UYVYToUV422Row_MSA : ~9x
UYVYToYRow_Any_MSA : ~6x
UYVYToUVRow_Any_MSA : ~5x
UYVYToUV422Row_Any_MSA : ~4x
Review URL: https://codereview.chromium.org/2397693002 .
2016-10-07 10:37:22 -07:00
Frank Barchard
7018f5be0f
Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains :-
I422ToYUY2Row_MSA - ~12x
I422ToYUY2Row_Any_MSA - ~7x
I422ToUYVYRow_MSA - ~12x
I422ToUYVYRow_Any_MSA - ~7x
Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
618149084e
Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function
...
This patch adds MSA optimized ARGBMirrorRow function in libYUV project.
Performance gain ~3x
R=fbarchard@google.com
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2368313003 .
2016-09-26 16:28:01 -07:00
Frank Barchard
c5323b0fdc
Add MIPS SIMD Arch (MSA) optimized MirrorRow function
...
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds
This patch adds first MSA optimized function in libYUV project.
BUG=libyuv:634
R=fbarchard@google.com
Review URL: https://codereview.chromium.org/2285683002 .
2016-09-22 16:12:22 -07:00