1513 Commits

Author SHA1 Message Date
Frank Barchard
27acadbf9d Roll chromium_revision c793ec77b2..7950721f08 (454713:454907)
Change log: c793ec77b2..7950721f08
Full diff: c793ec77b2..7950721f08

Changed dependencies:
* src/base: 8fe126945c..d75864a2c5
* src/build: 8a0a5a27d4..bf8911f59b
* src/ios: 2c58c1ed6b..8b8111f841
* src/testing: 9cacf531de..c2c74bc1d1
* src/third_party: 0ea751c2fe..4c0908d22e
* src/third_party/catapult: 3c626eaf72..353ee60a45
* src/tools: 41a0ccf0e1..14318cc69b
DEPS diff: c793ec77b2..7950721f08/DEPS

No update to Clang.

TBR=kjellander@chromium.org
BUG=libyuv:689

Change-Id: Ife134b4af1c8c1e63aae2b811342d325abe0b600
Reviewed-on: https://chromium-review.googlesource.com/450317
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-03-06 21:49:04 +00:00
Frank Barchard
136aa9d37c any11p fix for buffer overrun
BUG=libyuv:686
TESTED=untested

Change-Id: Idfae93349dd78b1b633a596631e5397e11b77d0b
Reviewed-on: https://chromium-review.googlesource.com/448320
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-03 19:57:35 +00:00
Frank Barchard
91ee9b729e Fix missing return in MipsCpuCaps.
Previously if MipsCpuCaps were called with something other than
dspr2 or msa, the file was closed but still used.

This change assumed the function is only called internally twice:
once for msa and once for dspr2.  If msa is not being detected,
the function assumed dspr2 was being tested and returns dspr2 was
true.

BUG=libyuv:687
TEST=try bots

Change-Id: I80b328eb5ffc7baf5f1ee5a79c16d75c45ff26cc
Reviewed-on: https://chromium-review.googlesource.com/447831
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-01 23:07:03 +00:00
Manojkumar Bhosale
45b176d153 Add MSA optimized Interpolate/MergeUV/Misc functions
BUG=libyuv:634

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126

Performance Gain (vs C auto-vectorized)
InterpolateRow_MSA      - ~3.3x
InterpolateRow_Any_MSA  - ~2.5x
ARGBSetRow_MSA          - ~1.0x
ARGBSetRow_Any_MSA      - ~1.0x
ARGBToRGB24Row_MSA      - ~1.9x
ARGBToRGB24Row_Any_MSA  - ~1.6x
MergeUVRow_MSA          - ~1.6x
MergeUVRow_Any_MSA      - ~1.2x

Performance Gain (vs C non-vectorized)
InterpolateRow_MSA      - ~11.3x
InterpolateRow_Any_MSA  - ~ 7.9x
ARGBSetRow_MSA          - ~ 6.2x
ARGBSetRow_Any_MSA      - ~ 4.0x
ARGBToRGB24Row_MSA      - ~ 9.9x
ARGBToRGB24Row_Any_MSA  - ~ 8.4x
MergeUVRow_MSA          - ~12.7x
MergeUVRow_Any_MSA      - ~ 8.0x

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Reviewed-on: https://chromium-review.googlesource.com/445817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-23 01:42:22 +00:00
Frank Barchard
a041b0ae03 Android.mk for libyuv - unused parameters warning enabled
BUG=libyuv:682
TEST=mm from android tree.

Change-Id: I13be3eaa6a33741797360d57bc5cf5fed91678ef
Reviewed-on: https://chromium-review.googlesource.com/445935
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-22 19:45:12 +00:00
Manojkumar Bhosale
eed66b2028 Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions
BUG=libyuv:634

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be

Performance Gain (vs C auto-vectorized)
I444ToARGBRow_MSA       - ~1.6x
I444ToARGBRow_Any_MSA   - ~1.6x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.3x
J400ToARGBRow_MSA       - ~1.0x
J400ToARGBRow_Any_MSA   - ~1.0x
YUY2ToARGBRow_MSA       - ~1.6x
YUY2ToARGBRow_Any_MSA   - ~1.6x
UYVYToARGBRow_MSA       - ~1.6x
UYVYToARGBRow_Any_MSA   - ~1.6x

Performance Gain (vs C non-vectorized)
I444ToARGBRow_MSA       - ~7.3x
I444ToARGBRow_Any_MSA   - ~7.1x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.2x
J400ToARGBRow_MSA       - ~6.8x
J400ToARGBRow_Any_MSA   - ~5.7x
YUY2ToARGBRow_MSA       - ~7.2x
YUY2ToARGBRow_Any_MSA   - ~7.0x
UYVYToARGBRow_MSA       - ~7.1x
UYVYToARGBRow_Any_MSA   - ~6.9x

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Reviewed-on: https://chromium-review.googlesource.com/439246
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-21 23:22:07 +00:00
Frank Barchard
bbe8c233f2 scale warning fixes for unused parameters
BUG=libyuv:680
TEST=builds and runs with no warnings

Change-Id: I7d60ef44292fa6ad4f7c4e2e2657359b864d2dab
Reviewed-on: https://chromium-review.googlesource.com/442670
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-02-15 21:38:59 +00:00
Frank Barchard
0fb5675902 Fix dspr2 rename changes. Fix unused variables
TBR=kjellander@chromium.org
BUG=libyuv:634
TEST=try bots

Review-Url: https://codereview.chromium.org/2675583002 .
2017-02-01 18:51:06 -08:00
Manojkumar Bhosale
54ce8f23d6 Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
ARGBToYJRow_MSA       - ~3.2x
ARGBToYJRow_Any_MSA   - ~2.7x
BGRAToYRow_MSA        - ~3.2x
BGRAToYRow_Any_MSA    - ~2.7x
ABGRToYRow_MSA        - ~3.2x
ABGRToYRow_Any_MSA    - ~2.6x
RGBAToYRow_MSA        - ~3.1x
RGBAToYRow_Any_MSA    - ~2.7x
ARGBToUVJRow_MSA      - ~5.5x
ARGBToUVJRow_Any_MSA  - ~4.5x
BGRAToUVRow_MSA       - ~2.1x
BGRAToUVRow_Any_MSA   - ~2.0x
ABGRToUVRow_MSA       - ~2.1x
ABGRToUVRow_Any_MSA   - ~1.9x
RGBAToUVRow_MSA       - ~2.2x
RGBAToUVRow_Any_MSA   - ~1.9x

Performance Gain (vs C non-vectorized)
ARGBToYJRow_MSA       - ~10.9x
ARGBToYJRow_Any_MSA   -  ~9.2x
BGRAToYRow_MSA        - ~10.9x
BGRAToYRow_Any_MSA    -  ~9.3x
ABGRToYRow_MSA        - ~11.0x
ABGRToYRow_Any_MSA    -  ~9.3x
RGBAToYRow_MSA        - ~10.9x
RGBAToYRow_Any_MSA    -  ~9.1x
ARGBToUVJRow_MSA      - ~12.4x
ARGBToUVJRow_Any_MSA  - ~10.5x
BGRAToUVRow_MSA       -  ~4.7x
BGRAToUVRow_Any_MSA   -  ~4.4x
ABGRToUVRow_MSA       -  ~4.7x
ABGRToUVRow_Any_MSA   -  ~4.5x
RGBAToUVRow_MSA       -  ~4.8x
RGBAToUVRow_Any_MSA   -  ~4.4x

Review-Url: https://codereview.chromium.org/2641153003 .
2017-02-01 10:31:28 +05:30
Frank Barchard
54f2094a5e Rename mips source files to dspr2.
Add MSA detect to unittest.
Change macro to disable DSPR2 code to LIBYUV_DISABLE_DSPR2

BUG=libyuv:634
TEST=try bots

Change-Id: I9e0aa2452204fc529bb6f9e6fd93c4e1c379bba6
Reviewed-on: https://chromium-review.googlesource.com/433463
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-01-27 23:11:43 +00:00
Frank Barchard
33f52bdac9 Add installer builds to cmake for linux
cd ~/my_projects/libyuv
git pull
mkdir cbuild  # (for out-of-source builds)
cd cbuild
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4
make package

BUG=libyuv:673
TEST=make package

Change-Id: Ia449cbfd0bc118cc90c8648f8199a0526b7ae2a2
Reviewed-on: https://chromium-review.googlesource.com/433440
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-01-26 23:05:17 +00:00
Frank Barchard
749e316ed8 Remove commented out code
TEST=None
BUG=libyuv:672
Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169

Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169
Reviewed-on: https://chromium-review.googlesource.com/430321
Reviewed-by: Aaron Gable <agable@chromium.org>
2017-01-20 02:03:12 +00:00
Manojkumar Bhosale
09b8c971b3 Add MSA optimized NV12/21 To RGB row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
NV12ToARGBRow_MSA       - ~1.5x
NV12ToARGBRow_Any_MSA   - ~1.4x
NV12ToRGB565Row_MSA     - ~1.4x
NV12ToRGB565Row_Any_MSA - ~1.4x
NV21ToARGBRow_MSA       - ~1.5x
NV21ToARGBRow_Any_MSA   - ~1.5x
SobelRow_MSA            - ~4.3x
SobelRow_Any_MSA        - ~3.4x
SobelToPlaneRow_MSA     - ~8.0x
SobelToPlaneRow_Any_MSA - ~4.7x
SobelXYRow_MSA          - ~3.0x
SobelXYRow_Any_MSA      - ~2.5x

Performance Gain (vs C non-vectorized)
NV12ToARGBRow_MSA       - ~6.5x
NV12ToARGBRow_Any_MSA   - ~6.5x
NV12ToRGB565Row_MSA     - ~6.2x
NV12ToRGB565Row_Any_MSA - ~6.1x
NV21ToARGBRow_MSA       - ~6.5x
NV21ToARGBRow_Any_MSA   - ~6.5x
SobelRow_MSA            - ~14.5x
SobelRow_Any_MSA        - ~11.3x
SobelToPlaneRow_MSA     - ~34.2x
SobelToPlaneRow_Any_MSA - ~19.4x
SobelXYRow_MSA          - ~11.1x
SobelXYRow_Any_MSA      - ~9.1x

Review-Url: https://codereview.chromium.org/2636483002 .
2017-01-18 09:24:39 +05:30
Frank Barchard
a7c87e19f0 add Intel Code Analyst markers
add macros to enable/disable code analyst around blocks of code.

Normally these macros should not be used, but if performance
details are wanted for intel code, enable them around the code
and then run via the iaca tool, available on the intel website.

BUG=libyuv:670
TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
R=wangcheng@google.com

Review-Url: https://codereview.chromium.org/2626193002 .
2017-01-13 15:50:24 -08:00
Manojkumar Bhosale
73a6f100a9 Add MSA optimized rotate functions (used 16x16 transpose)
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.7x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Performance Gain (vs C non-vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.8x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Review-Url: https://codereview.chromium.org/2617703002 .
2017-01-13 15:50:02 +05:30
Manojkumar Bhosale
7c64163ff4 Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGB1555ToARGBRow_MSA     - 1.85
ARGB1555ToARGBRow_Any_MSA - 1.82
RGB565ToARGBRow_MSA       - 2.14
RGB565ToARGBRow_Any_MSA   - 2.08
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 2.60
ARGB1555ToYRow_Any_MSA    - 2.47
RGB565ToYRow_MSA          - 2.45
RGB565ToYRow_Any_MSA      - 2.33
RGB24ToYRow_MSA           - 2.23
RGB24ToYRow_Any_MSA       - 2.01
RAWToYRow_MSA             - 2.25
RAWToYRow_Any_MSA         - 2.02
ARGB1555ToUVRow_MSA       - 1.40
ARGB1555ToUVRow_Any_MSA   - 1.37
RGB565ToUVRow_MSA         - 1.68
RGB565ToUVRow_Any_MSA     - 1.63
RGB24ToUVRow_MSA          - 3.02
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.85

Performance Gain (vs C non-vectorized)
ARGB1555ToARGBRow_MSA     - 4.66
ARGB1555ToARGBRow_Any_MSA - 4.45
RGB565ToARGBRow_MSA       - 5.58
RGB565ToARGBRow_Any_MSA   - 5.34
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 6.38
ARGB1555ToYRow_Any_MSA    - 5.98
RGB565ToYRow_MSA          - 6.42
RGB565ToYRow_Any_MSA      - 6.05
RGB24ToYRow_MSA           - 7.87
RGB24ToYRow_Any_MSA       - 7.01
RAWToYRow_MSA             - 7.98
RAWToYRow_Any_MSA         - 7.01
ARGB1555ToUVRow_MSA       - 5.39
ARGB1555ToUVRow_Any_MSA   - 5.06
RGB565ToUVRow_MSA         - 6.39
RGB565ToUVRow_Any_MSA     - 5.90
RGB24ToUVRow_MSA          - 3.04
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.88

Review-Url: https://codereview.chromium.org/2600713002 .
2017-01-13 15:43:37 +05:30
Frank Barchard
000d2fa91a Libyuv MIPS DSPR2 optimizations.
Optimized functions:

I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2

Bug-fixes in functions:

ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2

BUG=

Review-Url: https://codereview.chromium.org/2626123003 .
2017-01-11 12:19:13 -08:00
Manojkumar Bhosale
288bfbefb5 Add MSA optimized remaining scale row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleRowDown2_MSA            - ~22.3x
ScaleRowDown2_Any_MSA        - ~19.9x
ScaleRowDown2Linear_MSA      - ~31.2x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~1x
ScaleRowDown38_Any_MSA       - ~1x
ScaleRowDown38_2_Box_MSA     - ~1.7x
ScaleRowDown38_2_Box_Any_MSA - ~1.7x
ScaleRowDown38_3_Box_MSA     - ~1.7x
ScaleRowDown38_3_Box_Any_MSA - ~1.7x
ScaleAddRow_MSA              - ~1.2x
ScaleAddRow_Any_MSA          - ~1.15x

Performance Gain (vs C non-vectorized)
ScaleRowDown2_MSA            - ~22.4x
ScaleRowDown2_Any_MSA        - ~19.8x
ScaleRowDown2Linear_MSA      - ~31.6x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~3.2x
ScaleRowDown38_Any_MSA       - ~3.2x
ScaleRowDown38_2_Box_MSA     - ~2.4x
ScaleRowDown38_2_Box_Any_MSA - ~2.3x
ScaleRowDown38_3_Box_MSA     - ~2.9x
ScaleRowDown38_3_Box_Any_MSA - ~2.8x
ScaleAddRow_MSA              - ~8x
ScaleAddRow_Any_MSA          - ~7.46x

Review-Url: https://codereview.chromium.org/2559683002 .
2016-12-21 13:39:44 +05:30
Frank Barchard
bd10875846 modified libyuv.gyp so that it no longer depends on libjpeg.gyp, which does not exist anymore.
BUG=libyuv:666
TESTED= unittests built and passed with jpeg disabled.
R=kjellander@chromium.org

Review-Url: https://codereview.chromium.org/2585373002 .
2016-12-19 11:57:49 -08:00
Manojkumar Bhosale
a899dea251 Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA          - ~1.1x
ARGBAttenuateRow_Any_MSA      - ~1.1x
ARGBToRGB565DitherRow_MSA     - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA            - ~5.1x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~1.1x
ARGBGrayRow_MSA               - ~2.6x
ARGBSepiaRow_MSA              - ~11.6x

Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA          - ~2.46x
ARGBAttenuateRow_Any_MSA      - ~2.45x
ARGBToRGB565DitherRow_MSA     - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA            - ~5.2x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~4.3x
ARGBGrayRow_MSA               - ~10.5x
ARGBSepiaRow_MSA              - ~12.2x

Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Manojkumar Bhosale
6fa5e4eb78 Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx8_MSA          - ~2.7x
TransposeWx8_Any_MSA      - ~2.1x
TransposeUVWx8_MSA        - ~2.5x
TransposeUVWx8_Any_MSA    - ~2.7x

Performance Gain (vs C non-vectorized)
TransposeWx8_MSA          - ~4.6x
TransposeWx8_Any_MSA      - ~2.9x
TransposeUVWx8_MSA        - ~4.4x
TransposeUVWx8_Any_MSA    - ~3.7x

Review URL: https://codereview.chromium.org/2553403002 .
2016-12-15 10:06:01 +05:30
Frank Barchard
b18fd21d3c Android420ToI420 - use ptrdiff_t for difference of u and v pointers
The difference was assigned to an int, causing a warning on Visual C.

BUG=662
TEST=tested with try bots.
R=devangelakos@google.com

Review-Url: https://codereview.chromium.org/2574373002 .
2016-12-14 11:53:55 -08:00
Frank Barchard
dde8ba7009 ConvertFromI420: use halfstride instead of halfwidth
BUG=libyuv:660
TEST=try bots
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2554213003 .
2016-12-07 10:16:16 -08:00
Manojkumar Bhosale
56b5bbb0be Add MSA optimized ARGB scaling functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleARGBRowDown2_MSA           - ~2.6x
ScaleARGBRowDown2Linear_MSA     - ~7.9x
ScaleARGBRowDown2Box_MSA        - ~3.7x
ScaleARGBRowDownEven_MSA        - ~1.2x
ScaleARGBRowDownEvenBox_MSA     - ~3.5x

ScaleARGBRowDown2_Any_MSA       - ~2.6x
ScaleARGBRowDown2Linear_Any_MSA - ~7.9x
ScaleARGBRowDown2Box_Any_MSA    - ~3.6x
ScaleARGBRowDownEven_Any_MSA    - ~1.2x
ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x

Performance Gain (vs C non-vectorized)
ScaleARGBRowDown2_MSA           - 2.6x
ScaleARGBRowDown2Linear_MSA     - 13.5x
ScaleARGBRowDown2Box_MSA        - 5.8x
ScaleARGBRowDownEven_MSA        - 1.2x
ScaleARGBRowDownEvenBox_MSA     - 3.7x

ScaleARGBRowDown2_Any_MSA       - 2.6x
ScaleARGBRowDown2Linear_Any_MSA - 13.5x
ScaleARGBRowDown2Box_Any_MSA    - 5.3x
ScaleARGBRowDownEven_Any_MSA    - 1.2x
ScaleARGBRowDownEvenBox_Any_MSA - 3.7x

Review URL: https://codereview.chromium.org/2527983002 .
2016-12-07 11:47:15 +05:30
Manojkumar Bhosale
83f460be33 Add MSA optimized ARGB Multiply/Add/Subtract row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBMultiplyRow_MSA       - 1.4x
ARGBAddRow_MSA            - 8.6x
ARGBSubtractRow_MSA       - 8.6x

ARGBMultiplyRow_Any_MSA   - 1.35x
ARGBAddRow_Any_MSA        - 7.3x
ARGBSubtractRow_Any_MSA   - 7.2x

Performance Gain (vs C non-vectorized)
ARGBMultiplyRow_MSA       - 4.4x
ARGBAddRow_MSA            - 27x
ARGBSubtractRow_MSA       - 22x

ARGBMultiplyRow_Any_MSA   - 3.5x
ARGBAddRow_Any_MSA        - 23x
ARGBSubtractRow_Any_MSA   - 18x

Review URL: https://codereview.chromium.org/2529983002 .
2016-12-02 15:21:10 +05:30
Frank Barchard
da0c29dada Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBToRGB565Row_MSA       - ~1.6x
ARGBToRGB565Row_Any_MSA   - ~1.6x
ARGBToARGB1555Row_MSA     - ~1.3x
ARGBToARGB1555Row_Any_MSA - ~1.3x
ARGBToARGB4444Row_MSA     - ~3.8x
ARGBToARGB4444Row_Any_MSA - ~3.8x
ARGBToUV444Row_MSA        - ~2.4x
ARGBToUV444Row_Any_MSA    - ~2.4x

Performance Gain (vs C non-vectorized)
ARGBToRGB565Row_MSA       - ~2.8x
ARGBToRGB565Row_Any_MSA   - ~2.8x
ARGBToARGB1555Row_MSA     - ~2.2x
ARGBToARGB1555Row_Any_MSA - ~2.2x
ARGBToARGB4444Row_MSA     - ~6.8x
ARGBToARGB4444Row_Any_MSA - ~6.6x
ARGBToUV444Row_MSA        - ~6.7x
ARGBToUV444Row_Any_MSA    - ~6.7x

Review URL: https://codereview.chromium.org/2520003004 .
2016-11-22 10:47:55 -08:00
Frank Barchard
b1504a8e48 Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2487913004 .
2016-11-18 15:05:10 -08:00
Frank Barchard
97fb18b846 disable I422AlphaToARGBRow_SSSE3 for 32 bit fpic
BUG=libyuv:658
TEST=g++ -I include  -fPIC -m32 -msse2 -Os -fno-omit-frame-pointer -c source/row_gcc.cc -o row_gcc.o
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2482263003 .
2016-11-08 16:09:09 -08:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
f2c27dafa2 HalfFloat neon armv7 fix for destination pointer.
Improved unittests detect different in arm64 rounding.

TEST=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*Half* -a "--libyuv_width=640 --libyuv_height=360"
BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2478313004 .
2016-11-07 12:13:04 -08:00
Frank Barchard
eca08525cb HalfFloat Neon for ARMv7.
64 bit version made similar to 32 bit with registers 1 for load and store results, and 2 and 3 as expanded float temporary values.

TEST=out/Release/libyuv_unittest --gtest_filter=*Half*

BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2467723002 .
2016-11-01 11:36:51 -07:00
Frank Barchard
10ce829bad Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422ToRGB565Row_MSA             : ~1.5x
I422ToRGB565Row_Any_MSA         : ~1.5x
I422ToARGB4444Row_MSA           : ~1.4x
I422ToARGB4444Row_Any_MSA       : ~1.4x
I422ToARGB1555Row_MSA           : ~1.4x
I422ToARGB1555Row_Any_MSA       : ~1.4x

Performance Gain (vs C non-vectorized)
I422ToRGB565Row_MSA             : ~6.8x
I422ToRGB565Row_Any_MSA         : ~6.8x
I422ToARGB4444Row_MSA           : ~6.6x
I422ToARGB4444Row_Any_MSA       : ~6.6x
I422ToARGB1555Row_MSA           : ~6.6x
I422ToARGB1555Row_Any_MSA       : ~6.6x

Review URL: https://codereview.chromium.org/2445343007 .
2016-10-27 10:47:35 -07:00
Frank Barchard
532f5708a9 Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA      : ~1.4x
I422AlphaToARGBRow_Any_MSA  : ~1.4x
I422ToRGB24Row_MSA          : ~4.8x
I422ToRGB24Row_Any_MSA      : ~4.8x

Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA      : ~7.0x
I422AlphaToARGBRow_Any_MSA  : ~7.0x
I422ToRGB24Row_MSA          : ~7.9x
I422ToRGB24Row_Any_MSA      : ~7.7x

Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
02ae8b60c5 Line continuation at end of line with NOLINT before that.
BUG=libyuv:634
TEST=git cl lint
TBR=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2453013003 .
2016-10-26 10:42:52 -07:00
Frank Barchard
2488b3105b White spaces, comments and lint fixes for msa.
no functional changes.

TBR=kjellander@chromium.org
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2446313002 .
2016-10-25 11:36:54 -07:00
Frank Barchard
c2073823b4 use __OPTIMIZE__ macro to determine debug vs release.
Debug builds of x86 gcc/clang can run out of register.
Previously NDEBUG or _DEBUG was used to detect a debug build.
But those macros are not set by gentoo builds.
This CL switches to the compiler predefine __OPTIMIZE__ which is
built into clang and gcc.

BUG=libyuv:602
TEST=untested
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2451503002 .
2016-10-24 18:02:48 -07:00
Frank Barchard
f5d5bd88d6 Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gains :- (vs C vectorized)

I422ToARGBRow_MSA     : ~1.6x
I422ToRGBARow_MSA     : ~1.6x

I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x

Performance Gains :- (vs C non-vectorized)

I422ToARGBRow_MSA     : ~7x
I422ToRGBARow_MSA     : ~7x

I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x

Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.

Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
451af5e922 scale by 1 for neon implemented
void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fcvtn      v4.4h, v2.4s                   \n"  // 8 floatsgit
    "fcvtn2     v4.8h, v1.4s                   \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  :
  : "cc", "memory", "v1", "v2", "v4"
  );
}

void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fmul       v2.4s, v2.4s, %3.s[0]          \n"  // adjust exponent
    "fmul       v1.4s, v1.4s, %3.s[0]          \n"
    "uqshrn     v4.4h, v2.4s, #13              \n"  // isolate halffloat
    "uqshrn2    v4.8h, v1.4s, #13              \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  : "w"(scale * 1.9259299444e-34f)    // %3
  : "cc", "memory", "v1", "v2", "v4"
  );
}

TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2430313008 .
2016-10-21 14:30:03 -07:00
Frank Barchard
550cf829fb HalfFloat avx2 unpack bug fix.
AVX unpack parameters were reverse ordered causing incorrect results
on AVX2 hardware.

TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Half*

BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2438893002 .
2016-10-20 15:49:00 -07:00
Frank Barchard
f553db2d30 HalfFloatPlane unittest for denormal half floats
Halffloats have a limited range.  It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero.  This test ensures they behave the same in C and SIMD and tests the performance of denormals.

TEST=TestHalfFloatPlane_denormal
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2424233004 .
2016-10-19 18:13:01 -07:00
Frank Barchard
78c58ab8aa Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains : (Auto-vectorized C vs MSA SIMD)

ARGB4444ToYRow_MSA        : ~3.0x
ARGB4444ToUVRow_MSA       : ~1.8x
ARGB4444ToARGBRow_MSA     : ~3.4x

ARGB4444ToYRow_Any_MSA    : ~2.8x
ARGB4444ToUVRow_Any_MSA   : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x

Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
2d80fc3133 Port HalfFloatRow_SSE2 to AVX2 but not using F16C.
R=wangcheng@google.com, hubbe@chromium.org
BUG=libyuv:560

Review URL: https://codereview.chromium.org/2421993002 .
2016-10-14 19:01:41 -07:00
Frank Barchard
fdcf524aac Add f16c (halffloat) cpuid
R=wangcheng@google.com, hubbe@chromium.org
BUG=libyuv:560

Review URL: https://codereview.chromium.org/2418763006 .
2016-10-14 16:34:08 -07:00
Frank Barchard
5333e94e70 Port ARGBExtractAlpha_AVX2 function to windows.
BUG=libyuv:572
TEST=try bots
R=wangcheng@google.com, magjed@chromium.org

Review URL: https://codereview.chromium.org/2416783004 .
2016-10-13 23:20:57 -07:00
Frank Barchard
a5e93766a2 Add ARGBExtractAlpha_AVX2 function
Port SSE2 version to AVX2.
BUG=libyuv:572
TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract*
R=wangcheng@google.com, magjed@chromium.org

Review URL: https://codereview.chromium.org/2420553002 .
2016-10-13 16:03:43 -07:00
Frank Barchard
198bce3959 Cast for clang-cl 64 bit build warnings in unittests
R=kjellander@chromium.org
BUG=libyuv:649

Review URL: https://codereview.chromium.org/2414763002 .
2016-10-12 13:09:57 -07:00
Frank Barchard
d363ea6527 Remove I411 support.
YUV 411 is very uncommon format.  Remove support.

Update documentation to reflect that 411 is deprecated.

Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.

BUG=libyuv:645
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
0071f46a1f Side by side 420 test
I420 output can be slow due to multi channel write.
Putting the U and V into a single side by side buffer can improve performance.

TBR=wangcheng@google.com
BUG=None

Review URL: https://codereview.chromium.org/2403223003 .
2016-10-10 19:28:33 -07:00
Frank Barchard
edd3a84d05 libyuv::YUY2ToY for isolating Y channel of YUY2.
This function is the first step of YUY2 To I420.
Provided primarily for diagnostics.

TBR=wangcheng@google.com
BUG=libyuv:647
TESTED=LibYUVConvertTest.YUY2ToY_Opt

Review URL: https://codereview.chromium.org/2399153004 .
2016-10-07 17:20:30 -07:00
Frank Barchard
a2891ec77c Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains as below,

YUY2ToI422, YUY2ToI420 :-

YUY2ToYRow_MSA          : ~10x
YUY2ToUVRow_MSA         : ~11x
YUY2ToUV422Row_MSA      : ~9x
YUY2ToYRow_Any_MSA      : ~6x
YUY2ToUVRow_Any_MSA     : ~5x
YUY2ToUV422Row_Any_MSA  : ~4x

UYVYToI422, UYVYToI420 :-

UYVYToYRow_MSA          : ~10x
UYVYToUVRow_MSA         : ~11x
UYVYToUV422Row_MSA      : ~9x
UYVYToYRow_Any_MSA      : ~6x
UYVYToUVRow_Any_MSA     : ~5x
UYVYToUV422Row_Any_MSA  : ~4x

Review URL: https://codereview.chromium.org/2397693002 .
2016-10-07 10:37:22 -07:00
Frank Barchard
7018f5be0f Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains :-

I422ToYUY2Row_MSA     - ~12x
I422ToYUY2Row_Any_MSA - ~7x

I422ToUYVYRow_MSA     - ~12x
I422ToUYVYRow_Any_MSA - ~7x

Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
aa197ee1a3 HalfFloat_SSE2 for Visual C
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows
R=hubbe@chromium.org, wangcheng@google.com

Review URL: https://codereview.chromium.org/2387713002 .
2016-10-03 10:33:38 -07:00
Frank Barchard
4a14cb2e81 HalfFloat_SSE2 port from C algorithm to SSE2
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2381493006 .
2016-09-30 09:47:16 -07:00
Frank Barchard
7fc932ddd3 Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560,chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2371293002 .
2016-09-29 15:06:30 -07:00
Frank Barchard
c11e9b7fb7 bt709 coefficients for video constrained space
Original bt709 color space coefficients were full range yuv for higher
quality.  This change makes the coefficients use the video constrained
color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for
chroma channels.

BUG=libyuv:639
TEST=libyuv unittests run locally
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2367253003 .
2016-09-28 15:07:46 -07:00
Frank Barchard
6732bcbde9 ShortToHalfFloat_AVX2 function
BUG=libyuv:560
TEST=local compile for windows
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2364293002 .
2016-09-27 14:18:32 -07:00
Frank Barchard
bcd823805c remove guard nolints from all headers
Remove NOLINT from guards

TEST=git cl lint
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2374653002 .
2016-09-26 18:02:09 -07:00
Frank Barchard
51a3500c18 Remove unused macros for msa.
Signed vectors are rarely used in libyuv... remove macros for now.
Remove word shuffler, use byte shuffler in row_msa.
Bump version number.

TBR=manojkumar.bhosale@imgtec.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2375553002 .
2016-09-26 17:34:58 -07:00
Frank Barchard
618149084e Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function
This patch adds MSA optimized ARGBMirrorRow function in libYUV project.

Performance gain ~3x

R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2368313003 .
2016-09-26 16:28:01 -07:00
Frank Barchard
c5323b0fdc Add MIPS SIMD Arch (MSA) optimized MirrorRow function
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds

This patch adds first MSA optimized function in libYUV project.

BUG=libyuv:634
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2285683002 .
2016-09-22 16:12:22 -07:00
Frank Barchard
5da918b48d Enable NEON for unittests on ios 64 bit.
TBR=kjellander@chromium.org
BUG=libyuv:637, chromium:646279

Review URL: https://codereview.chromium.org/2340933005 .
2016-09-16 16:46:46 -07:00
Frank Barchard
8279df963e Scale by 3/8 only if source is multiple of 8 tall.
BUG=libyuv:635
TEST=try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2347733002 .
2016-09-16 14:57:47 -07:00
Frank Barchard
742be44654 Remove references to svn version control.
BUG=libyuv:636
TESTED=try bots
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2339813002 .
2016-09-14 14:54:47 -07:00
Frank Barchard
de944ed8c7 YuvConstants declare alignment for externs as well as declarations
On visual c 2013 and earlier a warning is generated if externs
are not declared with the same alignment as the declaration, when
using /ltcg

BUG=libyuv:633
TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg
R=skal@google.com

Review URL: https://codereview.chromium.org/2291533004 .
2016-08-30 11:06:46 -07:00
Frank Barchard
dc3a1295be add mergeuv test
Add test for SplitUVPlane and MergeUVPlane

Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists.

TEST=SplitUVPlane unittest
BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2279603002 .
2016-08-25 10:29:16 -07:00
Frank Barchard
c244a3e9a0 Add SplitUVPlanes and MergeUVPlanes
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.

BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2277603004 .
2016-08-24 16:47:24 -07:00
Frank Barchard
17d31e6a4a NV12 allow NULL for Y
The conversion from NV12 and other Bi or Tri planar formats, differs only in the UV handling.  The helper function supports passing a NULL for the dst_y channel indicating you only want to do the UV conversion.

TBR=harryjin@google.com
TEST=LibYUVConvertTest.NV12ToI420_NullY (601 ms)
BUG=libyuv:626

Review URL: https://codereview.chromium.org/2276703002 .
2016-08-23 19:05:25 -07:00
Frank Barchard
d58297a2df NV12ToI420 use SplitPlane function
TBR=magjed@chromium.org
BUG=libyuv:629
TEST=LibYUVConvertTest.NV12ToI420_Opt

Review URL: https://codereview.chromium.org/2267303002 .
2016-08-22 18:35:55 -07:00
Frank Barchard
920151f2b5 Change basic_types.h for fixing build failure
BUG=libyuv:630

TBR=harryjin@google.com
TEST=android build locally tested.

Review URL: https://codereview.chromium.org/2225763003 .

Review URL: https://codereview.chromium.org/2269793002 .
2016-08-22 16:16:49 -07:00
Frank Barchard
74491ba0c5 add blank lines to getting started
BUG=libyuv:626

Review URL: https://codereview.chromium.org/2225763003 .
2016-08-08 15:23:38 -07:00
Frank Barchard
e74086bfe3 Remove DISABLE_X86 from build.gn
Fix for duplicate define
../../third_party/libyuv/include/libyuv/scale_row.h:29:9: error: 'LIBYUV_DISABLE_X86' macro redefined [-Werror,-Wmacro-redefined]
        ^

GYP version relys on headers disabling the optimization.
This CL does the same for BUILD.gn
TBR=kjellander@chromium.org
BUG=libyuv:625

Review URL: https://codereview.chromium.org/2149823003 .
2016-07-14 12:14:22 -07:00
Frank Barchard
46a8eaaf0c fix typo in YUV
R=braveyao@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2152623002 .
2016-07-13 17:17:19 -07:00
Frank Barchard
1aa4ddd21c Attribute aligned 32 for YUV conversion structure on Intel
Fix for unaligned memory exception.

R=braveyao@chromium.org
BUG=libyuv:616

Review URL: https://codereview.chromium.org/2152553002 .
2016-07-13 12:19:26 -07:00
Frank Barchard
a7a6d8cc2e Duplicate prototype for I420ToABGR for remoting
Add alias prototype in convert_argb.h for remoting to build without the header convert_from.h

BUG=libyuv:622
TBR=harryjin@google.com

Review URL: https://codereview.chromium.org/2141923005 .
2016-07-12 19:12:28 -07:00
Frank Barchard
abcb70f183 Test nv21 layout of Android420ToI420 function.
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Any
[       OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Any
[       OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms)

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2146733002 .
2016-07-12 18:34:04 -07:00
Frank Barchard
84e04699c2 Add libyuv:Android420ToI420 function which takes 3 pointers
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2114843002 .
2016-07-12 16:23:51 -07:00
Frank Barchard
4d9146bbb1 include planar functions and convert_argb for webrtc
webrtc doesnt include the headers that the functions are prototyped in.
This CL makes the convert.h include those headers to allow webrtc to
update to the head libyuv.

TBR=harryjin@google.com
BUG=libyuv:620,webrtc:6091,webrtc:6094
TESTED=local build and try bots

Review URL: https://codereview.chromium.org/2141683002 .
2016-07-11 11:37:51 -07:00
Frank Barchard
8b55286ed5 duplicate I420Rect prototype into convert for webrtc
TBR=harryjin@google.com
BUG=libyuv:618

Review URL: https://codereview.chromium.org/2132993003 .
2016-07-08 16:03:38 -07:00
Frank Barchard
303b9f03c8 Avoid gcc 4.4 indexing a vector_size(32) array error.
Mking color conversion use simple arrays within structure, which will be referenced via register pointer.

R=harryjin@google.com
BUG=libyuv:616
TEST=CC=gcc-4.4 CXX=g++-4.4 LD=ld-4.4 make -f linux.mk

Review URL: https://codereview.chromium.org/2127863003 .
2016-07-06 15:14:29 -07:00
Frank Barchard
2f101fdbda mingw64 fix - guard row_win.cc against mingw build.
The old guard only checked for defined(_M_X64) which is defined by mingw64.  Add a test for defined(_MSC_VER) which is defined for clangcl and visual c but not mingw.  mingw should use row_gcc.cc for both 32 and 64 bit.

R=harryjin@google.com
BUG=webm:1252,libyuv:613
TEST=local gcc/clang builds on linux tested and try bots for others.

Review URL: https://codereview.chromium.org/2105603002 .
2016-06-28 10:21:27 -07:00
Frank Barchard
b8ddb5a2a7 rounding for arm filter
R=wangcheng@google.com, harryjin@google.com
BUG=libyuv:607

Review URL: https://codereview.chromium.org/2093913004 .
2016-06-24 16:07:49 -07:00
Frank Barchard
cc88adc620 YUV scale filter columns improved filtering accuracy
upscale a YUV image.  observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.

was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)

BUG=libyuv:605
TEST=try bots
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Frank Barchard
24b9fa6671 use vectorsize on clangcl
the ScaleFilterCols_SSSE3 function fails at runtime if vectorsize is not used.

BUG=libyuv:610,libyuv:605
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2080223007 .
2016-06-23 20:14:22 -07:00
Frank Barchard
e376b06d6a Disable ScaleFilterCols_SSSE3 which produces color shift
upscale a YUV image.  observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.

disable HAS_SCALEFILTERCOLS_SSSE3

R=harryjin@google.com
BUG=libyuv:605

Review URL: https://codereview.chromium.org/2080663003 .
2016-06-20 10:43:09 -07:00
Frank Barchard
fd3e676e91 android_full_debug x86 fix - use +rm for width count
Work around for android full debug build runnign out of registers.
5 functions were running out of registers causing the compiler error
error: 'asm' operand has impossible constraints
These functions mostly have 4 pointers, a counter (width) and a tempory
eax register.  With fpic and debug using stackframes, 2 registers are
unavailable.  So a total of 8 registers are used.
Although fpic and stack frame dont apply to assembly, the compiler
reserves 2 registers.  The optimized version builds, so its likely
freeing up the registers once it knows they are not used.
These functions used to build, so compile options and/or compiler may
have updated.. likely fpic was turned on.
An attribute can be done to disable each, and will avoid using the
2 GPR registers, but they are still reserved and unavailable in debug
builds on current compilers (gcc 4.9 and clang 3.8).

R=dhrosa@google.com
BUG=libyuv:602

Review URL: https://codereview.chromium.org/2066933002 .
2016-06-14 15:25:28 -07:00
Frank Barchard
e2611a7349 document cpuid command line behavior
cpu_info_ is zero for uninitialized state and all bits are off, disabling all cpu optimizations.
the 1 bit indicates cpu_info_ is initialized avoiding calling the detection code again for performance.

MaskCpuFlags initializes the cpu ignoring existing flags, then masks with the supplied flags and stores to cpu_info_.
As a mask, -1 has no effect, enabling all cpu features that were detected, but nothing that wasnt detected.
Setting to 0 will cause the next call to re-initialize the cpu, which is same as enabling all features.
Setting mask to 1 will turn off all cpu features but keep the initialized bit on, so the next detection call wont reinitialize and the cpu features are all disabled.

So normal behavior for command line and programatic masking is:
1 = C
-1 = SIMD

TBR=harryjin@google.com
BUG=libyuv:600
TESTED=out64/Release/bin/run_libyuv_unittest -s libyuv_unittest --verbose --release --gtest_filter=*ARGBExtractAlpha* -a "--libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=1 --libyuv_cpu_info=1"

Review URL: https://codereview.chromium.org/2042933002 .
2016-06-08 10:38:09 -07:00
Frank Barchard
026be3cd85 neon64 use width int directly.
width %w size modifier the int width can be passed directly to arm assembly.
For functions that take input constants, the outputs are declared as early
write using &, meaning the outputs use used before all inputs are consumed.

R=harryjin@google.com
BUG=libyuv:598

Review URL: https://codereview.chromium.org/2043073003 .
2016-06-08 10:26:53 -07:00
Frank Barchard
6546096269 ARGBExtractAlpha 16 pixels at a time for ARM
arm64   8     TestARGBExtractAlpha (10019 ms) <-original 64 bit code
arm64   8 x2  TestARGBExtractAlpha (7639 ms)
arm64   16    TestARGBExtractAlpha (7369 ms) <- new 64 bit code
thumb32 8     TestARGBExtractAlpha (9505 ms) <- original 32 bit code
thumb32 8 x2  TestARGBExtractAlpha (7400 ms)
thumb32 8 x2i TestARGBExtractAlpha (7266 ms) <- new 32 bit code
arm32   8     TestARGBExtractAlpha (10002 ms)

BUG=libyuv:572
TESTED=local test on nexus 9
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2035573002 .
2016-06-07 10:44:28 -07:00
Frank Barchard
462be27ec8 j422 now uses j420 source code so increase error threshold to match.
R=harryjin@google.com
BUG=libyuv:597

Review URL: https://codereview.chromium.org/2024213003 .
2016-05-31 19:45:34 -07:00
Frank Barchard
b00d40160a make unittest allocator align to 64 bytes.
blur requires memory be aligned.  change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.

TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.

Review URL: https://codereview.chromium.org/2019753002 .
2016-05-27 18:02:47 -07:00
Magnus Jedvert
942db3016a Add ARGBExtractAlpha function
BUG=libyuv:572
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1995293002 .
2016-05-26 10:30:57 +02:00
Frank Barchard
74a69522da white space fixes for MIPS
TBR=kjellander@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2005053004 .
2016-05-24 14:17:18 -07:00
Frank Barchard
60abed3a47 add SIMD_ALIGNED to unit_test.h
avoids need for row.h for some unittests;

R=harryjin@google.com
BUG=libyuv:594
TESTED=try bots tested.

Review URL: https://codereview.chromium.org/2004313004 .
2016-05-24 13:56:25 -07:00
Frank Barchard
7edf572e28 remove includes for duplicate functions
R=harryjin@google.com
BUG=libyuv:592
TESTED=local builds work with fewer headers

Review URL: https://codereview.chromium.org/2006943002 .
2016-05-23 17:38:26 -07:00
Frank Barchard
fbdc43a03c fix wrong HAS_ARGBCOPYALPHAROW_SSE2 ifdef
TBR=kjellander@chromium.org
BUG=libyuv:593
TESTED=try bots pass.

Review URL: https://codereview.chromium.org/2000393002 .
2016-05-23 16:26:02 -07:00
Frank Barchard
07cb92272f If image sizes are greater than 32768, fixed point stepping will overflow an int. This CL changes the max size to 32768 and disables the test if larger.
BUG=libyuv:590
TESTED=LIBYUV_FLAGS=-1 LIBYUV_WIDTH=8192 LIBYUV_HEIGHT=16 out/Release/libyuv_unittest --gtest_filter=*
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1947783002 .
2016-05-05 19:09:02 -07:00
Frank Barchard
6924590212 Add all library source files to linux.mk
Allows arm and mips linux builds.
Add psnr and cpuid utility targets.

BUG=libyuv:586
TESTED=make -f linux.mk
TBR=kjellander@chromium.org

Review URL: https://codereview.chromium.org/1906653003 .
2016-04-20 16:48:53 -07:00
Frank Barchard
cf101116c9 Remove initialize to zero on output variables for inline.
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs).  This allows the initialize
to zero step to be removed, saving 1 instruction.

BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
9c53ff2c57 Fix temporary stride for ConvertToARGB with rotation.
BUG=libyuv:578
TESTED=local unittests pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1879783002 .
2016-04-11 15:21:04 -07:00
Frank Barchard
3c862e3d29 Fix stride bug for msan on I420Interpolate.
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C

BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1872953002 .
2016-04-08 15:58:53 -07:00
Frank Barchard
ddbc63f7b9 Add //build/config/BUILD.gn to exec whitelist for GN.
Affected Linux GN build, not Windows.

R=kjellander@chromium.org
BUG=libyuv:583
TESTED=gn gen out/Debug --args=is_debug=true

Review URL: https://codereview.chromium.org/1866743002 .
2016-04-06 11:23:28 -07:00
Frank Barchard
ef79a9938b cmake move libyuv_unittest target into the if(TEST) condition
BUG=libyuv:579
TESTED=mkdir build && cd build && cmake .. && cmake --build . --config Release
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/1847233002 .
2016-04-01 16:15:34 -07:00
Frank Barchard
837aa1e2af disable assembly in header for msan=1
GYP_DEFINES="target_arch=x64 msan=1" ./gyp_libyuv
ninja -j7 -C out/Release

R=impjdi@google.com
BUG=libyuv:575

Review URL: https://codereview.chromium.org/1805683003 .
2016-03-15 18:45:38 -07:00
Frank Barchard
ee99b85126 Port ARGBToRGB565 from aarch64 neon to 32 bit
The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts and inserts, saving some masking.  The instruction is available for neon 32 bit as well.

R=magjed@chromium.org, harryjin@google.com
BUG=libyuv:571

Review URL: https://codereview.chromium.org/1724393002 .
2016-02-29 12:22:25 -08:00
Frank Barchard
ab0dfdd4ff Documentation fix for android aarch64 disassembly.
Name of objdump tool updated.

TBR=kjellander@chromium.org
BUG=none

Review URL: https://codereview.chromium.org/1715743003 .

Review URL: https://codereview.chromium.org/1727993002 .
2016-02-23 18:30:35 -08:00
Frank Barchard
127ff512b3 add perf data files to ignores
document play services update

R=jkellander@chromium.org
BUG=none

Review URL: https://codereview.chromium.org/1712463002 .
2016-02-17 21:37:09 -08:00
Frank Barchard
cc33dc68c7 Port I411ToARGBRow to AVX2.
An SSSE3 version already exists, and an AVX2 version is available for
Visual C.  This ports the function to AVX2 completing the AVX2 ports of
all YUV to RGB functions for AVX2 on gcc.

TBR=harryjin@google.com
BUG=libyuv:555

Review URL: https://codereview.chromium.org/1687253002 .
2016-02-12 10:26:10 -08:00
Frank Barchard
0e554b18fe port NV12ToRGB565Row_AVX2 to gcc
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2

NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.

R=harryjin@google.com
BUG=libyuv:554

Review URL: https://codereview.chromium.org/1687953002 .
2016-02-10 11:13:41 -08:00
Frank Barchard
c39509c8e5 add avx2 wrappers for functions that can call I422ToARGBRow_AVX2
R=harryjin@google.com
BUG=libyuv:557

Review URL: https://codereview.chromium.org/1687713002 .
2016-02-09 17:14:29 -08:00
Frank Barchard
6ea3755330 add 'LIBYUV_DISABLE_X86' to msan for unittests
R=harryjin@google.com
BUG=libyuv:564

Review URL: https://codereview.chromium.org/1685723002 .
2016-02-09 11:57:03 -08:00
Frank Barchard
fc2adcfa42 fix for msan builds which set -DLIBYUV_DISABLE_X86=1
TBR=harryjin@google.com
BUG=libyuv:566

Review URL: https://codereview.chromium.org/1673313003 .
2016-02-09 10:51:20 -08:00
Frank Barchard
0d880e5bc0 rename MIPS_DSPR2 to DSPR2 for consistency
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor.  This should be named consistently.

Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.

TBR=harryjin@google.com
BUG=libyuv:562

Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
903c91cc2e fix for ubsan on unittest.h fastrand()
internal math of the fastrand function uses a multiply
and add that overflows a signed int.  This triggers a
ubsan failure:

../../unit_test/../unit_test/unit_test.h:60:33: runtime error: signed integer overflow: 56248274 * 214013 cannot be represented in type 'int'

This change casts the intermediate math to unsigned
int to avoid the overflow.

For more info on ubsan, see
http://dev.chromium.org/developers/testing/undefinedbehaviorsanitizer

TESTED=Passing compilation using:
GYP_DEFINES="ubsan=1"
GYP_DEFINES="ubsan_vptr=1"

R=harryjin@google.com, pbos@webrtc.org
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1662453003 .
2016-02-02 14:32:12 -08:00
Frank Barchard
9e39c1f271 ubsan overflow fix for multiply by 0x01010101
This is an UBSan error reported by libjingle

[ RUN      ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride
[000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73
../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int'
[8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms)
[9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms)
Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted

The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes.
Changing the constant to 0x01010101u should avoid overflow.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1657533005 .
2016-02-01 12:29:04 -08:00
Frank Barchard
1cc0177669 Remove duplicate prototype for MJPGToARGB
MJPGToARGB prototype is in both convert_argb.h and planar_functions.h
Remove the duplicate prototype from planar_functions.h

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:561

Review URL: https://codereview.chromium.org/1638133002 .
2016-01-26 17:02:45 -08:00
Frank Barchard
ad71738f6a Remove svn version build and unittest.
R=harryjin@google.com
TBR=harryjin@google.com, kjellander@google.com
BUG=libyuv:551

Review URL: https://codereview.chromium.org/1612123002 .
2016-01-21 11:22:11 -08:00
Frank Barchard
8c196f4d4c Fix testi420 unittest for odd height
When the image height for unittests was set to an
odd height, the TestI420 unittest would not fill
the complete source buffer.  This change handles
the odd height test case.
No change to library code.

TBR=harryjin@google.com
BUG=libyuv:549

Review URL: https://codereview.chromium.org/1609103002 .
2016-01-19 16:16:39 -08:00
Frank Barchard
58cb534962 Fix memory overwrite in YUY2ToNV12 odd wdiths
When width was odd Y channel wrote an extra pixel.
This change splits the Y from UV into a temporary
buffer and memcpy's to the destination.  Performance
is slower.

Was
YUY2ToNV12_Any (307 ms)
YUY2ToNV12_Unaligned (213 ms)
TestYUY2ToNV12 (181 ms)
YUY2ToNV12_Opt (177 ms)
YUY2ToNV12_Invert (177 ms)

Npw
YUY2ToNV12_Any (300 ms)
YUY2ToNV12_Unaligned (226 ms)
YUY2ToNV12_Invert (206 ms)
TestYUY2ToNV12 (184 ms)
YUY2ToNV12_Opt (181 ms)
TBR=harryjin@google.com
BUG=libyuv:545

Review URL: https://codereview.chromium.org/1593833002 .
2016-01-19 11:28:09 -08:00
Frank Barchard
8377c798fb Fix I420ToNV21 for wrong dst_stride_y parameter.
I420ToNV21 passes the wrong dst_stride_y when it calls I420ToNV12; parameter 8 (convert_from.cc:448) is src_stride_y but should be dst_stride_y.  This causes image corruption when converting I420 -> NV21 with mismatched luminance strides.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:547

Review URL: https://codereview.chromium.org/1582793008 .
2016-01-14 17:38:54 -08:00
Frank Barchard
081475b3c8 refactor ARGBToI422 using ARGBToI420 internally
R=harryjin@google.com
BUG=libyuv:546

Review URL: https://codereview.chromium.org/1574253004 .
2016-01-12 17:05:49 -08:00
Frank Barchard
54bbea1701 Disable I420Blend_Any test that uses C
Also renames Inverted to Invert in test name for consistency.

TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1577973004 .
2016-01-11 18:23:04 -08:00
Frank Barchard
8030a711aa Rename rotate tests to include _Opt and disable _Odd tests
TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1577723003 .
2016-01-11 17:30:27 -08:00
Frank Barchard
23c6a83561 Fix ifdef mismatch for mirroruv
Macro define and macro ifdef didnt match, leading to C code
being used.  Make macro match function name.

TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1579023002 .
2016-01-11 16:33:36 -08:00
Frank Barchard
fc52d8ded2 Odd width variation of scale down by 2 for subsampling
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:538

Review URL: https://codereview.chromium.org/1558093003 .
2016-01-06 15:12:17 -08:00
Frank Barchard
2560df9513 add clang variable for other apps to use
R=dhrosa@google.com
BUG=libyuv:539

Review URL: https://codereview.chromium.org/1557923005 .
2016-01-05 11:47:55 -08:00
Frank Barchard
36615d62a0 fix for InterpolateRow_AVX2
port scaledownby4_avx2 to gcc

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1546763002 .
2015-12-22 12:29:54 -08:00
Frank Barchard
71deb7ba3a bug fix - remove shift from InterpolateRow_AVX2
TBR=harryjin@google.com
BUG=libyuv:537

Review URL: https://codereview.chromium.org/1547703002 .
2015-12-22 10:28:48 -08:00
Frank Barchard
2cb2e9e1ad fix for InterpolateRow_AVX2
TBR=harryjin@google.com
BUG=libyuv:535

Review URL: https://codereview.chromium.org/1543773002 .
2015-12-21 18:35:12 -08:00
Frank Barchard
3f4d86053e avx2 interpolate use 8 bit
BUG=libyuv:535
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1535833003 .
2015-12-21 10:57:32 -08:00
Frank Barchard
f4447745ae Add rounding to InterpolateRow for improved quality and consistency.
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly.  Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.

BUG=libyuv:535
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1533643005 .
2015-12-17 15:24:06 -08:00
Frank Barchard
029f926a14 add NDEBUG for release chromium buids
BUG=libyuv:533

TBR=harryjin@google.com

Review URL: https://codereview.chromium.org/1531143002 .
2015-12-16 16:23:09 -08:00
Frank Barchard
216e93b4e8 Fix MIPS DSPR2 build failure.
Fixing the failure:
 'TransposeWx8_Fast_MIPS_DSPR2' was not declared in this scope

BUG=none
R=fbarchard@chromium.org

Review URL: https://codereview.chromium.org/1527243002 .
2015-12-16 10:37:42 -08:00
Frank Barchard
80ca4514ef change scale down by 4 to use rounding.
TBR=harryjin@google.com
BUG=libyuv:447

Review URL: https://codereview.chromium.org/1525033005 .
2015-12-15 21:25:18 -08:00
Frank Barchard
70445ef2ef avx2 scale down by 2 for gcc
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1520423003 .
2015-12-15 10:59:20 -08:00
Frank Barchard
77346fcb4a disable I411ToARGB assembly if _DEBUG for chromium, as well as DEBUG for other builds.
TBR=harryjin@google.com
BUG=libyuv:533

Review URL: https://codereview.chromium.org/1527903002 .
2015-12-14 21:36:12 -08:00
Frank Barchard
ae55e41851 use rounding in scaledown by 2
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
8bca9fc178 remove unused var in a test
remove include from unittest.cc that is already done by unittest.h

TBR=harryjin@google.com
BUG=libyuv:530

Review URL: https://codereview.chromium.org/1513263004 .
2015-12-10 18:39:36 -08:00
Frank Barchard
44373d8fbb Add check for DEBUG to functions disabled on 386
Some functions run out of registers when compiled for debug,
fpic, with stack frames on 32 bit x86 with clang.
Previously they were enabled based on _DEBUG but that macro
is not set in some build systems.  This CL adds DEBUG macro as
well to cover those environments.

R=harryjin@google.com
BUG=libyuv:532

Review URL: https://codereview.chromium.org/1517693005 .
2015-12-10 15:42:46 -08:00
Frank Barchard
a2ea905679 BlendPlane any width.
Benchmark
out\release\libyuv_unittest --libyuv_width=1279 --libyuv_height=719 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms

Was
I420Blend_Any (2321 ms)
I420Blend_Unaligned (1684 ms)
I420Blend_Opt (1675 ms)
I420Blend_Invert (1653 ms)
BlendPlane_Invert (1556 ms)
BlendPlane_Any (1552 ms)
BlendPlane_Unaligned (1548 ms)
BlendPlane_Opt (1535 ms)
ARGBBlend_Unaligned (659 ms)
ARGBBlend_Any (596 ms)
ARGBBlend_Invert (591 ms)
ARGBBlend_Opt (508 ms)
BlendPlaneRow_Unaligned (186 ms)
BlendPlaneRow_Opt (171 ms)

Now
ARGBBlend_Any (621 ms)
ARGBBlend_Unaligned (585 ms)
ARGBBlend_Invert (564 ms)
ARGBBlend_Opt (512 ms)
I420Blend_Unaligned (347 ms)
I420Blend_Invert (345 ms)
I420Blend_Any (337 ms)
I420Blend_Opt (327 ms)
BlendPlane_Unaligned (187 ms)
BlendPlaneRow_Unaligned (187 ms)
BlendPlane_Invert (186 ms)
BlendPlane_Any (186 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (171 ms)

which is comparable to aligned case
out\release\libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms
ARGBBlend_Any (625 ms)
ARGBBlend_Unaligned (602 ms)
ARGBBlend_Invert (508 ms)
ARGBBlend_Opt (506 ms)
I420Blend_Any (353 ms)
I420Blend_Unaligned (322 ms)
I420Blend_Invert (304 ms)
I420Blend_Opt (301 ms)
BlendPlaneRow_Unaligned (188 ms)
BlendPlane_Unaligned (186 ms)
BlendPlane_Invert (185 ms)
BlendPlane_Any (184 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (169 ms)

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1513443002 .
2015-12-08 18:59:48 -08:00
Frank Barchard
fae1a10545 Work around bug in xgetbv for Visual Studio.
xgetbv is generating bad code, falsely disabling AVX2 and AVX512.
disable optimization for the function affected on older versions of Visual C 32 bit.

R=brucedawson@chromium.org, dhrosa@google.com, harryjin@google.com
BUG=libyuv:529

Review URL: https://codereview.chromium.org/1503393004 .
2015-12-08 18:13:32 -08:00
Frank Barchard
2657688e70 Add support for odd height YUVA alpha blending.
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1507683003 .
2015-12-07 12:03:20 -08:00
Frank Barchard
bea690b3e0 AVX2 YUV alpha blender and improved unittests
AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions.

unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1505433002 .
2015-12-05 22:23:29 -08:00
Frank Barchard
fa2618ee26 Port BlendPlaneRow_SSSE3 to GCC
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1490273006 .
2015-12-04 11:19:41 -08:00
Frank Barchard
8af0ebf816 planar blend use signed images
R=dhrosa@google.com, harryjin@google.com, jzern@chromium.org
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1491533002 .
2015-12-02 14:20:17 -08:00
Frank Barchard
b6f37bd8ec Interpolate plane initial implementation.
YUV version of interpolation between two images.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:526

Review URL: https://codereview.chromium.org/1479593002 .
2015-11-25 16:11:42 -08:00
Frank Barchard
88552486f1 disable 411 on x86 due to compile error
TBR=harryjin@google.com
BUG=libyuv:524

Review URL: https://codereview.chromium.org/1468523002 .
2015-11-20 11:21:39 -08:00
Frank Barchard
526558b2d8 disable debug build of 411 to work around compiler bug
TBR=harryjin@google.com
BUG=libyuv:524

Review URL: https://codereview.chromium.org/1461013002 .
2015-11-19 02:25:00 -08:00
Frank Barchard
b7dfb72559 fix for I411 build error on 32 bit x86
TBR=harrjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1461693004 .
2015-11-19 01:45:14 -08:00
Frank Barchard
528356a128 syntax fix for gcc movzwl
TBR=harryjin@google.com
BUG=libtyv:525

Review URL: https://codereview.chromium.org/1460723003 .
2015-11-18 13:14:15 -08:00
Frank Barchard
50f8cb2db3 port I411 movzx 2 byte reader to gcc
previously the I411 format used movd to read U, V pixels.
But this reads 4 bytes, and can cause a memory exception.
pinsrw can be used, but fails on drmemory 1.5, and is slow.
So in this change a movzxw is used to read 2 bytes into EBX,
then copy to xmm0 with movd.
Slightly slower, but no memory exception
Was LibYUVConvertTest.I411ToARGB_Opt (577 ms)
Now LibYUVConvertTest.I411ToARGB_Opt (608 ms)

TBR=harryjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1457783004 .
2015-11-18 13:05:39 -08:00
Frank Barchard
5eefbe2330 Fix for drmemory failure on I411ToARGB
Before
I420ToARGB_Opt (594 ms)
I422ToARGB_Opt (483 ms)
I411ToARGB_Opt (748 ms) ***
I444ToARGB_Opt (452 ms)
I400ToARGB_Opt (218 ms)

After
I420ToARGB_Opt (591 ms)
I422ToARGB_Opt (454 ms)
I411ToARGB_Opt (502 ms)  ***
I444ToARGB_Opt (441 ms)
I400ToARGB_Opt (216 ms)

TBR=harryjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1459513002 .
2015-11-17 18:00:52 -08:00
Frank Barchard
ec4b258d4e free src_a in unittest to fix leak
TBR=harryjin@google.com
BUG=libyuv:524

Review URL: https://codereview.chromium.org/1452083002 .
2015-11-17 00:29:53 -08:00
Frank Barchard
0815568a50 test for unaligned vs aligned for CopyRow_SSE2
improves performance on older CPUs where movdqa is faster.
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1455463002 .
2015-11-17 00:04:03 -08:00
Frank Barchard
1019e4537f port I444ToARGB avx2 code from Visual C to GCC.
SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)

AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1445893002 .
2015-11-13 18:31:22 -08:00
Frank Barchard
60adcbaf32 scale with conversion using 2 steps with unittest
a prototype function to implement the yuv to rgb with conversion and scale.
replace with 1 step function in future version, using same API.

R=harryjin@google.com
BUG=libyuv:471

Review URL: https://codereview.chromium.org/1421553016 .
2015-11-13 11:25:56 -08:00
Frank Barchard
6100f50f13 fix yvu constants for avx2 yuv to rgb
the yvu matrix for yuv to rgb had an incorrect entry, affecting yuv to bgra,
yuv to abgr and yuv to raw.
fix the matrix and reenable avx2 functions.

R=harryjin@google.com
BUG=libyuv:522

Review URL: https://codereview.chromium.org/1411763004 .
2015-11-10 10:45:44 -08:00
Frank Barchard
72a9e282ec disable more avx2 functions that dont link in chrome
libyuv builds/runs, but when integrated into chromium, produces link errors.  unclear why but this disables affected functions.
will followup with re-enabling them once the root cause in the runtime error is found.

TBR=harryjin@google.com
BUG=libyuv:522

Review URL: https://codereview.chromium.org/1427683004 .
2015-11-09 17:20:02 -08:00
Frank Barchard
fb5ed1f4c5 disable 4 AVX2 YUV to RGB conversions which fails tests.
disable I422ALPHATOARGBROW_AVX2 I422TOARGBROW_AVX2 I422TORGB24ROW_AVX2 I422TORGBAROW_AVX2 in row.h.
SSSE3 versions will be used instead.
Short term fix until issue can be resolved.

R=harryjin@google.com
BUG=libyuv:522

Review URL: https://codereview.chromium.org/1419513009 .
2015-11-09 14:40:08 -08:00
Frank Barchard
98eb102bea set d19 alpha on inner loop
TBR=harryjin@google.com
BUG=libyuv:521

Review URL: https://codereview.chromium.org/1429263004 .
2015-11-06 11:38:21 -08:00
Frank Barchard
431cb3667a YUV to RGB for x64 use registers instead of memory.
On Arm the YVU to RGB conversions move constants into registers.
This change does the same for 64 bit intel builds where additional
registers are available.
The AVX2 saves 3 instructions by because the 2nd argument needs to be a register, so a vmovdqu was avoided.

x64 builds using memory:
AVX2  I420ToARGB_Opt (3059 ms)
SSSE3 I420ToARGB_Opt (3959 ms)

Now using registers
AVX2  I420ToARGB_Opt (2906 ms)
SSSE3 I420ToARGB_Opt (3928 ms)

TBR=harryjin@google.com
BUG=libyuv:520

Review URL: https://codereview.chromium.org/1407353010 .
2015-11-04 16:16:18 -08:00
Frank Barchard
c2bff1a1af add .gn file for gn builds
using a stripped down gn file from webrtc.

BUG=libyuv:411,libyuv:519
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/1417613007 .
2015-11-04 11:09:00 -08:00
Frank Barchard
860cc0357a Neon versions of I420AlphaToARGB
Add alpha version of YUV to RGB to neon code for ARMv7 and aarch64.
For other YUV to RGB conversions, hoist alpha set to 255 out of loop.

TBR=harryjin@google.com
BUG=libyuv:516

Review URL: https://codereview.chromium.org/1413763017 .
2015-11-03 19:21:36 -08:00
Frank Barchard
d95d2169d9 rename yuv matrix constants to be more clear about what they are
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1429693006 .
2015-11-03 17:09:53 -08:00
Frank Barchard
1f1d140bb6 remove mips dsp detect
DSP code is not actually used, only DSPR2.  Remove the detect.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1405043008 .
2015-11-03 16:57:40 -08:00
Frank Barchard
ce4c2fad1d Raw 24 bit RGB to RGB24 (bgr)
Add unittests that do 1 step conversion vs 2 step conversion.

Tests end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1419103007 .
2015-11-03 10:30:30 -08:00
Frank Barchard
87926cec8b remove store bgra, abgr, raw unused macros
TBR=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1420033004 .
2015-11-02 10:40:03 -08:00
Frank Barchard
2c7aa0070a remove I422ToBGRA and use I422ToRGBA internally
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.

Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1427993004 .
2015-11-02 10:24:12 -08:00
Frank Barchard
5d97b93369 refactor I420ToABGR to use I420ToARGBRow
Using a transposed conversion matrix, I420ToARGB can output ABGR.

R=harryjin@google.com, xhwang@chromium.org
BUG=libyuv:473

Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
254ef01551 disable I420AlphaToARGB for 32 bit intel debug build
R=harryjin@google.com
BUG=libyuv:517

Review URL: https://codereview.chromium.org/1428843003 .
2015-10-29 11:21:36 -07:00
Frank Barchard
cdbdf5b723 Fix debug compilation problems for gcc and 32 bit x86.
In some methods with 7 arguments gcc fails to find enough registers
to compile the assembler code when compiling debug. Simplest solution
is to skip the assembler version in debug of those particular functions

(I422Alpha -> ARBG/ABGR)

R=harryjin@google.com,bratell@opera.com
BUG=libyuv:517

Review URL: https://codereview.chromium.org/1423283002 .
2015-10-28 14:27:29 -07:00
Frank Barchard
811a5ec446 pass clangcl compile options to ignore warnings in gflags.cc
R=ajm@chromium.org, ajm@google.com
BUG=libyuv:513,webrtc:760

Review URL: https://codereview.chromium.org/1427643003 .
2015-10-28 10:58:19 -07:00
Frank Barchard
b86dbf24d3 refactor I420AlphaToABGR to use I420AlphaToARGB internally
swap U and V and transpose conversion matrix, so I420AlphaToARGB and
I420AlphaToABGR share low level code.

Having less code with same performance allows more focused
optimization for future ARM versions.

R=harryjin@google.com
TBR=harryjin@chromium.org
BUG=libyuv:473,libyuv:516

Review URL: https://codereview.chromium.org/1422263002 .
2015-10-27 14:17:21 -07:00
Frank Barchard
cf160cdbaa implement I444ToABGR by swapping uv and transpose matrix
U contributes to B and G.  V contributes to R and G.
By swapping U and V, they contribute to the opposite channels.  Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.

As a result the existing I444ToABGRRow functions are no longer needed and are removed.

Previously this function was only Intel AVX2 optimized for Windwos.  Now it is also optimized for Arm and GCC.

ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.

R=xhwang@chromium.org
BUG=libyuv:515

Review URL: https://codereview.chromium.org/1414133006 .
2015-10-27 10:21:21 -07:00
Frank Barchard
e8ee175549 add unittest that compares ABGR to ARGB
TBR=harryjin@google.com
BUG=libyuv:515

Review URL: https://codereview.chromium.org/1423663007 .
2015-10-26 17:51:03 -07:00
Frank Barchard
2844662e1c Add avx512bw detection code
R=harryjin@google.com
BUG=libyuv:514

Review URL: https://codereview.chromium.org/1413463004 .
2015-10-26 14:42:49 -07:00
Frank Barchard
1502832a70 switch cpu flags to 0 for unitialized to avoid compare
R=harryjin@google.com
BUG=libyuv:512

Review URL: https://codereview.chromium.org/1418253002 .
2015-10-23 10:57:42 -07:00
Frank Barchard
ad36ba5c48 initialize cpu flags to fix compile error on windows
R=harryjin@google.com
BUG=libyuv:512

Review URL: https://codereview.chromium.org/1422733003 .
2015-10-22 15:16:31 -07:00
Frank Barchard
00f15e3c6c color unittest allow j420 error of 5 for arm
R=harryjin@google.com
BUG=libyuv:511

Review URL: https://codereview.chromium.org/1412683005 .
2015-10-22 11:25:04 -07:00
Frank Barchard
430bb0a0f0 odd width 444 fix
TBR=harryjin@google.com
BUG=libyuv:510

Review URL: https://codereview.chromium.org/1415583003 .
2015-10-21 20:03:19 -07:00
Frank Barchard
ba4b409d51 Fix ARGBToI411 odd width bug.
The any function for handling ARGBToI411 was not handling the pixel
replication correctly.  On 422 and odd width was handled by duplicating
a pixel of source.  411 needs replication for remainders of 1, 2 or 3
pixels.

The C version was handling odd width but with an average of the remainder
pixels, which does not match the SIMD 'any' handling off remainder.
This changes the odd width handling to mimic the any version.

TBR=harryjin@google.com
BUG=libyuv:491

Review URL: https://codereview.chromium.org/1411733004 .
2015-10-21 12:22:24 -07:00
Frank Barchard
9daa550a2e Move cpu_info variable outside ifdef
Fix compile error on arm, mips etc due to undefined variable.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1403373008 .
2015-10-20 16:32:44 -07:00
Frank Barchard
9be6d21ae7 write to cpu_flags once
To make init cpu flags thread safe, there can only be one write to the variable.

R=richard.winterton@intel.com, harryjin@google.com
BUG=libyuv:508

Review URL: https://codereview.chromium.org/1412793006 .
2015-10-20 16:24:01 -07:00
Frank Barchard
e6a54f223a Call AllowCommandLineReparsing in unit tests
Allows us to ignore flags passed on to us by Chromium build bots
without having to explicitly disable them. (Thanks pbos!)

TESTED=webrtc ran modules_unittests with a bogus flag did not result in an
error.

R=kjellander@chromium.org
BUG=libyuv:507

Review URL: https://codereview.chromium.org/1417573002 .
2015-10-19 16:30:41 -07:00
Frank Barchard
94312b695a add gflags support files from webrtc
files needed for command line support with gtest.
These files are copied directly from webrtc.

TBR=kjellander@chromium.org
BUG=libyuv:507

Review URL: https://codereview.chromium.org/1414483002 .
2015-10-16 18:53:25 -07:00
Henrik Kjellander
8dcec019b6 Add gflags dependency
Unit tests currently use environment variables to change behavior.
Using gflags this can be done via command line.

BUG=libyuv:507
TBR=fbarchard@chromium.org

Review URL: https://codereview.chromium.org/1413723002 .
2015-10-16 22:08:43 +02:00
Henrik Kjellander
f80cc26da7 Revert "add gflags to deps to allow command line parameters."
This reverts commit 2dd3d9230ee663e71ed4ad9164033ed672e571de.

Reason: chromium_git is a missing variable, and to properly
add gflags, we need to check in GYP files in third_party/gflags
first, then add the DEPS entry.

BUG=libyuv:507
TBR=fbarchard@chromium.org

Review URL: https://codereview.chromium.org/1406323002 .
2015-10-16 21:46:56 +02:00
Frank Barchard
2dd3d9230e add gflags to deps to allow command line parameters.
unittests currently use environment variables to change behavior.
using gflags this can be done via command line.

R=kjellander@chromium.org
BUG=libyuv:507

Review URL: https://codereview.chromium.org/1402313002 .
2015-10-16 10:57:51 -07:00
Frank Barchard
5d0a871d37 remove have jpeg test
This test is just a printf, not a real test, but somehow
fails on arm.

TBR=harryjin@google.com
BUG=libyuv:506

Review URL: https://codereview.chromium.org/1409913002 .
2015-10-15 19:13:07 -07:00
Frank Barchard
cf19a0c9a2 nv21 any fix
R=harryjin@google.com
BUG=libyuv:507

Review URL: https://codereview.chromium.org/1410643002 .
2015-10-15 16:24:51 -07:00
Frank Barchard
52a5504950 fix for C version of YUV to RGB for Arm
YuvPixel for arm was miscomputing YG.

TBR=harryjin@google.com
BUG=libyuv:506

Review URL: https://codereview.chromium.org/1402333002 .
2015-10-15 12:43:37 -07:00
Frank Barchard
e2417df4cb create color test category of unittests to narrow down arm bug
A hang in color conversion on arm occurs somewhere in yuv to rgb.
Breaking the color test into its own category of test will help
run selective tests to narrow down the issue.

R=harryjin@google.com
BUG=libyuv:506

Review URL: https://codereview.chromium.org/1405543003 .
2015-10-14 16:58:55 -07:00
Frank Barchard
c7c188379b avoid vectors for pnacl which cause linker failure.
R=sergeyu@chromium.org
BUG=chromium:538243

Review URL: https://codereview.chromium.org/1396363004 .
2015-10-14 14:49:48 -07:00
Frank Barchard
26db4de2ae break up unittests into categories
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1399523004 .
2015-10-13 16:01:07 -07:00
Frank Barchard
4abd096548 fix for yuv to rgb on arm64.
fill in aarch64 yuv constants to match how the code expects them.

TBR=harryjin@google.com
BUG=libyuv:502

Review URL: https://codereview.chromium.org/1396253004 .
2015-10-12 12:02:54 -07:00
Frank Barchard
2e4466e282 change all pix parameters to width for consistency
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
Frank Barchard
2d601aaf34 merge neon source files back into single libyuv library
previously the neon source code was broken into a separate
library built with -mfpu=neon for the neon assembly, while
the C code was built without neon.

In this change, the neon code is added to the main library
and all code built with neon.

TBR=harryjin@google.com
BUG=libyuv:371

Review URL: https://codereview.chromium.org/1392043003 .
2015-10-07 21:16:51 -07:00
Frank Barchard
76a599ec3b fix jpeg and bt.709 yuvconstants for neon64.
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces.  But the jpeg and bt.709
colour conversion constants were still in armv7 form.  This changes
the constants for aarch64 builds to be compatible with the code.

yuv constants are now passed as const *

Remove Yvu constants which were used for older version on nv21 but not new code.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
8f0cadede4 port ARGB to 565 dithering AVX2 code to GCC.
Previously the assembly code was only available to Windows.
This CL ports the AVX2 code to GCC syntax.

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1391273003 .
2015-10-07 19:13:59 -07:00
Frank Barchard
cc89e3a77b port ARGB to 565 dithering SSE2 code to GCC.
Previously the assembly code was only available to Windows.
This CL ports the SSE2 code to GCC syntax.

When running a profiler on all the unittests, this function
was the slowest of all functions that still ran in C code.
   3.71%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C

Was
ARGBToRGB565Dither_Opt (2894 ms)
Now
ARGBToRGB565Dither_Opt (432 ms)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1397673002 .
2015-10-07 18:24:50 -07:00
Frank Barchard
3e38762d6b fix avx2 box filter bug for yuv down sampling.
offset to second group of pixels was off by 16.
should have been 32, not 16.
requires avx2 hardware and wide image for test.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:492,libyuv:501

Review URL: https://codereview.chromium.org/1395603002 .
2015-10-07 11:02:33 -07:00