189 Commits

Author SHA1 Message Date
Frank Barchard
94af5319f4 Remove M420 and refactor NV12ToI420
M420 is a row biplanar variation of NV12 supported on Microsoft webcams.
The code was hardcoded to bt.601 and should be jpeg, but the format is
very old and rare.  Is a variation on NV12, so if someone needs it, it
can be re-implemented easily.

Bug: libyuv:858
Change-Id: I246167dba3c190cc76af741b8e91e58e68fde28f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2212608
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-05-26 18:48:00 +00:00
Frank Barchard
da41bca02b I400ToARGBMatrix Pass a color matrix to use different coefficients
32 bit
Neon I400ToARGB_Opt (1937 ms)
64 bit
C I400ToARGB_Opt (8957 ms)
NEON I400ToARGB_Opt (2147 ms)

x86
cI400ToARGB_Opt (1110 ms)
AVX2 I400ToARGB_Opt (213 ms)
SSE2 I400ToARGB_Opt (225 ms)

Bug: libyuv:861, b/156642185
Change-Id: I96b6f4ebba6ff9c4ed8803291ce098de6f93fa4f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2209718
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2020-05-20 20:33:12 +00:00
Frank Barchard
d426247a3b YUV to RGB Matrix functions for color space support
Make all Matrix versions of conversions public.

Bug: libyuv:861, b/156642185
Change-Id: Ida067c95dd041b612e2bab64dbface58b257038a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2202748
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Chong Zhang <chz@google.com>
2020-05-19 16:59:29 +00:00
Shiyou Yin
bed9292f2c Move init process of msa after mmi.
Some processors support both MSA and MMI.
when they are enabled together, MSA will be preferd.
This patch move MSA initialization after MMI, so that
MSA can overide MMI and be setted to effective.

Change-Id: I8a52cce83ee4ec9727d47c99b287c9580329b149
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2155944
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-28 11:01:51 +00:00
Frank Barchard
aabcc477bd RGB24Mirror function
Bug: b/151960427
Change-Id: I413db0011a4ed87eefc0dd166bb8e076b5aa4b1d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2116639
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2020-03-24 20:13:08 +00:00
Frank Barchard
6502179e4c I210ToAR30 support for 422 10 bit to 10 bit RGB
BUG=960620, libyuv:845, b/129864744

Change-Id: I43b152568b7f297f81624d47e56a334c127be17b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1901465
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-06 19:37:22 +00:00
Frank Barchard
1f12946068 Add U444ToABGR, J444ToABGR, H444ToABGR, H444ToARGB and ConvertToARGB support
BUG=960620, libyuv:845, b/129864744

Change-Id: I9f80cda3be8e13298c596fac514f65a23a38d3d0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1900310
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-05 22:11:20 +00:00
Frank Barchard
53e014c99d BT.2020 pull in tests and upstream fixes; expose a few more methods.
This adds some missing prototypes from the BT.2020 CL as well as expands
the H444 and J444 results.

BUG=960620, libyuv:845, b/129864744

Change-Id: I8ea3959379f1bb2edb857d4eb90fb9a1f6aa4e03
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1899093
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-11-05 20:01:59 +00:00
Dale Curtis
f15793d6af Add support for BT.2020.
This pulls in the changes that Firefox made to add BT.2020 support as well
as expands them to the existing 10-bit support. So we now have the following
input formats: U420, U422, U444, U010.

BUG=960620, libyuv:845

Change-Id: If0c47853a465d0ed660f849db08e71489fe1b9c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1884468
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2019-10-29 21:06:48 +00:00
Frank Barchard
22f8aad8bc RAWToRGBA for 3 channel OCR
Replace ARM64 only row function with high level function
that implements SSSE3, 32 bit Neon and C.

Compared to 2 step RAWToARGB + ARGBToRGBA on row level:
3.1x faster on ARM
6.2% faster on Intel

BUG=b/140748379

Change-Id: Ia8636d9e4fcdbe10b8c2e81610a54728e29845cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1860914
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-10-14 22:27:37 +00:00
Frank Barchard
c85a7b3ae3 MMI Optimized functions I422ToARGB for 1080p video
Improves playback performance for 1080p video on www.youku.com

BUG=libyuv:841

Change-Id: Iabe7693fba276162af0290863f46e214ab86fb6c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1790959
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2019-09-11 21:06:21 +00:00
Frank Barchard
681c6c6739 Add LIBYUV_API to NV12ToABGR and I444Rotate, I444Scale
Gaussian blur low levels ported to 32 bit neon.
But they are not hooked up to anything but a unittest.

Bug:b/248041731, b/132108021, b/129908793
Change-Id: Iccebb8ffd6b719810aa11dd770a525227da4c357
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1611206
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Chong Zhang <chz@google.com>
2019-05-14 01:18:06 +00:00
Frank Barchard
413a8d8041 Add AYUVToNV12 and NV21ToNV12
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*ToNV12* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

R=rrwinterton@gmail.com

Change-Id: Id03b4613211fb6a6e163d10daa7c692fe31e36d8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1560080
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2019-04-12 17:48:45 +00:00
Frank Barchard
5b6042fa0d add YUV24 and AYUV formats
Alternatives to RGB24 and AYUV for working with GPU.

BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*NV21To???24* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
R=rrwinterton@gmail.com

Change-Id: I5559c63f4bd4c847492fcb1571f7b03c58146689
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1501735
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-03-05 02:53:56 +00:00
Frank Barchard
b36c86fdfe Port box filter to NEON
Bug: libyuv:821
Change-Id: I4a6b9bee2c2fae199c73c9ec7ecb32bde37c1852
Tested: out/Release/libyuv_unittest --gtest_filter=*ScaleFrom1920x1080_Box --libyuv_width=160 --libyuv_height=90 --libyuv_repeat=1000
Reviewed-on: https://chromium-review.googlesource.com/c/1298598
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-10-25 18:56:29 +00:00
Frank Barchard
97b3990dec NV21ToRAW and NV12ToRAW functions added
RAW is a big endian style RGB buffer with R first in memory, then G and B.
Convert NV21 and NV12 to RAW format.

Performance on SkylakeX for 720p with AVX2
I420ToRAW_Opt (388 ms)
H420ToRAW_Opt (371 ms)
NV12ToRAW_Opt (341 ms)
NV21ToRAW_Opt (339 ms)

SSSE3
I420ToRAW_Opt (507 ms)
H420ToRAW_Opt (481 ms)
NV12ToRAW_Opt (498 ms)
NV21ToRAW_Opt (493 ms)

C
I420ToRAW_Opt (2287 ms)
H420ToRAW_Opt (2246 ms)
NV12ToRAW_Opt (2191 ms)
NV21ToRAW_Opt (2204 ms)

Performance on Pixel 2 for 720p
out/Release/bin/run_libyuv_unittest -v -t 7200 --gtest_filter=*NV??ToR*Opt --libyuv_repeat=1000 --libyuv_width=1280 --libyuv_height=720
LibYUVConvertTest.NV12ToRGB24_Opt (1739 ms)
LibYUVConvertTest.NV21ToRGB24_Opt (1734 ms)
LibYUVConvertTest.NV12ToRAW_Opt (1719 ms)
LibYUVConvertTest.NV21ToRAW_Opt (1691 ms)
LibYUVConvertTest.NV12ToRGB565_Opt (2152 ms)

Bug: libyuv:778, b:117522975
Test: add new NV21ToRAW and NV12ToRAW tests
Change-Id: Ieabb68a2c6d8c26743e609c5696c81bb14fb253f
Reviewed-on: https://chromium-review.googlesource.com/c/1272615
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2018-10-10 18:11:10 +00:00
Martin Storsjö
9b772abf97 Restore the file mode for source files
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.

Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-08-06 18:53:32 +00:00
lixia zhang
21be9122aa libyuv:loongson optimize compare/row/scale/rotate files with mmi.
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).

In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.

BUG=libyuv:804

Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-07-20 22:53:04 +00:00
Frank Barchard
3009890c11 NV21ToRGB24_AVX2 and SSSE3
Use 2 step conversion for NV21ToRGB24 to leverage AVX2
low levels instead of C.

Was C
NV21ToRGB24_Opt (882 ms)

Now SSSE3
NV21ToRGB24_Opt (218 ms)

Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I58faf766bbec4cc595aab2e217f6c874dd4b4363
Reviewed-on: https://chromium-review.googlesource.com/951629
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-07 03:58:48 +00:00
Frank Barchard
0ea50cbc74 NV21ToRGB24_NEON conversion
32 bit thumb2 performance:
NV12ToARGB_Opt (472 ms)
NV21ToARGB_Opt (466 ms)
NV12ToRGB24_Opt (457 ms)
NV21ToRGB24_Opt (457 ms)
NV12ToRGB565_Opt (501 ms)

Bug: libyuv:778
Test: add new NV21ToRGB24 test
Change-Id: I330585789835c79ee4b4da61d164716598268df3
Reviewed-on: https://chromium-review.googlesource.com/924646
Reviewed-by: Cheng Wang <wangcheng@google.com>
2018-02-22 22:24:24 +00:00
Frank Barchard
5f0354bde5 clang-tidy and clang-format applied reland
row_neon.cc manually editted for clang format bugs

TBR=braveyao@chromium.org

Bug: None
Test: local arm builds still pass
Change-Id: Ida4aac2f4ee354e2c1bd354b06e76a26b3c0becc
Reviewed-on: https://chromium-review.googlesource.com/930165
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-21 23:30:38 +00:00
Frank Barchard
9c0663d7ce Revert "clang-tidy and clang-format applied"
This reverts commit cfff527a4738cbd125f788937c503558d225d9fa.

Reason for revert: <INSERT REASONING HERE>

Original change's description:
> clang-tidy and clang-format applied
> 
> TBR=braveyao@chromium.org
> Bug: None
> Test: local arm builds still pass
> Change-Id: Iac042fbaad940e01fc4ce228a104d3d561b80f92
> Reviewed-on: https://chromium-review.googlesource.com/929999
> Reviewed-by: Frank Barchard <fbarchard@chromium.org>

TBR=fbarchard@chromium.org,braveyao@chromium.org

Change-Id: I4ee92ceeaa3c34bce3f20bf759dd30593807ad3f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: None
Reviewed-on: https://chromium-review.googlesource.com/930141
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-21 23:21:07 +00:00
Frank Barchard
cfff527a47 clang-tidy and clang-format applied
TBR=braveyao@chromium.org
Bug: None
Test: local arm builds still pass
Change-Id: Iac042fbaad940e01fc4ce228a104d3d561b80f92
Reviewed-on: https://chromium-review.googlesource.com/929999
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-21 22:44:53 +00:00
Frank Barchard
9c9215b218 End swap 10 bit RGB
Bug: libyuv:777
Test: None
Change-Id: I69b81f51c50d7739cfdb3cfb0c3d315c32bd63d2
Reviewed-on: https://chromium-review.googlesource.com/923042
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-02-15 23:50:40 +00:00
Frank Barchard
6630558875 10 bit YUV to 10 bit BGR
BGR variation of 10 bit conversion using swapped U and V
and mirrored matrix to produce AB30 format instead of AR30.

Bug: libyuv:777
Test: LibYUVConvertTest.H010ToAB30_Opt
Change-Id: I96d115a5d1e12138f40cb548871e03aa3ab210eb
Reviewed-on: https://chromium-review.googlesource.com/922284
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-02-15 22:44:36 +00:00
Frank Barchard
8a00c2bb4d Tidy applied with all safe checks on all arm, mips and intel, 32 and 64 bit
Using clang-tidy 7.
warnings=-*,mpi-*,objc-*,llvm-*,hicpp-*,-hicpp-use-noexcept,llvm-*,-hicpp-deprecated-headers,-hicpp-use-auto,bugprone-*,cert-*,google-*,-google-readability-casting,misc-*,,-misc-unused-parameters,-misc-macro-parentheses,cppcoreguidelines-*,-cppcoreguidelines-pro-type-member-init,readability-*,-readability-non-const-parameter,-readability-implicit-bool-conversion,fuchsia-*,-fuchsia-multiple-inheritance,-android-cloexec-*

~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D__ARM_NEON__ -D__arm__   -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D__mips_msa               -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D__aarch64__              -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D_MSC_VER=1600 -D_M_IX86  -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D_MSC_VER=1600 -D_M_X64   -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D__i386__                 -D__clang__ -D__clang_major__=6 -DHAVE_JPEG
~/bin/clang-tidy -fix-errors -format-style=file -checks=$warnings $* -- -Iinclude -D__x86_64__               -D__clang__ -D__clang_major__=6 -DHAVE_JPEG

Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: Ieb0f026c5b5a1d2daf8aca18b9290927fdaaa55c
Reviewed-on: https://chromium-review.googlesource.com/907853
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2018-02-12 18:34:33 +00:00
Frank Barchard
9a765f01bc Revert "tidy applied with readability-*"
This reverts commit 7b9ff4a0355c778f2cf03bdb15029d60a1259061.

Reason for revert: ios build bots are red

Original change's description:
> tidy applied with readability-*
> 
> TBR=braveyao@chromium.org
> Bug: libyuv:750
> Test: builds and runs and passes more tidy tests
> Change-Id: I316822f7d13b370b88b92a693912e880b21f92c8
> Reviewed-on: https://chromium-review.googlesource.com/907371
> Reviewed-by: Frank Barchard <fbarchard@chromium.org>

TBR=fbarchard@chromium.org,braveyao@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: libyuv:750
Change-Id: I4a73ffee2b71664c6cb93f38f2b5d70ebd76953e
Reviewed-on: https://chromium-review.googlesource.com/912175
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-02-09 19:41:26 +00:00
Frank Barchard
7b9ff4a035 tidy applied with readability-*
TBR=braveyao@chromium.org
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I316822f7d13b370b88b92a693912e880b21f92c8
Reviewed-on: https://chromium-review.googlesource.com/907371
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-08 18:13:01 +00:00
Frank Barchard
e1f6c1c0b5 tidy applied with readability-inconsistent-declaration-parameter-name
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2018-02-07 00:24:25 +00:00
Frank Barchard
ff8ab9baf1 AR30ToABGR for 10 to 8 bit RGB on Android
ABGR is the more common format on Android.
This CL converts 10 bit AR30, to standard 8 bit ABGR.
Unoptimized but allows better testing and feature completeness.

Bug: libyuv:751
Test: LibYUVConvertTest.AR30ToABGR_Opt
Change-Id: I0c7e7273158be215129e0a1d355587ae15942299
Reviewed-on: https://chromium-review.googlesource.com/891694
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-29 22:21:42 +00:00
Frank Barchard
92e22cf5b6 Lint cleanup after C99 change CL
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-24 19:16:03 +00:00
Frank Barchard
7e389884a1 Switch to C99 types
Append _t to all sized types.
uint64 becomes uint64_t etc

Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-23 19:16:05 +00:00
Frank Barchard
09db0c4ce2 H010ToAR30 in 1 step with SSSE3 assembly
Switch YUV conversion macro to output 16 bits per channel.
STOREAR30 macro to output AR30.

[ RUN      ] LibYUVConvertTest.TestH420ToARGB
uniques: B 220, G, 220, R 220
[       OK ] LibYUVConvertTest.TestH420ToARGB (0 ms)
[ RUN      ] LibYUVConvertTest.TestH010ToARGB
uniques: B 256, G, 256, R 256
[       OK ] LibYUVConvertTest.TestH010ToARGB (0 ms)
[ RUN      ] LibYUVConvertTest.TestH010ToAR30
uniques: B 883, G, 883, R 883
[       OK ] LibYUVConvertTest.TestH010ToAR30 (0 ms)

Bug: libyuv:751
Test: LibYUVConvertTest.H010ToAR30_Opt
Change-Id: I902b718e2c8b68ede69625ccafebc6519d5af70d
Reviewed-on: https://chromium-review.googlesource.com/869511
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-19 19:46:58 +00:00
Frank Barchard
00d526d4ea H010ToARGB_AVX2 optimized conversion
AVX2 optimized 10 bit YUV to ARGB.

Bug: libyuv:751
Test: H010ToARGB unittest
Change-Id: I705630beb62714b52042c2a5dcdb8b7859e734ae
Reviewed-on: https://chromium-review.googlesource.com/852563
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-01-09 03:17:33 +00:00
Frank Barchard
50f9e618fa Add H010ToABGR, I010ToABGR and I010ToARGB functions
ABGR output is implemented using the same source code as ARGB, by swapping
the u and v and supplying the mirrored conversion matrix.
ABGR format (RGBA in memory) is popular on Android.

Bug: libyuv:751
Test: H010ToABGR, I010ToABGR and I010ToARGB unittests

Change-Id: I0b5103628c58dcb22a6442c03814d4d5972e0339
Reviewed-on: https://chromium-review.googlesource.com/852985
Commit-Queue: Miguel Casas <mcasas@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-08 17:40:33 +00:00
Frank Barchard
9d2cd6a3ef H010ToAR30 optimized to 2 step conversion
Previously H010ToAR30 was done in a 3 step conversion:
H010ToH420, H420ToARGB, ARGBToAR30.
This CL merges the first 2 steps into H010ToARGB, to
improve performance.
Caveat - only 10 bit YUV is supported at this time.
Previously the low level code supported different numbers
of bits - 9, 10, 12 or 16.

Was 3 step conversion:
LibYUVConvertTest.H010ToAR30_Any (1263 ms)
LibYUVConvertTest.H010ToAR30_Unaligned (951 ms)
LibYUVConvertTest.H010ToAR30_Invert (913 ms)
LibYUVConvertTest.H010ToAR30_Opt (901 ms)

Now 2 step conversion:
LibYUVConvertTest.H010ToAR30_Any (853 ms)
LibYUVConvertTest.H010ToAR30_Unaligned (811 ms)
LibYUVConvertTest.H010ToAR30_Invert (781 ms)
LibYUVConvertTest.H010ToAR30_Opt (755 ms)

Bug: libyuv:751
Test: LibYUVConvertTest.H010ToAR30_Opt
Change-Id: Ica7574040401cd57145a4827acdf3c0e58346a2a
Reviewed-on: https://chromium-review.googlesource.com/853288
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-07 08:36:57 +00:00
Frank Barchard
a64658593e I210ToARGB conversion from 10 bit YUV to RGB
SSSE3 optimized 10 bit YUV conversion to ARGB in single step.

Bug: libyuv:751
Test:  I010ToARGB
Change-Id: I234b2850e35992113ee6bd638732bafc7010a60d
Reviewed-on: https://chromium-review.googlesource.com/848238
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-01-05 02:43:38 +00:00
Frank Barchard
790054ff03 Add AR30ToARGB function
Initial AR30ToARGB function to allow converion
from AR30 to other formats if necessary and/or
for testing.
Not optimized at this point.

Bug: libyuv:751
Test: LibYUVConvertTest.AR30ToARGB_Opt
Change-Id: I38ef192315240f3caa7aee0218b38d5e88a2849f
Reviewed-on: https://chromium-review.googlesource.com/833025
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-12-19 01:54:42 +00:00
Frank Barchard
3b81288ece Remove Mips DSPR2 code
Bug: libyuv:765
Test: build for mips still passes
Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37
Reviewed-on: https://chromium-review.googlesource.com/826404
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2017-12-14 18:22:16 +00:00
Frank Barchard
c367751430 ARGBToAR30 SSSE3 use pmulhuw to replicate fields
AR30 is optimized with 3 techniques
1. pmulhuw is used to replicate 8 bits to 10 bits.
2. Two channels are processed at a time.  R and B, and A and G.
3. pshufb is used to shift and mask 2 channels of R and B

Bug: libyuv:751
Test: ARGBToAR30_Opt
Change-Id: I4e62d6caa4df7d0ae80395fa911d3c922b6b897b
Reviewed-on: https://chromium-review.googlesource.com/822520
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2017-12-12 20:12:58 +00:00
Frank Barchard
0f98c3c1df Add ARGBToAR30Row_SSE2 to speed up H010ToAR30
Port ARGBToAR30Row_AVX2 to ARGBToAR30Row_SSE2 using same instructions
but xmm registers and doing half as many pixels per loop.

Bug: libyuv:751
Test: LibYUVConvertTest.ARGBToAR30_Opt
Change-Id: Id644e54639133d1caf28ea3cd11ff6ab6891a673
Reviewed-on: https://chromium-review.googlesource.com/817918
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-12-09 00:11:20 +00:00
Frank Barchard
aabe380890 H010ToAR30 and H010ToARGB optimized YUV buffering
Reduce allocations of row buffers to 1 alloc/free.
Do 2 rows at a time to avoid converting U and V planes twice.

Bug: libyuv:715
Test: LibYUVConvertTest.H010ToAR30_Opt
Change-Id: I2f3a03b4875df5e3b969112a78a1a0b28399fa2f
Reviewed-on: https://chromium-review.googlesource.com/816021
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-12-08 18:55:03 +00:00
Frank Barchard
3541e46a7e Add H010ToARGB for 10 bit YUV to ARGB
Bug: libyuv:751
Test:  LibYUVConvertTest.H010ToARGB_Opt
Change-Id: I668d3f3810e59a4fb6611503aae1c8edc7d596e7
Reviewed-on: https://chromium-review.googlesource.com/815015
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-12-07 20:17:50 +00:00
Frank Barchard
49d9b1039b NV21ToABGR for Android camera conversions
Bug: libyuv:762
Test: NV21ToABGR unittest
Change-Id: I71448ab83930339083f07eeafccf240c6cb41c48
Reviewed-on: https://chromium-review.googlesource.com/795212
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-11-30 20:29:28 +00:00
Frank Barchard
324fa32739 Convert16To8Row_SSSE3 port from AVX2
H010ToAR30 uses Convert16To8Row_SSSE3 to convert 10 bit YUV to 8 bit.
Then standard YUV conversion can be used.  This improves performance
on low end CPUs.
Future CL will by pass this conversion allowing for 10 bit YUV source,
but the function will be useful as a utility for YUV conversions.

Bug: libyuv:559, libyuv:751
Test: out/Release/libyuv_unittest --gtest_filter=*H010ToAR30* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: I9b3ef22d88a5fd861de4cf1900b4c6e8fd24d0af
Reviewed-on: https://chromium-review.googlesource.com/792334
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2017-11-28 19:22:39 +00:00
Lei Zhang
8445617191 Mark a bunch of kArray variables as const.
This allows the linker to move the variables from the .data section to
the .rodata section.

Bug: libyuv:254
Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1

Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0
Reviewed-on: https://chromium-review.googlesource.com/777033
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2017-11-27 23:38:44 +00:00
Frank Barchard
26173eb73e H010ToAR30 for 10 bit bt.709 YUV to 30 bit RGB
This version of the H010ToAR30 provides a 3 step conversion
Convert16To8Row_AVX2
H420ToARGB_AVX2
ARGBToAR30_AVX2

Low level function added to convert 16 bit to 8 bit using multiply
to adjust 10 bit or other bit depths and then save the upper 16 bits.

Bug: libyuv:751
Test: LibYUVPlanarTest.Convert16To8Row_Opt unittest added
Change-Id: I9cc576fda8afa1003cb961d03e0e656e0b478f03
Reviewed-on: https://chromium-review.googlesource.com/783554
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-11-22 23:58:30 +00:00
Manojkumar Bhosale
eed66b2028 Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions
BUG=libyuv:634

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be

Performance Gain (vs C auto-vectorized)
I444ToARGBRow_MSA       - ~1.6x
I444ToARGBRow_Any_MSA   - ~1.6x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.3x
J400ToARGBRow_MSA       - ~1.0x
J400ToARGBRow_Any_MSA   - ~1.0x
YUY2ToARGBRow_MSA       - ~1.6x
YUY2ToARGBRow_Any_MSA   - ~1.6x
UYVYToARGBRow_MSA       - ~1.6x
UYVYToARGBRow_Any_MSA   - ~1.6x

Performance Gain (vs C non-vectorized)
I444ToARGBRow_MSA       - ~7.3x
I444ToARGBRow_Any_MSA   - ~7.1x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.2x
J400ToARGBRow_MSA       - ~6.8x
J400ToARGBRow_Any_MSA   - ~5.7x
YUY2ToARGBRow_MSA       - ~7.2x
YUY2ToARGBRow_Any_MSA   - ~7.0x
UYVYToARGBRow_MSA       - ~7.1x
UYVYToARGBRow_Any_MSA   - ~6.9x

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Reviewed-on: https://chromium-review.googlesource.com/439246
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-21 23:22:07 +00:00
Manojkumar Bhosale
09b8c971b3 Add MSA optimized NV12/21 To RGB row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
NV12ToARGBRow_MSA       - ~1.5x
NV12ToARGBRow_Any_MSA   - ~1.4x
NV12ToRGB565Row_MSA     - ~1.4x
NV12ToRGB565Row_Any_MSA - ~1.4x
NV21ToARGBRow_MSA       - ~1.5x
NV21ToARGBRow_Any_MSA   - ~1.5x
SobelRow_MSA            - ~4.3x
SobelRow_Any_MSA        - ~3.4x
SobelToPlaneRow_MSA     - ~8.0x
SobelToPlaneRow_Any_MSA - ~4.7x
SobelXYRow_MSA          - ~3.0x
SobelXYRow_Any_MSA      - ~2.5x

Performance Gain (vs C non-vectorized)
NV12ToARGBRow_MSA       - ~6.5x
NV12ToARGBRow_Any_MSA   - ~6.5x
NV12ToRGB565Row_MSA     - ~6.2x
NV12ToRGB565Row_Any_MSA - ~6.1x
NV21ToARGBRow_MSA       - ~6.5x
NV21ToARGBRow_Any_MSA   - ~6.5x
SobelRow_MSA            - ~14.5x
SobelRow_Any_MSA        - ~11.3x
SobelToPlaneRow_MSA     - ~34.2x
SobelToPlaneRow_Any_MSA - ~19.4x
SobelXYRow_MSA          - ~11.1x
SobelXYRow_Any_MSA      - ~9.1x

Review-Url: https://codereview.chromium.org/2636483002 .
2017-01-18 09:24:39 +05:30
Manojkumar Bhosale
73a6f100a9 Add MSA optimized rotate functions (used 16x16 transpose)
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.7x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Performance Gain (vs C non-vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.8x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Review-Url: https://codereview.chromium.org/2617703002 .
2017-01-13 15:50:02 +05:30
Manojkumar Bhosale
7c64163ff4 Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGB1555ToARGBRow_MSA     - 1.85
ARGB1555ToARGBRow_Any_MSA - 1.82
RGB565ToARGBRow_MSA       - 2.14
RGB565ToARGBRow_Any_MSA   - 2.08
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 2.60
ARGB1555ToYRow_Any_MSA    - 2.47
RGB565ToYRow_MSA          - 2.45
RGB565ToYRow_Any_MSA      - 2.33
RGB24ToYRow_MSA           - 2.23
RGB24ToYRow_Any_MSA       - 2.01
RAWToYRow_MSA             - 2.25
RAWToYRow_Any_MSA         - 2.02
ARGB1555ToUVRow_MSA       - 1.40
ARGB1555ToUVRow_Any_MSA   - 1.37
RGB565ToUVRow_MSA         - 1.68
RGB565ToUVRow_Any_MSA     - 1.63
RGB24ToUVRow_MSA          - 3.02
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.85

Performance Gain (vs C non-vectorized)
ARGB1555ToARGBRow_MSA     - 4.66
ARGB1555ToARGBRow_Any_MSA - 4.45
RGB565ToARGBRow_MSA       - 5.58
RGB565ToARGBRow_Any_MSA   - 5.34
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 6.38
ARGB1555ToYRow_Any_MSA    - 5.98
RGB565ToYRow_MSA          - 6.42
RGB565ToYRow_Any_MSA      - 6.05
RGB24ToYRow_MSA           - 7.87
RGB24ToYRow_Any_MSA       - 7.01
RAWToYRow_MSA             - 7.98
RAWToYRow_Any_MSA         - 7.01
ARGB1555ToUVRow_MSA       - 5.39
ARGB1555ToUVRow_Any_MSA   - 5.06
RGB565ToUVRow_MSA         - 6.39
RGB565ToUVRow_Any_MSA     - 5.90
RGB24ToUVRow_MSA          - 3.04
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.88

Review-Url: https://codereview.chromium.org/2600713002 .
2017-01-13 15:43:37 +05:30
Frank Barchard
000d2fa91a Libyuv MIPS DSPR2 optimizations.
Optimized functions:

I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2

Bug-fixes in functions:

ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2

BUG=

Review-Url: https://codereview.chromium.org/2626123003 .
2017-01-11 12:19:13 -08:00
Manojkumar Bhosale
a899dea251 Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA          - ~1.1x
ARGBAttenuateRow_Any_MSA      - ~1.1x
ARGBToRGB565DitherRow_MSA     - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA            - ~5.1x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~1.1x
ARGBGrayRow_MSA               - ~2.6x
ARGBSepiaRow_MSA              - ~11.6x

Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA          - ~2.46x
ARGBAttenuateRow_Any_MSA      - ~2.45x
ARGBToRGB565DitherRow_MSA     - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA            - ~5.2x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~4.3x
ARGBGrayRow_MSA               - ~10.5x
ARGBSepiaRow_MSA              - ~12.2x

Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
532f5708a9 Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA      : ~1.4x
I422AlphaToARGBRow_Any_MSA  : ~1.4x
I422ToRGB24Row_MSA          : ~4.8x
I422ToRGB24Row_Any_MSA      : ~4.8x

Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA      : ~7.0x
I422AlphaToARGBRow_Any_MSA  : ~7.0x
I422ToRGB24Row_MSA          : ~7.9x
I422ToRGB24Row_Any_MSA      : ~7.7x

Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
f5d5bd88d6 Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gains :- (vs C vectorized)

I422ToARGBRow_MSA     : ~1.6x
I422ToRGBARow_MSA     : ~1.6x

I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x

Performance Gains :- (vs C non-vectorized)

I422ToARGBRow_MSA     : ~7x
I422ToRGBARow_MSA     : ~7x

I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x

Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.

Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
78c58ab8aa Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains : (Auto-vectorized C vs MSA SIMD)

ARGB4444ToYRow_MSA        : ~3.0x
ARGB4444ToUVRow_MSA       : ~1.8x
ARGB4444ToARGBRow_MSA     : ~3.4x

ARGB4444ToYRow_Any_MSA    : ~2.8x
ARGB4444ToUVRow_Any_MSA   : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x

Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
d363ea6527 Remove I411 support.
YUV 411 is very uncommon format.  Remove support.

Update documentation to reflect that 411 is deprecated.

Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.

BUG=libyuv:645
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
7edf572e28 remove includes for duplicate functions
R=harryjin@google.com
BUG=libyuv:592
TESTED=local builds work with fewer headers

Review URL: https://codereview.chromium.org/2006943002 .
2016-05-23 17:38:26 -07:00
Frank Barchard
0d880e5bc0 rename MIPS_DSPR2 to DSPR2 for consistency
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor.  This should be named consistently.

Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.

TBR=harryjin@google.com
BUG=libyuv:562

Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
d95d2169d9 rename yuv matrix constants to be more clear about what they are
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1429693006 .
2015-11-03 17:09:53 -08:00
Frank Barchard
5d97b93369 refactor I420ToABGR to use I420ToARGBRow
Using a transposed conversion matrix, I420ToARGB can output ABGR.

R=harryjin@google.com, xhwang@chromium.org
BUG=libyuv:473

Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
b86dbf24d3 refactor I420AlphaToABGR to use I420AlphaToARGB internally
swap U and V and transpose conversion matrix, so I420AlphaToARGB and
I420AlphaToABGR share low level code.

Having less code with same performance allows more focused
optimization for future ARM versions.

R=harryjin@google.com
TBR=harryjin@chromium.org
BUG=libyuv:473,libyuv:516

Review URL: https://codereview.chromium.org/1422263002 .
2015-10-27 14:17:21 -07:00
Frank Barchard
cf160cdbaa implement I444ToABGR by swapping uv and transpose matrix
U contributes to B and G.  V contributes to R and G.
By swapping U and V, they contribute to the opposite channels.  Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.

As a result the existing I444ToABGRRow functions are no longer needed and are removed.

Previously this function was only Intel AVX2 optimized for Windwos.  Now it is also optimized for Arm and GCC.

ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.

R=xhwang@chromium.org
BUG=libyuv:515

Review URL: https://codereview.chromium.org/1414133006 .
2015-10-27 10:21:21 -07:00
Frank Barchard
76a599ec3b fix jpeg and bt.709 yuvconstants for neon64.
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces.  But the jpeg and bt.709
colour conversion constants were still in armv7 form.  This changes
the constants for aarch64 builds to be compatible with the code.

yuv constants are now passed as const *

Remove Yvu constants which were used for older version on nv21 but not new code.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
914a9856c7 Reimplement NV21ToARGB to allow different color matrix.
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V.  But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.

TBR=harryjin@google.com
BUG=libyuv:500

Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
f00bc9ef46 Add J444ToARGB conversion function.
J444 is JPeg YUV color space with 444 subsampling.
This implementation uses the existing I444ToARGB conversion, which is
BT.601 color space with 444 subsampling, but passing in the jpeg
color matrix constants.

TBR=harryjin@google.com
BUG=449

Review URL: https://codereview.chromium.org/1387313002 .
2015-10-06 18:46:53 -07:00
Frank Barchard
2cc1a2b233 Remove sse2 functions that also have ssse3
ARGBBlendRow_SSE2, ARGBAttenuateRow_SSE2, and MirrorRow_SSE2
Since vast majority of CPUs have SSSE3 now, removing the SSE2
improves the performance of CPU dispatching.

R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1377053003 .
2015-09-30 14:24:44 -07:00
Frank Barchard
e365cdde3b I420Alpha row function in 1 pass.
API change - I420AlphaToARGB takes flag indicating if RGB should be
premultiplied by alpha.

This version implements an efficient SSSE3 version for Windows.
C version done in 2 steps.

Was
libyuvTest.I420AlphaToARGB_Any (1136 ms)
libyuvTest.I420AlphaToARGB_Unaligned (1210 ms)
libyuvTest.I420AlphaToARGB_Invert (966 ms)
libyuvTest.I420AlphaToARGB_Opt (1031 ms)
libyuvTest.I420AlphaToABGR_Any (1020 ms)
libyuvTest.I420AlphaToABGR_Unaligned (1359 ms)
libyuvTest.I420AlphaToABGR_Invert (1082 ms)
libyuvTest.I420AlphaToABGR_Opt (986 ms)

R=harryjin@google.com
BUG=libyuv:496

Review URL: https://codereview.chromium.org/1367093002 .
2015-09-25 10:29:20 -07:00
Frank Barchard
f96890a0be yuvconstants for all YUV to RGB conversion functions.
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1363503002 .
2015-09-22 10:26:03 -07:00
Frank Barchard
28427a53e2 I444ToABGR for android
Reimplements I444ToARGB as a matrix function.
new I444ToABGR as matrix functions with wrappers and any functions.
Allows for future J444 and H444 versions.
I444ToABGR user level function added.

BUG=libyuv:490, libyuv:449
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1355733002 .
2015-09-18 11:20:58 -07:00
Frank Barchard
ed55d24d9f H420 functionality
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/54869004 .
2015-09-06 11:01:40 -07:00
Frank Barchard
7060e0d826 I420ToABGRMatrix functions with J420ToABGR wrapper.
Allows direct conversion from JPeg to ABGR for android.

BUG=libyuv:488
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/55719004 .
2015-09-03 10:42:36 -07:00
Frank Barchard
4dfdabb552 I420AlphaToABGR for android version of yuva conversion
Same as I420AlphaToARGB but first step converts to ABGR instead of ARGB.

TBR=harryjin@google.com
BUG=libyuv:473

Review URL: https://webrtc-codereview.appspot.com/52779004.
2015-08-20 19:36:59 -07:00
Frank Barchard
278d88f872 Copy Alpha odd width support
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/59369004.
2015-08-13 15:05:14 -07:00
Frank Barchard
8e7a62f22a I420AlphaToARGB conversion for planar YUV with Alpha to ARGB.
R=brucedawson@chromium.org, harryjin@google.com
BUG=libyuv:473

Review URL: https://webrtc-codereview.appspot.com/54829004.
2015-08-12 17:01:24 -07:00
fbarchard@google.com
bb5a009d11 ARGB4444ToARGB and ARGB1555ToARGB ported to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGB4444ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48009004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1363 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 23:52:57 +00:00
fbarchard@google.com
2827277496 port RGB565ToARGB to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1357 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 19:24:23 +00:00
fbarchard@google.com
92f7f421fd rename I400 to J400 and I400 reference to I400. J400 is a simple replication of values to convert to RGB, which is what the old I400 was. I400 reference is the Y part of the YUV formula, so renaming that to I400.
BUG=none
TESTED=libyuvTest (5925 ms total)
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/50369005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1333 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:01:18 +00:00
fbarchard@google.com
685b92b0a6 I400ToARGB_AVX2 port from SSE2 to AVX2.
BUG=403
TESTED=libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I400ToARGB*
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1322 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 18:12:17 +00:00
fbarchard@google.com
f5a7b2b48a I411ToARGB AVX2 version
BUG=403
TESTED=I411ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1321 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 00:08:56 +00:00
fbarchard@google.com
cdd80e04c9 Port I444ToARGB to AVX2.
BUG=403
TESTED=I444ToARGB unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1314 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-09 21:56:48 +00:00
fbarchard@google.com
d96047761e AVX2 version of NV12ToARGB
BUG=403
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40089004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1295 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:45:08 +00:00
fbarchard@google.com
239962fa00 YUY2 and UYVY to ARGB AVX2 versions via wrappers.
BUG=403
TESTED=UNTESTED
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1291 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 18:58:51 +00:00
kjellander@google.com
28d1a582ba Revert "YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions."
This reverts r1288 due to breaking compilation on bots:
http://build.chromium.org/p/client.libyuv/builders/Mac64%20Debug/builds/365
http://build.chromium.org/p/client.libyuv/builders/Linux64%20Debug/builds/667

TBR=fbarchard@google.com
TESTED=Reverted locally and all built fine again.

Review URL: https://webrtc-codereview.appspot.com/40879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1289 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-23 09:22:22 +00:00
fbarchard@google.com
b52606c024 YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions.
BUG=403
TESTED=convert unittest
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/40839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1288 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-21 00:49:35 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
3982998c7c YToARGB AVX2 port from SSE2
BUG=393
TESTED=YToARGB unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1258 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-03 01:35:11 +00:00
fbarchard@google.com
a5a15198b4 Add J422 support which is 2x1 subsampling with jpeg color space.
BUG=391
TESTED=color_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/41479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1228 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-14 19:16:01 +00:00
fbarchard@google.com
40e3457574 J420ToARGB jpeg variation of YUV color space to ARGB.
BUG=241
TESTED=J420ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32929004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1212 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-29 19:17:53 +00:00
fbarchard@google.com
9ed836b154 The 'Any' versions of functions can handle any width now, so remove the check from the calling code. This has 2 advantages - less code, and less overhead in calling function when any function is NOT used. Downside is more code for case where any is used.
BUG=373
TESTED=libyuv_unittest still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1143 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 23:29:31 +00:00
fbarchard@google.com
f713691a6f Change elif to endif and if to allow AVX2 as well as SSE2 in future changes instead of one or the other.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1122 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 20:47:22 +00:00
fbarchard@google.com
ca308327d2 Remove unaligned functions, since most function support unaligned memory now. This reduces complexity and improves performance for unaligned cases because C code can be avoided, and overhead is less. Downside is old cpus (core2 and earlier) will be slower for aligned memory case. Except mips, which has alignment requirement, but remove unaligned variant.
BUG=365
TESTED=unittest builds and passes locally
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1113 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 00:59:31 +00:00
fbarchard@google.com
8798e04075 Port conversion functions to c.
BUG=303
TESTED=cl /c /TC /Iinclude source\convert_from.cc source\convert_argb.cc source\convert_from_argb.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/17909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1030 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-08 18:44:57 +00:00
fbarchard@google.com
2a35da3912 Add ARGBToABGR and ARGBToBGRA as actual functions instead of macros.
BUG=334
TESTED=libyuv unittests pass
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/12659006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1008 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-02 19:24:57 +00:00
fbarchard@google.com
a1f5254a95 Switch to c style casts for all source and includes.
BUG=303
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@952 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 03:03:00 +00:00
fbarchard@google.com
d9c9f37ac4 Conversions use malloc for row buffers.
BUG=296
TESTED=libyuv convert_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@928 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 02:00:30 +00:00
fbarchard@google.com
095f33d870 Coalesce rows by changing width/height and dropping into code instead of recursing. Improve coalesce by setting stride to 0 so it can be used even on odd width images. Reduce unittests to improve time to run emulators.
BUG=277
TEST=unittests all build and pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@819 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 19:29:10 +00:00
fbarchard@google.com
f2aa91a1ac replace static const with static to avoid internal compiler error with gcc
BUG=258
TEST=try bots
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/1944004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@743 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-02 17:48:24 +00:00
fbarchard@google.com
465a5583ef use CONST macro for OSX.
BUG=254
TEST=none
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/1942004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@742 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-02 00:44:29 +00:00