Frank Barchard
f2978400d5
Document AR30 format
...
Bug: libyuv:751
Test: none
Change-Id: If6d5e7b9c5e6e8d2a272e03ce5a1cc199ef364ca
Reviewed-on: https://chromium-review.googlesource.com/779980
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-11-20 22:05:45 +00:00
Frank Barchard
12c904a97c
H420ToRAW and H420ToRGB24 added for bt.709 support.
...
Bug: libyuv:760
Test: LibYUVConvertTest.H420ToRAW_Opt
Change-Id: I050385f477309d5db02bb2218088f224c83392ed
Reviewed-on: https://chromium-review.googlesource.com/775785
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2017-11-17 01:20:05 +00:00
Frank Barchard
46594be758
add ScalePlane_16 unit tests
...
Tests ScalePlane vs ScalePlane_16 match.
Bug: libyuv:749
Test: LibYUVScaleTest.ScalePlaneDownBy4_Box_16
Change-Id: I3f71748da404982d5d48bfb11bbd3ae95a1d021c
Reviewed-on: https://chromium-review.googlesource.com/765045
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2017-11-16 01:40:48 +00:00
Frank Barchard
49d1e3b036
MultiplyRow_16_AVX2 for converting 10 bit YUV
...
When converting from lsb 10 bit formats to msb, the values
need to be shifted to the top 10 bits. Using a multiply
allows the different numbers of bits to be copied:
// 128 = 9 bits
// 64 = 10 bits
// 16 = 12 bits
// 1 = 16 bits
Bug: libyuv:751
Test: LibYUVPlanarTest.MultiplyRow_16_Opt
Change-Id: I9cf226053a164baa14155215cb175065b1c4f169
Reviewed-on: https://chromium-review.googlesource.com/762951
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-10 22:02:32 +00:00
Frank Barchard
2f58d126b9
MergeUV10Row_AVX2 use multiply to handle different bit depths
...
Instead of hardcoded shift, use a multiply by a parameter.
128 = 9 bits
64 = 10 bits
16 = 12 bits
1 = 16 bits
Bug: libyuv:751
Test: LibYUVPlanarTest.MergeUV10Row_Opt
Change-Id: Id925edfdbf91243370c90641b50eb8e7625ec329
Reviewed-on: https://chromium-review.googlesource.com/762523
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-10 03:38:07 +00:00
Frank Barchard
e26b0a7e0e
casting for c89 compatibility and lint cleanup
...
Bug: libyuv:756
Test: CFLAGS="-m32 -static -std=gnu89 -mno-sse -O2" CXXFLAGS="-m32 -x c -static -std=gnu99 -mno-sse -O2" make -f linux.mk libyuv.a
Change-Id: Ic362f93e01ccbb0bea14f361a58585e79297e7d2
Reviewed-on: https://chromium-review.googlesource.com/759423
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Patrik Höglund <phoglund@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-09 18:22:17 +00:00
Frank Barchard
735ace2ed3
Re-enable x86 assembly without requiring -msse2
...
clang does not require -msse2 or -msse for inline, except
the "x" parameter. So change this to "m" for 32 bit. 64 bit
requires sse2 so use "x" for 64 bit.
gcc requires -msse for xmm registers in clobber list.
Reduce compiler requirement from -msse2 to -msse for enabling
assembly.
Bug: libyuv:754, libyuv:757
Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk
Change-Id: I86df72cfee80b7d349561c1fd7c97ad360767255
Reviewed-on: https://chromium-review.googlesource.com/759303
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-09 00:51:06 +00:00
Frank Barchard
68f852d835
Remove DISABLE_CLANG_MSA
...
cleanup to remove ifdefs around functions affected by
a clang bug.
gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
ninja -v -C out/Release libyuv_unittest
Bug: libyuv:634
Test: build for mips with clang
Change-Id: I278b368dbb2fe89082240e280267d0a27a214c78
Reviewed-on: https://chromium-review.googlesource.com/757980
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-08 19:55:14 +00:00
Frank Barchard
d997ac287d
Revert "Enable SSE2 code without -msse"
...
This reverts commit 01e994d74e4e3937ee1a3efdc048320a1e51f818.
Change-Id: Ie76710d0f4e641e071889c5125fd3be23cdcdb59
Reviewed-on: https://chromium-review.googlesource.com/758499
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-11-08 19:33:09 +00:00
Frank Barchard
01e994d74e
Enable SSE2 code without -msse
...
Bug: libyuv:754
Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk
Change-Id: I74bf8d032013694e65ea7637bc38d3253db53ff2
Reviewed-on: https://chromium-review.googlesource.com/758043
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-11-08 02:54:41 +00:00
Frank Barchard
522fd699e6
AVX512 feature detects for cnl and icl
...
Key instruction sets added for each microarchitecture:
AVX512BW, AVX512VL, AVX512DQ - skylake server or later
AVX512_VBMI, AVX512_IFMA - cannon lake or later
AVX512_BITALG, AVX512_VBMI2, AVX512_VPOPCNTDQ, AVX512_VNNI, GFNI, VAES, VPCLMULQDQ - ice lake or later
Bug: libyuv:752
Test: ~/intelsde/sde -icl -- out/Release/libyuv_unittest --gtest_filter=*Cpu*
Change-Id: I9ee28904c90009d66721b9f805a440c5fc2da122
Reviewed-on: https://chromium-review.googlesource.com/755617
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-11-07 00:56:37 +00:00
Frank Barchard
a0c32b9e49
MergeUV10Row_AVX2 for converting H010 to P010
...
H010 is 10 bit planar format with 10 bits in lower bits.
P010 is 10 bit biplanar format with 10 bits in upper bits.
This function weaves the U and V channels and shifts the bits
into the upper bits.
Bug: libyuv:751
Test: LibYUVPlanarTest.MergeUV10Row_Opt
Change-Id: I4a0bac0ef1ff95aa1b8d68261ec8e8e86f2d1fbf
Reviewed-on: https://chromium-review.googlesource.com/752692
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-11-03 18:55:36 +00:00
Frank Barchard
80077a80c2
HammingDistance_X86 using popcnt assembly
...
popcnt has a fake dependency on the destination.
This assembly avoids the dependency by using a different
register for each popcnt.
Bug: libyuv:701
Test: LIBYUV_DISABLE_SSSE3=1 out/Release/libyuv_unittest --gtest_filter=*Ham*Opt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: Ie1d202e2613b7fa8a3c02acd433940e92c80eafa
Reviewed-on: https://chromium-review.googlesource.com/731826
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-23 21:15:12 +00:00
Frank Barchard
8fa02df3c0
mingw fix ifdefs to use gcc source
...
mingw gcc sets the macro _M_IX86 which is normally only set
by Visual C and clangcl which are Visual C style source code
style for assembly, but gcc is not Visual C compatible.
Add _MSC_VER to most ifdefs to detect that its really Visual C
or clangcl and not mingw gcc so the gcc source code will be used.
Bug: libyuv:744
Test: CXXFLAGS=-m32 CXX=~/prebuilts/gcc/linux-x86/host/x86_64-w64-mingw32-4.8/bin/x86_64-w64-mingw32-g++ make -f linux.mk
Change-Id: I3431aa486eb769b145faa8d5eb75ed639f9d6f5e
Reviewed-on: https://chromium-review.googlesource.com/722319
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-17 17:36:35 +00:00
Frank Barchard
1cebe2c622
TestHammingDistance_Opt to test low level matches C reference.
...
The low level hamming distance functions have size limitations
based on counter sizes. The higher level calls the low level
in blocks that avoid overflow and then accumulators in int64.
This test compares the results of the low levels to the high
level and against a known value (all ones) to ensure the
count is correct for any specified size.
The the size is very large, the result is expected to be
different.
Bug: libyuv:701
Test: TestHammingDistance_Opt
Change-Id: I6716af7cd09ac4d88a8afa25bc845a1b62af7c93
Reviewed-on: https://chromium-review.googlesource.com/710800
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-11 20:21:31 +00:00
Frank Barchard
60f433fbd9
Revert "ComputeHammingDistance reduce SIMD loop to 1 call when possible."
...
This reverts commit ec75df5894845b8d6b1341885a78db1de83decd8.
Reason for revert: <INSERT REASONING HERE>
Original change's description:
> ComputeHammingDistance reduce SIMD loop to 1 call when possible.
>
> 32 bit x86 has high overhead due to -fpic. So this reduces the
> number of calls by 1.
>
> TBR=kjellander@chromium.org
> Bug: libyuv:701
> Test: BenchmarkHammingDistance
> Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051
> Reviewed-on: https://chromium-review.googlesource.com/701755
> Reviewed-by: Frank Barchard <fbarchard@google.com>
> Reviewed-by: Cheng Wang <wangcheng@google.com>
TBR=rrwinterton@gmail.com ,fbarchard@google.com,wangcheng@google.com
Change-Id: Ia61e8558a8f083c14be5f51e0e141550b6f2b5c1
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: libyuv:701
Reviewed-on: https://chromium-review.googlesource.com/707823
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-10 01:16:15 +00:00
Frank Barchard
ec75df5894
ComputeHammingDistance reduce SIMD loop to 1 call when possible.
...
32 bit x86 has high overhead due to -fpic. So this reduces the
number of calls by 1.
TBR=kjellander@chromium.org
Bug: libyuv:701
Test: BenchmarkHammingDistance
Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051
Reviewed-on: https://chromium-review.googlesource.com/701755
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-10-09 22:51:23 +00:00
Frank Barchard
1734712a6f
Fix odd length HammingDistance
...
If length of HammingDistance was not a multiple of 4,
the result was incorrect. The old tests did not catch this
so a new test is done to count 1s.
Bug: libyuv:740
Test: LibYUVCompareTest.TestHammingDistance
Change-Id: I93db5437821c597f1f162ac263d4a594bb83231f
Reviewed-on: https://chromium-review.googlesource.com/699614
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-04 22:21:36 +00:00
Frank Barchard
fecd741794
Port HammingDistance to SSSE3
...
Bug: libyuv:701
Test: BenchmarkHammingDistance_Opt
Change-Id: Ibdd5d382677ebef4f82a62e0d5c3b88614a3b6e4
Reviewed-on: https://chromium-review.googlesource.com/696290
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-10-03 19:11:05 +00:00
Frank Barchard
bde789b176
Hamming Distance SSE2 and AVX2 optimized
...
Bug: None
Test: None
Change-Id: Id52663f9c957aac3172fba92d888ad1b041d5cf0
Reviewed-on: https://chromium-review.googlesource.com/692981
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-02 22:32:54 +00:00
Frank Barchard
efbf15754a
Step thru full color test by increments of 5 for better test speed.
...
Full color test is the slowest of the unittests, and not catching any
additional bugs at the moment. Step thru range of 0 to 255 in steps of
5 to speed up the test. 255 is 3 * 5 * 17, so any of those primes would
hit 0 and 255 exactly.
Was LibYUVColorTest.TestFullYUV (896 ms)
Now LibYUVColorTest.TestFullYUV (212 ms)
TBR=kjellander@chromium.org
Bug: libyuv:736
Test: LibYUVColorTest.TestFullYUV
Change-Id: I5b55fb07ada0dc7bdc3c3c20569d36bf09bb3804
Reviewed-on: https://chromium-review.googlesource.com/672064
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-09-19 02:01:53 +00:00
Frank Barchard
00c501fe43
Cast xgetbv from int64 to int to avoid Visual C warning.
...
TBR=kjellander@chromium.org
Bug: libyuv:735
Test: try bots
Change-Id: I00dc06689cd0a23847865c0c8edeb538b0cc81ac
Reviewed-on: https://chromium-review.googlesource.com/669142
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-09-15 22:00:52 +00:00
Frank Barchard
753a91cbcb
fix fmov build error on gcc 4.7 for neon64
...
TBR=kjellander@chromium.org
BUG=libyuv:732
TEST=LibYUVPlanarTest.TestScaleSumSamples_Opt
Change-Id: If80e9510ad5668b080b9384e656c0bd73cf5b4a6
Reviewed-on: https://chromium-review.googlesource.com/663764
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-09-12 22:46:33 +00:00
Frank Barchard
1e16cb5c38
SplitRGBPlane and MergeRGBPlane functions added
...
Converts packed RGB to planar and back.
TBR=kjellander@chromium.org
BUG=libyuv:728
TEST=MergeRGBPlane_Opt and SplitRGBPlane_Opt unittests added
Change-Id: Ida59af940afcb1fc4a48bbf62c714f592665c3cc
Reviewed-on: https://chromium-review.googlesource.com/658069
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-09-11 21:02:04 +00:00
Frank Barchard
367c0d8f81
enable MSA for clang
...
clang version 6.0.0 (trunk 310694) is able to compile MSA code.
Previous versions had an issue with _msa_fill_w(v32)
In this CL the macro DISABLE_CLANG_MSA is not set, allowing clang
to build the full MSA source.
TBR=kjellander@chromium.org
BUG=libyuv:715
TEST=gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
ninja -v -C out/Release libyuv_unittest
Change-Id: I47401e3b1a3e4c57d9626ec2d3cd131c3ccf613c
Reviewed-on: https://chromium-review.googlesource.com/656501
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-09-07 23:50:12 +00:00
Manojkumar Bhosale
2621c91bf1
Add MSA optimized HammingDistance and SumSquareError functions
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: Id0126ba5aff38817525b1efa6044f1dc2cfa1a36
Reviewed-on: https://chromium-review.googlesource.com/625739
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-09-05 21:32:33 +00:00
Frank Barchard
0acc67712f
clang format / lint cleanup for arm scale functions
...
TBR=kjellander@chromium.org
BUG=libyuv:725
TEST=lint
Change-Id: I76f777427f9b1458faba12796fb0011d8e3228d5
Reviewed-on: https://chromium-review.googlesource.com/646586
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-31 22:41:08 +00:00
Manojkumar Bhosale
b6e8e9aa97
Add MSA optimized HalfFloatRow function
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: I54a2c57d66093b887c8ba31fd7a21a102165393a
Reviewed-on: https://chromium-review.googlesource.com/628557
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-29 18:40:08 +00:00
Frank Barchard
8cd3e4f3f2
Add MSA optimized ScaleFilterCols, ScaleARGBCols, ScaleARGBFilterCols and ScaleRowDown34 functions
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: Ib139b9701fc67e24d27a6886377c0cb8b2773fda
Reviewed-on: https://chromium-review.googlesource.com/620791
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-18 17:23:27 +00:00
Frank Barchard
78e44628c6
Add MSA optimized SplitUV, Set, MirrorUV, SobelX and SobelY row functions.
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: Ie2342f841f1bb8469fc4631b784eddd804f5d53e
Reviewed-on: https://chromium-review.googlesource.com/616765
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-17 18:39:22 +00:00
Frank Barchard
56bbcdf422
Reintroduce the max version of scale
...
add ScaleMaxSamples_NEON function with max
done on original values.
TBR=kjellander@chromium.org
BUG=libyuv:717
TEST=LibYUVPlanarTest.TestScaleMaxSamples_Opt
Change-Id: Id99338860782b10ffd24f66242eb42014c2e229e
Reviewed-on: https://chromium-review.googlesource.com/614685
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-14 23:33:56 +00:00
Manojkumar Bhosale
dbd7c1a9c5
Add MSA optimized ARGBExtractAlpha, ARGBBlend, ARGBQuantize and ARGBColorMatrix row functions
...
TBR=kjellander@chromium.org
R=fbarchard@google.com
Bug:libyuv:634
Change-Id: I17bd3f87336f613ad363af7d7b9d7af49d725e56
Reviewed-on: https://chromium-review.googlesource.com/613100
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-14 17:38:31 +00:00
Frank Barchard
83ca1abe09
Change ScaleSumSamples to return Sum of Squares
...
TBR=kjellander@chromium.org
BUG=libyuv:717
TEST=LibYUVPlanarTest.TestScaleSumSamples_Opt
Change-Id: I5208666f3968c5c4b0f1b0c951f24216d78ee3fe
Reviewed-on: https://chromium-review.googlesource.com/607184
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-09 22:19:45 +00:00
Frank Barchard
8676ad7004
scale float samples and return max value
...
BUG=libyuv:717
TEST=ScaleSum unittest to compare C vs Arm implementation
TBR=kjellander@chromium.org
Change-Id: Iaa7af5547d979aad4722f868d31b405340115748
Reviewed-on: https://chromium-review.googlesource.com/600534
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-04 23:34:30 +00:00
Frank Barchard
6d083e2d12
clang 6 build disable some msa functions
...
R=kjellander@chromium.org
Bug: libyuv:715
Test: gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
Change-Id: Ia3943b0afc02e05a8bc32350719b296b0b9d5479
Reviewed-on: https://chromium-review.googlesource.com/592720
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-03 17:44:35 +00:00
Frank Barchard
d8136924bd
Rename convert to yuvconvert for linux.mk
...
TBR=kjellander@chromium.org
BUG=None
TEST=make -f linux.mk
Change-Id: I747c2eb6ed03cacddf3265e65088472507f3436c
Reviewed-on: https://chromium-review.googlesource.com/581874
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-07-21 19:05:11 +00:00
Frank Barchard
db25485ee2
Move compare functions into a unittest class
...
BUG=None
TEST=LibYUVCompareTest.*
R=jkellander@chromium.org
Change-Id: I3131ca73020f855ead08255d09aa7a846bf0d556
Reviewed-on: https://chromium-review.googlesource.com/540064
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-06-19 19:39:10 +00:00
Frank Barchard
6c94ad13b5
Remove ARM NaCL macros from source
...
NaCL has been disabled for awhile, so the code
will still build, but only with C versions.
This change removes the MEMACCESS() macros from
Neon and Neon64 source.
BUG=libyuv:702
TEST=try bots build for arm.
R=kjellander@chromium.org
Change-Id: Id581a5c8ff71e18cc69595e7fee9337f97c44a19
Reviewed-on: https://chromium-review.googlesource.com/528332
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-09 22:22:07 +00:00
Frank Barchard
d981495b42
Hamming Distance using 16 bit accumulators
...
Summing 16 bit hamming codes restricts the maximum length,
but saves an inner loop instruction. The outer loop can sum the
values.
32 bit Neon
Now BenchmarkHammingDistance_Opt (78 ms)
Was BenchmarkHammingDistance_Opt (92 ms)
64 bit Neon
Now BenchmarkHammingDistance_Opt (85 ms)
Was BenchmarkHammingDistance_Opt (92 ms)
R=wangcheng@google.com
TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance
Change-Id: Ie40f0eac2f3339c33b833b42af5d394b122066ae
Reviewed-on: https://chromium-review.googlesource.com/526932
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-07 23:23:24 +00:00
Frank Barchard
baf5248242
HammingDistance_NEON ported to 32 bit
...
TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance
Change-Id: I252efd8a27aa11a0fe7d8030d7c8b57f20f04760
Reviewed-on: https://chromium-review.googlesource.com/525232
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-06 17:58:29 +00:00
Frank Barchard
44abf70187
ScaleDown odd functions adjust math so last pixel is half width source.
...
existing test passes
out/Release/libyuv_unittest --gtest_filter=*Blend* --libyuv_width=33 --libyuv_height=16
new test added
BUG=libyuv:705
TEST=LibYUVScaleTest.TestScaleOdd
Change-Id: Ica91812aee2e4ed9bcc18df4962b089c2e4ae704
Reviewed-on: https://chromium-review.googlesource.com/524932
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-06 01:37:26 +00:00
Frank Barchard
7bffe5e1c5
lint warning fixes for CpuID
...
The CpuId function is a wrapper for the intrinsic, or
implemented with inline if unavailable. It had been
using uint32, but the intrinsics use int, so it was causing
casting and lint warnings. This change makes the internal
implementation use int.
Casting was also done for xgetbv, and the cast is simply
removed, and is not causing a build error.
MipCpuCaps was doing strlen to check for white space after the
instruction set. Arm also does this but with a hard coded offset.
This was causing a cast from size_t to int, which produced a lint
warning. The change removes the white space detect.
In theory the code could be used to detect SSE vs SSE2, and it would
need to check SSE is followed by a space or end of line. But this
code is only used on Arm and Mips, where there there is one form
of SIMD detected. e.g. MSA for mips. If a new instruction set is
added with a similar name, the write space check could be reintroduced.
But its more likely the code can be rewritten to use a better form
of detection by then. Or remove detection and require the instructions
BUG=libyuv:641
TEST=try bots build on all platforms without error and lint is clean
Change-Id: I9f55f8e57bba0f78571bdddbe63b945dea3e8809
Reviewed-on: https://chromium-review.googlesource.com/514524
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Wan-Teh Chang <wtc@chromium.org>
2017-05-25 22:00:17 +00:00
Frank Barchard
8edd2286fd
MaskCpuFlags return cpuinfo so InitCpuFlags can call it
...
Reduce number of atomic references to cpu_info by making
InitCpuFlags call MaskCpuFlags and return the same value.
BUG=libyuv:641
TEST=libyuv_unittests pass
Change-Id: I5dfff8f7a10671bc8ef3ec0ed6f302791e752faa
Reviewed-on: https://chromium-review.googlesource.com/514145
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-24 22:27:03 +00:00
Frank Barchard
651ccc0c3a
Fix data races in libyuv::TestCpuFlag().
...
Detect the compiler's support of C11 atomics, and use C11 atomics when
available.
Note that libyuv::MaskCpuFlags() is still not thread-safe.
BUG=libyuv:641
TEST= cpu_thread_test.cc adds a pthread based test
R=wangcheng@google.com
Change-Id: If05b1e16da833105a0159ed67ef20f4e61bc7abd
Reviewed-on: https://chromium-review.googlesource.com/510079
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-24 02:09:03 +00:00
Frank Barchard
77f6916da2
use __popcnt for visual c HammingDistance_X86
...
BUG=libyuv:701
TEST=HammingDistance unittest performance is comparable to x64
R=wangcheng@google.com
Change-Id: I8abe861e086e0162ba4c7ba6f1ef7d1c006cd9d4
Reviewed-on: https://chromium-review.googlesource.com/505454
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-05-12 22:59:00 +00:00
Frank Barchard
e0615c0e69
Optimize Hamming Distance C code to do 64 bits at a time.
...
BUG=libyuv:701
TEST=LibYUVBaseTest.BenchmarkHammingDistance_C
R=wangcheng@google.com
Change-Id: I243003b098bea8ef3809298bbec349ed52a43d8c
Reviewed-on: https://chromium-review.googlesource.com/499487
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-12 17:53:52 +00:00
Frank Barchard
2136e349da
Hamming code difference of 2 memory blocks
...
BUG=libyuv:701
TEST=built and disassembled for aarch64
R=kjellander@chromium.org
Change-Id: I7712b1c7934e5dfb55fda1fa7c8405c32d6964ce
Reviewed-on: https://chromium-review.googlesource.com/495327
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-08 21:37:51 +00:00
Frank Barchard
945ea1b746
mips switch sgtu to sltu for clang in ndk r14
...
The verion of clang in ndk r14 (3.9) has a built in llvm assembler
that does not have the sgtu pseudo instruction.
sltu is the actual instruction, so switch the 2 operands and use
the instruction instead of the pseudo op.
BUG=libyuv:700
TEST=try bots build mips without error.
Change-Id: I2d5f94f81acbd56cdedea011e7d9308979e19079
Reviewed-on: https://chromium-review.googlesource.com/494026
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-05-02 21:34:13 +00:00
Vignesh Venkatasubramanian
54289f1bb0
Fix mips build on android ndk r14+
...
Revert the workaround and fix it properly by passing the
additional necessary flag to the compiler.
BUG=libyuv:700
Change-Id: I1c893a8acb5079decbee6963b689424bf2f99f4f
Reviewed-on: https://chromium-review.googlesource.com/487881
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-04-26 08:39:55 +00:00
Frank Barchard
3b583396bf
Disable CopyRow_MIPS
...
CopyRow_MIPS produces a compile error on some compilers.
TBR=kjellander@chromium.org
BUG=libyuv:700
TEST=try bots
Change-Id: Ie88f2006ef5cf14bffaf80fd4c0dd1caa409c569
Reviewed-on: https://chromium-review.googlesource.com/486127
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-04-25 01:31:13 +00:00
Frank Barchard
fc02cc3806
Add I422ToRGB565
...
BUG=libyuv:699
TESTED=LibYUVConvertTest.I420ToARGB_RGB565_Opt
Change-Id: I87943bcad056fbbe051301f45c7dc0ae0620c837
Reviewed-on: https://chromium-review.googlesource.com/478578
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-04-17 17:51:17 +00:00
Frank Barchard
bd0faedbd2
add libyuv_unittest to Android.mk
...
BUG=libyuv:698
TESTED=mm libyuv_unittest within android external/libyuv builds unittests
Change-Id: I4b5fed9f5af86c8a910f73b14053ef83f38431cc
Reviewed-on: https://chromium-review.googlesource.com/478572
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-04-14 23:46:27 +00:00
Frank Barchard
8cab2e31d7
I422ToRGB565 fix for odd widths
...
I422ToRGB565Row_Any_AVX2 uses 2 step row conversion that calls
I422ToARGBRow_AVX2 and then ARGBToRGB565.
I422ToARGBRow_AVX2 expects multiple of 16 pixels.
Adjust the I422ToRGB565Row_Any_AVX2 to do multiple of 16 with AVX2
and then remainder in a buffer.
Bug: libyuv: 657
Test: out/Release/libyuv_unittest --gtest_filter=*Convert*I*To* --libyuv_width=1280 --libyuv_height=720
Change-Id: Ice1cb6c7ff6b2295513e8b4a9f77522e1c659810
Reviewed-on: https://chromium-review.googlesource.com/474232
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-04-11 17:24:05 +00:00
Frank Barchard
2adb84e39e
make gflags command line parser optional
...
BUG=libyuv:691
TEST=gn gen out/Release "--args=is_debug=false target_cpu=\"x64\" libyuv_include_tests=true"
Change-Id: Ib481189be884c34d9bbc30bfcf71c7969c6f4dae
Reviewed-on: https://chromium-review.googlesource.com/452736
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-14 01:52:52 +00:00
Frank Barchard
e6fec061cf
lint cleanup for convert RGB24ToI420
...
RGB24, RAW, RGB565, ARGB1555 and ARGB4444 have conditional
2 pass versus direct path. 2 pass method requires a buffer that
is conditionally allocated. ifdef's were confusing lint.
simplifed ifdefs to clean up lint warning
BUG=libyuv:692
TEST=lint source/convert.cc
Change-Id: If868718af30b48824a5e3d28f0d7d01d4609ad55
Reviewed-on: https://chromium-review.googlesource.com/451552
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-09 10:32:23 +00:00
Frank Barchard
27acadbf9d
Roll chromium_revision c793ec77b2..7950721f08 (454713:454907)
...
Change log: c793ec77b2..7950721f08
Full diff: c793ec77b2..7950721f08
Changed dependencies:
* src/base: 8fe126945c..d75864a2c5
* src/build: 8a0a5a27d4..bf8911f59b
* src/ios: 2c58c1ed6b..8b8111f841
* src/testing: 9cacf531de..c2c74bc1d1
* src/third_party: 0ea751c2fe..4c0908d22e
* src/third_party/catapult: 3c626eaf72..353ee60a45
* src/tools: 41a0ccf0e1..14318cc69b
DEPS diff: c793ec77b2..7950721f08 /DEPS
No update to Clang.
TBR=kjellander@chromium.org
BUG=libyuv:689
Change-Id: Ife134b4af1c8c1e63aae2b811342d325abe0b600
Reviewed-on: https://chromium-review.googlesource.com/450317
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-03-06 21:49:04 +00:00
Frank Barchard
136aa9d37c
any11p fix for buffer overrun
...
BUG=libyuv:686
TESTED=untested
Change-Id: Idfae93349dd78b1b633a596631e5397e11b77d0b
Reviewed-on: https://chromium-review.googlesource.com/448320
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-03 19:57:35 +00:00
Frank Barchard
91ee9b729e
Fix missing return in MipsCpuCaps.
...
Previously if MipsCpuCaps were called with something other than
dspr2 or msa, the file was closed but still used.
This change assumed the function is only called internally twice:
once for msa and once for dspr2. If msa is not being detected,
the function assumed dspr2 was being tested and returns dspr2 was
true.
BUG=libyuv:687
TEST=try bots
Change-Id: I80b328eb5ffc7baf5f1ee5a79c16d75c45ff26cc
Reviewed-on: https://chromium-review.googlesource.com/447831
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-01 23:07:03 +00:00
Manojkumar Bhosale
45b176d153
Add MSA optimized Interpolate/MergeUV/Misc functions
...
BUG=libyuv:634
Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Performance Gain (vs C auto-vectorized)
InterpolateRow_MSA - ~3.3x
InterpolateRow_Any_MSA - ~2.5x
ARGBSetRow_MSA - ~1.0x
ARGBSetRow_Any_MSA - ~1.0x
ARGBToRGB24Row_MSA - ~1.9x
ARGBToRGB24Row_Any_MSA - ~1.6x
MergeUVRow_MSA - ~1.6x
MergeUVRow_Any_MSA - ~1.2x
Performance Gain (vs C non-vectorized)
InterpolateRow_MSA - ~11.3x
InterpolateRow_Any_MSA - ~ 7.9x
ARGBSetRow_MSA - ~ 6.2x
ARGBSetRow_Any_MSA - ~ 4.0x
ARGBToRGB24Row_MSA - ~ 9.9x
ARGBToRGB24Row_Any_MSA - ~ 8.4x
MergeUVRow_MSA - ~12.7x
MergeUVRow_Any_MSA - ~ 8.0x
Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Reviewed-on: https://chromium-review.googlesource.com/445817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-23 01:42:22 +00:00
Frank Barchard
a041b0ae03
Android.mk for libyuv - unused parameters warning enabled
...
BUG=libyuv:682
TEST=mm from android tree.
Change-Id: I13be3eaa6a33741797360d57bc5cf5fed91678ef
Reviewed-on: https://chromium-review.googlesource.com/445935
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-22 19:45:12 +00:00
Manojkumar Bhosale
eed66b2028
Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions
...
BUG=libyuv:634
Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Performance Gain (vs C auto-vectorized)
I444ToARGBRow_MSA - ~1.6x
I444ToARGBRow_Any_MSA - ~1.6x
I400ToARGBRow_MSA - ~5.5x
I400ToARGBRow_Any_MSA - ~5.3x
J400ToARGBRow_MSA - ~1.0x
J400ToARGBRow_Any_MSA - ~1.0x
YUY2ToARGBRow_MSA - ~1.6x
YUY2ToARGBRow_Any_MSA - ~1.6x
UYVYToARGBRow_MSA - ~1.6x
UYVYToARGBRow_Any_MSA - ~1.6x
Performance Gain (vs C non-vectorized)
I444ToARGBRow_MSA - ~7.3x
I444ToARGBRow_Any_MSA - ~7.1x
I400ToARGBRow_MSA - ~5.5x
I400ToARGBRow_Any_MSA - ~5.2x
J400ToARGBRow_MSA - ~6.8x
J400ToARGBRow_Any_MSA - ~5.7x
YUY2ToARGBRow_MSA - ~7.2x
YUY2ToARGBRow_Any_MSA - ~7.0x
UYVYToARGBRow_MSA - ~7.1x
UYVYToARGBRow_Any_MSA - ~6.9x
Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Reviewed-on: https://chromium-review.googlesource.com/439246
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-21 23:22:07 +00:00
Frank Barchard
bbe8c233f2
scale warning fixes for unused parameters
...
BUG=libyuv:680
TEST=builds and runs with no warnings
Change-Id: I7d60ef44292fa6ad4f7c4e2e2657359b864d2dab
Reviewed-on: https://chromium-review.googlesource.com/442670
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-02-15 21:38:59 +00:00
Frank Barchard
0fb5675902
Fix dspr2 rename changes. Fix unused variables
...
TBR=kjellander@chromium.org
BUG=libyuv:634
TEST=try bots
Review-Url: https://codereview.chromium.org/2675583002 .
2017-02-01 18:51:06 -08:00
Manojkumar Bhosale
54ce8f23d6
Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C auto-vectorized)
ARGBToYJRow_MSA - ~3.2x
ARGBToYJRow_Any_MSA - ~2.7x
BGRAToYRow_MSA - ~3.2x
BGRAToYRow_Any_MSA - ~2.7x
ABGRToYRow_MSA - ~3.2x
ABGRToYRow_Any_MSA - ~2.6x
RGBAToYRow_MSA - ~3.1x
RGBAToYRow_Any_MSA - ~2.7x
ARGBToUVJRow_MSA - ~5.5x
ARGBToUVJRow_Any_MSA - ~4.5x
BGRAToUVRow_MSA - ~2.1x
BGRAToUVRow_Any_MSA - ~2.0x
ABGRToUVRow_MSA - ~2.1x
ABGRToUVRow_Any_MSA - ~1.9x
RGBAToUVRow_MSA - ~2.2x
RGBAToUVRow_Any_MSA - ~1.9x
Performance Gain (vs C non-vectorized)
ARGBToYJRow_MSA - ~10.9x
ARGBToYJRow_Any_MSA - ~9.2x
BGRAToYRow_MSA - ~10.9x
BGRAToYRow_Any_MSA - ~9.3x
ABGRToYRow_MSA - ~11.0x
ABGRToYRow_Any_MSA - ~9.3x
RGBAToYRow_MSA - ~10.9x
RGBAToYRow_Any_MSA - ~9.1x
ARGBToUVJRow_MSA - ~12.4x
ARGBToUVJRow_Any_MSA - ~10.5x
BGRAToUVRow_MSA - ~4.7x
BGRAToUVRow_Any_MSA - ~4.4x
ABGRToUVRow_MSA - ~4.7x
ABGRToUVRow_Any_MSA - ~4.5x
RGBAToUVRow_MSA - ~4.8x
RGBAToUVRow_Any_MSA - ~4.4x
Review-Url: https://codereview.chromium.org/2641153003 .
2017-02-01 10:31:28 +05:30
Frank Barchard
54f2094a5e
Rename mips source files to dspr2.
...
Add MSA detect to unittest.
Change macro to disable DSPR2 code to LIBYUV_DISABLE_DSPR2
BUG=libyuv:634
TEST=try bots
Change-Id: I9e0aa2452204fc529bb6f9e6fd93c4e1c379bba6
Reviewed-on: https://chromium-review.googlesource.com/433463
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-01-27 23:11:43 +00:00
Frank Barchard
33f52bdac9
Add installer builds to cmake for linux
...
cd ~/my_projects/libyuv
git pull
mkdir cbuild # (for out-of-source builds)
cd cbuild
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4
make package
BUG=libyuv:673
TEST=make package
Change-Id: Ia449cbfd0bc118cc90c8648f8199a0526b7ae2a2
Reviewed-on: https://chromium-review.googlesource.com/433440
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-01-26 23:05:17 +00:00
Frank Barchard
749e316ed8
Remove commented out code
...
TEST=None
BUG=libyuv:672
Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169
Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169
Reviewed-on: https://chromium-review.googlesource.com/430321
Reviewed-by: Aaron Gable <agable@chromium.org>
2017-01-20 02:03:12 +00:00
Manojkumar Bhosale
09b8c971b3
Add MSA optimized NV12/21 To RGB row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C auto-vectorized)
NV12ToARGBRow_MSA - ~1.5x
NV12ToARGBRow_Any_MSA - ~1.4x
NV12ToRGB565Row_MSA - ~1.4x
NV12ToRGB565Row_Any_MSA - ~1.4x
NV21ToARGBRow_MSA - ~1.5x
NV21ToARGBRow_Any_MSA - ~1.5x
SobelRow_MSA - ~4.3x
SobelRow_Any_MSA - ~3.4x
SobelToPlaneRow_MSA - ~8.0x
SobelToPlaneRow_Any_MSA - ~4.7x
SobelXYRow_MSA - ~3.0x
SobelXYRow_Any_MSA - ~2.5x
Performance Gain (vs C non-vectorized)
NV12ToARGBRow_MSA - ~6.5x
NV12ToARGBRow_Any_MSA - ~6.5x
NV12ToRGB565Row_MSA - ~6.2x
NV12ToRGB565Row_Any_MSA - ~6.1x
NV21ToARGBRow_MSA - ~6.5x
NV21ToARGBRow_Any_MSA - ~6.5x
SobelRow_MSA - ~14.5x
SobelRow_Any_MSA - ~11.3x
SobelToPlaneRow_MSA - ~34.2x
SobelToPlaneRow_Any_MSA - ~19.4x
SobelXYRow_MSA - ~11.1x
SobelXYRow_Any_MSA - ~9.1x
Review-Url: https://codereview.chromium.org/2636483002 .
2017-01-18 09:24:39 +05:30
Frank Barchard
a7c87e19f0
add Intel Code Analyst markers
...
add macros to enable/disable code analyst around blocks of code.
Normally these macros should not be used, but if performance
details are wanted for intel code, enable them around the code
and then run via the iaca tool, available on the intel website.
BUG=libyuv:670
TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
R=wangcheng@google.com
Review-Url: https://codereview.chromium.org/2626193002 .
2017-01-13 15:50:24 -08:00
Manojkumar Bhosale
73a6f100a9
Add MSA optimized rotate functions (used 16x16 transpose)
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
TransposeWx16_MSA - ~6.0x
TransposeWx16_Any_MSA - ~4.7x
TransposeUVWx16_MSA - ~6.3x
TransposeUVWx16_Any_MSA - ~5.4x
Performance Gain (vs C non-vectorized)
TransposeWx16_MSA - ~6.0x
TransposeWx16_Any_MSA - ~4.8x
TransposeUVWx16_MSA - ~6.3x
TransposeUVWx16_Any_MSA - ~5.4x
Review-Url: https://codereview.chromium.org/2617703002 .
2017-01-13 15:50:02 +05:30
Manojkumar Bhosale
7c64163ff4
Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGB1555ToARGBRow_MSA - 1.85
ARGB1555ToARGBRow_Any_MSA - 1.82
RGB565ToARGBRow_MSA - 2.14
RGB565ToARGBRow_Any_MSA - 2.08
RGB24ToARGBRow_MSA - 8.57
RGB24ToARGBRow_Any_MSA - 7.42
RAWToARGBRow_MSA - 8.57
RAWToARGBRow_Any_MSA - 7.42
ARGB1555ToYRow_MSA - 2.60
ARGB1555ToYRow_Any_MSA - 2.47
RGB565ToYRow_MSA - 2.45
RGB565ToYRow_Any_MSA - 2.33
RGB24ToYRow_MSA - 2.23
RGB24ToYRow_Any_MSA - 2.01
RAWToYRow_MSA - 2.25
RAWToYRow_Any_MSA - 2.02
ARGB1555ToUVRow_MSA - 1.40
ARGB1555ToUVRow_Any_MSA - 1.37
RGB565ToUVRow_MSA - 1.68
RGB565ToUVRow_Any_MSA - 1.63
RGB24ToUVRow_MSA - 3.02
RGB24ToUVRow_Any_MSA - 2.87
RAWToUVRow_MSA - 3.04
RAWToUVRow_Any_MSA - 2.85
Performance Gain (vs C non-vectorized)
ARGB1555ToARGBRow_MSA - 4.66
ARGB1555ToARGBRow_Any_MSA - 4.45
RGB565ToARGBRow_MSA - 5.58
RGB565ToARGBRow_Any_MSA - 5.34
RGB24ToARGBRow_MSA - 8.57
RGB24ToARGBRow_Any_MSA - 7.42
RAWToARGBRow_MSA - 8.57
RAWToARGBRow_Any_MSA - 7.42
ARGB1555ToYRow_MSA - 6.38
ARGB1555ToYRow_Any_MSA - 5.98
RGB565ToYRow_MSA - 6.42
RGB565ToYRow_Any_MSA - 6.05
RGB24ToYRow_MSA - 7.87
RGB24ToYRow_Any_MSA - 7.01
RAWToYRow_MSA - 7.98
RAWToYRow_Any_MSA - 7.01
ARGB1555ToUVRow_MSA - 5.39
ARGB1555ToUVRow_Any_MSA - 5.06
RGB565ToUVRow_MSA - 6.39
RGB565ToUVRow_Any_MSA - 5.90
RGB24ToUVRow_MSA - 3.04
RGB24ToUVRow_Any_MSA - 2.87
RAWToUVRow_MSA - 3.04
RAWToUVRow_Any_MSA - 2.88
Review-Url: https://codereview.chromium.org/2600713002 .
2017-01-13 15:43:37 +05:30
Frank Barchard
000d2fa91a
Libyuv MIPS DSPR2 optimizations.
...
Optimized functions:
I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2
Bug-fixes in functions:
ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2
BUG=
Review-Url: https://codereview.chromium.org/2626123003 .
2017-01-11 12:19:13 -08:00
Manojkumar Bhosale
288bfbefb5
Add MSA optimized remaining scale row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ScaleRowDown2_MSA - ~22.3x
ScaleRowDown2_Any_MSA - ~19.9x
ScaleRowDown2Linear_MSA - ~31.2x
ScaleRowDown2Linear_Any_MSA - ~29.4x
ScaleRowDown2Box_MSA - ~20.1x
ScaleRowDown2Box_Any_MSA - ~19.6x
ScaleRowDown4_MSA - ~11.7x
ScaleRowDown4_Any_MSA - ~11.2x
ScaleRowDown4Box_MSA - ~15.1x
ScaleRowDown4Box_Any_MSA - ~15.1x
ScaleRowDown38_MSA - ~1x
ScaleRowDown38_Any_MSA - ~1x
ScaleRowDown38_2_Box_MSA - ~1.7x
ScaleRowDown38_2_Box_Any_MSA - ~1.7x
ScaleRowDown38_3_Box_MSA - ~1.7x
ScaleRowDown38_3_Box_Any_MSA - ~1.7x
ScaleAddRow_MSA - ~1.2x
ScaleAddRow_Any_MSA - ~1.15x
Performance Gain (vs C non-vectorized)
ScaleRowDown2_MSA - ~22.4x
ScaleRowDown2_Any_MSA - ~19.8x
ScaleRowDown2Linear_MSA - ~31.6x
ScaleRowDown2Linear_Any_MSA - ~29.4x
ScaleRowDown2Box_MSA - ~20.1x
ScaleRowDown2Box_Any_MSA - ~19.6x
ScaleRowDown4_MSA - ~11.7x
ScaleRowDown4_Any_MSA - ~11.2x
ScaleRowDown4Box_MSA - ~15.1x
ScaleRowDown4Box_Any_MSA - ~15.1x
ScaleRowDown38_MSA - ~3.2x
ScaleRowDown38_Any_MSA - ~3.2x
ScaleRowDown38_2_Box_MSA - ~2.4x
ScaleRowDown38_2_Box_Any_MSA - ~2.3x
ScaleRowDown38_3_Box_MSA - ~2.9x
ScaleRowDown38_3_Box_Any_MSA - ~2.8x
ScaleAddRow_MSA - ~8x
ScaleAddRow_Any_MSA - ~7.46x
Review-Url: https://codereview.chromium.org/2559683002 .
2016-12-21 13:39:44 +05:30
Frank Barchard
bd10875846
modified libyuv.gyp so that it no longer depends on libjpeg.gyp, which does not exist anymore.
...
BUG=libyuv:666
TESTED= unittests built and passed with jpeg disabled.
R=kjellander@chromium.org
Review-Url: https://codereview.chromium.org/2585373002 .
2016-12-19 11:57:49 -08:00
Manojkumar Bhosale
a899dea251
Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA - ~1.1x
ARGBAttenuateRow_Any_MSA - ~1.1x
ARGBToRGB565DitherRow_MSA - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA - ~5.1x
ARGBShuffleRow_Any_MSA - ~1.9x
ARGBShadeRow_MSA - ~1.1x
ARGBGrayRow_MSA - ~2.6x
ARGBSepiaRow_MSA - ~11.6x
Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA - ~2.46x
ARGBAttenuateRow_Any_MSA - ~2.45x
ARGBToRGB565DitherRow_MSA - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA - ~5.2x
ARGBShuffleRow_Any_MSA - ~1.9x
ARGBShadeRow_MSA - ~4.3x
ARGBGrayRow_MSA - ~10.5x
ARGBSepiaRow_MSA - ~12.2x
Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Manojkumar Bhosale
6fa5e4eb78
Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
TransposeWx8_MSA - ~2.7x
TransposeWx8_Any_MSA - ~2.1x
TransposeUVWx8_MSA - ~2.5x
TransposeUVWx8_Any_MSA - ~2.7x
Performance Gain (vs C non-vectorized)
TransposeWx8_MSA - ~4.6x
TransposeWx8_Any_MSA - ~2.9x
TransposeUVWx8_MSA - ~4.4x
TransposeUVWx8_Any_MSA - ~3.7x
Review URL: https://codereview.chromium.org/2553403002 .
2016-12-15 10:06:01 +05:30
Frank Barchard
b18fd21d3c
Android420ToI420 - use ptrdiff_t for difference of u and v pointers
...
The difference was assigned to an int, causing a warning on Visual C.
BUG=662
TEST=tested with try bots.
R=devangelakos@google.com
Review-Url: https://codereview.chromium.org/2574373002 .
2016-12-14 11:53:55 -08:00
Frank Barchard
dde8ba7009
ConvertFromI420: use halfstride instead of halfwidth
...
BUG=libyuv:660
TEST=try bots
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2554213003 .
2016-12-07 10:16:16 -08:00
Manojkumar Bhosale
56b5bbb0be
Add MSA optimized ARGB scaling functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ScaleARGBRowDown2_MSA - ~2.6x
ScaleARGBRowDown2Linear_MSA - ~7.9x
ScaleARGBRowDown2Box_MSA - ~3.7x
ScaleARGBRowDownEven_MSA - ~1.2x
ScaleARGBRowDownEvenBox_MSA - ~3.5x
ScaleARGBRowDown2_Any_MSA - ~2.6x
ScaleARGBRowDown2Linear_Any_MSA - ~7.9x
ScaleARGBRowDown2Box_Any_MSA - ~3.6x
ScaleARGBRowDownEven_Any_MSA - ~1.2x
ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x
Performance Gain (vs C non-vectorized)
ScaleARGBRowDown2_MSA - 2.6x
ScaleARGBRowDown2Linear_MSA - 13.5x
ScaleARGBRowDown2Box_MSA - 5.8x
ScaleARGBRowDownEven_MSA - 1.2x
ScaleARGBRowDownEvenBox_MSA - 3.7x
ScaleARGBRowDown2_Any_MSA - 2.6x
ScaleARGBRowDown2Linear_Any_MSA - 13.5x
ScaleARGBRowDown2Box_Any_MSA - 5.3x
ScaleARGBRowDownEven_Any_MSA - 1.2x
ScaleARGBRowDownEvenBox_Any_MSA - 3.7x
Review URL: https://codereview.chromium.org/2527983002 .
2016-12-07 11:47:15 +05:30
Manojkumar Bhosale
83f460be33
Add MSA optimized ARGB Multiply/Add/Subtract row functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBMultiplyRow_MSA - 1.4x
ARGBAddRow_MSA - 8.6x
ARGBSubtractRow_MSA - 8.6x
ARGBMultiplyRow_Any_MSA - 1.35x
ARGBAddRow_Any_MSA - 7.3x
ARGBSubtractRow_Any_MSA - 7.2x
Performance Gain (vs C non-vectorized)
ARGBMultiplyRow_MSA - 4.4x
ARGBAddRow_MSA - 27x
ARGBSubtractRow_MSA - 22x
ARGBMultiplyRow_Any_MSA - 3.5x
ARGBAddRow_Any_MSA - 23x
ARGBSubtractRow_Any_MSA - 18x
Review URL: https://codereview.chromium.org/2529983002 .
2016-12-02 15:21:10 +05:30
Frank Barchard
da0c29dada
Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ARGBToRGB565Row_MSA - ~1.6x
ARGBToRGB565Row_Any_MSA - ~1.6x
ARGBToARGB1555Row_MSA - ~1.3x
ARGBToARGB1555Row_Any_MSA - ~1.3x
ARGBToARGB4444Row_MSA - ~3.8x
ARGBToARGB4444Row_Any_MSA - ~3.8x
ARGBToUV444Row_MSA - ~2.4x
ARGBToUV444Row_Any_MSA - ~2.4x
Performance Gain (vs C non-vectorized)
ARGBToRGB565Row_MSA - ~2.8x
ARGBToRGB565Row_Any_MSA - ~2.8x
ARGBToARGB1555Row_MSA - ~2.2x
ARGBToARGB1555Row_Any_MSA - ~2.2x
ARGBToARGB4444Row_MSA - ~6.8x
ARGBToARGB4444Row_Any_MSA - ~6.6x
ARGBToUV444Row_MSA - ~6.7x
ARGBToUV444Row_Any_MSA - ~6.7x
Review URL: https://codereview.chromium.org/2520003004 .
2016-11-22 10:47:55 -08:00
Frank Barchard
b1504a8e48
Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2487913004 .
2016-11-18 15:05:10 -08:00
Frank Barchard
97fb18b846
disable I422AlphaToARGBRow_SSSE3 for 32 bit fpic
...
BUG=libyuv:658
TEST=g++ -I include -fPIC -m32 -msse2 -Os -fno-omit-frame-pointer -c source/row_gcc.cc -o row_gcc.o
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2482263003 .
2016-11-08 16:09:09 -08:00
Frank Barchard
e62309f259
clang-format libyuv
...
BUG=libyuv:654
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
f2c27dafa2
HalfFloat neon armv7 fix for destination pointer.
...
Improved unittests detect different in arm64 rounding.
TEST=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*Half* -a "--libyuv_width=640 --libyuv_height=360"
BUG=libyuv:560
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2478313004 .
2016-11-07 12:13:04 -08:00
Frank Barchard
eca08525cb
HalfFloat Neon for ARMv7.
...
64 bit version made similar to 32 bit with registers 1 for load and store results, and 2 and 3 as expanded float temporary values.
TEST=out/Release/libyuv_unittest --gtest_filter=*Half*
BUG=libyuv:560
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2467723002 .
2016-11-01 11:36:51 -07:00
Frank Barchard
10ce829bad
Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
I422ToRGB565Row_MSA : ~1.5x
I422ToRGB565Row_Any_MSA : ~1.5x
I422ToARGB4444Row_MSA : ~1.4x
I422ToARGB4444Row_Any_MSA : ~1.4x
I422ToARGB1555Row_MSA : ~1.4x
I422ToARGB1555Row_Any_MSA : ~1.4x
Performance Gain (vs C non-vectorized)
I422ToRGB565Row_MSA : ~6.8x
I422ToRGB565Row_Any_MSA : ~6.8x
I422ToARGB4444Row_MSA : ~6.6x
I422ToARGB4444Row_Any_MSA : ~6.6x
I422ToARGB1555Row_MSA : ~6.6x
I422ToARGB1555Row_Any_MSA : ~6.6x
Review URL: https://codereview.chromium.org/2445343007 .
2016-10-27 10:47:35 -07:00
Frank Barchard
532f5708a9
Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA : ~1.4x
I422AlphaToARGBRow_Any_MSA : ~1.4x
I422ToRGB24Row_MSA : ~4.8x
I422ToRGB24Row_Any_MSA : ~4.8x
Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA : ~7.0x
I422AlphaToARGBRow_Any_MSA : ~7.0x
I422ToRGB24Row_MSA : ~7.9x
I422ToRGB24Row_Any_MSA : ~7.7x
Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
02ae8b60c5
Line continuation at end of line with NOLINT before that.
...
BUG=libyuv:634
TEST=git cl lint
TBR=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2453013003 .
2016-10-26 10:42:52 -07:00
Frank Barchard
2488b3105b
White spaces, comments and lint fixes for msa.
...
no functional changes.
TBR=kjellander@chromium.org
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2446313002 .
2016-10-25 11:36:54 -07:00
Frank Barchard
c2073823b4
use __OPTIMIZE__ macro to determine debug vs release.
...
Debug builds of x86 gcc/clang can run out of register.
Previously NDEBUG or _DEBUG was used to detect a debug build.
But those macros are not set by gentoo builds.
This CL switches to the compiler predefine __OPTIMIZE__ which is
built into clang and gcc.
BUG=libyuv:602
TEST=untested
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2451503002 .
2016-10-24 18:02:48 -07:00
Frank Barchard
f5d5bd88d6
Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance Gains :- (vs C vectorized)
I422ToARGBRow_MSA : ~1.6x
I422ToRGBARow_MSA : ~1.6x
I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x
Performance Gains :- (vs C non-vectorized)
I422ToARGBRow_MSA : ~7x
I422ToRGBARow_MSA : ~7x
I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x
Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.
Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
451af5e922
scale by 1 for neon implemented
...
void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
asm volatile (
"1: \n"
MEMACCESS(0)
"ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts
"subs %w2, %w2, #8 \n" // 8 pixels per loop
"uxtl v2.4s, v1.4h \n" // 8 int's
"uxtl2 v1.4s, v1.8h \n"
"scvtf v2.4s, v2.4s \n" // 8 floats
"scvtf v1.4s, v1.4s \n"
"fcvtn v4.4h, v2.4s \n" // 8 floatsgit
"fcvtn2 v4.8h, v1.4s \n"
MEMACCESS(1)
"st1 {v4.16b}, [%1], #16 \n" // store 8 shorts
"b.gt 1b \n"
: "+r"(src), // %0
"+r"(dst), // %1
"+r"(width) // %2
:
: "cc", "memory", "v1", "v2", "v4"
);
}
void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
asm volatile (
"1: \n"
MEMACCESS(0)
"ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts
"subs %w2, %w2, #8 \n" // 8 pixels per loop
"uxtl v2.4s, v1.4h \n" // 8 int's
"uxtl2 v1.4s, v1.8h \n"
"scvtf v2.4s, v2.4s \n" // 8 floats
"scvtf v1.4s, v1.4s \n"
"fmul v2.4s, v2.4s, %3.s[0] \n" // adjust exponent
"fmul v1.4s, v1.4s, %3.s[0] \n"
"uqshrn v4.4h, v2.4s, #13 \n" // isolate halffloat
"uqshrn2 v4.8h, v1.4s, #13 \n"
MEMACCESS(1)
"st1 {v4.16b}, [%1], #16 \n" // store 8 shorts
"b.gt 1b \n"
: "+r"(src), // %0
"+r"(dst), // %1
"+r"(width) // %2
: "w"(scale * 1.9259299444e-34f) // %3
: "cc", "memory", "v1", "v2", "v4"
);
}
TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
BUG=libyuv:560
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2430313008 .
2016-10-21 14:30:03 -07:00
Frank Barchard
550cf829fb
HalfFloat avx2 unpack bug fix.
...
AVX unpack parameters were reverse ordered causing incorrect results
on AVX2 hardware.
TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Half*
BUG=libyuv:560
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2438893002 .
2016-10-20 15:49:00 -07:00
Frank Barchard
f553db2d30
HalfFloatPlane unittest for denormal half floats
...
Halffloats have a limited range. It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero. This test ensures they behave the same in C and SIMD and tests the performance of denormals.
TEST=TestHalfFloatPlane_denormal
BUG=libyuv:560
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2424233004 .
2016-10-19 18:13:01 -07:00
Frank Barchard
78c58ab8aa
Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains : (Auto-vectorized C vs MSA SIMD)
ARGB4444ToYRow_MSA : ~3.0x
ARGB4444ToUVRow_MSA : ~1.8x
ARGB4444ToARGBRow_MSA : ~3.4x
ARGB4444ToYRow_Any_MSA : ~2.8x
ARGB4444ToUVRow_Any_MSA : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x
Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
2d80fc3133
Port HalfFloatRow_SSE2 to AVX2 but not using F16C.
...
R=wangcheng@google.com , hubbe@chromium.org
BUG=libyuv:560
Review URL: https://codereview.chromium.org/2421993002 .
2016-10-14 19:01:41 -07:00
Frank Barchard
fdcf524aac
Add f16c (halffloat) cpuid
...
R=wangcheng@google.com , hubbe@chromium.org
BUG=libyuv:560
Review URL: https://codereview.chromium.org/2418763006 .
2016-10-14 16:34:08 -07:00
Frank Barchard
5333e94e70
Port ARGBExtractAlpha_AVX2 function to windows.
...
BUG=libyuv:572
TEST=try bots
R=wangcheng@google.com , magjed@chromium.org
Review URL: https://codereview.chromium.org/2416783004 .
2016-10-13 23:20:57 -07:00
Frank Barchard
a5e93766a2
Add ARGBExtractAlpha_AVX2 function
...
Port SSE2 version to AVX2.
BUG=libyuv:572
TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract*
R=wangcheng@google.com , magjed@chromium.org
Review URL: https://codereview.chromium.org/2420553002 .
2016-10-13 16:03:43 -07:00
Frank Barchard
198bce3959
Cast for clang-cl 64 bit build warnings in unittests
...
R=kjellander@chromium.org
BUG=libyuv:649
Review URL: https://codereview.chromium.org/2414763002 .
2016-10-12 13:09:57 -07:00
Frank Barchard
d363ea6527
Remove I411 support.
...
YUV 411 is very uncommon format. Remove support.
Update documentation to reflect that 411 is deprecated.
Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.
BUG=libyuv:645
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
0071f46a1f
Side by side 420 test
...
I420 output can be slow due to multi channel write.
Putting the U and V into a single side by side buffer can improve performance.
TBR=wangcheng@google.com
BUG=None
Review URL: https://codereview.chromium.org/2403223003 .
2016-10-10 19:28:33 -07:00
Frank Barchard
edd3a84d05
libyuv::YUY2ToY for isolating Y channel of YUY2.
...
This function is the first step of YUY2 To I420.
Provided primarily for diagnostics.
TBR=wangcheng@google.com
BUG=libyuv:647
TESTED=LibYUVConvertTest.YUY2ToY_Opt
Review URL: https://codereview.chromium.org/2399153004 .
2016-10-07 17:20:30 -07:00
Frank Barchard
a2891ec77c
Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains as below,
YUY2ToI422, YUY2ToI420 :-
YUY2ToYRow_MSA : ~10x
YUY2ToUVRow_MSA : ~11x
YUY2ToUV422Row_MSA : ~9x
YUY2ToYRow_Any_MSA : ~6x
YUY2ToUVRow_Any_MSA : ~5x
YUY2ToUV422Row_Any_MSA : ~4x
UYVYToI422, UYVYToI420 :-
UYVYToYRow_MSA : ~10x
UYVYToUVRow_MSA : ~11x
UYVYToUV422Row_MSA : ~9x
UYVYToYRow_Any_MSA : ~6x
UYVYToUVRow_Any_MSA : ~5x
UYVYToUV422Row_Any_MSA : ~4x
Review URL: https://codereview.chromium.org/2397693002 .
2016-10-07 10:37:22 -07:00
Frank Barchard
7018f5be0f
Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
...
R=fbarchard@google.com
BUG=libyuv:634
Performance gains :-
I422ToYUY2Row_MSA - ~12x
I422ToYUY2Row_Any_MSA - ~7x
I422ToUYVYRow_MSA - ~12x
I422ToUYVYRow_Any_MSA - ~7x
Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
aa197ee1a3
HalfFloat_SSE2 for Visual C
...
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560, chromium:445071
TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows
R=hubbe@chromium.org , wangcheng@google.com
Review URL: https://codereview.chromium.org/2387713002 .
2016-10-03 10:33:38 -07:00
Frank Barchard
4a14cb2e81
HalfFloat_SSE2 port from C algorithm to SSE2
...
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560, chromium:445071
TEST=untested
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2381493006 .
2016-09-30 09:47:16 -07:00
Frank Barchard
7fc932ddd3
Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
...
BUG=libyuv:560,chromium:445071
TEST=untested
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2371293002 .
2016-09-29 15:06:30 -07:00
Frank Barchard
c11e9b7fb7
bt709 coefficients for video constrained space
...
Original bt709 color space coefficients were full range yuv for higher
quality. This change makes the coefficients use the video constrained
color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for
chroma channels.
BUG=libyuv:639
TEST=libyuv unittests run locally
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2367253003 .
2016-09-28 15:07:46 -07:00
Frank Barchard
6732bcbde9
ShortToHalfFloat_AVX2 function
...
BUG=libyuv:560
TEST=local compile for windows
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2364293002 .
2016-09-27 14:18:32 -07:00
Frank Barchard
bcd823805c
remove guard nolints from all headers
...
Remove NOLINT from guards
TEST=git cl lint
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2374653002 .
2016-09-26 18:02:09 -07:00
Frank Barchard
51a3500c18
Remove unused macros for msa.
...
Signed vectors are rarely used in libyuv... remove macros for now.
Remove word shuffler, use byte shuffler in row_msa.
Bump version number.
TBR=manojkumar.bhosale@imgtec.com
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2375553002 .
2016-09-26 17:34:58 -07:00
Frank Barchard
618149084e
Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function
...
This patch adds MSA optimized ARGBMirrorRow function in libYUV project.
Performance gain ~3x
R=fbarchard@google.com
BUG=libyuv:634
Review URL: https://codereview.chromium.org/2368313003 .
2016-09-26 16:28:01 -07:00
Frank Barchard
c5323b0fdc
Add MIPS SIMD Arch (MSA) optimized MirrorRow function
...
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds
This patch adds first MSA optimized function in libYUV project.
BUG=libyuv:634
R=fbarchard@google.com
Review URL: https://codereview.chromium.org/2285683002 .
2016-09-22 16:12:22 -07:00
Frank Barchard
5da918b48d
Enable NEON for unittests on ios 64 bit.
...
TBR=kjellander@chromium.org
BUG=libyuv:637, chromium:646279
Review URL: https://codereview.chromium.org/2340933005 .
2016-09-16 16:46:46 -07:00
Frank Barchard
8279df963e
Scale by 3/8 only if source is multiple of 8 tall.
...
BUG=libyuv:635
TEST=try bots
R=harryjin@google.com
Review URL: https://codereview.chromium.org/2347733002 .
2016-09-16 14:57:47 -07:00
Frank Barchard
742be44654
Remove references to svn version control.
...
BUG=libyuv:636
TESTED=try bots
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2339813002 .
2016-09-14 14:54:47 -07:00
Frank Barchard
de944ed8c7
YuvConstants declare alignment for externs as well as declarations
...
On visual c 2013 and earlier a warning is generated if externs
are not declared with the same alignment as the declaration, when
using /ltcg
BUG=libyuv:633
TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg
R=skal@google.com
Review URL: https://codereview.chromium.org/2291533004 .
2016-08-30 11:06:46 -07:00
Frank Barchard
dc3a1295be
add mergeuv test
...
Add test for SplitUVPlane and MergeUVPlane
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists.
TEST=SplitUVPlane unittest
BUG=libyuv:629
R=braveyao@chromium.org
Review URL: https://codereview.chromium.org/2279603002 .
2016-08-25 10:29:16 -07:00
Frank Barchard
c244a3e9a0
Add SplitUVPlanes and MergeUVPlanes
...
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.
BUG=libyuv:629
R=braveyao@chromium.org
Review URL: https://codereview.chromium.org/2277603004 .
2016-08-24 16:47:24 -07:00
Frank Barchard
17d31e6a4a
NV12 allow NULL for Y
...
The conversion from NV12 and other Bi or Tri planar formats, differs only in the UV handling. The helper function supports passing a NULL for the dst_y channel indicating you only want to do the UV conversion.
TBR=harryjin@google.com
TEST=LibYUVConvertTest.NV12ToI420_NullY (601 ms)
BUG=libyuv:626
Review URL: https://codereview.chromium.org/2276703002 .
2016-08-23 19:05:25 -07:00
Frank Barchard
d58297a2df
NV12ToI420 use SplitPlane function
...
TBR=magjed@chromium.org
BUG=libyuv:629
TEST=LibYUVConvertTest.NV12ToI420_Opt
Review URL: https://codereview.chromium.org/2267303002 .
2016-08-22 18:35:55 -07:00
Frank Barchard
920151f2b5
Change basic_types.h for fixing build failure
...
BUG=libyuv:630
TBR=harryjin@google.com
TEST=android build locally tested.
Review URL: https://codereview.chromium.org/2225763003 .
Review URL: https://codereview.chromium.org/2269793002 .
2016-08-22 16:16:49 -07:00
Frank Barchard
74491ba0c5
add blank lines to getting started
...
BUG=libyuv:626
Review URL: https://codereview.chromium.org/2225763003 .
2016-08-08 15:23:38 -07:00
Frank Barchard
e74086bfe3
Remove DISABLE_X86 from build.gn
...
Fix for duplicate define
../../third_party/libyuv/include/libyuv/scale_row.h:29:9: error: 'LIBYUV_DISABLE_X86' macro redefined [-Werror,-Wmacro-redefined]
^
GYP version relys on headers disabling the optimization.
This CL does the same for BUILD.gn
TBR=kjellander@chromium.org
BUG=libyuv:625
Review URL: https://codereview.chromium.org/2149823003 .
2016-07-14 12:14:22 -07:00
Frank Barchard
46a8eaaf0c
fix typo in YUV
...
R=braveyao@chromium.org
BUG=None
Review URL: https://codereview.chromium.org/2152623002 .
2016-07-13 17:17:19 -07:00
Frank Barchard
1aa4ddd21c
Attribute aligned 32 for YUV conversion structure on Intel
...
Fix for unaligned memory exception.
R=braveyao@chromium.org
BUG=libyuv:616
Review URL: https://codereview.chromium.org/2152553002 .
2016-07-13 12:19:26 -07:00
Frank Barchard
a7a6d8cc2e
Duplicate prototype for I420ToABGR for remoting
...
Add alias prototype in convert_argb.h for remoting to build without the header convert_from.h
BUG=libyuv:622
TBR=harryjin@google.com
Review URL: https://codereview.chromium.org/2141923005 .
2016-07-12 19:12:28 -07:00
Frank Barchard
abcb70f183
Test nv21 layout of Android420ToI420 function.
...
to Y,U,V and a pixel stride for U and V. The pixel stride is expected to be 1 or 2.
[ RUN ] LibYUVConvertTest.Android420ToI420_1_Any
[ OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_1_Unaligned
[ OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_1_Invert
[ OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_1_Opt
[ OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_2_Any
[ OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_2_Unaligned
[ OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_2_Invert
[ OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms)
[ RUN ] LibYUVConvertTest.Android420ToI420_2_Opt
[ OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms)
TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org
Review URL: https://codereview.chromium.org/2146733002 .
2016-07-12 18:34:04 -07:00
Frank Barchard
84e04699c2
Add libyuv:Android420ToI420 function which takes 3 pointers
...
to Y,U,V and a pixel stride for U and V. The pixel stride is expected to be 1 or 2.
TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org
Review URL: https://codereview.chromium.org/2114843002 .
2016-07-12 16:23:51 -07:00
Frank Barchard
4d9146bbb1
include planar functions and convert_argb for webrtc
...
webrtc doesnt include the headers that the functions are prototyped in.
This CL makes the convert.h include those headers to allow webrtc to
update to the head libyuv.
TBR=harryjin@google.com
BUG=libyuv:620,webrtc:6091,webrtc:6094
TESTED=local build and try bots
Review URL: https://codereview.chromium.org/2141683002 .
2016-07-11 11:37:51 -07:00
Frank Barchard
8b55286ed5
duplicate I420Rect prototype into convert for webrtc
...
TBR=harryjin@google.com
BUG=libyuv:618
Review URL: https://codereview.chromium.org/2132993003 .
2016-07-08 16:03:38 -07:00
Frank Barchard
303b9f03c8
Avoid gcc 4.4 indexing a vector_size(32) array error.
...
Mking color conversion use simple arrays within structure, which will be referenced via register pointer.
R=harryjin@google.com
BUG=libyuv:616
TEST=CC=gcc-4.4 CXX=g++-4.4 LD=ld-4.4 make -f linux.mk
Review URL: https://codereview.chromium.org/2127863003 .
2016-07-06 15:14:29 -07:00
Frank Barchard
2f101fdbda
mingw64 fix - guard row_win.cc against mingw build.
...
The old guard only checked for defined(_M_X64) which is defined by mingw64. Add a test for defined(_MSC_VER) which is defined for clangcl and visual c but not mingw. mingw should use row_gcc.cc for both 32 and 64 bit.
R=harryjin@google.com
BUG=webm:1252,libyuv:613
TEST=local gcc/clang builds on linux tested and try bots for others.
Review URL: https://codereview.chromium.org/2105603002 .
2016-06-28 10:21:27 -07:00
Frank Barchard
b8ddb5a2a7
rounding for arm filter
...
R=wangcheng@google.com , harryjin@google.com
BUG=libyuv:607
Review URL: https://codereview.chromium.org/2093913004 .
2016-06-24 16:07:49 -07:00
Frank Barchard
cc88adc620
YUV scale filter columns improved filtering accuracy
...
upscale a YUV image. observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.
was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)
BUG=libyuv:605
TEST=try bots
R=harryjin@google.com , wangcheng@google.com
Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Frank Barchard
24b9fa6671
use vectorsize on clangcl
...
the ScaleFilterCols_SSSE3 function fails at runtime if vectorsize is not used.
BUG=libyuv:610,libyuv:605
R=wangcheng@google.com
Review URL: https://codereview.chromium.org/2080223007 .
2016-06-23 20:14:22 -07:00
Frank Barchard
e376b06d6a
Disable ScaleFilterCols_SSSE3 which produces color shift
...
upscale a YUV image. observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.
disable HAS_SCALEFILTERCOLS_SSSE3
R=harryjin@google.com
BUG=libyuv:605
Review URL: https://codereview.chromium.org/2080663003 .
2016-06-20 10:43:09 -07:00
Frank Barchard
fd3e676e91
android_full_debug x86 fix - use +rm for width count
...
Work around for android full debug build runnign out of registers.
5 functions were running out of registers causing the compiler error
error: 'asm' operand has impossible constraints
These functions mostly have 4 pointers, a counter (width) and a tempory
eax register. With fpic and debug using stackframes, 2 registers are
unavailable. So a total of 8 registers are used.
Although fpic and stack frame dont apply to assembly, the compiler
reserves 2 registers. The optimized version builds, so its likely
freeing up the registers once it knows they are not used.
These functions used to build, so compile options and/or compiler may
have updated.. likely fpic was turned on.
An attribute can be done to disable each, and will avoid using the
2 GPR registers, but they are still reserved and unavailable in debug
builds on current compilers (gcc 4.9 and clang 3.8).
R=dhrosa@google.com
BUG=libyuv:602
Review URL: https://codereview.chromium.org/2066933002 .
2016-06-14 15:25:28 -07:00
Frank Barchard
e2611a7349
document cpuid command line behavior
...
cpu_info_ is zero for uninitialized state and all bits are off, disabling all cpu optimizations.
the 1 bit indicates cpu_info_ is initialized avoiding calling the detection code again for performance.
MaskCpuFlags initializes the cpu ignoring existing flags, then masks with the supplied flags and stores to cpu_info_.
As a mask, -1 has no effect, enabling all cpu features that were detected, but nothing that wasnt detected.
Setting to 0 will cause the next call to re-initialize the cpu, which is same as enabling all features.
Setting mask to 1 will turn off all cpu features but keep the initialized bit on, so the next detection call wont reinitialize and the cpu features are all disabled.
So normal behavior for command line and programatic masking is:
1 = C
-1 = SIMD
TBR=harryjin@google.com
BUG=libyuv:600
TESTED=out64/Release/bin/run_libyuv_unittest -s libyuv_unittest --verbose --release --gtest_filter=*ARGBExtractAlpha* -a "--libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=1 --libyuv_cpu_info=1"
Review URL: https://codereview.chromium.org/2042933002 .
2016-06-08 10:38:09 -07:00
Frank Barchard
026be3cd85
neon64 use width int directly.
...
width %w size modifier the int width can be passed directly to arm assembly.
For functions that take input constants, the outputs are declared as early
write using &, meaning the outputs use used before all inputs are consumed.
R=harryjin@google.com
BUG=libyuv:598
Review URL: https://codereview.chromium.org/2043073003 .
2016-06-08 10:26:53 -07:00
Frank Barchard
6546096269
ARGBExtractAlpha 16 pixels at a time for ARM
...
arm64 8 TestARGBExtractAlpha (10019 ms) <-original 64 bit code
arm64 8 x2 TestARGBExtractAlpha (7639 ms)
arm64 16 TestARGBExtractAlpha (7369 ms) <- new 64 bit code
thumb32 8 TestARGBExtractAlpha (9505 ms) <- original 32 bit code
thumb32 8 x2 TestARGBExtractAlpha (7400 ms)
thumb32 8 x2i TestARGBExtractAlpha (7266 ms) <- new 32 bit code
arm32 8 TestARGBExtractAlpha (10002 ms)
BUG=libyuv:572
TESTED=local test on nexus 9
R=harryjin@google.com , wangcheng@google.com
Review URL: https://codereview.chromium.org/2035573002 .
2016-06-07 10:44:28 -07:00
Frank Barchard
462be27ec8
j422 now uses j420 source code so increase error threshold to match.
...
R=harryjin@google.com
BUG=libyuv:597
Review URL: https://codereview.chromium.org/2024213003 .
2016-05-31 19:45:34 -07:00
Frank Barchard
b00d40160a
make unittest allocator align to 64 bytes.
...
blur requires memory be aligned. change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.
TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.
Review URL: https://codereview.chromium.org/2019753002 .
2016-05-27 18:02:47 -07:00
Magnus Jedvert
942db3016a
Add ARGBExtractAlpha function
...
BUG=libyuv:572
R=fbarchard@google.com
Review URL: https://codereview.chromium.org/1995293002 .
2016-05-26 10:30:57 +02:00
Frank Barchard
74a69522da
white space fixes for MIPS
...
TBR=kjellander@chromium.org
BUG=None
Review URL: https://codereview.chromium.org/2005053004 .
2016-05-24 14:17:18 -07:00
Frank Barchard
60abed3a47
add SIMD_ALIGNED to unit_test.h
...
avoids need for row.h for some unittests;
R=harryjin@google.com
BUG=libyuv:594
TESTED=try bots tested.
Review URL: https://codereview.chromium.org/2004313004 .
2016-05-24 13:56:25 -07:00
Frank Barchard
7edf572e28
remove includes for duplicate functions
...
R=harryjin@google.com
BUG=libyuv:592
TESTED=local builds work with fewer headers
Review URL: https://codereview.chromium.org/2006943002 .
2016-05-23 17:38:26 -07:00
Frank Barchard
fbdc43a03c
fix wrong HAS_ARGBCOPYALPHAROW_SSE2 ifdef
...
TBR=kjellander@chromium.org
BUG=libyuv:593
TESTED=try bots pass.
Review URL: https://codereview.chromium.org/2000393002 .
2016-05-23 16:26:02 -07:00
Frank Barchard
07cb92272f
If image sizes are greater than 32768, fixed point stepping will overflow an int. This CL changes the max size to 32768 and disables the test if larger.
...
BUG=libyuv:590
TESTED=LIBYUV_FLAGS=-1 LIBYUV_WIDTH=8192 LIBYUV_HEIGHT=16 out/Release/libyuv_unittest --gtest_filter=*
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1947783002 .
2016-05-05 19:09:02 -07:00
Frank Barchard
6924590212
Add all library source files to linux.mk
...
Allows arm and mips linux builds.
Add psnr and cpuid utility targets.
BUG=libyuv:586
TESTED=make -f linux.mk
TBR=kjellander@chromium.org
Review URL: https://codereview.chromium.org/1906653003 .
2016-04-20 16:48:53 -07:00
Frank Barchard
cf101116c9
Remove initialize to zero on output variables for inline.
...
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs). This allows the initialize
to zero step to be removed, saving 1 instruction.
BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
9c53ff2c57
Fix temporary stride for ConvertToARGB with rotation.
...
BUG=libyuv:578
TESTED=local unittests pass
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1879783002 .
2016-04-11 15:21:04 -07:00
Frank Barchard
3c862e3d29
Fix stride bug for msan on I420Interpolate.
...
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C
BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com
Review URL: https://codereview.chromium.org/1872953002 .
2016-04-08 15:58:53 -07:00
Frank Barchard
ddbc63f7b9
Add //build/config/BUILD.gn to exec whitelist for GN.
...
Affected Linux GN build, not Windows.
R=kjellander@chromium.org
BUG=libyuv:583
TESTED=gn gen out/Debug --args=is_debug=true
Review URL: https://codereview.chromium.org/1866743002 .
2016-04-06 11:23:28 -07:00
Frank Barchard
ef79a9938b
cmake move libyuv_unittest target into the if(TEST) condition
...
BUG=libyuv:579
TESTED=mkdir build && cd build && cmake .. && cmake --build . --config Release
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/1847233002 .
2016-04-01 16:15:34 -07:00
Frank Barchard
837aa1e2af
disable assembly in header for msan=1
...
GYP_DEFINES="target_arch=x64 msan=1" ./gyp_libyuv
ninja -j7 -C out/Release
R=impjdi@google.com
BUG=libyuv:575
Review URL: https://codereview.chromium.org/1805683003 .
2016-03-15 18:45:38 -07:00
Frank Barchard
ee99b85126
Port ARGBToRGB565 from aarch64 neon to 32 bit
...
The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts and inserts, saving some masking. The instruction is available for neon 32 bit as well.
R=magjed@chromium.org , harryjin@google.com
BUG=libyuv:571
Review URL: https://codereview.chromium.org/1724393002 .
2016-02-29 12:22:25 -08:00
Frank Barchard
ab0dfdd4ff
Documentation fix for android aarch64 disassembly.
...
Name of objdump tool updated.
TBR=kjellander@chromium.org
BUG=none
Review URL: https://codereview.chromium.org/1715743003 .
Review URL: https://codereview.chromium.org/1727993002 .
2016-02-23 18:30:35 -08:00
Frank Barchard
127ff512b3
add perf data files to ignores
...
document play services update
R=jkellander@chromium.org
BUG=none
Review URL: https://codereview.chromium.org/1712463002 .
2016-02-17 21:37:09 -08:00
Frank Barchard
cc33dc68c7
Port I411ToARGBRow to AVX2.
...
An SSSE3 version already exists, and an AVX2 version is available for
Visual C. This ports the function to AVX2 completing the AVX2 ports of
all YUV to RGB functions for AVX2 on gcc.
TBR=harryjin@google.com
BUG=libyuv:555
Review URL: https://codereview.chromium.org/1687253002 .
2016-02-12 10:26:10 -08:00
Frank Barchard
0e554b18fe
port NV12ToRGB565Row_AVX2 to gcc
...
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2
NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.
R=harryjin@google.com
BUG=libyuv:554
Review URL: https://codereview.chromium.org/1687953002 .
2016-02-10 11:13:41 -08:00
Frank Barchard
c39509c8e5
add avx2 wrappers for functions that can call I422ToARGBRow_AVX2
...
R=harryjin@google.com
BUG=libyuv:557
Review URL: https://codereview.chromium.org/1687713002 .
2016-02-09 17:14:29 -08:00
Frank Barchard
6ea3755330
add 'LIBYUV_DISABLE_X86' to msan for unittests
...
R=harryjin@google.com
BUG=libyuv:564
Review URL: https://codereview.chromium.org/1685723002 .
2016-02-09 11:57:03 -08:00
Frank Barchard
fc2adcfa42
fix for msan builds which set -DLIBYUV_DISABLE_X86=1
...
TBR=harryjin@google.com
BUG=libyuv:566
Review URL: https://codereview.chromium.org/1673313003 .
2016-02-09 10:51:20 -08:00
Frank Barchard
0d880e5bc0
rename MIPS_DSPR2 to DSPR2 for consistency
...
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor. This should be named consistently.
Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.
TBR=harryjin@google.com
BUG=libyuv:562
Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
903c91cc2e
fix for ubsan on unittest.h fastrand()
...
internal math of the fastrand function uses a multiply
and add that overflows a signed int. This triggers a
ubsan failure:
../../unit_test/../unit_test/unit_test.h:60:33: runtime error: signed integer overflow: 56248274 * 214013 cannot be represented in type 'int'
This change casts the intermediate math to unsigned
int to avoid the overflow.
For more info on ubsan, see
http://dev.chromium.org/developers/testing/undefinedbehaviorsanitizer
TESTED=Passing compilation using:
GYP_DEFINES="ubsan=1"
GYP_DEFINES="ubsan_vptr=1"
R=harryjin@google.com , pbos@webrtc.org
BUG=libyuv:563
Review URL: https://codereview.chromium.org/1662453003 .
2016-02-02 14:32:12 -08:00
Frank Barchard
9e39c1f271
ubsan overflow fix for multiply by 0x01010101
...
This is an UBSan error reported by libjingle
[ RUN ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride
[000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73
../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int'
[8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms)
[9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms)
Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted
The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes.
Changing the constant to 0x01010101u should avoid overflow.
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:563
Review URL: https://codereview.chromium.org/1657533005 .
2016-02-01 12:29:04 -08:00
Frank Barchard
1cc0177669
Remove duplicate prototype for MJPGToARGB
...
MJPGToARGB prototype is in both convert_argb.h and planar_functions.h
Remove the duplicate prototype from planar_functions.h
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:561
Review URL: https://codereview.chromium.org/1638133002 .
2016-01-26 17:02:45 -08:00
Frank Barchard
ad71738f6a
Remove svn version build and unittest.
...
R=harryjin@google.com
TBR=harryjin@google.com , kjellander@google.com
BUG=libyuv:551
Review URL: https://codereview.chromium.org/1612123002 .
2016-01-21 11:22:11 -08:00
Frank Barchard
8c196f4d4c
Fix testi420 unittest for odd height
...
When the image height for unittests was set to an
odd height, the TestI420 unittest would not fill
the complete source buffer. This change handles
the odd height test case.
No change to library code.
TBR=harryjin@google.com
BUG=libyuv:549
Review URL: https://codereview.chromium.org/1609103002 .
2016-01-19 16:16:39 -08:00
Frank Barchard
58cb534962
Fix memory overwrite in YUY2ToNV12 odd wdiths
...
When width was odd Y channel wrote an extra pixel.
This change splits the Y from UV into a temporary
buffer and memcpy's to the destination. Performance
is slower.
Was
YUY2ToNV12_Any (307 ms)
YUY2ToNV12_Unaligned (213 ms)
TestYUY2ToNV12 (181 ms)
YUY2ToNV12_Opt (177 ms)
YUY2ToNV12_Invert (177 ms)
Npw
YUY2ToNV12_Any (300 ms)
YUY2ToNV12_Unaligned (226 ms)
YUY2ToNV12_Invert (206 ms)
TestYUY2ToNV12 (184 ms)
YUY2ToNV12_Opt (181 ms)
TBR=harryjin@google.com
BUG=libyuv:545
Review URL: https://codereview.chromium.org/1593833002 .
2016-01-19 11:28:09 -08:00
Frank Barchard
8377c798fb
Fix I420ToNV21 for wrong dst_stride_y parameter.
...
I420ToNV21 passes the wrong dst_stride_y when it calls I420ToNV12; parameter 8 (convert_from.cc:448) is src_stride_y but should be dst_stride_y. This causes image corruption when converting I420 -> NV21 with mismatched luminance strides.
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:547
Review URL: https://codereview.chromium.org/1582793008 .
2016-01-14 17:38:54 -08:00
Frank Barchard
081475b3c8
refactor ARGBToI422 using ARGBToI420 internally
...
R=harryjin@google.com
BUG=libyuv:546
Review URL: https://codereview.chromium.org/1574253004 .
2016-01-12 17:05:49 -08:00
Frank Barchard
54bbea1701
Disable I420Blend_Any test that uses C
...
Also renames Inverted to Invert in test name for consistency.
TBR=harryjin@google.com
BUG=libyuv:543
Review URL: https://codereview.chromium.org/1577973004 .
2016-01-11 18:23:04 -08:00
Frank Barchard
8030a711aa
Rename rotate tests to include _Opt and disable _Odd tests
...
TBR=harryjin@google.com
BUG=libyuv:543
Review URL: https://codereview.chromium.org/1577723003 .
2016-01-11 17:30:27 -08:00
Frank Barchard
23c6a83561
Fix ifdef mismatch for mirroruv
...
Macro define and macro ifdef didnt match, leading to C code
being used. Make macro match function name.
TBR=harryjin@google.com
BUG=libyuv:543
Review URL: https://codereview.chromium.org/1579023002 .
2016-01-11 16:33:36 -08:00
Frank Barchard
fc52d8ded2
Odd width variation of scale down by 2 for subsampling
...
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:538
Review URL: https://codereview.chromium.org/1558093003 .
2016-01-06 15:12:17 -08:00
Frank Barchard
2560df9513
add clang variable for other apps to use
...
R=dhrosa@google.com
BUG=libyuv:539
Review URL: https://codereview.chromium.org/1557923005 .
2016-01-05 11:47:55 -08:00
Frank Barchard
36615d62a0
fix for InterpolateRow_AVX2
...
port scaledownby4_avx2 to gcc
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1546763002 .
2015-12-22 12:29:54 -08:00
Frank Barchard
71deb7ba3a
bug fix - remove shift from InterpolateRow_AVX2
...
TBR=harryjin@google.com
BUG=libyuv:537
Review URL: https://codereview.chromium.org/1547703002 .
2015-12-22 10:28:48 -08:00
Frank Barchard
2cb2e9e1ad
fix for InterpolateRow_AVX2
...
TBR=harryjin@google.com
BUG=libyuv:535
Review URL: https://codereview.chromium.org/1543773002 .
2015-12-21 18:35:12 -08:00
Frank Barchard
3f4d86053e
avx2 interpolate use 8 bit
...
BUG=libyuv:535
R=dhrosa@google.com
Review URL: https://codereview.chromium.org/1535833003 .
2015-12-21 10:57:32 -08:00
Frank Barchard
f4447745ae
Add rounding to InterpolateRow for improved quality and consistency.
...
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly. Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.
BUG=libyuv:535
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1533643005 .
2015-12-17 15:24:06 -08:00
Frank Barchard
029f926a14
add NDEBUG for release chromium buids
...
BUG=libyuv:533
TBR=harryjin@google.com
Review URL: https://codereview.chromium.org/1531143002 .
2015-12-16 16:23:09 -08:00
Frank Barchard
216e93b4e8
Fix MIPS DSPR2 build failure.
...
Fixing the failure:
'TransposeWx8_Fast_MIPS_DSPR2' was not declared in this scope
BUG=none
R=fbarchard@chromium.org
Review URL: https://codereview.chromium.org/1527243002 .
2015-12-16 10:37:42 -08:00
Frank Barchard
80ca4514ef
change scale down by 4 to use rounding.
...
TBR=harryjin@google.com
BUG=libyuv:447
Review URL: https://codereview.chromium.org/1525033005 .
2015-12-15 21:25:18 -08:00
Frank Barchard
70445ef2ef
avx2 scale down by 2 for gcc
...
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1520423003 .
2015-12-15 10:59:20 -08:00
Frank Barchard
77346fcb4a
disable I411ToARGB assembly if _DEBUG for chromium, as well as DEBUG for other builds.
...
TBR=harryjin@google.com
BUG=libyuv:533
Review URL: https://codereview.chromium.org/1527903002 .
2015-12-14 21:36:12 -08:00
Frank Barchard
ae55e41851
use rounding in scaledown by 2
...
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:447,libyuv:527
Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
8bca9fc178
remove unused var in a test
...
remove include from unittest.cc that is already done by unittest.h
TBR=harryjin@google.com
BUG=libyuv:530
Review URL: https://codereview.chromium.org/1513263004 .
2015-12-10 18:39:36 -08:00
Frank Barchard
44373d8fbb
Add check for DEBUG to functions disabled on 386
...
Some functions run out of registers when compiled for debug,
fpic, with stack frames on 32 bit x86 with clang.
Previously they were enabled based on _DEBUG but that macro
is not set in some build systems. This CL adds DEBUG macro as
well to cover those environments.
R=harryjin@google.com
BUG=libyuv:532
Review URL: https://codereview.chromium.org/1517693005 .
2015-12-10 15:42:46 -08:00
Frank Barchard
a2ea905679
BlendPlane any width.
...
Benchmark
out\release\libyuv_unittest --libyuv_width=1279 --libyuv_height=719 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms
Was
I420Blend_Any (2321 ms)
I420Blend_Unaligned (1684 ms)
I420Blend_Opt (1675 ms)
I420Blend_Invert (1653 ms)
BlendPlane_Invert (1556 ms)
BlendPlane_Any (1552 ms)
BlendPlane_Unaligned (1548 ms)
BlendPlane_Opt (1535 ms)
ARGBBlend_Unaligned (659 ms)
ARGBBlend_Any (596 ms)
ARGBBlend_Invert (591 ms)
ARGBBlend_Opt (508 ms)
BlendPlaneRow_Unaligned (186 ms)
BlendPlaneRow_Opt (171 ms)
Now
ARGBBlend_Any (621 ms)
ARGBBlend_Unaligned (585 ms)
ARGBBlend_Invert (564 ms)
ARGBBlend_Opt (512 ms)
I420Blend_Unaligned (347 ms)
I420Blend_Invert (345 ms)
I420Blend_Any (337 ms)
I420Blend_Opt (327 ms)
BlendPlane_Unaligned (187 ms)
BlendPlaneRow_Unaligned (187 ms)
BlendPlane_Invert (186 ms)
BlendPlane_Any (186 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (171 ms)
which is comparable to aligned case
out\release\libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms
ARGBBlend_Any (625 ms)
ARGBBlend_Unaligned (602 ms)
ARGBBlend_Invert (508 ms)
ARGBBlend_Opt (506 ms)
I420Blend_Any (353 ms)
I420Blend_Unaligned (322 ms)
I420Blend_Invert (304 ms)
I420Blend_Opt (301 ms)
BlendPlaneRow_Unaligned (188 ms)
BlendPlane_Unaligned (186 ms)
BlendPlane_Invert (185 ms)
BlendPlane_Any (184 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (169 ms)
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1513443002 .
2015-12-08 18:59:48 -08:00
Frank Barchard
fae1a10545
Work around bug in xgetbv for Visual Studio.
...
xgetbv is generating bad code, falsely disabling AVX2 and AVX512.
disable optimization for the function affected on older versions of Visual C 32 bit.
R=brucedawson@chromium.org , dhrosa@google.com , harryjin@google.com
BUG=libyuv:529
Review URL: https://codereview.chromium.org/1503393004 .
2015-12-08 18:13:32 -08:00
Frank Barchard
2657688e70
Add support for odd height YUVA alpha blending.
...
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1507683003 .
2015-12-07 12:03:20 -08:00
Frank Barchard
bea690b3e0
AVX2 YUV alpha blender and improved unittests
...
AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions.
unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255.
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1505433002 .
2015-12-05 22:23:29 -08:00
Frank Barchard
fa2618ee26
Port BlendPlaneRow_SSSE3 to GCC
...
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1490273006 .
2015-12-04 11:19:41 -08:00
Frank Barchard
8af0ebf816
planar blend use signed images
...
R=dhrosa@google.com , harryjin@google.com , jzern@chromium.org
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1491533002 .
2015-12-02 14:20:17 -08:00
Frank Barchard
b6f37bd8ec
Interpolate plane initial implementation.
...
YUV version of interpolation between two images.
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:526
Review URL: https://codereview.chromium.org/1479593002 .
2015-11-25 16:11:42 -08:00
Frank Barchard
88552486f1
disable 411 on x86 due to compile error
...
TBR=harryjin@google.com
BUG=libyuv:524
Review URL: https://codereview.chromium.org/1468523002 .
2015-11-20 11:21:39 -08:00
Frank Barchard
526558b2d8
disable debug build of 411 to work around compiler bug
...
TBR=harryjin@google.com
BUG=libyuv:524
Review URL: https://codereview.chromium.org/1461013002 .
2015-11-19 02:25:00 -08:00
Frank Barchard
b7dfb72559
fix for I411 build error on 32 bit x86
...
TBR=harrjin@google.com
BUG=libyuv:525
Review URL: https://codereview.chromium.org/1461693004 .
2015-11-19 01:45:14 -08:00
Frank Barchard
528356a128
syntax fix for gcc movzwl
...
TBR=harryjin@google.com
BUG=libtyv:525
Review URL: https://codereview.chromium.org/1460723003 .
2015-11-18 13:14:15 -08:00
Frank Barchard
50f8cb2db3
port I411 movzx 2 byte reader to gcc
...
previously the I411 format used movd to read U, V pixels.
But this reads 4 bytes, and can cause a memory exception.
pinsrw can be used, but fails on drmemory 1.5, and is slow.
So in this change a movzxw is used to read 2 bytes into EBX,
then copy to xmm0 with movd.
Slightly slower, but no memory exception
Was LibYUVConvertTest.I411ToARGB_Opt (577 ms)
Now LibYUVConvertTest.I411ToARGB_Opt (608 ms)
TBR=harryjin@google.com
BUG=libyuv:525
Review URL: https://codereview.chromium.org/1457783004 .
2015-11-18 13:05:39 -08:00
Frank Barchard
5eefbe2330
Fix for drmemory failure on I411ToARGB
...
Before
I420ToARGB_Opt (594 ms)
I422ToARGB_Opt (483 ms)
I411ToARGB_Opt (748 ms) ***
I444ToARGB_Opt (452 ms)
I400ToARGB_Opt (218 ms)
After
I420ToARGB_Opt (591 ms)
I422ToARGB_Opt (454 ms)
I411ToARGB_Opt (502 ms) ***
I444ToARGB_Opt (441 ms)
I400ToARGB_Opt (216 ms)
TBR=harryjin@google.com
BUG=libyuv:525
Review URL: https://codereview.chromium.org/1459513002 .
2015-11-17 18:00:52 -08:00
Frank Barchard
ec4b258d4e
free src_a in unittest to fix leak
...
TBR=harryjin@google.com
BUG=libyuv:524
Review URL: https://codereview.chromium.org/1452083002 .
2015-11-17 00:29:53 -08:00
Frank Barchard
0815568a50
test for unaligned vs aligned for CopyRow_SSE2
...
improves performance on older CPUs where movdqa is faster.
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1455463002 .
2015-11-17 00:04:03 -08:00
Frank Barchard
1019e4537f
port I444ToARGB avx2 code from Visual C to GCC.
...
SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN ] LibYUVConvertTest.I444ToARGB_Any
[ OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Unaligned
[ OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Invert
[ OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Opt
[ OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)
AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN ] LibYUVConvertTest.I444ToARGB_Any
[ OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Unaligned
[ OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Invert
[ OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_Opt
[ OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1445893002 .
2015-11-13 18:31:22 -08:00
Frank Barchard
60adcbaf32
scale with conversion using 2 steps with unittest
...
a prototype function to implement the yuv to rgb with conversion and scale.
replace with 1 step function in future version, using same API.
R=harryjin@google.com
BUG=libyuv:471
Review URL: https://codereview.chromium.org/1421553016 .
2015-11-13 11:25:56 -08:00
Frank Barchard
6100f50f13
fix yvu constants for avx2 yuv to rgb
...
the yvu matrix for yuv to rgb had an incorrect entry, affecting yuv to bgra,
yuv to abgr and yuv to raw.
fix the matrix and reenable avx2 functions.
R=harryjin@google.com
BUG=libyuv:522
Review URL: https://codereview.chromium.org/1411763004 .
2015-11-10 10:45:44 -08:00
Frank Barchard
72a9e282ec
disable more avx2 functions that dont link in chrome
...
libyuv builds/runs, but when integrated into chromium, produces link errors. unclear why but this disables affected functions.
will followup with re-enabling them once the root cause in the runtime error is found.
TBR=harryjin@google.com
BUG=libyuv:522
Review URL: https://codereview.chromium.org/1427683004 .
2015-11-09 17:20:02 -08:00
Frank Barchard
fb5ed1f4c5
disable 4 AVX2 YUV to RGB conversions which fails tests.
...
disable I422ALPHATOARGBROW_AVX2 I422TOARGBROW_AVX2 I422TORGB24ROW_AVX2 I422TORGBAROW_AVX2 in row.h.
SSSE3 versions will be used instead.
Short term fix until issue can be resolved.
R=harryjin@google.com
BUG=libyuv:522
Review URL: https://codereview.chromium.org/1419513009 .
2015-11-09 14:40:08 -08:00
Frank Barchard
98eb102bea
set d19 alpha on inner loop
...
TBR=harryjin@google.com
BUG=libyuv:521
Review URL: https://codereview.chromium.org/1429263004 .
2015-11-06 11:38:21 -08:00
Frank Barchard
431cb3667a
YUV to RGB for x64 use registers instead of memory.
...
On Arm the YVU to RGB conversions move constants into registers.
This change does the same for 64 bit intel builds where additional
registers are available.
The AVX2 saves 3 instructions by because the 2nd argument needs to be a register, so a vmovdqu was avoided.
x64 builds using memory:
AVX2 I420ToARGB_Opt (3059 ms)
SSSE3 I420ToARGB_Opt (3959 ms)
Now using registers
AVX2 I420ToARGB_Opt (2906 ms)
SSSE3 I420ToARGB_Opt (3928 ms)
TBR=harryjin@google.com
BUG=libyuv:520
Review URL: https://codereview.chromium.org/1407353010 .
2015-11-04 16:16:18 -08:00
Frank Barchard
c2bff1a1af
add .gn file for gn builds
...
using a stripped down gn file from webrtc.
BUG=libyuv:411,libyuv:519
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/1417613007 .
2015-11-04 11:09:00 -08:00
Frank Barchard
860cc0357a
Neon versions of I420AlphaToARGB
...
Add alpha version of YUV to RGB to neon code for ARMv7 and aarch64.
For other YUV to RGB conversions, hoist alpha set to 255 out of loop.
TBR=harryjin@google.com
BUG=libyuv:516
Review URL: https://codereview.chromium.org/1413763017 .
2015-11-03 19:21:36 -08:00
Frank Barchard
d95d2169d9
rename yuv matrix constants to be more clear about what they are
...
R=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1429693006 .
2015-11-03 17:09:53 -08:00
Frank Barchard
1f1d140bb6
remove mips dsp detect
...
DSP code is not actually used, only DSPR2. Remove the detect.
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1405043008 .
2015-11-03 16:57:40 -08:00
Frank Barchard
ce4c2fad1d
Raw 24 bit RGB to RGB24 (bgr)
...
Add unittests that do 1 step conversion vs 2 step conversion.
Tests end swapping versions match direct conversions.
R=harryjin@google.com
BUG=libyuv:518
Review URL: https://codereview.chromium.org/1419103007 .
2015-11-03 10:30:30 -08:00
Frank Barchard
87926cec8b
remove store bgra, abgr, raw unused macros
...
TBR=harryjin@google.com
BUG=libyuv:518
Review URL: https://codereview.chromium.org/1420033004 .
2015-11-02 10:40:03 -08:00
Frank Barchard
2c7aa0070a
remove I422ToBGRA and use I422ToRGBA internally
...
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.
Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.
R=harryjin@google.com
BUG=libyuv:518
Review URL: https://codereview.chromium.org/1427993004 .
2015-11-02 10:24:12 -08:00
Frank Barchard
5d97b93369
refactor I420ToABGR to use I420ToARGBRow
...
Using a transposed conversion matrix, I420ToARGB can output ABGR.
R=harryjin@google.com , xhwang@chromium.org
BUG=libyuv:473
Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
254ef01551
disable I420AlphaToARGB for 32 bit intel debug build
...
R=harryjin@google.com
BUG=libyuv:517
Review URL: https://codereview.chromium.org/1428843003 .
2015-10-29 11:21:36 -07:00
Frank Barchard
cdbdf5b723
Fix debug compilation problems for gcc and 32 bit x86.
...
In some methods with 7 arguments gcc fails to find enough registers
to compile the assembler code when compiling debug. Simplest solution
is to skip the assembler version in debug of those particular functions
(I422Alpha -> ARBG/ABGR)
R=harryjin@google.com ,bratell@opera.com
BUG=libyuv:517
Review URL: https://codereview.chromium.org/1423283002 .
2015-10-28 14:27:29 -07:00
Frank Barchard
811a5ec446
pass clangcl compile options to ignore warnings in gflags.cc
...
R=ajm@chromium.org , ajm@google.com
BUG=libyuv:513,webrtc:760
Review URL: https://codereview.chromium.org/1427643003 .
2015-10-28 10:58:19 -07:00
Frank Barchard
b86dbf24d3
refactor I420AlphaToABGR to use I420AlphaToARGB internally
...
swap U and V and transpose conversion matrix, so I420AlphaToARGB and
I420AlphaToABGR share low level code.
Having less code with same performance allows more focused
optimization for future ARM versions.
R=harryjin@google.com
TBR=harryjin@chromium.org
BUG=libyuv:473,libyuv:516
Review URL: https://codereview.chromium.org/1422263002 .
2015-10-27 14:17:21 -07:00
Frank Barchard
cf160cdbaa
implement I444ToABGR by swapping uv and transpose matrix
...
U contributes to B and G. V contributes to R and G.
By swapping U and V, they contribute to the opposite channels. Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.
As a result the existing I444ToABGRRow functions are no longer needed and are removed.
Previously this function was only Intel AVX2 optimized for Windwos. Now it is also optimized for Arm and GCC.
ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.
R=xhwang@chromium.org
BUG=libyuv:515
Review URL: https://codereview.chromium.org/1414133006 .
2015-10-27 10:21:21 -07:00
Frank Barchard
e8ee175549
add unittest that compares ABGR to ARGB
...
TBR=harryjin@google.com
BUG=libyuv:515
Review URL: https://codereview.chromium.org/1423663007 .
2015-10-26 17:51:03 -07:00
Frank Barchard
2844662e1c
Add avx512bw detection code
...
R=harryjin@google.com
BUG=libyuv:514
Review URL: https://codereview.chromium.org/1413463004 .
2015-10-26 14:42:49 -07:00
Frank Barchard
1502832a70
switch cpu flags to 0 for unitialized to avoid compare
...
R=harryjin@google.com
BUG=libyuv:512
Review URL: https://codereview.chromium.org/1418253002 .
2015-10-23 10:57:42 -07:00
Frank Barchard
ad36ba5c48
initialize cpu flags to fix compile error on windows
...
R=harryjin@google.com
BUG=libyuv:512
Review URL: https://codereview.chromium.org/1422733003 .
2015-10-22 15:16:31 -07:00
Frank Barchard
00f15e3c6c
color unittest allow j420 error of 5 for arm
...
R=harryjin@google.com
BUG=libyuv:511
Review URL: https://codereview.chromium.org/1412683005 .
2015-10-22 11:25:04 -07:00
Frank Barchard
430bb0a0f0
odd width 444 fix
...
TBR=harryjin@google.com
BUG=libyuv:510
Review URL: https://codereview.chromium.org/1415583003 .
2015-10-21 20:03:19 -07:00
Frank Barchard
ba4b409d51
Fix ARGBToI411 odd width bug.
...
The any function for handling ARGBToI411 was not handling the pixel
replication correctly. On 422 and odd width was handled by duplicating
a pixel of source. 411 needs replication for remainders of 1, 2 or 3
pixels.
The C version was handling odd width but with an average of the remainder
pixels, which does not match the SIMD 'any' handling off remainder.
This changes the odd width handling to mimic the any version.
TBR=harryjin@google.com
BUG=libyuv:491
Review URL: https://codereview.chromium.org/1411733004 .
2015-10-21 12:22:24 -07:00
Frank Barchard
9daa550a2e
Move cpu_info variable outside ifdef
...
Fix compile error on arm, mips etc due to undefined variable.
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1403373008 .
2015-10-20 16:32:44 -07:00
Frank Barchard
9be6d21ae7
write to cpu_flags once
...
To make init cpu flags thread safe, there can only be one write to the variable.
R=richard.winterton@intel.com , harryjin@google.com
BUG=libyuv:508
Review URL: https://codereview.chromium.org/1412793006 .
2015-10-20 16:24:01 -07:00
Frank Barchard
e6a54f223a
Call AllowCommandLineReparsing in unit tests
...
Allows us to ignore flags passed on to us by Chromium build bots
without having to explicitly disable them. (Thanks pbos!)
TESTED=webrtc ran modules_unittests with a bogus flag did not result in an
error.
R=kjellander@chromium.org
BUG=libyuv:507
Review URL: https://codereview.chromium.org/1417573002 .
2015-10-19 16:30:41 -07:00
Frank Barchard
94312b695a
add gflags support files from webrtc
...
files needed for command line support with gtest.
These files are copied directly from webrtc.
TBR=kjellander@chromium.org
BUG=libyuv:507
Review URL: https://codereview.chromium.org/1414483002 .
2015-10-16 18:53:25 -07:00
Henrik Kjellander
8dcec019b6
Add gflags dependency
...
Unit tests currently use environment variables to change behavior.
Using gflags this can be done via command line.
BUG=libyuv:507
TBR=fbarchard@chromium.org
Review URL: https://codereview.chromium.org/1413723002 .
2015-10-16 22:08:43 +02:00
Henrik Kjellander
f80cc26da7
Revert "add gflags to deps to allow command line parameters."
...
This reverts commit 2dd3d9230ee663e71ed4ad9164033ed672e571de.
Reason: chromium_git is a missing variable, and to properly
add gflags, we need to check in GYP files in third_party/gflags
first, then add the DEPS entry.
BUG=libyuv:507
TBR=fbarchard@chromium.org
Review URL: https://codereview.chromium.org/1406323002 .
2015-10-16 21:46:56 +02:00
Frank Barchard
2dd3d9230e
add gflags to deps to allow command line parameters.
...
unittests currently use environment variables to change behavior.
using gflags this can be done via command line.
R=kjellander@chromium.org
BUG=libyuv:507
Review URL: https://codereview.chromium.org/1402313002 .
2015-10-16 10:57:51 -07:00
Frank Barchard
5d0a871d37
remove have jpeg test
...
This test is just a printf, not a real test, but somehow
fails on arm.
TBR=harryjin@google.com
BUG=libyuv:506
Review URL: https://codereview.chromium.org/1409913002 .
2015-10-15 19:13:07 -07:00
Frank Barchard
cf19a0c9a2
nv21 any fix
...
R=harryjin@google.com
BUG=libyuv:507
Review URL: https://codereview.chromium.org/1410643002 .
2015-10-15 16:24:51 -07:00
Frank Barchard
52a5504950
fix for C version of YUV to RGB for Arm
...
YuvPixel for arm was miscomputing YG.
TBR=harryjin@google.com
BUG=libyuv:506
Review URL: https://codereview.chromium.org/1402333002 .
2015-10-15 12:43:37 -07:00
Frank Barchard
e2417df4cb
create color test category of unittests to narrow down arm bug
...
A hang in color conversion on arm occurs somewhere in yuv to rgb.
Breaking the color test into its own category of test will help
run selective tests to narrow down the issue.
R=harryjin@google.com
BUG=libyuv:506
Review URL: https://codereview.chromium.org/1405543003 .
2015-10-14 16:58:55 -07:00
Frank Barchard
c7c188379b
avoid vectors for pnacl which cause linker failure.
...
R=sergeyu@chromium.org
BUG=chromium:538243
Review URL: https://codereview.chromium.org/1396363004 .
2015-10-14 14:49:48 -07:00
Frank Barchard
26db4de2ae
break up unittests into categories
...
R=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1399523004 .
2015-10-13 16:01:07 -07:00
Frank Barchard
4abd096548
fix for yuv to rgb on arm64.
...
fill in aarch64 yuv constants to match how the code expects them.
TBR=harryjin@google.com
BUG=libyuv:502
Review URL: https://codereview.chromium.org/1396253004 .
2015-10-12 12:02:54 -07:00
Frank Barchard
2e4466e282
change all pix parameters to width for consistency
...
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
Frank Barchard
2d601aaf34
merge neon source files back into single libyuv library
...
previously the neon source code was broken into a separate
library built with -mfpu=neon for the neon assembly, while
the C code was built without neon.
In this change, the neon code is added to the main library
and all code built with neon.
TBR=harryjin@google.com
BUG=libyuv:371
Review URL: https://codereview.chromium.org/1392043003 .
2015-10-07 21:16:51 -07:00
Frank Barchard
76a599ec3b
fix jpeg and bt.709 yuvconstants for neon64.
...
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces. But the jpeg and bt.709
colour conversion constants were still in armv7 form. This changes
the constants for aarch64 builds to be compatible with the code.
yuv constants are now passed as const *
Remove Yvu constants which were used for older version on nv21 but not new code.
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
8f0cadede4
port ARGB to 565 dithering AVX2 code to GCC.
...
Previously the assembly code was only available to Windows.
This CL ports the AVX2 code to GCC syntax.
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1391273003 .
2015-10-07 19:13:59 -07:00
Frank Barchard
cc89e3a77b
port ARGB to 565 dithering SSE2 code to GCC.
...
Previously the assembly code was only available to Windows.
This CL ports the SSE2 code to GCC syntax.
When running a profiler on all the unittests, this function
was the slowest of all functions that still ran in C code.
3.71% libyuv_unittest libyuv_unittest [.] ARGBToRGB565DitherRow_C
Was
ARGBToRGB565Dither_Opt (2894 ms)
Now
ARGBToRGB565Dither_Opt (432 ms)
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1397673002 .
2015-10-07 18:24:50 -07:00
Frank Barchard
3e38762d6b
fix avx2 box filter bug for yuv down sampling.
...
offset to second group of pixels was off by 16.
should have been 32, not 16.
requires avx2 hardware and wide image for test.
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:492,libyuv:501
Review URL: https://codereview.chromium.org/1395603002 .
2015-10-07 11:02:33 -07:00
Frank Barchard
013080f2d2
Pass yuvconstants to YUV conversions for neon 64 bit
...
SETUP provided by zhongwei.yao@linaro.org
Previously the 64 bit Neon code had hard coded constants in the setup macro
for YUV conversion, while 32 bit Neon code supported the yuvconstants
parameter.
This change accepts the constants passed to the YUV conversion row function,
allowing different color spaces to be respected - naming JPEG and BT.709.
As well as the existing BT.601.
TBR=harryjin@google.com
BUG=libyuv:472
Review URL: https://codereview.chromium.org/1384323002 .
2015-10-06 22:19:14 -07:00
Frank Barchard
914a9856c7
Reimplement NV21ToARGB to allow different color matrix.
...
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V. But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.
TBR=harryjin@google.com
BUG=libyuv:500
Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
f00bc9ef46
Add J444ToARGB conversion function.
...
J444 is JPeg YUV color space with 444 subsampling.
This implementation uses the existing I444ToARGB conversion, which is
BT.601 color space with 444 subsampling, but passing in the jpeg
color matrix constants.
TBR=harryjin@google.com
BUG=449
Review URL: https://codereview.chromium.org/1387313002 .
2015-10-06 18:46:53 -07:00
Frank Barchard
d70293993f
port scale box filter sse2 to gcc
...
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1393653002 .
2015-10-06 16:54:26 -07:00
Frank Barchard
f4c1ac10f0
Speed up rounding to byte test
...
R=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1367403007 .
2015-10-02 15:27:13 -07:00
Frank Barchard
3eefeaeb69
test xsave before calling xgetbv.
...
R=agl@chromium.org , harryjin@google.com
BUG=libyuv:497
Review URL: https://codereview.chromium.org/1382803002 .
2015-09-30 17:25:41 -07:00
Frank Barchard
2cc1a2b233
Remove sse2 functions that also have ssse3
...
ARGBBlendRow_SSE2, ARGBAttenuateRow_SSE2, and MirrorRow_SSE2
Since vast majority of CPUs have SSSE3 now, removing the SSE2
improves the performance of CPU dispatching.
R=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1377053003 .
2015-09-30 14:24:44 -07:00
Frank Barchard
febc26a2c9
win64 version of I422AlphaToARGB.
...
Was
I420AlphaToARGB_Premult (8861 ms)
I420AlphaToARGB_Opt (7119 ms)
Now
I420AlphaToABGR_Premult (2840 ms)
I420AlphaToARGB_Opt (484 ms)
C function switched to 1 step.
Was
I420AlphaToARGB_Premult (8862 ms)
I420AlphaToABGR_Opt (6718 ms)
Now
I420AlphaToARGB_Premult (8706 ms)
I420AlphaToARGB_Opt (6541 ms)
R=harryjin@google.com
BUG=libyuv:496, libyuv:473
Review URL: https://codereview.chromium.org/1359183003 .
2015-09-25 15:06:41 -07:00
Frank Barchard
9a0e12f5f1
AVX2 1 step I422AlphaToARGB for gcc and win.
...
C I420AlphaToARGB_Opt (5169 ms)
SSSE3 I420AlphaToARGB_Opt (432 ms)
AVX2 I420AlphaToARGB_Opt (358 ms)
and with premultiplication as 2 step process:
I420AlphaToARGB_Premult (7029 ms)
I420AlphaToARGB_Premult (757 ms)
I420AlphaToARGB_Premult (508 ms)
R=harryjin@google.com
BUG=libyuv:496,libyuv:473
Review URL: https://codereview.chromium.org/1372653003 .
2015-09-25 13:37:42 -07:00
Frank Barchard
e365cdde3b
I420Alpha row function in 1 pass.
...
API change - I420AlphaToARGB takes flag indicating if RGB should be
premultiplied by alpha.
This version implements an efficient SSSE3 version for Windows.
C version done in 2 steps.
Was
libyuvTest.I420AlphaToARGB_Any (1136 ms)
libyuvTest.I420AlphaToARGB_Unaligned (1210 ms)
libyuvTest.I420AlphaToARGB_Invert (966 ms)
libyuvTest.I420AlphaToARGB_Opt (1031 ms)
libyuvTest.I420AlphaToABGR_Any (1020 ms)
libyuvTest.I420AlphaToABGR_Unaligned (1359 ms)
libyuvTest.I420AlphaToABGR_Invert (1082 ms)
libyuvTest.I420AlphaToABGR_Opt (986 ms)
R=harryjin@google.com
BUG=libyuv:496
Review URL: https://codereview.chromium.org/1367093002 .
2015-09-25 10:29:20 -07:00
Frank Barchard
8fb2048e9f
Fix nv12 64 bit gcc increment.
...
Should be 16 bytes, but was 0x16 causing memory corruption.
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1368693002 .
2015-09-24 10:19:17 -07:00
Frank Barchard
accc04e6d8
NV12ToARGB_AVX2 ported to gcc
...
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1364913002 .
2015-09-23 15:54:16 -07:00
Frank Barchard
000cf89ca8
YUY2ToARGB avx2 in 1 step conversion.
...
Includes UYVYToARGB ssse3 fix.
Was
YUY2ToARGB_Opt (433 ms)
69.79% libyuv_unittest libyuv_unittest [.] I422ToARGBRow_AVX2
20.73% libyuv_unittest libyuv_unittest [.] YUY2ToUV422Row_AVX2
6.04% libyuv_unittest libyuv_unittest [.] YUY2ToYRow_AVX2
0.77% libyuv_unittest libyuv_unittest [.] YUY2ToARGBRow_AVX2
Now
YUY2ToARGB_Opt (280 ms)
95.66% libyuv_unittest libyuv_unittest [.] YUY2ToARGBRow_AVX2
BUG=libyuv:494
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1364813002 .
2015-09-23 11:15:18 -07:00
Frank Barchard
2b92ec8d0f
Fix git markers introduced on landing previous CL
...
BUG=none
Review URL: https://codereview.chromium.org/1359023003 .
2015-09-22 15:00:57 -07:00
Frank Barchard
5f3d4270d1
yuy2 to rgb gcc versions
...
read in read function for yuv conversion
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1355393002 .
2015-09-22 14:27:33 -07:00
Frank Barchard
03cd8584e7
Read Y channel in read function for yuv conversion.
...
Allows reader to support YUY2 format.
Also contains fix for win64 build for yuv conversion.
TBR=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1355333002 .
2015-09-22 12:05:16 -07:00
Frank Barchard
f96890a0be
yuvconstants for all YUV to RGB conversion functions.
...
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1363503002 .
2015-09-22 10:26:03 -07:00
Frank Barchard
62c49dc811
move constants into common
...
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1359443005 .
2015-09-18 16:28:44 -07:00
Frank Barchard
0381673d19
port I444 to ARGB to matrix. Add I444 to ABGR.
...
R=harryjin@google.com
BUG=libyuv:488,libyuv:490
Review URL: https://codereview.chromium.org/1348763005 .
2015-09-18 14:36:15 -07:00
Frank Barchard
28427a53e2
I444ToABGR for android
...
Reimplements I444ToARGB as a matrix function.
new I444ToABGR as matrix functions with wrappers and any functions.
Allows for future J444 and H444 versions.
I444ToABGR user level function added.
BUG=libyuv:490, libyuv:449
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1355733002 .
2015-09-18 11:20:58 -07:00
Frank Barchard
158d4079a3
NEON J422ToABGR and H422ToABGR missing prototypes
...
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1351993003 .
2015-09-17 15:47:55 -07:00
Frank Barchard
bdfd59a728
NEON constants
...
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1351553005 .
2015-09-17 15:28:29 -07:00
Frank Barchard
28ce7d94f5
j422toabgr neon port using i422toabgr matrix function.
...
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1353923003 .
2015-09-17 15:20:55 -07:00
Frank Barchard
6fcbae1409
J422ToARGB Neon but not aarch64
...
TBR=harryjin@google.com
BUG=libyuv:493
Review URL: https://codereview.chromium.org/1348203004 .
2015-09-17 12:43:05 -07:00
Frank Barchard
6a6b67e7a9
Add H422ToARGB armv7 neon version.
...
Patch provided by zhongwei.yao@linaro.org
R=fbarchard@chromium.org , fbarchard@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1344393002 .
2015-09-17 10:38:15 -07:00
Frank Barchard
bb0a521c52
j422 not available on aarch64
...
The aarch64 version does not have I422ToARGBMatrix yet,
so adding this to the ifdef section of row.h
R=harryjin@google.com
TBR=harryjin@google.com , zhongwei.yao@linaro.org
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1347853002 .
2015-09-15 15:26:01 -07:00
Frank Barchard
509c644245
Add J422ToARGB armv7 neon version.
...
R=fbarchard@chromium.org , fbarchard@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1334173005 .
2015-09-15 15:01:48 -07:00
Frank Barchard
fcacbfb27f
validate scan EOI from end for better coverage
...
R=tpsiaki@google.com
BUG=libyuv:478
Review URL: https://codereview.chromium.org/1344623003 .
2015-09-14 10:58:51 -07:00
Frank Barchard
67a9e30225
neon yuv matrix function
...
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://codereview.chromium.org/1337973002 .
2015-09-11 11:12:30 -07:00
Frank Barchard
316e1ab996
avx2 width parameter bug fix
...
R=harryjin@google.com
BUG=libyuv:489
Review URL: https://codereview.chromium.org/1321773004 .
2015-09-09 11:56:35 -07:00
Frank Barchard
8467f14ebb
disable avx2
...
R=harryjin@google.com
BUG=libyuv:489
Review URL: https://codereview.chromium.org/1318893003 .
2015-09-08 11:55:52 -07:00
Frank Barchard
ed55d24d9f
H420 functionality
...
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://webrtc-codereview.appspot.com/54869004 .
2015-09-06 11:01:40 -07:00
Frank Barchard
67b06e66cb
I422ToABGR for win64. Moves any functions to accomidate win64 subset of formats.
...
TBR=harryjin@google.com
BUG=libyuv:488
Review URL: https://webrtc-codereview.appspot.com/57679004 .
2015-09-03 11:00:18 -07:00
Frank Barchard
7060e0d826
I420ToABGRMatrix functions with J420ToABGR wrapper.
...
Allows direct conversion from JPeg to ABGR for android.
BUG=libyuv:488
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/55719004 .
2015-09-03 10:42:36 -07:00
Frank Barchard
fbc3d595e9
define yuvconstants structure all the time, so its can be referred to on all builds.
...
currently only intel code uses this structure, but the prototypes are there for neon and lack of a structure cases a compile error on arm.
R=tpsiaki@google.com
BUG=none
Review URL: https://webrtc-codereview.appspot.com/52799004 .
2015-09-02 14:55:11 -07:00
Frank Barchard
925c3d9e26
I420ToARGB conversion with matrix.
...
Take color conversion constants as a parameter to row function for I420ToARGBMatrixRow_SSSE3.
Allows future variations of color space using a single low level.
R=harryjin@google.com
BUG=libyuv:488
Review URL: https://webrtc-codereview.appspot.com/56669004 .
2015-09-02 10:45:42 -07:00
Frank Barchard
be11f500f0
Use ebp to point to conversion table.
...
Proof of concept that conversions can table color matrix as a parameter.
R=harryjin@google.com
BUG=libyuv:472, libyuv:488
Review URL: https://webrtc-codereview.appspot.com/58489004 .
2015-08-28 12:00:49 -07:00
Frank Barchard
3c4f5735ce
use pointer to inverse table for clangcl
...
R=harryjin@google.com
TBR=harryjin@google.com
BUG=none
Review URL: https://webrtc-codereview.appspot.com/54859004 .
2015-08-26 12:53:03 -07:00
Frank Barchard
d317a70c1d
llvm64 link error fix.
...
R=harryjin@google.com
BUG=libyuv:485
Review URL: https://webrtc-codereview.appspot.com/58479004 .
2015-08-24 14:21:04 -07:00
Frank Barchard
4dfdabb552
I420AlphaToABGR for android version of yuva conversion
...
Same as I420AlphaToARGB but first step converts to ABGR instead of ARGB.
TBR=harryjin@google.com
BUG=libyuv:473
Review URL: https://webrtc-codereview.appspot.com/52779004 .
2015-08-20 19:36:59 -07:00
Frank Barchard
2fb6fd74be
[Android] Remove reference to third_party/android_testrunner.
...
Deleting in https://codereview.chromium.org/1290173003
BUG=chromium:267773
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/54849004 .
2015-08-19 16:13:27 -07:00
Frank Barchard
ee9aaea02f
i422torgb565 is asm for clangcl as well
...
Merge branch 'master' of https://chromium.googlesource.com/libyuv/libyuv into convertcl
allow lto for llvm but not gcc
R=harryjin@google.com
BUG=libyuv:469
Review URL: https://webrtc-codereview.appspot.com/52769004 .
2015-08-19 10:46:30 -07:00
Frank Barchard
bb66c021ff
Re-enable LLVM LTO on Neon targets.
...
LTO was disabled due to a GCC compiler bug that does not affect LLVM.
This fixes the build in the cfi_vptr==1 configuration, which requires LLVM LTO.
R=pcc@google.com
BUG=chromium:469376
Review URL: https://webrtc-codereview.appspot.com/57659004 .
2015-08-18 15:26:52 -07:00
Frank Barchard
94d4269936
clang use scalewin
...
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:469
Review URL: https://webrtc-codereview.appspot.com/51329004 .
2015-08-18 14:50:27 -07:00
Frank Barchard
cda9d38a4e
xmmword cast for clang
...
clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang.
move defines/protos to compare_row.h
fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination.
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:484
Review URL: https://webrtc-codereview.appspot.com/59379004 .
2015-08-18 11:13:12 -07:00
Frank Barchard
baf6a3c1bd
Using the visual C source allows clangcl to fallback seamlessly to visual c, and supports SSE41 and AVX2 versions.
...
R=harryjin@google.com
BUG=libyuv:469
Review URL: https://webrtc-codereview.appspot.com/58469004 .
2015-08-17 10:47:43 -07:00
Frank Barchard
278d88f872
Copy Alpha odd width support
...
R=harryjin@google.com
BUG=none
Review URL: https://webrtc-codereview.appspot.com/59369004 .
2015-08-13 15:05:14 -07:00
Frank Barchard
8e7a62f22a
I420AlphaToARGB conversion for planar YUV with Alpha to ARGB.
...
R=brucedawson@chromium.org , harryjin@google.com
BUG=libyuv:473
Review URL: https://webrtc-codereview.appspot.com/54829004 .
2015-08-12 17:01:24 -07:00
Frank Barchard
58f0020137
use visual c 32 bit code for clangcl
...
R=harryjin@google.com
BUG=libyuv:483
Review URL: https://webrtc-codereview.appspot.com/54819004 .
2015-08-11 10:10:45 -07:00
Frank Barchard
9425c4b01a
rotate nv12 any width
...
BUG=libyuv:464
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/55709004 .
2015-08-07 23:48:38 -07:00
Frank Barchard
478ff9608b
Increase error tolerance to 4 for arm on J420 convert
...
BUG=libyuv:479
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/53789004 .
2015-08-07 12:24:25 -07:00
Frank Barchard
6e7ef3fddc
allow xgetbv to be disabled for drmemory testing
...
R=harryjin@google.com
BUG=none
Review URL: https://webrtc-codereview.appspot.com/56649004 .
2015-08-04 15:00:39 -07:00
Frank Barchard
e40384b6d9
remove 32 bit gcc version of UV transpose
...
TBR=harryjin@google.com
BUG=libyuv:481
Review URL: https://webrtc-codereview.appspot.com/52249004 .
2015-08-03 18:03:55 -07:00
Frank Barchard
f14c433916
rotate macros used for source
...
R=brucedawson@chromium.org , harryjin@google.com
BUG=libyuv:481
Review URL: https://webrtc-codereview.appspot.com/52239004 .
2015-08-03 16:12:18 -07:00
Frank Barchard
7cd7f5a80f
avx ifdef for scale HAS_SCALEADDROW_AVX2.
...
R=jzern@google.com
BUG=libyuv:480
Review URL: https://webrtc-codereview.appspot.com/53779004 .
2015-07-31 17:17:14 -07:00
Frank Barchard
f242a4a1a1
ValidateJpeg check for valid pointer and size
...
R=harryjin@google.com
BUG=chromium:497297
Review URL: https://webrtc-codereview.appspot.com/57649004 .
2015-07-30 15:49:48 -07:00
Frank Barchard
93464b926c
Add rotate any support. Fix for sobel for neon which does 16 at a time, not 8. Disable scaling color test that fails on arm. Test is not complete.
...
R=harryjin@google.com
BUG=libyuv:479
Review URL: https://webrtc-codereview.appspot.com/52229004 .
2015-07-28 15:06:20 -07:00
Frank Barchard
cb54e8b69a
rename rotate macros and functions to match
...
BUG=libyuv:477
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/52199004 .
2015-07-27 17:00:41 -07:00
Frank Barchard
6a1d01220a
disable x86 consistently
...
R=harryjin@google.com , jzern@chromium.org
BUG=libyuv:476
Review URL: https://webrtc-codereview.appspot.com/55699004 .
2015-07-27 12:49:54 -07:00
Frank Barchard
18a9027ad9
const warning fix on dither, bump chromium deps and add files to ignore list generated by arm build
...
BUG=none
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/57639004 .
2015-07-27 11:47:01 -07:00
Frank Barchard
2fa4f5a3ea
Adds files and functions for rotate any, but does not hook them up to the caller.
...
rotate any
R=harryjin@google.com
BUG=libyuv:464
Review URL: https://webrtc-codereview.appspot.com/53769004 .
2015-07-27 10:32:08 -07:00
Frank Barchard
5be90d23ee
rotate row included
...
R=tpsiaki@google.com
BUG=libyuv:468
Review URL: https://webrtc-codereview.appspot.com/55679004 .
2015-07-22 17:10:08 -07:00
Frank Barchard
892807d860
move asm out of rotate into win/gcc and header
...
R=harryjin@google.com
BUG=libyuv:468
Review URL: https://webrtc-codereview.appspot.com/51319004 .
2015-07-22 11:22:55 -07:00
Frank Barchard
f5c71e52bb
rowbytes fix for nv12 tests
...
R=harryjin@google.com
BUG=libyuv:466
Review URL: https://webrtc-codereview.appspot.com/50349004 .
2015-07-21 10:48:10 -07:00
Frank Barchard
ce98129951
yuy2tonv12
...
R=bcornell@google.com
BUG=libyuv:466
Review URL: https://webrtc-codereview.appspot.com/51309004 .
2015-07-17 16:22:59 -07:00
Frank Barchard
faa4b14f85
uyvy to nv12
...
R=harryjin@google.com
BUG=libyuv:466
Review URL: https://webrtc-codereview.appspot.com/50339004 .
2015-07-17 14:43:19 -07:00
Frank Barchard
faebf89ce0
src_uv typo fix
...
R=harryjin@google.com
BUG=none
Review URL: https://webrtc-codereview.appspot.com/51299004 .
2015-07-15 18:21:06 -07:00
Frank Barchard
3d190ee9f1
break rotate into files by cpu in preparation for optimization.
...
R=bcornell@google.com
BUG=libyuv:464
Review URL: https://webrtc-codereview.appspot.com/51289004 .
2015-07-14 10:23:10 -07:00
Frank Barchard
673fe7a684
create rotate_row header
...
R=tpsiaki@google.com , tpsiaki
BUG=none
TESTED=local build still works.
Review URL: https://webrtc-codereview.appspot.com/50329004 .
2015-07-09 14:40:35 -07:00
Frank Barchard
0e83b64e88
scalerow avx2 bug fix. was using ymm2 instead of ymm3.
...
R=harryjin@google.com
BUG=libyuv:462
Review URL: https://webrtc-codereview.appspot.com/56639004 .
2015-07-07 17:48:04 -07:00
Frank Barchard
715a29195b
vpermq for avx2 ARGB4444ToARGB, ARGB1555ToARGB and RGB565ToARGB
...
R=harryjin@google.com
BUG=libyuv:462
Review URL: https://webrtc-codereview.appspot.com/52759004 .
2015-07-07 17:06:04 -07:00
Frank Barchard
97b35daf75
disable faulty avx2 in argb conversions and box filter. and extend temporary buffer to 128 for an avx2 any function.
...
R=harryjin@google.com
BUG=libyuv:462
TESTED=libyuv_unittest run on haswell laptop
Review URL: https://webrtc-codereview.appspot.com/53759004 .
2015-07-07 15:40:24 -07:00
Frank Barchard
9487b9d6d8
any allow for avx2 32 pixels at a time of argb
...
R=harryjin@google.com
BUG=libyuv:461
Review URL: https://webrtc-codereview.appspot.com/54779004 .
2015-07-01 17:50:48 -07:00
Frank Barchard
cff11a17d6
remove tools from git that were previously checkin by accident.
...
R=harryjin@google.com , brucedawson@chromium.org
BUG=none
TESTED=untested
Review URL: https://webrtc-codereview.appspot.com/56619004 .
2015-06-30 10:45:24 -07:00
Frank Barchard
82180e8296
rgb24toyuv use 1 or 2 steps consistently.
...
R=bcornell@google.com , impjdi@google.com
BUG=libyuv:459
Review URL: https://webrtc-codereview.appspot.com/52149004 .
2015-06-29 16:51:05 -07:00
Frank Barchard
0686f26938
blend remove alignment 1 pixel loop for less overhead.
...
R=tpsiaki@google.com
BUG=none
TESTED=libyuvTest.ARGBBlend_Opt
Review URL: https://webrtc-codereview.appspot.com/50289005 .
2015-06-24 11:34:12 -07:00
Frank Barchard
553c7f85f1
mirror odd width with simd
...
R=harryjin@google.com
BUG=libyuv:448
Review URL: https://webrtc-codereview.appspot.com/54769004 .
2015-06-23 17:53:02 -07:00
Frank Barchard
6a9ef1ea36
any 1 to 2 with stride use SIMD
...
R=harryjin@google.com
BUG=libyuv:448
Review URL: https://webrtc-codereview.appspot.com/54759004 .
2015-06-23 17:08:08 -07:00
Frank Barchard
6dde4f14bd
argb to uv read 4 not 8
...
R=harryjin@google.com
BUG=libyuv:457
Review URL: https://webrtc-codereview.appspot.com/52139004 .
2015-06-23 14:48:37 -07:00
Frank Barchard
54100b91c1
copy 2 rows for interpolate and use SIMD.
...
R=harryjin@google.com
BUG=libyuv:448
Review URL: https://webrtc-codereview.appspot.com/50279004 .
2015-06-23 10:41:46 -07:00
Frank Barchard
3b5d726a4f
1 to 1 any functions with a parameter use memcpy.
...
R=harryjin@google.com
BUG=libyuv:448
Review URL: https://webrtc-codereview.appspot.com/57619004 .
2015-06-22 15:08:20 -07:00
Frank Barchard
a0fca88b1d
remove fmemcpy and bump version
...
R=harryjin@google.com
BUG=libyuv:448
Review URL: https://webrtc-codereview.appspot.com/50269004 .
2015-06-19 17:58:17 -07:00
Frank Barchard
cae07fb0e0
bump subsampling up
...
BUG=455
TESTED=libyuvTest.ARGBToYUY2_Random
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/58419004 .
2015-06-12 15:25:03 -07:00
Frank Barchard
03da5420bc
use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder.
...
Enter a description of the change.
use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder. Currently the C code does not exactly match the SIMD code, so an odd width produces different pixels than an even width, causing a subtle artifact. By using SIMD consistently, there is no difference in even and odd widths. Also the SIMD performance is faster, so even with overhead of memcpy, performance improves.
BUG=447
TESTED=out\release\libyuv_unittest.exe --gtest_filter=*I420ToARGB*
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/55579004 .
2015-06-11 16:38:52 -07:00
fbarchard@google.com
05416e2d9a
Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows.
...
BUG=425
TESTED=out\release\libyuv_unittest --gtest_filter=*ScaleTo1x1*
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/52659004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1428 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 01:05:18 +00:00
fbarchard@google.com
b07de879b6
enable intrinsics for clangcl if -mssse3 is enabled.
...
BUG=451
TESTED=untested
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/52699004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1427 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-08 22:48:18 +00:00
fbarchard@google.com
b3d3db1b33
align clangcl using declspec instead of gcc style vector
...
BUG=451
TESTED=clang=1 build on windows
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/57549004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1425 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-08 21:37:37 +00:00
fbarchard@google.com
d3d8e0d933
make source for planar tests contiguous to test planar functions coalesce into a single low level call.
...
BUG=431
TESTED=SetPlane unittest
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/51999004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1419 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-01 23:28:59 +00:00
fbarchard@google.com
e787144c2d
adjust dimensions for scale factor tests to ensure the scale factor tested is actually used.
...
BUG=none
TESTED=set LIBYUV_WIDTH=1918 libyuvTest.ScaleDownBy3by4_None
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/47349004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1416 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-30 00:45:08 +00:00
fbarchard@google.com
bd2d903e1b
odd width support for ARGBSobel functions. Improves performance for images that are not a multiple of 8 pixels.
...
BUG=444
TESTED=libyuvTest.ARGBSobel_Opt
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/54589004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1415 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-28 22:22:28 +00:00
fbarchard@google.com
cfce47efc8
Change Sobel to use JPeg Luma calculation instead of extracting G channel. Using luma produces a better sobel that respects all 3 channels of RGB. Historically the G channel was used to improve performance, and because the luma of I420 is a constrained range, hurting quality. Using the JPeg variation of YUV, the luma is more accurate, including cross platform, better optimized for AVX2 and odd widths, and full range.
...
BUG=444
TESTED=ARGBSobelXY_Opt
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/57479004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1414 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-27 22:32:26 +00:00
fbarchard@google.com
535a7140f2
Scale Down by factor tests scale down to specified ratio rather than up. This ensures the alignment constrains on the destination dont cause a different factor to be used.
...
BUG=431
TESTED=libyuvTest.ScaleDownBy3_Bilinear
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/47309004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1413 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-26 23:22:01 +00:00
fbarchard@google.com
7c09264ffc
odd width support for scale by even scale factor and box scale down by 4. scale down by 4 uses scale down by 2 internally.
...
BUG=431
TESTED=libyuvTest.ARGBScaleDownBy4_Bilinear
Review URL: https://webrtc-codereview.appspot.com/57399004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1412 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-26 17:56:51 +00:00
fbarchard@google.com
c38aeec322
scale down by 2 on argb images support odd widths using _any function.
...
BUG=431
TESTED=libyuvTest.ARGBScaleDownBy2_Bilinear
Review URL: https://webrtc-codereview.appspot.com/52569004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1410 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-22 21:39:21 +00:00
fbarchard@google.com
3666015261
add nacl macros for arm to YUV422TORGB_SETUP_REG.
...
BUG=415
TESTED=ncval.exe newlib/Release/nacltest_arm.nexe
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/46229005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1406 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-12 21:33:32 +00:00
fbarchard@google.com
7be3bc65a1
enable speed optimization for libyuv
...
BUG=439
TESTED=out\release\libyuv_unittest --gtest_filter=*I420ToARGB_Opt
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/55389004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1405 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-11 21:58:42 +00:00
fbarchard@google.com
b33dc47b54
sobel use LL for constants to be passed in as int64
...
BUG=437
TESTED=local ios build
Review URL: https://webrtc-codereview.appspot.com/47129004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1404 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-06 02:34:16 +00:00
fbarchard@google.com
b0f8352245
row_neon64 additional fixes for warning on ios where int doesnt match %2 size which is 64 bit by default. change size to explicitely 32 bit with %w2.
...
BUG=437
TESTED=try bots
Review URL: https://webrtc-codereview.appspot.com/43349004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1401 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-05 17:26:57 +00:00
fbarchard@google.com
ab6b224675
fix for arm builds where tmp for assembly produces an error if its uninitialized.
...
BUG=libyuv:432
TESTED=try bots
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/49249004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1392 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-30 18:21:19 +00:00
fbarchard@google.com
9f4636e298
AVX2 port of ScaleDownBy4.
...
BUG=314
TESTED=out\release\libyuv_unittest --gtest_filter=*.ScaleDownBy4*
Review URL: https://webrtc-codereview.appspot.com/46159004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1390 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-30 01:58:32 +00:00
fbarchard@google.com
f995021f35
Work around casting warnings in scale_neon64.cc for ios 64 bit.
...
BUG=430
TESTED=untested
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/49799004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1382 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 00:02:46 +00:00
fbarchard@google.com
4e78b8dc2e
scale to 3/4 or 3/8 with odd width destinations efficiently. previously if width was not multiple of what the simd loop would do (24), scaling would fall back on slower C code. This change allows SIMD to be used for most of the scaling and C for the remainder, improving efficiency.
...
BUG=314
TESTED=set LIBYUV_WIDTH=1896 & ScaleDownBy3by4_*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/48249004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1380 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-27 21:56:08 +00:00
fbarchard@google.com
c8a2c236a0
NaCl/GYP: remove references to prep_toolchain from libyuv. prep_toolchain is now a no-op.
...
BUG=none
TESTED=untested
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/49769004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1378 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-22 17:42:21 +00:00
fbarchard@google.com
2b7f6b7dee
ScaleAddRows_Any_SSE2 functions for handling odd widths.
...
BUG=425
TESTED=out\release\libyuv_unittest_old --gtest_filter=*.ScaleDownBy3_*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/45219004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1377 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-22 00:51:56 +00:00
fbarchard@google.com
01db3d1d1d
Remove declspec(align(32)) from AVX2 functions.
...
BUG=422
TESTED=untested
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/43229004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1374 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-20 22:57:04 +00:00
fbarchard@google.com
812f59ed40
box and point sampling use scaledownby4 but linear and bilinear do not.
...
BUG=427
TESTED=out\release\libyuv_unittest --gtest_filter=*.ScaleDownBy4_*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/51689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1373 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-17 18:04:09 +00:00
fbarchard@google.com
8eb887f56f
add empty header for backwards compatibility.
...
BUG=none
TESTED=lint passes
Review URL: https://webrtc-codereview.appspot.com/46919004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1371 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-15 23:09:09 +00:00
fbarchard@google.com
c9986313ac
lsl by 2 requires a number sign for xcode on ios 64 bit build. add the # sign for ios compatibility. remove legacy x86 asm files that are unused. the unused files cause complications in build systems that build all files.
...
BUG=libyuv:423
TESTED=try bots
R=noahric@google.com
Review URL: https://webrtc-codereview.appspot.com/45119004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1369 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 19:57:33 +00:00
fbarchard@google.com
32ad6e0e12
Remove unused variable 'I422ToRGB565Row' that breaks osx builds.
...
BUG=426
TESTED=untested
Review URL: https://webrtc-codereview.appspot.com/48079004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1368 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 02:50:35 +00:00
fbarchard@google.com
013e812275
Port box filter to AVX2.
...
BUG=libyuv:425
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*libyuvTest.ScaleTo640x360_Box
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/43149004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1367 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 00:21:15 +00:00
fbarchard@google.com
b5ea79d845
add rows handle height of 1 using a more general while-style loop.
...
BUG=none
TESTED=try bots
Review URL: https://webrtc-codereview.appspot.com/45999004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1366 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-13 18:56:08 +00:00
fbarchard@google.com
c7161d1c36
Remove code alignment declspec from Visual C versions for vs2014 compatibility.
...
BUG=422
TESTED=local vs2013 build still passes.
Review URL: https://webrtc-codereview.appspot.com/45959004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1365 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-12 23:54:26 +00:00
fbarchard@google.com
1eb51bcf01
Fix bug in YUV to RGB for gcc/clang and enable affected functions.
...
BUG=393
TESTED=sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I422ToARGB*
Review URL: https://webrtc-codereview.appspot.com/48019004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1364 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-08 02:32:33 +00:00
fbarchard@google.com
bb5a009d11
ARGB4444ToARGB and ARGB1555ToARGB ported to AVX2.
...
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGB4444ToARGB*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/48009004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1363 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 23:52:57 +00:00
fbarchard@google.com
8b9f908134
RGB565ToARGB AVX2 vzeroupper before the ret, not after.
...
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/51549004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1362 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 22:53:12 +00:00
yang.zhang@arm.com
5f609856de
Add ScaleARGBFilterCols_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleARGBFilterCols_NEON are implemented.
BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: Ifea62bc25d846bf16cb51d13b408de7bf58dccd4
Review URL: https://webrtc-codereview.appspot.com/46699004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1361 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 03:45:29 +00:00
fbarchard@google.com
2827277496
port RGB565ToARGB to AVX2.
...
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/49609004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1357 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 19:24:23 +00:00
fbarchard@google.com
e2ea106068
shift for arm wants a # sign for nacl and ios.
...
BUG=420
TESTED=d:/src/nacl_sdk/pepper_canary/toolchain/win_arm_newlib/bin/arm-nacl-g++ -o newlib/Release/source/scale_neon_arm.o -c source/scale_neon.cc -g -O2 -pthread -MMD -DNDEBUG -Id:/src/nacl_sdk/pepper_canary/include -Id:/src/nacl_sdk/pepper_canary/include/newlib -I./include
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/47949004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1356 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-03 23:28:44 +00:00
fbarchard@google.com
44b6ba91e4
Scale down by 4 for odd number of destination pixels using 'any' that handles SIMD for multiple of 8 pixels, and C for the remainder.
...
BUG=314
TESTED=local test with width odd
Review URL: https://webrtc-codereview.appspot.com/49599004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1355 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-03 22:12:53 +00:00
fbarchard@google.com
62a9fe303c
code style cleanup of scale functions. no functional change.
...
BUG=none
TESTED=lint
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/48839004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1354 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-02 21:23:52 +00:00
fbarchard@google.com
c70c7c02ff
scale to half size optimization for avx2 - use pmaddubsw instruction to horizontally add bytes, then pavgw to round and divide by 2.
...
BUG=314
TESTED=libyuvTest.ScaleDownBy2*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/45909004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1352 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-31 23:59:27 +00:00
yang.zhang@arm.com
f23d6222ac
Add ScaleARGBCols_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleARGBCols_NEON are implemented.
BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: Id9ad97f7aa5d8a34cd55ace9e648cb6ff028efd9
Review URL: https://webrtc-codereview.appspot.com/47689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1351 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-31 03:03:05 +00:00
fbarchard@google.com
72673ac873
linear and point sample scale to half size for AVX2.
...
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/44959004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1349 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-30 21:46:08 +00:00
fbarchard@google.com
9ef8999ff3
scale to half size use pmadd/pavgw to horizontal averaging.
...
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/50529004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1348 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-27 18:20:21 +00:00
fbarchard@google.com
e6ca9cc2a2
Scale down by 2 AVX2 port. Processes twice as many pixels as SSE2 and takes advantage of 3 argument instructions to reduce register usage and number of instructions.
...
BUG=314
TESTED=libyuvTest.ScaleDownBy2_Box
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/42959004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1347 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-26 23:21:08 +00:00
fbarchard@google.com
f16f33d4ce
All cpu flags to be set so that instead of comparing C code, compare assembler to assembler, for benchmarking purposes.
...
BUG=none
TESTED=libyuv_unittest.exe
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/50499004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1346 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-26 18:22:29 +00:00
fbarchard@google.com
d41fbf40dd
Handle scale down by factor of 2 efficiently by calling SIMD for multiple of 16 destination pixels, and C for remainder.
...
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/48689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1344 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-24 23:25:30 +00:00
fbarchard@google.com
d28cd77f99
Enable assembly for clangcl build on Windows. Previously assembly was disabled so clangcl would work, but only with C code. As clangcl mimics both Visual C and GCC, ifdefs need to pick one or the other or often you'll end up with both. In this CL we disable most Visual C code and use the GCC versions which allow assembly for both 32 and 64 bit intel.
...
BUG=412
TESTED=clang=1 build on windows
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/51389004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1341 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 20:36:31 +00:00
yang.zhang@arm.com
d6d7de5742
Add ScaleFilterCols_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleFilterCols_NEON are implemented.
BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: I5b0838769ffb0182155d7cd6bcc520eb81eb5c4e
Review URL: https://webrtc-codereview.appspot.com/41349004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1340 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 03:55:05 +00:00
fbarchard@google.com
0e4388aea3
I422ToRGB24 AVX2 and I422ToRAW
...
BUG=none
TESTED=I422ToRGB24 unittest
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/46619004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1337 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 17:25:27 +00:00
yang.zhang@arm.com
4d387fc619
Add ScaleARGBRowDown2Linear_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleARGBRowDown2Linear_NEON are implemented.
BUG=319
TESTED=libyuvTest.ARGBScale* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: Ife602c81b51aa36e0d56b9d628f278a24eed96f6
Review URL: https://webrtc-codereview.appspot.com/44689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1336 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:23:59 +00:00
yang.zhang@arm.com
e246e6c18f
Add ARGBToRGB565DitherRow_NEON for ARM32/64
...
ARM32/64 NEON versions of ARGBToRGB565DitherRow_NEON are implemented.
BUG=407
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: Ia689170fb39db964392e5e1113801592ab0628bf
Review URL: https://webrtc-codereview.appspot.com/49409004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1335 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:22:25 +00:00
fbarchard@google.com
3b4f5eb7b8
Port J422 colorspace to GCC
...
BUG=414
TESTED=try bots
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/43809004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1334 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:54:50 +00:00
fbarchard@google.com
92f7f421fd
rename I400 to J400 and I400 reference to I400. J400 is a simple replication of values to convert to RGB, which is what the old I400 was. I400 reference is the Y part of the YUV formula, so renaming that to I400.
...
BUG=none
TESTED=libyuvTest (5925 ms total)
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/50369005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1333 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:01:18 +00:00
fbarchard@google.com
f301777060
Fix YToARGB and tweaks to thresholds in YUV tests.
...
BUG=411
TESTED=libyuvTest.TestYToARGB
R=bcornell@google.com
Review URL: https://webrtc-codereview.appspot.com/44709004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1330 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 19:50:33 +00:00
fbarchard@google.com
f2fad0faa5
Optimized J422ToARGB.
...
BUG=414
TESTED=J422ToARGB unittest
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/42799004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1328 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 18:08:30 +00:00
yang.zhang@arm.com
ca5b1bd58b
Add ScaleAddRows_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleAddRows_NEON are implemented.
BUG=319
TESTED=libyuvTest.Scale* on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: I45b88c2b5f576042ba5b3d8d6f8851257fdb7218
Review URL: https://webrtc-codereview.appspot.com/46379004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1326 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 02:57:46 +00:00
fbarchard@google.com
63726ed9c6
test different ways to round and clamp
...
BUG=none
TESTED=TestRoundToByte
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/41299004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1325 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-13 22:24:45 +00:00
fbarchard@google.com
952ca5f26f
Fix for planar functions SSE2 enable when building with clang for Windows.
...
BUG=412
TESTED=clang=1 for build on windows
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/45679004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1324 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-13 21:37:14 +00:00
fbarchard@google.com
685b92b0a6
I400ToARGB_AVX2 port from SSE2 to AVX2.
...
BUG=403
TESTED=libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I400ToARGB*
R=brucedawson@google.com
Review URL: https://webrtc-codereview.appspot.com/46569004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1322 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 18:12:17 +00:00
fbarchard@google.com
f5a7b2b48a
I411ToARGB AVX2 version
...
BUG=403
TESTED=I411ToARGB unittest
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/42689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1321 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 00:08:56 +00:00
fbarchard@google.com
1e4a14f410
scale avoid math overflow in fixed point for large images
...
BUG=410
TESTED=set LIBYUV_WIDTH=65536 out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=libyuvTest.ScaleTo320x240_None
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/42319004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1320 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 22:30:47 +00:00
fbarchard@google.com
be77e062da
Make TestFullYUV test do full yuv color space by default with randomized Y for inner loop
...
BUG=none
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*TestFullYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/47499005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1319 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 21:26:46 +00:00
fbarchard@google.com
24152b2435
Dither from I420 to RGB565 in 2 steps - I420ToARGB then ARGBToRGB565.
...
BUG=407
TESTED=untested
R=brucedawson@google.com , tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/48429004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1315 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 01:45:04 +00:00
fbarchard@google.com
cdd80e04c9
Port I444ToARGB to AVX2.
...
BUG=403
TESTED=I444ToARGB unittests
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/45589004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1314 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-09 21:56:48 +00:00
fbarchard@google.com
7c55ae4ada
Change YUV full test to use pseudo random order
...
BUG=none
TESTED=libyuvTest.TestFullYUV
Review URL: https://webrtc-codereview.appspot.com/45539004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1313 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-08 23:00:39 +00:00
fbarchard@google.com
9c96867a97
lrintf is not supported by visual studio 2010; replace instances of lrintf with a cast to int.
...
BUG=409
TESTED=python build\gyp_chromium -fninja -G msvs_version=2010 --depth=. libyuv_test.gyp
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/44569004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1312 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 22:20:41 +00:00
fbarchard@google.com
697c5aa831
disable nv12 avx2 for vs9/10 that dont support avx2 instructions.
...
BUG=409
TESTED=try bots
R=harryjin@google.com , johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/43629004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1311 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 19:12:21 +00:00
fbarchard@google.com
bdeb9ac584
switch from 8x8 to 4x4 matrix for dithering
...
BUG=407
TESTED=Dither unittests
R=brucedawson@google.com
Review URL: https://webrtc-codereview.appspot.com/46459004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1310 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 18:28:00 +00:00
fbarchard@google.com
0fe4abbc5c
ARGBToRGB565 AVX2 with dithering
...
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=brucedawson@google.com , harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/44519004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1309 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 22:31:43 +00:00
fbarchard@google.com
9245317e16
ARGBToRGB565 SSE2 port.
...
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41039004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1308 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 00:00:50 +00:00
yang.zhang@arm.com
274c9bce92
Add ScaleRowDown2Linear_NEON for ARM32/64
...
ARM32/64 NEON versions of ScaleRowDown2Linear_NEON are implemented.
BUG=319
TESTED=libyuvTest.ScaleDownBy2_Linear on ARM32/64 with Android
R=fbarchard@google.com
Change-Id: I2c7f43a0d56ed4dfded5bdbbb61765d87d65a2ba
Review URL: https://webrtc-codereview.appspot.com/43519005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1307 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-03 02:17:16 +00:00
fbarchard@google.com
693e0217c8
ARGBToRGB565 C version use unsigned dither matrix pattern to bump pixels to next brighter value.
...
BUG=407
TESTED=unittest passes
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/43539004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1306 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-02 23:46:09 +00:00
fbarchard@google.com
6eaee58589
shift by 16 for neon expects a number sign
...
BUG=408
TESTED=nacl arm build
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/38329004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1305 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-02 18:48:17 +00:00
fbarchard@google.com
933bd40c3c
port ARGBToRGB565 and ARGB1555 to AVX2. Enable functions that use ARGBToRGB565 AVX2 code. Add ARGBToRGB565Dither function.
...
BUG=403
TESTED=local windows build
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/42109004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1302 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-27 21:15:28 +00:00
fbarchard@google.com
a5d26cedb1
if building with gcc and sse2 is not enabled, disable assembly
...
BUG=none
TESTED=nacl build with default options
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/44389004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1298 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 19:28:45 +00:00
fbarchard@google.com
bffd326f74
AVX2 version of ARGBToARGB4444
...
BUG=403
TESTED=local build on windows
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/43429004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1297 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 17:26:28 +00:00
fbarchard@google.com
d96047761e
AVX2 version of NV12ToARGB
...
BUG=403
TESTED=untested
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/40089004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1295 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:45:08 +00:00
fbarchard@google.com
3c11d4bf6e
align avx2 buffers to 32 bytes
...
BUG=403
TESTED=untested
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/40929004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1294 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:31:28 +00:00
fbarchard@google.com
446fa95587
I422ToRGB565, ARGB4444 and ARGB1555 for AVX2
...
BUG=403
TESTED=avx2 emulator
Review URL: https://webrtc-codereview.appspot.com/34359004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1293 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:14:46 +00:00
fbarchard@google.com
e2f1a75474
move mask to last parameter of any functions for consistency.
...
BUG=none
TESTED=local libyuv unittest passes
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/43419004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1292 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 21:18:30 +00:00
fbarchard@google.com
239962fa00
YUY2 and UYVY to ARGB AVX2 versions via wrappers.
...
BUG=403
TESTED=UNTESTED
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/34339004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1291 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 18:58:51 +00:00
kjellander@google.com
28d1a582ba
Revert "YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions."
...
This reverts r1288 due to breaking compilation on bots:
http://build.chromium.org/p/client.libyuv/builders/Mac64%20Debug/builds/365
http://build.chromium.org/p/client.libyuv/builders/Linux64%20Debug/builds/667
TBR=fbarchard@google.com
TESTED=Reverted locally and all built fine again.
Review URL: https://webrtc-codereview.appspot.com/40879004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1289 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-23 09:22:22 +00:00
fbarchard@google.com
b52606c024
YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions.
...
BUG=403
TESTED=convert unittest
R=brucedawson@google.com
Review URL: https://webrtc-codereview.appspot.com/40839004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1288 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-21 00:49:35 +00:00
fbarchard@google.com
6a192487fe
Switch SSSE3 row wrappers from variable sized malloc to fixed size array with loop to process a portion of the row at a time. This helps performance in the case where the image has been coalesced into a single large row and the allocator, although only called once, is slow to clear the pages. Also the smaller temporary buffer fits cache, further improving performance.
...
BUG=403
TESTED=YUY2ToARGB unittest
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/40849004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1286 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-20 22:46:15 +00:00
fbarchard@google.com
194f740d0e
Scan from start of buffer to handle case where an invalid size was passed.
...
BUG=404
TESTED=libyuvTest.ValidateJpegLarge
R=brucedawson@google.com , harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41989004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1285 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-18 01:57:31 +00:00
fbarchard@google.com
a965a97d8e
Unittest to test ValidateJpeg when jpeg is small but buffer is large
...
BUG=404
TESTED=libyuvTest.ValidateJpegLarge
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/36169004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1284 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-17 19:16:14 +00:00
fbarchard@google.com
975dd5a699
macros for storing RGB on windows.
...
BUG=403
TESTED=local windows build
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/38119004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1283 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-14 00:50:48 +00:00
fbarchard@google.com
5ab38f9258
Remove Q420 fourcc support.
...
BUG=396
TESTED=local build of unittest builds and passes
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/39089004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1278 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 18:20:54 +00:00
fbarchard@google.com
738dfa0307
Support odd widths for NV12 format when cropping vertically.
...
BUG=400
TESTED=CropNV12
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/39009004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1272 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 02:18:38 +00:00
fbarchard@google.com
695f42fdb8
fix for odd width but even height in TestI420
...
BUG=400
TESTED=libyuv unittests pass locally with width of 11
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41859004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1271 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 21:40:27 +00:00
fbarchard@google.com
0887315390
Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
...
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/38059004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
35037cb948
For 32 bit x86 with fpic use memory instead of register for count
...
BUG=399
TESTED=try bots
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/36019004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1269 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-08 21:28:40 +00:00
fbarchard@google.com
fd8054791c
build fixe for InterpolateRow_MIPS_DSPR2
...
BUG=398
TESTED=untested
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/37999004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1268 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-07 01:01:30 +00:00
fbarchard@google.com
a246086243
use the same structures for sse and avx yuv to rgb.
...
BUG=396
TESTED=local build still passes on sse
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/34999004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1267 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-06 21:05:38 +00:00
fbarchard@google.com
cf925c50bc
Make Yvu vs Yuv use same code and structure but pass in a different version of the matrix
...
BUG=396
TESTED=ncval
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/34979004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1266 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-06 00:29:08 +00:00
fbarchard@google.com
1663996c52
Remove ifdef __SSE2__ and native client ifdef for r14 in register usage declarations.
...
BUG=395
TESTED=gcc build with nacl
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/34149004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1264 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 23:09:15 +00:00
fbarchard@google.com
baafc97d6b
port YToARGB AVX2 to GCC
...
BUG=393
TESTED=untested
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/39819004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1262 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 20:17:27 +00:00
fbarchard@google.com
f7e5b5e361
Enable AVX I422ToARGB for Windows.
...
BUG=393
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I422ToARGB_Opt
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41789004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1261 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 19:41:10 +00:00
fbarchard@google.com
9d67669697
make histogram use 8 digits for all values for more consistent formatting.
...
BUG=394
TESTED=TestFullYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/34129004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1260 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-04 20:00:38 +00:00
fbarchard@google.com
c4e032c543
change Y multiplier and bias to compensate for 257/256 which makes YToARGB exactly match float math.
...
Histogram Before
hist -3 -2 -1 0 1 2 3
red 0 0 1809408 13140736 1827072 0 0
green 0 0 1679912 13471329 1625975 0 0
blue 168448 994816 1876480 10655488 1893376 1006336 182272
Histogram After
hist -3 -2 -1 0 1 2 3
red 0 0 558848 15632128 586240 0 0
green 0 0 209907 16350588 216721 0 0
blue 14848 642816 1989376 11363328 2053120 695040 18688
BUG=394
TESTED=more stringent luma tests
R=brucedawson@google.com
Review URL: https://webrtc-codereview.appspot.com/38859004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1259 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-04 19:45:26 +00:00
fbarchard@google.com
3982998c7c
YToARGB AVX2 port from SSE2
...
BUG=393
TESTED=YToARGB unittest
R=brucedawson@google.com , harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41679004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1258 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-03 01:35:11 +00:00
fbarchard@google.com
0494ffee81
use lrintf to round from float to int instead of round and then cast.
...
BUG=393
TESTED=local windows test passed.
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/37899004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1257 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-02 21:06:51 +00:00
fbarchard@google.com
c61394789d
Test for YToARGB to ensure ordering of values.
...
BUG=393
TESTED=TestYToARGB
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/37819004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1256 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-02 18:51:29 +00:00
fbarchard@google.com
dc2c90100c
Disable YUV to ARGB AVX2 versions.
...
BUG=393
TESTED=I420ToRGB*
Review URL: https://webrtc-codereview.appspot.com/35879004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1255 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-27 02:40:32 +00:00
fbarchard@google.com
4e155ef6c4
Change test to test for Arm, since all CPUs except arm provide accurate yuv conversion
...
BUG=392
TESTED=try bots
Review URL: https://webrtc-codereview.appspot.com/34059004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1254 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-27 00:22:10 +00:00
fbarchard@google.com
4848b06016
YPixel subtract bias to match C code
...
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/37799004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1253 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:58:20 +00:00
fbarchard@google.com
f0845348fe
Add a test for YToARGB to match exactly I420ToARGB
...
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/38739004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1252 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:22:13 +00:00
fbarchard@google.com
29db9b0b89
C version of YToARGB with ubias removed to produce consistent luma ramp.
...
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/35869004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1251 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:07:46 +00:00
fbarchard@google.com
63882a356f
Disable YToARGB assembly which is off by 1
...
BUG=392
TESTED=libyuvTest.YToARGB_Opt
Review URL: https://webrtc-codereview.appspot.com/40549004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1250 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 17:16:44 +00:00
fbarchard@google.com
080a316492
port yuv chroma improvements to gcc. YUV to RGB is more accurate using a negative matrix. 2% slower but half as much error.
...
BUG=324
TESTED=try bots
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/41629004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1249 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 04:35:51 +00:00
fbarchard@google.com
d12a08712b
adjust ubias to minimize error histogram centering error.
...
BUG=324
TESTED=TestFullYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/37739004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1248 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 22:16:33 +00:00
fbarchard@google.com
eb8dda3ac7
fix for ybias on YToARGB function.
...
BUG=324
TESTED=libyuvTest.YToARGB_Any
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/36939004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1247 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 18:31:29 +00:00
fbarchard@google.com
b114986477
Change YUV to RGB to subtract the chroma contributions from the bias.
...
BUG=324
TESTED=win64 build and TestFullYUV
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/33999004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1246 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 04:22:35 +00:00
fbarchard@google.com
c62d30111f
adjust bias on Y channel so error histogram is better centered on green channel
...
BUG=324
TESTED=FullYUVTest
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/38689004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1245 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-22 19:43:34 +00:00