1997 Commits

Author SHA1 Message Date
Frank Barchard
56bbcdf422 Reintroduce the max version of scale
add ScaleMaxSamples_NEON function with max
done on original values.

TBR=kjellander@chromium.org
BUG=libyuv:717
TEST=LibYUVPlanarTest.TestScaleMaxSamples_Opt

Change-Id: Id99338860782b10ffd24f66242eb42014c2e229e
Reviewed-on: https://chromium-review.googlesource.com/614685
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-14 23:33:56 +00:00
Manojkumar Bhosale
dbd7c1a9c5 Add MSA optimized ARGBExtractAlpha, ARGBBlend, ARGBQuantize and ARGBColorMatrix row functions
TBR=kjellander@chromium.org
R=fbarchard@google.com

Bug:libyuv:634
Change-Id: I17bd3f87336f613ad363af7d7b9d7af49d725e56
Reviewed-on: https://chromium-review.googlesource.com/613100
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-14 17:38:31 +00:00
Frank Barchard
83ca1abe09 Change ScaleSumSamples to return Sum of Squares
TBR=kjellander@chromium.org
BUG=libyuv:717
TEST=LibYUVPlanarTest.TestScaleSumSamples_Opt

Change-Id: I5208666f3968c5c4b0f1b0c951f24216d78ee3fe
Reviewed-on: https://chromium-review.googlesource.com/607184
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-09 22:19:45 +00:00
Frank Barchard
8676ad7004 scale float samples and return max value
BUG=libyuv:717
TEST=ScaleSum unittest to compare C vs Arm implementation
TBR=kjellander@chromium.org

Change-Id: Iaa7af5547d979aad4722f868d31b405340115748
Reviewed-on: https://chromium-review.googlesource.com/600534
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-04 23:34:30 +00:00
Frank Barchard
27036e33e8 Revert "include <new> header for benefit of new clang builds"
This reverts commit 1dda4cb0b7bd564e646d6ec2efee497fcd7146ca.

Reason for revert: build error on jpeg FILE

Original change's description:
> include <new> header for benefit of new clang builds
> 
> TBR=kjellander@chromium.org
> BUG=libyuv:712
> TEST=local builds still work
> 
> Change-Id: I040e8edc40aafd820d2a29629fe7aec5c049bc6b
> Reviewed-on: https://chromium-review.googlesource.com/576971
> Reviewed-by: Frank Barchard <fbarchard@google.com>
> Commit-Queue: Frank Barchard <fbarchard@google.com>

TBR=kjellander@chromium.org,fbarchard@google.com

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: libyuv:712
Change-Id: I4cf4e26eadb476017dc95e6c9578092204f088a3
Reviewed-on: https://chromium-review.googlesource.com/601211
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-08-03 22:03:47 +00:00
Frank Barchard
6d083e2d12 clang 6 build disable some msa functions
R=kjellander@chromium.org

Bug: libyuv:715
Test: gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
Change-Id: Ia3943b0afc02e05a8bc32350719b296b0b9d5479
Reviewed-on: https://chromium-review.googlesource.com/592720
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-08-03 17:44:35 +00:00
Frank Barchard
1dda4cb0b7 include <new> header for benefit of new clang builds
TBR=kjellander@chromium.org
BUG=libyuv:712
TEST=local builds still work

Change-Id: I040e8edc40aafd820d2a29629fe7aec5c049bc6b
Reviewed-on: https://chromium-review.googlesource.com/576971
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-07-19 17:47:31 +00:00
Frank Barchard
6c94ad13b5 Remove ARM NaCL macros from source
NaCL has been disabled for awhile, so the code
will still build, but only with C versions.
This change removes the MEMACCESS() macros from
Neon and Neon64 source.

BUG=libyuv:702
TEST=try bots build for arm.
R=kjellander@chromium.org

Change-Id: Id581a5c8ff71e18cc69595e7fee9337f97c44a19
Reviewed-on: https://chromium-review.googlesource.com/528332
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-09 22:22:07 +00:00
Frank Barchard
5f94a33e0c Lint fix for C casting for rotation code on arm
instead of casting int to int64, pass the int
and use %w modifier to use the word version of the register.

TBR=kjellander@chromium.org
BUG=libyuv:706
TEST=git cl lint
R=wangcheng@google.com

Change-Id: Iee5a70f04d928903ca8efac00066b8821a465e36
Reviewed-on: https://chromium-review.googlesource.com/528381
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-06-09 00:51:00 +00:00
Frank Barchard
d981495b42 Hamming Distance using 16 bit accumulators
Summing 16 bit hamming codes restricts the maximum length,
but saves an inner loop instruction.  The outer loop can sum the
values.

32 bit Neon
Now BenchmarkHammingDistance_Opt (78 ms)
Was BenchmarkHammingDistance_Opt (92 ms)

64 bit Neon
Now BenchmarkHammingDistance_Opt (85 ms)
Was BenchmarkHammingDistance_Opt (92 ms)

R=wangcheng@google.com
TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance

Change-Id: Ie40f0eac2f3339c33b833b42af5d394b122066ae
Reviewed-on: https://chromium-review.googlesource.com/526932
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-07 23:23:24 +00:00
Frank Barchard
790e0634a8 Port HammingDistance_NEON 32 bit code to 64 bit
The 32 bit version of HammingDistance_NEON accumulates
using vertical add and paired adds, which takes 3 instructions
instead of 4.
The instructions are also portable between 32 and 64 bit.

Was BenchmarkHammingDistance_Opt (105 ms)
Now BenchmarkHammingDistance_Opt (90 ms)

TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance

BenchmarkHammingDistance_Opt (90 ms)

Change-Id: If9e621e0bd2fe2492a1532056f8a1b451ba53d7e
Reviewed-on: https://chromium-review.googlesource.com/526365
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-07 01:04:35 +00:00
Frank Barchard
47d6eaa377 HammingDistance_NEON optimized looping
BenchmarkHammingDistance_Opt (93 ms)
BenchmarkHammingDistance_C (389 ms)

TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance

Change-Id: I4ba920751eb130cac6a276e441a7c309c495554a
Reviewed-on: https://chromium-review.googlesource.com/526401
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-07 00:09:59 +00:00
Frank Barchard
baf5248242 HammingDistance_NEON ported to 32 bit
TBR=kjellander@chromium.org
BUG=libyuv:701
TEST=BenchmarkHammingDistance

Change-Id: I252efd8a27aa11a0fe7d8030d7c8b57f20f04760
Reviewed-on: https://chromium-review.googlesource.com/525232
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-06 17:58:29 +00:00
Frank Barchard
44abf70187 ScaleDown odd functions adjust math so last pixel is half width source.
existing test passes
out/Release/libyuv_unittest --gtest_filter=*Blend* --libyuv_width=33 --libyuv_height=16

new test added
BUG=libyuv:705
TEST=LibYUVScaleTest.TestScaleOdd

Change-Id: Ica91812aee2e4ed9bcc18df4962b089c2e4ae704
Reviewed-on: https://chromium-review.googlesource.com/524932
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-06-06 01:37:26 +00:00
Frank Barchard
7bffe5e1c5 lint warning fixes for CpuID
The CpuId function is a wrapper for the intrinsic, or
implemented with inline if unavailable.  It had been
using uint32, but the intrinsics use int, so it was causing
casting and lint warnings.  This change makes the internal
implementation use int.

Casting was also done for xgetbv, and the cast is simply
removed, and is not causing a build error.

MipCpuCaps was doing strlen to check for white space after the
instruction set.  Arm also does this but with a hard coded offset.
This was causing a cast from size_t to int, which produced a lint
warning.  The change removes the white space detect.
In theory the code could be used to detect SSE vs SSE2, and it would
need to check SSE is followed by a space or end of line.  But this
code is only used on Arm and Mips, where there there is one form
of SIMD detected.  e.g. MSA for mips.  If a new instruction set is
added with a similar name, the write space check could be reintroduced.
But its more likely the code can be rewritten to use a better form
of detection by then. Or remove detection and require the instructions

BUG=libyuv:641
TEST=try bots build on all platforms without error and lint is clean

Change-Id: I9f55f8e57bba0f78571bdddbe63b945dea3e8809
Reviewed-on: https://chromium-review.googlesource.com/514524
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Wan-Teh Chang <wtc@chromium.org>
2017-05-25 22:00:17 +00:00
Frank Barchard
8edd2286fd MaskCpuFlags return cpuinfo so InitCpuFlags can call it
Reduce number of atomic references to cpu_info by making
InitCpuFlags call MaskCpuFlags and return the same value.

BUG=libyuv:641
TEST=libyuv_unittests pass

Change-Id: I5dfff8f7a10671bc8ef3ec0ed6f302791e752faa
Reviewed-on: https://chromium-review.googlesource.com/514145
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-24 22:27:03 +00:00
Frank Barchard
651ccc0c3a Fix data races in libyuv::TestCpuFlag().
Detect the compiler's support of C11 atomics, and use C11 atomics when
available.

Note that libyuv::MaskCpuFlags() is still not thread-safe.

BUG=libyuv:641
TEST= cpu_thread_test.cc adds a pthread based test
R=wangcheng@google.com

Change-Id: If05b1e16da833105a0159ed67ef20f4e61bc7abd
Reviewed-on: https://chromium-review.googlesource.com/510079
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-24 02:09:03 +00:00
Frank Barchard
77f6916da2 use __popcnt for visual c HammingDistance_X86
BUG=libyuv:701
TEST=HammingDistance unittest performance is comparable to x64
R=wangcheng@google.com

Change-Id: I8abe861e086e0162ba4c7ba6f1ef7d1c006cd9d4
Reviewed-on: https://chromium-review.googlesource.com/505454
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-05-12 22:59:00 +00:00
Frank Barchard
e0615c0e69 Optimize Hamming Distance C code to do 64 bits at a time.
BUG=libyuv:701
TEST=LibYUVBaseTest.BenchmarkHammingDistance_C
R=wangcheng@google.com

Change-Id: I243003b098bea8ef3809298bbec349ed52a43d8c
Reviewed-on: https://chromium-review.googlesource.com/499487
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-12 17:53:52 +00:00
Frank Barchard
bbbf30eecd Remove volatile from variables to improve performance
BUG=libyuv:703
TEST=compile and disassemble.  see registers used not stack.
R=wangcheng@google.com

Change-Id: Iaa07ee5d0c35252994491bb2868276e161149efd
Reviewed-on: https://chromium-review.googlesource.com/500427
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-09 18:14:21 +00:00
Frank Barchard
2136e349da Hamming code difference of 2 memory blocks
BUG=libyuv:701
TEST=built and disassembled for aarch64
R=kjellander@chromium.org

Change-Id: I7712b1c7934e5dfb55fda1fa7c8405c32d6964ce
Reviewed-on: https://chromium-review.googlesource.com/495327
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-08 21:37:51 +00:00
Frank Barchard
945ea1b746 mips switch sgtu to sltu for clang in ndk r14
The verion of clang in ndk r14 (3.9) has a built in llvm assembler
that does not have the sgtu pseudo instruction.
sltu is the actual instruction, so switch the 2 operands and use
the instruction instead of the pseudo op.

BUG=libyuv:700
TEST=try bots build mips without error.

Change-Id: I2d5f94f81acbd56cdedea011e7d9308979e19079
Reviewed-on: https://chromium-review.googlesource.com/494026
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-05-02 21:34:13 +00:00
Vignesh Venkatasubramanian
54289f1bb0 Fix mips build on android ndk r14+
Revert the workaround and fix it properly by passing the
additional necessary flag to the compiler.

BUG=libyuv:700

Change-Id: I1c893a8acb5079decbee6963b689424bf2f99f4f
Reviewed-on: https://chromium-review.googlesource.com/487881
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-04-26 08:39:55 +00:00
Frank Barchard
3b583396bf Disable CopyRow_MIPS
CopyRow_MIPS produces a compile error on some compilers.

TBR=kjellander@chromium.org
BUG=libyuv:700
TEST=try bots

Change-Id: Ie88f2006ef5cf14bffaf80fd4c0dd1caa409c569
Reviewed-on: https://chromium-review.googlesource.com/486127
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-04-25 01:31:13 +00:00
Frank Barchard
fc02cc3806 Add I422ToRGB565
BUG=libyuv:699
TESTED=LibYUVConvertTest.I420ToARGB_RGB565_Opt

Change-Id: I87943bcad056fbbe051301f45c7dc0ae0620c837
Reviewed-on: https://chromium-review.googlesource.com/478578
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-04-17 17:51:17 +00:00
Frank Barchard
8cab2e31d7 I422ToRGB565 fix for odd widths
I422ToRGB565Row_Any_AVX2 uses 2 step row conversion that calls
I422ToARGBRow_AVX2 and then ARGBToRGB565.
I422ToARGBRow_AVX2 expects multiple of 16 pixels.
Adjust the I422ToRGB565Row_Any_AVX2 to do multiple of 16 with AVX2
and then remainder in a buffer.

Bug: libyuv: 657
Test: out/Release/libyuv_unittest --gtest_filter=*Convert*I*To* --libyuv_width=1280 --libyuv_height=720
Change-Id: Ice1cb6c7ff6b2295513e8b4a9f77522e1c659810
Reviewed-on: https://chromium-review.googlesource.com/474232
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-04-11 17:24:05 +00:00
Frank Barchard
2adb84e39e make gflags command line parser optional
BUG=libyuv:691
TEST=gn gen out/Release "--args=is_debug=false target_cpu=\"x64\" libyuv_include_tests=true"

Change-Id: Ib481189be884c34d9bbc30bfcf71c7969c6f4dae
Reviewed-on: https://chromium-review.googlesource.com/452736
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-14 01:52:52 +00:00
Frank Barchard
d59d3fcd18 Change parameter for '_Any' functions to param to avoid misnomer
BUG=None
TEST=None

Change-Id: I6940fc4753783afd25f83868635381bf801c65f5
Reviewed-on: https://chromium-review.googlesource.com/452962
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-10 23:32:39 +00:00
Frank Barchard
e6fec061cf lint cleanup for convert RGB24ToI420
RGB24, RAW, RGB565, ARGB1555 and ARGB4444 have conditional
2 pass versus direct path.  2 pass method requires a buffer that
is conditionally allocated.  ifdef's were confusing lint.
simplifed ifdefs to clean up lint warning

BUG=libyuv:692
TEST=lint source/convert.cc

Change-Id: If868718af30b48824a5e3d28f0d7d01d4609ad55
Reviewed-on: https://chromium-review.googlesource.com/451552
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-09 10:32:23 +00:00
Frank Barchard
73a603e120 clang-format 5.0 applied to libyuv
BUG=None
TEST=try bots and lint test

Change-Id: I1ab462adf2d309117862c5eb4b244a61ae202951
Reviewed-on: https://chromium-review.googlesource.com/450658
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-03-08 18:50:12 +00:00
Frank Barchard
136aa9d37c any11p fix for buffer overrun
BUG=libyuv:686
TESTED=untested

Change-Id: Idfae93349dd78b1b633a596631e5397e11b77d0b
Reviewed-on: https://chromium-review.googlesource.com/448320
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-03 19:57:35 +00:00
Frank Barchard
91ee9b729e Fix missing return in MipsCpuCaps.
Previously if MipsCpuCaps were called with something other than
dspr2 or msa, the file was closed but still used.

This change assumed the function is only called internally twice:
once for msa and once for dspr2.  If msa is not being detected,
the function assumed dspr2 was being tested and returns dspr2 was
true.

BUG=libyuv:687
TEST=try bots

Change-Id: I80b328eb5ffc7baf5f1ee5a79c16d75c45ff26cc
Reviewed-on: https://chromium-review.googlesource.com/447831
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-03-01 23:07:03 +00:00
Manojkumar Bhosale
45b176d153 Add MSA optimized Interpolate/MergeUV/Misc functions
BUG=libyuv:634

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126

Performance Gain (vs C auto-vectorized)
InterpolateRow_MSA      - ~3.3x
InterpolateRow_Any_MSA  - ~2.5x
ARGBSetRow_MSA          - ~1.0x
ARGBSetRow_Any_MSA      - ~1.0x
ARGBToRGB24Row_MSA      - ~1.9x
ARGBToRGB24Row_Any_MSA  - ~1.6x
MergeUVRow_MSA          - ~1.6x
MergeUVRow_Any_MSA      - ~1.2x

Performance Gain (vs C non-vectorized)
InterpolateRow_MSA      - ~11.3x
InterpolateRow_Any_MSA  - ~ 7.9x
ARGBSetRow_MSA          - ~ 6.2x
ARGBSetRow_Any_MSA      - ~ 4.0x
ARGBToRGB24Row_MSA      - ~ 9.9x
ARGBToRGB24Row_Any_MSA  - ~ 8.4x
MergeUVRow_MSA          - ~12.7x
MergeUVRow_Any_MSA      - ~ 8.0x

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Reviewed-on: https://chromium-review.googlesource.com/445817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-23 01:42:22 +00:00
Manojkumar Bhosale
eed66b2028 Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions
BUG=libyuv:634

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be

Performance Gain (vs C auto-vectorized)
I444ToARGBRow_MSA       - ~1.6x
I444ToARGBRow_Any_MSA   - ~1.6x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.3x
J400ToARGBRow_MSA       - ~1.0x
J400ToARGBRow_Any_MSA   - ~1.0x
YUY2ToARGBRow_MSA       - ~1.6x
YUY2ToARGBRow_Any_MSA   - ~1.6x
UYVYToARGBRow_MSA       - ~1.6x
UYVYToARGBRow_Any_MSA   - ~1.6x

Performance Gain (vs C non-vectorized)
I444ToARGBRow_MSA       - ~7.3x
I444ToARGBRow_Any_MSA   - ~7.1x
I400ToARGBRow_MSA       - ~5.5x
I400ToARGBRow_Any_MSA   - ~5.2x
J400ToARGBRow_MSA       - ~6.8x
J400ToARGBRow_Any_MSA   - ~5.7x
YUY2ToARGBRow_MSA       - ~7.2x
YUY2ToARGBRow_Any_MSA   - ~7.0x
UYVYToARGBRow_MSA       - ~7.1x
UYVYToARGBRow_Any_MSA   - ~6.9x

Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
Reviewed-on: https://chromium-review.googlesource.com/439246
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-21 23:22:07 +00:00
Frank Barchard
bbe8c233f2 scale warning fixes for unused parameters
BUG=libyuv:680
TEST=builds and runs with no warnings

Change-Id: I7d60ef44292fa6ad4f7c4e2e2657359b864d2dab
Reviewed-on: https://chromium-review.googlesource.com/442670
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-02-15 21:38:59 +00:00
Frank Barchard
20ecb366ea I420ToI400 fix for unused u and v arguments
TBR=kjellander@chromium.org
BUG=libyuv:679
TEST=None

Change-Id: I830f31e0de58c967cb02c71f422df4c19c94c367
Reviewed-on: https://chromium-review.googlesource.com/441949
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-14 02:54:58 +00:00
Frank Barchard
1d5630ba2b mjpeg_decoder - fix for unused parameter.
TBR=kjellander@chromium.org
BUG=libyuv:678
TEST=None

Change-Id: I5206de2026b56f155531a76f94aeb6cb1b0cd6c9
Reviewed-on: https://chromium-review.googlesource.com/442090
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-14 02:22:57 +00:00
Frank Barchard
177d344ed7 enable unused parameter warning
android.mk builds have unused parameter warning on by default.
This change for GN makes libyuv build the same way.

BUG=libyuv:681
TEST=build on linux with clang and ninja.

Change-Id: I76c627d446b96653f147725bca915d94a42ce9a6
Reviewed-on: https://chromium-review.googlesource.com/441194
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-14 01:33:57 +00:00
Frank Barchard
6825b161d7 HalfFloat SSE2/AVX2 optimized port scheduling.
Uses 1 add instead of 2 leas to reduce port pressure on ports 1 and 5
used for SIMD instructions.

BUG=libyuv:670
TEST=~/iaca-lin64/bin/iaca.sh -arch HSW out/Release/obj/libyuv/row_gcc.o

Change-Id: I3965ee5dcb49941a535efa611b5988d977f5b65c
Reviewed-on: https://chromium-review.googlesource.com/433391
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-11 01:02:06 +00:00
Frank Barchard
7a54d0a302 row_msa fix clang build warnings.
BUG=libyuv:634
TEST=untested

Change-Id: Ib7f0c99e669ddba0a1efbd15895880281ad6303e
Reviewed-on: https://chromium-review.googlesource.com/435303
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-02-07 09:36:28 +00:00
Frank Barchard
76e7f104ae documentation updates
BUG=None
TEST=Untested

Change-Id: I8ab95654255d1aa9cf05a664ecf59ee6c0757e66
Reviewed-on: https://chromium-review.googlesource.com/434941
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-02 18:31:32 +00:00
Frank Barchard
0fb5675902 Fix dspr2 rename changes. Fix unused variables
TBR=kjellander@chromium.org
BUG=libyuv:634
TEST=try bots

Review-Url: https://codereview.chromium.org/2675583002 .
2017-02-01 18:51:06 -08:00
Manojkumar Bhosale
54ce8f23d6 Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
ARGBToYJRow_MSA       - ~3.2x
ARGBToYJRow_Any_MSA   - ~2.7x
BGRAToYRow_MSA        - ~3.2x
BGRAToYRow_Any_MSA    - ~2.7x
ABGRToYRow_MSA        - ~3.2x
ABGRToYRow_Any_MSA    - ~2.6x
RGBAToYRow_MSA        - ~3.1x
RGBAToYRow_Any_MSA    - ~2.7x
ARGBToUVJRow_MSA      - ~5.5x
ARGBToUVJRow_Any_MSA  - ~4.5x
BGRAToUVRow_MSA       - ~2.1x
BGRAToUVRow_Any_MSA   - ~2.0x
ABGRToUVRow_MSA       - ~2.1x
ABGRToUVRow_Any_MSA   - ~1.9x
RGBAToUVRow_MSA       - ~2.2x
RGBAToUVRow_Any_MSA   - ~1.9x

Performance Gain (vs C non-vectorized)
ARGBToYJRow_MSA       - ~10.9x
ARGBToYJRow_Any_MSA   -  ~9.2x
BGRAToYRow_MSA        - ~10.9x
BGRAToYRow_Any_MSA    -  ~9.3x
ABGRToYRow_MSA        - ~11.0x
ABGRToYRow_Any_MSA    -  ~9.3x
RGBAToYRow_MSA        - ~10.9x
RGBAToYRow_Any_MSA    -  ~9.1x
ARGBToUVJRow_MSA      - ~12.4x
ARGBToUVJRow_Any_MSA  - ~10.5x
BGRAToUVRow_MSA       -  ~4.7x
BGRAToUVRow_Any_MSA   -  ~4.4x
ABGRToUVRow_MSA       -  ~4.7x
ABGRToUVRow_Any_MSA   -  ~4.5x
RGBAToUVRow_MSA       -  ~4.8x
RGBAToUVRow_Any_MSA   -  ~4.4x

Review-Url: https://codereview.chromium.org/2641153003 .
2017-02-01 10:31:28 +05:30
Frank Barchard
b89bcda273 Add comments for ARGBToUV_C and ARGBToUVJ_C
ARGBToUV_C and ARGBToUVJ_C are generated functions with subtle
difference in rounding.  Adding comment to make them easier to find.

TBR=kjellander@chromium.org
BUG=libyuv:634
TEST=untested

Change-Id: I9912d256a1e04c58475d33bdb472c37484f6cab9
Reviewed-on: https://chromium-review.googlesource.com/434980
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-01-30 23:44:05 +00:00
Frank Barchard
54f2094a5e Rename mips source files to dspr2.
Add MSA detect to unittest.
Change macro to disable DSPR2 code to LIBYUV_DISABLE_DSPR2

BUG=libyuv:634
TEST=try bots

Change-Id: I9e0aa2452204fc529bb6f9e6fd93c4e1c379bba6
Reviewed-on: https://chromium-review.googlesource.com/433463
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-01-27 23:11:43 +00:00
Frank Barchard
749e316ed8 Remove commented out code
TEST=None
BUG=libyuv:672
Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169

Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169
Reviewed-on: https://chromium-review.googlesource.com/430321
Reviewed-by: Aaron Gable <agable@chromium.org>
2017-01-20 02:03:12 +00:00
Manojkumar Bhosale
09b8c971b3 Add MSA optimized NV12/21 To RGB row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
NV12ToARGBRow_MSA       - ~1.5x
NV12ToARGBRow_Any_MSA   - ~1.4x
NV12ToRGB565Row_MSA     - ~1.4x
NV12ToRGB565Row_Any_MSA - ~1.4x
NV21ToARGBRow_MSA       - ~1.5x
NV21ToARGBRow_Any_MSA   - ~1.5x
SobelRow_MSA            - ~4.3x
SobelRow_Any_MSA        - ~3.4x
SobelToPlaneRow_MSA     - ~8.0x
SobelToPlaneRow_Any_MSA - ~4.7x
SobelXYRow_MSA          - ~3.0x
SobelXYRow_Any_MSA      - ~2.5x

Performance Gain (vs C non-vectorized)
NV12ToARGBRow_MSA       - ~6.5x
NV12ToARGBRow_Any_MSA   - ~6.5x
NV12ToRGB565Row_MSA     - ~6.2x
NV12ToRGB565Row_Any_MSA - ~6.1x
NV21ToARGBRow_MSA       - ~6.5x
NV21ToARGBRow_Any_MSA   - ~6.5x
SobelRow_MSA            - ~14.5x
SobelRow_Any_MSA        - ~11.3x
SobelToPlaneRow_MSA     - ~34.2x
SobelToPlaneRow_Any_MSA - ~19.4x
SobelXYRow_MSA          - ~11.1x
SobelXYRow_Any_MSA      - ~9.1x

Review-Url: https://codereview.chromium.org/2636483002 .
2017-01-18 09:24:39 +05:30
Frank Barchard
a7c87e19f0 add Intel Code Analyst markers
add macros to enable/disable code analyst around blocks of code.

Normally these macros should not be used, but if performance
details are wanted for intel code, enable them around the code
and then run via the iaca tool, available on the intel website.

BUG=libyuv:670
TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
R=wangcheng@google.com

Review-Url: https://codereview.chromium.org/2626193002 .
2017-01-13 15:50:24 -08:00
Manojkumar Bhosale
73a6f100a9 Add MSA optimized rotate functions (used 16x16 transpose)
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.7x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Performance Gain (vs C non-vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.8x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Review-Url: https://codereview.chromium.org/2617703002 .
2017-01-13 15:50:02 +05:30
Manojkumar Bhosale
7c64163ff4 Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGB1555ToARGBRow_MSA     - 1.85
ARGB1555ToARGBRow_Any_MSA - 1.82
RGB565ToARGBRow_MSA       - 2.14
RGB565ToARGBRow_Any_MSA   - 2.08
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 2.60
ARGB1555ToYRow_Any_MSA    - 2.47
RGB565ToYRow_MSA          - 2.45
RGB565ToYRow_Any_MSA      - 2.33
RGB24ToYRow_MSA           - 2.23
RGB24ToYRow_Any_MSA       - 2.01
RAWToYRow_MSA             - 2.25
RAWToYRow_Any_MSA         - 2.02
ARGB1555ToUVRow_MSA       - 1.40
ARGB1555ToUVRow_Any_MSA   - 1.37
RGB565ToUVRow_MSA         - 1.68
RGB565ToUVRow_Any_MSA     - 1.63
RGB24ToUVRow_MSA          - 3.02
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.85

Performance Gain (vs C non-vectorized)
ARGB1555ToARGBRow_MSA     - 4.66
ARGB1555ToARGBRow_Any_MSA - 4.45
RGB565ToARGBRow_MSA       - 5.58
RGB565ToARGBRow_Any_MSA   - 5.34
RGB24ToARGBRow_MSA        - 8.57
RGB24ToARGBRow_Any_MSA    - 7.42
RAWToARGBRow_MSA          - 8.57
RAWToARGBRow_Any_MSA      - 7.42
ARGB1555ToYRow_MSA        - 6.38
ARGB1555ToYRow_Any_MSA    - 5.98
RGB565ToYRow_MSA          - 6.42
RGB565ToYRow_Any_MSA      - 6.05
RGB24ToYRow_MSA           - 7.87
RGB24ToYRow_Any_MSA       - 7.01
RAWToYRow_MSA             - 7.98
RAWToYRow_Any_MSA         - 7.01
ARGB1555ToUVRow_MSA       - 5.39
ARGB1555ToUVRow_Any_MSA   - 5.06
RGB565ToUVRow_MSA         - 6.39
RGB565ToUVRow_Any_MSA     - 5.90
RGB24ToUVRow_MSA          - 3.04
RGB24ToUVRow_Any_MSA      - 2.87
RAWToUVRow_MSA            - 3.04
RAWToUVRow_Any_MSA        - 2.88

Review-Url: https://codereview.chromium.org/2600713002 .
2017-01-13 15:43:37 +05:30
Frank Barchard
cb11559408 ConvertToARGB: Allows rotation on ARGB input
BUG=libyuv:668
TEST=run unit tests
R=fbarchard@google.com

Review-Url: https://codereview.chromium.org/2620183002 .
2017-01-11 14:38:25 -08:00
Frank Barchard
000d2fa91a Libyuv MIPS DSPR2 optimizations.
Optimized functions:

I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2

Bug-fixes in functions:

ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2

BUG=

Review-Url: https://codereview.chromium.org/2626123003 .
2017-01-11 12:19:13 -08:00
Manojkumar Bhosale
288bfbefb5 Add MSA optimized remaining scale row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleRowDown2_MSA            - ~22.3x
ScaleRowDown2_Any_MSA        - ~19.9x
ScaleRowDown2Linear_MSA      - ~31.2x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~1x
ScaleRowDown38_Any_MSA       - ~1x
ScaleRowDown38_2_Box_MSA     - ~1.7x
ScaleRowDown38_2_Box_Any_MSA - ~1.7x
ScaleRowDown38_3_Box_MSA     - ~1.7x
ScaleRowDown38_3_Box_Any_MSA - ~1.7x
ScaleAddRow_MSA              - ~1.2x
ScaleAddRow_Any_MSA          - ~1.15x

Performance Gain (vs C non-vectorized)
ScaleRowDown2_MSA            - ~22.4x
ScaleRowDown2_Any_MSA        - ~19.8x
ScaleRowDown2Linear_MSA      - ~31.6x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~3.2x
ScaleRowDown38_Any_MSA       - ~3.2x
ScaleRowDown38_2_Box_MSA     - ~2.4x
ScaleRowDown38_2_Box_Any_MSA - ~2.3x
ScaleRowDown38_3_Box_MSA     - ~2.9x
ScaleRowDown38_3_Box_Any_MSA - ~2.8x
ScaleAddRow_MSA              - ~8x
ScaleAddRow_Any_MSA          - ~7.46x

Review-Url: https://codereview.chromium.org/2559683002 .
2016-12-21 13:39:44 +05:30
Manojkumar Bhosale
a899dea251 Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA          - ~1.1x
ARGBAttenuateRow_Any_MSA      - ~1.1x
ARGBToRGB565DitherRow_MSA     - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA            - ~5.1x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~1.1x
ARGBGrayRow_MSA               - ~2.6x
ARGBSepiaRow_MSA              - ~11.6x

Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA          - ~2.46x
ARGBAttenuateRow_Any_MSA      - ~2.45x
ARGBToRGB565DitherRow_MSA     - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA            - ~5.2x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~4.3x
ARGBGrayRow_MSA               - ~10.5x
ARGBSepiaRow_MSA              - ~12.2x

Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Manojkumar Bhosale
6fa5e4eb78 Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx8_MSA          - ~2.7x
TransposeWx8_Any_MSA      - ~2.1x
TransposeUVWx8_MSA        - ~2.5x
TransposeUVWx8_Any_MSA    - ~2.7x

Performance Gain (vs C non-vectorized)
TransposeWx8_MSA          - ~4.6x
TransposeWx8_Any_MSA      - ~2.9x
TransposeUVWx8_MSA        - ~4.4x
TransposeUVWx8_Any_MSA    - ~3.7x

Review URL: https://codereview.chromium.org/2553403002 .
2016-12-15 10:06:01 +05:30
Frank Barchard
b18fd21d3c Android420ToI420 - use ptrdiff_t for difference of u and v pointers
The difference was assigned to an int, causing a warning on Visual C.

BUG=662
TEST=tested with try bots.
R=devangelakos@google.com

Review-Url: https://codereview.chromium.org/2574373002 .
2016-12-14 11:53:55 -08:00
Frank Barchard
dde8ba7009 ConvertFromI420: use halfstride instead of halfwidth
BUG=libyuv:660
TEST=try bots
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2554213003 .
2016-12-07 10:16:16 -08:00
Manojkumar Bhosale
56b5bbb0be Add MSA optimized ARGB scaling functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleARGBRowDown2_MSA           - ~2.6x
ScaleARGBRowDown2Linear_MSA     - ~7.9x
ScaleARGBRowDown2Box_MSA        - ~3.7x
ScaleARGBRowDownEven_MSA        - ~1.2x
ScaleARGBRowDownEvenBox_MSA     - ~3.5x

ScaleARGBRowDown2_Any_MSA       - ~2.6x
ScaleARGBRowDown2Linear_Any_MSA - ~7.9x
ScaleARGBRowDown2Box_Any_MSA    - ~3.6x
ScaleARGBRowDownEven_Any_MSA    - ~1.2x
ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x

Performance Gain (vs C non-vectorized)
ScaleARGBRowDown2_MSA           - 2.6x
ScaleARGBRowDown2Linear_MSA     - 13.5x
ScaleARGBRowDown2Box_MSA        - 5.8x
ScaleARGBRowDownEven_MSA        - 1.2x
ScaleARGBRowDownEvenBox_MSA     - 3.7x

ScaleARGBRowDown2_Any_MSA       - 2.6x
ScaleARGBRowDown2Linear_Any_MSA - 13.5x
ScaleARGBRowDown2Box_Any_MSA    - 5.3x
ScaleARGBRowDownEven_Any_MSA    - 1.2x
ScaleARGBRowDownEvenBox_Any_MSA - 3.7x

Review URL: https://codereview.chromium.org/2527983002 .
2016-12-07 11:47:15 +05:30
Manojkumar Bhosale
83f460be33 Add MSA optimized ARGB Multiply/Add/Subtract row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBMultiplyRow_MSA       - 1.4x
ARGBAddRow_MSA            - 8.6x
ARGBSubtractRow_MSA       - 8.6x

ARGBMultiplyRow_Any_MSA   - 1.35x
ARGBAddRow_Any_MSA        - 7.3x
ARGBSubtractRow_Any_MSA   - 7.2x

Performance Gain (vs C non-vectorized)
ARGBMultiplyRow_MSA       - 4.4x
ARGBAddRow_MSA            - 27x
ARGBSubtractRow_MSA       - 22x

ARGBMultiplyRow_Any_MSA   - 3.5x
ARGBAddRow_Any_MSA        - 23x
ARGBSubtractRow_Any_MSA   - 18x

Review URL: https://codereview.chromium.org/2529983002 .
2016-12-02 15:21:10 +05:30
Frank Barchard
da0c29dada Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBToRGB565Row_MSA       - ~1.6x
ARGBToRGB565Row_Any_MSA   - ~1.6x
ARGBToARGB1555Row_MSA     - ~1.3x
ARGBToARGB1555Row_Any_MSA - ~1.3x
ARGBToARGB4444Row_MSA     - ~3.8x
ARGBToARGB4444Row_Any_MSA - ~3.8x
ARGBToUV444Row_MSA        - ~2.4x
ARGBToUV444Row_Any_MSA    - ~2.4x

Performance Gain (vs C non-vectorized)
ARGBToRGB565Row_MSA       - ~2.8x
ARGBToRGB565Row_Any_MSA   - ~2.8x
ARGBToARGB1555Row_MSA     - ~2.2x
ARGBToARGB1555Row_Any_MSA - ~2.2x
ARGBToARGB4444Row_MSA     - ~6.8x
ARGBToARGB4444Row_Any_MSA - ~6.6x
ARGBToUV444Row_MSA        - ~6.7x
ARGBToUV444Row_Any_MSA    - ~6.7x

Review URL: https://codereview.chromium.org/2520003004 .
2016-11-22 10:47:55 -08:00
Frank Barchard
b1504a8e48 Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2487913004 .
2016-11-18 15:05:10 -08:00
Frank Barchard
3028e1bd97 clang-format row_gcc.cc with some functions disabled
BUG=libyuv:654
TEST=try bots build
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2484083003 .
2016-11-07 18:37:29 -08:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
f2c27dafa2 HalfFloat neon armv7 fix for destination pointer.
Improved unittests detect different in arm64 rounding.

TEST=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*Half* -a "--libyuv_width=640 --libyuv_height=360"
BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2478313004 .
2016-11-07 12:13:04 -08:00
Frank Barchard
eca08525cb HalfFloat Neon for ARMv7.
64 bit version made similar to 32 bit with registers 1 for load and store results, and 2 and 3 as expanded float temporary values.

TEST=out/Release/libyuv_unittest --gtest_filter=*Half*

BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2467723002 .
2016-11-01 11:36:51 -07:00
Frank Barchard
10ce829bad Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422ToRGB565Row_MSA             : ~1.5x
I422ToRGB565Row_Any_MSA         : ~1.5x
I422ToARGB4444Row_MSA           : ~1.4x
I422ToARGB4444Row_Any_MSA       : ~1.4x
I422ToARGB1555Row_MSA           : ~1.4x
I422ToARGB1555Row_Any_MSA       : ~1.4x

Performance Gain (vs C non-vectorized)
I422ToRGB565Row_MSA             : ~6.8x
I422ToRGB565Row_Any_MSA         : ~6.8x
I422ToARGB4444Row_MSA           : ~6.6x
I422ToARGB4444Row_Any_MSA       : ~6.6x
I422ToARGB1555Row_MSA           : ~6.6x
I422ToARGB1555Row_Any_MSA       : ~6.6x

Review URL: https://codereview.chromium.org/2445343007 .
2016-10-27 10:47:35 -07:00
Frank Barchard
532f5708a9 Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
I422AlphaToARGBRow_MSA      : ~1.4x
I422AlphaToARGBRow_Any_MSA  : ~1.4x
I422ToRGB24Row_MSA          : ~4.8x
I422ToRGB24Row_Any_MSA      : ~4.8x

Performance Gain (vs C non-vectorized)
I422AlphaToARGBRow_MSA      : ~7.0x
I422AlphaToARGBRow_Any_MSA  : ~7.0x
I422ToRGB24Row_MSA          : ~7.9x
I422ToRGB24Row_Any_MSA      : ~7.7x

Review URL: https://codereview.chromium.org/2454433003 .
2016-10-26 11:12:17 -07:00
Frank Barchard
2488b3105b White spaces, comments and lint fixes for msa.
no functional changes.

TBR=kjellander@chromium.org
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2446313002 .
2016-10-25 11:36:54 -07:00
Frank Barchard
c2073823b4 use __OPTIMIZE__ macro to determine debug vs release.
Debug builds of x86 gcc/clang can run out of register.
Previously NDEBUG or _DEBUG was used to detect a debug build.
But those macros are not set by gentoo builds.
This CL switches to the compiler predefine __OPTIMIZE__ which is
built into clang and gcc.

BUG=libyuv:602
TEST=untested
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2451503002 .
2016-10-24 18:02:48 -07:00
Frank Barchard
f5d5bd88d6 Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gains :- (vs C vectorized)

I422ToARGBRow_MSA     : ~1.6x
I422ToRGBARow_MSA     : ~1.6x

I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x

Performance Gains :- (vs C non-vectorized)

I422ToARGBRow_MSA     : ~7x
I422ToRGBARow_MSA     : ~7x

I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x

Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.

Review URL: https://codereview.chromium.org/2430313005 .
2016-10-24 15:37:08 -07:00
Frank Barchard
451af5e922 scale by 1 for neon implemented
void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fcvtn      v4.4h, v2.4s                   \n"  // 8 floatsgit
    "fcvtn2     v4.8h, v1.4s                   \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  :
  : "cc", "memory", "v1", "v2", "v4"
  );
}

void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
  asm volatile (
  "1:                                          \n"
    MEMACCESS(0)
    "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
    "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
    "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
    "uxtl2      v1.4s, v1.8h                   \n"
    "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
    "scvtf      v1.4s, v1.4s                   \n"
    "fmul       v2.4s, v2.4s, %3.s[0]          \n"  // adjust exponent
    "fmul       v1.4s, v1.4s, %3.s[0]          \n"
    "uqshrn     v4.4h, v2.4s, #13              \n"  // isolate halffloat
    "uqshrn2    v4.8h, v1.4s, #13              \n"
   MEMACCESS(1)
    "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
    "b.gt       1b                             \n"
  : "+r"(src),    // %0
    "+r"(dst),    // %1
    "+r"(width)   // %2
  : "w"(scale * 1.9259299444e-34f)    // %3
  : "cc", "memory", "v1", "v2", "v4"
  );
}

TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2430313008 .
2016-10-21 14:30:03 -07:00
Frank Barchard
550cf829fb HalfFloat avx2 unpack bug fix.
AVX unpack parameters were reverse ordered causing incorrect results
on AVX2 hardware.

TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Half*

BUG=libyuv:560
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2438893002 .
2016-10-20 15:49:00 -07:00
Frank Barchard
f553db2d30 HalfFloatPlane unittest for denormal half floats
Halffloats have a limited range.  It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero.  This test ensures they behave the same in C and SIMD and tests the performance of denormals.

TEST=TestHalfFloatPlane_denormal
BUG=libyuv:560
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2424233004 .
2016-10-19 18:13:01 -07:00
Frank Barchard
78c58ab8aa Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains : (Auto-vectorized C vs MSA SIMD)

ARGB4444ToYRow_MSA        : ~3.0x
ARGB4444ToUVRow_MSA       : ~1.8x
ARGB4444ToARGBRow_MSA     : ~3.4x

ARGB4444ToYRow_Any_MSA    : ~2.8x
ARGB4444ToUVRow_Any_MSA   : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x

Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
e16e3a629f cpu_id cleanup. no functional change.
remove old comment about initialize to zero.
remove ifdef and replace with macro defined to zero.

BUG=None
TEST=try bots
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2425623004 .
2016-10-18 12:26:02 -07:00
Frank Barchard
2d80fc3133 Port HalfFloatRow_SSE2 to AVX2 but not using F16C.
R=wangcheng@google.com, hubbe@chromium.org
BUG=libyuv:560

Review URL: https://codereview.chromium.org/2421993002 .
2016-10-14 19:01:41 -07:00
Frank Barchard
fdcf524aac Add f16c (halffloat) cpuid
R=wangcheng@google.com, hubbe@chromium.org
BUG=libyuv:560

Review URL: https://codereview.chromium.org/2418763006 .
2016-10-14 16:34:08 -07:00
Frank Barchard
5333e94e70 Port ARGBExtractAlpha_AVX2 function to windows.
BUG=libyuv:572
TEST=try bots
R=wangcheng@google.com, magjed@chromium.org

Review URL: https://codereview.chromium.org/2416783004 .
2016-10-13 23:20:57 -07:00
Frank Barchard
a5e93766a2 Add ARGBExtractAlpha_AVX2 function
Port SSE2 version to AVX2.
BUG=libyuv:572
TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract*
R=wangcheng@google.com, magjed@chromium.org

Review URL: https://codereview.chromium.org/2420553002 .
2016-10-13 16:03:43 -07:00
Frank Barchard
d363ea6527 Remove I411 support.
YUV 411 is very uncommon format.  Remove support.

Update documentation to reflect that 411 is deprecated.

Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.

BUG=libyuv:645
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
af87c11c9a YUY2ToI422 coalesce rows for small images
TBR=wangcheng@google.com
BUG=libyuv:647
TESTED=LibYUVConvertTest.YUY2ToI422_Opt

Review URL: https://codereview.chromium.org/2393393006 .
2016-10-07 18:35:42 -07:00
Frank Barchard
edd3a84d05 libyuv::YUY2ToY for isolating Y channel of YUY2.
This function is the first step of YUY2 To I420.
Provided primarily for diagnostics.

TBR=wangcheng@google.com
BUG=libyuv:647
TESTED=LibYUVConvertTest.YUY2ToY_Opt

Review URL: https://codereview.chromium.org/2399153004 .
2016-10-07 17:20:30 -07:00
Frank Barchard
a2891ec77c Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains as below,

YUY2ToI422, YUY2ToI420 :-

YUY2ToYRow_MSA          : ~10x
YUY2ToUVRow_MSA         : ~11x
YUY2ToUV422Row_MSA      : ~9x
YUY2ToYRow_Any_MSA      : ~6x
YUY2ToUVRow_Any_MSA     : ~5x
YUY2ToUV422Row_Any_MSA  : ~4x

UYVYToI422, UYVYToI420 :-

UYVYToYRow_MSA          : ~10x
UYVYToUVRow_MSA         : ~11x
UYVYToUV422Row_MSA      : ~9x
UYVYToYRow_Any_MSA      : ~6x
UYVYToUVRow_Any_MSA     : ~5x
UYVYToUV422Row_Any_MSA  : ~4x

Review URL: https://codereview.chromium.org/2397693002 .
2016-10-07 10:37:22 -07:00
Frank Barchard
3b88a19ab1 YUY2ToI422_Any_Neon clean up to not require 16 pixels
YUY2ToI422_Any_Neon previously required 16 pixels and duplicated
the last pixel.  The replication was not necessary after a previous
change to treat YUY2 to 4 byte macro pixels.

TBR=harryjin@google.com
BUG=libyuv:648
TESTED=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*YUY2ToI422* -a "--libyuv_width=17 --libyuv_height=7 --libyuv_repeat=999 --libyuv_flags=1"

Review URL: https://codereview.chromium.org/2399143002 .
2016-10-06 12:11:40 -07:00
Frank Barchard
7018f5be0f Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains :-

I422ToYUY2Row_MSA     - ~12x
I422ToYUY2Row_Any_MSA - ~7x

I422ToUYVYRow_MSA     - ~12x
I422ToUYVYRow_Any_MSA - ~7x

Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
aa197ee1a3 HalfFloat_SSE2 for Visual C
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows
R=hubbe@chromium.org, wangcheng@google.com

Review URL: https://codereview.chromium.org/2387713002 .
2016-10-03 10:33:38 -07:00
Frank Barchard
4a14cb2e81 HalfFloat_SSE2 port from C algorithm to SSE2
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion.

BUG=libyuv:560, chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2381493006 .
2016-09-30 09:47:16 -07:00
Frank Barchard
7fc932ddd3 Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion.
BUG=libyuv:560,chromium:445071
TEST=untested
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2371293002 .
2016-09-29 15:06:30 -07:00
Frank Barchard
c11e9b7fb7 bt709 coefficients for video constrained space
Original bt709 color space coefficients were full range yuv for higher
quality.  This change makes the coefficients use the video constrained
color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for
chroma channels.

BUG=libyuv:639
TEST=libyuv unittests run locally
R=hubbe@chromium.org

Review URL: https://codereview.chromium.org/2367253003 .
2016-09-28 15:07:46 -07:00
Frank Barchard
6732bcbde9 ShortToHalfFloat_AVX2 function
BUG=libyuv:560
TEST=local compile for windows
R=wangcheng@google.com

Review URL: https://codereview.chromium.org/2364293002 .
2016-09-27 14:18:32 -07:00
Frank Barchard
618149084e Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function
This patch adds MSA optimized ARGBMirrorRow function in libYUV project.

Performance gain ~3x

R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2368313003 .
2016-09-26 16:28:01 -07:00
Frank Barchard
34a29bf756 fix warning on visual C for mips cpu detect
follow up warning fixs
cpu_id.cc(167): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
lint warning: cpu_id.cc:171:  Missing space before ( in if(  [whitespace/parens] [5]

TBR=manojkumar.bhosale@imgtec.com
BUG=libyuv:634
TEST=try bots for windows.

Review URL: https://codereview.chromium.org/2365813002 .
2016-09-22 18:25:52 -07:00
Frank Barchard
c5323b0fdc Add MIPS SIMD Arch (MSA) optimized MirrorRow function
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds

This patch adds first MSA optimized function in libYUV project.

BUG=libyuv:634
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2285683002 .
2016-09-22 16:12:22 -07:00
Frank Barchard
6ad3aa6ae4 fix multi-line comment warning
../../source/scale_neon.cc:576:1: error: multi-line comment [-Werror=comment]
 // #define BLENDER(a, b, f) (uint8)((int)(a) + \
 ^

BUG=None
TEST=try bots

Review URL: https://codereview.chromium.org/2344203003 .
2016-09-16 15:16:39 -07:00
Frank Barchard
8279df963e Scale by 3/8 only if source is multiple of 8 tall.
BUG=libyuv:635
TEST=try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2347733002 .
2016-09-16 14:57:47 -07:00
Frank Barchard
137aa63afe Fix some comment typos
BUG=None
TEST=try bots

Review URL: https://codereview.chromium.org/2346633002 .
2016-09-15 15:38:19 -07:00
Frank Barchard
de944ed8c7 YuvConstants declare alignment for externs as well as declarations
On visual c 2013 and earlier a warning is generated if externs
are not declared with the same alignment as the declaration, when
using /ltcg

BUG=libyuv:633
TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg
R=skal@google.com

Review URL: https://codereview.chromium.org/2291533004 .
2016-08-30 11:06:46 -07:00
Frank Barchard
c244a3e9a0 Add SplitUVPlanes and MergeUVPlanes
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.

BUG=libyuv:629
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2277603004 .
2016-08-24 16:47:24 -07:00
Frank Barchard
161e5c4569 Allow NULL for dst_y in planar formats. BUG=libyuv:631 TEST=unittests build/pass
BUG=libyuv:631
TEST=unittests build/pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2271053003 .
2016-08-24 10:19:14 -07:00
Frank Barchard
17d31e6a4a NV12 allow NULL for Y
The conversion from NV12 and other Bi or Tri planar formats, differs only in the UV handling.  The helper function supports passing a NULL for the dst_y channel indicating you only want to do the UV conversion.

TBR=harryjin@google.com
TEST=LibYUVConvertTest.NV12ToI420_NullY (601 ms)
BUG=libyuv:626

Review URL: https://codereview.chromium.org/2276703002 .
2016-08-23 19:05:25 -07:00
Frank Barchard
d58297a2df NV12ToI420 use SplitPlane function
TBR=magjed@chromium.org
BUG=libyuv:629
TEST=LibYUVConvertTest.NV12ToI420_Opt

Review URL: https://codereview.chromium.org/2267303002 .
2016-08-22 18:35:55 -07:00
Frank Barchard
36ae08ce1c Suppress MJPEG fprintf() runtime warning
TBR=harryjin@google.com
BUG=libyuv:630
TEST=local build and try bots pass

Review URL: https://codereview.chromium.org/2264293002 .
2016-08-22 16:30:36 -07:00
Frank Barchard
46a8eaaf0c fix typo in YUV
R=braveyao@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2152623002 .
2016-07-13 17:17:19 -07:00
Frank Barchard
1aa4ddd21c Attribute aligned 32 for YUV conversion structure on Intel
Fix for unaligned memory exception.

R=braveyao@chromium.org
BUG=libyuv:616

Review URL: https://codereview.chromium.org/2152553002 .
2016-07-13 12:19:26 -07:00
Frank Barchard
abcb70f183 Test nv21 layout of Android420ToI420 function.
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Any
[       OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_1_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Any
[       OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Unaligned
[       OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Invert
[       OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms)
[ RUN      ] LibYUVConvertTest.Android420ToI420_2_Opt
[       OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms)

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2146733002 .
2016-07-12 18:34:04 -07:00
Frank Barchard
84e04699c2 Add libyuv:Android420ToI420 function which takes 3 pointers
to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.

TEST=LibYUVConvertTest.Android420ToI420_Opt
BUG=libyuv:604
R=braveyao@chromium.org

Review URL: https://codereview.chromium.org/2114843002 .
2016-07-12 16:23:51 -07:00
Frank Barchard
2f101fdbda mingw64 fix - guard row_win.cc against mingw build.
The old guard only checked for defined(_M_X64) which is defined by mingw64.  Add a test for defined(_MSC_VER) which is defined for clangcl and visual c but not mingw.  mingw should use row_gcc.cc for both 32 and 64 bit.

R=harryjin@google.com
BUG=webm:1252,libyuv:613
TEST=local gcc/clang builds on linux tested and try bots for others.

Review URL: https://codereview.chromium.org/2105603002 .
2016-06-28 10:21:27 -07:00
Frank Barchard
b8ddb5a2a7 rounding for arm filter
R=wangcheng@google.com, harryjin@google.com
BUG=libyuv:607

Review URL: https://codereview.chromium.org/2093913004 .
2016-06-24 16:07:49 -07:00
Frank Barchard
1b3e4aee47 make count a memory variable for 32 bit
32 bit clang runs out of registers and compiler does core dump.
force 32 bit build to use memory variable for counter.

BUG=libyuv:612
TBR=harryjin@google.com

Review URL: https://codereview.chromium.org/2091913003 .
2016-06-23 20:42:10 -07:00
Frank Barchard
cc88adc620 YUV scale filter columns improved filtering accuracy
upscale a YUV image.  observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.

was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)

BUG=libyuv:605
TEST=try bots
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Niels Möller
365ed3851c Treat YU12 as an alias for I420. Simplify setting of inv_crop_height.
BUG=
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/2020193002 .
2016-06-16 12:49:17 +02:00
Frank Barchard
fd3e676e91 android_full_debug x86 fix - use +rm for width count
Work around for android full debug build runnign out of registers.
5 functions were running out of registers causing the compiler error
error: 'asm' operand has impossible constraints
These functions mostly have 4 pointers, a counter (width) and a tempory
eax register.  With fpic and debug using stackframes, 2 registers are
unavailable.  So a total of 8 registers are used.
Although fpic and stack frame dont apply to assembly, the compiler
reserves 2 registers.  The optimized version builds, so its likely
freeing up the registers once it knows they are not used.
These functions used to build, so compile options and/or compiler may
have updated.. likely fpic was turned on.
An attribute can be done to disable each, and will avoid using the
2 GPR registers, but they are still reserved and unavailable in debug
builds on current compilers (gcc 4.9 and clang 3.8).

R=dhrosa@google.com
BUG=libyuv:602

Review URL: https://codereview.chromium.org/2066933002 .
2016-06-14 15:25:28 -07:00
Frank Barchard
026be3cd85 neon64 use width int directly.
width %w size modifier the int width can be passed directly to arm assembly.
For functions that take input constants, the outputs are declared as early
write using &, meaning the outputs use used before all inputs are consumed.

R=harryjin@google.com
BUG=libyuv:598

Review URL: https://codereview.chromium.org/2043073003 .
2016-06-08 10:26:53 -07:00
Frank Barchard
17e8a4d3df Remove ifdefs for neon in row_neon*.cc
ifdefs on a function level are not needed for neon functions, unless
they are conditionally enabled in row.h.  No functions are conditionally
enabled at this time, so all ifdefs can be removed from row_neon.cc and
row_neon64.cc

TBR=kjellander@chromium.org
BUG=libyuv:599

Review URL: https://codereview.chromium.org/2044223002 .
2016-06-07 14:34:13 -07:00
Frank Barchard
6546096269 ARGBExtractAlpha 16 pixels at a time for ARM
arm64   8     TestARGBExtractAlpha (10019 ms) <-original 64 bit code
arm64   8 x2  TestARGBExtractAlpha (7639 ms)
arm64   16    TestARGBExtractAlpha (7369 ms) <- new 64 bit code
thumb32 8     TestARGBExtractAlpha (9505 ms) <- original 32 bit code
thumb32 8 x2  TestARGBExtractAlpha (7400 ms)
thumb32 8 x2i TestARGBExtractAlpha (7266 ms) <- new 32 bit code
arm32   8     TestARGBExtractAlpha (10002 ms)

BUG=libyuv:572
TESTED=local test on nexus 9
R=harryjin@google.com, wangcheng@google.com

Review URL: https://codereview.chromium.org/2035573002 .
2016-06-07 10:44:28 -07:00
Frank Barchard
b00d40160a make unittest allocator align to 64 bytes.
blur requires memory be aligned.  change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.

TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.

Review URL: https://codereview.chromium.org/2019753002 .
2016-05-27 18:02:47 -07:00
Frank Barchard
ade85fb55c remove row.h from unittests
add SIMD_ALIGNED to unittest header.

BUG=libyuv:594
TESTED=local build passes with row.h removed from tests.
R=harryjin@google.com

Review URL: https://codereview.chromium.org/2001373002 .
2016-05-27 10:57:49 -07:00
Magnus Jedvert
942db3016a Add ARGBExtractAlpha function
BUG=libyuv:572
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1995293002 .
2016-05-26 10:30:57 +02:00
Frank Barchard
74a69522da white space fixes for MIPS
TBR=kjellander@chromium.org
BUG=None

Review URL: https://codereview.chromium.org/2005053004 .
2016-05-24 14:17:18 -07:00
Frank Barchard
7edf572e28 remove includes for duplicate functions
R=harryjin@google.com
BUG=libyuv:592
TESTED=local builds work with fewer headers

Review URL: https://codereview.chromium.org/2006943002 .
2016-05-23 17:38:26 -07:00
Frank Barchard
fbdc43a03c fix wrong HAS_ARGBCOPYALPHAROW_SSE2 ifdef
TBR=kjellander@chromium.org
BUG=libyuv:593
TESTED=try bots pass.

Review URL: https://codereview.chromium.org/2000393002 .
2016-05-23 16:26:02 -07:00
Frank Barchard
cf101116c9 Remove initialize to zero on output variables for inline.
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs).  This allows the initialize
to zero step to be removed, saving 1 instruction.

BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
9c53ff2c57 Fix temporary stride for ConvertToARGB with rotation.
BUG=libyuv:578
TESTED=local unittests pass
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1879783002 .
2016-04-11 15:21:04 -07:00
Frank Barchard
3c862e3d29 Fix stride bug for msan on I420Interpolate.
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C

BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1872953002 .
2016-04-08 15:58:53 -07:00
Frank Barchard
c7372a323a add if defined(_MSC_FULL_VER) for NaCL
TBR=kjellander@chromium.org
BUG=libyuv:573
TESTED=try bots

Review URL: https://codereview.chromium.org/1850053002 .
2016-04-01 17:48:23 -07:00
Frank Barchard
76aee8ced7 Remove most clang-cl special cases from cpu_id.cc
They are not needed, and due to them there was a call to _xgetbv()
without a declaration of the function.  This used to work because we
implicitly included intrin.h in all translation units with clang-cl, but
we want to stop doing that.

BUG=chromium:592745
R=fbarchard@google.com

Review URL: https://codereview.chromium.org/1780473003 .
2016-03-10 14:01:26 -08:00
Frank Barchard
ee99b85126 Port ARGBToRGB565 from aarch64 neon to 32 bit
The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts and inserts, saving some masking.  The instruction is available for neon 32 bit as well.

R=magjed@chromium.org, harryjin@google.com
BUG=libyuv:571

Review URL: https://codereview.chromium.org/1724393002 .
2016-02-29 12:22:25 -08:00
Frank Barchard
22e062a448 Port ARGBToJ420 to AVX2
ARGBToJ420 had an SSSE3 version, but not AVX2.
ARGBToI420 had an AVX2, so adapt that code to J420.

TBR=harryjin@google.com
BUG=libyuv:553

Review URL: https://codereview.chromium.org/1702373004 .
2016-02-17 23:16:39 -08:00
Frank Barchard
127ff512b3 add perf data files to ignores
document play services update

R=jkellander@chromium.org
BUG=none

Review URL: https://codereview.chromium.org/1712463002 .
2016-02-17 21:37:09 -08:00
Frank Barchard
cc33dc68c7 Port I411ToARGBRow to AVX2.
An SSSE3 version already exists, and an AVX2 version is available for
Visual C.  This ports the function to AVX2 completing the AVX2 ports of
all YUV to RGB functions for AVX2 on gcc.

TBR=harryjin@google.com
BUG=libyuv:555

Review URL: https://codereview.chromium.org/1687253002 .
2016-02-12 10:26:10 -08:00
Frank Barchard
0e554b18fe port NV12ToRGB565Row_AVX2 to gcc
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2

NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.

R=harryjin@google.com
BUG=libyuv:554

Review URL: https://codereview.chromium.org/1687953002 .
2016-02-10 11:13:41 -08:00
Frank Barchard
c39509c8e5 add avx2 wrappers for functions that can call I422ToARGBRow_AVX2
R=harryjin@google.com
BUG=libyuv:557

Review URL: https://codereview.chromium.org/1687713002 .
2016-02-09 17:14:29 -08:00
Frank Barchard
0d880e5bc0 rename MIPS_DSPR2 to DSPR2 for consistency
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor.  This should be named consistently.

Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.

TBR=harryjin@google.com
BUG=libyuv:562

Review URL: https://codereview.chromium.org/1677633002 .
2016-02-05 14:49:54 -08:00
Frank Barchard
05ed0c539c rework scale code for ubsan
For more info on ubsan, see
http://dev.chromium.org/developers/testing/undefinedbehaviorsanitizer

TESTED=Passing compilation using:
GYP_DEFINES="ubsan=1"
GYP_DEFINES="ubsan_vptr=1"

R=harryjin@google.com, pbos@webrtc.org
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1654253004 .
2016-02-02 11:01:49 -08:00
Frank Barchard
9e39c1f271 ubsan overflow fix for multiply by 0x01010101
This is an UBSan error reported by libjingle

[ RUN      ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride
[000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73
../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int'
[8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms)
[9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms)
Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted

The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes.
Changing the constant to 0x01010101u should avoid overflow.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:563

Review URL: https://codereview.chromium.org/1657533005 .
2016-02-01 12:29:04 -08:00
Frank Barchard
58cb534962 Fix memory overwrite in YUY2ToNV12 odd wdiths
When width was odd Y channel wrote an extra pixel.
This change splits the Y from UV into a temporary
buffer and memcpy's to the destination.  Performance
is slower.

Was
YUY2ToNV12_Any (307 ms)
YUY2ToNV12_Unaligned (213 ms)
TestYUY2ToNV12 (181 ms)
YUY2ToNV12_Opt (177 ms)
YUY2ToNV12_Invert (177 ms)

Npw
YUY2ToNV12_Any (300 ms)
YUY2ToNV12_Unaligned (226 ms)
YUY2ToNV12_Invert (206 ms)
TestYUY2ToNV12 (184 ms)
YUY2ToNV12_Opt (181 ms)
TBR=harryjin@google.com
BUG=libyuv:545

Review URL: https://codereview.chromium.org/1593833002 .
2016-01-19 11:28:09 -08:00
Frank Barchard
8377c798fb Fix I420ToNV21 for wrong dst_stride_y parameter.
I420ToNV21 passes the wrong dst_stride_y when it calls I420ToNV12; parameter 8 (convert_from.cc:448) is src_stride_y but should be dst_stride_y.  This causes image corruption when converting I420 -> NV21 with mismatched luminance strides.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:547

Review URL: https://codereview.chromium.org/1582793008 .
2016-01-14 17:38:54 -08:00
Frank Barchard
081475b3c8 refactor ARGBToI422 using ARGBToI420 internally
R=harryjin@google.com
BUG=libyuv:546

Review URL: https://codereview.chromium.org/1574253004 .
2016-01-12 17:05:49 -08:00
Frank Barchard
23c6a83561 Fix ifdef mismatch for mirroruv
Macro define and macro ifdef didnt match, leading to C code
being used.  Make macro match function name.

TBR=harryjin@google.com
BUG=libyuv:543

Review URL: https://codereview.chromium.org/1579023002 .
2016-01-11 16:33:36 -08:00
Frank Barchard
0e462e6f45 Remove use_sysroot=0
use_sysroot=0 is required for webrtc on linux intel builds, but
libyuv doesnt use the affected libraries, so removing this.

R=harryjin@google.com, sbc@chromium.org
BUG=libyuv:534,libyuv:542

Review URL: https://codereview.chromium.org/1566303002 .
2016-01-11 14:57:50 -08:00
Frank Barchard
fc52d8ded2 Odd width variation of scale down by 2 for subsampling
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:538

Review URL: https://codereview.chromium.org/1558093003 .
2016-01-06 15:12:17 -08:00
Frank Barchard
36615d62a0 fix for InterpolateRow_AVX2
port scaledownby4_avx2 to gcc

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1546763002 .
2015-12-22 12:29:54 -08:00
Frank Barchard
71deb7ba3a bug fix - remove shift from InterpolateRow_AVX2
TBR=harryjin@google.com
BUG=libyuv:537

Review URL: https://codereview.chromium.org/1547703002 .
2015-12-22 10:28:48 -08:00
Frank Barchard
2cb2e9e1ad fix for InterpolateRow_AVX2
TBR=harryjin@google.com
BUG=libyuv:535

Review URL: https://codereview.chromium.org/1543773002 .
2015-12-21 18:35:12 -08:00
Frank Barchard
3f4d86053e avx2 interpolate use 8 bit
BUG=libyuv:535
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1535833003 .
2015-12-21 10:57:32 -08:00
Frank Barchard
f4447745ae Add rounding to InterpolateRow for improved quality and consistency.
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly.  Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.

BUG=libyuv:535
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1533643005 .
2015-12-17 15:24:06 -08:00
Frank Barchard
1ccbf8fb7b use memory for loop counter to work around nearly out of registers
TBR=harryjin@google.com
BUG=libyuv:533

Review URL: https://codereview.chromium.org/1535433003 .
2015-12-16 17:13:37 -08:00
Frank Barchard
80ca4514ef change scale down by 4 to use rounding.
TBR=harryjin@google.com
BUG=libyuv:447

Review URL: https://codereview.chromium.org/1525033005 .
2015-12-15 21:25:18 -08:00
Frank Barchard
70445ef2ef avx2 scale down by 2 for gcc
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1520423003 .
2015-12-15 10:59:20 -08:00
Frank Barchard
ae55e41851 use rounding in scaledown by 2
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
b3bbcc1f4e add ifdef for AVX2 so vs2010 can still compile
R=harryjin@google.com
BUG=libyuv:531

Review URL: https://codereview.chromium.org/1515503005 .
2015-12-09 15:23:51 -08:00
Frank Barchard
cb44936403 fix typo in avx2 gcc blend.
was using wrong register on 32 pixel version.

R=harryjin@google.com, dhrosa@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1511433006 .
2015-12-09 10:38:46 -08:00
Frank Barchard
353ffbab80 fix for gcc compile error: variable duplicate define
TBR=harryjin@google.com
BUG=libyuv:529

Review URL: https://codereview.chromium.org/1512793002 .
2015-12-08 19:03:43 -08:00
Frank Barchard
a2ea905679 BlendPlane any width.
Benchmark
out\release\libyuv_unittest --libyuv_width=1279 --libyuv_height=719 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms

Was
I420Blend_Any (2321 ms)
I420Blend_Unaligned (1684 ms)
I420Blend_Opt (1675 ms)
I420Blend_Invert (1653 ms)
BlendPlane_Invert (1556 ms)
BlendPlane_Any (1552 ms)
BlendPlane_Unaligned (1548 ms)
BlendPlane_Opt (1535 ms)
ARGBBlend_Unaligned (659 ms)
ARGBBlend_Any (596 ms)
ARGBBlend_Invert (591 ms)
ARGBBlend_Opt (508 ms)
BlendPlaneRow_Unaligned (186 ms)
BlendPlaneRow_Opt (171 ms)

Now
ARGBBlend_Any (621 ms)
ARGBBlend_Unaligned (585 ms)
ARGBBlend_Invert (564 ms)
ARGBBlend_Opt (512 ms)
I420Blend_Unaligned (347 ms)
I420Blend_Invert (345 ms)
I420Blend_Any (337 ms)
I420Blend_Opt (327 ms)
BlendPlane_Unaligned (187 ms)
BlendPlaneRow_Unaligned (187 ms)
BlendPlane_Invert (186 ms)
BlendPlane_Any (186 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (171 ms)

which is comparable to aligned case
out\release\libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms
ARGBBlend_Any (625 ms)
ARGBBlend_Unaligned (602 ms)
ARGBBlend_Invert (508 ms)
ARGBBlend_Opt (506 ms)
I420Blend_Any (353 ms)
I420Blend_Unaligned (322 ms)
I420Blend_Invert (304 ms)
I420Blend_Opt (301 ms)
BlendPlaneRow_Unaligned (188 ms)
BlendPlane_Unaligned (186 ms)
BlendPlane_Invert (185 ms)
BlendPlane_Any (184 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (169 ms)

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1513443002 .
2015-12-08 18:59:48 -08:00
Frank Barchard
dee77a4ebe Optimize yuv alpha blend AVX2 code to do 32 pixels at time.
out/Release/libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --gtest_filter=*I420Blend_Opt

Was LibYUVPlanarTest.I420Blend_Opt (2335 ms)
Now LibYUVPlanarTest.I420Blend_Opt (1937 ms)

vs SSSE3
LibYUVPlanarTest.I420Blend_Opt (2599 ms)

BUG=libyuv:527
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1505673003 .
2015-12-08 18:20:30 -08:00
Frank Barchard
fae1a10545 Work around bug in xgetbv for Visual Studio.
xgetbv is generating bad code, falsely disabling AVX2 and AVX512.
disable optimization for the function affected on older versions of Visual C 32 bit.

R=brucedawson@chromium.org, dhrosa@google.com, harryjin@google.com
BUG=libyuv:529

Review URL: https://codereview.chromium.org/1503393004 .
2015-12-08 18:13:32 -08:00
Frank Barchard
2657688e70 Add support for odd height YUVA alpha blending.
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1507683003 .
2015-12-07 12:03:20 -08:00
Frank Barchard
b0b22f88b9 Unroll C version of YUV blender for improved performance.
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1502343003 .
2015-12-07 12:02:45 -08:00
Frank Barchard
48a919d86e Bug fix for UYVYToNV12 odd height
TBR=harryjin@google.com
BUG=libyuv:528

Review URL: https://codereview.chromium.org/1506973002 .
2015-12-07 11:39:48 -08:00
Frank Barchard
bea690b3e0 AVX2 YUV alpha blender and improved unittests
AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions.

unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1505433002 .
2015-12-05 22:23:29 -08:00
Frank Barchard
fa2618ee26 Port BlendPlaneRow_SSSE3 to GCC
R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1490273006 .
2015-12-04 11:19:41 -08:00
Frank Barchard
8af0ebf816 planar blend use signed images
R=dhrosa@google.com, harryjin@google.com, jzern@chromium.org
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1491533002 .
2015-12-02 14:20:17 -08:00
Frank Barchard
b6f37bd8ec Interpolate plane initial implementation.
YUV version of interpolation between two images.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:526

Review URL: https://codereview.chromium.org/1479593002 .
2015-11-25 16:11:42 -08:00
Frank Barchard
526558b2d8 disable debug build of 411 to work around compiler bug
TBR=harryjin@google.com
BUG=libyuv:524

Review URL: https://codereview.chromium.org/1461013002 .
2015-11-19 02:25:00 -08:00
Frank Barchard
b7dfb72559 fix for I411 build error on 32 bit x86
TBR=harrjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1461693004 .
2015-11-19 01:45:14 -08:00
Frank Barchard
528356a128 syntax fix for gcc movzwl
TBR=harryjin@google.com
BUG=libtyv:525

Review URL: https://codereview.chromium.org/1460723003 .
2015-11-18 13:14:15 -08:00
Frank Barchard
50f8cb2db3 port I411 movzx 2 byte reader to gcc
previously the I411 format used movd to read U, V pixels.
But this reads 4 bytes, and can cause a memory exception.
pinsrw can be used, but fails on drmemory 1.5, and is slow.
So in this change a movzxw is used to read 2 bytes into EBX,
then copy to xmm0 with movd.
Slightly slower, but no memory exception
Was LibYUVConvertTest.I411ToARGB_Opt (577 ms)
Now LibYUVConvertTest.I411ToARGB_Opt (608 ms)

TBR=harryjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1457783004 .
2015-11-18 13:05:39 -08:00
Frank Barchard
5eefbe2330 Fix for drmemory failure on I411ToARGB
Before
I420ToARGB_Opt (594 ms)
I422ToARGB_Opt (483 ms)
I411ToARGB_Opt (748 ms) ***
I444ToARGB_Opt (452 ms)
I400ToARGB_Opt (218 ms)

After
I420ToARGB_Opt (591 ms)
I422ToARGB_Opt (454 ms)
I411ToARGB_Opt (502 ms)  ***
I444ToARGB_Opt (441 ms)
I400ToARGB_Opt (216 ms)

TBR=harryjin@google.com
BUG=libyuv:525

Review URL: https://codereview.chromium.org/1459513002 .
2015-11-17 18:00:52 -08:00
Frank Barchard
0815568a50 test for unaligned vs aligned for CopyRow_SSE2
improves performance on older CPUs where movdqa is faster.
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1455463002 .
2015-11-17 00:04:03 -08:00
Frank Barchard
1019e4537f port I444ToARGB avx2 code from Visual C to GCC.
SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)

AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1445893002 .
2015-11-13 18:31:22 -08:00
Frank Barchard
60adcbaf32 scale with conversion using 2 steps with unittest
a prototype function to implement the yuv to rgb with conversion and scale.
replace with 1 step function in future version, using same API.

R=harryjin@google.com
BUG=libyuv:471

Review URL: https://codereview.chromium.org/1421553016 .
2015-11-13 11:25:56 -08:00
Frank Barchard
6100f50f13 fix yvu constants for avx2 yuv to rgb
the yvu matrix for yuv to rgb had an incorrect entry, affecting yuv to bgra,
yuv to abgr and yuv to raw.
fix the matrix and reenable avx2 functions.

R=harryjin@google.com
BUG=libyuv:522

Review URL: https://codereview.chromium.org/1411763004 .
2015-11-10 10:45:44 -08:00
Frank Barchard
72a9e282ec disable more avx2 functions that dont link in chrome
libyuv builds/runs, but when integrated into chromium, produces link errors.  unclear why but this disables affected functions.
will followup with re-enabling them once the root cause in the runtime error is found.

TBR=harryjin@google.com
BUG=libyuv:522

Review URL: https://codereview.chromium.org/1427683004 .
2015-11-09 17:20:02 -08:00
Frank Barchard
98eb102bea set d19 alpha on inner loop
TBR=harryjin@google.com
BUG=libyuv:521

Review URL: https://codereview.chromium.org/1429263004 .
2015-11-06 11:38:21 -08:00
Frank Barchard
431cb3667a YUV to RGB for x64 use registers instead of memory.
On Arm the YVU to RGB conversions move constants into registers.
This change does the same for 64 bit intel builds where additional
registers are available.
The AVX2 saves 3 instructions by because the 2nd argument needs to be a register, so a vmovdqu was avoided.

x64 builds using memory:
AVX2  I420ToARGB_Opt (3059 ms)
SSSE3 I420ToARGB_Opt (3959 ms)

Now using registers
AVX2  I420ToARGB_Opt (2906 ms)
SSSE3 I420ToARGB_Opt (3928 ms)

TBR=harryjin@google.com
BUG=libyuv:520

Review URL: https://codereview.chromium.org/1407353010 .
2015-11-04 16:16:18 -08:00
Frank Barchard
860cc0357a Neon versions of I420AlphaToARGB
Add alpha version of YUV to RGB to neon code for ARMv7 and aarch64.
For other YUV to RGB conversions, hoist alpha set to 255 out of loop.

TBR=harryjin@google.com
BUG=libyuv:516

Review URL: https://codereview.chromium.org/1413763017 .
2015-11-03 19:21:36 -08:00
Frank Barchard
d95d2169d9 rename yuv matrix constants to be more clear about what they are
R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1429693006 .
2015-11-03 17:09:53 -08:00
Frank Barchard
1f1d140bb6 remove mips dsp detect
DSP code is not actually used, only DSPR2.  Remove the detect.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1405043008 .
2015-11-03 16:57:40 -08:00
Frank Barchard
ce4c2fad1d Raw 24 bit RGB to RGB24 (bgr)
Add unittests that do 1 step conversion vs 2 step conversion.

Tests end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1419103007 .
2015-11-03 10:30:30 -08:00
Frank Barchard
87926cec8b remove store bgra, abgr, raw unused macros
TBR=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1420033004 .
2015-11-02 10:40:03 -08:00
Frank Barchard
2c7aa0070a remove I422ToBGRA and use I422ToRGBA internally
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.

Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.

R=harryjin@google.com
BUG=libyuv:518

Review URL: https://codereview.chromium.org/1427993004 .
2015-11-02 10:24:12 -08:00
Frank Barchard
5d97b93369 refactor I420ToABGR to use I420ToARGBRow
Using a transposed conversion matrix, I420ToARGB can output ABGR.

R=harryjin@google.com, xhwang@chromium.org
BUG=libyuv:473

Review URL: https://codereview.chromium.org/1413573010 .
2015-10-30 11:56:57 -07:00
Frank Barchard
cdbdf5b723 Fix debug compilation problems for gcc and 32 bit x86.
In some methods with 7 arguments gcc fails to find enough registers
to compile the assembler code when compiling debug. Simplest solution
is to skip the assembler version in debug of those particular functions

(I422Alpha -> ARBG/ABGR)

R=harryjin@google.com,bratell@opera.com
BUG=libyuv:517

Review URL: https://codereview.chromium.org/1423283002 .
2015-10-28 14:27:29 -07:00
Frank Barchard
b86dbf24d3 refactor I420AlphaToABGR to use I420AlphaToARGB internally
swap U and V and transpose conversion matrix, so I420AlphaToARGB and
I420AlphaToABGR share low level code.

Having less code with same performance allows more focused
optimization for future ARM versions.

R=harryjin@google.com
TBR=harryjin@chromium.org
BUG=libyuv:473,libyuv:516

Review URL: https://codereview.chromium.org/1422263002 .
2015-10-27 14:17:21 -07:00
Frank Barchard
cf160cdbaa implement I444ToABGR by swapping uv and transpose matrix
U contributes to B and G.  V contributes to R and G.
By swapping U and V, they contribute to the opposite channels.  Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.

As a result the existing I444ToABGRRow functions are no longer needed and are removed.

Previously this function was only Intel AVX2 optimized for Windwos.  Now it is also optimized for Arm and GCC.

ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.

R=xhwang@chromium.org
BUG=libyuv:515

Review URL: https://codereview.chromium.org/1414133006 .
2015-10-27 10:21:21 -07:00
Frank Barchard
2844662e1c Add avx512bw detection code
R=harryjin@google.com
BUG=libyuv:514

Review URL: https://codereview.chromium.org/1413463004 .
2015-10-26 14:42:49 -07:00
Frank Barchard
1502832a70 switch cpu flags to 0 for unitialized to avoid compare
R=harryjin@google.com
BUG=libyuv:512

Review URL: https://codereview.chromium.org/1418253002 .
2015-10-23 10:57:42 -07:00
Frank Barchard
ad36ba5c48 initialize cpu flags to fix compile error on windows
R=harryjin@google.com
BUG=libyuv:512

Review URL: https://codereview.chromium.org/1422733003 .
2015-10-22 15:16:31 -07:00
Frank Barchard
430bb0a0f0 odd width 444 fix
TBR=harryjin@google.com
BUG=libyuv:510

Review URL: https://codereview.chromium.org/1415583003 .
2015-10-21 20:03:19 -07:00
Frank Barchard
90335f6043 bug fix for odd width 16/24 bit to i420
A bug was introduced on arm when the code for 'any' width switch to
a temporary stack buffer and simd.
The C version handles odd width by doing 1 pixel, instead of averaging 2.
But the SIMD any version is supposed to replicate the last pixel, then
the subsampling in Neon will average the pixel with itself, producing
the same result.
The previous version did this, but only for ARGB 32 bit, which was to
avoid introducing issues with subsampled YUY2 source.  This CL adds
replication for RGB 16 bit values.

TBR=harryjin@google.com
BUG=libyuv:510

Review URL: https://codereview.chromium.org/1418983003 .
2015-10-21 18:23:02 -07:00
Frank Barchard
5bf4de0806 width and 3 bug fix in odd width support of ARGBToI411
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1415213002 .
2015-10-21 12:45:08 -07:00
Frank Barchard
ba4b409d51 Fix ARGBToI411 odd width bug.
The any function for handling ARGBToI411 was not handling the pixel
replication correctly.  On 422 and odd width was handled by duplicating
a pixel of source.  411 needs replication for remainders of 1, 2 or 3
pixels.

The C version was handling odd width but with an average of the remainder
pixels, which does not match the SIMD 'any' handling off remainder.
This changes the odd width handling to mimic the any version.

TBR=harryjin@google.com
BUG=libyuv:491

Review URL: https://codereview.chromium.org/1411733004 .
2015-10-21 12:22:24 -07:00
Frank Barchard
9daa550a2e Move cpu_info variable outside ifdef
Fix compile error on arm, mips etc due to undefined variable.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1403373008 .
2015-10-20 16:32:44 -07:00
Frank Barchard
9be6d21ae7 write to cpu_flags once
To make init cpu flags thread safe, there can only be one write to the variable.

R=richard.winterton@intel.com, harryjin@google.com
BUG=libyuv:508

Review URL: https://codereview.chromium.org/1412793006 .
2015-10-20 16:24:01 -07:00
Frank Barchard
cf19a0c9a2 nv21 any fix
R=harryjin@google.com
BUG=libyuv:507

Review URL: https://codereview.chromium.org/1410643002 .
2015-10-15 16:24:51 -07:00
Frank Barchard
52a5504950 fix for C version of YUV to RGB for Arm
YuvPixel for arm was miscomputing YG.

TBR=harryjin@google.com
BUG=libyuv:506

Review URL: https://codereview.chromium.org/1402333002 .
2015-10-15 12:43:37 -07:00
Frank Barchard
4abd096548 fix for yuv to rgb on arm64.
fill in aarch64 yuv constants to match how the code expects them.

TBR=harryjin@google.com
BUG=libyuv:502

Review URL: https://codereview.chromium.org/1396253004 .
2015-10-12 12:02:54 -07:00
Frank Barchard
2e4466e282 change all pix parameters to width for consistency
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
Frank Barchard
76a599ec3b fix jpeg and bt.709 yuvconstants for neon64.
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces.  But the jpeg and bt.709
colour conversion constants were still in armv7 form.  This changes
the constants for aarch64 builds to be compatible with the code.

yuv constants are now passed as const *

Remove Yvu constants which were used for older version on nv21 but not new code.

TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398623002 .
2015-10-07 19:46:56 -07:00
Frank Barchard
fae8e66d43 Fix for AVX2 dither function.
Fix for 64 bit gcc parameter in dither function which requires m not r,
when ABI uses register.

BUG=none

Review URL: https://codereview.chromium.org/1399463002 .
2015-10-07 19:17:56 -07:00
Frank Barchard
8f0cadede4 port ARGB to 565 dithering AVX2 code to GCC.
Previously the assembly code was only available to Windows.
This CL ports the AVX2 code to GCC syntax.

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1391273003 .
2015-10-07 19:13:59 -07:00
Frank Barchard
cc89e3a77b port ARGB to 565 dithering SSE2 code to GCC.
Previously the assembly code was only available to Windows.
This CL ports the SSE2 code to GCC syntax.

When running a profiler on all the unittests, this function
was the slowest of all functions that still ran in C code.
   3.71%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C

Was
ARGBToRGB565Dither_Opt (2894 ms)
Now
ARGBToRGB565Dither_Opt (432 ms)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1397673002 .
2015-10-07 18:24:50 -07:00
Frank Barchard
3e38762d6b fix avx2 box filter bug for yuv down sampling.
offset to second group of pixels was off by 16.
should have been 32, not 16.
requires avx2 hardware and wide image for test.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:492,libyuv:501

Review URL: https://codereview.chromium.org/1395603002 .
2015-10-07 11:02:33 -07:00
Frank Barchard
013080f2d2 Pass yuvconstants to YUV conversions for neon 64 bit
SETUP provided by zhongwei.yao@linaro.org

Previously the 64 bit Neon code had hard coded constants in the setup macro
for YUV conversion, while 32 bit Neon code supported the yuvconstants
parameter.

This change accepts the constants passed to the YUV conversion row function,
allowing different color spaces to be respected - naming JPEG and BT.709.
As well as the existing BT.601.

TBR=harryjin@google.com
BUG=libyuv:472

Review URL: https://codereview.chromium.org/1384323002 .
2015-10-06 22:19:14 -07:00
Frank Barchard
914a9856c7 Reimplement NV21ToARGB to allow different color matrix.
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V.  But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.

TBR=harryjin@google.com
BUG=libyuv:500

Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
68fa59c873 add box scaling avx2 optimization for gcc
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1392803002 .
2015-10-06 20:01:02 -07:00
Frank Barchard
f00bc9ef46 Add J444ToARGB conversion function.
J444 is JPeg YUV color space with 444 subsampling.
This implementation uses the existing I444ToARGB conversion, which is
BT.601 color space with 444 subsampling, but passing in the jpeg
color matrix constants.

TBR=harryjin@google.com
BUG=449

Review URL: https://codereview.chromium.org/1387313002 .
2015-10-06 18:46:53 -07:00
Frank Barchard
d70293993f port scale box filter sse2 to gcc
TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1393653002 .
2015-10-06 16:54:26 -07:00
Frank Barchard
3eefeaeb69 test xsave before calling xgetbv.
R=agl@chromium.org, harryjin@google.com
BUG=libyuv:497

Review URL: https://codereview.chromium.org/1382803002 .
2015-09-30 17:25:41 -07:00
Frank Barchard
2cc1a2b233 Remove sse2 functions that also have ssse3
ARGBBlendRow_SSE2, ARGBAttenuateRow_SSE2, and MirrorRow_SSE2
Since vast majority of CPUs have SSSE3 now, removing the SSE2
improves the performance of CPU dispatching.

R=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1377053003 .
2015-09-30 14:24:44 -07:00
Frank Barchard
d039ad6e9b Width use memory instead of register for 32 bit fpic.
Code runs out of registers on 32 bit fpic builts.

TBR=harryjin@google.com
BUG=libyuv:496

Review URL: https://codereview.chromium.org/1369053002 .
2015-09-25 15:36:04 -07:00
Frank Barchard
febc26a2c9 win64 version of I422AlphaToARGB.
Was
I420AlphaToARGB_Premult (8861 ms)
I420AlphaToARGB_Opt (7119 ms)
Now
I420AlphaToABGR_Premult (2840 ms)
I420AlphaToARGB_Opt (484 ms)

C function switched to 1 step.
Was
I420AlphaToARGB_Premult (8862 ms)
I420AlphaToABGR_Opt (6718 ms)

Now
I420AlphaToARGB_Premult (8706 ms)
I420AlphaToARGB_Opt (6541 ms)

R=harryjin@google.com
BUG=libyuv:496, libyuv:473

Review URL: https://codereview.chromium.org/1359183003 .
2015-09-25 15:06:41 -07:00
Frank Barchard
9a0e12f5f1 AVX2 1 step I422AlphaToARGB for gcc and win.
C     I420AlphaToARGB_Opt (5169 ms)
SSSE3 I420AlphaToARGB_Opt (432 ms)
AVX2  I420AlphaToARGB_Opt (358 ms)

and with premultiplication as 2 step process:
I420AlphaToARGB_Premult (7029 ms)
I420AlphaToARGB_Premult (757 ms)
I420AlphaToARGB_Premult (508 ms)

R=harryjin@google.com
BUG=libyuv:496,libyuv:473

Review URL: https://codereview.chromium.org/1372653003 .
2015-09-25 13:37:42 -07:00
Frank Barchard
e365cdde3b I420Alpha row function in 1 pass.
API change - I420AlphaToARGB takes flag indicating if RGB should be
premultiplied by alpha.

This version implements an efficient SSSE3 version for Windows.
C version done in 2 steps.

Was
libyuvTest.I420AlphaToARGB_Any (1136 ms)
libyuvTest.I420AlphaToARGB_Unaligned (1210 ms)
libyuvTest.I420AlphaToARGB_Invert (966 ms)
libyuvTest.I420AlphaToARGB_Opt (1031 ms)
libyuvTest.I420AlphaToABGR_Any (1020 ms)
libyuvTest.I420AlphaToABGR_Unaligned (1359 ms)
libyuvTest.I420AlphaToABGR_Invert (1082 ms)
libyuvTest.I420AlphaToABGR_Opt (986 ms)

R=harryjin@google.com
BUG=libyuv:496

Review URL: https://codereview.chromium.org/1367093002 .
2015-09-25 10:29:20 -07:00
Frank Barchard
d4594beefc switch from ebp to ebx.
ebx encodes more efficiently (1 byte less) for most address modes, than ebp.
previously it was used for 411 format, but the reader uses pinsrw now avoiding
gpr register.

BUG=libyuv:488
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1365003003 .
2015-09-24 17:25:11 -07:00
Frank Barchard
8fb2048e9f Fix nv12 64 bit gcc increment.
Should be 16 bytes, but was 0x16 causing memory corruption.

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1368693002 .
2015-09-24 10:19:17 -07:00
Frank Barchard
accc04e6d8 NV12ToARGB_AVX2 ported to gcc
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1364913002 .
2015-09-23 15:54:16 -07:00
Frank Barchard
000cf89ca8 YUY2ToARGB avx2 in 1 step conversion.
Includes UYVYToARGB ssse3 fix.

Was
YUY2ToARGB_Opt (433 ms)
69.79%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBRow_AVX2
20.73%  libyuv_unittest  libyuv_unittest      [.] YUY2ToUV422Row_AVX2
 6.04%  libyuv_unittest  libyuv_unittest      [.] YUY2ToYRow_AVX2
 0.77%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_AVX2

Now
YUY2ToARGB_Opt (280 ms)
95.66%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_AVX2

BUG=libyuv:494
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1364813002 .
2015-09-23 11:15:18 -07:00
Frank Barchard
2b92ec8d0f Fix git markers introduced on landing previous CL
BUG=none

Review URL: https://codereview.chromium.org/1359023003 .
2015-09-22 15:00:57 -07:00
Frank Barchard
5f3d4270d1 yuy2 to rgb gcc versions
read in read function for yuv conversion

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1355393002 .
2015-09-22 14:27:33 -07:00
Frank Barchard
03cd8584e7 Read Y channel in read function for yuv conversion.
Allows reader to support YUY2 format.
Also contains fix for win64 build for yuv conversion.

TBR=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1355333002 .
2015-09-22 12:05:16 -07:00
Frank Barchard
f96890a0be yuvconstants for all YUV to RGB conversion functions.
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1363503002 .
2015-09-22 10:26:03 -07:00
Frank Barchard
62c49dc811 move constants into common
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1359443005 .
2015-09-18 16:28:44 -07:00
Frank Barchard
0381673d19 port I444 to ARGB to matrix. Add I444 to ABGR.
R=harryjin@google.com
BUG=libyuv:488,libyuv:490

Review URL: https://codereview.chromium.org/1348763005 .
2015-09-18 14:36:15 -07:00
Frank Barchard
28427a53e2 I444ToABGR for android
Reimplements I444ToARGB as a matrix function.
new I444ToABGR as matrix functions with wrappers and any functions.
Allows for future J444 and H444 versions.
I444ToABGR user level function added.

BUG=libyuv:490, libyuv:449
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1355733002 .
2015-09-18 11:20:58 -07:00
Frank Barchard
28ce7d94f5 j422toabgr neon port using i422toabgr matrix function.
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1353923003 .
2015-09-17 15:20:55 -07:00
Frank Barchard
6fcbae1409 J422ToARGB Neon but not aarch64
TBR=harryjin@google.com
BUG=libyuv:493

Review URL: https://codereview.chromium.org/1348203004 .
2015-09-17 12:43:05 -07:00
Frank Barchard
6a6b67e7a9 Add H422ToARGB armv7 neon version.
Patch provided by zhongwei.yao@linaro.org

R=fbarchard@chromium.org, fbarchard@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1344393002 .
2015-09-17 10:38:15 -07:00
Frank Barchard
509c644245 Add J422ToARGB armv7 neon version.
R=fbarchard@chromium.org, fbarchard@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1334173005 .
2015-09-15 15:01:48 -07:00
Frank Barchard
73c32d92d7 neon64 use yuvconstants like 32 bit code.
TBR=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1345643002 .
2015-09-14 16:43:07 -07:00
Frank Barchard
a67927c994 use struct instead of vectors
TBR=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1345623003 .
2015-09-14 16:07:58 -07:00
Frank Barchard
909160b3b5 use same macros as row_gcc.cc
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1343863002 .
2015-09-14 15:36:10 -07:00
Frank Barchard
fcacbfb27f validate scan EOI from end for better coverage
R=tpsiaki@google.com
BUG=libyuv:478

Review URL: https://codereview.chromium.org/1344623003 .
2015-09-14 10:58:51 -07:00
Frank Barchard
67a9e30225 neon yuv matrix function
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://codereview.chromium.org/1337973002 .
2015-09-11 11:12:30 -07:00
Frank Barchard
316e1ab996 avx2 width parameter bug fix
R=harryjin@google.com
BUG=libyuv:489

Review URL: https://codereview.chromium.org/1321773004 .
2015-09-09 11:56:35 -07:00
Frank Barchard
ed55d24d9f H420 functionality
R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/54869004 .
2015-09-06 11:01:40 -07:00
Frank Barchard
67b06e66cb I422ToABGR for win64. Moves any functions to accomidate win64 subset of formats.
TBR=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/57679004 .
2015-09-03 11:00:18 -07:00
Frank Barchard
7060e0d826 I420ToABGRMatrix functions with J420ToABGR wrapper.
Allows direct conversion from JPeg to ABGR for android.

BUG=libyuv:488
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/55719004 .
2015-09-03 10:42:36 -07:00
Frank Barchard
925c3d9e26 I420ToARGB conversion with matrix.
Take color conversion constants as a parameter to row function for I420ToARGBMatrixRow_SSSE3.
Allows future variations of color space using a single low level.

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/56669004 .
2015-09-02 10:45:42 -07:00
Frank Barchard
0bc626a5d7 nolint removed
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/59389004.
2015-08-31 10:52:13 -07:00
Frank Barchard
0735245c52 pinsrw instruction allows reading 2 bytes directly into an xmm register.
Saving a gpr register allows the register to not be pushed for now, and in future it can be used to point to color conversion matrix or alpha channel.

R=harryjin@google.com
BUG=libyuv:488

Review URL: https://webrtc-codereview.appspot.com/52789004.
2015-08-28 17:03:54 -07:00
Frank Barchard
be11f500f0 Use ebp to point to conversion table.
Proof of concept that conversions can table color matrix as a parameter.

R=harryjin@google.com

BUG=libyuv:472, libyuv:488

Review URL: https://webrtc-codereview.appspot.com/58489004.
2015-08-28 12:00:49 -07:00
Frank Barchard
3c4f5735ce use pointer to inverse table for clangcl
R=harryjin@google.com
TBR=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/54859004.
2015-08-26 12:53:03 -07:00
Frank Barchard
5452cce452 port row to clangcl
BUG=libyuv:487
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/53799005.
2015-08-25 16:15:42 -07:00
Frank Barchard
fa7ce4af3f fixed table for clangcl
R=harryjin@google.com
BUG=libyuv:487

Review URL: https://webrtc-codereview.appspot.com/53799004.
2015-08-25 10:47:30 -07:00
Frank Barchard
d317a70c1d llvm64 link error fix.
R=harryjin@google.com
BUG=libyuv:485

Review URL: https://webrtc-codereview.appspot.com/58479004.
2015-08-24 14:21:04 -07:00
Frank Barchard
4dfdabb552 I420AlphaToABGR for android version of yuva conversion
Same as I420AlphaToARGB but first step converts to ABGR instead of ARGB.

TBR=harryjin@google.com
BUG=libyuv:473

Review URL: https://webrtc-codereview.appspot.com/52779004.
2015-08-20 19:36:59 -07:00
Frank Barchard
ee9aaea02f i422torgb565 is asm for clangcl as well
Merge branch 'master' of https://chromium.googlesource.com/libyuv/libyuv into convertcl

allow lto for llvm but not gcc

R=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/52769004.
2015-08-19 10:46:30 -07:00
Frank Barchard
94d4269936 clang use scalewin
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/51329004.
2015-08-18 14:50:27 -07:00
Frank Barchard
cda9d38a4e xmmword cast for clang
clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang.
move defines/protos to compare_row.h
fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:484

Review URL: https://webrtc-codereview.appspot.com/59379004.
2015-08-18 11:13:12 -07:00
Frank Barchard
baf6a3c1bd Using the visual C source allows clangcl to fallback seamlessly to visual c, and supports SSE41 and AVX2 versions.
R=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/58469004.
2015-08-17 10:47:43 -07:00
Frank Barchard
278d88f872 Copy Alpha odd width support
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/59369004.
2015-08-13 15:05:14 -07:00
Frank Barchard
8e7a62f22a I420AlphaToARGB conversion for planar YUV with Alpha to ARGB.
R=brucedawson@chromium.org, harryjin@google.com
BUG=libyuv:473

Review URL: https://webrtc-codereview.appspot.com/54829004.
2015-08-12 17:01:24 -07:00
Frank Barchard
58f0020137 use visual c 32 bit code for clangcl
R=harryjin@google.com
BUG=libyuv:483

Review URL: https://webrtc-codereview.appspot.com/54819004.
2015-08-11 10:10:45 -07:00
Frank Barchard
9425c4b01a rotate nv12 any width
BUG=libyuv:464
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/55709004.
2015-08-07 23:48:38 -07:00
Frank Barchard
1f461f73d8 remove align directives
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/54809004.
2015-08-04 17:00:03 -07:00
Frank Barchard
6e7ef3fddc allow xgetbv to be disabled for drmemory testing
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/56649004.
2015-08-04 15:00:39 -07:00
Frank Barchard
e40384b6d9 remove 32 bit gcc version of UV transpose
TBR=harryjin@google.com
BUG=libyuv:481

Review URL: https://webrtc-codereview.appspot.com/52249004.
2015-08-03 18:03:55 -07:00
Frank Barchard
f14c433916 rotate macros used for source
R=brucedawson@chromium.org, harryjin@google.com
BUG=libyuv:481

Review URL: https://webrtc-codereview.appspot.com/52239004.
2015-08-03 16:12:18 -07:00
Frank Barchard
7cd7f5a80f avx ifdef for scale HAS_SCALEADDROW_AVX2.
R=jzern@google.com
BUG=libyuv:480

Review URL: https://webrtc-codereview.appspot.com/53779004.
2015-07-31 17:17:14 -07:00
Frank Barchard
f242a4a1a1 ValidateJpeg check for valid pointer and size
R=harryjin@google.com
BUG=chromium:497297

Review URL: https://webrtc-codereview.appspot.com/57649004.
2015-07-30 15:49:48 -07:00
Frank Barchard
93464b926c Add rotate any support. Fix for sobel for neon which does 16 at a time, not 8. Disable scaling color test that fails on arm. Test is not complete.
R=harryjin@google.com
BUG=libyuv:479

Review URL: https://webrtc-codereview.appspot.com/52229004.
2015-07-28 15:06:20 -07:00
Frank Barchard
45230390ff add support for odd width rotate
R=harryjin@google.com
BUG=libyuv:464

Review URL: https://webrtc-codereview.appspot.com/52219004.
2015-07-28 14:30:07 -07:00
Frank Barchard
cb54e8b69a rename rotate macros and functions to match
BUG=libyuv:477
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/52199004.
2015-07-27 17:00:41 -07:00
Frank Barchard
18a9027ad9 const warning fix on dither, bump chromium deps and add files to ignore list generated by arm build
BUG=none
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/57639004.
2015-07-27 11:47:01 -07:00
Frank Barchard
2fa4f5a3ea Adds files and functions for rotate any, but does not hook them up to the caller.
rotate any

R=harryjin@google.com
BUG=libyuv:464

Review URL: https://webrtc-codereview.appspot.com/53769004.
2015-07-27 10:32:08 -07:00
Frank Barchard
3a3a89ccd4 rotate include and proto cleanup
R=harryjin@google.com
BUG=libyuv:468

Review URL: https://webrtc-codereview.appspot.com/55679005.
2015-07-22 18:09:04 -07:00
Frank Barchard
5be90d23ee rotate row included
R=tpsiaki@google.com
BUG=libyuv:468

Review URL: https://webrtc-codereview.appspot.com/55679004.
2015-07-22 17:10:08 -07:00
Frank Barchard
892807d860 move asm out of rotate into win/gcc and header
R=harryjin@google.com
BUG=libyuv:468

Review URL: https://webrtc-codereview.appspot.com/51319004.
2015-07-22 11:22:55 -07:00
Frank Barchard
ce98129951 yuy2tonv12
R=bcornell@google.com
BUG=libyuv:466

Review URL: https://webrtc-codereview.appspot.com/51309004.
2015-07-17 16:22:59 -07:00
Frank Barchard
faa4b14f85 uyvy to nv12
R=harryjin@google.com
BUG=libyuv:466

Review URL: https://webrtc-codereview.appspot.com/50339004.
2015-07-17 14:43:19 -07:00
Frank Barchard
faebf89ce0 src_uv typo fix
R=harryjin@google.com
BUG=none

Review URL: https://webrtc-codereview.appspot.com/51299004.
2015-07-15 18:21:06 -07:00
Frank Barchard
3d190ee9f1 break rotate into files by cpu in preparation for optimization.
R=bcornell@google.com
BUG=libyuv:464

Review URL: https://webrtc-codereview.appspot.com/51289004.
2015-07-14 10:23:10 -07:00
Frank Barchard
673fe7a684 create rotate_row header
R=tpsiaki@google.com, tpsiaki
BUG=none
TESTED=local build still works.

Review URL: https://webrtc-codereview.appspot.com/50329004.
2015-07-09 14:40:35 -07:00
Frank Barchard
0e83b64e88 scalerow avx2 bug fix. was using ymm2 instead of ymm3.
R=harryjin@google.com
BUG=libyuv:462

Review URL: https://webrtc-codereview.appspot.com/56639004.
2015-07-07 17:48:04 -07:00
Frank Barchard
715a29195b vpermq for avx2 ARGB4444ToARGB, ARGB1555ToARGB and RGB565ToARGB
R=harryjin@google.com
BUG=libyuv:462

Review URL: https://webrtc-codereview.appspot.com/52759004.
2015-07-07 17:06:04 -07:00
Frank Barchard
97b35daf75 disable faulty avx2 in argb conversions and box filter. and extend temporary buffer to 128 for an avx2 any function.
R=harryjin@google.com
BUG=libyuv:462
TESTED=libyuv_unittest run on haswell laptop

Review URL: https://webrtc-codereview.appspot.com/53759004.
2015-07-07 15:40:24 -07:00
Frank Barchard
0737ff5bd0 128 for avx2
R=harryjin@google.com
BUG=libyuv:461

Review URL: https://webrtc-codereview.appspot.com/55649004.
2015-07-04 09:13:20 -07:00
Frank Barchard
9487b9d6d8 any allow for avx2 32 pixels at a time of argb
R=harryjin@google.com
BUG=libyuv:461

Review URL: https://webrtc-codereview.appspot.com/54779004.
2015-07-01 17:50:48 -07:00
Frank Barchard
82180e8296 rgb24toyuv use 1 or 2 steps consistently.
R=bcornell@google.com, impjdi@google.com
BUG=libyuv:459

Review URL: https://webrtc-codereview.appspot.com/52149004.
2015-06-29 16:51:05 -07:00
Frank Barchard
0686f26938 blend remove alignment 1 pixel loop for less overhead.
R=tpsiaki@google.com
BUG=none
TESTED=libyuvTest.ARGBBlend_Opt

Review URL: https://webrtc-codereview.appspot.com/50289005.
2015-06-24 11:34:12 -07:00
Frank Barchard
553c7f85f1 mirror odd width with simd
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/54769004.
2015-06-23 17:53:02 -07:00
Frank Barchard
6a9ef1ea36 any 1 to 2 with stride use SIMD
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/54759004.
2015-06-23 17:08:08 -07:00
Frank Barchard
6dde4f14bd argb to uv read 4 not 8
R=harryjin@google.com
BUG=libyuv:457

Review URL: https://webrtc-codereview.appspot.com/52139004.
2015-06-23 14:48:37 -07:00
Frank Barchard
54100b91c1 copy 2 rows for interpolate and use SIMD.
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/50279004.
2015-06-23 10:41:46 -07:00
Frank Barchard
3b5d726a4f 1 to 1 any functions with a parameter use memcpy.
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/57619004.
2015-06-22 15:08:20 -07:00
Frank Barchard
a0fca88b1d remove fmemcpy and bump version
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/50269004.
2015-06-19 17:58:17 -07:00
Frank Barchard
722e87f19f string.h for memcpy
R=harryjin
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/57609004.
2015-06-19 16:40:22 -07:00
Frank Barchard
dfb2120a42 set us simd
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/55629004.
2015-06-19 14:18:48 -07:00
Frank Barchard
6608c100e2 copy last 4
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/54749004.
2015-06-18 17:40:19 -07:00
Frank Barchard
a209d7314b simd for 1 to 1
R=harryjin@google.com, harryjin
BUG=448

Review URL: https://webrtc-codereview.appspot.com/55619004.
2015-06-17 18:22:11 -07:00
Frank Barchard
72a235af9f repeat y for yuy2 so that unittests that check the 2nd y on odd widths will match the C and SIMD. The C code duplicates the last Y.
R=harryjin@google.com
BUG=libyuv:455

Review URL: https://webrtc-codereview.appspot.com/50249004.
2015-06-16 16:27:15 -07:00
Frank Barchard
44ff3c333d split share macro
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/55609004.
2015-06-16 12:44:15 -07:00
Frank Barchard
2edfe0f0c6 merge
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/52119004.
2015-06-16 12:17:53 -07:00
Frank Barchard
bff1e18e51 share functions in any
R=harryjin@google.com
BUG=libyuv:448

Review URL: https://webrtc-codereview.appspot.com/57599004.
2015-06-16 12:05:39 -07:00
Frank Barchard
0b3294af6c disable I422ToYUY2 sse for odd sizes.
BUG=455
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/51239004.
2015-06-16 11:09:03 -07:00
Frank Barchard
68e8d9bebd Math functions need BPP of 4 for odd width support on first source argument
BUG=455
TESTED=ARGBMultply
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/54719004.
2015-06-16 09:34:51 -07:00
Frank Barchard
b071a3d321 subsample yuy2 dest
BUG=455
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*ARGBToYUY2*
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/58429004.
2015-06-15 12:01:28 -07:00
Frank Barchard
58ca9f899e remainder done unconditionally and with a variable
BUG=448
TESTED=local build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/57559004.
2015-06-12 17:21:41 -07:00
Frank Barchard
242cb2554c nv12 odd width support using SIMD for remainder
BUG=libyuv:448
TESTED=NV21ToRGB565_Any etc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/53689004.
2015-06-12 16:07:20 -07:00
Frank Barchard
cae07fb0e0 bump subsampling up
BUG=455
TESTED=libyuvTest.ARGBToYUY2_Random
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/58419004.
2015-06-12 15:25:03 -07:00
Frank Barchard
03da5420bc use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder.
Enter a description of the change.

use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder.  Currently the C code does not exactly match the SIMD code, so an odd width produces different pixels than an even width, causing a subtle artifact.  By using SIMD consistently, there is no difference in even and odd widths.  Also the SIMD performance is faster, so even with overhead of memcpy, performance improves.

BUG=447
TESTED=out\release\libyuv_unittest.exe --gtest_filter=*I420ToARGB*
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/55579004.
2015-06-11 16:38:52 -07:00
Frank Barchard
ee351bc2d5 check height is non-zero
BUG=none
TESTED=libyuv unittest with even width
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/51219004.
2015-06-11 16:35:20 -07:00
fbarchard@google.com
2e9f3e5cf5 rename source files from row_posix.cc etc to row_gcc.cc to avoid gyp build filtering out source files from build when on windows with clang. The source code contained in row_gcc.cc is gcc syntax inline assembly available for any platform that supports gcc or clang for intel cpus.
BUG=440
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/56579004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1430 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 17:27:52 +00:00
fbarchard@google.com
05416e2d9a Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows.
BUG=425
TESTED=out\release\libyuv_unittest --gtest_filter=*ScaleTo1x1*
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/52659004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1428 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 01:05:18 +00:00
fbarchard@google.com
b07de879b6 enable intrinsics for clangcl if -mssse3 is enabled.
BUG=451
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/52699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1427 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-08 22:48:18 +00:00
fbarchard@google.com
bd2d903e1b odd width support for ARGBSobel functions. Improves performance for images that are not a multiple of 8 pixels.
BUG=444
TESTED=libyuvTest.ARGBSobel_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/54589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1415 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-28 22:22:28 +00:00
fbarchard@google.com
cfce47efc8 Change Sobel to use JPeg Luma calculation instead of extracting G channel. Using luma produces a better sobel that respects all 3 channels of RGB. Historically the G channel was used to improve performance, and because the luma of I420 is a constrained range, hurting quality. Using the JPeg variation of YUV, the luma is more accurate, including cross platform, better optimized for AVX2 and odd widths, and full range.
BUG=444
TESTED=ARGBSobelXY_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/57479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1414 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-27 22:32:26 +00:00
fbarchard@google.com
7c09264ffc odd width support for scale by even scale factor and box scale down by 4. scale down by 4 uses scale down by 2 internally.
BUG=431
TESTED=libyuvTest.ARGBScaleDownBy4_Bilinear

Review URL: https://webrtc-codereview.appspot.com/57399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1412 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-26 17:56:51 +00:00
fbarchard@google.com
c38aeec322 scale down by 2 on argb images support odd widths using _any function.
BUG=431
TESTED=libyuvTest.ARGBScaleDownBy2_Bilinear

Review URL: https://webrtc-codereview.appspot.com/52569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1410 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-22 21:39:21 +00:00
fbarchard@google.com
632c50f29c include posix source for 64 bit clang builds.
BUG=440
TESTED=ninja -C out\Release_x64
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/46259004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1407 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-14 21:40:05 +00:00
fbarchard@google.com
3666015261 add nacl macros for arm to YUV422TORGB_SETUP_REG.
BUG=415
TESTED=ncval.exe newlib/Release/nacltest_arm.nexe
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/46229005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1406 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-12 21:33:32 +00:00
fbarchard@google.com
b33dc47b54 sobel use LL for constants to be passed in as int64
BUG=437
TESTED=local ios build

Review URL: https://webrtc-codereview.appspot.com/47129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1404 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-06 02:34:16 +00:00
fbarchard@google.com
d3f51b58f0 work arounds for ios 64 bit compiler where int passed into assembly needs to be explicitely cast to 'w' register.
BUG=437
TESTED=local ios build
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/49289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1402 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-05 22:46:16 +00:00
fbarchard@google.com
b0f8352245 row_neon64 additional fixes for warning on ios where int doesnt match %2 size which is 64 bit by default. change size to explicitely 32 bit with %w2.
BUG=437
TESTED=try bots

Review URL: https://webrtc-codereview.appspot.com/43349004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1401 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-05 17:26:57 +00:00
fbarchard@google.com
a20e2c6213 row_neon64 fix for warning on ios where int width doesnt match %2 size which is 64 bit by default. change size to explicitely 32 bit with %w2.
BUG=437
TESTED=try bots
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/47119004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1399 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-04 22:53:53 +00:00
fbarchard@google.com
6d5554661f scale 64 bit fix for warning on ios where int width doesnt match %2 size which is 64 bit by default. change size to explicitely 32 bit with %w2.
BUG=437
TESTED=try bots
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/43339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1398 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-04 22:03:53 +00:00
fbarchard@google.com
e8c90c31ee fix for warning on ios 64 bit that int width doesnt match %2 size which is 64 bit by default. change size to explicitely 32 bit with %w2.
BUG=437
TESTED=try bots
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/49279004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1397 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-04 21:16:03 +00:00
fbarchard@google.com
54e78d12e0 make windows code built with clangcl include the _posix source code.
depot_tools excludes these source files now, so they need to be manually
included.
BUG=435
TESTED=clangcl local build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/52419004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1396 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-02 01:00:01 +00:00
fbarchard@google.com
2c44965e8d make row_win windows code built with clangcl include the _posix source code.
depot_tools excludes these source files now, so they need to be manually
included.
BUG=435
TESTED=clangcl local build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1395 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-02 00:39:29 +00:00
fbarchard@google.com
484e5d2d23 make windows code built with clangcl include the _posix source code. depot_tools excludes these source files now, so they need to be manually included.
BUG=435
TESTED=clangcl local build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46199004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1394 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-05-02 00:15:05 +00:00
fbarchard@google.com
ab6b224675 fix for arm builds where tmp for assembly produces an error if its uninitialized.
BUG=libyuv:432
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1392 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-30 18:21:19 +00:00
fbarchard@google.com
31806d76d9 scale to 3/4 bug fix for odd widths. multiply to index into source by scale factor should be 4 / 3 not 3 / 4.
BUG=433
TESTED=set LIBYUV_WIDTH=1276 out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.Scale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1391 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-30 17:18:13 +00:00
fbarchard@google.com
9f4636e298 AVX2 port of ScaleDownBy4.
BUG=314
TESTED=out\release\libyuv_unittest --gtest_filter=*.ScaleDownBy4*

Review URL: https://webrtc-codereview.appspot.com/46159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1390 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-30 01:58:32 +00:00
fbarchard@google.com
5abb6d4556 disable stucture padded warnings on win64 builds.
BUG=432
TESTED=local win64 build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1389 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 23:18:07 +00:00
fbarchard@google.com
e23274cad2 remove unused function SumBox.
BUG=432
TESTED=untested
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/48279004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1388 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 22:13:44 +00:00
fbarchard@google.com
428fce642d remove unused function ScalePlaneBoxRow_* which was for slow box filter that is no longer used.
BUG=432
TESTED=try bots
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/51779004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1387 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 21:00:16 +00:00
fbarchard@google.com
35aa92a1ea fixed unused variables/code warnings in scale box function
BUG=libyuv:432
TESTED=local windows build with chromium_code =1
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/49849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1385 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 18:41:04 +00:00
fbarchard@google.com
f995021f35 Work around casting warnings in scale_neon64.cc for ios 64 bit.
BUG=430
TESTED=untested
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/49799004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1382 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-28 00:02:46 +00:00
fbarchard@google.com
a81da96c90 Work around for ios 64 bit build warning - use explicit word register for int.
BUG=430
TESTED=local ios 64 bit build
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/47039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1381 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-27 23:53:11 +00:00
fbarchard@google.com
4e78b8dc2e scale to 3/4 or 3/8 with odd width destinations efficiently. previously if width was not multiple of what the simd loop would do (24), scaling would fall back on slower C code. This change allows SIMD to be used for most of the scaling and C for the remainder, improving efficiency.
BUG=314
TESTED=set LIBYUV_WIDTH=1896 & ScaleDownBy3by4_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1380 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-27 21:56:08 +00:00
fbarchard@google.com
1ffb04b43e Allow ScaleRowDown any functions to accept non-power of 2 for destination SIMD multiple.
BUG=none
TESTED=local unittests pass
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/45129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1379 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-24 22:32:12 +00:00
fbarchard@google.com
2b7f6b7dee ScaleAddRows_Any_SSE2 functions for handling odd widths.
BUG=425
TESTED=out\release\libyuv_unittest_old --gtest_filter=*.ScaleDownBy3_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1377 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-22 00:51:56 +00:00
fbarchard@google.com
01db3d1d1d Remove declspec(align(32)) from AVX2 functions.
BUG=422
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/43229004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1374 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-20 22:57:04 +00:00
fbarchard@google.com
812f59ed40 box and point sampling use scaledownby4 but linear and bilinear do not.
BUG=427
TESTED=out\release\libyuv_unittest --gtest_filter=*.ScaleDownBy4_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1373 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-17 18:04:09 +00:00
fbarchard@google.com
e52b9c3405 make box filter upsampler consider a pixel width of less than 1 to be 1. This makes it behave as a point sampler.
BUG=428
TESTED=set LIBYUV_WIDTH=1900 && out\release\libyuv_unittest.exe
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49709004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1372 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-16 21:02:16 +00:00
fbarchard@google.com
c9986313ac lsl by 2 requires a number sign for xcode on ios 64 bit build. add the # sign for ios compatibility. remove legacy x86 asm files that are unused. the unused files cause complications in build systems that build all files.
BUG=libyuv:423
TESTED=try bots
R=noahric@google.com

Review URL: https://webrtc-codereview.appspot.com/45119004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1369 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 19:57:33 +00:00
fbarchard@google.com
32ad6e0e12 Remove unused variable 'I422ToRGB565Row' that breaks osx builds.
BUG=426
TESTED=untested

Review URL: https://webrtc-codereview.appspot.com/48079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1368 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 02:50:35 +00:00
fbarchard@google.com
013e812275 Port box filter to AVX2.
BUG=libyuv:425
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*libyuvTest.ScaleTo640x360_Box
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/43149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1367 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-14 00:21:15 +00:00
fbarchard@google.com
b5ea79d845 add rows handle height of 1 using a more general while-style loop.
BUG=none
TESTED=try bots

Review URL: https://webrtc-codereview.appspot.com/45999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1366 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-13 18:56:08 +00:00
fbarchard@google.com
c7161d1c36 Remove code alignment declspec from Visual C versions for vs2014 compatibility.
BUG=422
TESTED=local vs2013 build still passes.

Review URL: https://webrtc-codereview.appspot.com/45959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1365 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-12 23:54:26 +00:00
fbarchard@google.com
1eb51bcf01 Fix bug in YUV to RGB for gcc/clang and enable affected functions.
BUG=393
TESTED=sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I422ToARGB*

Review URL: https://webrtc-codereview.appspot.com/48019004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1364 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-08 02:32:33 +00:00
fbarchard@google.com
bb5a009d11 ARGB4444ToARGB and ARGB1555ToARGB ported to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGB4444ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48009004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1363 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 23:52:57 +00:00
fbarchard@google.com
8b9f908134 RGB565ToARGB AVX2 vzeroupper before the ret, not after.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1362 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 22:53:12 +00:00
yang.zhang@arm.com
5f609856de Add ScaleARGBFilterCols_NEON for ARM32/64
ARM32/64 NEON versions of ScaleARGBFilterCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ifea62bc25d846bf16cb51d13b408de7bf58dccd4

Review URL: https://webrtc-codereview.appspot.com/46699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1361 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 03:45:29 +00:00
fbarchard@google.com
3d1176a3f8 ARGBToYJRow_AVX2 hooked up for ARGBToJ422
BUG=none
TESTED=ARGBToJ422 unittest

Review URL: https://webrtc-codereview.appspot.com/44079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1360 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 00:39:25 +00:00
fbarchard@google.com
8f0b32773c ARGBToUV AVX2 functions hooked up.
BUG=none
TESTED=RGB565ToI420
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46829004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1359 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 00:10:52 +00:00
fbarchard@google.com
9afabe29b8 Add ARGBToY AVX calls.
BUG=none
TESTED=libyuv unittests all pass with AVX2
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/44999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1358 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 23:11:05 +00:00
fbarchard@google.com
2827277496 port RGB565ToARGB to AVX2.
BUG=421
TESTED=out\release\libyuv_unittest --gtest_filter=*RGB565ToARGB*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1357 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 19:24:23 +00:00
fbarchard@google.com
e2ea106068 shift for arm wants a # sign for nacl and ios.
BUG=420
TESTED=d:/src/nacl_sdk/pepper_canary/toolchain/win_arm_newlib/bin/arm-nacl-g++ -o newlib/Release/source/scale_neon_arm.o -c source/scale_neon.cc -g -O2 -pthread -MMD -DNDEBUG  -Id:/src/nacl_sdk/pepper_canary/include -Id:/src/nacl_sdk/pepper_canary/include/newlib  -I./include
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/47949004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1356 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-03 23:28:44 +00:00
fbarchard@google.com
44b6ba91e4 Scale down by 4 for odd number of destination pixels using 'any' that handles SIMD for multiple of 8 pixels, and C for the remainder.
BUG=314
TESTED=local test with width odd

Review URL: https://webrtc-codereview.appspot.com/49599004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1355 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-03 22:12:53 +00:00
fbarchard@google.com
62a9fe303c code style cleanup of scale functions. no functional change.
BUG=none
TESTED=lint
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1354 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-02 21:23:52 +00:00
yang.zhang@arm.com
c00796c4e4 Fix the issue of q4 register not in clobber list for ARMv7
On ARMv7 platform, q4 is used, but it isn't declared in clobber list.
It results that q4 isn't preserved automatically by compiler. So that the value of q4 is destroyed.

BUG=418
TESTED=libyuvTest.* on ARM32 with Android
R=fbarchard@google.com

Change-Id: Ib9b5eff8231c5057f4d58f1c4029f5452222af55

Review URL: https://webrtc-codereview.appspot.com/47899004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1353 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-02 02:33:23 +00:00
fbarchard@google.com
c70c7c02ff scale to half size optimization for avx2 - use pmaddubsw instruction to horizontally add bytes, then pavgw to round and divide by 2.
BUG=314
TESTED=libyuvTest.ScaleDownBy2*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1352 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-31 23:59:27 +00:00
yang.zhang@arm.com
f23d6222ac Add ScaleARGBCols_NEON for ARM32/64
ARM32/64 NEON versions of ScaleARGBCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Id9ad97f7aa5d8a34cd55ace9e648cb6ff028efd9

Review URL: https://webrtc-codereview.appspot.com/47689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1351 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-31 03:03:05 +00:00
fbarchard@google.com
72673ac873 linear and point sample scale to half size for AVX2.
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/44959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1349 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-30 21:46:08 +00:00
fbarchard@google.com
9ef8999ff3 scale to half size use pmadd/pavgw to horizontal averaging.
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/50529004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1348 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-27 18:20:21 +00:00
fbarchard@google.com
e6ca9cc2a2 Scale down by 2 AVX2 port. Processes twice as many pixels as SSE2 and takes advantage of 3 argument instructions to reduce register usage and number of instructions.
BUG=314
TESTED=libyuvTest.ScaleDownBy2_Box
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/42959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1347 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-26 23:21:08 +00:00
fbarchard@google.com
d41fbf40dd Handle scale down by factor of 2 efficiently by calling SIMD for multiple of 16 destination pixels, and C for remainder.
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/48689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1344 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-24 23:25:30 +00:00
yang.zhang@arm.com
0d3bfab6db Add nacl macros to ScaleFilterCols_NEON on ARM32/64 platform
Add the nacl macros to ARM functions. If not, a bunch of code is failing
to validate.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: I7a36434f18e0de8b8f8a9fe01167bfe50cff8962

Review URL: https://webrtc-codereview.appspot.com/47739004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1343 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-24 08:02:30 +00:00
fbarchard@google.com
d28cd77f99 Enable assembly for clangcl build on Windows. Previously assembly was disabled so clangcl would work, but only with C code. As clangcl mimics both Visual C and GCC, ifdefs need to pick one or the other or often you'll end up with both. In this CL we disable most Visual C code and use the GCC versions which allow assembly for both 32 and 64 bit intel.
BUG=412
TESTED=clang=1 build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1341 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 20:36:31 +00:00
yang.zhang@arm.com
d6d7de5742 Add ScaleFilterCols_NEON for ARM32/64
ARM32/64 NEON versions of ScaleFilterCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: I5b0838769ffb0182155d7cd6bcc520eb81eb5c4e

Review URL: https://webrtc-codereview.appspot.com/41349004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1340 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 03:55:05 +00:00
fbarchard@google.com
70e5c81860 copy width to int64 to pass to assembly to avoid warning on ios 64 bit for implicit: value size does not match register size specified by the constraint and modif
BUG=413
TESTED=local ios 64 bit build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45749004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1338 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 17:56:20 +00:00
fbarchard@google.com
0e4388aea3 I422ToRGB24 AVX2 and I422ToRAW
BUG=none
TESTED=I422ToRGB24 unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46619004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1337 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 17:25:27 +00:00
yang.zhang@arm.com
4d387fc619 Add ScaleARGBRowDown2Linear_NEON for ARM32/64
ARM32/64 NEON versions of ScaleARGBRowDown2Linear_NEON are implemented.

BUG=319
TESTED=libyuvTest.ARGBScale* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ife602c81b51aa36e0d56b9d628f278a24eed96f6

Review URL: https://webrtc-codereview.appspot.com/44689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1336 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:23:59 +00:00
yang.zhang@arm.com
e246e6c18f Add ARGBToRGB565DitherRow_NEON for ARM32/64
ARM32/64 NEON versions of ARGBToRGB565DitherRow_NEON are implemented.

BUG=407
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ia689170fb39db964392e5e1113801592ab0628bf

Review URL: https://webrtc-codereview.appspot.com/49409004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1335 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:22:25 +00:00
fbarchard@google.com
3b4f5eb7b8 Port J422 colorspace to GCC
BUG=414
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/43809004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1334 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:54:50 +00:00
fbarchard@google.com
92f7f421fd rename I400 to J400 and I400 reference to I400. J400 is a simple replication of values to convert to RGB, which is what the old I400 was. I400 reference is the Y part of the YUV formula, so renaming that to I400.
BUG=none
TESTED=libyuvTest (5925 ms total)
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/50369005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1333 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 00:01:18 +00:00
fbarchard@google.com
35f0add66d cpuid ifdefs fixed to remove some duplicate code cases.
BUG=none
TESTED=local windows build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/47619004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1332 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 20:02:04 +00:00
fbarchard@google.com
f301777060 Fix YToARGB and tweaks to thresholds in YUV tests.
BUG=411
TESTED=libyuvTest.TestYToARGB
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/44709004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1330 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 19:50:33 +00:00
fbarchard@google.com
f2fad0faa5 Optimized J422ToARGB.
BUG=414
TESTED=J422ToARGB unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/42799004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1328 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 18:08:30 +00:00
fbarchard@google.com
e408a3759f Improve accuracy of J422 color space using higher precission fixed point and bias.
BUG=414
TESTED=TestFullYUVJ
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/44679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1327 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 18:08:12 +00:00
yang.zhang@arm.com
ca5b1bd58b Add ScaleAddRows_NEON for ARM32/64
ARM32/64 NEON versions of ScaleAddRows_NEON are implemented.

BUG=319
TESTED=libyuvTest.Scale* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: I45b88c2b5f576042ba5b3d8d6f8851257fdb7218

Review URL: https://webrtc-codereview.appspot.com/46379004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1326 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-16 02:57:46 +00:00
fbarchard@google.com
685b92b0a6 I400ToARGB_AVX2 port from SSE2 to AVX2.
BUG=403
TESTED=libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*I400ToARGB*
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1322 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 18:12:17 +00:00
fbarchard@google.com
f5a7b2b48a I411ToARGB AVX2 version
BUG=403
TESTED=I411ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1321 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-11 00:08:56 +00:00
fbarchard@google.com
1e4a14f410 scale avoid math overflow in fixed point for large images
BUG=410
TESTED=set LIBYUV_WIDTH=65536 out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=libyuvTest.ScaleTo320x240_None
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42319004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1320 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 22:30:47 +00:00
fbarchard@google.com
24152b2435 Dither from I420 to RGB565 in 2 steps - I420ToARGB then ARGBToRGB565.
BUG=407
TESTED=untested
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1315 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-10 01:45:04 +00:00
fbarchard@google.com
cdd80e04c9 Port I444ToARGB to AVX2.
BUG=403
TESTED=I444ToARGB unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1314 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-09 21:56:48 +00:00
fbarchard@google.com
697c5aa831 disable nv12 avx2 for vs9/10 that dont support avx2 instructions.
BUG=409
TESTED=try bots
R=harryjin@google.com, johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/43629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1311 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 19:12:21 +00:00
fbarchard@google.com
bdeb9ac584 switch from 8x8 to 4x4 matrix for dithering
BUG=407
TESTED=Dither unittests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1310 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 18:28:00 +00:00
fbarchard@google.com
0fe4abbc5c ARGBToRGB565 AVX2 with dithering
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/44519004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1309 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 22:31:43 +00:00
fbarchard@google.com
9245317e16 ARGBToRGB565 SSE2 port.
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1308 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 00:00:50 +00:00
yang.zhang@arm.com
274c9bce92 Add ScaleRowDown2Linear_NEON for ARM32/64
ARM32/64 NEON versions of ScaleRowDown2Linear_NEON are implemented.

BUG=319
TESTED=libyuvTest.ScaleDownBy2_Linear on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: I2c7f43a0d56ed4dfded5bdbbb61765d87d65a2ba

Review URL: https://webrtc-codereview.appspot.com/43519005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1307 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-03 02:17:16 +00:00
fbarchard@google.com
693e0217c8 ARGBToRGB565 C version use unsigned dither matrix pattern to bump pixels to next brighter value.
BUG=407
TESTED=unittest passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/43539004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1306 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-02 23:46:09 +00:00
fbarchard@google.com
6eaee58589 shift by 16 for neon expects a number sign
BUG=408
TESTED=nacl arm build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1305 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-02 18:48:17 +00:00
yang.zhang@arm.com
e4cf8950d8 Improve the accuracy YUV to RGB for ARM64 NEON
ARM64 NEON version of YUV422TORGB is updated based on C algorithm.
Except TestJ420 and TestYUV, all the other tests are passed.

BUG=324
TESTED=libyuvTest on ARM64 with Android
R=fbarchard@google.com

Change-Id: Ia2663cfdeccc4c8c1d46262c9c0cc67b71d45e70

Review URL: https://webrtc-codereview.appspot.com/35329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1304 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-02 08:30:22 +00:00
fbarchard@google.com
933bd40c3c port ARGBToRGB565 and ARGB1555 to AVX2. Enable functions that use ARGBToRGB565 AVX2 code. Add ARGBToRGB565Dither function.
BUG=403
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1302 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-27 21:15:28 +00:00
fbarchard@google.com
bffd326f74 AVX2 version of ARGBToARGB4444
BUG=403
TESTED=local build on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/43429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1297 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 17:26:28 +00:00
yang.zhang@arm.com
94e3d5a3be Improve the accuracy YUV to RGB for ARMv7 NEON
NEON version of YUV422TORGB is updated based on C algorithm. Accuracy YUV to RGB
of NEON is also updated according to test result. Macro LIBYUV_NEON is added to
identify accuracy YUV to RGB for ARM platform.
Except TestJ420 and TestYUV, all the other tests are passed.

BUG=324
TESTED=libyuvTest on ARMv7 with Android
R=fbarchard@google.com

Change-Id: I492ca628679940534f40341721dc5b6dc2d7a5b6

Review URL: https://webrtc-codereview.appspot.com/40609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1296 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 06:51:29 +00:00
fbarchard@google.com
d96047761e AVX2 version of NV12ToARGB
BUG=403
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40089004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1295 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:45:08 +00:00
fbarchard@google.com
3c11d4bf6e align avx2 buffers to 32 bytes
BUG=403
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40929004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1294 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:31:28 +00:00
fbarchard@google.com
446fa95587 I422ToRGB565, ARGB4444 and ARGB1555 for AVX2
BUG=403
TESTED=avx2 emulator

Review URL: https://webrtc-codereview.appspot.com/34359004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1293 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 23:14:46 +00:00
fbarchard@google.com
e2f1a75474 move mask to last parameter of any functions for consistency.
BUG=none
TESTED=local libyuv unittest passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/43419004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1292 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 21:18:30 +00:00
fbarchard@google.com
239962fa00 YUY2 and UYVY to ARGB AVX2 versions via wrappers.
BUG=403
TESTED=UNTESTED
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1291 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-24 18:58:51 +00:00
kjellander@google.com
28d1a582ba Revert "YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions."
This reverts r1288 due to breaking compilation on bots:
http://build.chromium.org/p/client.libyuv/builders/Mac64%20Debug/builds/365
http://build.chromium.org/p/client.libyuv/builders/Linux64%20Debug/builds/667

TBR=fbarchard@google.com
TESTED=Reverted locally and all built fine again.

Review URL: https://webrtc-codereview.appspot.com/40879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1289 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-23 09:22:22 +00:00
fbarchard@google.com
b52606c024 YUY2ToARGB and UYVYToARGB AVX with C wrapper to call lower level conversions.
BUG=403
TESTED=convert unittest
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/40839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1288 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-21 00:49:35 +00:00
fbarchard@google.com
cfd6f897c4 use named type for pointer
BUG=403
TESTED=try bot

Review URL: https://webrtc-codereview.appspot.com/38179004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1287 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-20 23:26:27 +00:00
fbarchard@google.com
6a192487fe Switch SSSE3 row wrappers from variable sized malloc to fixed size array with loop to process a portion of the row at a time. This helps performance in the case where the image has been coalesced into a single large row and the allocator, although only called once, is slow to clear the pages. Also the smaller temporary buffer fits cache, further improving performance.
BUG=403
TESTED=YUY2ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1286 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-20 22:46:15 +00:00
fbarchard@google.com
194f740d0e Scan from start of buffer to handle case where an invalid size was passed.
BUG=404
TESTED=libyuvTest.ValidateJpegLarge
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41989004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1285 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-18 01:57:31 +00:00
fbarchard@google.com
975dd5a699 macros for storing RGB on windows.
BUG=403
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38119004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1283 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-14 00:50:48 +00:00
fbarchard@google.com
8e16c1a341 Switch to macro for STOREBGRA etc on Posix SSSE3
BUG=393
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1282 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-13 21:53:16 +00:00
fbarchard@google.com
2f56d2859f Macro to store ARGB value
BUG=396
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1279 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 18:53:54 +00:00
fbarchard@google.com
5ab38f9258 Remove Q420 fourcc support.
BUG=396
TESTED=local build of unittest builds and passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39089004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1278 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 18:20:54 +00:00
fbarchard@google.com
d1ac8b17e6 use matrix for win64 version of I420ToARGB
BUG=396
TESTED=local unittests build/pass
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41899004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1276 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-11 00:57:46 +00:00
fbarchard@google.com
3bb829a44f Add a macro for YUV to RGB on Windows. Allows multiple color matrix structures in the future.
BUG=393
TESTED=local build
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1275 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 23:03:37 +00:00
fbarchard@google.com
97a3850ea4 Add a macro to reference YUV structure for future alternative color spaces.
BUG=393
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33299005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1274 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 19:34:17 +00:00
yang.zhang@arm.com
63996ab7ca Fix the bug (0 extension from int to int64) in ScaleARGBRowDownEven_NEON.
Reason of this bug is that ARM64 can't extend the sign bit of a 32-bit integer
to 64-bit integer automatically.
ScaleARGBRowDownEven_NEON is also enabled for ARM64.

BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: Ib8d30a05156239247296aa8bb4faa94b4f69a9c3

Review URL: https://webrtc-codereview.appspot.com/32949004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1273 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 06:04:12 +00:00
fbarchard@google.com
738dfa0307 Support odd widths for NV12 format when cropping vertically.
BUG=400
TESTED=CropNV12
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39009004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1272 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-10 02:18:38 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
35037cb948 For 32 bit x86 with fpic use memory instead of register for count
BUG=399
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/36019004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1269 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-08 21:28:40 +00:00
fbarchard@google.com
fd8054791c build fixe for InterpolateRow_MIPS_DSPR2
BUG=398
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1268 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-07 01:01:30 +00:00
fbarchard@google.com
a246086243 use the same structures for sse and avx yuv to rgb.
BUG=396
TESTED=local build still passes on sse
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1267 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-06 21:05:38 +00:00
fbarchard@google.com
cf925c50bc Make Yvu vs Yuv use same code and structure but pass in a different version of the matrix
BUG=396
TESTED=ncval
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1266 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-06 00:29:08 +00:00
fbarchard@google.com
1663996c52 Remove ifdef __SSE2__ and native client ifdef for r14 in register usage declarations.
BUG=395
TESTED=gcc build with nacl
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1264 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 23:09:15 +00:00
fbarchard@google.com
3cb485533a NaCL port of YToARGB for AVX2
BUG=393
TESTED=d:\src\nacl_sdk\pepper_canary\tools\ncval.exe newlib/Release/nacltest_x86_32.nexe
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39829004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1263 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 21:32:15 +00:00
fbarchard@google.com
baafc97d6b port YToARGB AVX2 to GCC
BUG=393
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39819004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1262 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-05 20:17:27 +00:00
fbarchard@google.com
c4e032c543 change Y multiplier and bias to compensate for 257/256 which makes YToARGB exactly match float math.
Histogram Before
hist            -3      -2      -1      0       1       2       3
red             0       0       1809408 13140736        1827072 0       0
green           0       0       1679912 13471329        1625975 0       0
blue            168448  994816  1876480 10655488        1893376 1006336 182272
Histogram After
hist            -3      -2      -1      0       1       2       3
red             0       0       558848  15632128        586240  0       0
green           0       0       209907  16350588        216721  0       0
blue            14848   642816  1989376 11363328        2053120 695040  18688
BUG=394
TESTED=more stringent luma tests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/38859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1259 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-04 19:45:26 +00:00
fbarchard@google.com
3982998c7c YToARGB AVX2 port from SSE2
BUG=393
TESTED=YToARGB unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1258 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-03 01:35:11 +00:00
fbarchard@google.com
4848b06016 YPixel subtract bias to match C code
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37799004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1253 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:58:20 +00:00
fbarchard@google.com
29db9b0b89 C version of YToARGB with ubias removed to produce consistent luma ramp.
BUG=392
TESTED=TestGreyYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1251 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 23:07:46 +00:00
fbarchard@google.com
63882a356f Disable YToARGB assembly which is off by 1
BUG=392
TESTED=libyuvTest.YToARGB_Opt

Review URL: https://webrtc-codereview.appspot.com/40549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1250 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 17:16:44 +00:00
fbarchard@google.com
080a316492 port yuv chroma improvements to gcc. YUV to RGB is more accurate using a negative matrix. 2% slower but half as much error.
BUG=324
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1249 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-26 04:35:51 +00:00
fbarchard@google.com
d12a08712b adjust ubias to minimize error histogram centering error.
BUG=324
TESTED=TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37739004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1248 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 22:16:33 +00:00
fbarchard@google.com
eb8dda3ac7 fix for ybias on YToARGB function.
BUG=324
TESTED=libyuvTest.YToARGB_Any
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/36939004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1247 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 18:31:29 +00:00
fbarchard@google.com
b114986477 Change YUV to RGB to subtract the chroma contributions from the bias.
BUG=324
TESTED=win64 build and TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1246 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-23 04:22:35 +00:00
fbarchard@google.com
c62d30111f adjust bias on Y channel so error histogram is better centered on green channel
BUG=324
TESTED=FullYUVTest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/38689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1245 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-22 19:43:34 +00:00
fbarchard@google.com
b089593610 xmm4 is unused - remove from NV21
BUG=324
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/40489004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1243 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-22 18:17:44 +00:00
fbarchard@google.com
319f047710 Compute chroma using negative coefficients to extend range of U contribution on B to 2
BUG=324
TESTED=TestI420
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/41569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1238 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-21 18:45:13 +00:00
fbarchard@google.com
e7873910df port YUV luma accuracy to posix
BUG=324
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1236 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-21 00:36:30 +00:00
fbarchard@google.com
3842299be8 YUV use same constant as asm then multiply by 0x0101 to replicate the value.
BUG=324
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41559004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1235 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-21 00:11:44 +00:00
fbarchard@google.com
c3d09f6021 Improve accuracy of luma channel in YUV to RGB conversion
BUG=324
TESTED=TestFullYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/36859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1233 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-20 23:42:15 +00:00
fbarchard@google.com
292c2286a6 prototype of a YUV to RGB function to achieve higher quality and performance at the same time. The chroma is made more accurate by using negative values that allow more range and then subtract the contributions from the luma contributes. The luma is made more accurate using a multiply that duplicates the Y bits out to 16 bits and then does a 2.14 bit fixed point coefficient. The replication is done for free as part of the multiply.
BUG=391
TESTED=TestYUV
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/36819004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1232 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-20 18:21:16 +00:00
fbarchard@google.com
a5a15198b4 Add J422 support which is 2x1 subsampling with jpeg color space.
BUG=391
TESTED=color_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/41479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1228 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-14 19:16:01 +00:00
fbarchard@google.com
b2a6af1be6 Change rectangle low level functions to use more conventional row functions including 'any' variations. Previously the yuv function SetPlane stored 32 bit values. Now a more conventional memset() style function is used for YUV that stores bytes. On Haswell a rep stosb is used for YUV. Overall benefit of this CL is improved performance for 'any' width, and simpler row assembly instead of full image assembly. Previously ARGBRect used a low level function that supported a rectangle in assembly. Now it uses a row function, and relies on row coalesce to combine into a single low level call.
BUG=371
TESTED=untested
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1222 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-12 03:58:24 +00:00
fbarchard@google.com
89671c4de1 Fix for build on 32 bit neon
BUG=none
TESTED=nacl neon build
R=harryjin@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/34659004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1221 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-09 17:40:56 +00:00
fbarchard@google.com
852f4854c0 Neon version of new SetRow functions for rectangles.
BUG=387
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/39449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1220 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-09 00:15:44 +00:00
fbarchard@google.com
8e3db2dc73 Support invert for ARGBRect and SetPlane
BUG=387
TESTED=ARGBRect_Invert
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37539004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1219 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-07 19:02:01 +00:00
fbarchard@google.com
992c3b089a Use HAS_ARGBSETROWS_X86 to detect presence of function.
BUG=none
TESTED=rectangle unittests
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35639004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1218 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-07 00:11:51 +00:00
fbarchard@google.com
966233e5eb Remove sub 16 from yuv conversions and change bias to include it.
BUG=388
TESTED=out\release\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*420ToARGB_Opt  | sortms
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/34609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1216 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-31 01:07:02 +00:00
fbarchard@google.com
8723fc1109 Syntax fix for change 24 bit conversions to use single asm block instead of 2, but with memory counter
BUG=389, 378
TESTED=out\release\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*420ToRGB24_Opt | sortms
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/39399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1215 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-30 22:26:10 +00:00
fbarchard@google.com
16338ba85f Change 24 bit conversions to use single asm block instead of 2, but with memory counter
BUG=389,378
TESTED=out\release\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*420ToRGB24_Opt | sortms
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29359004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1214 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-30 21:35:35 +00:00
fbarchard@google.com
5304aaaed1 Use post bias to adjust for Y - 16 to improve performance.
BUG=388
TESTED=set LIBYUV_DISABLE_ASM=1 out\release\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*I420ToARGB_Opt
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/35609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1213 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-30 00:02:06 +00:00
fbarchard@google.com
40e3457574 J420ToARGB jpeg variation of YUV color space to ARGB.
BUG=241
TESTED=J420ToARGB unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32929004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1212 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-29 19:17:53 +00:00
yang.zhang@arm.com
002feab4c5 Fix the bug in ARGBColorMatrixRow_NEON
BUG=371
TESTED=libyuv_unittest and test case written by myself
R=fbarchard@google.com

Change-Id: I652dc23e4be75bd51d15a8a7f9d023594c9cd032




git-svn-id: http://libyuv.googlecode.com/svn/trunk@1211 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-19 08:09:04 +00:00
fbarchard@google.com
284d6bdf49 Port I422ToBGRA from Windows version that does 16 pixels at a time, for performance improvement.
BUG=386
TESTED=nacl build
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/36549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1207 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-16 23:56:04 +00:00
fbarchard@google.com
8b55212c83 Make vextop take the register selector parameter to access the upper portion of the avx registers.
BUG=269
TESTED=nacl
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/37399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1205 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-16 00:30:51 +00:00
fbarchard@google.com
7892ea1fe1 Fix for ARGBToUV on AVX2
BUG=269
TESTED=local testing
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/33669004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1202 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:59:23 +00:00
fbarchard@google.com
ddee77cdbd Fix for I422ToRGBA when I422ToARGB is not enabled for AVX2
BUG=269
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1201 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:28:59 +00:00
fbarchard@google.com
f5f5d15dcd Fix register order for ARGBToUV_AVX2
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1200 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-15 18:07:09 +00:00
fbarchard@google.com
11c3015712 Fix for I422ToARGB AVX2
BUG=269
TESTED=untested

Review URL: https://webrtc-codereview.appspot.com/32809004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1199 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-13 19:22:09 +00:00
fbarchard@google.com
ada2a3eb12 Fix for ARGBToY on AVX
BUG=269
TESTED=local build on osx
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29229005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1198 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-13 01:44:33 +00:00
fbarchard@google.com
b0abc62c21 Fix for UYVYToI422 AVX2 version
BUG=269
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1197 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-13 00:15:11 +00:00
fbarchard@google.com
a9734a4492 ARGBMirror for AVX had wrong loop counting. This fixes it to match windows, and reenables the function.
BUG=269
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/33639004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1196 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-12 22:43:55 +00:00
fbarchard@google.com
08daa3e22b Disable AVX2 code that fails on GCC unittests until issues can be resolved.
BUG=269
TESTED=sde-external-7.8.0-2014-10-02-mac/sde -ast -hsw -- out/Release/libyuv_unittest

Review URL: https://webrtc-codereview.appspot.com/29219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1195 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-12 19:30:15 +00:00
fbarchard@google.com
233a931cb6 Port ARGBToUV to AVX2.
BUG=269
TESTED=ncval
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/35449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1194 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-10 22:48:58 +00:00
fbarchard@google.com
e0bb4c26e2 Interpolate Row ported to AVX2 GCC/NaCL.
BUG=269
TESTED=nacl build
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/25329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1193 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-09 22:21:53 +00:00
fbarchard@google.com
044938f485 convert ARGB to UV for SSSE3 use single asm block.
BUG=378
TESTED=nacl build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/28179004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1191 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-06 19:23:12 +00:00
fbarchard@google.com
540e8af80c remove add 16 from ARGBToYJ and add rounding, for consistency with Windows version. row.h header macros sorted alphabetically.
BUG=269
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32579005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1185 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 22:37:47 +00:00
fbarchard@google.com
b036cf700b ARGBToYRow_AVX2 and ARGBToYJRow_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/30299004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1184 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 22:00:08 +00:00
fbarchard@google.com
d0bfd10147 I422ToRGBARow_AVX2 ported to GCC.
BUG=269
TESTED=nacl build
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/32259004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1183 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 18:59:33 +00:00
fbarchard@google.com
702e237d5f I422ToABGR_AVX2 port from Visual C to GCC/NaCL.
BUG=269
TESTED=builds with nacl compiler.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/33449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1182 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 03:54:08 +00:00
fbarchard@google.com
0c472f9d42 gcc port of I422ToARGB_AVX2 from Visual C. Uses Macros for read of I422 and conversion from YUV to RGB. Shares constants from I422ToBGRA structure.
BUG=269
TESTED=nacl builds.
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/27279004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1181 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-12-02 00:22:56 +00:00
fbarchard@google.com
c5aac16af9 Remove loop alignment for benefit of modern cpus that dont require alignment.
BUG=none
TESTED=local libyuv unittest passes
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1180 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-24 21:26:22 +00:00
fbarchard@google.com
ef14972df0 MergeUV AVX2 use vextractf128 to store results to avoid shuffling.
BUG=none
TESTED=intel sde on unittests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/33369004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1178 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-22 03:33:33 +00:00
fbarchard@google.com
147f7b70f5 Quick fix for build gcc - remove unused argument kARGBShuffleMirror from ARGBMirror SSE2.
BUG=none
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/30209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1177 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-22 01:12:19 +00:00
fbarchard@google.com
ef67597b48 ARGBMirror use SSE2 pshufd instruction instead of SSSE3 pshufb.
BUG=269
TESTED=local benchmark for ARGBMirror
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/32509004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1176 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-21 19:25:14 +00:00
fbarchard@google.com
91f240c5db Move sub before branch for loops.
Remove CopyRow_x86
Add CopyRow_Any versions for AVX, SSE2 and Neon.
BUG=269
TESTED=local build
R=harryjin@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/26209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1175 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-20 21:14:27 +00:00
fbarchard@google.com
813bf9f97d Change lea macros from memaccess to memlea to fix nacl 64 bit build errors.
BUG=381
TESTED=local nacl build and validate
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1174 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-19 23:02:04 +00:00
fbarchard@google.com
db7a7f61ff Port ARGBMirror AVX2 code to gcc/NaCL.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1173 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-19 20:03:37 +00:00
fbarchard@google.com
9dd083a512 ARGBMirror Any
BUG=none
TESTED=mirror and rotate unittests
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1172 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-19 00:46:51 +00:00
fbarchard@google.com
59ed448685 MirrorAny functions so assembly can always be used.
BUG=none
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29069004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1170 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-18 01:03:47 +00:00
fbarchard@google.com
55db4ec23b port lea removal for mirror to gcc
BUG=none
TESTED=none
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1169 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 20:06:40 +00:00
fbarchard@google.com
b9d17e1d79 Fix offset in addresses for windows. Wants it within [] now.
BUG=none
TESTED=local windows build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1168 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 19:50:42 +00:00
fbarchard@google.com
5822505e0a Remove extra unaligned loop from alphablender. Both aligned and unaligned loops were the same, so remove the extra.
BUG=none
TESTED=try bots.
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1166 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:33:07 +00:00
fbarchard@google.com
1eb636d249 remove initial lea in mirror functions and add the offset in the address mode.
BUG=none
TESTED=local libyuv unittests on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26169004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1165 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:16:23 +00:00
fbarchard@google.com
35508d0979 Mirror_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1164 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 23:11:10 +00:00
fbarchard@google.com
91000425a3 ARGBUnattenuate_AVX2 ported to GCC. Minor cleanup of constants to use broadcast to make 16 byte constant instead of 32 byte.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1163 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 17:57:33 +00:00
fbarchard@google.com
f8c334473b ARGBAttenuate_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1162 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-12 18:38:06 +00:00
fbarchard@google.com
ec1f854f86 Use broadcast to duplicate constants from 16 bytes to 32 bytes to save data space.
BUG=none
TESTED=intelsde
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/32029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1161 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-12 01:45:27 +00:00
fbarchard@google.com
a843cafbe4 ARGBMultiply_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27139005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1160 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 20:33:33 +00:00
fbarchard@google.com
0387df5186 ARGBSubtract_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1159 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 19:12:38 +00:00
fbarchard@google.com
9e9e26d60a ARGBAdd ported AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1158 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 19:01:29 +00:00
fbarchard@google.com
10d9c0d0a7 MergeUV for AVX2 ported to gcc. Add missing vzeroupper to all avx2 functions.
BUG=none
TESTED=ncval for nacl
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25059005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1157 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-10 19:19:12 +00:00
fbarchard@google.com
d1885bcc73 SplitUVRow_AVX2 ported to GCC/NaCL.
BUG=269
TESTED=validator for nacl.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/28979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1156 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-06 01:39:26 +00:00
fbarchard@google.com
a6025e8b6b ARGBDetect do 2 pixels at a time for improved performance.
BUG=375
TESTED=libyuvTest.BenchmarkARGBDetect_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1155 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 23:23:17 +00:00
fbarchard@google.com
b661b3ee0d Detect Endian of ARGB image.
BUG=375
TESTED=libyuv builds, but no test app for it yet
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1154 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 18:46:06 +00:00
fbarchard@google.com
bb3a4b41e9 vextractf128 requuires a constant argument for which dqword to extract, so add a new macro.
BUG=none
TESTED=local build on clang for osx
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1153 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-04 21:05:55 +00:00
fbarchard@google.com
3f87404769 Port YUY2ToUV, YUY2ToUV422, UYVYToUV and UYVYToUV422 to AVX2 on GCC/Nacl.
BUG=269
TESTED=ncval
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1152 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-04 18:24:10 +00:00
fbarchard@google.com
067892c5a1 Port YUY2ToYRow_AVX2 and UYVYToYRow_AVX2 to gcc/NaCL from Windows AVX code.
BUG=269
TESTED=ncval
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1151 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-03 18:30:17 +00:00
fbarchard@google.com
260e3b2273 now that libyuv requires newer nacl compiler, bundles can be assumed and bundle align macro can be removed. no impact on code gen.
BUG=none
TESTED=validator still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30019004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1150 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-30 20:02:03 +00:00
fbarchard@google.com
ee4bc0d834 vzeroupper moved to just before ret. in one case it was done after ret, which is a bug that would cause a performance stall.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1149 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-30 19:27:21 +00:00
fbarchard@google.com
2edea9454d Fix lint extraneous warning on row_win assembly by disabling the warning for those affected lines.
BUG=none
TESTED=line row_win.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1144 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-27 16:27:48 +00:00
fbarchard@google.com
9ed836b154 The 'Any' versions of functions can handle any width now, so remove the check from the calling code. This has 2 advantages - less code, and less overhead in calling function when any function is NOT used. Downside is more code for case where any is used.
BUG=373
TESTED=libyuv_unittest still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1143 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 23:29:31 +00:00
fbarchard@google.com
88ac01aed0 Change YAny functions to share, and use mask for how many bytes at a time for simd vs C.
BUG=373
TESTED=libyuv_unittest passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31819004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1142 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 22:58:38 +00:00
fbarchard@google.com
78a3a6b345 Change Any functions that convert 1 to 1 formats, memcpy style, so use C for remainder to allow a minimum width of 1. This has some advantages - allows function to be used even with SIMD that only allows aligned memory. Fewer macros, used by more functions. SIMD is not used unaligned avoiding page/cache split. No overlap so it can be used in place. Disadvantage is it will be slower if close to the maximum number of non-SIMD pixels.
BUG=373
TESTED=libyuv_unittest still passes
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1141 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 22:17:59 +00:00
fbarchard@google.com
1f151f62a9 add a check that the simd function should be called. allows any functions to support any width, simplifing and speeding up the calling code.
BUG=373
TESTED=try bots
R=brucedawson@chromium.org, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25949004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1140 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 00:45:27 +00:00
fbarchard@google.com
0a6dab42c0 Add check for minimum of 8 pixels for any functions and multiple of 8 not 16 for neon functions.
BUG=373
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/23189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1139 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 23:05:12 +00:00
fbarchard@google.com
f2fa453b94 Port I422ToABGR to AVX2.
BUG=269
TESTED=intelsde on I422ToABGR
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1138 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 17:20:22 +00:00
fbarchard@google.com
22eb5965fc Optimize I422ToRGBA for AVX2 by hoisting ymm5 initialization and using different register for output of unpack.
BUG=269
TESTED=intelsde on I422ToABGR
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29889004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1137 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 23:39:16 +00:00
fbarchard@google.com
c000955bc0 Port I422ToRGBA to AVX.
BUG=269
TESTED=intelsde on I422ToRGBA
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/28769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1136 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 22:41:39 +00:00
fbarchard@google.com
af6f25245e Reenable AVX2 scaling with bug fix for any width
BUG=376
TESTED=unittest on scale functions
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1135 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 01:15:20 +00:00
fbarchard@google.com
4ec55a21cf Use macros to simplify I422ToARGB for AVX code.
BUG=269
TESTED=local build with Visual C
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1133 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 22:48:32 +00:00
fbarchard@google.com
a063a66de4 Change I422ToARGB_AVX2 register usage to match SSSE3. ymm0 = B, ymm1 = G, ymm2 = R.
BUG=269
TESTED=intelsde passes on unittests.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/28759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1132 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 19:02:06 +00:00
fbarchard@google.com
51b78880c5 gcc version of I422ToBGRA_AVX2. Original copied from https://webrtc-codereview.appspot.com/28729004/ and compatible with, but unrelated to windows version.
BUG=269
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1131 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 02:18:11 +00:00
fbarchard@google.com
d81dddd3d0 port I420ToBGRA to AVX2.
BUG=269
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*I420ToBGRA*
R=brucedawson@google.com, harryjin@google.com, magjed@chromium.org

Review URL: https://webrtc-codereview.appspot.com/26869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1127 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 19:35:55 +00:00
fbarchard@google.com
055725b8fe Neon does 8 at a time, so a check is added for any function of I422ToBGRA that width is >= 8 and for fast path that it is a multiple of 8 not 16.
BUG=373
TESTED=untested
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/24059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1126 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 18:39:21 +00:00
fbarchard@google.com
3dbaaf0032 switch win64 intrinsics to loadu / storeu for unaligned memory.
BUG=372
TESTED=untested
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30729004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1124 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 23:46:48 +00:00
fbarchard@google.com
e737688603 Fix for r1122 to change back to elif for rotate build error on Mac.
BUG=268
TESTED=try bot
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31749004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1123 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 22:21:48 +00:00
fbarchard@google.com
f713691a6f Change elif to endif and if to allow AVX2 as well as SSE2 in future changes instead of one or the other.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1122 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 20:47:22 +00:00
fbarchard@google.com
f6e495169c Copy width to 64 bit register to work around clang 3.4 warning
BUG=none
TESTED=local ios 64 bit build completes without size warnings on xcode 5.1.1
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1120 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-13 23:26:17 +00:00
fbarchard@google.com
4d46be3930 Declare CopyRow_AVX as using xmm usage, not ymm. Should resolve chromium build error for Android Atom.
BUG=libyuv:369
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/31609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1118 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-09 17:54:43 +00:00
zhongwei.yao@arm.com
0eb196f8db clear aarch64 related macro and fix bugs
fix 2 bugs:
 - build bug libyuv.gyp
 - runtime bug in ScaleRowDown38_2_Box_NEON
BUG=
TESTED=libyuv_unittest
R=fbarchard@google.com, fbarchard@chromium.org

Review URL: https://webrtc-codereview.appspot.com/23939004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1117 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-09 02:00:40 +00:00
fbarchard@google.com
205c1440cf Use movdqu then pavgb to allow unaligned memory for rgb subsampling code. Allows this assembly to be used for unaligned pointers as well as aligned ones with no performance hit when memory is aligned on a modern cpu.
BUG=365
TESTED=libyuvTest.ARGBToI420_Unaligned (453 ms)
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1116 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 19:47:06 +00:00
fbarchard@google.com
883ce64a34 ifdefs for UV functions to resolve link error on osx
BUG=365
TESTED=mac local build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/24859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1115 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 17:24:14 +00:00
fbarchard@google.com
008ce53ac4 pavgb with memory op requires alignment. This CL disables conversions that use pavgb, and resolves scale by 3/8 unittest for checking alignment works. The 3/8 code used a pavgb with a memory operand. tests are added for scaling and allow unaligning on purpose.
BUG=365
TESTED=local change to force unaligned memory fails on some conversions and scaling code.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1114 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 01:57:34 +00:00
fbarchard@google.com
ca308327d2 Remove unaligned functions, since most function support unaligned memory now. This reduces complexity and improves performance for unaligned cases because C code can be avoided, and overhead is less. Downside is old cpus (core2 and earlier) will be slower for aligned memory case. Except mips, which has alignment requirement, but remove unaligned variant.
BUG=365
TESTED=unittest builds and passes locally
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1113 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 00:59:31 +00:00
fbarchard@google.com
b720049a54 Make row functions used for planarfunctions and convert use movdqu to relax alignment constraint. Step 1 - make functions unaligned.
BUG=365
TESTED=libyuv_unittest passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26709004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1111 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-03 21:11:37 +00:00
ashok.bhat@gmail.com
147bbede9d Row AArch64 Neon implementation - Part 8
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: If30eb2d255a09dece9d216a9d29317dd748ef496
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/22769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1109 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-03 18:10:05 +00:00
fbarchard@google.com
d1a0e7e71a scale use movdqu for posix
BUG=367
TESTED=libyuvTest.I444ToI420_Unaligned
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1108 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-03 18:03:10 +00:00
fbarchard@google.com
d83f63a3b4 InterpolateRow used for scale handle unaligned memory. Remove HalfRow which is not used.
BUG=367
TESTED=unittest on I422ToI420
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/28639004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1107 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-03 17:37:11 +00:00
fbarchard@google.com
455ae94c60 Make rotate SIMD allow unaligned pointers.
BUG=365
TESTED=libyuv_unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/22899004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1102 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-02 17:56:48 +00:00
fbarchard@google.com
044f914c29 Change scale to unaligned movdqu.
BUG=365
TESTED=scale unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/22879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1101 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-01 01:16:04 +00:00
fbarchard@google.com
9c4c82181b Remove alignment constraint for SSE2. Allows the optimized function to be used with unaligned memory, improving performance in that use case. Hurts performance on core2 and prior where memory was faster with movdqa instruction.
BUG=365
TESTED=psnr, ssim and djb2 unittests pass.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/22859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1100 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-30 18:53:34 +00:00
fbarchard@google.com
1fb68cda67 port/fix CopyRow_AVX to gcc
BUG=363
TESTED=osx build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/28619004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1098 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-30 00:39:41 +00:00
fbarchard@google.com
d33bf86b25 CopyRow_AVX which supports unaligned pointers for Sandy Bridge CPU.
BUG=363
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGBToARGB_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/31489004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1097 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-29 23:53:18 +00:00
ashok.bhat@gmail.com
c379d17195 Row AArch64 Neon implementation - Part 11
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: Id187c5cbdbbb5570598eb9fcd9c3d6699e175f03
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/24759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1096 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-29 18:10:20 +00:00
ashok.bhat@gmail.com
824d9071d7 Remove __ARM_NEON__ define check for AArch64
BUG=319
TESTED=local build
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/28569005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1095 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-29 09:40:37 +00:00
ashok.bhat@gmail.com
fc5ca9280f Row AArch64444 Neon implementation - Part 10
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: I1a11136aa3e4f541f9c2617281d7b530b470f13d
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/23769005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1093 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-26 12:40:57 +00:00
ashok.bhat@gmail.com
c8a34d2e5b Row AArch64 Neon implementation - Part 9
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: Id3af83a6efbd70b4a808a8442c3badbef749c0cc
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/23769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1092 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-26 09:16:48 +00:00
fbarchard@google.com
c52d66d7da Detect asimd as same as Neon for Arm features. Used on Juno aarch64 linux.
BUG=361
TESTED=.\libyuv_unittest --gtest_filter=libyuvTest.TestLinuxNeon
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/31439004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1088 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-22 18:30:17 +00:00
fbarchard@google.com
aec76f2e30 add stride to pointer in C and pass as register to inline.
BUG=357
TESTED=clang on ios
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29489004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1086 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-19 22:51:39 +00:00
fbarchard@google.com
f7d9b9fb13 change vector range notation to a list of registers for clang compatibility. break compare into 2 neon files for consistency with other neon64 files.
BUG=357
TESTED=local ios build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30379004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1085 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-15 23:39:43 +00:00
fbarchard@google.com
a62a97f142 Change branch notation to clang compatible b dot cc
BUG=357
TESTED=local ios a64 build
R=yunqingwang@google.com

Review URL: https://webrtc-codereview.appspot.com/25549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1084 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-15 22:45:32 +00:00
fbarchard@google.com
8cbfc5d41f Change ifdefs for arm 32 and 64 bit so there will only be 32 bit in legacy mode.
BUG=357
TESTED=ios arm64 build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1083 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-15 22:05:01 +00:00
zhongwei.yao@arm.com
60ccea47d9 add TransposeWx8_NEON's aarch64 implementation
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/20259004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1081 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-12 08:11:20 +00:00
ashok.bhat@gmail.com
44c4d0f3b0 Fix the build failure for arm64
TESTED=libyuv_unittest
BUG=357
R=fbarchard@google.com


git-svn-id: http://libyuv.googlecode.com/svn/trunk@1080 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-11 14:49:31 +00:00
ashok.bhat@gmail.com
21cadac909 Fix the build failure for arm64
TESTED=libyuv_unittest
BUG=357
R=fbarchard@google.com

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1079 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-11 14:43:34 +00:00
yang.zhang@arm.com
c386168c76 Rotate ARM64 NEON implementation - TransposeUVWx8_NEON
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: I1dc89b35d4c4bf011cd04b549aaf9d777b1acc65

Review URL: https://webrtc-codereview.appspot.com/23399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1078 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-10 06:02:55 +00:00
zhongwei.yao@arm.com
4667addfb8 Add a placeholder file for ARM64 Rotate Neon implementation
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/18189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1073 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-01 08:50:49 +00:00
zhongwei.yao@arm.com
686e9d0247 implement ARM64 ScaleARGBRowDownEven and ScaleARGBRowDownEvenBox
TESTED=libyuv_unittest
BUG=319
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/19099004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1072 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-01 08:17:36 +00:00
yang.zhang@arm.com
90f971fc5e Scale ARM64 NEON implementation - ScaleRowDown38
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Add the following functions:
- ScaleRowDown38_NEON
- ScaleRowDown38_2_Box_NEON
- ScaleRowDown38_3_Box_NEON

I find that these functions aren't tracked in the gtest.
So that I write the test case myself.

Change-Id: Ie70a00d7f708450dc786dfb388386ff748a21508

Review URL: https://webrtc-codereview.appspot.com/15229004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1071 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-01 03:55:19 +00:00
yang.zhang@arm.com
5497af0ba7 Scale ARM64 NEON implementation - ScaleRowDown34
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Add the following functions:
 - ScaleRowDown34_NEON
 - ScaleRowDown34_0_Box_NEON
 - ScaleRowDown34_1_Box_NEON

Change-Id: If3fe96de602b77033ec67252ef755ef3f88f33aa

Review URL: https://webrtc-codereview.appspot.com/20209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1070 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-01 03:48:10 +00:00
ashok.bhat@gmail.com
2df5743bd4 Row AArch64 Neon implementation - Part 6
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: I5d93eb184ba873d5e7637a3b5a830be39a967c6f
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/15239004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1069 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-29 08:12:51 +00:00
zhongwei.yao@arm.com
4d5c3f3498 implement ARM64 ScaleRowDown4 and ScaleRowDown4Box
TESTED=libyuv_unittest
BUG=319
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/21279004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1068 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-28 06:43:18 +00:00
fbarchard@google.com
3389f8efa4 disable mips assembly for __mips_isa_rev 6
BUG=355
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/22229004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1067 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-27 18:14:58 +00:00
zhongwei.yao@arm.com
4e4396313a Add function ScaleFilterRows_NEON for ARM64 Scale Neon implementation
TESTED=libyuv_unittest
BUG=319
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/22439004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1066 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-27 09:41:54 +00:00
ashok.bhat@gmail.com
218ebde886 Row AArch64 Neon implementation - Part 7
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Change-Id: Idfad43af3d637596678a35f733d76dec29778af2
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/22459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1065 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-26 10:28:14 +00:00
zhongwei.yao@arm.com
298dbf2dc3 implement ScaleRowDown2_NEON && ScaleRowDown2Box_NEON
TESTED=libyuv_unit_test
BUG=319
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/15269004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1064 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-26 02:50:41 +00:00
zhongwei.yao@arm.com
15d1af1574 implement ScaleARGBRowDown2 && ScaleARGBRowDown2Box
TESTED=libyuv_unit_test
BUG=319
R=fbarchard@chromium.org, fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/17199004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1063 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-26 02:45:14 +00:00
fbarchard@google.com
ee43c95c51 fix memory leaks in *ToI420 functions.
BUG=352
TESTED=drmemory out\debug\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=**ToI420_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/21289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1060 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-22 00:37:16 +00:00
fbarchard@google.com
6e95f6f7e1 ifdef headers to avoid intrinsics if built with gcc 64 bit on windows.
BUG=351
TESTED=untested
R=jzern@chromium.org

Review URL: https://webrtc-codereview.appspot.com/22419004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1058 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 22:44:49 +00:00
fbarchard@google.com
aaddd24ba0 ARGBToNV12 fix for memory leak on row_u_mem.
BUG=352
TESTED=libyuv_unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/18209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1057 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 22:40:22 +00:00
ashok.bhat@gmail.com
c1155cb587 Row AArch64 Neon implementation - Part 3
BUG=319
TESTED=libyuv_unittest
R=fbarchard@google.com

Change-Id: Ia818ca62d4a84d76b0144f904983d82d41cab651
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/15149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1056 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 19:13:37 +00:00
ashok.bhat@gmail.com
8f04ca5b9c Row AArch64 Neon implementation - Part 5
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Change-Id: Ia76096088ddd771388f01dd86110089db2faedfc
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/21189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1055 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 10:07:11 +00:00
ashok.bhat@gmail.com
cb8be2fb2b Row AArch64 Neon implementation - Part 4
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Change-Id: If145660d999e95246efeedb64a45ba70bf0fe23e
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/13199004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1054 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 09:55:58 +00:00
fbarchard@google.com
720e3a247f In Q420ToI420 the variable halfheight is initialized but not used. Change it to instantiate the variable but do not initialize it. It will be assigned conditionally later. This warning raised in xcode.
BUG=353
TESTED=local build still works
R=harryjin@google.com, noahric@chromium.org

Review URL: https://webrtc-codereview.appspot.com/13299004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1053 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 00:20:58 +00:00
zhongwei.yao@arm.com
0ce3733797 Add a placeholder file for ARM64 Scale Neon implementation
BUS=319
TESTED=libyuv_unit_test
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/18179004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1051 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-20 02:54:59 +00:00
fbarchard@google.com
bf24367476 Mingw xgetbv use gcc assembly not visual c.
BUG=349
TESTED=c:\mingw64\bin\x86_64-w64-mingw32-c++.exe -m32 -I include source/cpu_id.cc -c -o cpu_id.o
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/17139004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1049 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-18 23:45:59 +00:00
fbarchard@google.com
5bd003f081 fix a lint warning about a space needed after && in ifdef
BUG=348
TESTED=cpplint.py --filter=-readability/casting source/*.cc include/libyuv/*.h
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/21209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1048 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-18 23:22:20 +00:00
ashok.bhat@gmail.com
b8c4fc71c3 Row AArch64 Neon implementation - Part 2
BUG=319
TEST=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Change-Id: Ib1f824c5a7dc3938ff63991f08eafa08fc33f108
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/18109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1047 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-18 08:42:56 +00:00
fbarchard@google.com
64455db9b1 cpuid include intrinsics header before using xgetbv
BUG=282
TESTED=vs2010sp1 build.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/17109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1046 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-15 01:23:54 +00:00
fbarchard@google.com
9e0f21af0b fixes for blank line lint warnings
BUG=348
TESTED=cpplint.py --filter=-casting source/*.cc include/libyuv/*.h
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/18139004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1045 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-14 19:42:48 +00:00
ashok.bhat@gmail.com
de9fa43c60 Row AArch64 Neon implementation - Part 1
BUG=319
TEST=libyuv_unittest
R=fbarchard@google.com

Change-Id: I367ffa7bb0fd0337ab8486d3eb4fb94afea7400c
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/21149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1044 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-13 08:33:17 +00:00
yang.zhang@arm.com
26f43db1ef AArch64:add SumSquareError_NEON armv8 assembly version
BUG=none
TESTED=libyuv_unittest
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/16259004/

the benckmarking result is as follows:
toolchain: gcc 4.9
hardware: A53

| count | C Times/NEON times |
| 16    | 3.35               |
| 128   | 6.63               |
| 512   | 7.47               |
| 1024  | 7.72               |

Change-Id: Ic10bf22d77d069a1a2074b68bd5a310c579ec490

Review URL: https://webrtc-codereview.appspot.com/21159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1043 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-13 06:10:02 +00:00
zhongwei.yao@arm.com
1afdfb3da8 arm64 neon optimization building is enabled
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1042 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-13 03:45:11 +00:00
ashok.bhat@gmail.com
9453f7c494 Add a placeholder file for ARM64 Row Neon implementation
BUG=319
TEST=libyuv_unittest
R=fbarchard@google.com

Change-Id: I9fdc355d285062d32c11dba4e240d32f5b1bcb80
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/16249004

Review URL: https://webrtc-codereview.appspot.com/16249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1041 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-07 13:40:18 +00:00
ashok.bhat@gmail.com
c8f529a48f Remove extra MEMACCESS
TESTED=libyuv_unittest

Change-Id: I25fae71200ea44846eea3604a55bf4a88ea593ce
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>


git-svn-id: http://libyuv.googlecode.com/svn/trunk@1039 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-29 18:32:59 +00:00
fbarchard@google.com
451a7541b7 Check number of functions available to cpuid before fetching function 7 results.
BUG=343
TESTED=local test on Windows.
R=brettw@chromium.org, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/12969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1035 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-14 17:48:35 +00:00
fbarchard@google.com
9ffb92fae0 Detect clang-cl compiler and disable assembly for now.
BUG=341
TESTED=clang-cl /W0 -c -Iinclude source/cpu_id.c
R=harryjin@google.com, rnk@chromium.org

Review URL: https://webrtc-codereview.appspot.com/12939004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1033 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-09 17:55:23 +00:00
fbarchard@google.com
65a324ed34 remove extern "C" from rotate function, since its built with extern "C" around full file.
BUG=341
TESTED=clang -c -Iinclude source/rotate.c
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/17919004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1031 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-08 22:06:56 +00:00
fbarchard@google.com
8798e04075 Port conversion functions to c.
BUG=303
TESTED=cl /c /TC /Iinclude source\convert_from.cc source\convert_argb.cc source\convert_from_argb.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/17909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1030 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-08 18:44:57 +00:00
fbarchard@google.com
a2fbf9dee6 convert source ported to c89.
BUG=303
TESTED=cl /c /TC /Iinclude source/convert.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/21849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1029 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-07 19:05:45 +00:00
fbarchard@google.com
81ba94f58a only enable mips assembly for old 32 bit abi. new 32 bit abi and 64 bit bit able remove t4 to t7 and add a4 to a7
BUG=337
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/20769005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1020 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-24 23:40:52 +00:00
fbarchard@google.com
1b9df4c5c8 Add nacl version check to enable Neon on M37 and bundles for X86 on M33
BUG=333
TESTED=nacl build and validate
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/20769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1019 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-24 22:26:30 +00:00
fbarchard@google.com
e6dd1fa024 Port I420ToARGB to intrinsics for win64
BUG=336
TESTED=out\release_x64\libyuv_unittest --gunit_also_run_disabled_tests --gtest_filter=*I420To*B*
R=bryan.bernhart@intel.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/15809005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1018 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-24 20:45:45 +00:00
fbarchard@google.com
f67b426bdf Add some more bic's for scale nacl code
BUG=333
TESTED=ncval
R=thorcarpenter@google.com

Review URL: https://webrtc-codereview.appspot.com/20719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1017 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-17 23:12:55 +00:00
fbarchard@google.com
4b7a04e864 port neon to arm64. the register names have changes from r0 to w0 or x0 depending on size. Passing them as parameters (e.g. %0) makes the code register name agnostic.
BUG=333
TESTED=32 bit build still works.
R=nfullagar@chromium.org

Review URL: https://webrtc-codereview.appspot.com/20669005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1016 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-17 18:16:29 +00:00
fbarchard@google.com
4e5e44e21e scale neon nacl port
BUG=333
TESTED=ncval
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/18549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1015 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-16 17:09:48 +00:00
fbarchard@google.com
b1df26dc27 rotate neon code port to nacl
BUG=333
TESTED=ncval
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/19759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1014 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-13 23:02:10 +00:00
fbarchard@google.com
0bb310ebc4 Add bic instructions before each load or store for nacl
BUG=333
TESTED=validator
R=jfb@chromium.org

Review URL: https://webrtc-codereview.appspot.com/13669004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1013 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-13 17:20:52 +00:00
fbarchard@google.com
bf3b111147 MEMACCESS macro for NaCL Arm
BUG=333
TESTED=validator passes
R=jfb@chromium.org, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/13649004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1012 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-12 00:19:38 +00:00
fbarchard@google.com
a9ff15b7bb check copy has different address. If same, skip the copy to avoid valgrind error.
BUG=334
TESTED=unittests still pass
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/14679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1011 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-11 00:16:59 +00:00
fbarchard@google.com
2a35da3912 Add ARGBToABGR and ARGBToBGRA as actual functions instead of macros.
BUG=334
TESTED=libyuv unittests pass
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/12659006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1008 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-06-02 19:24:57 +00:00
fbarchard@google.com
37ad8b6507 Port libyuv neon to nacl - compare
BUG=333
TESTED=d:\src\nacl_sdk\pepper_canary\tools\ncval.exe newlib/Release/nacltest_arm.nexe
R=nfullagar@chromium.org

Review URL: https://webrtc-codereview.appspot.com/17599004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1006 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-05-21 19:04:15 +00:00
fbarchard@google.com
b18413e568 YUV scaling with 16 bit planes
BUG=331
TESTED=libyuv_unittest --gunit_also_run_disabled_tests --gtest_filter=**.ScaleFrom1280x720*
R=debargha@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/17569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1005 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-05-20 19:22:30 +00:00
fbarchard@google.com
8b857c0ac6 changes to accommodate libjpeg 9 interface.
BUG=327
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/15489005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1004 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-05-13 19:00:01 +00:00
fbarchard@google.com
fdaaea0f83 Change r9 to a parameter which will map to x9 for arm64.
BUG=319
TESTED=untested
R=thorcarpenter@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/11139004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@998 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-04-03 18:37:32 +00:00
kjellander@google.com
614c9fa853 Revert accidental checkin in r992
I dunno how this happened, since that file belonged
to a another gcl change that I use to test the trybots.

TBR=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/10899005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@993 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-28 19:40:39 +00:00
kjellander@google.com
d5a521cd23 Minor fixes to gyp_libyuv.py
Fixed invalid references left from
the copied gyp_webrtc.

This CL will also add svn:ignore on a
bunch of directories to speed up builds
(less unnecessary delete + redownload).

It also adds the executable bit to
gyp_libyuv.

R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/10889004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@992 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-28 19:38:05 +00:00
fbarchard@google.com
a3cda5080c Port format_conversion (bayer) to C
BUG=303
TESTED=cl /c /TC /Iinclude source/format_conversion.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/10709004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@990 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-26 17:29:20 +00:00
fbarchard@google.com
91dc3eddeb Fix C89 compile error for cpu detect. Make mips detection assume DSP if cpuinfo file can not be opened, so that if run in a sandbox, DSP is assumed true, like the arm version.
BUG=303
TESTED=cl /c /TC /Iinclude source/cpu_id.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/10549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@986 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-24 18:24:22 +00:00
fbarchard@google.com
efc5d9b930 Warning fix for implicite cast in scaling from int64 to int.
BUG=none
TESTED=local visual c build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/10169004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@985 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-18 22:23:15 +00:00
fbarchard@google.com
398de7d0be ARGBScale down bilinear clip to edge of image to avoid overread.
BUG=317
TESTED=drmemory out\debug\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*ARGBScale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/10159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@984 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-18 21:39:42 +00:00
fbarchard@google.com
6e618676dc More wordy comments about Neon
BUG=315
TESTED=untested
R=wuwang@google.com

Review URL: https://webrtc-codereview.appspot.com/9599004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@982 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-08 00:23:04 +00:00
fbarchard@google.com
cb3344ae25 If libyuv built with Neon, assume Neon is present on CPU.
BUG=315
TESTED=untested
R=nfullagar@chromium.org

Review URL: https://webrtc-codereview.appspot.com/9589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@980 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-03-07 21:17:24 +00:00
fbarchard@google.com
02ff4e77e5 clang compatibility ifdef
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/7809004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@978 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-31 00:27:40 +00:00
fbarchard@google.com
16ea9c816b libyuv::MJPGToI420() and libyuv::MJPGToARGB() return failure if callback to JPeg fails.
BUG=309
TESTED=try bots still pass
R=braveyao@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/7709004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@976 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-28 03:08:59 +00:00
fbarchard@google.com
fa83188365 scale port to c. completes all scaling functions.
BUG=303
TESTED=cl /c /TC /Iinclude source/scale.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/7319004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@974 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-18 01:30:58 +00:00
fbarchard@google.com
0283648407 ARGB Scale ported to C
BUG=303
TESTED=cl /c /TC /Iinclude source/scale_argb.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/7169004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@972 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-15 02:25:15 +00:00
fbarchard@google.com
2c8108e6c2 Detect pnacl and disable x86 specific code.
BUG=none
TESTED=untested
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/7099004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@968 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-14 00:36:31 +00:00
fbarchard@google.com
ae997018e5 Add extern c around jpeg header
BUG=305
TESTED=try bots
R=michaelbai@chromium.org

Review URL: https://webrtc-codereview.appspot.com/7069004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@967 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 19:43:43 +00:00
fbarchard@google.com
9c6e52791f Port compare to C89 / Visual C.
BUG=303
TESTED=cl /c /TC /Iinclude source/compare.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/7019006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@966 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 18:57:30 +00:00
fbarchard@google.com
70bc4995a0 Planarfunctions (mainly effects) converted to C89/VisualC.
BUG=303
TESTED=cl /c /TC /Iinclude source/planar_functions.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@965 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 18:56:25 +00:00
fbarchard@google.com
6d82347dda Conversion functions ported to C89 / Visual C.
BUG=303
TESTED=cl /c /TC /Iinclude source/convert_to_argb.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@964 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 18:32:37 +00:00
fbarchard@google.com
e28b2084d7 Rotate functions ported to C. This completes all rotate functionality under c89, for overall 71% complete port.
BUG=303
TESTED=cl /c /TC /Iinclude source/rotate_argb.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@963 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 18:31:35 +00:00
fbarchard@google.com
5aa39953cc Port scale to C moving variable definitions to top of functions.
BUG=303
TESTED=gyp builds still build/pass.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6949004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@962 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-11 04:59:01 +00:00
fbarchard@google.com
ecf5a1446e common functions (c row functions) ported to C89.
BUG=303
TESTED=cl /c /TC /Iinclude source/scale_common.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@961 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-10 20:55:39 +00:00
fbarchard@google.com
da443d7adc Remainder calc needs to be after blocks are done. Move calc to old location.
BUG=303
TESTED=Djb2 unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@960 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-09 20:29:27 +00:00
fbarchard@google.com
6700a27c97 Scale mirror bug fix.
BUG=304
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6789005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@959 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-09 20:21:24 +00:00
fbarchard@google.com
9124ac893a compare_common visual c port
BUG=303
TESTED=cl /c /TC /Iinclude source/compare_common.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@958 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-09 19:11:09 +00:00
fbarchard@google.com
167d5d1c2f Porting parts of compare to c89
BUG=303
TESTED=try bots still build, gcc and vc direct for c testing.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6739004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@956 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-08 00:59:40 +00:00
fbarchard@google.com
53a7923b15 cast malloc to uint8*
BUG=303
TESTED=visual c higher warnings
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6639004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@955 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 06:06:01 +00:00
fbarchard@google.com
1f923e3ea6 Declare parameters that are unused, since C does not let you give a type without name.
BUG=303
TEST=compile -x c

Review URL: https://webrtc-codereview.appspot.com/6599006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@954 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 05:42:27 +00:00
fbarchard@google.com
db73518b19 use LIBYUV_BOOL instead of bool
BUG=303
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6519006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@953 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 03:59:31 +00:00
fbarchard@google.com
a1f5254a95 Switch to c style casts for all source and includes.
BUG=303
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@952 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 03:03:00 +00:00
fbarchard@google.com
959b290a96 Port a few functions to C
BUG=303
TESTED=try bots
R=johannkoenig@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6599005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@950 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-06 22:37:28 +00:00
fbarchard@google.com
dd49958058 Only do 1x1 work around for large source
BUG=302
TESTED=hammer effects

Review URL: https://webrtc-codereview.appspot.com/6549005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@949 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 08:42:55 +00:00
fbarchard@google.com
a1b92bd744 Warning fixes for implicite casts that vs2012 complains about with higher warning levels
BUG=302
TESTED=hammer build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6559004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@948 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 02:38:49 +00:00
fbarchard@google.com
c23b817eab 64 bit clip for argb scale down
BUG=302
TEST=out\release\libyuv_unittest --gtest_filter=*ARGBScaleDownClipBy3by4*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6549004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@947 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 02:13:35 +00:00
fbarchard@google.com
909c76e317 point sample 64 bit column filter
BUG=302
TESTED=ARGBScaleClipTo320x240_None etc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6539004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@946 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 02:03:07 +00:00
fbarchard@google.com
667de22fd3 clip by adjusting pointer
BUG=302
TEST=ARGBScaleDownClipBy2_None
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6529004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@945 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 01:43:21 +00:00
fbarchard@google.com
6dc80ab585 gargantuan width support on ARGBScale
BUG=302
TEST=libyuv ARGBScale tests with LIBYUV_WIDTH=90000
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6519005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@944 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 01:15:31 +00:00
fbarchard@google.com
ac9b96c076 Work around for 1 pixel destination
BUG=302
TEST=*1x1*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6519004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@943 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 00:51:12 +00:00
fbarchard@google.com
90a36b29d3 Use 64 bit fixed point for scaling columns if source is 32k or wider.
BUG=302
TESTED=out\release\libyuv_unittest --gtest_filter=*I*ToI*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6509004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@942 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-03 00:34:55 +00:00
fbarchard@google.com
88c0b01cdd Use 64 bit Sum for planar function to remove size limitation
BUG=302
TESTED=out\release\libyuv_unittest --gtest_filter=*Psnr
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6499004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@941 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-02 22:57:06 +00:00
fbarchard@google.com
5dba58cb1e FixedDiv1 using a single 64/32 divide. Removes size restriction from slope.
BUG=302
TESTED=libyuv scale tests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6489004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@940 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-02 22:32:09 +00:00
fbarchard@google.com
277378723a Add little endian 555/565 kCMPixelFormat's to alias list
BUG=none
TESTED=unittests added
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@939 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-02 20:31:40 +00:00
fbarchard@google.com
d1206caefa Apple uses 'BGRA' to mean 'ARGB', so map this on Apple machines.
BUG=229
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6459005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@934 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-31 19:01:14 +00:00
fbarchard@google.com
9fd689e5bf Combines multiple allocs into one call.
BUG=300
TESTED=libyuv_unitests pass
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@932 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-30 21:11:21 +00:00
fbarchard@google.com
a12284b906 sobel use one alloc instead of 3.
BUG=300
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@931 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-30 18:39:43 +00:00
fbarchard@google.com
49db7b7e4a Add edge to sobel buffers to avoid overwrites.
BUG=296
TESTED=Sobel unittest in Effects
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@930 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 22:54:51 +00:00
fbarchard@google.com
6b6eb8cd36 lint fixes
BUG=none
TEST=LINT
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6409004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@929 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 02:09:58 +00:00
fbarchard@google.com
d9c9f37ac4 Conversions use malloc for row buffers.
BUG=296
TESTED=libyuv convert_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@928 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 02:00:30 +00:00
fbarchard@google.com
b2a51d042d Sobel use malloc for row buffers
BUG=296
TESTED=Sobel*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@927 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 01:29:14 +00:00
fbarchard@google.com
05d025df22 Convert common low levels use malloc
BUG=296
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6379004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@926 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 01:21:49 +00:00
fbarchard@google.com
e86abbd244 Use malloc for row buffers in scalers removing size limitations.
BUG=296
TESTED=libyuvTest.Scale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6369004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@925 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 01:11:26 +00:00
fbarchard@google.com
aab73bbe8a format conversion use malloc
BUG=296
TESTED=convert_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@924 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-26 23:25:03 +00:00
fbarchard@google.com
ae9a1388a7 Use malloc for row buffers in rotate
BUG=296
TESTED=rotate_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6329004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@922 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-26 21:41:11 +00:00
fbarchard@google.com
cf17f0cd2b Scale exit early if simple version used
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6319004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@921 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-26 20:00:54 +00:00
fbarchard@google.com
06ed625808 Neon RGBToUV more accurate coefficients and subsample averaging. Instead of adding 4 pixels and making coefficients 4x smaller, this makes the coefficients 2x small and does a shift, for best accuracy.
BUG=297
TESTED=try bots
R=tpsiaki@google.com, yunqingwang@google.com

Review URL: https://webrtc-codereview.appspot.com/6309004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@920 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-26 19:17:21 +00:00
fbarchard@google.com
b14f46fa30 NaCL pepper_33 port of scale and compare using lock/unlock. Remove less useful scaling tests and change default size to a multiple of 16 for better assembly coverage.
BUG=none
TESTED=ncval
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/5939005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@917 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-17 18:27:06 +00:00
fbarchard@google.com
f2bd31538e Change name of variable for convert to crop_width/height instead of dst_width/height to clarify that it is used to crop the original before rotation and is not the final destination size.
BUG=none
TESTED=local builds still work
R=wjia@google.com, wjia@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/5859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@915 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-16 18:23:04 +00:00
nfullagar@google.com
6f0a1dca2c First pass using bundle_lock / bundle_unlock from binutils update.
Remove hand placed BUNDLEALIGN and add new asm directives into
psuedo-instruction macros. Moving forward, this will make for easier and
more consistent psuedo-instruction alignment with bundle boundaries.
This is mostly a NaCl CL but does include a few changes that will slightly
change loop alignments in non-NaCl builds.
BUG=253
TEST=trybots,ncval
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/4889004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@913 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-13 22:40:53 +00:00
fbarchard@google.com
0287a3cb53 Fix for I422ToARGB which used movq instead of movd
BUG=293
TESTED=LIBYUV_DISABLE_ASM=1 valgrind out/Debug/remoting_unittests --single-process-tests --gtest_filter=*YuvToRgb*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5669004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@911 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-12 01:59:45 +00:00
fbarchard@google.com
dd2fca5f9c scale down 4
BUG=none
TEST=none
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/5389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@907 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-09 21:07:37 +00:00
fbarchard@google.com
a6b8e0da51 Reduce filter to None if 1 pixel wide.
BUG=none
TESTED=talk media_unittest YuvScalerTest.TestScaleUp1x6OptInt
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5449005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@906 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-09 19:56:16 +00:00
fbarchard@google.com
5316f38375 clamp pointer to max row to avoid stepping off source image.
BUG=292
TESTED=BackgroundOverlayKernelTest.ProcessVideo_ForegroundBackground

Review URL: https://webrtc-codereview.appspot.com/5379004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@905 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-08 21:13:09 +00:00
fbarchard@google.com
0db78ad127 Switch from xor/mov bx, to movzx ebx, which still passes drmemory and valgrind.
BUG=none
TESTED=drmemory

Review URL: https://webrtc-codereview.appspot.com/5339004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@904 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-08 19:23:41 +00:00
fbarchard@google.com
5f29eaafae Fix for off by one in scale - only source should be src - 0x10001 because dest will hit exact pixel.
BUG=292
TESTED=valgrind

Review URL: https://webrtc-codereview.appspot.com/5359004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@902 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-08 18:03:00 +00:00
fbarchard@google.com
aae7deb5cf yuv use scale slope calc
BUG=none
TEST=drmem
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5319004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@899 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-07 00:55:23 +00:00
fbarchard@google.com
3f5a860be8 Swap x and dx to get slope right.
BUG=none
TEST=ARGBScaleClipFrom320x240_Bilinear
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5299004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@898 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-06 19:49:56 +00:00
fbarchard@google.com
5e24e55f98 Couple fixes for scale common
BUG=none
TESTED=local build
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@897 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-06 19:20:40 +00:00
fbarchard@google.com
980150f7f1 Compute slope considering filtering, mirror.
BUG=261
TEST=valgrind
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/5199004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@896 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-06 18:13:46 +00:00
fbarchard@google.com
ec0cc5bb2d Function to switch filters to a simplier one based on scale factors.
BUG=none
TEST=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4989004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@894 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-05 00:28:12 +00:00
fbarchard@google.com
99a1298c54 I444ToI420 etc use ScalePlane on Y to allow mirroring.
BUG=291
TESTED=unittests still pass.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@893 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-04 23:36:06 +00:00
fbarchard@google.com
a8e4dcb5d5 Use scaling for YUV to YUV.
BUG=none
TEST=I4*ToI4*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@892 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-04 20:31:53 +00:00
fbarchard@google.com
48e5364313 Use xor/mov bx instead of movzx to avoid drmemory bug
BUG=none
TEST=none
R=johannkoenig@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4879004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@891 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-04 03:04:58 +00:00
fbarchard@google.com
064d2768a8 Windows Arm makefile and build fix.
BUG=290
TESTED=make -f winarm.mk
R=kjellander@google.com, mflodman@webrtc.org, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@890 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-03 19:13:53 +00:00
fbarchard@google.com
d0a4b4ac01 Switch I4xxToI420 to point sample to pass drmemory.
BUG=289
TESTED=drmemory
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/4809004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@886 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-03 07:00:11 +00:00
fbarchard@google.com
04f40278df yasm ALIGN uppercase
BUG=none
TEST=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4769005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@885 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-03 00:51:14 +00:00
fbarchard@google.com
545a51c1d3 use scale for subsampling to handle odd source width to even destination width.
BUG=289
TEST=drmemory
R=nfullagar@google.com, ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/4779004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@884 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-03 00:27:29 +00:00
fbarchard@google.com
0014ce0056 test odd width and fix for unaligned used on odd width conversion.
BUG=283
TESTED=try bots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4729004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@883 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-02 20:07:29 +00:00
fbarchard@google.com
06f730d8a6 Change do while loops to for loops to allow 0 or 1 wide
BUG=289
TESTED=drmemory on odd width
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@882 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-02 20:03:04 +00:00
fbarchard@google.com
c2295807bd Reduce alignment for loops from 16 bytes to 4 bytes. Reduces outer loop overhead without hurting innerloop time.
BUG=none
TESTED=try bots
R=fbarchard@chromium.org, mflodman@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/4659004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@880 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-02 15:57:39 +00:00
fbarchard@google.com
dbe4814361 Move scale row functions to scale_win etc
BUG=none
TEST=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4509005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@879 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-28 01:16:15 +00:00
fbarchard@google.com
a77571812e Lint fixes for macros
BUG=none
TEST=lint
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4499004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@877 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-27 01:33:09 +00:00
fbarchard@google.com
67e6419668 Port more functions in row_posix.cc to NaCl
BUG=253
TEST=libyuv_unittest,ncval,trybots
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4489004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@876 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-27 01:15:24 +00:00
nfullagar@google.com
a6c94b2227 Port gcc asm code in scale.cc to NaCl x86-64
BUG=253
TEST=manually run tests, trybots and ncval
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/4029005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@872 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-21 19:38:17 +00:00
fbarchard@google.com
2d8824079a Fix ConvertToI420() to properly delete temporary array when rotating result. When rotating an image ConvertToI420() allocates a temporary buffer using new[], but then attempts to delete it using delete instead of delete[].This issue is not the root cause of the referenced bug, but it needs to be addressed in order to fix that bug.
BUG=crbug.com/306876
TESTED=try bots
R=wuwang@google.com

Review URL: https://webrtc-codereview.appspot.com/4129005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@866 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-20 21:18:23 +00:00
fbarchard@google.com
431f5f0388 Fix scaling bug
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3979007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@864 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-20 01:23:04 +00:00
fbarchard@google.com
ba0eab9366 Reduce blur radius based on width. And Makefile clean remove temp files.
BUG=none
TEST=Blur*
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/4019004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@858 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-18 19:18:15 +00:00
fbarchard@google.com
e812e86ea9 Simplify constraints on asm yuv scale columns for benefit of android intel build.
BUG=none
TEST=try bots
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3989005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@857 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-18 17:45:23 +00:00
fbarchard@google.com
a0630d77f0 Report of affine to nacl using %k0
BUG=none
TEST=none
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/3929004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@855 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-15 17:42:44 +00:00
fbarchard@google.com
e8c74b61d3 Faster point samplers using row functions and specialized 2x upsampler.
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@854 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-14 02:03:32 +00:00
fbarchard@google.com
a2311691c6 YUV scale up
BUG=none
TEST=libyuvTest.ScaleTo1280x720_Linear
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3829004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@853 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-13 21:16:17 +00:00
fbarchard@google.com
5c36470985 Port ScaleFilterCols_SSSE3 to gcc
BUG=none
TEST=Scale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3789004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@851 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-12 20:40:55 +00:00
fbarchard@google.com
e37aed6f42 Nacl versions of color tables
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@850 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-12 04:00:10 +00:00
fbarchard@google.com
f7eb04bc41 Port ScaleCols to SSSE3 for Win.
BUG=none
TEST=Scale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@849 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-11 23:13:57 +00:00
fbarchard@google.com
788f757016 Linear interpolation.
BUG=none
TEST=*Linear*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@848 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-11 18:53:19 +00:00
fbarchard@google.com
c2a889eb55 Bump reciprocal up by 1
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3599004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@847 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-11 05:14:13 +00:00
fbarchard@google.com
67a0987dd9 Scale Up2 ported to NaCL.
BUG=none
TEST=none
R=nfullagar@chromium.org, nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/3589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@846 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-07 20:35:17 +00:00
fbarchard@google.com
1428b37c2b Scale up by 2 unfiltered
BUG=none
TESTED=manual test with width/height set to half.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@845 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-07 19:26:31 +00:00
fbarchard@google.com
ed9ddc0765 Port small blur to NaCL
BUG=none
TEST=validate passes
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@844 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-05 23:26:06 +00:00
fbarchard@google.com
191ab18073 Use fixed point for small blurs
BUG=none
TEST=libyuvTest.ARGBBlurSmall_Opt
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@843 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-05 18:19:11 +00:00
fbarchard@google.com
87215c0871 Fix row coalescing for NV12 and NV21 to I420.
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3379004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@841 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-05 02:15:55 +00:00
fbarchard@google.com
f1b8e2a575 4 pixel sse2 point scaler for better use of SSE2 that can hold 4 coordinates.
BUG=none
TEST=*ARGBScaleFrom*None
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3319004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@839 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-04 19:05:33 +00:00
fbarchard@google.com
4a4b7374c1 Load matrix with one vector and splat to 4 different ones.
BUG=none
TEST=none
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3299004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@838 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-01 21:29:45 +00:00
fbarchard@google.com
6368c10c9c Add __declspec(safebuffers) to functions with arrays on stack that have explicit checks to avoid a redundent compiler stack check.
BUG=none
TEST=unitests pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@837 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-01 21:27:31 +00:00
fbarchard@google.com
f3871ce976 safebuffers requires vs2010
BUG=none
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3259004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@836 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-31 22:25:04 +00:00
fbarchard@google.com
bcccd6b78d cpuid for older vc
BUG=263
TEST=none
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@835 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-31 19:15:12 +00:00
fbarchard@google.com
11a0d48e45 pass parameter for yuv conversion
BUG=267
TEST=Luma
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3169005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@834 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-31 05:47:13 +00:00
fbarchard@google.com
15df877b64 Use emit for xgetbv for compatibility with vs2005, vs2008 and vs2010 without sp1.
BUG=none
TEST=none
R=nfullagar@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/3089005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@833 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-30 01:33:11 +00:00
fbarchard@google.com
21796c94aa Move constant to its own asm block to save 3 GPR registers for main loop
BUG=267
TESTED=32 bit mac build

Review URL: https://webrtc-codereview.appspot.com/3099004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@832 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-29 08:43:13 +00:00
fbarchard@google.com
ca8f826ba3 Luma fetch 4 pixels
BUG=267
TEST=Luma*
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@831 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-28 22:53:22 +00:00
fbarchard@google.com
407c4ee73a Register juggling to get gcc 32 bit to build
BUG=267
TEST=Luma* builds under gcc 32 and runs similar performance to other builds
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3059005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@830 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-28 22:28:35 +00:00
fbarchard@google.com
4c736098d6 Use packssdw which is SSE2 not packusdw which is SSSE4.
BUG=none
TEST=Sobel* on AMD cpu
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3069004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@829 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-28 19:12:49 +00:00
fbarchard@google.com
6f7e514caa Full metal BCS
BUG=none
TEST=Luma* unittest
R=thorcarpenter@google.com

Review URL: https://webrtc-codereview.appspot.com/3029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@828 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-28 17:10:49 +00:00
fbarchard@google.com
fb99c03008 NaCL port of CopyAlpha
BUG=none
TEST=ncval
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/2999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@827 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-25 21:26:40 +00:00
fbarchard@google.com
08b24a4232 Bayer GG specialized version for Sobel
BUG=none
TEST=Sobel
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/2849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@826 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-25 07:39:43 +00:00
fbarchard@google.com
1b2ff39cc8 Add space around each nacl macro for clang compatibility with -Wreserved-user-defined-literal
BUG=280
TESTED=try bots
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/2899004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@825 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-24 20:12:36 +00:00