150 Commits

Author SHA1 Message Date
Frank Barchard
88b050f337 MergeUV AVX512BW use assembly
- Convert MergeUVRow_AVX512BW to assembly
- Enable MergeUVRow_AVX512BW for Windows with clangcl
- MergeUVRow_AVX2 use vpmovzxbw and vpsllw
- MergeUVRow_16_AVX2 use vpmovzxbw and vpsllw with different shift for U and V

AMD Zen 4 640x360 100000 iterations
Was
AVX512 MergeUVPlane_Opt (884 ms)
AVX2   MergeUVPlane_Opt (945 ms)
AVX2   MergeUVPlane_16_Opt (2167 ms)

Now
AVX512 MergeUVPlane_Opt (865 ms)
AVX2   MergeUVPlane_Opt (943 ms)
SSE2   MergeUVPlane_Opt (973 ms)
AVX2   MergeUVPlane_16_Opt (2102 ms)

Bug: None
Change-Id: I658ada2a75d44c3f93be8bd3ed96f83d5fa2ab8d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4271230
Reviewed-by: Fritz Koenig <frkoenig@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-02-22 21:19:08 +00:00
Frank Barchard
2bdc210be9 MergeUV_AVX512BW for I420ToNV12
On Skylake Xeon 640x360 100000 iterations
AVX512   MergeUVPlane_Opt (1196 ms)
AVX2     MergeUVPlane_Opt (1565 ms)
SSE2     MergeUVPlane_Opt (1780 ms)
Pixel 7  MergeUVPlane_Opt (1177 ms)

Bug: None
Change-Id: If47d4fa957cf27781bba5fd6a2f0bf554101a5c6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4242247
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-02-13 20:14:57 +00:00
Hao Chen
0809713775 Refine some functions on the Longarch platform.
Add ARGBToYMatrixRow_LSX/LASX, RGBAToYMatrixRow_LSX/LASX and
RGBToYMatrixRow_LSX/LASX functions with RgbConstants argument.

Bug: libyuv:912
Change-Id: I956e639d1f0da4a47a55b79c9d41dcd29e29bdc5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4167860
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-01-18 18:54:14 +00:00
Frank Barchard
f71c83552d I420ToRGB24MatrixFilter function added
- Implemented as 3 steps: Upsample UV to 4:4:4, I444ToARGB, ARGBToRGB24
- Fix some build warnings for missing prototypes.

Pixel 4
I420ToRGB24_Opt (743 ms)
I420ToRGB24Filter_Opt (1331 ms)

Windows with skylake xeon:
x86 32 bit
I420ToRGB24_Opt (387 ms)
I420ToRGB24Filter_Opt (571 ms)
x64 64 bit
I420ToRGB24_Opt (384 ms)
I420ToRGB24Filter_Opt (582 ms)


Bug: libyuv:938, libyuv:830
Change-Id: Ie27f70816ec084437014f8a1c630ae011ee2348c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3900298
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-09-16 19:46:47 +00:00
Frank Barchard
65e7c9d570 MM21ToYUY2 and ABGRToJ420 conversion
MM21 to YUY2 use zip1 for performance

Cortex A510
Was MM21ToYUY2 (612 ms)
Now MM21ToYUY2 (573 ms)

Prefetches help Cortex A53
Was MM21ToYUY2 (4998 ms)
Now MM21ToYUY2 (1900 ms)

Pixel 4 Cortex A76
Was MM21ToYUY2 (215 ms)
Now MM21ToYUY2 (173 ms)

ABGRToJ420
- NEON, SSSE3 and AVX2 row functions
- J400, J420 and J422 formats.
- Added AVX2 for UV on ARGBToJ420.  Was SSSE3

Same code/performance as ARGBToJ420 but with constants re-ordered.
Pixel 4
ABGRToJ420_Opt (623 ms)
ABGRToJ422_Opt (702 ms)
ABGRToJ400_Opt (238 ms)

Skylake Xeon
With LIBYUV_BIT_EXACT which uses C for UV
ABGRToJ420_Opt (988 ms)
ABGRToJ422_Opt (1872 ms)
ABGRToJ400_Opt (186 ms)
Skylake Xeon using AVX2
ABGRToJ420_Opt (251 ms)
ABGRToJ422_Opt (245 ms)
ABGRToJ400_Opt (184 ms)
Skylake Xeon using SSSE3
ABGRToJ420_Opt (328 ms)
ABGRToJ422_Opt (362 ms)
ABGRToJ400_Opt (185 ms)

Bug: b/238137982
Change-Id: I559c3fe3fb80fa2ce5be3d8218736f9cbc627666
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3832111
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-08-16 22:07:38 +00:00
Frank Barchard
95b14b2446 RAWToJ400 faster version for ARM
- Unrolled to 16 pixels
- Take constants via structure, allowing different colorspace and channel order
- Use ADDHN to add 16.5 and take upper 8 bits of 16 bit values, narrowing to 8 bits
- clang-format applied, affecting mips code

On Cortex A510
Was RAWToJ400_Opt (1623 ms)
Now RAWToJ400_Opt (862 ms)

C   RAWToJ400_Opt (1627 ms)

Bug: b/220171611
Change-Id: I06a9baf9650ebe2802fb6ff6dfbd524e2c06ada0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3534023
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-18 07:22:36 +00:00
Hao Chen
91bae707e1 Optimize functions for LASX in row_lasx.cc.
1. Optimize 18 functions in source/row_lasx.cc file.
2. Make small modifications to LSX.
3. Remove some unnecessary content.

Bug: libyuv:912
Change-Id: Ifd1d85366efb9cdb3b99491e30fa450ff1848661
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3507640
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-09 08:52:54 +00:00
Frank Barchard
42d76a342f RAWToJNV21 function with 2 step conversion
RAWToJ420 + J420ToNV21 on row level

Pixel 6
RAWToJNV21_Opt (320 ms)

Skylake Xeon
RAWToJNV21_Opt (302 ms)

Bug: b/220171611
Change-Id: I39dcce9cf56c576b95666bb4fb1baccf9fbc7f7a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3495876
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-01 19:33:49 +00:00
Frank Barchard
2c6bfc02d5 Remove MMI support
Bug: libyuv:916
Change-Id: I345b7e271ceb4b32fe91e292915e66be40812810
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3415817
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-01-26 08:41:33 +00:00
Hao Chen
dfe046d272 Add optimization functions in row_lsx.cc file.
Optimize 44 functions in source/row_lsx.cc file.
All test cases passed on loongarch platform.

Bug: libyuv:913
Change-Id: Ic80a5751314adc2e9bd435f2bbd928ab017a90f9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351467
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-01-21 01:34:38 +00:00
Hao Chen
de8ae8c679 Add optimization functions in row_lasx.cc file.
Optimize 32 functions in source/row_lasx.cc file.
All test cases passed on loongarch platform.

Bug: libyuv:912
Signed-off-by: Hao Chen <chenhao@loongson.cn>
Change-Id: I7d3f649f753f72ca9bd052d5e0562dbc6f6ccfed
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351466
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-01-21 01:34:38 +00:00
Frank Barchard
b179f1847a Enable SIMD for exact RGB to Y conversions
Bug: libyuv:908, b/202888439
Change-Id: Icc5470b85d91b441ded9958ee04b4f32246646f0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3230489
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2021-10-19 07:54:50 +00:00
Frank Barchard
55b97cb48f BIT_EXACT for unattenuate and attenuate.
- reenable Intel SIMD unaffected by BIT_EXACT
- add bit exact version of ARGBAttenuate, which uses ARM version of formula.
- add bit exact version of ARGBUnatenuate, which mimics the AVX code.

Apply clang format to cleanup code.

Bug: libyuv:908, b/202888439
Change-Id: Ie842b1b3956b48f4190858e61c02998caedc2897
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3224702
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-10-15 19:46:02 +00:00
Frank Barchard
daf9778a24 Fix for failed compile with armv-7a neon gcc
Bug: libyuv:907
Change-Id: I955e83c72b57ce5ba45730030b32f337be610a21
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3216739
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-10-12 18:17:50 +00:00
Yuan Tong
f37014fcff Add support for AR64 format
Add following conversions:
ARGB,ABGR <-> AR64,AB64
AR64 <-> AB64

R=fbarchard@chromium.org

Change-Id: I5ca5b40a98bffea11981e136afae4a511ba6c564
Bug: libyuv:886
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2746780
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-03-13 20:55:21 +00:00
Shiyou Yin
d9681c53b3 Refine conditional compilation for MSA and MMI.
This patch is a complement for commit bed9292f2cbba2f8f9ff0f1635a8aa17a311f2f9.
1. Supplement inspection for macro HAS_***TOUV*ROW_MMI/MSA.
2. Reduce calls to function TestCpuFlag().
3. Fix a mistake in source/convert.cc: line 1105.

Change-Id: I5e7f9fe367fa0f6d1db6f7644c5b48d4ad85fedb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2169342
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-29 19:13:23 +00:00
Shiyou Yin
bed9292f2c Move init process of msa after mmi.
Some processors support both MSA and MMI.
when they are enabled together, MSA will be preferd.
This patch move MSA initialization after MMI, so that
MSA can overide MMI and be setted to effective.

Change-Id: I8a52cce83ee4ec9727d47c99b287c9580329b149
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2155944
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-28 11:01:51 +00:00
Frank Barchard
fce0fed542 ARGBToY use 8 bit precision instead of 7 bit.
Neon and GCC Intel optimized, but win32 and mips not optimized.

BUG=libyuv:842, b/141482243

Change-Id: Ia56fa85c8cc1db51f374bd0c89b56d21ec94afa7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1825642
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2019-10-07 23:01:10 +00:00
Frank Barchard
9b63884a3e Add ABGRToNV21 and ABGRToNV12
Fix ARGBToUVJRow_AVX2 constants for win32

BUG=libyuv:833, libyuv:839

Change-Id: Id4731a573d40d7a9b46fcc31c2fee295483e1ff6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1739509
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Hirokazu Honda <hiroh@chromium.org>
2019-08-07 01:29:13 +00:00
Martin Storsjö
9b772abf97 Restore the file mode for source files
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.

Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-08-06 18:53:32 +00:00
lixia zhang
21be9122aa libyuv:loongson optimize compare/row/scale/rotate files with mmi.
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).

In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.

BUG=libyuv:804

Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-07-20 22:53:04 +00:00
Johann
b8696fde84 add const to casts
When casting a const value, ensure the cast is const as well.

BUG=webm:1509

Change-Id: I5b597fdcc148d111e9824bc7cf918fc5f24e970f
Reviewed-on: https://chromium-review.googlesource.com/996553
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-04-13 22:52:52 +00:00
Frank Barchard
83aa7512c1 AVX512 VMBI version of ARGBToRGB24
Use VMBI instructions but on AVX2 registers to avoid clockrate change.

Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: Id4f8ad1e0e142a380c8a46c5eab90ce145a10edd
Reviewed-on: https://chromium-review.googlesource.com/956609
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-10 02:04:48 +00:00
Frank Barchard
1d509f2178 ARGBToRGB24_AVX2 version
AVX2 port of SSSE3 conversion to output 24 bit RGB

Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I14f7815522d1b790ecd2bb39d9a3441e803b694a
Reviewed-on: https://chromium-review.googlesource.com/953303
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-03-08 02:38:21 +00:00
Frank Barchard
e1f6c1c0b5 tidy applied with readability-inconsistent-declaration-parameter-name
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
2018-02-07 00:24:25 +00:00
Frank Barchard
664c735677 I420ToYUY2_AVX2 port
I420 and I422 To YUY2 and UYVY ported from SSE2 to AVX2.

Was SSE2
I420ToYUY2_Opt (135 ms)
I420ToUYVY_Opt (148 ms)
I422ToYUY2_Opt (145 ms)
I422ToUYVY_Opt (142 ms)

Now AVX2
I420ToYUY2_Opt (133 ms)
I420ToUYVY_Opt (130 ms)
I422ToYUY2_Opt (127 ms)
I422ToUYVY_Opt (137 ms)

Bug: libyuv:556
Test: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*I42?To*UY*Opt
Change-Id: Ic35f97cee02dc009fd98785589ba17c7cf50bb35
Reviewed-on: https://chromium-review.googlesource.com/892493
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-02-01 00:33:25 +00:00
Frank Barchard
ffec313dbe ABGRToAR30 used AVX2 with reversed shuffler
vpshufb is used to reverse R and B channels;
Code is otherwise the same as ARGBToAR30.

Bug: libyuv:751
Test: ABGRToAR30 unittest
Change-Id: I30e02925f5c729e4496c5963ba4ba4af16633b3b
Reviewed-on: https://chromium-review.googlesource.com/891807
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-01-29 22:31:31 +00:00
Frank Barchard
92e22cf5b6 Lint cleanup after C99 change CL
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-24 19:16:03 +00:00
Frank Barchard
7e389884a1 Switch to C99 types
Append _t to all sized types.
uint64 becomes uint64_t etc

Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-23 19:16:05 +00:00
Frank Barchard
3b81288ece Remove Mips DSPR2 code
Bug: libyuv:765
Test: build for mips still passes
Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37
Reviewed-on: https://chromium-review.googlesource.com/826404
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2017-12-14 18:22:16 +00:00
Frank Barchard
c367751430 ARGBToAR30 SSSE3 use pmulhuw to replicate fields
AR30 is optimized with 3 techniques
1. pmulhuw is used to replicate 8 bits to 10 bits.
2. Two channels are processed at a time.  R and B, and A and G.
3. pshufb is used to shift and mask 2 channels of R and B

Bug: libyuv:751
Test: ARGBToAR30_Opt
Change-Id: I4e62d6caa4df7d0ae80395fa911d3c922b6b897b
Reviewed-on: https://chromium-review.googlesource.com/822520
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2017-12-12 20:12:58 +00:00
Frank Barchard
0f98c3c1df Add ARGBToAR30Row_SSE2 to speed up H010ToAR30
Port ARGBToAR30Row_AVX2 to ARGBToAR30Row_SSE2 using same instructions
but xmm registers and doing half as many pixels per loop.

Bug: libyuv:751
Test: LibYUVConvertTest.ARGBToAR30_Opt
Change-Id: Id644e54639133d1caf28ea3cd11ff6ab6891a673
Reviewed-on: https://chromium-review.googlesource.com/817918
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-12-09 00:11:20 +00:00
Lei Zhang
8445617191 Mark a bunch of kArray variables as const.
This allows the linker to move the variables from the .data section to
the .rodata section.

Bug: libyuv:254
Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1

Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0
Reviewed-on: https://chromium-review.googlesource.com/777033
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2017-11-27 23:38:44 +00:00
Frank Barchard
a98d6cdb17 ARGBToAR30 AVX2 conversion function
Bug: libyuv:751
Test: LibYUVConvertTest.ARGBToAR30_Opt
Change-Id: I09c13eb53ba5f1ce1740c013dc587f8300f1d9e0
Reviewed-on: https://chromium-review.googlesource.com/780437
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2017-11-21 20:37:01 +00:00
Manojkumar Bhosale
45b176d153 Add MSA optimized Interpolate/MergeUV/Misc functions
BUG=libyuv:634

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126

Performance Gain (vs C auto-vectorized)
InterpolateRow_MSA      - ~3.3x
InterpolateRow_Any_MSA  - ~2.5x
ARGBSetRow_MSA          - ~1.0x
ARGBSetRow_Any_MSA      - ~1.0x
ARGBToRGB24Row_MSA      - ~1.9x
ARGBToRGB24Row_Any_MSA  - ~1.6x
MergeUVRow_MSA          - ~1.6x
MergeUVRow_Any_MSA      - ~1.2x

Performance Gain (vs C non-vectorized)
InterpolateRow_MSA      - ~11.3x
InterpolateRow_Any_MSA  - ~ 7.9x
ARGBSetRow_MSA          - ~ 6.2x
ARGBSetRow_Any_MSA      - ~ 4.0x
ARGBToRGB24Row_MSA      - ~ 9.9x
ARGBToRGB24Row_Any_MSA  - ~ 8.4x
MergeUVRow_MSA          - ~12.7x
MergeUVRow_Any_MSA      - ~ 8.0x

Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
Reviewed-on: https://chromium-review.googlesource.com/445817
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-02-23 01:42:22 +00:00
Manojkumar Bhosale
54ce8f23d6 Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C auto-vectorized)
ARGBToYJRow_MSA       - ~3.2x
ARGBToYJRow_Any_MSA   - ~2.7x
BGRAToYRow_MSA        - ~3.2x
BGRAToYRow_Any_MSA    - ~2.7x
ABGRToYRow_MSA        - ~3.2x
ABGRToYRow_Any_MSA    - ~2.6x
RGBAToYRow_MSA        - ~3.1x
RGBAToYRow_Any_MSA    - ~2.7x
ARGBToUVJRow_MSA      - ~5.5x
ARGBToUVJRow_Any_MSA  - ~4.5x
BGRAToUVRow_MSA       - ~2.1x
BGRAToUVRow_Any_MSA   - ~2.0x
ABGRToUVRow_MSA       - ~2.1x
ABGRToUVRow_Any_MSA   - ~1.9x
RGBAToUVRow_MSA       - ~2.2x
RGBAToUVRow_Any_MSA   - ~1.9x

Performance Gain (vs C non-vectorized)
ARGBToYJRow_MSA       - ~10.9x
ARGBToYJRow_Any_MSA   -  ~9.2x
BGRAToYRow_MSA        - ~10.9x
BGRAToYRow_Any_MSA    -  ~9.3x
ABGRToYRow_MSA        - ~11.0x
ABGRToYRow_Any_MSA    -  ~9.3x
RGBAToYRow_MSA        - ~10.9x
RGBAToYRow_Any_MSA    -  ~9.1x
ARGBToUVJRow_MSA      - ~12.4x
ARGBToUVJRow_Any_MSA  - ~10.5x
BGRAToUVRow_MSA       -  ~4.7x
BGRAToUVRow_Any_MSA   -  ~4.4x
ABGRToUVRow_MSA       -  ~4.7x
ABGRToUVRow_Any_MSA   -  ~4.5x
RGBAToUVRow_MSA       -  ~4.8x
RGBAToUVRow_Any_MSA   -  ~4.4x

Review-Url: https://codereview.chromium.org/2641153003 .
2017-02-01 10:31:28 +05:30
Frank Barchard
000d2fa91a Libyuv MIPS DSPR2 optimizations.
Optimized functions:

I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2

Bug-fixes in functions:

ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2

BUG=

Review-Url: https://codereview.chromium.org/2626123003 .
2017-01-11 12:19:13 -08:00
Manojkumar Bhosale
a899dea251 Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBAttenuateRow_MSA          - ~1.1x
ARGBAttenuateRow_Any_MSA      - ~1.1x
ARGBToRGB565DitherRow_MSA     - ~6.4x
ARGBToRGB565DitherRow_Any_MSA - ~6.2x
ARGBShuffleRow_MSA            - ~5.1x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~1.1x
ARGBGrayRow_MSA               - ~2.6x
ARGBSepiaRow_MSA              - ~11.6x

Performance Gain (vs C non-vectorized)
ARGBAttenuateRow_MSA          - ~2.46x
ARGBAttenuateRow_Any_MSA      - ~2.45x
ARGBToRGB565DitherRow_MSA     - ~9.4x
ARGBToRGB565DitherRow_Any_MSA - ~12.5x
ARGBShuffleRow_MSA            - ~5.2x
ARGBShuffleRow_Any_MSA        - ~1.9x
ARGBShadeRow_MSA              - ~4.3x
ARGBGrayRow_MSA               - ~10.5x
ARGBSepiaRow_MSA              - ~12.2x

Review-Url: https://codereview.chromium.org/2559693002 .
2016-12-15 12:06:02 +05:30
Frank Barchard
da0c29dada Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ARGBToRGB565Row_MSA       - ~1.6x
ARGBToRGB565Row_Any_MSA   - ~1.6x
ARGBToARGB1555Row_MSA     - ~1.3x
ARGBToARGB1555Row_Any_MSA - ~1.3x
ARGBToARGB4444Row_MSA     - ~3.8x
ARGBToARGB4444Row_Any_MSA - ~3.8x
ARGBToUV444Row_MSA        - ~2.4x
ARGBToUV444Row_Any_MSA    - ~2.4x

Performance Gain (vs C non-vectorized)
ARGBToRGB565Row_MSA       - ~2.8x
ARGBToRGB565Row_Any_MSA   - ~2.8x
ARGBToARGB1555Row_MSA     - ~2.2x
ARGBToARGB1555Row_Any_MSA - ~2.2x
ARGBToARGB4444Row_MSA     - ~6.8x
ARGBToARGB4444Row_Any_MSA - ~6.6x
ARGBToUV444Row_MSA        - ~6.7x
ARGBToUV444Row_Any_MSA    - ~6.7x

Review URL: https://codereview.chromium.org/2520003004 .
2016-11-22 10:47:55 -08:00
Frank Barchard
b1504a8e48 Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Review URL: https://codereview.chromium.org/2487913004 .
2016-11-18 15:05:10 -08:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
78c58ab8aa Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains : (Auto-vectorized C vs MSA SIMD)

ARGB4444ToYRow_MSA        : ~3.0x
ARGB4444ToUVRow_MSA       : ~1.8x
ARGB4444ToARGBRow_MSA     : ~3.4x

ARGB4444ToYRow_Any_MSA    : ~2.8x
ARGB4444ToUVRow_Any_MSA   : ~1.7x
ARGB4444ToARGBRow_Any_MSA : ~3.2x

Review URL: https://codereview.chromium.org/2421843002 .
2016-10-19 11:10:51 -07:00
Frank Barchard
d363ea6527 Remove I411 support.
YUV 411 is very uncommon format.  Remove support.

Update documentation to reflect that 411 is deprecated.

Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.

BUG=libyuv:645
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2406123002 .
2016-10-11 11:14:16 -07:00
Frank Barchard
7018f5be0f Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions
R=fbarchard@google.com
BUG=libyuv:634

Performance gains :-

I422ToYUY2Row_MSA     - ~12x
I422ToYUY2Row_Any_MSA - ~7x

I422ToUYVYRow_MSA     - ~12x
I422ToUYVYRow_Any_MSA - ~7x

Review URL: https://codereview.chromium.org/2378753004 .
2016-10-03 18:21:31 -07:00
Frank Barchard
081475b3c8 refactor ARGBToI422 using ARGBToI420 internally
R=harryjin@google.com
BUG=libyuv:546

Review URL: https://codereview.chromium.org/1574253004 .
2016-01-12 17:05:49 -08:00
Frank Barchard
2e4466e282 change all pix parameters to width for consistency
TBR=harryjin@google.com
BUG=none

Review URL: https://codereview.chromium.org/1398633002 .
2015-10-07 22:30:36 -07:00
fbarchard@google.com
3d1176a3f8 ARGBToYJRow_AVX2 hooked up for ARGBToJ422
BUG=none
TESTED=ARGBToJ422 unittest

Review URL: https://webrtc-codereview.appspot.com/44079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1360 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 00:39:25 +00:00
fbarchard@google.com
8f0b32773c ARGBToUV AVX2 functions hooked up.
BUG=none
TESTED=RGB565ToI420
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/46829004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1359 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-07 00:10:52 +00:00
fbarchard@google.com
9afabe29b8 Add ARGBToY AVX calls.
BUG=none
TESTED=libyuv unittests all pass with AVX2
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/44999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1358 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-04-06 23:11:05 +00:00
yang.zhang@arm.com
e246e6c18f Add ARGBToRGB565DitherRow_NEON for ARM32/64
ARM32/64 NEON versions of ARGBToRGB565DitherRow_NEON are implemented.

BUG=407
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ia689170fb39db964392e5e1113801592ab0628bf

Review URL: https://webrtc-codereview.appspot.com/49409004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1335 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-17 02:22:25 +00:00
fbarchard@google.com
bdeb9ac584 switch from 8x8 to 4x4 matrix for dithering
BUG=407
TESTED=Dither unittests
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/46459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1310 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-06 18:28:00 +00:00
fbarchard@google.com
0fe4abbc5c ARGBToRGB565 AVX2 with dithering
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/44519004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1309 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 22:31:43 +00:00
fbarchard@google.com
9245317e16 ARGBToRGB565 SSE2 port.
BUG=407
TESTED=ARGBToRGB565Dither unittest
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/41039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1308 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-04 00:00:50 +00:00
fbarchard@google.com
933bd40c3c port ARGBToRGB565 and ARGB1555 to AVX2. Enable functions that use ARGBToRGB565 AVX2 code. Add ARGBToRGB565Dither function.
BUG=403
TESTED=local windows build
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/42109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1302 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-27 21:15:28 +00:00
fbarchard@google.com
bffd326f74 AVX2 version of ARGBToARGB4444
BUG=403
TESTED=local build on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/43429004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1297 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-25 17:26:28 +00:00
fbarchard@google.com
0887315390 Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues.
BUG=301
TESTED=local unittests build/pass on windows gyp build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/38059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1270 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-02-09 19:58:19 +00:00
fbarchard@google.com
a5a15198b4 Add J422 support which is 2x1 subsampling with jpeg color space.
BUG=391
TESTED=color_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/41479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1228 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-01-14 19:16:01 +00:00
fbarchard@google.com
9ed836b154 The 'Any' versions of functions can handle any width now, so remove the check from the calling code. This has 2 advantages - less code, and less overhead in calling function when any function is NOT used. Downside is more code for case where any is used.
BUG=373
TESTED=libyuv_unittest still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1143 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 23:29:31 +00:00
fbarchard@google.com
f713691a6f Change elif to endif and if to allow AVX2 as well as SSE2 in future changes instead of one or the other.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1122 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 20:47:22 +00:00
fbarchard@google.com
ca308327d2 Remove unaligned functions, since most function support unaligned memory now. This reduces complexity and improves performance for unaligned cases because C code can be avoided, and overhead is less. Downside is old cpus (core2 and earlier) will be slower for aligned memory case. Except mips, which has alignment requirement, but remove unaligned variant.
BUG=365
TESTED=unittest builds and passes locally
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24839004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1113 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 00:59:31 +00:00
fbarchard@google.com
aaddd24ba0 ARGBToNV12 fix for memory leak on row_u_mem.
BUG=352
TESTED=libyuv_unittest
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/18209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1057 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 22:40:22 +00:00
ashok.bhat@gmail.com
8f04ca5b9c Row AArch64 Neon implementation - Part 5
BUG=319
TESTED=libyuv_unittest
R=fbarchard@chromium.org, fbarchard@google.com

Change-Id: Ia76096088ddd771388f01dd86110089db2faedfc
Signed-off-by: Ashok Bhat <ashok.bhat@arm.com>

Review URL: https://webrtc-codereview.appspot.com/21189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1055 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-21 10:07:11 +00:00
fbarchard@google.com
8798e04075 Port conversion functions to c.
BUG=303
TESTED=cl /c /TC /Iinclude source\convert_from.cc source\convert_argb.cc source\convert_from_argb.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/17909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1030 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-07-08 18:44:57 +00:00
fbarchard@google.com
a1f5254a95 Switch to c style casts for all source and includes.
BUG=303
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@952 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 03:03:00 +00:00
fbarchard@google.com
9fd689e5bf Combines multiple allocs into one call.
BUG=300
TESTED=libyuv_unitests pass
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6459004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@932 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-30 21:11:21 +00:00
fbarchard@google.com
d9c9f37ac4 Conversions use malloc for row buffers.
BUG=296
TESTED=libyuv convert_test
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@928 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-27 02:00:30 +00:00
fbarchard@google.com
04f40278df yasm ALIGN uppercase
BUG=none
TEST=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/4769005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@885 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-12-03 00:51:14 +00:00
fbarchard@google.com
6368c10c9c Add __declspec(safebuffers) to functions with arrays on stack that have explicit checks to avoid a redundent compiler stack check.
BUG=none
TEST=unitests pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/3289004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@837 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-11-01 21:27:31 +00:00
fbarchard@google.com
095f33d870 Coalesce rows by changing width/height and dropping into code instead of recursing. Improve coalesce by setting stride to 0 so it can be used even on odd width images. Reduce unittests to improve time to run emulators.
BUG=277
TEST=unittests all build and pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@819 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 19:29:10 +00:00
fbarchard@google.com
7e7c7753ba Remove alignment from ARGBToRGB24 and ARGBToRAW to allow fast code to be used all of the time. Improves performance on Westmere and beyond, hurts performance for aligned buffers on older CPUs.
BUG=230
TESTED=try bot
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/2197007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@785 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-11 01:18:36 +00:00
fbarchard@google.com
7fa21d677c More ifdefs to build all libyuv and not get link errors on missing assembly
BUG=253
TEST=nacl validator
R=nfullagar@google.com, ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2024004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@756 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-13 21:54:23 +00:00
fbarchard@google.com
f2aa91a1ac replace static const with static to avoid internal compiler error with gcc
BUG=258
TEST=try bots
R=johannkoenig@google.com

Review URL: https://webrtc-codereview.appspot.com/1944004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@743 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-02 17:48:24 +00:00
fbarchard@google.com
4127a2637d ARGBInterpolate odd width support and inverted odd width test. ARGBToNV12/21 odd height fix. Compare test tolerate small height with warning.
BUG=202
TEST=libyuvTest.ARGBInterpolate85_Any_Invert
Review URL: https://webrtc-codereview.appspot.com/1325004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@663 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-15 10:43:33 +00:00
fbarchard@google.com
9b4c00b908 Move vzeroupper to row functions to simplify caller and allow mix of avx2 and sse2. Impact reduced by row coalescing.
BUG=none
TEST=all tests pass with sde
Review URL: https://webrtc-codereview.appspot.com/1269009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@641 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-04 05:54:59 +00:00
fbarchard@google.com
91c50c3a7d ARGBToYJ_AVX2 port to AVX2.
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/1272008

git-svn-id: http://libyuv.googlecode.com/svn/trunk@640 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-03 23:47:10 +00:00
fbarchard@google.com
f8e9017685 switch toyuy2 from aligned to unaligned
BUG=211
TESTED=ToYUY2*
Review URL: https://webrtc-codereview.appspot.com/1274005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@633 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-02 21:18:12 +00:00
fbarchard@google.com
050b39a5cb Recomputed JPeg coefficients normalized to 128. Apply to ARGBGray function reusing YJ function/coefficients and rounding.
BUG=201
TESTED=Gray unittest improved
Review URL: https://webrtc-codereview.appspot.com/1269006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@629 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-01 20:07:14 +00:00
fbarchard@google.com
4e0d7cc2c6 Y coefficients for J420 need to be scaled by 255/219 to full range.
BUG=159
TESTED=out\release\libyuv_unittest --gtest_filter=*J*
Review URL: https://webrtc-codereview.appspot.com/1264004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@624 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-27 07:35:03 +00:00
fbarchard@google.com
cfaa66c041 ARGBToJ420 and ARGBToJ400 - Full range YUV Jpeg style.
BUG=159
TEST=*J4*
Review URL: https://webrtc-codereview.appspot.com/1243004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@622 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-26 09:14:46 +00:00
fbarchard@google.com
518833b983 Fix RGB565ToARGB_Any which uses SSE2 that requires ARGB alignment. Add row coalescing to convert_argb.cc. Improve coalescing on planar_functions.cc and convert_from_argb.cc. Use stride * 2 == width to test for even width. Apply coalescing to all functions that have same vertical subsampling.
BUG=197
TESTED=libyuv unittest passes where _Opt uses row coalescing.
Review URL: https://webrtc-codereview.appspot.com/1186004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@601 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-12 21:44:56 +00:00
fbarchard@google.com
4db105148e Fix white space (lint) and sort row.h defines
BUG=197
TEST=lint
Review URL: https://webrtc-codereview.appspot.com/1185004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@600 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-12 03:58:04 +00:00
fbarchard@google.com
11a524362d Coalesce rows
BUG=197
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGBToI400*
Review URL: https://webrtc-codereview.appspot.com/1176004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@598 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-11 18:51:25 +00:00
fbarchard@google.com
1096543eaa ARGBShuffle AVX2
BUG=196
TESTED=BGRAToARGB*
Review URL: https://webrtc-codereview.appspot.com/1171006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@596 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-03-08 23:22:32 +00:00
fbarchard@google.com
b444bae883 ARGBToI400 and ARGBToI411 using AVX2. YUY2ToI420 and UVYVToI420 use AVX2. CopyPlane use rep movsb for AVX2. CopyPlane2 use rep movsb for AVX2 and CopyPlane if strides match AVX2, which will do a single rep movsb for entire image if stride == width. MergeUV for I420ToNV12.
BUG=181
TESTED=unittests pass
Review URL: https://webrtc-codereview.appspot.com/1103007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@569 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-11 20:30:41 +00:00
fbarchard@google.com
cde587092f Replace two spaces with one after .
BUG=none
TEST=lint
Review URL: https://webrtc-codereview.appspot.com/1063010

git-svn-id: http://libyuv.googlecode.com/svn/trunk@553 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-28 00:02:35 +00:00
fbarchard@google.com
41e972ec31 ARGBToI444_SSSE3 UV function ported. Thanks to changjun.yang@intel.com
BUG=148
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGBToI* | grep ms
Review URL: https://webrtc-codereview.appspot.com/1019011

git-svn-id: http://libyuv.googlecode.com/svn/trunk@539 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-16 05:54:56 +00:00
fbarchard@google.com
958a0b0c19 lint cleanup
BUG=none
TEST=lint
Review URL: https://webrtc-codereview.appspot.com/931013

git-svn-id: http://libyuv.googlecode.com/svn/trunk@496 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-17 00:00:23 +00:00
fbarchard@google.com
4a86a836fc On Neon remove aligned SplitUVRow
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/930020

git-svn-id: http://libyuv.googlecode.com/svn/trunk@493 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-16 02:51:31 +00:00
fbarchard@google.com
f08ac6bb09 Rename row functions so they are all SomethingRow_CPU
BUG=133
TEST=still builds
Review URL: https://webrtc-codereview.appspot.com/939020

git-svn-id: http://libyuv.googlecode.com/svn/trunk@491 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-15 00:21:14 +00:00
fbarchard@google.com
9573071950 Neon RGB24 to I420
BUG=none
TEST=convert_test
Review URL: https://webrtc-codereview.appspot.com/965018

git-svn-id: http://libyuv.googlecode.com/svn/trunk@481 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-12 20:42:48 +00:00
fbarchard@google.com
522d757c92 Neon optimized ARGBToI444/422/411/420 Any variations, ARGB1555ToI420 Neon, ARGB4444ToI420
BUG=148
TEST=sudo LIBYUV_REPEAT=1000 nice --5 ./libyuv_unittest --gtest_filter=*R*ToI4* | sed 's/\(.*(\)\([0-9]*\)\( ms)\)/\2 - \1\2\3/g' | sort -rn | grep ms
Review URL: https://webrtc-codereview.appspot.com/936020

git-svn-id: http://libyuv.googlecode.com/svn/trunk@480 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-09 23:14:57 +00:00
fbarchard@google.com
dd2d512e5a 420 subsampler
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/935012

git-svn-id: http://libyuv.googlecode.com/svn/trunk@478 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-07 00:53:52 +00:00
fbarchard@google.com
76e851792c 411 subsampled
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/930013

git-svn-id: http://libyuv.googlecode.com/svn/trunk@477 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-06 22:51:39 +00:00
fbarchard@google.com
c4f443f8fe ARGBToUV422Row_NEON
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/928018

git-svn-id: http://libyuv.googlecode.com/svn/trunk@476 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-06 20:58:24 +00:00
fbarchard@google.com
c673f426de ARGBToI444 for Neon
BUG=none
TEST=libyuvTest.ARGBToI444_Opt
Review URL: https://webrtc-codereview.appspot.com/932013

git-svn-id: http://libyuv.googlecode.com/svn/trunk@475 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-06 18:46:51 +00:00
fbarchard@google.com
bdf7cb5914 RGB formats converted to YUV with Neon
BUG=none
TEST=convert_test
Review URL: https://webrtc-codereview.appspot.com/936013

git-svn-id: http://libyuv.googlecode.com/svn/trunk@471 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-11-05 23:40:11 +00:00
fbarchard@google.com
b7ae15a236 Neon optimized ARGBToY
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/916004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@427 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-19 08:25:48 +00:00
fbarchard@google.com
1bdcc4c3e3 rgb565 and argb1555 neon
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/881004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@420 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-15 17:46:59 +00:00
fbarchard@google.com
c389e8e3b1 Convert ARGB to ARGB4444 with Neon
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/875004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@415 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-13 02:51:48 +00:00
fbarchard@google.com
c7277d08e8 Add convert_from_argb.h for all conversion functions from ARGB to something else.
BUG=none
TEST=convert_test
Review URL: https://webrtc-codereview.appspot.com/857014

git-svn-id: http://libyuv.googlecode.com/svn/trunk@408 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-11 23:10:53 +00:00