Yuan Tong
98ec7c28d5
Fix SSE2 version of ScalePlaneUp2_16_Bilinear
...
- Define HAS_SCALEROWUP2_BILINEAR_16_SSE2: it's now fixed.
- Correct function name to ScaleRowUp2_Bilinear_16_Any_SSE2:
this row function uses only SSE2 instructions.
Bug: libyuv:882
Change-Id: Ib1c7ac5b09997cb5b32bc54109d8c566af762433
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3800842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-08-02 20:35:48 +00:00
Frank Barchard
b028453ba6
Disable bilinear 16 bit scale up for SSE2
...
- Undefine HAS_SCALEROWUP2_BILINEAR_16_SSE2
- Save XMM7 in ScaleRowUp2_Bilinear_16_SSE2().
- Rename HAS_SCALEROWUP2LINEAR_xxx to HAS_SCALEROWUP2_LINEAR_xxx
- DetileSplitUVRow_C() is implemented using SplitUVRow_C().
- Changes to unit_test/planar_test.cc.
Bug: libyuv:882
Change-Id: I0a8e8e5fb43bdf58ded87244e802343eacb789f2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3795063
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-08-01 22:54:48 +00:00
Frank Barchard
eec8dd37e8
Change ScaleUVRowUp2_Biinear_16_SSE2 to SSE41
...
Bug: libyuv:928
xed -i scale_gcc.o:
SYM ScaleUVRowUp2_Linear_16_SSE2:
XDIS 0: LOGICAL SSE2 660FEFED pxor xmm5, xmm5
XDIS 4: SSE SSE2 660F76E4 pcmpeqd xmm4, xmm4
XDIS 8: SSE SSE2 660F72D41F psrld xmm4, 0x1f
XDIS d: SSE SSE2 660F72F401 pslld xmm4, 0x1
XDIS 12: DATAXFER SSE2 F30F7E07 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER SSE2 F30F7E4F04 movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE SSE2 660F61C5 punpcklwd xmm0, xmm5
XDIS 1f: SSE SSE2 660F61CD punpcklwd xmm1, xmm5
XDIS 23: DATAXFER SSE2 660F6FD0 movdqa xmm2, xmm0
XDIS 27: DATAXFER SSE2 660F6FD9 movdqa xmm3, xmm1
XDIS 2b: SSE SSE2 660F70D24E pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE SSE2 660F70DB4E pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE SSE2 660FFED4 paddd xmm2, xmm4
XDIS 39: SSE SSE2 660FFEDC paddd xmm3, xmm4
XDIS 3d: SSE SSE2 660FFED0 paddd xmm2, xmm0
XDIS 41: SSE SSE2 660FFED9 paddd xmm3, xmm1
XDIS 45: SSE SSE2 660FFEC0 paddd xmm0, xmm0
XDIS 49: SSE SSE2 660FFEC9 paddd xmm1, xmm1
XDIS 4d: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS 51: SSE SSE2 660FFECB paddd xmm1, xmm3
XDIS 55: SSE SSE2 660F72D002 psrld xmm0, 0x2
XDIS 5a: SSE SSE2 660F72D102 psrld xmm1, 0x2
XDIS 5f: SSE SSE4 660F382BC1 packusdw xmm0, xmm1
XDIS 64: DATAXFER SSE2 F30F7F06 movdqu xmmword ptr [rsi], xmm0
XDIS 68: MISC BASE 488D7F08 lea rdi, ptr [rdi+0x8]
XDIS 6c: MISC BASE 488D7610 lea rsi, ptr [rsi+0x10]
XDIS 70: BINARY BASE 83EA04 sub edx, 0x4
XDIS 73: COND_BR BASE 7F9D jnle 0x12 <ScaleUVRowUp2_Linear_16_SSE2+0x12>
XDIS 75: RET BASE C3 ret
SYM ScaleUVRowUp2_Bilinear_16_SSE2:
XDIS 0: LOGICAL SSE2 660FEFFF pxor xmm7, xmm7
XDIS 4: SSE SSE2 660F76F6 pcmpeqd xmm6, xmm6
XDIS 8: SSE SSE2 660F72D61F psrld xmm6, 0x1f
XDIS d: SSE SSE2 660F72F603 pslld xmm6, 0x3
XDIS 12: DATAXFER SSE2 F30F7E07 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER SSE2 F30F7E4F04 movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE SSE2 660F61C7 punpcklwd xmm0, xmm7
XDIS 1f: SSE SSE2 660F61CF punpcklwd xmm1, xmm7
XDIS 23: DATAXFER SSE2 660F6FD0 movdqa xmm2, xmm0
XDIS 27: DATAXFER SSE2 660F6FD9 movdqa xmm3, xmm1
XDIS 2b: SSE SSE2 660F70D24E pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE SSE2 660F70DB4E pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE SSE2 660FFED0 paddd xmm2, xmm0
XDIS 39: SSE SSE2 660FFED9 paddd xmm3, xmm1
XDIS 3d: SSE SSE2 660FFEC0 paddd xmm0, xmm0
XDIS 41: SSE SSE2 660FFEC9 paddd xmm1, xmm1
XDIS 45: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS 49: SSE SSE2 660FFECB paddd xmm1, xmm3
XDIS 4d: DATAXFER SSE2 F30F7E1477 movq xmm2, qword ptr [rdi+rsi*2]
XDIS 52: DATAXFER SSE2 F30F7E5C7704 movq xmm3, qword ptr [rdi+rsi*2+0x4]
XDIS 58: SSE SSE2 660F61D7 punpcklwd xmm2, xmm7
XDIS 5c: SSE SSE2 660F61DF punpcklwd xmm3, xmm7
XDIS 60: DATAXFER SSE2 660F6FE2 movdqa xmm4, xmm2
XDIS 64: DATAXFER SSE2 660F6FEB movdqa xmm5, xmm3
XDIS 68: SSE SSE2 660F70E44E pshufd xmm4, xmm4, 0x4e
XDIS 6d: SSE SSE2 660F70ED4E pshufd xmm5, xmm5, 0x4e
XDIS 72: SSE SSE2 660FFEE2 paddd xmm4, xmm2
XDIS 76: SSE SSE2 660FFEEB paddd xmm5, xmm3
XDIS 7a: SSE SSE2 660FFED2 paddd xmm2, xmm2
XDIS 7e: SSE SSE2 660FFEDB paddd xmm3, xmm3
XDIS 82: SSE SSE2 660FFED4 paddd xmm2, xmm4
XDIS 86: SSE SSE2 660FFEDD paddd xmm3, xmm5
XDIS 8a: DATAXFER SSE2 660F6FE0 movdqa xmm4, xmm0
XDIS 8e: DATAXFER SSE2 660F6FEA movdqa xmm5, xmm2
XDIS 92: SSE SSE2 660FFEE0 paddd xmm4, xmm0
XDIS 96: SSE SSE2 660FFEEE paddd xmm5, xmm6
XDIS 9a: SSE SSE2 660FFEE0 paddd xmm4, xmm0
XDIS 9e: SSE SSE2 660FFEE5 paddd xmm4, xmm5
XDIS a2: SSE SSE2 660F72D404 psrld xmm4, 0x4
XDIS a7: DATAXFER SSE2 660F6FEA movdqa xmm5, xmm2
XDIS ab: SSE SSE2 660FFEEA paddd xmm5, xmm2
XDIS af: SSE SSE2 660FFEC6 paddd xmm0, xmm6
XDIS b3: SSE SSE2 660FFEEA paddd xmm5, xmm2
XDIS b7: SSE SSE2 660FFEE8 paddd xmm5, xmm0
XDIS bb: SSE SSE2 660F72D504 psrld xmm5, 0x4
XDIS c0: DATAXFER SSE2 660F6FC1 movdqa xmm0, xmm1
XDIS c4: DATAXFER SSE2 660F6FD3 movdqa xmm2, xmm3
XDIS c8: SSE SSE2 660FFEC1 paddd xmm0, xmm1
XDIS cc: SSE SSE2 660FFED6 paddd xmm2, xmm6
XDIS d0: SSE SSE2 660FFEC1 paddd xmm0, xmm1
XDIS d4: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS d8: SSE SSE2 660F72D004 psrld xmm0, 0x4
XDIS dd: DATAXFER SSE2 660F6FD3 movdqa xmm2, xmm3
XDIS e1: SSE SSE2 660FFED3 paddd xmm2, xmm3
XDIS e5: SSE SSE2 660FFECE paddd xmm1, xmm6
XDIS e9: SSE SSE2 660FFED3 paddd xmm2, xmm3
XDIS ed: SSE SSE2 660FFED1 paddd xmm2, xmm1
XDIS f1: SSE SSE2 660F72D204 psrld xmm2, 0x4
XDIS f6: SSE SSE4 660F382BE0 packusdw xmm4, xmm0
XDIS fb: DATAXFER SSE2 F30F7F22 movdqu xmmword ptr [rdx], xmm4
XDIS ff: SSE SSE4 660F382BEA packusdw xmm5, xmm2
XDIS 104: DATAXFER SSE2 F30F7F2C4A movdqu xmmword ptr [rdx+rcx*2], xmm5
XDIS 109: MISC BASE 488D7F08 lea rdi, ptr [rdi+0x8]
XDIS 10d: MISC BASE 488D5210 lea rdx, ptr [rdx+0x10]
XDIS 111: BINARY BASE 4183E804 sub r8d, 0x4
XDIS 115: COND_BR BASE 0F8FF7FEFFFF jnle 0x12 <ScaleUVRowUp2_Bilinear_16_SSE2+0x12>
XDIS 11b: RET BASE C3 ret
Change-Id: Ia20860e9c3c45368822cfd8877167ff0bf973dcc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3587602
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-15 18:46:09 +00:00
Wan-Teh Chang
18f9110516
Avoid AVX instructions in ScaleRowUp2_Linear_SSSE3
...
The "vpackuswb %%xmm2,%%xmm0,%%xmm0" and "vmovdqu %%xmm0,(%1)"
instructions in ScaleRowUp2_Linear_SSSE3() are AVX instructions. They
cause an illegal instruction exception on CPUs that do not support AVX.
Bug: libyuv:927
Bug: chromium:1312551
Change-Id: I87b2aaf041e7d185e7e8fb07172d4f37482e9d08
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3585881
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2022-04-15 00:18:39 +00:00
Frank Barchard
49ebc996aa
Make 2 step transitive tests measure 2 step time.
...
Add tests of all macros used by libyuv public headers
When a 1 step conversion is added, a 2 step test can compare
the old 2 step method to the 1 step. A 1 step unittest is
also added which compares C to SIMD. Making the 2 step
conversions measure performance of the 2 steps allows the
old 2 step performance to be compared to 1 step.
All macros used in public headers are added to an ifdef test.
Showing them in a unittest allows some diagnostics when
a test is failing.
Bug: libyuv:901
Change-Id: I7ffa6ed0cb3b506fa1b7fd4b7b1b729658c3c266
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2857916
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-04-30 18:14:57 +00:00
Frank Barchard
5e05f26a2b
Switch win32 to row_gcc for clangcl.
...
Bug: libyuv:900, libyuv:848, b/178283356, b/185922513
Change-Id: I7697953753391c555a778198db36412c853fb29e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2844962
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
2021-04-22 19:32:32 +00:00
Yuan Tong
c41eabe3d4
Add full 16 bit scaling up by 2x function
...
R=fbarchard@chromium.org
Change-Id: I4a869aefdc16e34357a615727711594c5d8e3a80
Bug: libyuv:882
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2719842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-03-02 19:29:02 +00:00
Frank Barchard
d768774299
add yuvconvstants util
...
miscellaneous cleanup of other code/comments
Bug: libyuv:873, libyuv:877
Change-Id: I0d8caf9a65908ff8898b25494f7c724775f84fa3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2692930
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-12 19:45:16 +00:00
Yuan Tong
d4ecb70610
Add P010ToP410 and P210ToP410
...
These are 16 bit bi-planar convert functions to scale UV plane to
Y plane's size using (bi)linear filter.
libyuv_unittest --gtest_filter=*ToP41*
R=fbarchard@chromium.org
Bug: libyuv:872
Change-Id: I3cb4fafe2b2c9eedd0d91cf4c619abb9ee107bc1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2690102
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-12 14:55:24 +00:00
Frank Barchard
12a4a2372c
Rounding added to scaling upsampler
...
Bug: libyuv:872, b/178521093
Change-Id: I86749f73f5e55d5fd8b87ea6938084cbacb1cda7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2686945
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-02-10 18:51:02 +00:00
Yuan Tong
f7fc83f46d
Add NV12ToNV24 and NV16ToNV24
...
These are bi-planar convert functions to scale UV plane to Y plane's size using (bi)linear filter.
libyuv_unittest --gtest_filter=*ToNV24*
R=fbarchard@chromium.org
Change-Id: I3d98f833feeef00af3c903ac9ad0e41bdcbcb51f
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2682152
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-09 07:38:40 +00:00
Frank Barchard
942c508448
BT.2020 Full Range yuvconstants
...
new color util to compute constants needed based on white point.
[ RUN ] LibYUVColorTest.TestFullYUVV
hist -2 -1 0 1 2
red 0 1627136 13670144 1479936 0
green 319285 3456836 9243059 3440771 317265
blue 0 1561088 14202112 1014016 0
Bug: libyuv:877, b/178283356
Change-Id: If432ebfab76b01302fdb416a153c4f26ca0832d6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2678859
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-02-06 00:26:55 +00:00
Yuan Tong
fc61dde1eb
Add special optimization for I420ToI444 and I422ToI444
...
These functions use (bi)linear filter, to scale U and V planes to the size of Y plane.
This will help enhance the quality of YUV to RGB conversion.
Also added 10bit and 12bit version:
I010ToI410
I210ToI410
I012ToI412
I212ToI412
libyuv_unittest --gtest_filter=LibYUVConvertTest.I42*ToI444*:LibYUVConvertTest.I*1*ToI41*
R=fbarchard@chromium.org
Change-Id: Ie4a711a5ba28f2ff1f44c021f7a5c149022264c5
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2658097
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2021-02-03 10:53:02 +00:00
Frank Barchard
b7a1c5ee5d
Scale by even factor low level row function
...
Bug: b/171884264
Change-Id: I6a94bde0aa05e681bb4590ea8beec33a61ddbfc9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2518361
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-11-03 21:25:18 +00:00
Frank Barchard
cec28e7088
PlaneScale, UVScale and ARGBScale test 3x and 4x down sample.
...
Intel SkylakeX
UVTest3x (1925 ms)
UVTest4x (2915 ms)
PlaneTest3x (2040 ms)
PlaneTest4x (4292 ms)
ARGBTest3x (2079 ms)
ARGBTest4x (1854 ms)
Pixel 2
ARGBTest3x (3602 ms)
ARGBTest4x (4064 ms)
PlaneTest3x (3331 ms)
PlaneTest4x (8977 ms)
UVTest3x (3473 ms)
UVTest4x (6970 ms)
Bug: b/171798872, b/171884264
Change-Id: Iebc70fed907857b6cb71a9baf2aba9861ef1e3f7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2505601
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-28 20:41:59 +00:00
Frank Barchard
a4ec5cf9c2
UVScale down use AVX2 and Neon for aarch32
...
Intel SkylakeX
Was SSSE3 UVScaleDownBy4_Box (2496 ms)
Now AVX2 UVScaleDownBy4_Box (1983 ms)
Was SSSE3 UVScaleDownBy2_Box (380 ms)
Now AVX2 UVScaleDownBy2_Box (360 ms)
Pixel 4 aarch32
Was UVScaleDownBy4_Box (4295 ms)
Now UVScaleDownBy4_Box (3307 ms)
Was UVScaleDownBy2_Box (1022 ms)
Now UVScaleDownBy2_Box (778 ms)
Bug: libuyv:838
Change-Id: Ic823fa15e5761c1b9a897da27341adbf1ed39883
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2470196
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-14 06:23:26 +00:00
Frank Barchard
d730dc2f18
2x down sample for UV planes ported to SSSE3 / NEON
...
Bug: libuyv:838
Change-Id: Id9fb3282a3e86143d76b5e0cb557f0523a88b3c8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2465578
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-10-13 21:42:15 +00:00
Frank Barchard
413a8d8041
Add AYUVToNV12 and NV21ToNV12
...
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*ToNV12* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
R=rrwinterton@gmail.com
Change-Id: Id03b4613211fb6a6e163d10daa7c692fe31e36d8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1560080
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2019-04-12 17:48:45 +00:00
Frank Barchard
664c735677
I420ToYUY2_AVX2 port
...
I420 and I422 To YUY2 and UYVY ported from SSE2 to AVX2.
Was SSE2
I420ToYUY2_Opt (135 ms)
I420ToUYVY_Opt (148 ms)
I422ToYUY2_Opt (145 ms)
I422ToUYVY_Opt (142 ms)
Now AVX2
I420ToYUY2_Opt (133 ms)
I420ToUYVY_Opt (130 ms)
I422ToYUY2_Opt (127 ms)
I422ToUYVY_Opt (137 ms)
Bug: libyuv:556
Test: out/Release/libyuv_unittest --sandbox_unittests --gtest_filter=*I42?To*UY*Opt
Change-Id: Ic35f97cee02dc009fd98785589ba17c7cf50bb35
Reviewed-on: https://chromium-review.googlesource.com/892493
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2018-02-01 00:33:25 +00:00
Frank Barchard
92e22cf5b6
Lint cleanup after C99 change CL
...
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-24 19:16:03 +00:00
Frank Barchard
7e389884a1
Switch to C99 types
...
Append _t to all sized types.
uint64 becomes uint64_t etc
Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-23 19:16:05 +00:00
Frank Barchard
ecab5430c2
Remove MEMOPREG x64 NaCL macros
...
MEMOPREG macros are deprecated in row.h
Regular expressions to remove MEMOPREG macros:
MEMOPREG(movd, 0x00, [u_buf], [v_buf], 1, xmm1) \
MEMOPREG\((.*), (.*), (.*), (.*), (.*), (.*)\)
"\1 \2(%\3,%\4,\5),%%\6 \\n"
MEMOPREG(movdqu,0x00,1,4,1,xmm2)
MEMOPREG\((.*),(.*),(.*),(.*),(.*),(.*)\)
"\1 \2(%\3,%\4,\5),%%\6 \\n"
MEMOPREG(movdqu,0x00,1,4,1,xmm2)
MEMOPREG\((.*),(.*),(.*),(.*),(.*),(.*)\)(.*)(//.*)
"\1 \2(%\3,%\4,\5),%%\6 \\n"
TBR=braveyao@chromium.org
Bug: libyuv:702
Test: try bots pass
Change-Id: If8743abd9af2e8c549d0c7d3d49733a9b0f0ca86
Reviewed-on: https://chromium-review.googlesource.com/865964
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-16 19:10:44 +00:00
Frank Barchard
5088f00165
Remove MEMACCESS x64 NaCL macros
...
MEMACCESS macros are deprecated in row.h
Usage examples
"movdqu " MEMACCESS(0) ",%%xmm0 \n"
"movdqu " MEMACCESS2(0x10,0) ",%%xmm1 \n"
Regular expressions to remove MEMACCESS macros:
" MEMACCESS2\((.*),(.*)\) "(.*)\\n"
\1(%\2)\3 \\n"
" MEMACCESS\((.*)\) "(.*)\\n"
(%\1)\2 \\n"
Bug: libyuv:702
Test: try bots pass
Change-Id: I42f62d5dede8ef2ea643e78c204371a7659d25e6
Reviewed-on: https://chromium-review.googlesource.com/862803
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-12 20:37:41 +00:00
Frank Barchard
e3797d1765
Remove MEMOPARG x64 NaCL macros
...
MEMOPARG macros are deprecated in row.h
#opcode " " #offset "(%" #base ",%" #index "," #scale "),%" #arg "\n"
Usage examples
MEMOPARG(movzwl,0x00,1,3,1,k2) // movzwl (%1,%3,1),%k2
Regular expression to remove MEMACCESS macro:
MEMOPARG\((.*),(.*),(.*),(.*),(.*),(.*)\)(.*//.*)
"\1 \2(%\3,%\4,\5),%\6 \\n"
Bug: libyuv:702
Test: try bots pass
Change-Id: I4a5ad2abf5017e651576f4c8c784be1c8dbf5a83
Reviewed-on: https://chromium-review.googlesource.com/863108
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-12 18:26:06 +00:00
Frank Barchard
3694891922
Remove MEMLEA x64 NaCL macros
...
Bug: libyuv:702
Test: try bots pass
Change-Id: I0ee094551734368f2179c298e7bf423ec80a929c
Reviewed-on: https://chromium-review.googlesource.com/857845
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-10 19:16:16 +00:00
Frank Barchard
55310f92bc
Remove NACL_R14 macro
...
Bug: libyuv:702
Test: try bots still build
Change-Id: I05317e45c885955fcda233bdddbd11ce1d246d90
Reviewed-on: https://chromium-review.googlesource.com/854770
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-01-08 22:41:15 +00:00
Lei Zhang
8445617191
Mark a bunch of kArray variables as const.
...
This allows the linker to move the variables from the .data section to
the .rodata section.
Bug: libyuv:254
Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0
Reviewed-on: https://chromium-review.googlesource.com/777033
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2017-11-27 23:38:44 +00:00
Frank Barchard
bbe8c233f2
scale warning fixes for unused parameters
...
BUG=libyuv:680
TEST=builds and runs with no warnings
Change-Id: I7d60ef44292fa6ad4f7c4e2e2657359b864d2dab
Reviewed-on: https://chromium-review.googlesource.com/442670
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
2017-02-15 21:38:59 +00:00
Frank Barchard
e62309f259
clang-format libyuv
...
BUG=libyuv:654
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
137aa63afe
Fix some comment typos
...
BUG=None
TEST=try bots
Review URL: https://codereview.chromium.org/2346633002 .
2016-09-15 15:38:19 -07:00
Frank Barchard
1b3e4aee47
make count a memory variable for 32 bit
...
32 bit clang runs out of registers and compiler does core dump.
force 32 bit build to use memory variable for counter.
BUG=libyuv:612
TBR=harryjin@google.com
Review URL: https://codereview.chromium.org/2091913003 .
2016-06-23 20:42:10 -07:00
Frank Barchard
cc88adc620
YUV scale filter columns improved filtering accuracy
...
upscale a YUV image. observe change in hue.. green especially.
disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C
observe hue.. green especially, is better.
was ScaleFrom1280x720_Bilinear (1620 ms)
now ScaleFrom1280x720_Bilinear (1907 ms)
BUG=libyuv:605
TEST=try bots
R=harryjin@google.com , wangcheng@google.com
Review URL: https://codereview.chromium.org/2084533006 .
2016-06-23 20:16:55 -07:00
Frank Barchard
cf101116c9
Remove initialize to zero on output variables for inline.
...
Inline that uses temporary variables is currently initializing them
to 0 and passing in as output "+r".
This CL replaces the output constraint to "=&r" for most meaning an
output with early write (before inputs). This allows the initialize
to zero step to be removed, saving 1 instruction.
BUG=libyuv:580
TESTED=local libyuv build on gcc/linux and try bots
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1895743008 .
2016-04-18 16:24:26 -07:00
Frank Barchard
36615d62a0
fix for InterpolateRow_AVX2
...
port scaledownby4_avx2 to gcc
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1546763002 .
2015-12-22 12:29:54 -08:00
Frank Barchard
80ca4514ef
change scale down by 4 to use rounding.
...
TBR=harryjin@google.com
BUG=libyuv:447
Review URL: https://codereview.chromium.org/1525033005 .
2015-12-15 21:25:18 -08:00
Frank Barchard
70445ef2ef
avx2 scale down by 2 for gcc
...
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:527
Review URL: https://codereview.chromium.org/1520423003 .
2015-12-15 10:59:20 -08:00
Frank Barchard
ae55e41851
use rounding in scaledown by 2
...
When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.
R=dhrosa@google.com , harryjin@google.com
BUG=libyuv:447,libyuv:527
Review URL: https://codereview.chromium.org/1513183004 .
2015-12-14 17:25:36 -08:00
Frank Barchard
3e38762d6b
fix avx2 box filter bug for yuv down sampling.
...
offset to second group of pixels was off by 16.
should have been 32, not 16.
requires avx2 hardware and wide image for test.
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:492,libyuv:501
Review URL: https://codereview.chromium.org/1395603002 .
2015-10-07 11:02:33 -07:00
Frank Barchard
914a9856c7
Reimplement NV21ToARGB to allow different color matrix.
...
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V. But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.
TBR=harryjin@google.com
BUG=libyuv:500
Review URL: https://codereview.chromium.org/1388273002 .
2015-10-06 20:34:44 -07:00
Frank Barchard
68fa59c873
add box scaling avx2 optimization for gcc
...
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1392803002 .
2015-10-06 20:01:02 -07:00
Frank Barchard
d70293993f
port scale box filter sse2 to gcc
...
TBR=harryjin@google.com
BUG=libyuv:492
Review URL: https://codereview.chromium.org/1393653002 .
2015-10-06 16:54:26 -07:00
Frank Barchard
94d4269936
clang use scalewin
...
R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:469
Review URL: https://webrtc-codereview.appspot.com/51329004 .
2015-08-18 14:50:27 -07:00
fbarchard@google.com
2e9f3e5cf5
rename source files from row_posix.cc etc to row_gcc.cc to avoid gyp build filtering out source files from build when on windows with clang. The source code contained in row_gcc.cc is gcc syntax inline assembly available for any platform that supports gcc or clang for intel cpus.
...
BUG=440
TESTED=try bots
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/56579004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1430 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-06-09 17:27:52 +00:00